Embedding layer (Input)
Source files in EpyNN/epynn/embedding/
.
See Appendix - Notations for mathematical conventions.
Layer architecture
In EpyNN, the Embedding - or input - layer must be the first layer of every Neural Network. This layer is not trainable but binds the data to be forwarded through the network. Importantly, it contains specific procedures for data pre-processing in the epynn.embedding.dataset
module.
- class epynn.embedding.models.Embedding(X_data=None, Y_data=None, relative_size=(2, 1, 0), batch_size=None, X_encode=False, Y_encode=False, X_scale=False)[source]
Bases:
epynn.commons.models.Layer
Definition of an embedding layer prototype.
- Parameters
X_data (list[list[float or str or list[float or str]]] or NoneType, optional) – Dataset containing samples features, defaults to None which returns an empty layer.
Y_data (list[int or list[int]] or NoneType, optional) – Dataset containing samples label, defaults to None.
relative_size (tuple[int], optional) – For training, validation and testing sets. Defaults to (2, 1, 1)
batch_size (int or NoneType, optional) – For training batches, defaults to None which makes a single batch out of the training data.
X_encode – Set to True to one-hot encode features, default to False.
Y_encode – Set to True to one-hot encode labels, default to False.
X_scale (bool, optional) – Normalize sample features within [0, 1], default to False.
Upon instantiation, the Embedding layer can be instructed to one-hot encode sample features and/or label. It can also apply a global scaling of features within [0, 1]. The batch_size argument can be set for training batches preparation.
Shapes
- Embedding.compute_shapes(A)[source]
Wrapper for
epynn.embedding.parameters.embedding_compute_shapes()
.
- Parameters
A (
numpy.ndarray
) – Output of forward propagation from previous layer.def embedding_compute_shapes(layer, A): """Compute forward shapes and dimensions from input for layer. """ X = A # Input of current layer layer.fs['X'] = X.shape # (m, .. ) layer.d['m'] = layer.fs['X'][0] # Number of samples (m) layer.d['n'] = X.size // layer.d['m'] # Number of features (n) return NoneWithin an Embedding layer, shapes of interest include:
Input X of shape (m, …) with m equal to the number of samples. The number of input dimensions is unknown a priori.
The number of features n per sample can still be determined formally: it is equal to the size of the input X divided by the number of samples m.
Note that:
The Embedding layer is like a pass-through layer except that it is the first layer of the Network. Therefore, it does not receive an input from the previous layer because there is none.
The Embedding layer is not trainable and does not transform the data during the training phase. The input dimensions does not need to be known by the layer.
Forward
- Embedding.forward(A)[source]
Wrapper for
epynn.embedding.forward.embedding_forward()
.
- Parameters
A (
numpy.ndarray
) – Output of forward propagation from previous layer.- Returns
Output of forward propagation for current layer.
- Return type
numpy.ndarray
def embedding_forward(layer, A): """Forward propagate signal to next layer. """ # (1) Initialize cache X = initialize_forward(layer, A) # (2) Pass forward A = layer.fc['A'] = X return A # To next layerThe forward propagation function in the Embedding layer k includes:
(1): Input X in current layer k is equal to the user-provided samples features as a whole or in batches depending on user-choices and training or prediction mode.
(2): Output A of current layer k is equal to input X.
\[\begin{split}\begin{alignat*}{2} & x^{k}_{m,d_1...d_n} &&= X\_data \tag{1} \\ & a^{k}_{m,d_1...d_n} &&= x^{k}_{m,d_1...d_n} \tag{2} \end{alignat*}\end{split}\]
Backward
- Embedding.backward(dX)[source]
Wrapper for
epynn.embedding.backward.embedding_backward()
.
- Parameters
dX (
numpy.ndarray
) – Output of backward propagation from next layer.- Returns
Output of backward propagation for current layer.
- Return type
numpy.ndarray
def embedding_backward(layer, dX): """Backward propagate error gradients to previous layer. """ # (1) Initialize cache dA = initialize_backward(layer, dX) # (2) Pass backward dX = layer.bc['dX'] = dA return None # No previous layerThe backward propagation function in the Embedding layer k includes:
(1): dA the gradient of the loss with respect to the output of forward propagation A for current layer k. It is equal to the gradient of the loss with respect to input of forward propagation for next layer k+1.
(2): The gradient of the loss dX with respect to the input of forward propagation X for current layer k is mathematically equal to dA. However, the Embedding layer returns None because there is no previous layer.
\[\begin{split}\begin{alignat*}{2} & \delta^{\kp}_{m,d_1...d_n} &&= \frac{\partial \mathcal{L}}{\partial a^{k}_{m,d_1...d_n}} = \frac{\partial \mathcal{L}}{\partial x^{\kp}_{m,d_1...d_n}} \tag{1} \\ & \delta^{k}_{m,d_1...d_n} &&= \frac{\partial \mathcal{L}}{\partial x^{k}_{m,d_1...d_n}} = \frac{\partial \mathcal{L}}{\partial a^{\km}_{m,d_1...d_n}} = \varnothing \tag{2} \end{alignat*}\end{split}\]
Gradients
- Embedding.compute_gradients()[source]
Wrapper for
epynn.embedding.parameters.embedding_compute_gradients()
. Dummy method, there are no gradients to compute in layer.def embedding_compute_gradients(layer): """Compute gradients with respect to weight and bias for layer. """ # No gradients to compute for Embedding layer return NoneThe Embedding layer is not a trainable layer. It has no trainable parameters such as weight W or bias b. Therefore, there is no parameters gradients to compute.