Embedding layer (Input)

Source files in EpyNN/epynn/embedding/.

See Appendix - Notations for mathematical conventions.

Layer architecture


In EpyNN, the Embedding - or input - layer must be the first layer of every Neural Network. This layer is not trainable but binds the data to be forwarded through the network. Importantly, it contains specific procedures for data pre-processing in the epynn.embedding.dataset module.

class epynn.embedding.models.Embedding(X_data=None, Y_data=None, relative_size=(2, 1, 0), batch_size=None, X_encode=False, Y_encode=False, X_scale=False)[source]

Bases: epynn.commons.models.Layer

Definition of an embedding layer prototype.

  • X_data (list[list[float or str or list[float or str]]] or NoneType, optional) – Dataset containing samples features, defaults to None which returns an empty layer.

  • Y_data (list[int or list[int]] or NoneType, optional) – Dataset containing samples label, defaults to None.

  • relative_size (tuple[int], optional) – For training, validation and testing sets. Defaults to (2, 1, 1)

  • batch_size (int or NoneType, optional) – For training batches, defaults to None which makes a single batch out of the training data.

  • X_encode – Set to True to one-hot encode features, default to False.

  • Y_encode – Set to True to one-hot encode labels, default to False.

  • X_scale (bool, optional) – Normalize sample features within [0, 1], default to False.

Upon instantiation, the Embedding layer can be instructed to one-hot encode sample features and/or label. It can also apply a global scaling of features within [0, 1]. The batch_size argument can be set for training batches preparation.



Wrapper for epynn.embedding.parameters.embedding_compute_shapes().


A (numpy.ndarray) – Output of forward propagation from previous layer.

def embedding_compute_shapes(layer, A):
    """Compute forward shapes and dimensions from input for layer.
    X = A    # Input of current layer

    layer.fs['X'] = X.shape    #  (m, .. )

    layer.d['m'] = layer.fs['X'][0]        # Number of samples (m)
    layer.d['n'] = X.size // layer.d['m']  # Number of features (n)

    return None

Within an Embedding layer, shapes of interest include:

  • Input X of shape (m, …) with m equal to the number of samples. The number of input dimensions is unknown a priori.

  • The number of features n per sample can still be determined formally: it is equal to the size of the input X divided by the number of samples m.

Note that:

  • The Embedding layer is like a pass-through layer except that it is the first layer of the Network. Therefore, it does not receive an input from the previous layer because there is none.

  • The Embedding layer is not trainable and does not transform the data during the training phase. The input dimensions does not need to be known by the layer.



Wrapper for epynn.embedding.forward.embedding_forward().


A (numpy.ndarray) – Output of forward propagation from previous layer.


Output of forward propagation for current layer.

Return type


def embedding_forward(layer, A):
    """Forward propagate signal to next layer.
    # (1) Initialize cache
    X = initialize_forward(layer, A)

    # (2) Pass forward
    A = layer.fc['A'] = X

    return A   # To next layer

The forward propagation function in the Embedding layer k includes:

  • (1): Input X in current layer k is equal to the user-provided samples features as a whole or in batches depending on user-choices and training or prediction mode.

  • (2): Output A of current layer k is equal to input X.

\[\begin{split}\begin{alignat*}{2} & x^{k}_{m,d_1...d_n} &&= X\_data \tag{1} \\ & a^{k}_{m,d_1...d_n} &&= x^{k}_{m,d_1...d_n} \tag{2} \end{alignat*}\end{split}\]



Wrapper for epynn.embedding.backward.embedding_backward().


dX (numpy.ndarray) – Output of backward propagation from next layer.


Output of backward propagation for current layer.

Return type


def embedding_backward(layer, dX):
    """Backward propagate error gradients to previous layer.
    # (1) Initialize cache
    dA = initialize_backward(layer, dX)

    # (2) Pass backward
    dX = layer.bc['dX'] = dA

    return None    # No previous layer

The backward propagation function in the Embedding layer k includes:

  • (1): dA the gradient of the loss with respect to the output of forward propagation A for current layer k. It is equal to the gradient of the loss with respect to input of forward propagation for next layer k+1.

  • (2): The gradient of the loss dX with respect to the input of forward propagation X for current layer k is mathematically equal to dA. However, the Embedding layer returns None because there is no previous layer.

\[\begin{split}\begin{alignat*}{2} & \delta^{\kp}_{m,d_1...d_n} &&= \frac{\partial \mathcal{L}}{\partial a^{k}_{m,d_1...d_n}} = \frac{\partial \mathcal{L}}{\partial x^{\kp}_{m,d_1...d_n}} \tag{1} \\ & \delta^{k}_{m,d_1...d_n} &&= \frac{\partial \mathcal{L}}{\partial x^{k}_{m,d_1...d_n}} = \frac{\partial \mathcal{L}}{\partial a^{\km}_{m,d_1...d_n}} = \varnothing \tag{2} \end{alignat*}\end{split}\]



Wrapper for epynn.embedding.parameters.embedding_compute_gradients(). Dummy method, there are no gradients to compute in layer.

def embedding_compute_gradients(layer):
    """Compute gradients with respect to weight and bias for layer.
    # No gradients to compute for Embedding layer

    return None

The Embedding layer is not a trainable layer. It has no trainable parameters such as weight W or bias b. Therefore, there is no parameters gradients to compute.