.. EpyNN documentation master file, created by sphinx-quickstart on Tue Jul 6 18:46:11 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. toctree:: Fully Connected (Dense) =============================== Source files in ``EpyNN/epynn/dense/``. See `Appendix - Notations`_ for mathematical conventions. .. _Appendix - Notations: glossary.html#notations Layer architecture ------------------------------ .. image:: _static/Dense/Dense-01.svg :alt: Dense A fully-connected or *Dense* layer is an object containing a number of *units* and provided with functions for parameters *initialization* and non-linear *activation* of inputs. .. autoclass:: epynn.dense.models.Dense :show-inheritance: Shapes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.dense.models.Dense.compute_shapes .. literalinclude:: ./../epynn/dense/parameters.py :pyobject: dense_compute_shapes :language: python Within a *Dense* layer, shapes of interest include: * Input *X* of shape *(m, n)* with *m* equal to the number of samples and *n* the number of features per sample. * Weight *W* of shape *(n, u)* with *n* the number of features per sample and *u* the number of units in the current layer *k*. * Bias *b* of shape *(1, u)* with *u* the number of units in the layer. Note that: * Parameters shape for *W* and *b* is independent from the number of samples *m*. * The number of features *n* per sample may be expressed in this context as the number of units in the previous layer *k-1*, even though this definition may tend to be less general. .. image:: _static/Dense/Dense1-01.svg Forward ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.dense.models.Dense.forward .. literalinclude:: ./../epynn/dense/forward.py :pyobject: dense_forward :language: python The forward propagation function in a *Dense* layer *k* includes: * (1): Input *X* in current layer *k* is equal to the output *A* of previous layer *k-1*. * (2): *Z* is computed by applying a dot product operation between *X* and *W*, on which the bias *b* is added. * (3): Output *A* is computed by applying a non-linear *activation* function on *Z*. Note that: * *Z* may be referred to as the *(biased) weighted sum of inputs by parameters* or as the *linear activation product*. * *A* may be referred to as the *non-linear activation product* or simply the output of *Dense* layer *k*. .. image:: _static/Dense/Dense2-01.svg .. math:: \begin{alignat*}{2} & x^{k}_{mn} &&= a^{\km}_{mn} \tag{1} \\ \\ & z^{k}_{mu} &&= x^{k}_{mn} \cdot W^{k}_{nu} \\ & &&+ b^{k}_{u} \tag{2} \\ & a^{k}_{mu} &&= a_{act}(z^{k}_{mu}) \tag{3} \end{alignat*} Backward ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.dense.models.Dense.backward .. literalinclude:: ./../epynn/dense/backward.py :pyobject: dense_backward :language: python The backward propagation function in a *Dense* layer *k* includes: * (1): *dA* the gradient of the loss with respect to the output of forward propagation *A* for current layer *k*. It is equal to the gradient of the loss with respect to input of forward propagation for next layer *k+1*. * (2): *dZ* is the gradient of the loss with respect to *Z*. It is computed by applying element-wise multiplication between *dA* and the derivative of the non-linear *activation* function applied on *Z*. * (3): The gradient of the loss *dX* with respect to the input of forward propagation *X* for current layer *k* is computed by applying a dot product operation between *dZ* and the transpose of *W*. Note that: * The expression *gradient of the loss with respect to* is equivalent to *partial derivative of the loss with respect to*. * The variable *dA* is often referred to as the *error term* for layer *k+1* and *dX* the error term for layer *k*. * In contrast to the forward pass, parameters are used to weight *dZ* with shape *(m, u)*. Therefore, we use the transpose of *W* with shape *(u, n)* in order to compute the dot product. .. image:: _static/Dense/Dense3-01.svg .. math:: \begin{alignat*}{2} & \delta^{\kp}_{mu} &&= \frac{\partial \mathcal{L}}{\partial a^{k}_{mu}} = \frac{\partial \mathcal{L}}{\partial x^{\kp}_{mu}} \tag{1} \\ \\ & \frac{\partial \mathcal{L}}{\partial z^{k}_{mu}} &&= \delta^{\kp}_{mu} \\ & &&* a_{act}'(z^{k}_{mu}) \tag{2} \\ & \delta^{k}_{mn} &&= \frac{\partial \mathcal{L}}{\partial x^{k}_{mn}} = \frac{\partial \mathcal{L}}{\partial a^{\km}_{mn}} = \frac{\partial \mathcal{L}}{\partial z^{k}_{mu}} \cdot W^{k~{\intercal}}_{nu} \tag{3} \\ \end{alignat*} Gradients ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.dense.models.Dense.compute_gradients .. literalinclude:: ./../epynn/dense/parameters.py :pyobject: dense_compute_gradients :language: python The function to compute parameter gradients in a *Dense* layer *k* includes: * (1.1): *dW* is the gradient of the loss with respect to *W*. It is computed by applying a dot product operation between the transpose of *X* and *dZ*. * (1.2): *db* is the gradient of the loss with respect to *b*. It is computed by summing *dZ* along the axis corresponding to the number of samples *m*. Note that: * We use the transpose of *X* with shape *(n, m)* for the dot product operation with *dZ* of shape *(m, u)*. .. math:: \begin{alignat*}{2} & \frac{\partial \mathcal{L}}{\partial W^{k}_{nu}} &&= x^{k~{\intercal}}_{mn} \cdot \frac{\partial \mathcal{L}}{\partial z^{k}_{mu}} \tag{1.1} \\ & \frac{\partial \mathcal{L}}{\partial b^{k}_{u}} &&= \sum_{m = 1}^M \frac{\partial \mathcal{L}}{\partial z^{k}_{mu}} \tag{1.2} \end{alignat*} Live examples ------------------------------ The Dense layer is used as an output layer in every `Network training examples`_ provided with EpyNN Examples of pure Feed-Forward Neural Networks within these examples can be directly accessed from: * `Dummy Boolean - Basics with Perceptron`_ * `Dummy string - Feed-Forward (FF)`_ * `Protein Modification - Feed-Forward (FF)`_ * `Dummy time - Feed-Forward (FF)`_ * `Author and music - Feed-Forward (FF)`_ * `Dummy image - Feed-Forward (FF)`_ * `MNIST Database - Feed-Forward (FF)`_ .. _Network training examples: run_examples.html .. _Dummy Boolean - Basics with Perceptron: epynnlive/dummy_boolean/train.html#Perceptron---Single-layer-Neural-Network .. _Dummy string - Feed-Forward (FF): epynnlive/dummy_string/train.html#Feed-Forward-(FF) .. _Protein Modification - Feed-Forward (FF): epynnlive/ptm_protein/train.html#Feed-Forward-(FF) .. _Dummy time - Feed-Forward (FF): epynnlive/dummy_time/train.html#Feed-Forward-(FF) .. _Author and music - Feed-Forward (FF): epynnlive/author_music/train.html#Feed-Forward-(FF) .. _Dummy image - Feed-Forward (FF): epynnlive/dummy_image/train.html#Feed-Forward-(FF) .. _MNIST Database - Feed-Forward (FF): epynnlive/captcha_mnist/train.html#Feed-Forward-(FF)