.. EpyNN documentation master file, created by
sphinx-quickstart on Tue Jul 6 18:46:11 2021.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
.. toctree::
Appendix
===============================
Notations
-------------------------------
Conventions are related to mathematical expression on EpyNN's website. Divergences with the Python code are highlighted when applicable.
Arithmetic operators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:math:`+` and :math:`-`
Element-wise addition/subtraction between matrices, scalar addition/subtraction to/from each element of one matrix, scalar addition/subtraction to/from another scalar.
:math:`*` and :math:`/`
Element-wise multiplication/division between matrices (See `Hadamard product (matrices)`_ on Wikipedia), matrix multiplication/division by a scalar, scalar multiplication/division by another scalar.
:math:`\cdot`
Dot product between matrices (See `Dot product`_ on Wikipedia).
.. _Hadamard product (matrices): https://en.wikipedia.org/wiki/Hadamard_product_(matrices)
.. _Dot product: https://en.wikipedia.org/wiki/Dot_product
Names of matrices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Layers input and output:**
:math:`X`
Input of forward propagation.
:math:`A`
Output of forward propagation.
:math:`\frac{\partial \mathcal{L}}{\partial A}`
Input of backward propagation. Referred to as ``dA`` in Python code.
:math:`\frac{\partial \mathcal{L}}{\partial X}`
Output of backward propagation. Referred to as ``dX`` in Python code.
**Layers parameters:**
:math:`W`
Weight applied to inputs for *Dense* and *Convolution* layers.
:math:`U`
Weight applied to inputs for *RNN*, *LSTM* and *GRU* layers.
:math:`V`
Weight applied to hidden cell state for *RNN*, *LSTM* and *GRU* layers.
:math:`b`
Bias added to weighted sums.
**Linear and non-linear activation products:**
:math:`Z~and~A`
For *Dense* and *Convolution* layers, :math:`Z` is the weighted sum of inputs also known as linear activation product while :math:`A` is the product of non-linear activation.
:math:`Z~and~A`
For *Embedding*, *Pooling*, *Dropout* and *Flatten* layers, :math:`Z` is the result of layer processing equal to the output :math:`A` of this same layer. It has no relationship with linear and non-linear activation - because there is none - but the names are kept for the purpose of homogeneity.
:math:`h\_~and~h`
For recurrent *RNN*, *LSTM* and *GRU* layers, the underscore appended to the variable name denotes the linear activation product while the underscore-free variable denotes the non-linear activation product. Note that the underscore notation also applies to partial derivatives.
Dimensions and indexing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Uppercase and lowercase letters represent dimensions and corresponding index, respectively.
In the python code, note that dimension *D* is stored in the layer's ``.d`` dictionary attribute ``layer.d['d']`` while the corresponding index *d* is a namespace variable such as ``d``.
**Frequently used:**
:math:`K, k`
Number of layers in network.
:math:`U, u`
Number of units in layer :math:`k`.
:math:`M, m`
Number of training examples.
:math:`N, n`
Number of features per training example.
Note that in the case where layer :math:`k-1` is a *Dense* layer or a recurrent layer *RNN, GRU, LSTM* with *sequences=False*, then :math:`N` is equal to the number of units in layer :math:`k-1`.
**Related to recurrent architectures:**
:math:`S, s`
Number of steps in sequence.
:math:`E, e`
Number of elements for steps in sequence.
Note that in the context, it is considered that :math:`S * E = N`.
**Related to CNN:**
:math:`H, h`
Height of features.
:math:`W, w`
Width of features.
:math:`D, d`
Depth of features.
:math:`Sh, s_h`
Stride height.
:math:`Sw, s_w`
Stride Width.
:math:`Oh, o_h`
Output height.
:math:`Ow, o_w`
Output width.
:math:`Fh, f_h`
Filter height (Convolution).
:math:`Fw, f_w`
Filter Width (Convolution).
:math:`Ph, p_h`
Pool height (Pooling).
:math:`Pw, p_w`
Pool Width (Pooling).
Note that in the context, it is considered that :math:`H * W * D = N`.
Glossary
-------------------------------
In order to not reinvent the wheel, note that definitions below may be sourced from external resources.
.. glossary::
:sorted:
Dense
Fully-connected layer made of one or more nodes. Each node receives input from all nodes in the previous layer.
RNN
Recurrent layer made of one or more unit cells. Single activation (hidden cell state).
LSTM
Recurrent layer made of one or more unit cells. Three gates and two activation (hidden and memory cell states).
GRU
Recurrent layer made of one or more unit cells. Two gates and one activation (hidden cell state).
Embedding
Input layer in EpyNN, more generally any process or object that prepares or contain data fed to the layer coming next after the input layer.
Model
A specific design of a neural network which incorporates layers of given architecture.
CNN
Type of neural network used in image recognition and processing.
Convolution
Layer used in CNNs to merge input data with filter or kernel and to produce a feature map.
Pooling
Compression layer used in CNNs whose function is to reduce the spatial size of a given representation to reduce the amount of parameters and computation in the network.
Cost
Scalar value which is some kind of average of the loss.
Loss
Error with respect to one loss function which is computed for each training example and output probability.
Activation
Function that defines how the weighted sum of the input is transformed into an output.
Dropout
Dropping out units in one layer for neural network regularization.
Flatten
May refer to a reshaping layer acting forward to reduce 2D+ data into 2D data and reversing the operation backward.
Class (Python)
Prototype of an object.
Layer
Collection of nodes or units operating together at a specific depth within a neural network.
Neural Network
Series of algorithms that endeavors to recognize underlying relationships in a set of data.
Metrics
Function used to judge the performance of one model.
Weight
Parameter within layers that transforms input data to output data.
Bias
Additional set of parameters in one layer added to products of weight input operations with respect to units.
Trainable
May refer to architecture layers incorporating unfrozen trainable parameters (weight, bias).
List (Python)
Mutable data type containing an ordered and indexed sequence.
Tuple (Python)
Immutable sequence data type made of any type of values.
Mutable (Python)
Object whose internal state can be changed.
Immutable (Python)
Object whose internal state can not be changed.
String (Python)
Immutable sequence data type made of characters.
Integer (Python)
Zero, positive or negative numbers without fractional part.
Float (Python)
Number that is not an integer.
Dictionary (Python)
Unordered collection of data organized as key: value pairs.
Set (Python)
Collection which is unordered and unindexed.
Gate
Acts as a threshold to help the network to distinguish when to use normal stacked layers or an identity connection.
Instantiate (Python)
Creation of an object instance.
Instantiation (Python)
The action of creating an object instance.
Instance (Python)
An individual object of a certain class.
Hyperparameters
May refer to settings whose value is used to control the learning process.
Parameters
May refer to trainable parameters within a neural network, namely weights and bias.
Feed-Forward
Type of layer architecture wherein units do not contain loops.
Recurrent
Type of layer architecture wherein units contain loops, allowing information to be stored within one unit with respect to sequential data.
Neuron
May be equivalent to unit.
Node
May be equivalent to unit.
Cell
In the context of recurrent networks, one cell may be equivalent to one unit.
Unit
The functional entity within a layer which is composed of a certain number of units.