.. EpyNN documentation master file, created by sphinx-quickstart on Tue Jul 6 18:46:11 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. .. toctree:: Pooling (CNN) =================================== Source files in ``EpyNN/epynn/pooling/``. See `Appendix - Notations`_ for mathematical conventions. .. _Appendix - Notations: glossary.html#notations Layer architecture ----------------------------------- .. image:: _static/Pooling/pool-01.svg :alt: Pooling A *Pooling* layer is an object which can be seen as a *data compression* layer. It is provided with a *pool_size* and *strides* argument upon instantiation. They define, respectively, the dimension of the window used for the *pooling* operation and the steps by which the window is moved between each operation. .. autoclass:: epynn.pooling.models.Pooling :show-inheritance: Shapes ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.pooling.models.Pooling.compute_shapes .. literalinclude:: ./../epynn/pooling/parameters.py :pyobject: pooling_compute_shapes :language: python Within a *Pooling* layer, shapes of interest include: * Input *X* of shape *(m, h, w, d)* with *m* equal to the number of samples, *h* the height of features, *w* the width of features and *d* the depth of features. * Output height *oh* defined by the ratio of the difference between *h* and pooling height *ph* over the stride height *sh*. The float ratio is diminished to the nearest lower integer on which the value one is added. * Output width *ow* defined by the ratio of the difference between *w* and pooling width *pw* over the stride width *sw*. The float ratio is diminished to the nearest lower integer on which the value one is added. .. image:: _static/Pooling/pool1-01.svg Forward ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.pooling.models.Pooling.forward .. literalinclude:: ./../epynn/pooling/forward.py :pyobject: pooling_forward .. image:: _static/Pooling/pool2-01.svg The forward propagation function in a *Pooling* layer *k* includes: * (1): Input *X* in current layer *k* with shape *(m, h, w, d)* is equal to the output *A* of previous layer *k-1*. * (2): *Xb* is an array of blocks with shape *(oh, ow, m, ph, pw, d)* made by iterative slicing of *X* with respect to *ph*, *pw* and *sh*, *sw*. * (3): Given *Xb* with shape *(oh, ow, m, ph, pw, d)* the operation moves the axis 2 to the position 0, yielding *Xb* with shape *(m, oh, ow, ph, pw, d)*. * (4): The layer output *Z* with shape *(m, oh, ow, d)* is computed by pooling each input block within *Xb* with shape *(m, oh, ow, ph, pw, d)* over the blocks dimension on axis 4 and 3. Note that: * This implementation is not the vanilla implementation of a *Pooling* layer. In the naive implementation, the process is fully iterative while here is depicted a stride groups optimized version that takes advantage of NumPy arrays. * To preserve code homogeneity across layers, the output of the *Pooling* layer is *A* which is equal to *Z*. * Note steps 1-3 are identical between the *Convolution* and *Pooling* layers. This is because both layers process the input *X* by blocks using window sizes and strides. .. math:: \begin{alignat*}{2} & x^{k}_{mhwd} &&= a^{\km}_{mhwd} \tag{1} \\ & \_Xb^{k} &&= blocks(X^{k}) \tag{2} \\ & Xb^{k} &&= moveaxis(\_Xb^{k}) \tag{3} \\ & Z^{k} &&= pool(Xb^{k}) \tag{4} \end{alignat*} .. math:: \begin{alignat*}{2} & where~blocks~is~defined~as: \\ &~~~~~~~~~~~~~~~~~~~~~~~blocks:\mathcal{M}_{M,H,W,D}(\mathbb{R}) &&\to \mathcal{M}_{Oh,Ow,M,Fh,Fw,D}(\mathbb{R}) \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~X = \mathop{(x_{mhwd})}_{\substack{1 \le m \le M \\ 1 \le h \le H \\ 1 \le w \le W \\ 1 \le d \le D}} &&\to Y = \mathop{(y_{o_{h}o_{w}mf_hf_wd})}_{\substack{1 \le o_h \le Oh \\ 1 \le o_w \le Ow \\ 1 \le m \le M \\ 1 \le f_h \le Fh \\ 1 \le f_w \le Fw \\ 1 \le d \le D}} \\ &~~~~~~~~~~~~~~~~~~~~~~with~Fh, Fw, Sh, Sw \in &&~\mathbb{N^*_+} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\forall{h} \in &&~\{1,..,H-Fh~|~h \pmod{Sh} = 0\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\forall{w} \in &&~\{1,..,W-Fw~|~h \pmod{Sw} = 0\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Y = && X[:, h:h+Fh, w:w+Fw:, :] \\ \\ & where~moveaxis~is~defined~as: \\ &~~~~~~~moveaxis:\mathcal{M}_{Oh,Ow,M,Fh,Fw,D}(\mathbb{R}) &&\to \mathcal{M}_{M,Oh,Ow,Fh,Fw,D}(\mathbb{R}) \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~X &&\to Y \\ \\ & where~pool~is~defined~as: \\ &~~~~~~~~~~~~~~~pool:\mathcal{M}_{M,Oh,Ow,Fh,Fw,D}(\mathbb{R}) &&\to \mathcal{M}_{M,Oh,Ow,D}(\mathbb{R}) \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~X = \mathop{(x_{mhwd})}_{\substack{1 \le m \le M \\ 1 \le o_h \le Oh \\ 1 \le o_w \le Ow \\ 1 \le f_h \le Fh \\ 1 \le f_w \le Fw \\ 1 \le d \le D}} &&\to Y = \mathop{(y_{mo_{h}o_{w}f_hf_wd})}_{\substack{1 \le m \le M \\ 1 \le o_h \le Oh \\ 1 \le o_w \le Ow \\ 1 \le d \le D}} \\ \\ &~~~~~~~~~~~~~~~~~~~~\forall{m, o_h, o_w, d} \in \{1,..,M\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,Oh\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,Ow\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,D\}&& \\ \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y_{mo_{h}o_{w}d} &&= \max\limits_{p_{h} = 1}^{Ph}\max\limits_{p_{w} = 1}^{Pw} x_{mo_{h}o_{w}p_{h}p_{w}d} \end{alignat*} Backward ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.pooling.models.Pooling.backward .. literalinclude:: ./../epynn/pooling/backward.py :pyobject: pooling_backward .. image:: _static/Pooling/pool3-01.svg The backward propagation function in a *Pooling* layer *k* includes: * (1): *dA* with shape *(m, oh, ow, u)* is the gradient of the loss with respect to the output of forward propagation *A* for current layer *k*. It is equal to the gradient of the loss with respect to input of forward propagation for next layer *k+1*. * (2): Two new axes are added before the last axis of *dZ* with shape *(m, oh, ow, d)* to reintroduce the dimensions corresponding to *Xb* with shape *(m, oh, ow, ph, pw, d)*. * (3): The gradient of the loss *dX* with respect to the input of forward propagation *X* for current layer *k* is initialized as a zero array of the same shape as *X*. * (4hw): For each row *h* with respect to *oh* and for each column *w* with respect to *ow*, the input block *Xb* of coordinates *\[:, h, w, :, :, :\]* is retrieved. * (5hw): For each row *h* with respect to *oh* and for each column *w* with respect to *ow*, values pooled from *Xb* are retrieved from *Z* to yield *Zb*. Repeat operations are applied on *Zb* in order to reconstruct the shape of *Xb*. * (6hw): The *mask* is an array of the same shape as *X* and *Zb*. The conditional expression *(X == Zx)* will return *1* for any value in pooled in *Zb* from *Xb* with respect to *X* coordinates. * (7hw): *dZb* is retrieved from *dZ* with respect to *h* and *w*. Repeat operations are applied on *dZb* in order to reconstruct the shape of *mask*. * (8hw): *dXb* is the product of *dXb* by *mask*. In fact, *dXb* is equal to zero for all but not coordinates which correspond to the pooled value. * (9hw): *dXb* is the gradient of the loss with respect to *Xb* and is added to *dX* with respect to the current window coordinates. Note that: * One window represents an ensemble of coordinates. One value with given coordinates may be part of more than one window. This is why the operation *dX[:, hs:he, ws:we, :] += dXb* is used instead of *dX[:, hs:he, ws:we, :] = dXb*. .. math:: \begin{alignat*}{2} & \delta^{\kp}_{mo_{h}o_{w}d} &&= \frac{\partial \mathcal{L}}{\partial a^{k}_{mo_{h}o_{w}d}} = \frac{\partial \mathcal{L}}{\partial x^{\kp}_{mo_{h}o_{w}d}} \tag{1} \\ & \frac{\partial \mathcal{L}}{\partial zb^{k}_{mo_{h}o_{w}11d}} &&= expand\_dims(\frac{\partial \mathcal{L}}{\partial a^{k}_{mo_{h}o_{w}d}}) \tag{2} \\ & \frac{\partial \mathcal{L}}{\partial X^{k}} &&= [\frac{\partial \mathcal{L}}{\partial x^{k}_{mhwd}}] \in \{0\}^{M \times H \times W \times D} \tag{3} \\ \\ & xb^{k}_{mp_{h}p_{w}d} &&= Xb^{k}_{mo_{h}o_{w}p_{h}p_{w}d}[:, h, w, :, :, :] \tag{4hw} \\ \\ & \_zb^{k}_{m11d} &&= z^{k}_{mo_{h}o_{w}d}[:, h:h+1, w:w+1, :] \tag{5.1hw} \\ & Zb^{k} &&= repeat(\_Zb) \tag{5.2hw} \\ \\ & M &&= \begin{cases} 1, & Xb = Zb, \\ 0, & Xb \ne Zb \end{cases} \tag{6hw} \\ \\ & \frac{\partial \mathcal{L}}{\partial \_zb^{k}_{m11d}} &&= \frac{\partial \mathcal{L}}{\partial z^{k}_{mo_{h}o_{w}11d}}[:, h, w, :, :, :] \tag{7.1hw} \\ & \frac{\partial \mathcal{L}}{\partial Zb^{k}} &&= repeat(\frac{\partial \mathcal{L}}{\partial \_Zb^{k}}) \tag{7.2hw} \\ \\ & \frac{\partial \mathcal{L}}{\partial Xb^{k}} &&= M * \frac{\partial \mathcal{L}}{\partial Zb^{k}} \tag{8hw} \\ \\ & \Delta\frac{\partial \mathcal{L}}{\partial X^{k}_{mp_hp_wd}} &&= \frac{\partial \mathcal{L}}{\partial Xb^{k}_{mp_hp_wd}} \tag{9hw} \\ \end{alignat*} .. math:: \begin{alignat*}{2} & where~expand\_dims~is~defined~as: \\ &~~~~~~expand\_dims:\mathcal{M}_{M,Oh,Ow,D}(\mathbb{R}) &&\to \mathcal{M}_{M,Oh,Ow,1,1,D}(\mathbb{R}) \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~X &&\to Y \\ \\ & where~repeat~is~defined~as: \\ &~~~~~~~~~~~~~~~~~~~~repeat:\mathcal{M}_{M,1,1,D}(\mathbb{R}) &&\to \mathcal{M}_{M,Ph,Pw,D}(\mathbb{R}) \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~X = \mathop{(x_{m11d})}_{\substack{1 \le m \le M \\ 1 \le d \le D}} &&\to Y = \mathop{(y_{mo_{h}o_{w}f_hf_wd})}_{\substack{1 \le m \le M \\ 1 \le p_h \le Ph \\ 1 \le p_w \le Pw \\ 1 \le d \le D}} \\ &~~~~~~~~~~~~~~~\forall{m, p_h, p_w, d} \in \{1,..,M\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,Ph\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,Pw\} \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\times \{1,..,D\} \\ \\ &~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y_{mp_{h}p_{w}d} &&= x_{m11d} \end{alignat*} Gradients ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. automethod:: epynn.pooling.models.Pooling.compute_gradients The *Pooling* layer is not a *trainable* layer. It has no *trainable* parameters such as weight *W* or bias *b*. Therefore, there is no parameters gradients to compute. Live examples ----------------------------------- * `Dummy image - Convolutional Neural Network (CNN)`_ * `MNIST Database - Convolutional Neural Network (CNN)`_ You may also like to browse all `Network training examples`_ provided with EpyNN. .. _Network training examples: run_examples.html .. _Dummy image - Convolutional Neural Network (CNN): epynnlive/dummy_image/train.html#Convolutional-Neural-Network-(CNN) .. _MNIST Database - Convolutional Neural Network (CNN): epynnlive/captcha_mnist/train.html#Convolutional-Neural-Network-(CNN)