{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basics with numerical time-series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Find this notebook at `EpyNN/epynnlive/dummy_time/train.ipynb`.\n", "* Regular python code at `EpyNN/epynnlive/dummy_time/train.py`.\n", "\n", "Run the notebook online with [Google Colab](https://colab.research.google.com/github/Synthaze/EpyNN/blob/main/epynnlive/dummy_time/train.ipynb).\n", "\n", "**Level: Intermediate**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook we will review:\n", "\n", "* Handling univariate time series data to proceed with Neural Network regression.\n", "* Training of Feed-Forward (FF) and Recurrent Neural Network (RNN) for binary classification tasks.\n", "* Overfitting of the model to the training data and the impact of Dropout regularization.\n", "\n", "**It is assumed that the following *basics* notebooks were already reviewed:**\n", "\n", "* [Basics with Perceptron (P)](../dummy_boolean/train.ipynb)\n", "* [Basics with string sequence](../dummy_string/train.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**This notebook does not enhance, extend or replace EpyNN's documentation.**\n", "\n", "**Relevant documentation pages for the current notebook:**\n", "\n", "* [Fully Connected (Dense)](https://epynn.net/Dense.html)\n", "* [Recurrent Neural Network (RNN)](https://epynn.net/RNN.html)\n", "* [Dropout - Regularization](https://epynn.net/Dropout.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Environment and data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Follow [this link](prepare_dataset.ipynb) for details about data preparation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Briefly, sample features are univariate time series which may consist of only white noise (negative) or white noise supplemented with a pure sine-wave of random frequency (positive). \n", "\n", "The goal of the game is to train a Neural Network which may be able to detect if sample features do or do not contain a true signal." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# EpyNN/epynnlive/dummy_time/train.ipynb\n", "# Install dependencies\n", "!pip3 install --upgrade-strategy only-if-needed epynn\n", "\n", "# Standard library imports\n", "import random\n", "\n", "# Related third party imports\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "# Local application/library specific imports\n", "import epynn.initialize\n", "from epynn.commons.maths import relu, softmax\n", "from epynn.commons.library import (\n", " configure_directory,\n", " read_model,\n", ")\n", "from epynn.network.models import EpyNN\n", "from epynn.dropout.models import Dropout\n", "from epynn.embedding.models import Embedding\n", "from epynn.flatten.models import Flatten\n", "from epynn.rnn.models import RNN\n", "from epynn.dense.models import Dense\n", "from epynnlive.dummy_time.prepare_dataset import prepare_dataset\n", "from epynnlive.dummy_time.settings import se_hPars\n", "\n", "\n", "########################## CONFIGURE ##########################\n", "random.seed(1)\n", "np.random.seed(1)\n", "\n", "np.set_printoptions(threshold=10)\n", "\n", "np.seterr(all='warn')\n", "\n", "configure_directory()\n", "\n", "\n", "############################ DATASET ##########################\n", "X_features, Y_label = prepare_dataset(N_SAMPLES=1024)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's control what we retrieved." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(128, 1)\n", "[[-0.35376783]\n", " [-0.17895082]\n", " [-0.29156932]\n", " ...\n", " [-0.25263271]\n", " [-0.74700792]\n", " [-0.16726332]]\n" ] } ], "source": [ "print(X_features[0].shape)\n", "print(X_features[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is called a *univariate time series* because it contains a single measurement per time step. Note we retrieved data with a sampling rate of 128 Hz and duration of 1 second for a total of 128 points." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(X_features[0], label='label: %s' % Y_label[0])\n", "plt.plot(X_features[1], label='label: %s' % Y_label[1])\n", "plt.legend()\n", "plt.show()\n", "plt.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While sample features corresponding to ``label: 1`` do not contain a true signal, sample features corresponding to ``label: 0`` do contain a true signal of random frequency." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Feed-Forward (FF)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Feed-Forward Neural Networks are advantageous compared to other architectures because they are relatively fast to train, and most often less sensitive to parameters of training, also called *hyperparameters*. Therefore, they are most commonly a first choice when dealing with a new problem and prior to further insights." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Embedding" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In EpyNN, sample features (X_features) and labels (Y_label) are passed to the corresponding arguments value when instantiating the *embedding* or *input* layer. It is required to use this layer in EpyNN, which is then always the first layer of the network architecture. The layer consists of a simple forward pass but contains several routines to prepare data according to user choices." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "embedding = Embedding(X_data=X_features,\n", " Y_data=Y_label,\n", " Y_encode=True,\n", " relative_size=(2, 1, 0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Herein, we instructed to the class constructor ``Embedding`` to one-hot encode the set of sample label ``Y_encode=True``. We also instructed to split data in order to build a training set contraining ``2/3`` of the whole set while the validation set will contain ``1/3``." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The training procedure is exclusively driven by the training data. However, one may judge how general the model can be by comparing metrics with evaluation on the validation set." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Flatten-(Dense)n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As introduced before, sample features are univariate time series and as such the set of sample features has 3 dimensions." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(683, 128, 1)\n" ] } ], "source": [ "print(embedding.dtrain.X.shape) # (m, s, v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It contains 683 samples (m), each described by a sequence of 128 features (s) containing 1 element (v).\n", "\n", "However, the fully-connected or *dense* layer can only process bi-dimensionnal input arrays. That is the reason why we need to invoke a *flatten* layer in between the *embedding* and *dense* layer." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(683, 128, 1)\n", "(683, 128)\n", "(683, 128, 1)\n" ] } ], "source": [ "flatten = Flatten()\n", "\n", "# Original shape (m, s, v)\n", "print(embedding.dtrain.X.shape)\n", "\n", "# Flatten on forward pass (m, s * v)\n", "print (flatten.forward(embedding.dtrain.X).shape)\n", "\n", "# Reverse on backward pass (m, s, v)\n", "print (flatten.backward(flatten.forward(embedding.dtrain.X)).shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In other words, the *flatten* layer is an adapter.\n", "\n", "Let's now build the network architecture." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "name = 'Flatten_Dense-64-relu_Dense-2-softmax'\n", "\n", "# Tune defaults learning rate from 0.1 to 0.005\n", "se_hPars['learning_rate'] = 0.005\n", "\n", "flatten = Flatten()\n", "\n", "hidden_dense = Dense(64, relu)\n", "\n", "dense = Dense(2, softmax)\n", "\n", "layers = [embedding, flatten, hidden_dense, dense]\n", "\n", "model = EpyNN(layers=layers, name=name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The network - or model - architecture is composed of the mandatory *embedding* input layer, one *hidden_dense* layer with 64 nodes and *ReLU* activation function and finally an output *dense* layer with 2 nodes - because we have two distinct one-hot encoded labels - and a *softmax* activation function.\n", "\n", "We now initialize the model and instruct it to use a *Mean Square Error* cost function with network *seeding* for reproducibility and we provide our custom hyperparameters." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m--- EpyNN Check OK! --- \u001b[0m\r" ] } ], "source": [ "model.initialize(loss='MSE', seed=1, se_hPars=se_hPars.copy(), end='\\r')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The initialize method performs a dry epoch which includes all steps but omits parameters update.\n", "\n", "We can now proceed with the training." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[37mEpoch 99 - Batch 0/0 - Accuracy: 1.0 Cost: 0.00075 - TIME: 7.75s RATE: 1.29e+01e/s TTC: 0s \u001b[0m\n", "\n", "+-------+----------+----------+----------+-------+--------+-------+--------------------------------------------------+\n", "| \u001b[1m\u001b[37mepoch\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[32maccuracy\u001b[0m | | \u001b[1m\u001b[31mMSE\u001b[0m | | \u001b[37mExperiment\u001b[0m |\n", "| | \u001b[37mDense\u001b[0m | \u001b[37mDense\u001b[0m | \u001b[1m\u001b[32mdtrain\u001b[0m | \u001b[1m\u001b[32mdval\u001b[0m | \u001b[1m\u001b[31mdtrain\u001b[0m | \u001b[1m\u001b[31mdval\u001b[0m | |\n", "+-------+----------+----------+----------+-------+--------+-------+--------------------------------------------------+\n", "| \u001b[1m\u001b[37m0\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.722\u001b[0m | \u001b[1m\u001b[32m0.695\u001b[0m | \u001b[1m\u001b[31m0.197\u001b[0m | \u001b[1m\u001b[31m0.214\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m10\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.975\u001b[0m | \u001b[1m\u001b[32m0.868\u001b[0m | \u001b[1m\u001b[31m0.031\u001b[0m | \u001b[1m\u001b[31m0.097\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m20\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.894\u001b[0m | \u001b[1m\u001b[31m0.008\u001b[0m | \u001b[1m\u001b[31m0.080\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m30\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.900\u001b[0m | \u001b[1m\u001b[31m0.004\u001b[0m | \u001b[1m\u001b[31m0.077\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m40\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.909\u001b[0m | \u001b[1m\u001b[31m0.003\u001b[0m | \u001b[1m\u001b[31m0.076\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m50\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.909\u001b[0m | \u001b[1m\u001b[31m0.002\u001b[0m | \u001b[1m\u001b[31m0.075\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m60\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.909\u001b[0m | \u001b[1m\u001b[31m0.001\u001b[0m | \u001b[1m\u001b[31m0.075\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m70\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.912\u001b[0m | \u001b[1m\u001b[31m0.001\u001b[0m | \u001b[1m\u001b[31m0.075\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m80\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.912\u001b[0m | \u001b[1m\u001b[31m0.001\u001b[0m | \u001b[1m\u001b[31m0.075\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m90\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.912\u001b[0m | \u001b[1m\u001b[31m0.001\u001b[0m | \u001b[1m\u001b[31m0.075\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m99\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m1.000\u001b[0m | \u001b[1m\u001b[32m0.909\u001b[0m | \u001b[1m\u001b[31m0.001\u001b[0m | \u001b[1m\u001b[31m0.074\u001b[0m | \u001b[37m1635012768_Flatten_Dense-64-relu_Dense-2-softmax\u001b[0m |\n", "+-------+----------+----------+----------+-------+--------+-------+--------------------------------------------------+\n" ] } ], "source": [ "model.train(epochs=100, init_logs=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can observe on the logs that an accuracy of 1 was reached after 40 iterations on the training set with a cost near zero. By contrast, we can observe that - although good - the metrics and cost computed on the validation set ran behind from the early epochs of the regression." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model.plot(path=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When plotting the accuracy and cost for training and validation sets, these differences are even more obvious.\n", "\n", "This is called *overfitting* of the model to the training data. In other words, the model represents very well - and even exactly - the data from which it was trained, but lacks such performance on independant data.\n", "\n", "There are many ways to limit such overfitting when designing or training a Neural Network. Herein we are going to experiment the *dropout* regularization method to see how it can impact such unwanted overfitting behavior." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For code, maths and pictures behind the *Flatten* and *Dense* layers, follow these links:\n", "\n", "* [Flatten - Adapter](https://epynn.net/Flatten.html)\n", "* [Fully Connected (Dense)](https://epynn.net/Dense.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Flatten-(Dense)n with Dropout" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can follow [this link]() for details on the *dropout* layer.\n", "\n", "Briefly, the *Dropout* regularization layer randomly subsamples the output of the previous layer and forwards the product to the next layer. By introducing such instability in the network, the layer reduces the thinning capacity of the network which may be of interest to prevent overfitting, which happens when the model is too closely related to the training data.\n", "\n", "In EpyNN, the class constructor ``Dropout()`` takes a single argument ``drop_prob`` which represents the probability of one element in the input array to be preserved in the output array.\n", "\n", "For instance." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-0.40461632 0.56992124 -0.188923 -1.92933133 -0.69195013]\n", " [ 1.04594288 -0.42813771 0.89180849 0.84383519 0.65651707]\n", " [-0.42446322 -1.20019422 2.12126642 0.27575357 -0.67915118]\n", " [-0.79589247 -0.40698592 -1.54560858 0.40910712 -0.89735926]\n", " [-1.02323887 -0.73387254 -0.17314366 0.50633278 2.35972254]]\n", "[[-0.40461632 0.56992124 -0.188923 -1.92933133 -0.69195013]\n", " [ 1.04594288 -0.42813771 0.89180849 0.84383519 0.65651707]\n", " [-0.42446322 -1.20019422 2.12126642 0.27575357 -0.67915118]\n", " [-0.79589247 -0.40698592 -1.54560858 0.40910712 -0.89735926]\n", " [-1.02323887 -0.73387254 -0.17314366 0.50633278 2.35972254]]\n", "[[-0.40461632 0.56992124 -0. -1.92933133 -0.69195013]\n", " [ 0. -0. 0.89180849 0. 0. ]\n", " [-0.42446322 -1.20019422 2.12126642 0.27575357 -0.67915118]\n", " [-0.79589247 -0.40698592 -1.54560858 0.40910712 -0.89735926]\n", " [-0. -0. -0. 0. 2.35972254]]\n", "[[-0. 0. -0. -0. -0.]\n", " [ 0. -0. 0. 0. 0.]\n", " [-0. -0. 0. 0. -0.]\n", " [-0. -0. -0. 0. -0.]\n", " [-0. -0. -0. 0. 0.]]\n" ] } ], "source": [ "test_array = np.random.standard_normal((5, 5))\n", "\n", "D1 = (np.random.uniform(0, 1, test_array.shape) < 1)\n", "D05 = (np.random.uniform(0, 1, test_array.shape) < 0.5)\n", "D0 = (np.random.uniform(0, 1, test_array.shape) < 0)\n", "\n", "print(test_array)\n", "print(test_array * D1) # drop_prob = 1 - No dropout\n", "print(test_array * D05) # drop_prob = 0.5 - Common value\n", "print(test_array * D0) # drop_prob = 0 - Output is null" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Usually, values for ``drop_prob`` are within 0-0.5.\n", "\n", "Let’s build the same Feed-Forward network as above but with the implementation of two *Dropout* layers." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "name = 'Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax'\n", "\n", "se_hPars['learning_rate'] = 0.005\n", "\n", "flatten = Flatten()\n", "\n", "dropout1 = Dropout(drop_prob=0.2)\n", "\n", "hidden_dense = Dense(64, relu)\n", "\n", "dropout2 = Dropout(drop_prob=0.5)\n", "\n", "dense = Dense(2, softmax)\n", "\n", "layers = [embedding, flatten, dropout1, hidden_dense, dropout2, dense]\n", "\n", "model = EpyNN(layers=layers, name=name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have set up a first *dropout1* between the *flatten* and *hidden_dense* layer as well as a second one between *hidden_dense* and *dense*.\n", "\n", "Note the ``drop_prob`` values are different for the two *dropout* layers. We will dropt 0.2 of the input for the first, and 0.5 for the second. Those settings are quite empirical, better just to test and see.\n", "\n", "Initialize with the same settings as is the no-dropout setup." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m--- EpyNN Check OK! --- \u001b[0m\r" ] } ], "source": [ "model.initialize(loss='MSE', seed=1, se_hPars=se_hPars.copy(), end='\\r')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see the *dropout* layers did appear in the check. Let's proceed with training." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[37mEpoch 99 - Batch 0/0 - Accuracy: 0.993 Cost: 0.0111 - TIME: 8.1s RATE: 1.23e+01e/s TTC: 0s \u001b[0m\n", "\n", "+-------+----------+----------+----------+-------+--------+-------+------------------------------------------------------------------------+\n", "| \u001b[1m\u001b[37mepoch\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[32maccuracy\u001b[0m | | \u001b[1m\u001b[31mMSE\u001b[0m | | \u001b[37mExperiment\u001b[0m |\n", "| | \u001b[37mDense\u001b[0m | \u001b[37mDense\u001b[0m | \u001b[1m\u001b[32mdtrain\u001b[0m | \u001b[1m\u001b[32mdval\u001b[0m | \u001b[1m\u001b[31mdtrain\u001b[0m | \u001b[1m\u001b[31mdval\u001b[0m | |\n", "+-------+----------+----------+----------+-------+--------+-------+------------------------------------------------------------------------+\n", "| \u001b[1m\u001b[37m0\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.518\u001b[0m | \u001b[1m\u001b[32m0.510\u001b[0m | \u001b[1m\u001b[31m0.303\u001b[0m | \u001b[1m\u001b[31m0.305\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m10\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.791\u001b[0m | \u001b[1m\u001b[32m0.736\u001b[0m | \u001b[1m\u001b[31m0.143\u001b[0m | \u001b[1m\u001b[31m0.173\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m20\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.922\u001b[0m | \u001b[1m\u001b[32m0.891\u001b[0m | \u001b[1m\u001b[31m0.061\u001b[0m | \u001b[1m\u001b[31m0.085\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m30\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.952\u001b[0m | \u001b[1m\u001b[32m0.897\u001b[0m | \u001b[1m\u001b[31m0.044\u001b[0m | \u001b[1m\u001b[31m0.079\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m40\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.971\u001b[0m | \u001b[1m\u001b[32m0.924\u001b[0m | \u001b[1m\u001b[31m0.030\u001b[0m | \u001b[1m\u001b[31m0.066\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m50\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.977\u001b[0m | \u001b[1m\u001b[32m0.927\u001b[0m | \u001b[1m\u001b[31m0.026\u001b[0m | \u001b[1m\u001b[31m0.068\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m60\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.980\u001b[0m | \u001b[1m\u001b[32m0.915\u001b[0m | \u001b[1m\u001b[31m0.022\u001b[0m | \u001b[1m\u001b[31m0.065\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m70\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.987\u001b[0m | \u001b[1m\u001b[32m0.918\u001b[0m | \u001b[1m\u001b[31m0.017\u001b[0m | \u001b[1m\u001b[31m0.061\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m80\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.985\u001b[0m | \u001b[1m\u001b[32m0.921\u001b[0m | \u001b[1m\u001b[31m0.014\u001b[0m | \u001b[1m\u001b[31m0.058\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m90\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.988\u001b[0m | \u001b[1m\u001b[32m0.891\u001b[0m | \u001b[1m\u001b[31m0.012\u001b[0m | \u001b[1m\u001b[31m0.073\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m99\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[37m5.00e-03\u001b[0m | \u001b[1m\u001b[32m0.974\u001b[0m | \u001b[1m\u001b[32m0.938\u001b[0m | \u001b[1m\u001b[31m0.021\u001b[0m | \u001b[1m\u001b[31m0.055\u001b[0m | \u001b[37m1635012776_Flatten_Dropout-02_Dense-64-relu_Dropout-05_Dense-2-softmax\u001b[0m |\n", "+-------+----------+----------+----------+-------+--------+-------+------------------------------------------------------------------------+\n" ] } ], "source": [ "model.train(epochs=100, init_logs=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is clear that the delta between training and validation set is much reduced now, both for accuracy metrics and MSE cost." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model.plot(path=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In brief, think about using *dropout* layers to reduce overfitting, alone or in combination with other methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For code, maths and pictures behind the *Dropout* layer, follow these links:\n", "\n", "* [Dropout - Regularization](https://epynn.net/Dropout.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recurrent Neural Network (RNN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When dealing with any sort of sequential data, it is often suggested to use recurrent architectures because they can process three-dimensional input arrays and take advantage of the *sequential* nature of sample features." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Embedding" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's embed our data again, using the same settings as before, and proceed with a little refresh." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(683, 128, 1)\n", "(683, 1)\n" ] } ], "source": [ "embedding = Embedding(X_data=X_features,\n", " Y_data=Y_label,\n", " Y_encode=True,\n", " relative_size=(2, 1, 0))\n", "\n", "print(embedding.dtrain.X.shape) # (m, s, v)\n", "print(embedding.dtrain.X[:, 0].shape) # Input shape at sequence step 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have 683 samples (m) with sequential features of length 128 (s) which represents a univariate time series (v).\n", "\n", "We recalled before that sample features may be white noise, or white noise combined with a pure sine-wave of random frequency. While there is no expected **correlation through time** within the white noise, we expect a **periodic pattern** to repeat in the case where a pure since-wave is also present in sample features.\n", "\n", "Recurrent layers are said to have some internal memory of such periodic patterns because:\n", "\n", "* One recurrent layer is made of recurrent units or cells.\n", "* One unit processes every step in one sequence at time within a ``for`` loop.\n", "* The trick is: the output of every iteration - every step - is *fed back* in the unit for every *step forward* along the sequence.\n", "\n", "Said differently, the output of one recurrent cell does not only depend of the input at a given sequence step.\n", "\n", "This output of one iteration in sequence, called ``h`` for *hidden cell state*, becomes more and more \"impregnated\" by outputs of previous iterations along with the step forward in the sequence.\n", "\n", "These statements may be hard to understand. See [RNN - Forward](https://epynn.net/RNN.html) for some graphical elements." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### RNN-Dense" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we will instantiate a simple *RNN* layer composed of 10 cells which forward the last (10th) hidden state to the output *dense* layer. " ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "name = 'RNN-10_Flatten_Dense-2-softmax'\n", "\n", "se_hPars['learning_rate'] = 0.01\n", "se_hPars['softmax_temperature'] = 5\n", "\n", "rnn = RNN(10)\n", "\n", "dense = Dense(2, softmax)\n", "\n", "layers = [embedding, rnn, dense]\n", "\n", "model = EpyNN(layers=layers, name=name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In contrast to the Feed-Forward network seen above, the *Flatten* layer is absent herein. This is because RNN returns the last hidden state of shape (10,). The flatten operation is therefore unnecessary. Note that RNN in EpyNN may return all hidden states sequences (``RNN(10, sequences=True)``) which then would require the use of a flatten layer because the shape would be (128, 10).\n", "\n", "Let's give a try to this RNN-based network." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m--- EpyNN Check OK! --- \u001b[0m\r" ] } ], "source": [ "model.initialize(loss='MSE', seed=1, se_hPars=se_hPars.copy(), end='\\r')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start training." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[37mEpoch 99 - Batch 0/0 - Accuracy: 0.693 Cost: 0.2105 - TIME: 30.35s RATE: 3.29e+00e/s TTC: 1s \u001b[0m\n", "\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n", "| \u001b[1m\u001b[37mepoch\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[32maccuracy\u001b[0m | | \u001b[1m\u001b[31mMSE\u001b[0m | | \u001b[37mExperiment\u001b[0m |\n", "| | \u001b[37mRNN\u001b[0m | \u001b[37mDense\u001b[0m | \u001b[1m\u001b[32mdtrain\u001b[0m | \u001b[1m\u001b[32mdval\u001b[0m | \u001b[1m\u001b[31mdtrain\u001b[0m | \u001b[1m\u001b[31mdval\u001b[0m | |\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n", "| \u001b[1m\u001b[37m0\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.533\u001b[0m | \u001b[1m\u001b[32m0.499\u001b[0m | \u001b[1m\u001b[31m0.248\u001b[0m | \u001b[1m\u001b[31m0.251\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m10\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.561\u001b[0m | \u001b[1m\u001b[32m0.507\u001b[0m | \u001b[1m\u001b[31m0.247\u001b[0m | \u001b[1m\u001b[31m0.249\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m20\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.564\u001b[0m | \u001b[1m\u001b[32m0.510\u001b[0m | \u001b[1m\u001b[31m0.245\u001b[0m | \u001b[1m\u001b[31m0.248\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m30\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.581\u001b[0m | \u001b[1m\u001b[32m0.513\u001b[0m | \u001b[1m\u001b[31m0.242\u001b[0m | \u001b[1m\u001b[31m0.246\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m40\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.599\u001b[0m | \u001b[1m\u001b[32m0.528\u001b[0m | \u001b[1m\u001b[31m0.239\u001b[0m | \u001b[1m\u001b[31m0.244\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m50\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.609\u001b[0m | \u001b[1m\u001b[32m0.548\u001b[0m | \u001b[1m\u001b[31m0.236\u001b[0m | \u001b[1m\u001b[31m0.241\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m60\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.625\u001b[0m | \u001b[1m\u001b[32m0.589\u001b[0m | \u001b[1m\u001b[31m0.231\u001b[0m | \u001b[1m\u001b[31m0.237\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m70\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.638\u001b[0m | \u001b[1m\u001b[32m0.622\u001b[0m | \u001b[1m\u001b[31m0.226\u001b[0m | \u001b[1m\u001b[31m0.232\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m80\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.657\u001b[0m | \u001b[1m\u001b[32m0.628\u001b[0m | \u001b[1m\u001b[31m0.221\u001b[0m | \u001b[1m\u001b[31m0.227\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m90\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.673\u001b[0m | \u001b[1m\u001b[32m0.651\u001b[0m | \u001b[1m\u001b[31m0.215\u001b[0m | \u001b[1m\u001b[31m0.222\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m99\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.694\u001b[0m | \u001b[1m\u001b[32m0.663\u001b[0m | \u001b[1m\u001b[31m0.210\u001b[0m | \u001b[1m\u001b[31m0.216\u001b[0m | \u001b[37m1635012784_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n" ] } ], "source": [ "model.train(epochs=100, init_logs=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We observe that the model was still converging at the end of the training. Interestingly, there is no overfitting at all. *Accuracy* and *cost* (MSE) are even higher and lower, respectively, for the validation set. " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model.plot(path=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is clear that this model would need more training epochs to converge, but this would not fix the overfitting problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For code, maths and pictures behind the *RNN* layer, follow this link:\n", "\n", "* [Recurrent Neural Network (RNN)](https://epynn.net/RNN.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### RNN-Dense with SGD" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the above-example we used Gradient Descent (GD) optimization, meaning that:\n", "\n", "* All training examples are in a single batch.\n", "* There is a single weight update by the training epoch.\n", "\n", "Below we set ``batch_size`` to ``32`` to use Stochastic Gradient Descent (SGD), meaning that:\n", "\n", "* All training examples are divided into batches of size 32, yielding ``N_SAMPLES // 32`` batches.\n", "* There are as many weight update by training epochs than training batches." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "embedding = Embedding(X_data=X_features,\n", " Y_data=Y_label,\n", " Y_encode=True,\n", " batch_size=32,\n", " relative_size=(2, 1, 0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the same network as before." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "name = 'RNN-10_Flatten_Dense-2-softmax'\n", "\n", "se_hPars['learning_rate'] = 0.01\n", "se_hPars['softmax_temperature'] = 5\n", "\n", "rnn = RNN(10)\n", "\n", "dense = Dense(2, softmax)\n", "\n", "layers = [embedding, rnn, dense]\n", "\n", "model = EpyNN(layers=layers, name=name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We initialize and train the nework." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[37mEpoch 99 - Batch 20/20 - Accuracy: 0.969 Cost: 0.02007 - TIME: 57.12s RATE: 1.75e+00e/s TTC: 1s \u001b[0m\n", "\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n", "| \u001b[1m\u001b[37mepoch\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[37mlrate\u001b[0m | \u001b[1m\u001b[32maccuracy\u001b[0m | | \u001b[1m\u001b[31mMSE\u001b[0m | | \u001b[37mExperiment\u001b[0m |\n", "| | \u001b[37mRNN\u001b[0m | \u001b[37mDense\u001b[0m | \u001b[1m\u001b[32mdtrain\u001b[0m | \u001b[1m\u001b[32mdval\u001b[0m | \u001b[1m\u001b[31mdtrain\u001b[0m | \u001b[1m\u001b[31mdval\u001b[0m | |\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n", "| \u001b[1m\u001b[37m0\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.540\u001b[0m | \u001b[1m\u001b[32m0.516\u001b[0m | \u001b[1m\u001b[31m0.247\u001b[0m | \u001b[1m\u001b[31m0.250\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m10\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.638\u001b[0m | \u001b[1m\u001b[32m0.630\u001b[0m | \u001b[1m\u001b[31m0.226\u001b[0m | \u001b[1m\u001b[31m0.230\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m20\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.900\u001b[0m | \u001b[1m\u001b[32m0.903\u001b[0m | \u001b[1m\u001b[31m0.119\u001b[0m | \u001b[1m\u001b[31m0.117\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m30\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.895\u001b[0m | \u001b[1m\u001b[32m0.900\u001b[0m | \u001b[1m\u001b[31m0.097\u001b[0m | \u001b[1m\u001b[31m0.094\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m40\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.959\u001b[0m | \u001b[1m\u001b[32m0.971\u001b[0m | \u001b[1m\u001b[31m0.038\u001b[0m | \u001b[1m\u001b[31m0.029\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m50\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.974\u001b[0m | \u001b[1m\u001b[32m0.977\u001b[0m | \u001b[1m\u001b[31m0.025\u001b[0m | \u001b[1m\u001b[31m0.022\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m60\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.974\u001b[0m | \u001b[1m\u001b[32m0.977\u001b[0m | \u001b[1m\u001b[31m0.023\u001b[0m | \u001b[1m\u001b[31m0.019\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m70\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.972\u001b[0m | \u001b[1m\u001b[32m0.974\u001b[0m | \u001b[1m\u001b[31m0.024\u001b[0m | \u001b[1m\u001b[31m0.024\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m80\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.966\u001b[0m | \u001b[1m\u001b[32m0.974\u001b[0m | \u001b[1m\u001b[31m0.028\u001b[0m | \u001b[1m\u001b[31m0.024\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m90\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.985\u001b[0m | \u001b[1m\u001b[32m0.991\u001b[0m | \u001b[1m\u001b[31m0.014\u001b[0m | \u001b[1m\u001b[31m0.010\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "| \u001b[1m\u001b[37m99\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[37m1.00e-02\u001b[0m | \u001b[1m\u001b[32m0.982\u001b[0m | \u001b[1m\u001b[32m0.991\u001b[0m | \u001b[1m\u001b[31m0.016\u001b[0m | \u001b[1m\u001b[31m0.009\u001b[0m | \u001b[37m1635012814_RNN-10_Flatten_Dense-2-softmax\u001b[0m |\n", "+-------+----------+----------+----------+-------+--------+-------+-------------------------------------------+\n" ] } ], "source": [ "model.initialize(loss='MSE', seed=1, se_hPars=se_hPars.copy(), end='\\r')\n", "\n", "model.train(epochs=100, init_logs=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the same number of training epochs, accuracy is much higher." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "model.plot(path=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And there is no overfitting, this is an excellent model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For code, maths and pictures behind the *RNN* layer, follow this link:\n", "\n", "* [Recurrent Neural Network (RNN)](https://epynn.net/RNN.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Write, read & Predict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A trained model can be written on disk such as:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mMake: /media/synthase/beta/EpyNN/epynnlive/dummy_time/models/1635012814_RNN-10_Flatten_Dense-2-softmax.pickle\u001b[0m\n" ] } ], "source": [ "model.write()\n", "\n", "# model.write(path=/your/custom/path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A model can be read from disk such as:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "model = read_model()\n", "\n", "# model = read_model(path=/your/custom/path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can retrieve new features and predict on them." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "X_features, _ = prepare_dataset(N_SAMPLES=10)\n", "\n", "dset = model.predict(X_features)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Results can be extracted such as:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 0 [0.98718813 0.01281187]\n", "1 0 [0.94121987 0.05878013]\n", "2 1 [0.02139625 0.97860375]\n", "3 0 [0.95959614 0.04040386]\n", "4 1 [0.02610764 0.97389236]\n", "5 1 [0.01849713 0.98150287]\n", "6 1 [0.018713 0.981287]\n", "7 1 [0.01967183 0.98032817]\n", "8 0 [0.96032797 0.03967203]\n", "9 0 [0.93915586 0.06084414]\n" ] } ], "source": [ "for n, pred, probs in zip(dset.ids, dset.P, dset.A):\n", " print(n, pred, probs)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 4 }