Dummy dataset

  • Find this notebook at EpyNN/epynnlive/dummy_boolean/prepare_dataset.ipynb.

  • Regular python code at EpyNN/epynnlive/dummy_boolean/prepare_dataset.py.

Run the notebook online with Google Colab.

Level: Beginner

This notebook is part of the series on preparing data for Neural Network regression with EpyNN.

In addition to the topic-specific content, it contains several explanations about basics or general concepts in programming that are important in the context.

What is a Boolean data-type?

Boolean data type is a form of data with only two possible values, namely True and False in most programming languages. In Python, these values evaluate to 1 and 0 behind the scene. Calculations using Boolean data are very quick and performance gain also arises from easier data and output processing compared to other data types.

Examples of real world topics well suited for the Boolean data type may include: molecular interactions, gene regulation, disease prediction and diagnosis, among many others.

Why preparing a dummy dataset with Boolean features?

A dummy dataset means an ensemble of data having no interest in the real world. However, dummy datasets can be prepared in a way that results from Neural Network regression are made predictable. When having the a priori knowledge of the law we want to model, it is easier to evaluate if the Neural Network is working in an optimal way. When dealing with samples described by Boolean features, it is then a good and time-saving practice to test the code and settings on simple problems from dummy data that should work if no mistake was introduced in the procedure.

Live examples

The function prepare_dataset() presented herein is used in the following live examples:

  • Notebook atEpyNN/epynnlive/dummy_boolean/train.ipynb or following this link.

  • Regular python code at EpyNN/epynnlive/dummy_boolean/train.py.