Other Helper function¶

run_montecarlo(models, dataset, open_dataset=None, iterations=500, epochs=150, batch_size=100, display_freq=1, validation_split=0.2, validation_data=None, debug=False, polar=False, do_all=True, do_conf_mat=True)¶

This function is used to compare different neural networks performance.

Note

If you want to compare a CVNN model with just an equivalent-RVNN model you could use mlp_run_real_comparison_montecarlo instead.

Runs simulation and compares them.
Saves several files into ./log/montecarlo/date/of/run/
1. run_data.csv: Full information of performance of iteration of each model at each epoch
2. <model.name>_statistical_result.csv: Statistical results of all iterations of each model per epoch (mean, median, std, etc)
3. models_details.json: A full detailed description of each model to be trained
4. (Optional) run_summary.txt: User friendly summary of the run models and data
5. (Optional) plot/ folder with the corresponding plots generated by MonteCarloAnalyzer.do_all()

Parameters:

models – List of cvnn.CvnnModel to be compared.
dataset – cvnn.dataset.Dataset with the dataset to be used on the training
open_dataset – (None) If dataset is saved inside a folder and must be opened, path of the Dataset to be opened. Else None (default)
iterations – Number of iterations to be done for each model
epochs – Number of epochs for each iteration
batch_size – Batch size at each iteration
display_freq – Frequency in terms of epochs of when to do a checkpoint
polar – Boolean weather the RVNN should receive real and imaginary part (False) or amplitude and phase (True)
do_all – If true (default) it creates a plot/ folder with the plots generated by MonteCarloAnalyzer.do_all()
validation_split – Float between 0 and 1. Percentage of the input data to be used as test set (the rest will be use as train set) Default: 0.0 (No validation set). This input is ignored if validation_data is given.
validation_data –
Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. This parameter takes precedence over validation_split. It can be:
- tuple (x_val, y_val) of Numpy arrays or tensors. Preferred data type (less overhead).
- A tf.data dataset.
do_conf_mat – Generate a confusion matrix based on results.
verbose –
Different modes according to
- 0 or ‘silent’: No output at all
- 1 or False: Progress bar per iteration
- 2 or True or ‘debug’: Progress bar per epoch

Returns:

(string) Full path to the run_data.csv generated file. It can be used by cvnn.data_analysis.SeveralMonteCarloComparison to compare several runs.

mlp_run_real_comparison_montecarlo(dataset: cvnn.dataset.Dataset, open_dataset=None, iterations=1000, epochs=150, batch_size=100, display_freq=1, optimizer='sgd', shape_raw=None, activation='cart_relu', debug=False, polar=False, do_all=True, dropout=0.5, validation_split=0.2, validation_data=None, capacity_equivalent=True, equiv_technique='ratio', do_conf_mat=True)¶

This function is used to compare CVNN vs RVNN performance over any dataset.

Automatically creates two Multi-Layer Perceptrons (MLP), one complex and one real.
Runs simulation and compares them.
Saves several files into ./logs/montecarlo/<year>/<month>/<day>/run_<time>/
1. run_summary.txt: Summary of the run models and data
2. run_data.csv: Full information of performance of iteration of each model at each epoch
3. complex_network_statistical_result.csv: Statistical results of all iterations of CVNN per epoch
4. real_network_statistical_result.csv: Statistical results of all iterations of RVNN per epoch
5. (Optional) plot/ folder with the corresponding plots generated by :code:`MonteCarloAnalyzer.do_all()`#

Parameters:

dataset – cvnn.dataset.Dataset with the dataset to be used on the training
open_dataset – (None) If dataset is saved inside a folder and must be opened, path of the Dataset to be opened. Else None (default)
iterations – Number of iterations to be done for each model
epochs – Number of epochs for each iteration
batch_size – Batch size at each iteration
display_freq – Frequency in terms of epochs of when to do a checkpoint.
optimizer – Optimizer to be used. Keras optimizers are not allowed. Can be either cvnn.optimizers.Optimizer or a string listed in opt_dispatcher.
shape_raw – List of sizes of each hidden layer. For example [64] will generate a CVNN with one hidden layer of size 64. Default None will default to example.
activation – Activation function to be used at each hidden layer
polar – Boolean weather the RVNN should receive real and imaginary part (False) or amplitude and phase (True)
do_all – If true (default) it creates a plot/ folder with the plots generated by MonteCarloAnalyzer.do_all()
dropout – (float) Dropout to be used at each hidden layer. If None it will not use any dropout.

validation_split –

Float between 0 and 1.: Percentage of the input data to be used as test set (the rest will be use as train set) Default: 0.0 (No validation set). This input is ignored if validation_data is given.

param validation_data:
	Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. This parameter takes precedence over validation_split. It can be: tuple `(x_val, y_val)` of Numpy arrays or tensors. Preferred data type (less overhead). A `tf.data dataset`.

capacity_equivalent –

An equivalent model can be equivalent in terms of layer neurons or

trainable parameters (capacity equivalent according to: this paper
- True, it creates a capacity-equivalent model in terms of trainable parameters
- False, it will double all layer size (except the last one if classifier=True)
equiv_technique –
Used to define the strategy of the capacity equivalent model. This parameter is ignored if capacity_equivalent=False - ‘ratio’: neurons_real_valued_layer[i] = r * neurons_complex_valued_layer[i], ‘r’ constant for all ‘i’ - ‘alternate’: Method described in this paper where one alternates between

multiplying by 2 or 1. Special case on the middle is treated as a compromise between the two.
do_conf_mat – Generate a confusion matrix based on results.
verbose –
Different modes according to
- 0 or ‘silent’: No output at all
- 1 or False: Progress bar per iteration
- 2 or True or ‘debug’: Progress bar per epoch

Returns:

(string) Full path to the run_data.csv generated file. It can be used by cvnn.data_analysis.SeveralMonteCarloComparison to compare several runs.

run_gaussian_dataset_montecarlo(iterations=1000, m=10000, n=128, param_list=None, epochs=150, batch_size=100, display_freq=1, optimizer='sgd', shape_raw=None, activation='cart_relu', debug=False, polar=False, do_all=True, dropout=None)¶

This function is used to compare CVNN vs RVNN performance over statistical non-circular data.

Generates a complex-valued gaussian correlated noise with the characteristics given by the inputs.
It then runs a monte carlo simulation of several iterations of both CVNN and an equivalent RVNN model.
Saves several files into ./log/montecarlo/date/of/run/
1. run_summary.txt: Summary of the run models and data
2. run_data.csv: Full information of performance of iteration of each model at each epoch
3. complex_network_statistical_result.csv: Statistical results of all iterations of CVNN per epoch
4. real_network_statistical_result.csv: Statistical results of all iterations of RVNN per epoch
5. (Optional) plot/ folder with the corresponding plots generated by MonteCarloAnalyzer.do_all()

Parameters:

iterations – Number of iterations to be done for each model
m – Total size of the dataset (number of examples)
n – Number of features / input vector
param_list –
A list of len = number of classes. Each element of the list is another list of len = 3 with values: [correlation_coeff, sigma_x, sigma_y] Example for dataset type A of paper [CIT2020-BARRACHINA]:
```
param_list = [
    [0.5, 1, 1],
    [-0.5, 1, 1]
]
```
Default: None will default to the example.
epochs – Number of epochs for each iteration
batch_size – Batch size at each iteration
display_freq – Frequency in terms of epochs of when to do a checkpoint.
optimizer – Optimizer to be used. Keras optimizers are not allowed. Can be either cvnn.optimizers.Optimizer or a string listed in opt_dispatcher.
shape_raw – List of sizes of each hidden layer. For example [64] will generate a CVNN with one hidden layer of size 64. Default None will default to example.
activation – Activation function to be used at each hidden layer
polar – Boolean weather the RVNN should receive real and imaginary part (False) or amplitude and phase (True)
do_all – If true (default) it creates a plot/ folder with the plots generated by MonteCarloAnalyzer.do_all()
dropout – (float) Dropout to be used at each hidden layer. If None it will not use any dropout.
verbose –
Different modes according to
- 0 or ‘silent’: No output at all
- 1 or False: Progress bar per iteration
- 2 or True or ‘debug’: Progress bar per epoch

Returns:

(string) Full path to the run_data.csv generated file. It can be used by cvnn.data_analysis.SeveralMonteCarloComparison to compare several runs.

[CIT2020-BARRACHINA]

Jose Agustin Barrachina, Chenfang Ren, Christele Morisseau, Gilles Vieillard, Jean-Philippe Ovarlez “Complex-Valued vs. Real-Valued Neural Networks for Classification Perspectives: An Example on Non-Circular Data” arXiv:2009.08340 ML Stat, Sep. 2020. Available: https://arxiv.org/abs/2009.08340.