SL6: California housing dataset – regression with ANN

30 mins
# data analysis and wrangling
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
# import random as rnd

# visualization
import matplotlib.pyplot as plt
# %matplotlib inline

# machine learning
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
from keras.models import  Sequential
from keras.layers.core import Dense
import keras.metrics as metrics

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

1. Browse the Keras library (tutorial and documentation cited in the slides)

2. Load the California housing dataset

Some info about the California_housing_dataset #

  • #samples-istances: 20640

  • variables: 8 numeric predictors, 1 target

    • Predictors:
      • MedInc (mi): median income in block
      • HouseAge (ha): median house age in block
      • AveRooms (ar): average number of rooms
      • AveBedrms (ab): average number of bedrooms
      • Population (p): block population
      • AveOccup (ao): average house occupancy
      • Latitude (lt): house block latitude
      • Longitude (lg): house block longitude
    • Response:
      • Target (v): median house value for California districts
  • Missing values: none

Data Acquisition #

# Load the California Housing dataset
df_train = pd.read_csv('../input/californiahousingdataset/train.csv',sep=',')
df_test = pd.read_csv('../input/californiahousingdataset/test.csv',sep=',')
# Some stats
print(f"We have {df_train.shape[0] + df_test.shape[0]} observation, splitted into:\n\
      * {df_train.shape[0]} training observations;\n\
      * {df_test.shape[0]} test observations.\n\
There are {df_train.isna().sum().sum() + df_test.isna().sum().sum()} missing values in the dataset.")
We have 20640 observation, splitted into:
      * 16385 training observations;
      * 4255 test observations.
There are 0 missing values in the dataset.

Data pre-processing #

# Drop an useless feature
df_train = df_train.drop(columns='Unnamed: 0');
df_test = df_test.drop(columns='Unnamed: 0');


16385 rows × 9 columns

Split the dataset into Training and Test sets #

# Training set
predictorsTrain = df_train.loc[:, df_train.columns != 'v']
responseTrain = df_train['v']

# Test set
predictorsTest = df_test.loc[:, df_train.columns != 'v']
responseTest = df_test['v']

Standardization #

# Standardize "predictorsTrain"
predictorsTrainMeans = predictorsTrain.mean()
predictorsTrainStds = predictorsTrain.std()
predictorsTrain_std = (predictorsTrain - predictorsTrainMeans)/predictorsTrainStds # standardized variables of predictorTrain

# Standardize "predictorsTest" (using the mean and std of predictorsTrain, it's better!)
predictorsTest_std = (predictorsTest - predictorsTrainMeans)/predictorsTrainStds # standardized variables of predictorTest

Split the training set into Train and Validation sets #

Splitting the dataset is essential for an unbiased evaluation of prediction performance. In most cases, it’s enough to split your dataset randomly into three subsets:

  • The training set is applied to train, or fit, your model. For example, you use the training set to find the optimal coefficients for linear regression.

  • The validation set is used for unbiased model evaluation during hyperparameter tuning. For example, when you want to find the optimal number of neurons in a neural network, you experiment with different values. For each considered setting of hyperparameters, you fit the model with the training set and assess its performance with the validation set.

  • The test set is needed for an unbiased evaluation of the final model. Don’t use it for fitting or validation.

I choosed to split the train set in two parts: a small fraction (20%) became the validation set which the model is evaluated and the rest (80%) is used to train the model.

# Set the random seed
random_seed = 3 # a random_state parameter may be provided to control the random number generator used
# Split the train and the validation set for the fitting
X_train, X_val, y_train, y_val = train_test_split(predictorsTrain_std, responseTrain, test_size = 0.2, random_state = random_seed)
X_train.shape, y_train.shape, X_val.shape, y_val.shape
((13108, 8), (13108,), (3277, 8), (3277,))

Rename the data

# Rename our data

## Training set - already done it above when I created the validation set
# X_train = X_train
# X_val = X_val
# y_train = y_train
# y_val = y_val

## Test set
X_test = predictorsTest_std
y_test = responseTest
# # Since Keras models are trained on Numpy arrays of input data and labels:

# # Training set
# # X_train = X_train.values
# # X_val = X_val.values
# # y_train = y_train.values
# # y_val = y_val.values

# # Test set
# X_test = X_test.values
# y_test = y_test.values
Warning: Converting data into Numpy arrays makes the fitting process very slower!

Generating the first Artificial Neural Network #

3. Generate the artificial neural network model analyzed in this slides and compare the results obtained by structures defined below**

Create the ANN #

Lets create a simple model from Keras Sequential layer:

  • Dense is fully connected layer that means all neurons in previous layers will be connected to all neurons in fully connected layer.
model = Sequential()

# Input Layer
model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))

# Hidden Layers
model.add(Dense(30, activation='relu'))
model.add(Dense(40, activation='relu'))

# Output Layer
2022-06-24 16:09:21.913542: I tensorflow/core/common_runtime/] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.

Compile network #

# Compile the model
model.compile(optimizer ='adam',           # Optimizer: an algorithm for first-order stochastic gradiend descent
              loss = 'mean_squared_error', # Loss function: the objective that the model will try to minimize
              metrics=[metrics.mae])       # A list of metrics: used to judge the performance of your model

Fitting procedure #

EPOCHS = 150 # 150 are too much (using np.arrays)

print(f"Train on {X_train.shape[0]} samples, validate on {X_val.shape[0]} samples.")

# train model on full train set, with 80/20 CV split
history =, y_train,
                    validation_data=(X_val, y_val), # validation_data: data on which to evaluate the loss and any model metrics at the end of each epoch
                    epochs=EPOCHS,                  # epochs: number of iterations of the training phase 
                    batch_size=32)                  # batch_size: number of samples per gradient update (default: 32)
Train on 13108 samples, validate on 3277 samples.
Epoch 1/150

2022-06-24 16:09:22.231798: I tensorflow/compiler/mlir/] None of the MLIR Optimization Passes are enabled (registered 2)

410/410 [==============================] - 2s 3ms/step - loss: 1.2253 - mean_absolute_error: 0.7533 - val_loss: 0.5908 - val_mean_absolute_error: 0.5529
410/410 [==============================] - 1s 2ms/step - loss: 0.2205 - mean_absolute_error: 0.3197 - val_loss: 0.2908 - val_mean_absolute_error: 0.3615
'- Stats on Training set:',
'\n\t* Loss:\t\t', history.history['loss'][-1],
'\n\t* MAE:\t\t', history.history['mean_absolute_error'][-1],
'\n- Stats on Validation set:',
'\n\t* loss:\t\t', history.history['val_loss'][-1],
'\n\t* MAE:\t\t', history.history['val_mean_absolute_error'][-1],
- Stats on Training set: 
	* Loss:		 0.2204761952161789 
	* MAE:		 0.3197171688079834 
- Stats on Validation set: 
	* loss:		 0.29083022475242615 
	* MAE:		 0.3615073263645172
Model: "sequential"
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                90        
dense_1 (Dense)              (None, 30)                330       
dense_2 (Dense)              (None, 40)                1240      
dense_3 (Dense)              (None, 1)                 41        
Total params: 1,701
Trainable params: 1,701
Non-trainable params: 0
Warning: In the first Dense layer I was expecting 90 parameters, but I have only 80 parameters.

From this observation, the formula seems to be: $$\#param = \#inp \cdot \#neurons_{layer_1};$$ but the true formula is: $$ \#param = (\#inp + 1) \cdot \#neurons_{layer_1}. $$ Maybe in the slides we have that #inputs = 7. 1

Evaluate the model #

On the Training set #

# Initialize the figure
width, height = 10, 5
nfig = 2
fig = plt.figure(figsize = (width*nfig,height))


ax1 = fig.add_subplot(1, nfig, 1);
ax1.plot(range(1,EPOCHS+1),history.history['loss'], color='darkblue', label="Training loss")
ax1.plot(range(1,EPOCHS+1),history.history['val_loss'], color='darkorange', label="validation loss",axes =ax1)

ax1.legend(loc='best', shadow=True)
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

# SBP 1: MAE

ax2 = fig.add_subplot(1, nfig, 2);
ax2.plot(range(1,EPOCHS+1),history.history['mean_absolute_error'], color='darkblue', label="Training accuracy")
ax2.plot(range(1,EPOCHS+1),history.history['val_mean_absolute_error'], color='darkorange',label="Validation accuracy")

ax2.legend(loc='best', shadow=True)
ax2.set_ylabel('Mean Absolute Error',fontsize=14);
ax2.grid(color='grey', linestyle='-', linewidth=0.5);

# plt.suptitle("Stats on Training set",fontsize=25)
# plt.subplots_adjust(top=0.8) # change title position


On the Test set #

def root_mean_squared_error(y_true, y_pred):
        return  np.sqrt(np.mean((y_pred-y_true)**2))
# Compute LOSS and MAE on Test set
loss, mae = model.evaluate(X_test, y_test, verbose = 0); # 133/133 because it's the number of batches:
                                                                   # X_test.shape[0]/32 (default batch_size = 32)
# Compute RMSE on Test set
y_pred = model.predict(X_test)

A = y_test.values     # convert into a numpy array
B = y_pred.flatten()  # to get rid off the multiple brackets returned by predict method

rmse = root_mean_squared_error(A, B)
print(f"- Statistics on the Test set:\n\
\t* Test Loss: {loss}\n\
\t* Test MAE: {mae}\n\
\t* Test RMSE: {rmse}"
- Statistics on the Test set:
	* Test Loss: 0.2790662944316864
	* Test MAE: 0.35706254839897156
	* Test RMSE: 0.5282672929159882
Warning: we don’t store these results now, because we perform the same model below.

Generating other ANN models #

4. Test the following network structures and compare the results in terms of training/validation MAE/loss, RMSE on test set:

  • 1 layer containing a single neuron
  • 1 layer containing 3 neurons
  • 1 layer containing 10 neurons
  • 2 layers containing respectively 10 and 30 neurons
  • 3 layers containing respectively 10, 30 and 40 neurons
def create_model(network):
    num_layers = len(network)
    model = Sequential()

    # Input Layer
    model.add(Dense(network[0], input_dim=X_train.shape[1], activation='relu'))

    # Hidden Layers
    if num_layers > 1:
        for i in range(1,num_layers):            
            model.add(Dense(network[i], activation='relu'))

    # Output Layer
    return model
def get_test_stats(model, xtest, ytest, verbose_flag):
    # Compute LOSS and MAE on Test set
    loss, mae = model.evaluate(xtest, ytest, verbose = verbose_flag);
    # Compute RMSE on Test set
    y_pred = model.predict(xtest)
    rmse = root_mean_squared_error(y_test.values, y_pred.flatten())
    return loss, mae, rmse, y_pred
DOE = [[1], [3], [10], [10,30], [10,30,40]] #Design of experiment
from time import time

# Store the info in order to compare the results with the following models.
training_loss, training_MAE = [], []
val_loss, val_MAE = [], []
test_loss, test_MAE, test_RMSE = [], [], []
net_struct, net_epochs, pred_list = [], [], [] #info about the network setting

print(f"Now we preform {len(DOE)} ANN models.\n")

for network in DOE:
    idx = DOE.index(network) # we consider as "MODEL #0" the one shown above!
    print(f"[INFO] MODEL #{idx+1} using {DOE[idx]} neurons. [{idx+1}/{len(DOE)}]\n")
    custom_model = create_model(network)
    ## Compile the model
    custom_model.compile(optimizer ='adam',loss = 'mean_squared_error', metrics=[metrics.mae])

    ## Train model on full train set, with 80/20 CV split
    print(f"[INFO] Fitting using {EPOCHS} epochs...")
    print(f"Train on {X_train.shape[0]} samples, validate on {X_val.shape[0]} samples.")
    tstart = time()
    custom_history =, y_train,
                        validation_data=(X_val, y_val),
                        verbose = 0)
    tend = time() - tstart
    print(f"\n...OK, fitted the model in {tend}s.")
    ## Summary
    print("\n[INFO] Summary:")
    ## Test set statistics
    print("\n[INFO] Evaluate the model on Test set:")
    loss, mae, rmse, y_pred = get_test_stats(custom_model, X_test, y_test, verbose_flag = 1)
    print('\n[INFO] Statistics:\
\n- Stats on Training set:',
'\n\t* Loss:\t\t', custom_history.history['loss'][-1],
'\n\t* MAE:\t\t', custom_history.history['mean_absolute_error'][-1],
'\n- Stats on Validation set:',
'\n\t* loss:\t\t', custom_history.history['val_loss'][-1],
'\n\t* MAE:\t\t', custom_history.history['val_mean_absolute_error'][-1],
'\n- Stats on Test set:',
'\n\t* loss:\t\t', loss,
'\n\t* MAE:\t\t', mae,
'\n\t* RMSE:\t\t', rmse,
    ## Store all the statistics
    # store training info

    # store val info

    # store test info
    #structure of the network

print(f"Performed all the {len(DOE)} models.")
Now we preform 5 ANN models.

[INFO] MODEL #1 using [1] neurons. [1/5]

[INFO] Fitting using 150 epochs...
Train on 13108 samples, validate on 3277 samples.

...OK, fitted the model in 81.33795595169067s.

[INFO] Summary:
Model: "sequential_1"
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 1)                 9         
dense_5 (Dense)              (None, 1)                 2         
Total params: 11
Trainable params: 11
Non-trainable params: 0

[INFO] Evaluate the model on Test set:
133/133 [==============================] - 0s 1ms/step - loss: 0.5043 - mean_absolute_error: 0.5226

[INFO] Statistics:
- Stats on Training set: 
	* Loss:		 0.5019071698188782 
	* MAE:		 0.5186200737953186 
- Stats on Validation set: 
	* loss:		 0.5074825882911682 
	* MAE:		 0.5189610123634338 
- Stats on Test set: 
	* loss:		 0.5042912364006042 
	* MAE:		 0.5225676894187927 
	* RMSE:		 0.7101348192496855

[INFO] MODEL #2 using [3] neurons. [2/5]

[INFO] Fitting using 150 epochs...
Train on 13108 samples, validate on 3277 samples.

...OK, fitted the model in 86.69740176200867s.

[INFO] Summary:
Model: "sequential_2"
Layer (type)                 Output Shape              Param #   
dense_6 (Dense)              (None, 3)                 27        
dense_7 (Dense)              (None, 1)                 4         
Total params: 31
Trainable params: 31
Non-trainable params: 0

[INFO] Evaluate the model on Test set:
133/133 [==============================] - 0s 1ms/step - loss: 0.4569 - mean_absolute_error: 0.4864

[INFO] Statistics:
- Stats on Training set: 
	* Loss:		 0.45196104049682617 
	* MAE:		 0.4827933609485626 
- Stats on Validation set: 
	* loss:		 0.4575823247432709 
	* MAE:		 0.4799402058124542 
- Stats on Test set: 
	* loss:		 0.4569326639175415 
	* MAE:		 0.4863855540752411 
	* RMSE:		 0.6759679391200216

[INFO] MODEL #3 using [10] neurons. [3/5]

[INFO] Fitting using 150 epochs...
Train on 13108 samples, validate on 3277 samples.

...OK, fitted the model in 85.105064868927s.

[INFO] Summary:
Model: "sequential_3"
Layer (type)                 Output Shape              Param #   
dense_8 (Dense)              (None, 10)                90        
dense_9 (Dense)              (None, 1)                 11        
Total params: 101
Trainable params: 101
Non-trainable params: 0

[INFO] Evaluate the model on Test set:
133/133 [==============================] - 0s 1ms/step - loss: 0.3562 - mean_absolute_error: 0.4191

[INFO] Statistics:
- Stats on Training set: 
	* Loss:		 0.3352857232093811 
	* MAE:		 0.40571486949920654 
- Stats on Validation set: 
	* loss:		 0.36747753620147705 
	* MAE:		 0.41564399003982544 
- Stats on Test set: 
	* loss:		 0.35619547963142395 
	* MAE:		 0.4191289246082306 
	* RMSE:		 0.59682117017886

[INFO] MODEL #4 using [10, 30] neurons. [4/5]

[INFO] Fitting using 150 epochs...
Train on 13108 samples, validate on 3277 samples.

...OK, fitted the model in 106.76856350898743s.

[INFO] Summary:
Model: "sequential_4"
Layer (type)                 Output Shape              Param #   
dense_10 (Dense)             (None, 10)                90        
dense_11 (Dense)             (None, 30)                330       
dense_12 (Dense)             (None, 1)                 31        
Total params: 451
Trainable params: 451
Non-trainable params: 0

[INFO] Evaluate the model on Test set:
133/133 [==============================] - 0s 1ms/step - loss: 0.2938 - mean_absolute_error: 0.3772

[INFO] Statistics:
- Stats on Training set: 
	* Loss:		 0.25864139199256897 
	* MAE:		 0.3488193154335022 
- Stats on Validation set: 
	* loss:		 0.29905804991722107 
	* MAE:		 0.37346842885017395 
- Stats on Test set: 
	* loss:		 0.29382142424583435 
	* MAE:		 0.37718966603279114 
	* RMSE:		 0.5420530397809932

[INFO] MODEL #5 using [10, 30, 40] neurons. [5/5]

[INFO] Fitting using 150 epochs...
Train on 13108 samples, validate on 3277 samples.

...OK, fitted the model in 123.2047529220581s.

[INFO] Summary:
Model: "sequential_5"
Layer (type)                 Output Shape              Param #   
dense_13 (Dense)             (None, 10)                90        
dense_14 (Dense)             (None, 30)                330       
dense_15 (Dense)             (None, 40)                1240      
dense_16 (Dense)             (None, 1)                 41        
Total params: 1,701
Trainable params: 1,701
Non-trainable params: 0

[INFO] Evaluate the model on Test set:
133/133 [==============================] - 0s 2ms/step - loss: 0.2787 - mean_absolute_error: 0.3495

[INFO] Statistics:
- Stats on Training set: 
	* Loss:		 0.234622061252594 
	* MAE:		 0.33155113458633423 
- Stats on Validation set: 
	* loss:		 0.2967351973056793 
	* MAE:		 0.35640445351600647 
- Stats on Test set: 
	* loss:		 0.27867719531059265 
	* MAE:		 0.3495338559150696 
	* RMSE:		 0.5278987847617789

Performed all the 5 models.
Warning: pay attention to running multiple times the above cell, because you append other results on the final statistics.
# Collect all the most useful data into a DataFrame
stats = pd.DataFrame({
    'ANN_structure': net_struct,
    'ANN_epochs': net_epochs,
    'Training Loss': [last for *_, last in training_loss],
    'Training MAE': [last for *_, last in training_MAE],
    'Validation Loss': [last for *_, last in val_loss],
    'Validation MAE': [last for *_, last in val_MAE],
    'Test Loss': test_loss,
    'Test MAE': test_MAE,
    'Test RMSE': test_RMSE

ANN_structureANN_epochsTraining LossTraining MAEValidation LossValidation MAETest LossTest MAETest RMSE
3[10, 30]1500.2610560.3502030.3070470.3688650.2991780.3712060.546971
4[10, 30, 40]1500.2398020.3335290.3082540.3595920.2981370.3611280.546019

5. Generate a chart in which the performance of these models are displayed and compared

Compare the results #

On the Training set #

# Initialize the figure
width, height = 10, 5
nfig = 2
fig = plt.figure(figsize = (width*nfig,height))

# SBP 1: LOSS on Training set
ax1 = fig.add_subplot(1, nfig, 1);
for i in range(0,len(DOE)):
    ax1.plot(range(1,EPOCHS+1), training_loss[i], label="training_nn: " + str(net_struct[i]))
ax1.legend(loc='best', shadow=True)
ax1.set_title('Loss on Training set',fontsize=18);
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

# SBP 2: LOSS on Validation set
ax2 = fig.add_subplot(1, nfig, 2);
for i in range(0,len(DOE)):
    ax2.plot(range(1,EPOCHS+1), val_loss[i], label="val_nn: " + str(net_struct[i]))
ax2.legend(loc='best', shadow=True)
ax2.set_title('Loss on Validation set',fontsize=18);
ax2.grid(color='grey', linestyle='-', linewidth=0.5);


# Initialize the figure
width, height = 10, 5
nfig = 2
fig = plt.figure(figsize = (width*nfig,height))

# SBP 1: LOSS on Training set
ax1 = fig.add_subplot(1, nfig, 1);
for i in range(0,len(DOE)):
    ax1.plot(range(1,EPOCHS+1), training_MAE[i], label="training_nn: " + str(net_struct[i]))
ax1.legend(loc='best', shadow=True)
ax1.set_title('MAE on Training set',fontsize=18);
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

# SBP 2: LOSS on Validation set
ax2 = fig.add_subplot(1, nfig, 2);
for i in range(0,len(DOE)):
    ax2.plot(range(1,EPOCHS+1), val_MAE[i], label="val_nn: " + str(net_struct[i]))
ax2.legend(loc='best', shadow=True)
ax2.set_title('MAE on Validation set',fontsize=18);
ax2.grid(color='grey', linestyle='-', linewidth=0.5);


On the Test set #

# Initialize the figure
width, height = 10, 5
fig = plt.figure(figsize = (width,height))

ax1 = fig.add_subplot(1, 1, 1);,len(DOE)+1), test_RMSE,width=0.4)

ax1.set_title('RMSE on Test set: comparison',fontsize=18);
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

# change the x-axis
xrange = [1,2,3,4,5]
squad = net_struct
ax1.set_xticklabels(squad, minor=False, rotation=45)

for xx,yy in zip(xrange,test_RMSE):
    ax1.text(xx -0.15, yy + .005, str(test_RMSE[xx-1].round(3)), color='darkblue', fontweight='bold')


The first and the last models #

ns = 50 # number of samples to visualize
id_one, id_two = 0,-1 # index of the two models we want to compare

# Initialize the figure
width, height = 7, 10 # single pic
rows, columns = 3, 2
fig = plt.figure(figsize = (width*rows,height*columns))

idx_model = id_one
y_pred = pred_list[idx_model] # prediction of a certain model

## SBP 1
ax1 = fig.add_subplot(rows, columns, 1);
for i in range(0,ns+1):
    ax1.plot(i,y_pred[i], 'darkorange',marker='o')
    ax1.plot(i,y_test[i], 'b',marker='o')
    ax1.plot([i, i], [y_pred[i], y_test[i]], color='grey') # distance btw y_test and y_pred
ax1.set_ylabel('median house value',fontsize=14);
ax1.set_title(f'Houses prices prediction:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 3
ax3 = fig.add_subplot(rows, columns, 3);
ax3.plot(range(1,EPOCHS+1), training_loss[idx_model], label="training loss")
ax3.plot(range(1,EPOCHS+1), val_loss[idx_model], label="val loss")
ax3.set_title(f'Loss:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax3.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 5
ax5 = fig.add_subplot(rows, columns, 5);
ax5.plot(range(1,EPOCHS+1), training_MAE[idx_model], label="training MAE")
ax5.plot(range(1,EPOCHS+1), val_MAE[idx_model], label="val MAE")
ax5.set_title(f'MAE:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax5.grid(color='grey', linestyle='-', linewidth=0.5);

idx_model = id_two
y_pred = pred_list[idx_model] # prediction of a certain model

## SBP 2
ax2 = fig.add_subplot(rows, columns, 2);
for i in range(0,ns+1):
    ax2.plot(i,y_pred[i], 'darkorange',marker='o')
    ax2.plot(i,y_test[i], 'b',marker='o')
    ax2.plot([i, i], [y_pred[i], y_test[i]], color='grey') # distance btw y_test and y_pred
ax2.legend(['prediction','actual value'])
ax2.set_ylabel('median house value',fontsize=14);
ax2.set_title(f'House prices prediction\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax2.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 4
ax4 = fig.add_subplot(rows, columns, 4);
ax4.plot(range(1,EPOCHS+1), training_loss[idx_model], label="training loss")
ax4.plot(range(1,EPOCHS+1), val_loss[idx_model], label="val loss")
ax4.set_title(f'Loss:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax4.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 6
ax6 = fig.add_subplot(rows, columns, 6);
ax6.plot(range(1,EPOCHS+1), training_MAE[idx_model], label="training MAE")
ax6.plot(range(1,EPOCHS+1), val_MAE[idx_model], label="val MAE")
ax6.set_title(f'MAE:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax6.grid(color='grey', linestyle='-', linewidth=0.5);

# set the spacing between subplots
plt.suptitle(f"Comparison ANN structure:\n{net_struct[id_one]} vs. {net_struct[id_two]}", fontsize=20)
/opt/conda/lib/python3.7/site-packages/numpy/core/ VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  ary = asanyarray(ary)


The last two models #

ns = 50 # number of samples to visualize
id_one, id_two = -2,-1 # index of the two models we want to compare

# Initialize the figure
width, height = 7, 10 # single pic
rows, columns = 3, 2
fig = plt.figure(figsize = (width*rows,height*columns))

idx_model = id_one
y_pred = pred_list[idx_model] # prediction of a certain model

## SBP 1
ax1 = fig.add_subplot(rows, columns, 1);
for i in range(0,ns+1):
    ax1.plot(i,y_pred[i], 'darkorange',marker='o')
    ax1.plot(i,y_test[i], 'b',marker='o')
    ax1.plot([i, i], [y_pred[i], y_test[i]], color='grey') # distance btw y_test and y_pred
ax1.set_ylabel('median house value',fontsize=14);
ax1.set_title(f'House prices prediction:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax1.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 3
ax3 = fig.add_subplot(rows, columns, 3);
ax3.plot(range(1,EPOCHS+1), training_loss[idx_model], label="training loss")
ax3.plot(range(1,EPOCHS+1), val_loss[idx_model], label="val loss")
ax3.set_title(f'Loss:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax3.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 5
ax5 = fig.add_subplot(rows, columns, 5);
ax5.plot(range(1,EPOCHS+1), training_MAE[idx_model], label="training MAE")
ax5.plot(range(1,EPOCHS+1), val_MAE[idx_model], label="val MAE")
ax5.set_title(f'MAE:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax5.grid(color='grey', linestyle='-', linewidth=0.5);

idx_model = id_two
y_pred = pred_list[idx_model] # prediction of a certain model

## SBP 2
ax2 = fig.add_subplot(rows, columns, 2);
for i in range(0,ns+1):
    ax2.plot(i,y_pred[i], 'darkorange',marker='o')
    ax2.plot(i,y_test[i], 'b',marker='o')
    ax2.plot([i, i], [y_pred[i], y_test[i]], color='grey') # distance btw y_test and y_pred
ax2.legend(['prediction','actual value'])
ax2.set_ylabel('median house value',fontsize=14);
ax2.set_title(f'Houses prices prediction\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax2.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 4
ax4 = fig.add_subplot(rows, columns, 4);
ax4.plot(range(1,EPOCHS+1), training_loss[idx_model], label="training loss")
ax4.plot(range(1,EPOCHS+1), val_loss[idx_model], label="val loss")
ax4.set_title(f'Loss:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax4.grid(color='grey', linestyle='-', linewidth=0.5);

## SBP 6
ax6 = fig.add_subplot(rows, columns, 6);
ax6.plot(range(1,EPOCHS+1), training_MAE[idx_model], label="training MAE")
ax6.plot(range(1,EPOCHS+1), val_MAE[idx_model], label="val MAE")
ax6.set_title(f'MAE:\nANN structure: {net_struct[idx_model]}',fontsize=18);
ax6.grid(color='grey', linestyle='-', linewidth=0.5);

# set the spacing between subplots
plt.suptitle(f"Comparison ANN structure:\n{net_struct[id_one]} vs. {net_struct[id_two]}", fontsize=20)


  1. We can find the formula to count the parameter for the dense layer here↩︎