Why did this Keras python program fail?

I followed a tutorial on youtube and I accidentally didn't add model.add(Dense(6, activation='relu')) on Keras and I got 36% accuracy. After I added this code it rised to 86%. Why did this happen?

This is the code

from sklearn.model_selection import train_test_split
import keras
from keras.models import Sequential
from keras.layers import Dense 
import numpy as np
classifications = 3
dataset = np.loadtxt('wine.csv', delimiter=",")
X = dataset[:,1:14]
Y = dataset[:,0:1]
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.66, 
y_train = keras.utils.to_categorical(y_train-1, classifications)
y_test = keras.utils.to_categorical(y_test-1, classifications)
model = Sequential()
model.add(Dense(10, input_dim=13, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(6, activation='relu')) # This is the code I missed
model.add(Dense(6, activation='relu'))
model.add(Dense(4, activation='relu'))
model.add(Dense(2, activation='relu'))
model.add(Dense(classifications, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics= 
model.fit(x_train, y_train, batch_size=15, epochs=2500, validation_data= 
(x_test, y_test))

2 answers

  • answered 2018-04-15 02:09 Wendong Zheng

    In my opinion, maybe it's the ratio of your training set to your test set. You have 66% of your test set, so it's possible that training with this model will be under fitting. So one less layer of dense will have a greater change in the accuracy . You put test_size = 0.2 and try again the change in the accuracy of the missing layer.

  • answered 2018-04-15 02:53 Shashi Tunga

    Number of layers is an hyper parameter just like learning rate,no of neurons. These play an important role in determining the accuracy. So in your case.

    model.add(Dense(6, activation='relu')) 

    This layer played the key roll. We cannot understand what exactly these layers are actually doing. The best we can do is to do hyper parameter tuning to get the best combination of hyper parameters.