This is not what I had expected!

I have trained a CNN on SVHN. The accuracy is close to ~0.93 and overall it works really well when tested on the single number images. So if I test the model with images that contain a single number such as follows:

it works great with the expected class probability close to `1`

. But if I supply the model with random images like some `house`

or a `lion`

, it will still predict a class with a probability close to 1. I cannot understand the reason for this. It should have predicted very low probabilities for each class.

Here is how I created the network.

```
import tensorflow.keras as keras
model = keras.Sequential()
# First Conv Layer
model.add(keras.layers.Conv2D(filters = 96, kernel_size = (11,11), strides = (4,4), padding = "same", input_shape=(227,227,3)))
model.add(keras.layers.Activation("relu"))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.MaxPooling2D(pool_size = (3,3), strides = (2,2), padding="same"))
# .. More Convolution Layer ...
# .. SOME Fully Connected Layers ..
# Final Fully Connected Layer
model.add(keras.layers.Dense(10))
model.add(keras.layers.Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer=keras.optimizers.RMSprop(lr=0.0001), metrics=['accuracy'])
data_generator = keras.preprocessing.image.ImageDataGenerator(rescale = 1./255)
train_generator = data_generator.flow_from_directory(
'train',
target_size=(227, 227),
batch_size=batch_size,
color_mode='rgb',
class_mode='categorical'
)
model.fit_generator(
train_generator
epochs = 12,
steps_per_epoch = math.ceil(num_train_samples / batch_size),
verbose = 2
)
```

As could also be seen from the code I have shared above, I have used:

- Loss function as
`categorical_crossentropy`

- Final layer activation function as
`softmax`

There are 10 classes 0 to 9. Would I also need to have a 11th class that has some random images? But that sounds very weird. Did I choose the incorrect loss / activation functions?