CNN model with both image data and preextracted features
I am trying to implement a CNN model
to classify some images to their corresponding classes. Images are of size 64x64x3
. My dataset consists of 25,000 images and also a CSV
file consisting of 14 preextracted features
like color, length etc.
I want to build a CNN
model that make use of both the image data and the features for training and prediction. How can I implement such a model in Python
with Keras
.?
1 answer

I'm going to start out assuming that you can import the data without any issues, and you have already separated the xdata into Image and Features, and you have the y data as the labels of each image.
You can use the keras functional api to have a neural network take multiple inputs.
from keras.models import Model from keras.layers import Conv2D, Dense, Input, Embedding, multiply, Reshape, concatenate img = Input(shape=(64, 64, 3)) features = Input(shape=(14,)) embedded = Embedding(input_dim=14, output_dim=60*32)(features) embedded = Reshape(target_shape=(14, 60,32))(embedded) encoded = Conv2D(32, (3, 3), activation='relu')(img) encoded = Conv2D(32, (3, 3), activation='relu')(encoded) x = concatenate([embedded, encoded], axis=1) x = Dense(64, activation='relu')(x) x = Dense(64, activation='relu')(x) main_output = Dense(1, activation='sigmoid', name='main_output')(x) model = Model([img, features], [main_output])
See also questions close to this topic

Implement the ID3 algorithm from scratch
I would like to make a ID3 algorithm but I am running into problems here.
First of all I would like to make 2 functions.
The first one is the following:
def func1_learn(X, y, impurity_measure = "entropy"): # this function should learn a decision tree with entropy as impurity_measure
so if I wanted to call on the func1_learn function by using:
func1_learn(X, y, impurity_measure = "entropy")
or like this:
func1_learn(X, y)
then in both cases, the function should learn a decision tree with entropy as impurity measure.
My second function should predict class label of some new data point x.
def func2_predict(x, tree): #predict class label of some new data point x # This function predicts class label of some new data point x.

Sklearn supervised learning with 3d array
NOTE: You can probably ignore the paragraph below if you have deep technical knowledge of sklearn and ML in general.
I am working on indexing image objects based on their position in an image. Their index is relative to the other image objects in each image,which varies significantly, so simple math will not work in indexing them. Moreover, I have tried to index them via their middle x coordinate in the image, but that only yields an accuracy of ~75% with sklearn DecisionTreeRegressor. Now I want to try to train a model to index them from their detection box's (obtained from tensorflow object recognition + pretrained neural network) x1,y1 or x1,x2,y1,y2 coordinates.
So here's my question:
Is an array such as
[ [[x0_0_0, x0_0_1], # < object 1 x,y coords for image 1 [x0_1_0, x0_1_1]], # < object 2 x,y coords for image 1 [ ... ], [xn_0_0, xn_0_1], # < object 1 x,y coords for image n [xn_1_0, xn_1_1]] # < object 1 x,y coords for image n ]
with a target array of
[[y0_0, y0_1], # < indices of objects 1 and 2 in image 1 [ ... ], [yn_0, yn_1]] # < indices of objects 1 and 2 in image n
Viable for use in any supervised ML algorithms packaged in sklearn?

Image convolution with a 4D Kernel in opencv
Given an image, I'm trying to apply convolution with a (3 x 3 x 3 x 64) kernel:
cv2.filter2D(img, 1, np.random.rand(3,3,3,64))
Gives:
error: /Users/travis/build/skvark/opencvpython/opencv/modules/imgproc/src/filterengine.hpp:363: error: (215) anchor.inside(Rect(0, 0, ksize.width, ksize.height)) in function normalizeAnchor
In fact in the documentation it says:
kernel – convolution kernel (or rather a correlation kernel), a singlechannel floating point matrix; if you want to apply different kernels to different channels, split the image into separate color planes using split() and process them individually.
Is there any other opencv function that can convolve a > 2D kernel? Or do I have to do two for loops applying filter2d?

Understanding where part of dimension is lost
I'm trying to build something that can upscale an image by a factor of 2 through a Residual Neural Network. Here is the current code:
filters = 256 kernel_size = 3 strides = 1 inputLayer = Input(shape=(img_height, img_width, img_depth)) conv1 = Conv2D(filters=filters, kernel_size=kernel_size, strides=strides)(inputLayer) res = Conv2D(filters=filters, kernel_size=kernel_size, padding='same', # (kernel_size // 2) strides=strides, activation='relu')(conv1) res = Conv2D(filters=256, kernel_size=3, padding='same', # (kernel_size // 2) strides=strides)(res) res = Add()([conv1, res]) for i in range(31): # 321 res1 = Conv2D(filters=filters, kernel_size=kernel_size, padding='same', # (kernel_size // 2) strides=strides, activation='relu')(res) res = Conv2D(filters=filters, kernel_size=kernel_size, padding='same', # (kernel_size // 2) strides=strides)(res) res = Add()([res1, res]) conv2 = Conv2D(256, 3, 1)(res) a = Add()([conv1, conv2]) up = UpSampling2D(size=2)(a) outputLayer = Conv2D(filters=3, kernel_size=1, strides=1)(up) model = Model(inputs=inputLayer, outputs=outputLayer)
Somehow, I get this error:
ValueError: Operands could not be broadcast together with shapes (1398, 1398, 256) (1396, 1398, 256)
The initial input is
1400x1400x3
. What exactly is making me lose part of those 1400 pixels ? 
Can we apply feature scaling to "independent variable" in a dataset?
I have a dataset with 8 dependent variables (2 categorical data). I have applied
ExtraTreeClassifier()
to eliminate some of dependent variables. I had also feature scale the X,y .from sklearn.preprocessing import StandardScaler sc = StandardScaler() X = sc.fit_transform(X) X = sc.transform(X) y = sc.fit_transform(y) y = sc.transform(y)
And after this I have split the dataset like
from sklearn.cross_validation import train_test_split X_train, X_test, y_train, y_test = train_test_split(X_new, encoded2, test_size = 0.25, random_state = 0)
And now I am applying
DecisionTreeRegressor
algorithm for prediction. But I want to the actual prediction (right now I am getting scaled value). How to do that? Is there any other approach to do it? Because the way I have done is giving RMSE = 0.02 and if I am not feature scaling dependent variable RMSE = 18.4. Please suggest how to solve this kind of problem. 
Estimator API: AttributeError: 'NoneType' object has no attribute 'dtype'
I have already looked up the previous answers to this problem but it has not been resolved yet. I am implementing a YOLO algorithm (for object detection) from scratch and am having problem in training part.
For training, I am tf.estimator API and am using a code similar to CNN MNIST code in tensorflow example. I am getting the following error:
Traceback (most recent call last): File "recover_v3.py", line 663, in <module> model.train(input_fn=train_input_fn, steps=1) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/estimator/estimator.py", line 376, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/estimator/estimator.py", line 1145, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/estimator/estimator.py", line 1170, in _train_model_default features, labels, model_fn_lib.ModeKeys.TRAIN, self.config) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/estimator/estimator.py", line 1133, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "recover_v3.py", line 584, in cnn_model_fn loss=loss, global_step=tf.train.get_global_step()) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/training/optimizer.py", line 400, in minimize grad_loss=grad_loss) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/training/optimizer.py", line 494, in compute_gradients self._assert_valid_dtypes([loss]) File "/home/nyummvc019/miniconda3/envs/tf_0/lib/python3.6/sitepackages/tensorflow/python/training/optimizer.py", line 872, in _assert_valid_dtypes dtype = t.dtype.base_dtype AttributeError: 'NoneType' object has no attribute 'dtype'
The code related to loss function in the main file is as shown(similar to official CNN MNIST example):
if mode == tf.estimator.ModeKeys.TRAIN: # This gives the LOSS for each image in the batch. # It is importing loss function from another file (called loss_fn) # Apparently it returns None (not sure) loss = loss_fn.loss_fn(logits, labels) optimizer = tf.train.AdamOptimizer(learning_rate=params["learning_rate"]) train_op = optimizer.minimize( loss=loss, global_step=tf.train.get_global_step()) # Wrap all of this in an EstimatorSpec. spec = tf.estimator.EstimatorSpec( mode=mode, loss=loss, train_op=train_op, eval_metric_ops=None) return spec
Previous answers to similar problem suggested that the loss function is returning nothing. However, when I try the loss function with randomly generated arrays, it works fine and yields normal values.
Also, if I return a constant like 10.0 from loss function, I still get the same error.
I am not sure how to proceed now. Also, is there any way I could print the loss returned by the loss function. Apparently, tf.estimator API start a tensorflow session by itself, and if I try to create another session (in order to print the value returned by loss function), I get other errors.

Why is ReLU a nonlinear activation function?
As I understand it, in a deep neural network, we use an activation function (g) after applying the weights (w) and bias(b)
(z := w * X + b  a := g(z))
. So there is a composition function of(g o z)
and the activation function makes so our model can learn function other than linear functions. I see that Sigmoid and Tanh activation function makes our model nonlinear, but I have some trouble seeing that a ReLu (which takes the max out of 0 and z) can make a model nonlinear...Let's say if every Z is always positive, then it would be as if there was no activation function...
So my question here is why does ReLu make in a neural network a model nonlinear?

How do i find derivative of softmax in python
i have trouble implementing back propogation for multi class classification of CIFAR10 dataset
My neural network has 2 layers
forward propagation
X > L1 > L2
weights w2 is initialized as random
np.random.randn(L1.shape[0], X.shape[0]) * 0.01
X is input of size (no_features * number of examples)
Z1 = (w1 * x) + b1 A1 = relu(Z1)
L1 has ReLu activation
Z2 = (w2 * A1) + b2 A2 = softmax(Z1)
L2 has softmax activation
cost is caluclated using this equation
cost = (1/m)*np.sum((Y * np.log(A2) ) + ((1  Y)*np.log(1A2)))
back propagation
derivative of cost is calculated
dA2 = (1/m)*(np.divide(Y, A2)  np.divide(1  Y, 1  A2))
dA2 = derivative of A2
Y = one hot encoded True values
now how do i proceed from here
how do i find dZ2 (derivative of Z2) using dA2

AttributeError: 'NoneType' object has no attribute '_inbound_nodes' while trying to add multiple keras Dense layers
I'm trying to add multiple Dense layers together, like this:
def tst_1():
inputs = Input((3, 1000, 1)) dense10 = Dense(224, activation='relu')(inputs[0,:,1]) dense11 = Dense(112, activation='relu')(dense10) dense12 = Dense(56, activation='relu')(dense11) dense20 = Dense(224, activation='relu')(inputs[1,:,1]) dense21 = Dense(112, activation='relu')(dense20) dense22 = Dense(56, activation='relu')(dense21) dense30 = Dense(224, activation='relu')(inputs[2,:,1]) dense31 = Dense(112, activation='relu')(dense30) dense32 = Dense(56, activation='relu')(dense31) flat = keras.layers.Add()([dense12, dense22, dense32]) dense1 = Dense(224, activation='relu')(flat) drop1 = Dropout(0.5)(dense1) dense2 = Dense(112, activation='relu')(drop1) drop2 = Dropout(0.5)(dense2) dense3 = Dense(32, activation='relu')(drop2) densef = Dense(1, activation='sigmoid')(dense3) model = Model(inputs = inputs, outputs = densef) model.compile(optimizer=Adam(), loss='binary_crossentropy', metrics=['accuracy']) return model
model = tst_1()
model.summary()
but I got this error:
...
/usr/local/lib/python2.7/distpackages/keras/engine/network.pyc in build_map(tensor, finished_nodes, nodes_in_progress, layer, node_index, tensor_index) 1310 ValueError: if a cycle is detected. 1311 """ > 1312 node = layer._inbound_nodes[node_index] 1313 1314 # Prevent cycles.
AttributeError: 'NoneType' object has no attribute '_inbound_nodes'

Cannot find the reason behind very low accuracy in my Convolutional Net model with Keras
I have built and trained my Convolutional Neural Network model and trained it using the Handwritten_Dataset and have used the epochs=2 and sent the training data in batches of 128 but cannot find the reason behind its very low accuracy.
The code is :
import pandas as pd import numpy as np import matplotlib.pyplot as plt import keras from sklearn.model_selection import train_test_split import warnings warnings.filterwarnings('ignore') import tables from keras.models import Sequential from keras.utils import np_utils from keras.layers import Conv2D, MaxPooling2D from keras.layers import Activation, Flatten, Dropout, Dense from keras.utils import to_categorical #hd=pd.read_hdf('data.h5') hd=pd.read_csv('../input/handwritten_data_785.csv') hd.head() Y=hd.iloc[:,0] X=hd.iloc[:,1:] Y=to_categorical(Y) print("X.shape ",X.shape) print("Y.shape ",Y.shape) type(Y) input_shape=(28,28,1) n_classes=Y_train.shape[1] batch_size=128 epochs=2 model=Sequential() model.add(Conv2D(filters=32,kernel_size=(4,4),strides=(1,1),padding='same',activation='relu',input_shape=input_shape)) model.add(MaxPooling2D(pool_size=(2,2)))#,strides=(1,1))) model.add(Conv2D(filters=64,kernel_size=(4,4),strides=(1,1),padding='same',activation='relu')) model.add(MaxPooling2D(pool_size=(2,2),strides=(1,1))) model.add(Flatten()) model.add(Dense(1000,activation='relu')) model.add(Dense(n_classes,activation='softmax')) model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.SGD(lr=0.05),metrics=["accuracy"]) model.fit(X_train,Y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(X_test,Y_test)) model.evaluate(X_test,Y_test,verbose=0)
Can anyone point out the reason behind such low accuracy ? Have I divided the dataset correctly ?
The output Accuracy is :
Train on 279027 samples, validate on 93010 samples Epoch 1/2 279027/279027 [==============================]  63s 225us/step  loss: 15.6456  acc: 0.0293  val_loss: 15.6455  val_acc: 0.0293 Epoch 2/2 279027/279027 [==============================]  58s 208us/step  loss: 15.6455  acc: 0.0293  val_loss: 15.6455  val_acc: 0.0293 [15.64552185918654, 0.02931942801857274]

Image preprocessing in convolutional neural network yields lower accuracy in Keras vs Tflearn
I'm trying to convert this tflearn DCNN sample (using image preprocessing and augmemtation) to keras:
Tflearn sample:
import tflearn from tflearn.data_utils import shuffle, to_categorical from tflearn.layers.core import input_data, dropout, fully_connected from tflearn.layers.conv import conv_2d, max_pool_2d from tflearn.layers.estimator import regression from tflearn.data_preprocessing import ImagePreprocessing from tflearn.data_augmentation import ImageAugmentation # Data loading and preprocessing from tflearn.datasets import cifar10 (X, Y), (X_test, Y_test) = cifar10.load_data() X, Y = shuffle(X, Y) Y = to_categorical(Y, 10) Y_test = to_categorical(Y_test, 10) # Realtime data preprocessing img_prep = ImagePreprocessing() img_prep.add_featurewise_zero_center() img_prep.add_featurewise_stdnorm() # Realtime data augmentation img_aug = ImageAugmentation() img_aug.add_random_flip_leftright() img_aug.add_random_rotation(max_angle=25.) # Convolutional network building network = input_data(shape=[None, 32, 32, 3], data_preprocessing=img_prep, data_augmentation=img_aug) network = conv_2d(network, 32, 3, activation='relu') network = max_pool_2d(network, 2) network = conv_2d(network, 64, 3, activation='relu') network = conv_2d(network, 64, 3, activation='relu') network = max_pool_2d(network, 2) network = fully_connected(network, 512, activation='relu') network = dropout(network, 0.5) network = fully_connected(network, 10, activation='softmax') network = regression(network, optimizer='adam', loss='categorical_crossentropy', learning_rate=0.001) # Train using classifier model = tflearn.DNN(network, tensorboard_verbose=0) model.fit(X, Y, n_epoch=50, shuffle=True, validation_set=(X_test, Y_test), show_metric=True, batch_size=96, run_id='cifar10_cnn')
This yielded the following results after 50 epochs:
Training Step: 26050  total loss: 0.35260  time: 144.306s  Adam  epoch: 050  loss: 0.35260  acc: 0.8785  val_loss: 0.64622  val_acc: 0.8212  iter: 50000/50000
I then tried to convert it to Keras using the same DCNN layers, parameters and image preprocessing/augmentation:
import numpy as np from keras.datasets import cifar10 from keras.callbacks import TensorBoard from keras.models import Sequential from keras.layers import Dense, Dropout, Activation, Flatten from keras.layers import Conv2D, MaxPooling2D, AveragePooling2D, UpSampling2D, AtrousConvolution2D from keras.layers.advanced_activations import LeakyReLU, PReLU from keras.utils import np_utils from keras.preprocessing.image import ImageDataGenerator from keras import backend as K import matplotlib from matplotlib import pyplot as plt np.random.seed(1337) batch_size = 96 # how many images to process at once nb_classes = 10 # how many types of objects we can detect in this set nb_epoch = 50 # how long we train the system img_rows, img_cols = 32, 32 # image dimensions nb_filters = 32 # number of convolutional filters to use pool_size = (2, 2) # size of pooling area for max pooling kernel_size = (3, 3) # convolution kernel size (X_train, Y_train), (X_test, Y_test) = cifar10.load_data() X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 3) X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 3) input_shape = (img_rows, img_cols, 3) X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255 X_test /= 255 print('X_train shape:', X_train.shape) print(X_train.shape[0], 'train samples') print(X_test.shape[0], 'test samples') # convert class vectors to binary class matrices Y_train = np_utils.to_categorical(Y_train, nb_classes) Y_test = np_utils.to_categorical(Y_test, nb_classes) datagen = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True, horizontal_flip=True, rotation_range=25) datagen.fit(X_train) model = Sequential() model.add(Conv2D(nb_filters, kernel_size, padding='valid', input_shape=input_shape, activation='relu')) model.add(MaxPooling2D(pool_size=pool_size)) model.add(Conv2D(nb_filters*2, kernel_size, activation='relu')) model.add(Conv2D(nb_filters*2, kernel_size, activation='relu')) model.add(MaxPooling2D(pool_size=pool_size)) model.add(Flatten()) model.add(Dense(512, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(nb_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # Set up TensorBoard tb = TensorBoard(log_dir='./logs') history = model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size), epochs=nb_epoch, shuffle=True, verbose=1, validation_data=(X_test, Y_test), callbacks=[tb]) score = model.evaluate(X_test, Y_test, verbose=0) print('Test score:', score[0]) print("Accuracy: %.2f%%" % (score[1]*100)) plt.plot(history.epoch,history.history['val_acc'],'o',label='validation') plt.plot(history.epoch,history.history['acc'],'o',label='training') plt.legend(loc=0) plt.xlabel('epochs') plt.ylabel('accuracy') plt.grid(True) plt.show()
This yielded far worse validation accuracy results:
Epoch 50/50 521/521 [==============================]  84s 162ms/step  loss: 0.4723  acc: 0.8340  val_loss: 3.2970  val_acc: 0.2729 Test score: 3.2969648239135743 Accuracy: 27.29%
Can anyone help me understand why? Have I misapplied/misunderstood image preprocessing/augmentation in Keras?

Reshaping RGB image is not consistent with Lecture Notes(coursera)
I:m taking the coursera deep learning specialization course and got confusion about reshaping RGB image into an array.
image = np.array([[[ 0.67826139, 0.29380381], [ 0.90714982, 0.52835647], [ 0.4215251 , 0.45017551]], [[ 0.92814219, 0.96677647], [ 0.85304703, 0.52351845], [ 0.19981397, 0.27417313]], [[ 0.60659855, 0.00533165], [ 0.10820313, 0.49978937], [ 0.34144279, 0.94630077]]]) print ("image2vector(image) = " + str(image2vector(image)))
Here is an image of shape (3 ,3 ,2) before turning into one dimensional vector. So here 2 is the number of channel of the image which is white and black respectively. The instruction of Assignment uses X_flatten = X.reshape(X.shape[0], 1).T to reshape the image. We get the following array.
Reshaping RGB image is not consistent with Lecture Notes Weiguo WangWeek 2 · 41 minutes ago · Edited
Here is an image of shape (3 ,3 ,2) before turning into one dimensional vector. So here 2 is the number of channel of the image which is white and black respectively. The instruction of Assignment uses X_flatten = X.reshape(X.shape[0], 1).T to reshape the image. We get the following array.
Clearly the procedure turning the original image into one dimensional vector by doing the following operations:
Original image pixel 1 W channel followed by pixel 1 B channel  pixel 2 W channel followed by pixel 2 B channel pixel 3W channel followed by pixel 3 B channel pixel 4 W channel followed by pixel 4 B channel ..................pixel 9 W channel followed by pixel 9 B channel
But the instruction provided is not consistent with instruction on the Lecture notes which flattening the Red channel into an array then Green and Blue, finally concatenating them. This has been confusing me for a while. I would appreciate for your help. Thank you.

Pipeline for image classification with small objects relative to image size
I have a dataset of images, which are mostly background noise except for tiny regions where there is an object of interest (one per image). Each image has a prediction target of either 0 or 1, indicating the class of the object.
My goal is to train a NN which will perform a binary classification on the object of interest.
What kind of pipeline should I use?
Just an endtoend CNN?
Is there some simple way of, I dunno, break the problem into a) find the object of interest b) classify it?
Can I train and predict better by having my model use sliding windows somehow, even if I do not have labels on each patch of the image?
Examples of problems that fall into this scheme: tumor classification (malignant/benignant), distinguishing between planes and birds in the sky near an airport