How to use embedding to onehot encode before passing
I am able to train my seq2seq model when onehot encodded input is passed in the fit function. How would I achieve the same thing if input is not one hot encoded?
Following code works
def seqModel():
latent_dim = 256 # Latent dimensionality of the encoding space.
encoder_inputs = Input(shape=(None, input_vocab_size))
decoder_inputs = Input(shape=(None, output_vocab_size))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,\
initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
return model
def train(data):
model = seqModel()
#compile and get data
model.fit([to_one_hot(input_texts, num_encoder_tokens),to_one_hot(target_text, num_decoder_tokens)], outputs,batch_size=3, epochs=5)
I am asked not to onehot encode in the train method. How would I do it in the seqModel method? Is Embedding right way to onehot encode?
See also questions close to this topic

How to run the multigp code in Matlab?
The multigp code is in the link https://github.com/SheffieldML/multigp. After I downloaded the multigp code, I copied it to the Matlab directory. However, almost none of the codes can run. For example, the demGgToy1.m code can't run since there isn't mapLoadData function in the multigp directory. I think that other codes can't run because of lacking of other functions. So I don't know how to solve it. Are there other codes I need to download to place in the Matlab directory and how to build the path?

How can I change neural network to detect small objects with size 2x2 pixels?
I need an algorithm for detecting objects using neural networks.
I have objects with size 2x2  30x30 pixels on images that have resolution 1024x1024.
Although objects are small, but the context of objects is important, therefore receptive field of final activation, but shouldn't be too much. Those, the point in the sky is not the same as the point on the wall.
I use neural networks with resolution 1024x1024.
How can I change my neural network to detect small objects with size 2x2 pixels:
 Should I increse number of filters in the layers?
 Should I increse the convolutional kernel size, for example, from default 3x3 to the 5x5?
 Should I use Concat layers, Shortcut (Residaul) layers, Inception blocks, or other?
And what neural network should I use to detect small objects with size 2x2 pixels: Yolo v2, SSD, DenseNet, ResNet, PVANet+, ... ?

In tensor flow, how can one filter a two dimensional tensor with unique values of one of the column?
In tensor flow, I have got a tensor with 512 rows and 2 columns. What I want to do is that: filter column 2 of the tensor on the basis of unique values of column 1 and then for each unique value (of column 1) process corresponding values of column 2 in inner loop.
So, as an example, I have got a 2dimensional tensor, value (after evaluating in a session) of which looks like following:
[[ 509, 270], [ 533, 568], [ 472, 232], ..., [ 6, 276], [ 331, 165], [ 401, 1144]] 509, 533, 472 ... are elements of column1 and 270, 568, 232,... are elements of column 2.
Is there are way that I can
define
following 2 steps within a graph (not while executing the session):get unique values of column1 for each `unique_value` in column1: values_in_column2 = values in column2 corresponding to `unique_value` (filter column2 according to unique_value`) some_function(values_in_column2)
I can do above steps while running the session but I would like to define the above 2 steps in a graph  which I can run in session after defining many subsequent steps.
Is there any way to do this? Appreciate any kind of help in this regard.
Regards,
Sumit 
Keras: Is it okay to define a model inside of a call function of a custom layer?
Is it feasible to define a model containing custom layers with trainable weights inside of another custom layer with no trainable weights ?
Basically this means that I have:
1 A model containing one custom layer named custom_layer_main
2 In this custom_layer_main (In the call function) I define a shared weights model (model_shared_weights) that operates the same on 2 input images
3 The output tensors of (model_shared_weights) are needed for the operations of the custom_layer_main
This means that my main model contains one custom layer (custom_layer_main). But In the call function of this custom layer I define a shared weights model to have two branches operating on two images.

regarding converting 1D to 2D in tensorflow
Given a one dimensional data, how to reshape it to 2D matrix so that I can leverage the existing 2D convolution in tensorflow?

Weird Nan loss for custom Keras loss
I'm trying to implement a custom loss in Keras but can't get it to work.
I have implemented it in numpy and with keras.backend:
def log_rmse_np(y_true, y_pred): d_i = np.log(y_pred)  np.log(y_true) loss1 = (np.sum(np.square(d_i))/np.size(d_i)) loss2 = ((np.square(np.sum(d_i)))/(2 * np.square(np.size(d_i)))) loss = loss1  loss2 print('np_loss = %s  %s = %s'%(loss1, loss2, loss)) return loss def log_rmse(y_true, y_pred): d_i = (K.log(y_pred)  K.log(y_true)) loss1 = K.mean(K.square(d_i)) loss2 = K.square(K.sum(K.flatten(d_i),axis=1))/(K.cast_to_floatx(2) * K.square(K.cast_to_floatx(K.int_shape(K.flatten(d_i))[0]))) loss = loss1  loss2 return loss
When I test and compare the losses with the following function everything seems to work just fine.
def check_loss(_shape): if _shape == '2d': shape = (6, 7) elif _shape == '3d': shape = (5, 6, 7) elif _shape == '4d': shape = (8, 5, 6, 7) elif _shape == '5d': shape = (9, 8, 5, 6, 7) y_a = np.random.random(shape) y_b = np.random.random(shape) out1 = K.eval(log_rmse(K.variable(y_a), K.variable(y_b))) out2 = log_rmse_np(y_a, y_b) print('shapes:', str(out1.shape), str(out2.shape)) print('types: ', type(out1), type(out2)) print('log_rmse: ', np.linalg.norm(out1)) print('log_rmse_np: ', np.linalg.norm(out2)) print('difference: ', np.linalg.norm(out1out2)) assert out1.shape == out2.shape #assert out1.shape == shape[1] def test_loss(): shape_list = ['2d', '3d', '4d', '5d'] for _shape in shape_list: check_loss(_shape) print ('======================') test_loss()
The above code prints:
np_loss = 1.34490449177  0.000229461787517 = 1.34467502998 shapes: () () types: <class 'numpy.float32'> <class 'numpy.float64'> log_rmse: 1.34468 log_rmse_np: 1.34467502998 difference: 3.41081509703e08 ====================== np_loss = 1.68258448859  7.67580654591e05 = 1.68250773052 shapes: () () types: <class 'numpy.float32'> <class 'numpy.float64'> log_rmse: 1.68251 log_rmse_np: 1.68250773052 difference: 1.42057615005e07 ====================== np_loss = 1.99736933814  0.00386228512295 = 1.99350705302 shapes: () () types: <class 'numpy.float32'> <class 'numpy.float64'> log_rmse: 1.99351 log_rmse_np: 1.99350705302 difference: 2.53924863358e08 ====================== np_loss = 1.95178217182  1.60006871892e05 = 1.95176617114 shapes: () () types: <class 'numpy.float32'> <class 'numpy.float64'> log_rmse: 1.95177 log_rmse_np: 1.95176617114 difference: 3.78277884572e08 ======================
I never get an exception when I compile and fit my model with this loss and when I run the model with the 'adam'loss everything works fine. However with this loss keras keeps showing a nanloss:
Epoch 1/10000 17/256 [>.............................]  ETA: 124s  loss: nan
Kind of stuck here... Am I doing something wrong?
Using Tensorflow 1.4 on Ubuntu 16.04
Update:
After a suggestion by Marcin Możejko I updated the code but unfortunately the training loss is still Nan:
def get_log_rmse(normalization_constant): def log_rmse(y_true, y_pred): d_i = (K.log(y_pred)  K.log(y_true)) loss1 = K.mean(K.square(d_i)) loss2 = K.square(K.sum(K.flatten(d_i),axis=1))/K.cast_to_floatx(2 * normalization_constant ** 2) loss = loss1  loss2 return loss return log_rmse
Then the model is compiled via:
model.compile(optimizer='adam', loss=get_log_rmse(batch_size))
Update 2:
The model summary looks like this:
Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) (None, 160, 256, 3) 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, 160, 256, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, 160, 256, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 80, 128, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 80, 128, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 80, 128, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 40, 64, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 40, 64, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 40, 64, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 40, 64, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, 40, 64, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 20, 32, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 20, 32, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 20, 32, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 20, 32, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, 20, 32, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 10, 16, 512) 0 _________________________________________________________________ conv2d_transpose_5 (Conv2DTr (None, 10, 16, 128) 1048704 _________________________________________________________________ up_sampling2d_5 (UpSampling2 (None, 20, 32, 128) 0 _________________________________________________________________ conv2d_transpose_6 (Conv2DTr (None, 20, 32, 64) 131136 _________________________________________________________________ up_sampling2d_6 (UpSampling2 (None, 40, 64, 64) 0 _________________________________________________________________ conv2d_transpose_7 (Conv2DTr (None, 40, 64, 32) 32800 _________________________________________________________________ up_sampling2d_7 (UpSampling2 (None, 80, 128, 32) 0 _________________________________________________________________ conv2d_transpose_8 (Conv2DTr (None, 80, 128, 16) 8208 _________________________________________________________________ up_sampling2d_8 (UpSampling2 (None, 160, 256, 16) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 160, 256, 1) 401 ================================================================= Total params: 11,806,401 Trainable params: 11,806,401 Nontrainable params: 0
Update 3:
Sample y_true:

LSTM model for sentiment analysis
I am using this tutorial: https://github.com/rvinas/sentiment_analysis_tensorflow , in order to classify short text messages. In this tutorial, the prediction is made on the previously trained model, but I am struggling to find a way to build the test data from input messages.
thank for the help.

LSTM Sequence Prediction in Keras just outputs last step in the input
I am currently working with Keras using Tensorflow as the backend. I have a LSTM Sequence Prediction model shown below that I am using to predict one step ahead in a data series (input 30 steps [each with 4 features], output predicted step 31).
model = Sequential() model.add(LSTM( input_dim=4, output_dim=75, return_sequences=True)) model.add(Dropout(0.2)) model.add(LSTM( 150, return_sequences=False)) model.add(Dropout(0.2)) model.add(Dense( output_dim=4)) model.add(Activation("linear")) model.compile(loss="mse", optimizer="rmsprop") return model
The issue I'm having is that after training the model and testing it  even with the same data it trained on  what it outputs is essentially the 30th step in the input. My first thought is the patterns of my data must be too complex to accurately predict, at least with this relatively simple model, so the best answer it can return is essentially the last element of the input. To limit the possibility of overfitting I've tried turning training epochs down to 1 but the same behavior appears. I've never observed this behavior before though and I have worked with this type of data before with successful results (for context, I'm using vibration data taken from 4 points on a complex physical system that has active stabilizers; the prediction is used in a pid loop for stabilization hence why, at least for now, I'm using a simpler model to keep things fast).
Does that sound like the most likely cause, or does anyone have another idea? Has anyone seen this behavior before? In case it helps with visualization here is what the prediction looks like for one vibration point compared to the desired output (note, these screenshots are zoomed in smaller selections of a very large dataset  as @MarcinMożejko noticed I did not zoom quite the same both times so any offset between the images is due to that, the intent is to show the horizontal offset between the prediction and true data within each image):
...and compared to the 30th step of the input:
Note: Each data point seen by the Keras model is an average over many actual measurements with the window of the average processed along in time. This is done because the vibration data is extremely chaotic at the smallest resolution I can measure so instead I use this moving average technique to predict the larger movements (which are the more important ones to counteract anyway). That is why the offset in the first image appears as many points off instead of just one, it is 'one average' or 100 individual points of offset. .
Edit 1, code used to get from the input datasets 'X_test, y_test' to the plots shown above
model_1 = lstm.build_model() # The function above, pulled from another file 'lstm' model_1.fit( X_test, Y_test, nb_epoch=1) prediction = model_1.predict(X_test) temp_predicted_sensor_b = (prediction[:, 0] + 1) * X_b_orig[:, 0] sensor_b_y = (Y_test[:, 0] + 1) * X_b_orig[:, 0] plot_results(temp_predicted_sensor_b, sensor_b_y) plot_results(temp_predicted_sensor_b, X_b_orig[:, 29])
For context:
X_test.shape = (41541, 30, 4)
Y_test.shape = (41541, 4)
X_b_orig is the raw (averaged as described above) data from the b sensor. This is multiplied by the prediction and input data when plotting to undo normalization I do to improve the prediction. It has shape (41541, 30).
Edit 2
Here is a link to a complete project setup to demonstrate this behavior:

Transform keras model to cntk model
We are trying to change this example https://docs.microsoft.com/enus/azure/machinelearning/preview/scenariotdspbiomedicalrecognition?toc=%2Fenus%2Fazure%2Fmachinelearning%2Fteamdatascienceprocess%2Ftoc.json&bc=%2Fenus%2Fazure%2Fbread%2Ftoc.json to use a cnkt model instead of keras model.
This is the link to the original code: https://github.com/Azure/MachineLearningSamplesBiomedicalEntityExtraction and it is divided in several sections (01_data_acquisition_and_understanding, 02_modeling, 03_deployment). We are going to focus in second section, 02_modeling and 02_model_creation part.
To do that, the model constructed in this way:
self.model = Sequential() self.model.add(Embedding(self.wordvecs.shape[0], self.wordvecs.shape[1],input_length = train_X.shape[1], weights = [self.wordvecs], trainable = False)) for i in range(0, num_layers): if network_type == 'unidirectional': self.model.add(LSTM(num_hidden_units, return_sequences = True)) else: self.model.add(Bidirectional(LSTM(num_hidden_units, return_sequences = True))) self.model.add(Dropout(dropout)) self.model.add(TimeDistributed(Dense(train_Y.shape[2], activation='softmax')))
is changed for this implementation:
def __create_model(self, features, num_classes, num_hidden_units, dropout): embedding = C.layers.Embedding(weights=self.wordvecs)(features) unidirectional1 = C.layers.Recurrence(C.layers.LSTM(num_hidden_units))(embedding) bidirectional1 = C.layers.Recurrence(C.layers.LSTM(num_hidden_units), go_backwards=True)(embedding) splice1 = C.splice(unidirectional1, bidirectional1) dropout1 = C.layers.Dropout(dropout)(splice1) unidirection2 = C.layers.Recurrence(C.layers.LSTM(num_hidden_units))(dropout1) bidirectional2 = C.layers.Recurrence(C.layers.LSTM(num_hidden_units), go_backwards=True)(dropout1) splice2 = C.splice(unidirection2, bidirectional2) dropout2 = C.layers.Dropout(dropout)(splice2) last = C.sequence.last(dropout2) model = C.layers.Dense(num_classes)(last) return model
where num_classes is equal to train_Y.shape[2] in previous code and features is C.sequence.input_variable(self.wordvecs.shape[0])
self.wordvecs has this shape: (66962, 50) Train_X has this shape: (15380, 613) Train_Y has this shape: (15380, 613, 8)
But we get the next error when the training is started:
The trailing dimensions of the Value shape '[0 x 613]' do not match the Variable 'Input('Input3', [#, *], [66962])' shape '[66962]'.
The issue have sense because the embedding layers use the weights of self.wordvec with 66962 entries but the train data has other shape, 15380 sentences with a maximun lenght of 613. However, How can we express that wordvec contains the vocabulary to use and training data has indexes to that vocabulary?
This link: https://github.com/Crisgonmu/EntityExtractorCNTK has the code to reproduce that issue. The model creation is in EntityExtractor.py file.
Thanks

How to get one hot encoding of specific words in a text in Pandas?
Let's say I have a dataframe and list of words i.e
toxic = ['bad','horrible','disguisting'] df = pd.DataFrame({'text':['You look horrible','You are good','you are bad and disguisting']}) main = pd.concat([df,pd.DataFrame(columns=toxic)]).fillna(0) samp = main['text'].str.split().apply(lambda x : [i for i in toxic if i in x]) for i,j in enumerate(samp): for k in j: main.loc[i,k] = 1
This leads to :
bad disguisting horrible text 0 0 0 1 You look horrible 1 0 0 0 You are good 2 1 1 0 you are bad and disguisting
This is bit faster than get_dummies, but for loops in pandas is not appreciable when there is huge amount of data.
I tried with
str.get_dummies
, this will rather one hot encode every word in the series which makes it bit slower.pd.concat([df,main['text'].str.get_dummies(' ')[toxic]],1) text bad horrible disguisting 0 You look horrible 0 1 0 1 You are good 0 0 0 2 you are bad and disguisting 1 0 1
If I try the same in scipy.
from sklearn import preprocessing le = preprocessing.LabelEncoder() le.fit(toxic) main['text'].str.split().apply(le.transform)
This leads to
Value Error,y contains new labels
. Is there a way to ignore the error in scipy?How can I improve the speed of achieving the same, is there any other fast way of doing the same?

Python sklearn OneHotEncoding categorical and sometimes repeated values
This is my problem with sklearn's OneHotEncoder. with an array
a = [1,2,3,4,5,6,7,8,9,22]
i.e ALL UNIQUE ofa.shape=[10,1]
(afterreshape(1,1)
, a [10,10] matrix of OneHotEncoded values is returned.array([[ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.], [ 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]])
But with an array like
a = [1,2,2,4,4,6,7,8,9,22]
i.e NON UNIQUE ofa.shape=[10,1]
(afterreshape(1,1)
, a [10,8] matrix of OneHotEncoded values is returned.array([[ 1., 0., 0., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0., 0., 0.], [ 0., 0., 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0., 0.], [ 0., 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 0., 1., 0., 0.], [ 0., 0., 0., 0., 0., 0., 1., 0.], [ 0., 0., 0., 0., 0., 0., 0., 1.]])
But I cannot use this as my input placeholder expects a [10,10] matrix as input. Can anyone help me handle nonunique values in sklearn's OneHotEncoder?
P.S Adding the parameter n_values= 10 gives an error saying
ValueError: Feature out of bounds for n_values=10