How to use embedding to onehot encode before passing
I am able to train my seq2seq model when onehot encodded input is passed in the fit function. How would I achieve the same thing if input is not one hot encoded?
Following code works
def seqModel():
latent_dim = 256 # Latent dimensionality of the encoding space.
encoder_inputs = Input(shape=(None, input_vocab_size))
decoder_inputs = Input(shape=(None, output_vocab_size))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,\
initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
return model
def train(data):
model = seqModel()
#compile and get data
model.fit([to_one_hot(input_texts, num_encoder_tokens),to_one_hot(target_text, num_decoder_tokens)], outputs,batch_size=3, epochs=5)
I am asked not to onehot encode in the train method. How would I do it in the seqModel method? Is Embedding right way to onehot encode?
See also questions close to this topic

Multivariate Gaussian distribution
I was going through Andrew Ng's Machine learning course and was a bit confused about the difference between Gaussian distribution and multivariate Gaussian distribution. As per my understanding multivariate Gaussian distribution is for models where we have multiple features like x1,x2...xn but the same thing could be done for original model. I know I am totally confused here.

Predict future values after using polynomial regression in python
I'm currently using TensorFlow and SkLearn to to try to make a model that can predict the amount of sales for a certain product, X, based on the outdoor temperature in celcius.
I took my datasets for the temperature and set it equal to the x variable, and the amount of sales to as a y variable. As seen on the picture below, there is some sort of correlation between the temperature and the amount of sales:
First and foremost, I tried to do linear regression to see how well it'd fit. This is the code for that:
from sklearn.linear_model import LinearRegression model = LinearRegression() model.fit(x_train, y_train) #fit tries to fit the x variable and y variable. #Let's try to plot it out. y_pred = model.predict(x_train) plt.scatter(x_train,y_train) plt.plot(x_train,y_pred,'r') plt.legend(['Predicted Line', 'Observed data']) plt.show()
This resulted in a predicted line that had a pretty poor fit:
A very nice feature from sklearn however is that you can try to predict an value based on a temperature, so if I were to write
model.predict(15)
i'd get the output
array([6949.05567873])
This is exactly what I want, I just wanted to line to fit better so instead I tried polynoimal regression with sklearn by doing following:
from sklearn.preprocessing import PolynomialFeatures poly = PolynomialFeatures(degree=8, include_bias=False) #the bias is avoiding the need to intercept x_new = poly.fit_transform(x_train) new_model = LinearRegression() new_model.fit(x_new,y_train) #plotting y_prediction = new_model.predict(x_new) #this actually predicts x...? plt.scatter(x_train,y_train) plt.plot(x_new[:,0], y_prediction, 'r') plt.legend(['Predicted line', 'Observed data']) plt.show()
The line seems to fit better now:
My problem is not that I can't use new_model.predict(x) since it'll result in "ValueError: shapes (1,1) and (8,) not aligned: 1 (dim 1) != 8 (dim 0)". I understand that this is because I'm using a 8degree polynomium, but is there any way for me to predict the yaxsis based on ONE temperature using the polynomial regression model?

How to solve a ResourceExhaustedError error in Python reinforcement learning
I use the python3.6.5,tensorflow,spyder.My GPU is nvidia gt750m.I am implementing a policy gradient with gym's MsPacmanv0, but I will report an error when I run it:
ResourceExhaustedError: OOM when allocating tensor with shape[408576,32] [[Node: gradients/dense/MatMul_grad/MatMul_1 = MatMul[T=DT_FLOAT, transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, gradients/dense/BiasAdd_grad/tuple/control_dependency)]] Caused by op 'gradients/dense/MatMul_grad/MatMul_1', defined at: File "/home/xuenzhu/anaconda3/lib/python3.6/sitepackages/spyder/utils/ipython/start_kernel.py", line 269, in <module> main() File "/home/xuenzhu/anaconda3/lib/python3.6/sitepackages/spyder/utils/ipython/start_kernel.py", line 265, in main kernel.start()
The Wrong sentence1:
self.r: self.r_buffer
The Wrong sentence2:
model.train()
This is the complete codeļ¼
#!/usr/bin/env python3 # * coding: utf8 *[enter image description here][1] import tensorflow as tf import numpy as np import gym import sys MODEL_SAVE_PATH="MNIST_model/" MODEL_NAME="mnist_model" sys.path.append('.') class Agent(object): def __init__(self, a_space, s_space, **options): self.session = tf.Session() self.a_space, self.s_space = a_space, s_space self.s_buffer, self.a_buffer, self.r_buffer = [], [], [] self.filter1=tf.Variable(tf.random_normal([3,3,1,32])) self.filter2=tf.Variable(tf.random_normal([3,3,32,64])) self._init_options(options) self._init_input() self._init_nn() self._init_op() def _init_input(self): self.s = tf.placeholder(tf.float32, [None, 88,80,1]) self.r = tf.placeholder(tf.float32, [None, ]) self.a = tf.placeholder(tf.int32, [None, ]) def _init_nn(self): # Kernel init. w_init = tf.random_normal_initializer(mean=0.0, stddev=0.3) hidden0 = tf.nn.conv2d(self.s,self.filter1,strides=[1, 1, 1, 1], padding='VALID' ) relu1 = tf.nn.relu(hidden0) hidden1 = tf.nn.conv2d(relu1,self.filter2,strides=[1, 1, 1, 1], padding='VALID' ) relu2 = tf.nn.relu(hidden1) pool1 = tf.nn.max_pool(relu2,ksize=[1,1,1,1],strides=[1,1,1,1],padding='SAME') hidden11=pool1.get_shape().as_list() #print(hidden11) reshaped=tf.reshape(pool1,[1,hidden11[1]*hidden11[2]*hidden11[3]]) # Dense 2. dense_2 = tf.layers.dense(reshaped, 32, tf.nn.relu, kernel_initializer=w_init) dense_3 = tf.layers.dense(dense_2, 200, tf.nn.relu, kernel_initializer=w_init) # Action logits. self.a_logits = tf.layers.dense(dense_3, self.a_space, kernel_initializer=w_init) # Action prob. self.a_prob = tf.nn.softmax(self.a_logits) def _init_op(self): # One hot action. action_one_hot = tf.one_hot(self.a, self.a_space) # Calculate cross entropy. cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=action_one_hot, logits=self.a_logits) self.loss_func = tf.reduce_mean(cross_entropy * self.r) self.train_op = tf.train.AdamOptimizer(self.learning_rate).minimize(self.loss_func) self.session.run(tf.global_variables_initializer()) def _init_options(self, options): try: self.learning_rate = options['learning_rate'] except KeyError: self.learning_rate = 0.001 try: self.gamma = options['gamma'] except KeyError: self.gamma = 0.95 def predict(self, state): action_prob = self.session.run(self.a_prob, feed_dict={self.s: state[np.newaxis, :]}) return np.random.choice(range(action_prob.shape[1]), p=action_prob.ravel()) def save_transition(self, state, action, reward): self.s_buffer.append(state) self.a_buffer.append(action) self.r_buffer.append(reward) def train(self): # Copy r_buffer r_buffer = self.r_buffer # Init r_tau r_tau = 0 # Calculate r_tau for index in reversed(range(0, len(r_buffer))): r_tau = r_tau * self.gamma + r_buffer[index] self.r_buffer[index] = r_tau # Minimize loss. _, loss = self.session.run([self.train_op, self.loss_func], feed_dict={ self.s: self.s_buffer, self.a: self.a_buffer, # Predict action. self.r: self.r_buffer }) # self.s_buffer, self.a_buffer, self.r_buffer = [], [], [] import matplotlib.pyplot as plt #%matplotlib inline env = gym.make('MsPacmanv0') env.seed(1) env = env.unwrapped model = Agent(env.action_space.n, env.observation_space.shape[0]) r_sum_list, r_episode_sum = [], None max_reward = 1.0 mspacman_color = np.array([210,164,74]).mean() for episode in range(10000): # Reset env. s, r_episode = env.reset(), 0 # Start episode. while True: # if episode > 80:()() # env.render() # Predict action. img=s[1:176:2,::2] img = img.mean(axis=2) mspacman_color = np.array([210,164,74]).mean() img[img==mspacman_color]=0 img = (img128)/1281 s=img.reshape(88,80,1) a = model.predict(s) # Iteration. s_n, r, done, _ = env.step(a) if done: r = 5 r_episode += r # Save transition. model.save_transition(s, a, r) s = s_n if done: if r_episode_sum is None: r_episode_sum = sum(model.r_buffer) else: r_episode_sum = r_episode_sum * 0.99 + sum(model.r_buffer) * 0.01 #r_episode_sum = r_episode_sum * 0.01 + sum(model.r_buffer) * 0.99 r_sum_list.append(r_episode_sum) break # Start train. model.train() r_episode1= r_episode if r_episode1>max_reward: max_reward = r_episode1 print('the max reward is:',r_episode1) if max_reward>2000.0: obs = env.reset() while True: env.render() a = model.predict(obs) # Iteration. s_n, r, done, _ = env.step(a) obs = s_n if done: break

Keras model.evaluate() failing
I have created a little ConvNet which look likes this:
model = Sequential() optimizer = Adam() model.add(Lambda(lambda x: x / 127.5  1., input_shape=(28, 28, 1))) model.add(Convolution2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(NUM_CLASSES, activation='softmax')) model.compile(optimizer=optimizer, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
I am training it with data of shape
X_train.shape = (48000, 28, 28, 1) X_val.shape = (12000, 28, 28, 1)
And it works well.
However, I would now like to test the model using
keras.evaluate()
function:score = trained_model.evaluate(X_test, y_test, batch_size=128) # X_test.shape = (10000, 28, 28, 1) # y_test.shape (10000,)
Which result in the following error:
ValueError: Error when checking target: expected dense_2 to have shape (10,) but got array with shape (1,)
I don't quite understand this error, given that I use the same shape for my training, validation and test set.
Would you mind explaining what my error is, and how to fix it?
Many thanks!
Edit: Output of
trained_model.summary()
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= lambda_1 (Lambda) (None, 28, 28, 1) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 26, 26, 64) 640 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 13, 13, 64) 0 _________________________________________________________________ dropout_1 (Dropout) (None, 13, 13, 64) 0 _________________________________________________________________ flatten_1 (Flatten) (None, 10816) 0 _________________________________________________________________ dense_1 (Dense) (None, 128) 1384576 _________________________________________________________________ dropout_2 (Dropout) (None, 128) 0 _________________________________________________________________ dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,386,506 Trainable params: 1,386,506 Nontrainable params: 0
Solution given in the comment
I forgot to one hot encore myy_train
,y_val
andy_test
data. Solved with:from keras.utils.np_utils import to_categorical y_train = to_categorical(y_train)

Is keras 1.2.2 compatible with tensorflow 1.10.0 or any other latest version of tensorflow
is it required for the keras to be of the latest version of tensorflow or will keras 1.2.2 work fine with tensorflow 1.10.0 as the backend

Image Augmentation for Imbalanced Dataset  Keras
I am training a model on a dataset which is not balanced. We know that keras ImageDataGenerator() works on balanced data only. How do I make it work on my imbalanced data? Its a multiclass classification problem.
The data folder structure is as follows:
Training files: '../data/train'
Test files: '../data/test'Do I need to need to separate each class folder inside train & test ?
Also, how do I split my training and test data?
Currently I have put 80% in train and 20% in test for each class respectively.Thanks

Mobile phone recommendation system using natural language processing and RNN
I have been researching in the field of natural language processing. I have found an interesting dataset [https://www.kaggle.com/PromptCloudHQ/amazonreviewsunlockedmobilephones] As of now I have done exploratory data analysis as explained in this repo on the above dataset. i want help to build a recommendation system based on the reviews in the dataset Only mobile phone related reviews should be considered for the recommendations. For example the system should consider "Camera is good" and not "My dad's suggestions are good". Please share some ideas and suggestions and code.
Thanks in advance.

LSTM Training Input Versus Live Evaluation Input  Dynamic RNN?
I am having trouble wrapping my head around RNNs for this problem.
The problem: live binary classification of video using image sequences. Meaning I am receiving a video one image at a time and need to predict either Class A or Class B for the most recent image received.
Current Solution
Training  I use a CNN as feature extractor on a full sequence of images. I then feed multiple images (lstmlen, cnnfeaturesize) into the LSTM.
Live Evaluation  I receive 1 frame at a time and run it through the CNN. I add these new features to a queue of length lstmlen, then I take all the features from the queue and feed into the LSTM.
What I don't understand
Why is it that I have to keep track of and feed all of the features into the LSTM at evaluation time? The point of an LSTM is to remember past inputs so it seems redundant for me to input all the previous images at every time step. What I would like to be able to do is simply calculate the features for the most recent images and then feed those new features into the LSTM while the LSTM remembers the last lstmlen number of frames.
Am I using the RNN incorrectly in this case? Should I be able to simply use the previous LSTM state as input into the other LSTM cells and provide feature input for only the newest image?
I'm thinking something like tensorflows DynamicRNN may be the solution to this problem.
Pretty confused about this. Thanks for the help!

Reinforcement learning  How to deal with varying number of actions which do number approximation
I am a new to Reinforcement learning, but I am trying to use RL in this task:
Given a function definition in written e.g. in C with 1 to 10s of input arguments (only numerical ones  integer, float, etc.) and the body of the function (represented as a Abstract Syntax Tree/ Abstract Decision Tree with data dependencies  how the internal variable values change) I would like to approximate the values of these input parameters so for e.g. a certain decision block is executed. For this I thought of a recurrent network with LSTM cells.
Now, to achieve this, I would traverse one path in the tree leading to the block and take note of any data changes and decision blocks in the path. These steps would influence my parameter input predictions  what values to insert into/change in the input parameters if I wish to have a certain decision block executed.
Action: Changing the value of one chosen input parameter of the function OR Changing the value of all input parameters individually (with mathematical different operation). After action execution, moving onto the next node in the tree.
Reward: How close I am to executing the given decision block (thus satisfying the condition) with given input parameter values.
Goal: Have a condition in the code satisfied and a decision block thus executed (e.g. an if condition is met).
State: Current position in the AST/ADT with data dependencies.
Assuming that I already have a way to evaluate, how far I am from executing the wanted decision block given current parameter input values, I came across two problems:
How would I deal with varying number of function input parameters in RL? If I want to change their values to be closer to the execution of the wanted decision block, the number of given actions changes with the number of parameters defined for the given function.
If I already did chose one parameter, what is the best way to do number approximation using RL? In the function body there could be numerous very complex mathematical operations happening, so should there be defined action as logarithm, exponentiation, division, multiplying, etc. or is there a better way with maybe just adding/subtracting from the current value?
If you find any mistakes in my definition of the Actions, Reward, Goal or State, please do correct me, as I am still a big learner in this field.
Thank you for your answers.

Creating OneHot Encoder. CountVectorizer returns error with ArrayType(IntergerType, true)
I try to create a one hot encoder for the following input data :
+++ userid categoryIndexes +++  24868 [7276]  35335 [12825]  42634 . [14550, 14550]  51183 [7570]  61065 [14782]  70292 [7282]  72326 [14883, 14877]  96632 [14902]  99703 [14889] 121994 [16000, 7417] 144782 [12139, 12139] 175886 [7305, 7305] 221451 [14889, 12139] 226945 [18097] 250401 [7278] 256892 [7383, 5514] 270043 [7442] 272338 [7306] 284802 [18310, 14898] +++
Referring to Aggregating a OneHot Encoded feature in pyspark and Encode and assemble multiple features in PySpark , I try to solve it with
from pyspark.ml.feature import CountVectorizer df_user_catlist = df_order.groupBy("userid").agg(F.collect_list('level3_cat').alias('categoryIndexes')) cv = CountVectorizer(inputCol='categoryIndexes', outputCol='categoryVec') transformed_df = cv.fit(df_user_catlist).transform(df_user_catlist) transformed_df.show()
But caught the following error
IllegalArgumentException: u'requirement failed: Column category must be of type equal to one of the following types: [ArrayType(StringType,true), ArrayType(StringType,false)] but was actually of type ArrayType(IntegerType,true).'
I notice the difference is that the input data is of IntegerType instead of StringType, may I know (a) how can I convert it to StringType, or there's a better way to convert it to OHE?
 How to avoid dummy variable trap for multiple category in one column

onehot encoded Keras CNN output not as expected
I have a simple problem to solve in which there are 32 filters which are the same size as the image(1x2048). Therefore, the filter's weights will be multiplied one by one with the pixels rather than convolving over them.
The output for each image is a onehotvector, for example [1,0,0,0]. when I sum two images and do the pridiction, the output will be either [1,0,0,0] or [0,0,1,0].
However, since I have summed the two images, I expect to get [1,0,1,0] as the output to understand that I have both both of the classes in the image. Yet, I am don't know what to do to get what I expect and where the problem can be.
input_shape=(1,2048,1) model = Sequential() model.add(Conv2D(32, kernel_size=(1, 2048), strides=(1, 1), activation='softmax', input_shape=input_shape, kernel_regularizer=keras.regularizers.l1(L1regularization), kernel_constraint=keras.constraints.non_neg() )) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=optimizer,metrics=[metrics])
Thanks.