Using RNN and LSTM for process control using available data mesaurements
I have 50 sample measurements of input (temperature) with 600 time steps in each sample. I also have the outputs for these 50 samples in form of 4 variables (like saturation, concentration, lower bound concentration and upper bound concentration).
So the input data is 50 samples, 600 timesteps each and 1 variable or property. The output is 50 samples, 600 timesteps each and 4 variables or properties (and the variable 2 concentration must always be between the variable 3 and 4 which are its lower and upper bounds at all times)
I am new to deep learning and LSTMs. I am having trouble implementing the LSTM. So from what I have read (kindly correct me if I am wrong), I need to use many to many LSTM and batch_size = 50 input_shape is (600,) seq_len = 1
How many units of LSTM to use? I am very confused how to implement LSTM in this.
See also questions close to this topic

How to type hint opengl object in python correctly?
I want to know how to add the right type hints for opengl objects in python to let reader know that they are opengl objects instead of other.
I try to print the type of opengl objects but what I get is np.uint32 or int from which reader may mistake them for simple numbers instead of opengl objects.
I understand that a opengl object is actually sotred in GPU, the variable in cpu just stores the index or something else of it so I would get a number when I print it's type.
However, this type hint is trashy because reader can not regard this variable as a opengl object from it's type hint.Here is some example.
vao = glGenVertexArrays(1) vbo = glGenBuffers(1) texture = glGenTextures(1) print(type(vao)) # np.uint32 print(type(vbo)) # np.uint32 print(type(texture)) # int
I want to add the type hints like following coce.
vao = glGenVertexArrays(1) # type: VertexArray vbo = glGenBuffers(1) # type: Buffer texture = glGenTextures(1) # type: Texture

How to pro grammatically conduct a keyword research? Looking for Python scripts to do keyword research for local listings
I'm looking for scripts to do keyword research programmatically.

Is it okay to use the barplot function without specifying `order?
I'm working on the Titanic dataset based on this Kaggle kernel. On the part where I'm trying to use the barplot function, it gives me a warning message "UserWarning: Using the barplot function without specifying
order
is likely to produce an incorrect plot."Should I be concerned?
I've tried to specify the order parameter and the hue_order as well.
'''
> grid = sns.FacetGrid(train_df, col='Embarked', row='Survived', > height=2.2, aspect=1.6) grid.map(sns.barplot, 'Sex', 'Fare', > alpha=0.5, ci=None, order=[1,2,3], hue_order=['Embarked', 'Survived']) > grid.add_legend()
'''
When I specified the order and hue_order it gave me empty bar plots. empty barplots
However, when I take out the order and hue_order, it does give me the plots with this warning message:
'''
C:\Users\user\Anaconda3\lib\sitepackages\seaborn\axisgrid.py:715: UserWarning: Using the barplot function without specifying `order` is likely to produce an incorrect plot. warnings.warn(warning)
''' results with error warning
Any thoughts or tips that I should know? Thanks in advance!

What is more accurate: an ensemble of CNN or an ensemble of weak learners with a neural network meta learner?
Would it be better to use an ensemble of CNNs for medical images or several weak learners with a metalearner of a neural network?
These medical images would be tissue sections or blood smears and I would want to classify the disease subtypes.
Thanks in advance.

How to specify number of layers in keras?
I'm trying to define a fully connected neural network in keras using tensorflow backend, I have a sample code but I dont know what it means.
model = Sequential() model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(50, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(20, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal')) model.add(Dense(y.shape[1],activation='softmax'))
From the above code I want to know what is the number of inputs to my network, number of outputs, number of hidden layers. And what is the number coming after model.add(Dense ? assuming x.shape[1]=60. What is the name of this network exacly? Should I call it a fully connected network or convolutional network?

How come a deep network trained on color images still works with grayscale images
I have applied openpose on grayscale images and it still works. I saw this post, and I guess it is the same thing with openpose (openpose uses the first 10 layers of VGG19 to extract feature maps of the input image, and VGG19 is trained on color images). However, I am curious to know how a deep network that is pretrained on color images might still work on a grayscale image? How much the features are dependent on color (in particular, in case of openpose)? And how much might using grayscale images with such networks degrade the results?

Is there any paper about vanishinggradients of LSTM?
Some web pages mentioned that LSTM causes the vanishing or exploding gradients if the sequence is too long.
These are one of the pages:
 https://machinelearningmastery.com/handlelongsequenceslongshorttermmemoryrecurrentneuralnetworks/
 How to handle extremely long LSTM sequence length?
However, I couldn't find any paper or formulation for it.
Could you please tell me the references for this problem? 
Pytorch LSTM vs LSTMCell
What is the difference between LSTM and LSTMCell in Pytorch (currently version 1.1)? It seems that LSTMCell is a special case of LSTM (i.e. with only one layer, unidirectional, no dropout).
Then, what's the purpose of having both implementations? Unless I'm missing something, it's trivial to use an LSTM object as an LSTMCell (or alternatively, it's pretty easy to use multiple LSTMCells to create the LSTM object)

How to use attention layer on a sequence labeling task implemented using LSTMs in tensorflow?
I've used LSTMCell in tensorflow to implement a sequence labeling task. My code is based on this example code written by Aymeric Damien. Here is important parts of the code (full code is here):
# tf Graph input x = tf.placeholder("float", [None, seq_max_len, 1]) y = tf.placeholder("float", [None, n_classes]) # A placeholder for indicating each sequence length seqlen = tf.placeholder(tf.int32, [None]) # Define weights weights = { 'out': tf.Variable(tf.random_normal([n_hidden, n_classes])) } biases = { 'out': tf.Variable(tf.random_normal([n_classes])) } def dynamicRNN(x, seqlen, weights, biases): # Prepare data shape to match `rnn` function requirements # Current data input shape: (batch_size, n_steps, n_input) # Required shape: 'n_steps' tensors list of shape (batch_size, n_input) # Unstack to get a list of 'n_steps' tensors of shape (batch_size, n_input) x = tf.unstack(x, seq_max_len, 1) # Define a lstm cell with tensorflow lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden) # Get lstm cell output, providing 'sequence_length' will perform dynamic # calculation. outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen) # When performing dynamic calculation, we must retrieve the last # dynamically computed output, i.e., if a sequence length is 10, we need # to retrieve the 10th output. # However TensorFlow doesn't support advanced indexing yet, so we build # a custom op that for each sample in batch size, get its length and # get the corresponding relevant output. # 'outputs' is a list of output at every timestep, we pack them in a Tensor # and change back dimension to [batch_size, n_step, n_input] outputs = tf.stack(outputs) outputs = tf.transpose(outputs, [1, 0, 2]) # Hack to build the indexing and retrieve the right output. batch_size = tf.shape(outputs)[0] # Start indices for each sample index = tf.range(0, batch_size) * seq_max_len + (seqlen  1) # Indexing outputs = tf.gather(tf.reshape(outputs, [1, n_hidden]), index) # Linear activation, using outputs computed above return tf.matmul(outputs, weights['out']) + biases['out'] pred = dynamicRNN(x, seqlen, weights, biases) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Initialize the variables (i.e. assign their default value) init = tf.global_variables_initializer() # Start training with tf.Session() as sess: # Run the initializer sess.run(init) for step in range(1, training_steps + 1): batch_x, batch_y, batch_seqlen = trainset.next(batch_size) # Run optimization op (backprop) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, seqlen: batch_seqlen}) if step % display_step == 0 or step == 1: # Calculate batch accuracy & loss acc, loss = sess.run([accuracy, cost], feed_dict={x: batch_x, y: batch_y, seqlen: batch_seqlen}) print("Step " + str(step*batch_size) + ", Minibatch Loss= " + \ "{:.6f}".format(loss) + ", Training Accuracy= " + \ "{:.5f}".format(acc)) print("Optimization Finished!") # Calculate accuracy test_data = testset.data test_label = testset.labels test_seqlen = testset.seqlen print("Testing Accuracy:", \ sess.run(accuracy, feed_dict={x: test_data, y: test_label, seqlen: test_seqlen}))
So, it's a multi input and single output architecture. I'd like to add an attention layer to this model. In fact, I'd like to treat vectors at different time steps differently (using different weights). Any idea on how to implement this in TF or where to start is appreciated. I completely understand this code, but I"m new in using attentions, specially using them on a multi input single output RNN.
Thanks!

Out of Memory issues with beam search decoder
I am trying to implement a beam search decoder for a project. Currently, I am following the TensorFlow tutorial for neural machine translation from :
I am a newbie to deep learning and TensorFlow and not sure if I am doing it in the correct way.
I have edited the code to accommodate the changes for beam search decoder. I have replaced the GRU layer with LSTM layer and added the beam decoder layer.
def call(self, x, hidden, hidden2, enc_output): # enc_output shape == (batch_size, max_length, hidden_size) # hidden shape == (batch_size, hidden size) # hidden_with_time_axis shape == (batch_size, 1, hidden size) # we are doing this to perform addition to calculate the score hidden_with_time_axis = tf.expand_dims(hidden, 1) hidden_with_time_axis2 = tf.expand_dims(hidden2, 1) # score shape == (batch_size, max_length, 1) # we get 1 at the last axis because we are applying tanh(FC(EO) + FC(H)) to self.V score = self.V( tf.nn.tanh(self.W1(enc_output) + self.W2(hidden_with_time_axis) + self.W3(hidden_with_time_axis2))) # attention_weights shape == (batch_size, max_length, 1) attention_weights = tf.nn.softmax(score, axis=1) # context_vector shape after sum == (batch_size, hidden_size) context_vector = attention_weights * enc_output context_vector = tf.reduce_sum(context_vector, axis=1) # x shape after passing through embedding == (batch_size, 1, embedding_dim) x = self.embedding(x) # x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size) x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=1) output, h, c = self.lstm(x) logits = self.fc(output) beam = tf.contrib.seq2seq.BeamSearchDecoder(self.cell, self.embedding, self.start_tokens, self.end_token, tf.contrib.rnn.LSTMStateTuple( tf.contrib.seq2seq.tile_batch(h, multiplier=self.beam_size), tf.contrib.seq2seq.tile_batch(c, multiplier=self.beam_size)), self.beam_size) output, beamOp , _ = tf.contrib.seq2seq.dynamic_decode( beam, output_time_major=True, maximum_iterations=MAX_SEQUENCE_LENGTH) predicted_ids = tf.transpose(tf.cast(output.predicted_ids[:, :, 0], tf.float32)) beamOp_h = beamOp[0][0][:, 0] beamOp_c = beamOp[0][1][:, 0] return predicted_ids, logits, beamOp_h, beamOp_c, attention_weights
Now, I am getting Out of memory when I start training the model. I think I am making a fatal mistake here, but not able to figure out what is it. Can someone kindly help me out?