Multivariate time series forecasting lstm
Assume I have 200 samples of time series data. The input has three features and I want to forecast one step ahead in time. So assume my series look like the following: The first three rows:
1 2 3
4 5 6
7 8 9
Now I will convert the problem to supervised learning problem for one step prediction:
Input Output
1 2 3 4 5 6
4 5 6 7 8 9
Now I would like to forecast the value of all three features in the output not only one of them.
Now I would like to know what should be the dimension of the input to Keras?:
I think it should be (100,1,3)
where 100 can be replaced by the batch size. Can someone verify this. The main question is what should I change in lstm configuration for it to understand that I need multivariate setting for the output like for example should I have dense(3)
to specify I need 3 outputs. I would be appreciate if someone can help.
See also questions close to this topic

How to make the number of range working in generator
How to make this range working in function. Whenever I limit
n
, it gives every result without limit.def generator(f, n, *args, **kwargs): return [f(*args, **kwargs) for __ in range(n)]

Assigning value to extended field when adding to cart in odoo ecommerce
Now I am extending the new field (boolean) to the object, sale.order.
My default value of extended field is False.
Extending part is ok and I already added this field to tree, form view.
So my original idea is when user press "add to cart" button in webshop, my new extended field automatically assign true (boolean) value. In Odoo, adding cart is like creating new Quotation in backend if Quotation haven't created yet.
Here is my extended field:
from odoo import api, fields, models class add_to_cart_extend(models.Model): _inherit = 'sale.order' is_value_over = fields.Boolean(string='value_over')
I just realized that I needed to assign this value in two main functions under website_sale module, as below:
@http.route(['/shop/cart'], type='http', auth="public", website=True) def cart(self, **post): @http.route(['/shop/cart/update'], type='http', auth="public", methods=['POST'], website=True, csrf=False) def cart_update(self, product_id, add_qty=1, set_qty=0, **kw):
How can I assign
is_value_over
value to true when user is adding item to the cart? 
How can I find out a game process in task manager?
I am trying to kill all the game processes running on my computer with the help of the
psutil
module. But the problem is that, how am I supposed to distinguish the game process from all the other processes running on my computer?Even if I try to distinguish based on the memory consumption, it won't work for all games because not every game consume high memory.

How to specify number of layers in keras?
I'm trying to define a fully connected neural network in keras using tensorflow backend, I have a sample code but I dont know what it means.
model = Sequential() model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(50, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(20, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(10, input_dim=x.shape[1], kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal')) model.add(Dense(y.shape[1],activation='softmax'))
From the above code I want to know what is the number of inputs to my network, number of outputs, number of hidden layers. And what is the number coming after model.add(Dense ? assuming x.shape[1]=60. What is the name of this network exacly? Should I call it a fully connected network or convolutional network?

Keras: Multiple outputs, loss only a function of one?
I have a setup like this:
model = keras.Model(input,[output1,output2])
My loss function is only a function of output1. How do I tell Keras to ignore output2 for the purposes of computing loss? The best I have come up with is to generate a bogus loss function which always returns 0.0:
model.compile(optimizer=..., loss=[realLossFunction, zeroLossFunction])
I can live with this, but I have to see the statistics and progress of this loss function all over the place and would like to know if there is a more elegant way.

Iterating over arrays on disk similar to ImageDataGenerator
I have 70'000 2D numpy arrays on which I would like to train a CNN network using Keras. Holding them in memory would be an option but would consume a lot of memory. Thus, I would like to save the matrices on disk and load them on runtime. One option would be to use
ImageDataGenerator
. The problem is that it only can read images.I would like to store the arrays not as images because when I would save them as (grayscale) images then the values of arrays are changed (normalized etc.). But in the end I would like to feed the original matrices into the network and not changed values due to saving as image.
Is it possible to somehow store the arrays on disk and iterate over them in a similar way as
ImageDataGenerator
does?Or else can I save the arrays as images without changing the values of the arrays?

Calculating percent change over time with a longitudal dataset
I'm trying to calculate the year to year change in some data I have. It is in panel/longitudinal form
the data is in a dataframe that looks like this
NbrHood TaxYear median 1 0106 2011 82100 2 0106 2012 43000 3 0106 2014 53000 4 0106 2015 64100 5 0106 2016 64100 6 0106 2017 64100
I would like to get a dataframe that comes out in the form like this
Year Differnce Zipcode % Change 20112012 11411 100% 20122013 11411 100% 20112012 11345 16% 20122013 11345 42%

sampling from regular pandas series using irregular DatetimeIndex and fill method
I have a regular time series and a DatetimeIndex of irregular dates. I would like to obtain a new irregular time series where the irregular times are filled from the regular one using one of the standard methods ("ffill" is the one I am after).
import pandas as pd times = pd.date_range(start="20000101 00:00", periods = 12, freq="H") df = pd.DataFrame({"val":np.arange(12)},index=times) newtimes = pd.DatetimeIndex(["20000101 02:13","20000101 03:00"]) print(df.val.loc[newtimes])
produces a warning and this:
20000101 02:13:00 NaN 20000101 03:00:00 3.0
whereas the backfill answer I would like is this:
20000101 02:13:00 2.0 20000101 03:00:00 3.0
Seems like a common use case but couldn't find the answer. Can anyone help?

How to check if column ID exists in column family in Bigtable
I am using
google.cloud.bigtable
library to read data from Google Bigtable. The main table stores time series data. Below is example of two rows:2:137:Power_End.Front.Damage_Accumulation_Score:RAW:1562864520 meta:fleetId @ 2019/07/1112:04:25.833000 "1" meta:sensorDisplayName @ 2019/07/1112:04:25.833000 "Power End.Front.Damage Accumulation Score" sec:48 @ 2019/07/1112:04:25.833000 "0.0011697"  2:137:Power_End.Front.Damage_Accumulation_Score:RAW:1562864640 meta:fleetId @ 2019/07/1112:06:25.401000 "1" meta:sensorDisplayName @ 2019/07/1112:06:25.401000 "Power End.Front.Damage Accumulation Score" sec:41 @ 2019/07/1112:06:25.401000 "0.001215"
Every row has a column family
sec
which is always associated with at least one column ID in string format that represents seconds (between 1 and 60). I want to get one value fromsec
for each of the rows. The problem is column ID is different from one row to the other. I tried to iterate through the 60 possible column IDs for each row by checking whether that column ID exists first but am gettingKeyError
running the following code:rows = table.read_rows(start_key=key1, end_key=key2) df = pd.DataFrame({'timestamp': [], 'score': []}) sec_list = list(range(1, 60)) sec_str = ["%02d" % s for s in sec_list] for r in rows: for s in sec_str: if r.cells['sec'][s]: score = r.cells['sec'][s][0].value.decode('utf8') break
What is the correct way to check if particular column ID exists?

Is there any paper about vanishinggradients of LSTM?
Some web pages mentioned that LSTM causes the vanishing or exploding gradients if the sequence is too long.
These are one of the pages:
 https://machinelearningmastery.com/handlelongsequenceslongshorttermmemoryrecurrentneuralnetworks/
 How to handle extremely long LSTM sequence length?
However, I couldn't find any paper or formulation for it.
Could you please tell me the references for this problem? 
Pytorch LSTM vs LSTMCell
What is the difference between LSTM and LSTMCell in Pytorch (currently version 1.1)? It seems that LSTMCell is a special case of LSTM (i.e. with only one layer, unidirectional, no dropout).
Then, what's the purpose of having both implementations? Unless I'm missing something, it's trivial to use an LSTM object as an LSTMCell (or alternatively, it's pretty easy to use multiple LSTMCells to create the LSTM object)

How to use attention layer on a sequence labeling task implemented using LSTMs in tensorflow?
I've used LSTMCell in tensorflow to implement a sequence labeling task. My code is based on this example code written by Aymeric Damien. Here is important parts of the code (full code is here):
# tf Graph input x = tf.placeholder("float", [None, seq_max_len, 1]) y = tf.placeholder("float", [None, n_classes]) # A placeholder for indicating each sequence length seqlen = tf.placeholder(tf.int32, [None]) # Define weights weights = { 'out': tf.Variable(tf.random_normal([n_hidden, n_classes])) } biases = { 'out': tf.Variable(tf.random_normal([n_classes])) } def dynamicRNN(x, seqlen, weights, biases): # Prepare data shape to match `rnn` function requirements # Current data input shape: (batch_size, n_steps, n_input) # Required shape: 'n_steps' tensors list of shape (batch_size, n_input) # Unstack to get a list of 'n_steps' tensors of shape (batch_size, n_input) x = tf.unstack(x, seq_max_len, 1) # Define a lstm cell with tensorflow lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden) # Get lstm cell output, providing 'sequence_length' will perform dynamic # calculation. outputs, states = tf.contrib.rnn.static_rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen) # When performing dynamic calculation, we must retrieve the last # dynamically computed output, i.e., if a sequence length is 10, we need # to retrieve the 10th output. # However TensorFlow doesn't support advanced indexing yet, so we build # a custom op that for each sample in batch size, get its length and # get the corresponding relevant output. # 'outputs' is a list of output at every timestep, we pack them in a Tensor # and change back dimension to [batch_size, n_step, n_input] outputs = tf.stack(outputs) outputs = tf.transpose(outputs, [1, 0, 2]) # Hack to build the indexing and retrieve the right output. batch_size = tf.shape(outputs)[0] # Start indices for each sample index = tf.range(0, batch_size) * seq_max_len + (seqlen  1) # Indexing outputs = tf.gather(tf.reshape(outputs, [1, n_hidden]), index) # Linear activation, using outputs computed above return tf.matmul(outputs, weights['out']) + biases['out'] pred = dynamicRNN(x, seqlen, weights, biases) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) # Initialize the variables (i.e. assign their default value) init = tf.global_variables_initializer() # Start training with tf.Session() as sess: # Run the initializer sess.run(init) for step in range(1, training_steps + 1): batch_x, batch_y, batch_seqlen = trainset.next(batch_size) # Run optimization op (backprop) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, seqlen: batch_seqlen}) if step % display_step == 0 or step == 1: # Calculate batch accuracy & loss acc, loss = sess.run([accuracy, cost], feed_dict={x: batch_x, y: batch_y, seqlen: batch_seqlen}) print("Step " + str(step*batch_size) + ", Minibatch Loss= " + \ "{:.6f}".format(loss) + ", Training Accuracy= " + \ "{:.5f}".format(acc)) print("Optimization Finished!") # Calculate accuracy test_data = testset.data test_label = testset.labels test_seqlen = testset.seqlen print("Testing Accuracy:", \ sess.run(accuracy, feed_dict={x: test_data, y: test_label, seqlen: test_seqlen}))
So, it's a multi input and single output architecture. I'd like to add an attention layer to this model. In fact, I'd like to treat vectors at different time steps differently (using different weights). Any idea on how to implement this in TF or where to start is appreciated. I completely understand this code, but I"m new in using attentions, specially using them on a multi input single output RNN.
Thanks!

I want to prepare data for multi step time series forecasting with 3 inputs (timestamp included) and 4 outputs. Please explain in python code
I have a dataset which has deviceID, deviceID type, timestamp and the battery's voltages (4 batteries) as attributes. I have 40,000 samples. I want to forecast the battery's voltages of a (deviceID,deviceID type) after a week. I want to know python code or syntax to prepare the data for training into LSTM model.

How to perform time series analysis on a categorical dataset using neural networks
I have a dataset with 2 columns date & state(36 unique values). I want to do time series analysis using nueral networks on this dataset (keras is recommended). I searched a lot in internet, but I'm getting answers for numerical data. Please someone help me how to move forward with this dataset.

Calculate sliding periods in forecasting problem
I have developed 2 functions: one which takes an incomplete time series, and fills in the gaps, and the other which then takes this completed and then creates a sliding window dataset.
The purpose of this transformation is to turn a time series dataset into a supervised learning problem for multistep forecasting.
As an example, a starting dataset would look something like:
[1] Jan 2017 . Mar 2017 . May 2017 . Jul 2017 . Sep 2017 [2] 50 60 30 20 90
If this were to then impute all missing values with zeros, the new dataset becomes
[1] Jan 2017 . Feb 2017 . Mar 2017 . Apr 2017 . May 2017 . Jun 2017 . Jul 2017 . Aug 2017 . Sep 2017 [2] 50 0 60 0 30 0 20 0 90
(It does not matter whether the format of the data is in wide or long format.)
In a multistep supervised learning problem, data needs to be turned into sliding window input and output. If I were to turn the above into a 2 input : 2 output dataset, this would create 6 combinations:
[input], [output] [50, 0], [60, 0] [0, 60], [0, 30] [60, 0], [30, 0] [0, 30], [0, 20] [30, 0], [20, 0] [0, 20], [0, 90]
Question: How can I determine the total number of combinations that will be generated from this process before having to create the combinations?