How to bring time series data in following format for sequence to sequence prediction
1 answer

Use
shift
in loop withfstring
s:#python 3.6+ for i in range(1,5): df[f'demand_{i}'] = df['demand'].shift(i) #python bellow 3.6 for i in range(1,5): df['demand_{}'.format(i)] = df['demand'].shift(i)
Sample:
df = pd.DataFrame({ 'demand':[4,7,8,3,5,0], }) for i in range(1,5): df['demand_{}'.format(i)] = df['demand'].shift(i) print(df) demand demand_1 demand_2 demand_3 demand_4 0 4 7.0 8.0 3.0 5.0 1 7 8.0 3.0 5.0 0.0 2 8 3.0 5.0 0.0 NaN 3 3 5.0 0.0 NaN NaN 4 5 0.0 NaN NaN NaN 5 0 NaN NaN NaN NaN
See also questions close to this topic

Python code percentage number compare problem
when i compared tow percentage number it gives me wrong answer
a="{0:%}".format(85/100) b="{0:%}".format(9/100) if b>a : print("done")
it should be pass if condition while its give me done in answer

How to create sum of different kernel objects in TensorFlow Probability?
I have one question about specifying kernel function in
Tensorflowprobability
.Usually, if I want to create a kernel object, I will write
import tensorflow as tf import tensorflow_probability as tfp tfp_kernels = tfp.positive_semidefinite_kernels kernel_obj = tfp_kernels.ExponentiateQuadratic(*args, **karwgs)
I know that kernel object support batch broadcasting. But what if I want to build a kernel object that is the sum of several different kernel objects, like additive Gaussian processes?
I am not sure how to "sum" up the kernel object in Tensorflow. What I am able to do is to create several separate kernel objects
K1, ... KJ
It seems that there is no similar question online.Thanks for the help in advance.

Counting words per sentence and sentences per paragraph in a text file
I'm having trouble obtaining normalising a dictionary. In my dictionary, I have a bunch of words we are meant to count in a text file. Now for each of these words/characters, "normalising", in the context of my project, is dividing their frequency/value by the total number of sentences in the given text. I then have to replace the old values of the dictionary with these new ones.
I.e. name of my dictionary is count, with keys and values like this:
{'and': 5, ';' : 3, '' : 0...}
def main(textfile, normalize == True): . . . . if normalize == True: for x in count: new_count[x] = count[x]/numSentence print(x,count[x])
Here's a sample file to try any codes on: https://www.dropbox.com/s/7xph5pb9bdf551h/sample2.txt?dl=0 Also note in the above code the normalise == True is there because in the toplevel function

How can I write dataset information into a text file?
I want to write dataset information into a text file, but result is none.
Result:
Dataset information None
Code:
text_file = open("output_file.txt","w") dataset = pd.read_csv("labelled_text.txt", delimiter="\t") text_file.write("Dataset information\n") text_file.write("%s\n"%(dataset.info()))
How can i write this informaiton into a file.
Print function is okey, but writing into a text is not okey.

pandas reversal of numbers based on condition
I have a data frame that looks something like this:
import pandas as pd d={'name':['edward','margaret'],'sex':['male','female'],'amt':[100,200]} df=pd.DataFrame(data=d)
I want to reverse the amt column if the sex is 'female'. So I need the amt to be 200 for the second record. Something like:
df.loc[df['sex']=='female','amt']=200

How to get pandas to act like MS Excel with cell arithmetic?
Objective: Have a function placed at a given position within a Pandas dataframe that updates with adjustments in the dataframe
Description: I am trying to subtract 75,000 from 400,000 to result in 325,000 and have it be displayed in a Pandas datframe. Currently, the row 'End Cash' provides me all the answers that I am expecting. However, these are hard coded values and not dynamic.
import pandas as pd data_2 = [['Init Cash', 400000, 325000,335000,355000,275000,225000,240000], ['Matur CDs',0,0,0,0,0,0,0], ['Interest',0,0,0,0,0,0,0], ['1mo CDs',0,0,0,0,0,0,0], ['3mo CDs',0,0,0,0,0,0,0], ['6mo CDs',0,0,0,0,0,0,0], ['Cash Uses',75000,10000,20000,80000,50000,15000,60000], ['End Cash', 325000,335000,355000,275000,225000,240000,180000]] df_2 = pd.DataFrame(data_2,columns=['Month', 'Month 1', 'Month 2', 'Month 3', 'Month 4', 'Month 5', 'Month 6', 'End']) df_2_copy = df_2.copy()
I thought I could get away with something like the following:
df_2_copy.iloc[7]['Month 1'] == (df_2_copy.iloc[0]['Month 1']  df_2_copy.iloc[6]['Month 1'])
But, unfortunately, this does not work for me.
Any help would be appreciated.

ordinary least squares coefficients calculation using np.einsum
Is there a way to calculate ordinary least squares coefficients
β
in onenp.einsum
call given a vector of dependent variableY
and a matrix of predictorsX
? 
System of coupled ODE  heat exchanger problem
I'm solving a heat exchanger problem and have to find the final temperature. It's a system of ode's I've written the following code: It asks for me to define T and t, but I they're functions of x that I should find
import numpy as np import matplotlib.pyplot as plt from scipy.integrate import odeint #propriedades trietileno glicol na entrada T1 = 363 #K temperatura de entrada Wt = 3.6 #kg/s vazão #propriedades água na entrada t1 = 298 #K Ww = 1.0 Ww = [0.1, 0.5, 1] #kg/s vazões Di = 0.07792 #m diâmetro interno L = 3.048 #m comprimento do tubo U = 283.72 #W/m²K pi= 3.1415 cpt = 1901.09#, 2.7683*T + 896.2] #J/kg*K calor específico cpw = 4018.04#, 0.00003*(t^3) + 0.0403*(t^2)  16.277*t + 6083.7] #J/kg*K calor específico cp = [cpt, cpw] def funct(y, x): y = T, t dTdx = (U * Di * pi / (Wt * cpt)) * (T  t) dtdx = (U * Di * pi / (Ww * cpw)) * (T  t) x = np.linspace(0, L, 100) return dTdx, dtdx # Vetor espaço # Initial condition y0 = T1, t1 sol = odeint(funct, y0 , x) # plot plt.plot(x, sol[:, 0], label='Trietilenoglicol') plt.plot(x, sol[:, 1], label='Água') plt.legend() plt.xlabel('posição')
but it gives me the following error message :
NameError Traceback (most recent call last) <ipythoninput472f2a81626e49> in <module> 35 # Initial condition 36 y0 = T1, t1 > 37 sol = odeint(funct, y0 , x) 38 39 # plot c:\users\idril\appdata\local\programs\python\python36\lib\sitepackages\scipy\integrate\odepack.py in odeint(func, y0, t, args, Dfun, col_deriv, full_output, ml, mu, rtol, atol, tcrit, h0, hmax, hmin, ixpr, mxstep, mxhnil, mxordn, mxords, printmessg, tfirst) 242 full_output, rtol, atol, tcrit, h0, hmax, hmin, 243 ixpr, mxstep, mxhnil, mxordn, mxords, > 244 int(bool(tfirst))) 245 if output[1] < 0: 246 warning_msg = _msgs[output[1]] + " Run with full_output = 1 to get quantitative information." <ipythoninput472f2a81626e49> in funct(y, x) 23 24 def funct(y, x): > 25 y = T, t 26 dTdx = (U * Di * pi / (Wt * cpt)) * (T  t) 27 dtdx = (U * Di * pi / (Ww * cpw)) * (T  t) NameError: name 'T' is not defined
I've already read many answered questions but still can't find the mistake.

Is there a way to easily integrate a set of differential equations over a full grid of points?
The problem is that I would like to be able to integrate the differential equations starting for each point of the grid at once instead of having to loop over the
scipy
integrator for each coordinate. (I'm sure there's an easy way)As background for the code I'm trying to solve the trajectories of a Couette flux alternating the direction of the velocity each certain period, that is a well known dynamical system that produces chaos. I don't think the rest of the code really matters as the part of the integration with
scipy
and my usage of themeshgrid
function ofnumpy
.import numpy as np import matplotlib.pyplot as plt from matplotlib.animation import FuncAnimation, writers from scipy.integrate import solve_ivp start_T = 100 L = 1 V = 1 total_run_time = 10*3 grid_points = 10 T_list = np.arange(start_T, 1, 1) x = np.linspace(0, L, grid_points) y = np.linspace(0, L, grid_points) X, Y = np.meshgrid(x, y) condition = True totals = np.zeros((start_T, total_run_time, 2)) alphas = np.zeros(start_T) i = 0 for T in T_list: alphas[i] = L / (V * T) solution = np.array([X, Y]) for steps in range(int(total_run_time/T)): t = steps*T if condition: def eq(t, x): return V * np.sin(2 * np.pi * x[1] / L), 0.0 condition = False else: def eq(t, x): return 0.0, V * np.sin(2 * np.pi * x[1] / L) condition = True time_steps = np.arange(t, t + T) xt = solve_ivp(eq, time_steps, solution) solution = np.array([xt.y[0], xt.y[1]]) totals[i][t: t + T][0] = solution[0] totals[i][t: t + T][1] = solution[1] i += 1 np.save('alphas.npy', alphas) np.save('totals.npy', totals)
The error given is :
ValueError: y0 must be 1dimensional.
And it comes from the 'solve_ivp' function of
scipy
because it doesn't accept the format of thenumpy
functionmeshgrid
. I know I could run some loops and get over it but I'm assuming there must be a 'good' way to do it usingnumpy
andscipy
. I accept advice for the rest of the code too.