Mapping a set of input variables to a single output variable; also optimization of input (ex. Car inputs to fuel economy)
I'm wondering what are the stateoftheart methods when having to learn the mapping between a set of input variables to a single output. An example is the inputs of a car to fuel efficiency where the data set contained these variables:
 Time and Date
 Speed (Kph)
 Rpm
 Car tire pressure (psi)
 Fuel Economy (Liters/gallon)
(I understand the data set is lacking and flawed. But this is just an example.)
Most cases use multiple linear regressions to do this.
Some also use a basic neural network with an input, hidden, and output layer could be used to solve this problem.The input is a m x n array (where m is the number of data points and n is the number of variables (in the above case, 3). And the output is a single variable representing fuel economy. You then compare the predicted fuel economy to the actual and begin optimizing the neural network (back propagation). However I learned that from a relatively old source (2013). I was wondering if there have been new advancements in solving this kind of problem.
Can RNNs (LSTMs) be used to solve this? If the data is measured every minute for example, I have reason to believe that the current fuel economy is affected by the previous car inputs (t1, t2, etc.)
I also learned that I can use evolutionary algorithms to optimize this sort of problem. (ex. Whats the fastest I can go while having a fuel economy of 50 L/gal). What are the most popular evolutionary algorithms for this kind of problem? Could anyone recommend any papers or other material that could help me learn more about this?
See also questions close to this topic

Firestore: Running Complex Update Queries With Multiple Retrievals (ReactJS)
I have a grid of data whose endpoints are displayed from data stored in my firestore database. So for instance an outline could be as follows:
 Spent total: $150 
 Item 1: $80 
 Item 2: $70 
So the value for all of these costs (70,80 and 150) is stored in my firestore database with the sub items being a separate collection from my total spent. Now, I wannt to be able to update the price of item 2 to say $90 which will then update Item 2's value in firestore, but I want this to then run a check against the table so that the "spent total" is also updated to say "$170". What would be the best way to accomplish something like this?
Especially if I were to add multiple rows and columns that all are dependent on one another, what is the best way to update one part of my grid so that afterwords all of the data endpoints on the grid are updated correctly? Should I be using cloud functions somehow?
Additionally, I am creating a ReactJS app and previously in the app I just had my grid endpoints stored in my Redux store state so that I could run complex methods that checked each row and column and did some math to update each endpoint correctly, but what is the best way to do this now that I have migrated my data to firestore?

Error using * Disciplined convex programming error: Only scalar quadratic forms can be specified in CVX
I am using cvx to optimize the convex problem:
the U is variable and here is my code:
cvx_begin variable U(l,N) U >= 0 ones(l,N)'*U == ones(N,N)'; minimize (trace(D'*U)+ lambda*norm(U, inf)+ gamma*trace(U*L*U')) ; cvx_end
The error message is: Error using * Disciplined convex programming error: Only scalar quadratic forms can be specified in CVX
So, can anyone tell what is/are the problems in my code?

Out of memory issue with pyomogurobi
I am trying to solve an IP problem formulated in Pyomo calling Gurobi (academic license). The model works well for relatively small instances, however computational times grow exponentially and my code crashes with a "Gurobi out of memory error". Since the machine is quite good on GPU, I was wondering if there is anything I can do that to leverage that and get around this problem. Thank you.

contour plot for regression predict with fixed input variables
I want to create a contour plot for a prediction with multiple features. The remaining values should be fixed to plot the 2 interesting values. Unfortunately I resulting matrix has the same value on all positions instead of the expected.
I think something with my matrixes is wrong, but I don't find the error.
[...] f_learn = [x_1,x_2,x_3,x_4] r_lear = [r_1] clf = svm.MLPRegressor(...) clf.fit(f_learn,r_learn) [...] x_1 = np.linspace(1, 100, 100) x_2 = np.linspace(1, 100, 100) X_1, X_2 = np.meshgrid(x_1, x_2) x_3 = np.full( (100,1000), 5).ravel() x_4 = np.full( (100,1000), 15).ravel() predict_matrix = np.vstack([X_1, X_2, x_3,x_3]) prediction = clf.predict(predict_matrix.T) prediction_plot = prediction.reshape(X_1.shape) plt.figure() cp = plt.contourf(X_1, X_2, prediction_plot, 10) plt.colorbar(cp) plt.show()
If I test the matrix line by line by hand I get the right results. However, it doesn't work if I put them together this way.

Different signs of Principal Components
I have implemented a PCA in python. I used MNISTData and have reduced the data to 2d. After that I used KNN to classify the data. The same I had repeated with Scikit. The result is, that I have with my own PCA a much lower accuracy. I compared the PC's and see, that the signs of some components are different to the results of SciKit. I have absolut no idea how to fix this. I hope, one of you see my mistake or missunderstanding.
class Dimension_reduction: def PCA(self, X, dimensions): covariance_matrix = self.find_covariance(X) eigenvalues, eigenvectors = self.eigenvalue_decomposition(covariance_matrix) eigenpairs = self.sort_eigenvalues(eigenvalues, eigenvectors) projection_matrix = self.projection_matrix(eigenpairs, dimensions) new_featurespace = self.project_reduced_featurespace(X, projection_matrix) return new_featurespace def find_covariance(self, X): means = np.mean(X, axis = 0) covariance_matrix = (X  means).T.dot((X  means)) / (len(X)1) return covariance_matrix def eigenvalue_decomposition(self, covariance_matrix): eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix) return eigenvalues, eigenvectors def sort_eigenvalues(self, eigenvalues, eigenvectors): eigenpairs = [(np.abs(eigenvalues[i]), eigenvectors[:,i]) for i in range(len(eigenvalues))] eigenpairs.sort() eigenpairs.reverse() return eigenpairs def projection_matrix(self, eigenpairs, dimensions): projection_matrix = np.array(eigenpairs[0][1].reshape(len(eigenpairs[0][1]),1)) for i in range(1, dimensions, 1): projection_matrix = np.concatenate((projection_matrix, np.array(eigenpairs[i][1].reshape(len(eigenpairs[0][1]),1))), axis = 1) return projection_matrix def project_reduced_featurespace(self, X, projection_matrix): return X.dot(projection_matrix) def scatter_plot(self, new_X, label_number, labels): partial_X = [] x = [] y = [] for i in range(label_number): partial_X.append([]) x.append([]) y.append([]) for i in range(len(new_X)): label = labels[i] partial_X[int(label)].append(new_X[i]) colors = plt.cm.rainbow(np.linspace(0,1,label_number)) for i in range(label_number): for j in range(len(partial_X[i])): x[i].append(partial_X[i][j][0]) y[i].append(partial_X[i][j][1]) i = 0 for x,y,c in zip(x,y, colors): plt.scatter(x,y,c, label = str(i)) return plt, colors def eigenfaces(self, X): covariance_matrix = self.find_covariance(X) eigenvalues, eigenvectors = self.eigenvalue_decomposition(covariance_matrix) eigenpairs = self.sort_eigenvalues(eigenvalues, eigenvectors) return eigenpairs
I also used different implementations
def PCA(X): PC = [] cov = np.cov(X.T) w,v = np.linalg.eig(cov) eig_pairs = [(np.abs(w[i]), v[:,i]) for i in range(len(w))] eig_pairs.sort() eig_pairs.reverse() PC.append(eig_pairs[0][1]) PC.append(eig_pairs[1][1]) Y = X.dot(np.array(PC).T) return Y def PCA(data, dims_rescaled_data=2): """ returns: data transformed in 2 dims/columns + regenerated original data pass in: data as 2D NumPy array """ import numpy as NP from scipy import linalg as LA m, n = data.shape # mean center the data data = np.mean(data,axis=0) # calculate the covariance matrix R = NP.cov(data, rowvar=False) # calculate eigenvectors & eigenvalues of the covariance matrix # use 'eigh' rather than 'eig' since R is symmetric, # the performance gain is substantial evals, evecs = LA.eigh(R) # sort eigenvalue in decreasing order idx = NP.argsort(evals)[::1] evecs = evecs[:,idx] # sort eigenvectors according to same index evals = evals[idx] # select the first n eigenvectors (n is desired dimension # of rescaled data array, or dims_rescaled_data) evecs = evecs[:, :dims_rescaled_data] # carry out the transformation on the data using eigenvectors # and return the rescaled data, eigenvalues, and eigenvectors return NP.dot(evecs.T, data.T).T, evals, evecs def pca_svd(X, num = 2): X = Xnp.mean(X, axis = 0) [u,s,v] = np.linalg.svd(X) v = v.T[:,:num] return np.dot(X,v)
In sum there are 4 different implementations. The result of first is:
# Training array([[1.01031446, 6.71282428], [3.03212724, 1.64169381], [ 2.95288108, 1.44413258], ..., [1.15329784, 2.83978701], [8.02795144, 2.12452378], [ 9.83911408, 3.2389573 ]]) # Testing array([[ 5.15053345, 6.79771421], [ 1.84247302, 0.58932415], [ 1.66957196, 3.89696398], ..., [ 5.22253275, 1.74628625], [8.2209684 , 0.32435677], [11.00041468, 4.62978653]])
For the second:
#train array([[ 4.94267554, 6.22892054], [ 6.96448832, 1.15779007], [ 0.97948 , 1.92803631], ..., [ 5.08565892, 3.32369075], [11.96031252, 1.64062004], [ 5.906753 , 2.75505357]]) #test array([[ 1.39046844, 7.2344142 ], [ 1.91759199, 0.15262416], [ 2.09049305, 4.33366397], ..., [ 1.46246774, 2.18298624], [11.98103342, 0.76105676], [ 7.24034966, 4.19308654]])
For the third:
#train array([[ 4.94267554, 6.22892054], [ 6.96448832, 1.15779007], [ 0.97948 , 1.92803631], ..., [ 5.08565892, 3.32369075], [11.96031252, 1.64062004], [ 5.906753 , 2.75505357]]) #test array([[1.39046844, 7.2344142 ], [ 1.91759199, 0.15262416], [ 2.09049305, 4.33366397], ..., [1.46246774, 2.18298624], [11.98103342, 0.76105676], [7.24034966, 4.19308654]])
for the svd:
#train array([[ 4.94267554, 6.22892054], [ 6.96448832, 1.15779007], [ 0.97948 , 1.92803631], ..., [ 5.08565892, 3.32369075], [11.96031252, 1.64062004], [5.906753 , 2.75505357]]) #test xt2 array([[ 1.39046844, 7.2344142 ], [ 1.91759199, 0.15262416], [ 2.09049305, 4.33366397], ..., [ 1.46246774, 2.18298624], [11.98103342, 0.76105676], [ 7.24034966, 4.19308654]])
At last the PC's from scikit:
#train array([[ 4.9426755 , 6.22891427], [ 6.9644884 , 1.15779638], [ 0.97948002, 1.92802868], ..., [ 5.08565916, 3.32364585], [11.96031245, 1.64060628], [5.90675305, 2.7550444 ]]) #test array([[1.39046854, 7.23440228], [ 1.91759194, 0.1526275 ], [ 2.09049303, 4.33366458], ..., [1.46246766, 2.18299325], [11.98103337, 0.76105147], [7.24034972, 4.19309378]])
One can see, that the results are just in signs different. But for the classification this is an problem. Additionally, there must be an mistake in the first code. But i can not find it.
Has anybody a idea?
PS: If i increase the number of PC's in projection, the signproblem get worst.

Sorting Images from particular folder with similar images
I have selected some images of similar type and have thousands of all types of images in another folder. I need to move the images from folder A to folder B which matches only with data images which I had selected as training data. I had tried using sift and surf but that is not not working for me because it is skipping some images which are matched. Please guide me or give me hint what should I do for it in Python.
import numpy as np import cv2 from matplotlib import pyplot as plt img1 = cv2.imread('fig_16125.jpg',0) # queryImage img2 = cv2.imread('fig_36823.jpg',0) # trainImage # Initiate SIFT detector sift = cv2.xfeatures2d.SIFT_create() # find the keypoints and descriptors with SIFT kp1, des1 = sift.detectAndCompute(img1,None) kp2, des2 = sift.detectAndCompute(img2,None) # BFMatcher with default params bf = cv2.BFMatcher() matches = bf.knnMatch(des1,des2, k=2) # Apply ratio test good = [] for m,n in matches: if m.distance < 0.75*n.distance: good.append([m]) # cv2.drawMatchesKnn expects list of lists as matches. img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2) plt.imshow(img3),plt.show()
this is the code which I am referring from Sift, surf and ORB python output:

What's the loss function of this model?
Each text description is a sentence. For example,there are 1000 text descriptions to form a dataset，and there are 3,000 nonrepeating words in dataset.And then I encode each word in dataset from 0 to 3000:
# a text description: I want to read books → 1,8,399,32,68
number is the id of each word.
As the picture show,I have many text descriptions with
300
words in each of them,batch_size
is100
.I use convolution neural networks to output a matrixP[100,300,50]
,and 50 is the dimensions of each word in text description. Finally,I usetf.reduce_mean(P,1)
to get a matrixM[100,50]
.Each vector inM
is the representation vector of a text description.self.Text_Description = tf.placeholder(tf.int32, [config.batch_size, 300], name='Ta') # 100*300 def conv(self): W2 = tf.Variable(tf.truncated_normal([2, int(config.embed_size), 1, 50], stddev=0.3)) convA = tf.nn.conv2d(self.Text_Description, W2, strides=[1, 1, 1, 1], padding='VALID') #(100*299*1*50) tf.squeeze(convA) #(100*299*50) hA = tf.tanh(tf.squeeze(convA)) att1 = tf.reduce_mean(hA,1) # (100*50) return att1
what's the loss function of this model

Use both losses on a subnetwork of combined networks
I am trying to stack two networks together. I want to calculate loss of each network separately. For example in the image below; loss of LSTM1 should be (Loss1 + Loss2) and loss of LSTM2 should be just (Loss2)
I implemented a network like below with the idea above but have no idea how to compile and run it.
def build_lstm1(): x = Input(shape=(self.timesteps, self.input_dim,), name = 'input') h = LSTM(1024, return_sequences=True))(x) scores = TimeDistributed(Dense(self.input_dim, activation='sigmoid', name='dense'))(h) LSTM1 = Model(x, scores) return LSTM1 def build_lstm2(): x = Input(shape=(self.timesteps, self.input_dim,), name = 'input') h = LSTM(1024, return_sequences=True))(x) labels = TimeDistributed(Dense(self.input_dim, activation='sigmoid', name='dense'))(h) LSTM2 = Model(x, labels) return LSTM2 lstm1 = build_lstm1() lstm2 = build_lstm2() combined = Model(inputs = lstm1.input , outputs = [lstm1.output, lstm2(lstm1.output).output)])

How to use trained neural network as function using MATLAB coder?
I've trained a simple neural network that just multiply 4 numbers and gives 1 number as an output.
( output(x0) = in1(x0)*in2(x0)*in3(x0) *in4(x0)).
My neural net has 4 inputs, and 1 output, [10 10] is the hidden layer. I used 'genFunction' to generate a .m file out of network.then I used MATLAB coder to generate C++ function. I generate the code with following input types: input types:
My problem is when I test the C++ code it only gives the 2 first samples for the output.
I store my entries in a std:: vector which has size of 400 (each input size is 100)
I've done the following so far (no desirable output though):
std::vector<double> Multiplier(std::vector<double>& input) { double* X_data = new double[input.size()]; X_data = vec2ar(input); int X_size[2]; X_size[0] = 4; X_size[1] = 100; double* Y_data = new double[input.size()]; int Y_size[100];
While the original was:
void multiplier(const double X_data[], const int X_size[2], double Y_data[], int Y_size[2]) { double Xp1_data[800]; int Xp1_size[2]; int j; double a1_data[2000]; int coffset; int a1_size[2]; int boffset; double tmp_data[2000]; int k; double b_a1_data[2000];
And for getting the output:
std::vector<double> output; output = ar2vec(Y_data); return output;
All I want is we have 4 vectors and give them to this function simply just multiply corresponding samples as shown in 'description' (in my case we have 4 vectors of size 100, and we expect an output with size 100).
And for ar2vec and vec2ar functions:
std::vector<double> ar2vec(double arr[]) { std::vector<double> vec; copy(&arr[0], &arr[100], std::back_inserter(vec)); return vec; } double* vec2ar(std::vector<double> vec) { double * arr = new double[vec.size()]; copy(vec.begin(), vec.end(), arr); return arr; }
How to fix my problem?

How to Plot graph from multiple independent variable and one dependent variable in python [Multiple linear regression]
I am new to Machine Learning and facing a situation in which how to remove multiple independent variables in multiple linear regression. Steps I have gone through: 1) Read Dataset 2) separate into X and Y 3)Encode the categorical data as Dataset contains column : prof rank, profession etc... 4) Remove Dummy variable 5)OLS regression results.
I had 7 independent variables, after OLS ,I have 6 independent variables.Removed by P > 0.05 as Pvalue is greater than 0.05 significance level.
Can you suggest what are the steps to plot the graph with removing all unnecessary independent variables as attached in the image?. How to get just ONE independent variable from all these variables.
How to check multicollinearity using python? What is VIF and how to use it to detect multicollinearity
Thanks in advance. Sorry for grammmer mistakes if any.

Multiplying uneven datasets
I am trying to interact the log of distance with the tariff rates on different countries but my data frames are of slightly different dimension. The first data frame is
'data.frame': 265 obs. of 32 variables:
and the second data frame is
'data.frame': 263 obs. of 32 variables:
I have been unable to find where they differ but my professor says that in STATA their are workarounds that should also be found in R. What work arounds would you use?

Finding RSS and RSquared
am pretty new to all this (so please be merciful)
so I've imported a csv file into python as shown below:
data = pd.read_csv("sales.csv") data.head(10)
and I then fit a linear regression model on the sales variable, using the variables as shown in the results as predictors. the results are summarised below:
model_linear = smf.ols('sales ~ month + weekend + holiday + prod_function + prod_regime + prod_listprice + discount + stockout', data=data).fit() print(model_linear.summary())
and given the above, I want to find the RSS and Rsquared of the model. this is what I did:
RSS = np.sum(model_linear.resid**2) print(RSS) model_linear.rsquared
is this the right approach to doing this?

Evolving a classifier from a simple data set
We have been given a dataset:
00000 0 00001 0 00010 0 00011 1 00100 0 00101 1 00110 1 00111 0 01000 0 01001 1 01010 1 01011 0 01100 1 01101 0 01110 0 01111 1 10000 0 10001 1 10010 1 10011 0 10100 1 10101 0 10110 0 10111 1 11000 1 11001 0 11010 0 11011 1 11100 0 11101 1 11110 1 11111 0
Left side shows the 5 input variables and the classification of said variables follows.
The rule here is that if there's an even number of 1's the classification is 1, odd number: 0.
We have to evolve a classifier from this data set.
I have an understanding of genetic algorithms but have no idea where to start with this. So far I've just created classes for the data and for the rules:
class Rule: def __init__(self): self.conditions = []*condition_length self.out = 0; class Data: def __init__(self, var_string, classification): self.variables = list(var_string) self.classification = 0; print("data added: " + var_string + " class:" + classification)
I don't need code itself just an idea of where to start with such a problem and how it ties in to evolutionary computing. Thanks in advanced.

Is this language generic/mighty enough to be used for a generic game AI?
I want to develop a genetic program that can solve generic problems like surviving in a computer game. Since this is for fun/education I do not want to use existing libraries.
I came up with the following idea:
The input is an array of N integers. The genetic program consists of up to N ASTs, each of which takes input from some of the array elements and writes its output to a single specific array element.
The ASTs can be arbitrary complex and consist only of four arithmetic operators (+,,*,/) and can operate on constants and fixed elements of the given array (no random access).
So for [N=3], we have 3 ASTs, for example:
a[0] = {a[0] + 1} a[1] = {a[0] + a[1]} a[2] = {a[0] * 123 + a[1]}
The N ASTs are executed one after another and this is repeated infinitely.
Now my question, is this system "mighty" enough (turing complete?) or will it fail to solve some kinds of problems common for an game AI?

Password cracking with evolution algorithm
In school we got task to crack skipjack cipher. We got source text, encrypted text and even a password. Only clue we got from teacher is
Try to use evolution and genetic algorithm for start
. I was thinking whole day about it and I have no idea how could I use these for this task. Can someone give me a hint?