numpy inner product and value error
I have problem with inner product of two vector. I define subtract and distance in these shape:
subtract = np.zeros((3,1), dtype=int)
distance = np.zeros((7,))
then when I want to do this operation:
subtract = np.subtract(pix[i,j],cluster[k])
distance[k] = np.inner(subtract,np.transpose(subtract))
I get this error:
distance[k] = np.inner(subtract,np.transpose(subtract))
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
there is no problem with subtract I can print it and its transpose.
See also questions close to this topic

How to detach Python child process on Windows (without setsid)?
I'm migrating some process code to Windows which worked well on Posix. Very simply put: the code to launch a subprocess and immediately detach will not work because
setsid()
is not available:import os, subprocess, sys p = subprocess.Popen([sys.executable, 'c', "print 'hello'"], preexec_fn=os.setsid)
I can remove the use of
setsid
but then the child process ends when the parent ends.My question is how do I achieve the same effect as
setsid
on Windows so that the child process lifetime is independent of the parent's?I'd be willing to use a particular Python package if one exists for this sort of thing. I'm already using
psutil
for example but I didn't see anything in it that could help me. 
Recursively finding a base sequence
findStartRec(goal, count)
recursively searches forward from an initial value of 0, and returns the smallest integer value that reaches or exceeds the goal.The preconditions are that
goal >= 0
andcount > 0
. If the double (x * 2) and add 5 ( + 5) sequence starting at 0 cannot reach the goal incount
steps, then try starting at 1.Continue this process until the program finds a starting value 'N' that does reach or exceed goal in count steps, and return that start value.
Example:
findStartRec( 100, 3 ) returns '9'
Here is what I have come up with so far
def findStartRec(goal, count, sequence=0, itter=0): if sequence == goal and count == 0: print("Sequence: ", sequence, "Itter: ", itter) return sequence, itter else: while count > 0: sequence = (itter * 2) + 5 count = count + 1 #return findStartRec(goal, count + 1, sequence, itter) else: return findStartRec(goal, count, sequence, itter + 1)

Data preparation with python
I'm in my debut in this forum, so I apologize for any noncompliance conditions. I have a text file that I want to divide into four parts as indicated in the code. it always generates me errors, I really ask for your help. Thank you
# First import pandas and the regex module import pandas as pd import numpy as np import re data = open("Discussion.txt", encoding="utf8") contenu = data.read() data.close() print(contenu) # Read the .txt file into a string data = open("Discussion.txt", encoding="utf8") string = data.read() data.close() #Split seperate lines into list of strings splitstring = string.splitlines() # For each list item find the data needed (with regex or indexing) # and assign to a dictionary df = {} for i in range(len(splitstring)): match = re.search(r'(.* .*)  (.*): (.*)',splitstring[1]) line = { 'Date' : splitstring[i][:10], 'Time' : match.group(1), 'Number' : match.group(2), 'Text' : match.group(3)} df[i] = line  AttributeError Traceback (most recent call last) <ipythoninput543a1f0fdf7c6> in <module>() 8 line = { 9 'Date' : splitstring[i][:10], > 10 'Time' : match.group(1), 11 'Number' : match.group(2), 12 'Text' : match.group(3)} AttributeError: 'NoneType' object has no attribute 'group' # Convert dictionary to pandas dataframe dataframe = pd.DataFrame(df).T #Finally send to csv dataframe.to_csv(filepath) File "<ipythoninput62b1b4e00c433>", line 3 Finally send to csv ^ IndentationError: unexpected indent
Here is a preview of the content of print (content) in image:

How do I use numpy vectorize to iterate through a twodimentional vector?
I am trying to use numpy.vectorize to iterate over a (2x5) matrix which contains two vectors representing the x and yvalues of coordinates. The coordinates (x and yvalue) are to be fed to a function returning a (1x1) vector for each iteration. So that in the end, the result should be a (1x5) vector. My problem is that instead of iterating through each element I want the algorithm to iterate through both vectors simultaneously, so it picks up the x and yvalues of the coordinates in parallel to feed it to the function.
data = np.transpose(np.array([[1, 2], [1, 3], [2, 1], [1, 1], [2, 1]])) th_ = np.array([[1, 1]]) th0_ = 2 def positive(x, th = th_, th0 = th0_): if signed_dist(x, th, th0)[0][0] > 0: return np.array([[1]]) elif signed_dist(x, th, th0)[0][0] == 0: return np.array([[0]]) else: return np.array([[1]]) positive_numpy = np.vectorize(positive) results = positive_numpy(data)
Reading the numpy documentation did not really help and I want to avoid large workarounds in favor of computation timing. Thankful for any suggestion!

How to get data from python datatype returned in MATLAB?
I have a python script like so:
import numpy as np def my_function(x): return np.array([x])
And I have a MATLAB script to call it:
clear all; clc; if count(py.sys.path,'') == 0 insert(py.sys.path,int32(0),''); end myfunction_results = py.python_matlab_test.my_function(8); display(myfunction_results);
And it displays:
myfunction_results = Python ndarray with properties: T: [1×1 py.numpy.ndarray] base: [1×1 py.NoneType] ctypes: [1×1 py.numpy.core._internal._ctypes] data: [1×8 py.buffer] dtype: [1×1 py.numpy.dtype] flags: [1×1 py.numpy.flagsobj] flat: [1×1 py.numpy.flatiter] imag: [1×1 py.numpy.ndarray] itemsize: 8 nbytes: 8 ndim: 1 real: [1×1 py.numpy.ndarray] shape: [1×1 py.tuple] size: 1 strides: [1×1 py.tuple] [8.]
But I do not know how to actaully get the data out of this object. The type is
py.numpy.ndarray
, but I want to obviously use it in MATLAB as an array or matrix, or integer or something. HOw do I convert it to one of those types?I've been looking at these: https://www.mathworks.com/help/matlab/examples/callpythonfrommatlab.html https://www.mathworks.com/matlabcentral/answers/216498passingnumpyndarrayfrompythontomatlab https://www.mathworks.com/help/matlab/matlab_external/usematlabhandleobjectsinpython.html
Some of the answers suggest writing to a
.mat
file. I DO NOT want to write to a file. This needs to be able to run in real time and writing to a file will make it very slow for obvious reasons.Seems like there is an answer here: "Converting" Numpy arrays to Matlab and vice versa which shows
shape = cellfun(@int64,cell(myfunction_results.shape)); ls = py.array.array('d',myfunction_results.flatten('F').tolist()); p = double(ls);
But I must say that is very cumbersome....is there an easier way?

How to solve ValueError: cannot reindex from a duplicate axis in python
I have a dataset of all categorical columns from where I need to find the proportions of the target_class i.e 1 within each level of a categorical variable. And then append the correlation of each level with target_class by dummying the categorical variable. Below is an example of input data and expected output:
#Input Data: df_data = pd.DataFrame( {'production' : ['1101100000','1101100000','100100000','100100000','1101100000','1101100000','1001000000','1101100000','1101100000','1101100000'], 'enc_svod' : ['Free','Free','Pay','','Pay','Free','Free','','','Pay'], 'status' : [1,0,0,0,1,0,0,0,0,1]} )
Code to find proportions and correlation with target_class:
cat_cols = ['production','enc_svod'] # Code to find proportions and correlation with target_class: # Now traverse through each column and calculate correlation and generate metrics cat_count = 0 cat_metrics_df = pd.DataFrame() for each_col in cat_cols: df_temp = pd.DataFrame() df_single_col_data = df_data[[each_col]] cat_count += 1 # Calculate uniques and nulls in each column to display in log file. uniques_in_column = len(df_single_col_data[each_col].unique()) nulls_in_column = df_single_col_data.isnull().sum() print('Working on column %s, converting to dummies and finding correlation with target' %(each_col)) df_categorical_attribute = pd.get_dummies(df_single_col_data[each_col].astype(str), dummy_na=True, prefix=each_col) df_categorical_attribute = df_categorical_attribute.loc[:, df_categorical_attribute.var() != 0.0]# Drop columns with 0 variance. df_temp['correlation'] = df_categorical_attribute.corrwith(df_data['status']) try: # Calculate Index : Proportions of 1's within each CAT level frames = [df_single_col_data,df_data['status']] df_proportions = pd.concat(frames,axis = 1) df_proportions = df_proportions.fillna('nan').groupby(each_col,as_index = True).mean() df_proportions.index = [str(df_proportions.index.name) + '_' + str(x) for x in df_proportions.index.values] df_temp['Index'] = df_temp.join(df_proportions)['status'] df_temp['Attribute'] = str(each_col) cat_metrics_df = cat_metrics_df.append(df_temp) except ValueError: print("Error for column %s:" %(each_col)) continue
The reason I do try except here is because for some variables there is an error of Value Error as below:
Traceback (most recent call last): File "/user/data_processing_functions.py", line 443, in metrics_categorical df_temp['Index'] = df_temp.join(df_proportions)['disco_status'] File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/frame.py", line 2331, in __setitem__ self._set_item(key, value) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/frame.py", line 2397, in _set_item value = self._sanitize_column(key, value) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/frame.py", line 2547, in _sanitize_column value = reindexer(value) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/frame.py", line 2539, in reindexer raise e File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/frame.py", line 2534, in reindexer value = value.reindex(self.index)._values File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/series.py", line 2426, in reindex return super(Series, self).reindex(index=index, **kwargs) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/generic.py", line 2515, in reindex fill_value, copy).__finalize__(self) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/generic.py", line 2533, in _reindex_axes copy=copy, allow_dups=False) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/generic.py", line 2627, in _reindex_with_indexers copy=copy) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/internals.py", line 3886, in reindex_indexer self.axes[axis]._can_reindex(indexer) File "/user/anaconda3/lib/python3.6/sitepackages/pandas/core/indexes/base.py", line 2836, in _can_reindex raise ValueError("cannot reindex from a duplicate axis") ValueError: cannot reindex from a duplicate axis
For some columns there are more unique values(19) than the number of categories left after:
df_categorical_attribute = df_categorical_attribute.loc[:, df_categorical_attribute.var() != 0.0]# Drop columns with 0 variance.
This happens when I run it on the server which has pandas version 0.20.3 whereas on my local it is the latest one  0.23.4. I am not sure if this is the reason or there is some other reason for this error. I thought of using Try Except so that if such ValueError errors it should skip that column. I am not sure why that is happening, I am guessing because of spaces on the whole data  2.5 million rows * 1200 columns ( I am using a sample  50000 on my local) which might not be capturing those cases I think.
 PowerBi Transpose  Table

Excel  transpose/reorganize a worksheet data
I have an excel file that looks like this:
**T1 T2 T3 T4 T5** 1/2/2018 **A** 2 2 1 2 1 **B** 0 0 0 0 0 **C** 2 2 1 2 1 1/3/2018 **A** 2 2 1 2 1 **B** 0 0 0 0 2 **C** 2 2 1 2 1
And I want it to look like this: final required table format
Name Time col1 val col2 A 12/1/2010 1:00 extra_col 2 other_col A 12/1/2010 2:00 extra_col 2 other_col A 12/1/2010 3:00 extra_col 1 other_col A 12/1/2010 4:00 extra_col 2 other_col A 12/1/2010 5:00 extra_col 1 other_col B 12/1/2010 1:00 extra_col 0 other_col B 12/1/2010 2:00 extra_col 0 other_col B 12/1/2010 3:00 extra_col 0 other_col B 12/1/2010 4:00 extra_col 0 other_col B 12/1/2010 5:00 extra_col 0 other_col C 12/1/2010 1:00 extra_col 2 other_col C 12/1/2010 2:00 extra_col 2 other_col C 12/1/2010 3:00 extra_col 1 other_col C 12/1/2010 4:00 extra_col 2 other_col C 12/1/2010 5:00 extra_col 1 other_col
The values T1 to T5 are times (first hour to fifth hour of the day, for eg). The Paste Transpose won't give me the required format. I could do transpose and drag for data for a given date, but it is time consuming when there are almost 90 days involved. There are thousands of columns for me to try doing manually. Does anyone have a better idea on bulk transpose into the said format? I basically am stuck not knowing what are the keywords required to research this as I do not know how to present it easily  and that disables me. It would be great if I could get some keywords to move further, if not the method.

Transpose Data from Long to Wide
I am trying to transpose data in SAS from a long format to a wide format. The problem I'm having is that I have multiple columns that I'm trying to transpose. I have a few example datasets below to demonstrate what I'm trying to do. The actual dataset I'm doing this on is going to be very large, I think one way to handle this could be to tranpose individual columns and then merge at the end, but the dataset I'm going to be doing this on is going to be significantly larger (tens of thousands of columns), so this will be pretty unfeasible.
Below is the data I'm starting with:
data current_state; input id $ att_1 $ att_2 $ att_3 $ att_4 $ att_5 $ Dollars; datalines; 1 d234 d463 d213 d678 d435 50 2 d213 d690 d360 d145 d269 25 3 d409 d231 d463 d690 d609 10 ;
Below is what I would want the outcome of the transpose to be:
data desired_state; input id $ d145 $ d213 $ d231 $ d234 $ d269 $ d360 $ d409 $ d435 $ d463 $ d609 $ d678 $ d690; datalines; 1 0 50 0 50 0 0 0 50 0 0 50 0 2 25 25 0 0 25 25 0 0 0 0 0 25 3 0 0 10 0 0 0 10 0 10 10 0 10 ;
I have attempted the following, which isn't giving me the desired output.
proc transpose data=current_state out=test1; by id; id att_1 att_2 att_3 att_4 att_5; var Dollars; run;

ValueError: Found array with 0 sample(s) (shape=(0, 3072)) while a minimum of 1 is required
I have been working on an Image Classifier using kNN on cats and dogs dataset from kaggle. The following is the snippet from the code which poses the error as in the title of the question.
sp = SimplePreprocessor(32, 32) sdl = SimpleDatasetLoader(preprocessors=[sp]) (data, labels) = sdl.load(imagePaths, verbose=500) data = data.reshape((data.shape[0], 3072))
SimplePreprocessor and SimpleDatsetLoader are two functions which compress the image to 32*32 and load the dataset respectively. Now, the 32*32*3 images have to be flattened to an array with shape (3000, 3072). What change should be made in the fourth line of the code?

Repeated IndexError in terminal caused by exception
I am trying to solve a programming task, and i have ran into some trouble. The task reads:
Consider the usual formula for computing solutions to the quadratic equation: ax2 + bx + c = 0 given by x = sqrt(b± b^2−4ac/2a) Write a program reads values for a,b and c from the command line. Use exceptions to handle missing arguments, and handle invalid input where b^24ac < 0
My program is as follows:
from math import sqrt import sys try: a = float(sys.argv[1]) b = float(sys.argv[2]) c = float(sys.argv[3]) bac = b**2  4*a*c if bac < 0: raise ValueError except IndexError: while True: input("No arguments read from command line!") a = float(input("a = ? ")) b = float(input("b = ? ")) c = float(input("c = ? ")) bac = b**2  4*a*c if bac > 0: break if bac < 0: while True: print("Please choose values of a,b,c so\ that b^2  4ac > 0") a = float(input("a = ? ")) b = float(input("b = ? ")) c = float(input("c = ? ")) bac = b**2  4*a*c if bac > 0: break except ValueError: while True: input("Please choose values of a,b,c so that b^2  4ac > 0") a = float(input("a = ? ")) b = float(input("b = ? ")) c = float(input("c = ? ")) if bac > 0: break for i in range(1,2,2): # i=1, next loop > i=1 x = (b + i*sqrt(bac)) / (2*a) print("x = %.2f"%(x))
It seems to be working fine, but in the case below it doesnt:
terminal > python quadratic_roots_error2.py No arguments read from command line! a = ? 1 b = ? 1 c = ? 1 Please choose values of a,b,c so that b^2  4ac > 0 a = ? 5 b = ? 2 c = ? 3 No arguments read from command line! a = ? 5 b = ? 2 c = ? 3 x = 0.60 x = 1.00
Why does the program spit out the message "No arguments read from command line!"? I want the program to print every solution where b^24ac > 0, and whenever b^24ac < 0 i want the message "Please chose values of a,b,c so that b^2  4ac > 0" to be printed, like it does.

ValueError DataFrame constructor not properly called
billing_descrp= pd.DataFrame(customer.groupby('Invoice No')['Item No']) print(billing_descrp) raise ValueError('DataFrame constructor not properly called!') ValueError: DataFrame constructor not properly called!

Matlab inner product between two vectors
how to find the inner product between ["v1" "v2"] and ["f1" "f2" "f3"] in Matlab for this ["v1f1" "v1f2" "v1f3" "v2f1" "v2f2" "v2f3"] result? Thanks.

¿How to compute the Frobenius inner space between two matrices in Python 2 and 3?
Is there any way to compute the frobenius inner product in Python?