Not a duplicate  How can I search for first occurrence of a number less than threshold in a 1D numpy array?
This question was incorrectly marked as a duplicate.
I have an n x 1 numpy array. I want to find the first occurrence of an entry in the array that is less than a threshold.
my code is as follows:
import numpy as np
aa = np.array([4,3,5,7])
print(aa)
np.argmin(aa<3)
output:
[ 4 3 5 7]
0
I expect argmin to return 2 but I'm getting 0. How can I make this work?
See also questions close to this topic

Function in python that will return a list of only truncated integers
So I am working on code in a class, and I have a problem where my teacher wants me to: "Write a Python function that will take a list of numbers as a parameter(list can be a mixed list), and returns a new list comprised of the integer values(truncated in the case of a float), of the original list. E.g. the list [5, 6.4, 7.5, 8.8, 2, 2.1] returns [5, 6, 7, 8, 2, 2]"
I've started the funtion already, but I am stuck on the part of deceifering whether or not a value in the list is an int or float... This is what I have:
def int_list(a_list): for x in a_list: if x = int:
I don't think we can ask if x is an int or float without saying type(x), but I dont think my teacher wants us using any Python built in library functions.
Any help is appreciated. Thanks

Sort a numpy 2d array by 1st row, maintaining columns
In python, I have a numpy array of the form:
[4 8 2 0 5] [3 1 6 8 1] [2 2 6 0 3] [9 7 6 7 8] [5 8 1 1 4]
I want to sort it by the value of the first row from left to right in ascending order, while keeping the columns as a whole intact. The actual arrays are of unspecified dimensions, and pretty gigantic, so writing something myself with for loops get prohibitively slow. The result should be:
[0 2 4 5 8] [8 6 3 1 1] [0 6 2 3 2] [7 6 9 8 7] [1 1 5 4 8]

Running while loops in the background, and threading, python 2.7
dumb question I know, but I am having some major issues with this.
I have built a command line menu system that allows you to select options and execute different parts of the program.
Here is the issue I am running into.
When option 7 is pressed, it calls a function, that calls a def in a class that starts the program main loop and outputs data to an SQL database.
What I want it to do is execute the deff in the class, and then return to the main menu, allowing users to select a different option.
I have tried with threading and I have it working, however, it does not return the console to the menu, as the def in the class is a while loop.
How do I get it to execute the while loop in the background while returning to the main menu?
Thanks!

How can i create this 2D array with numpy?
I have list of arrays but it is 3D array. What I need is a 2D array but I am not sure how to do this. Here is what I have got:
[[[ 529.92 529.92 519.91300005] [ 531. 531.0000001 522.142 ] [ 536.8724 541.99999999 530.65200001] [ 533.601 537.34 530.1 ] [ 535.5 535.5 530.959595 ]] [[ 536. 536.9 532.8300001 ] [ 532.7 536.00000547 531.78309789] [ 534.9 534.9 532.7100011 ] [ 536.87249998 539.39931808 533.33 ] [ 536.8725 539.83129126 534.055 ]] [[7667.99233033 7760. 7530. ] [7781.1494513 7815.0972 7537.49884148] [7371.08368857 7817.80861244 7321. ] [7453.74590586 7566.11480152 7326.00000001] [8200. 8325.94657268 7453.74590764]]]
However what i am trying to get is the following shape:
[[ 529.92 529.92 519.91300005] [ 531. 531.0000001 522.142 ] [ 536.8724 541.99999999 530.65200001] [ 533.601 537.34 530.1 ] [ 535.5 535.5 530.959595 ]] [[ 536. 536.9 532.8300001 ] [ 532.7 536.00000547 531.78309789] [ 534.9 534.9 532.7100011 ] [ 536.87249998 539.39931808 533.33 ] [ 536.8725 539.83129126 534.055 ]] [[7667.99233033 7760. 7530. ] [7781.1494513 7815.0972 7537.49884148] [7371.08368857 7817.80861244 7321. ] [7453.74590586 7566.11480152 7326.00000001] [8200. 8325.94657268 7453.74590764]]
I am creating this array from pandas:
def get_y(ts, df): cl = len(ts) for i in range(cl1): start = ts[i] end = ts[i+1] b = df[(df["Timestamp"] > start) & (df["Timestamp"] < end)].drop(["Open","Timestamp","Volume"],1).values try: d = np.append(d, [b], axis=0) except: print("Error...") d = np.array([b]) return d
so each root elements contain another array.

Numpy array in panda dataframe from json
I have a JSON file containing properties of some mathematical objects (CalabiYau manifolds). These objects are defined by a matrix and a vector, and two additional properties I am storing are the matrix size (such that it does not need to be computed again) and the Euler number of the manifold (some integer). In total there are roughly 1 million entries, the biggest matrix is 16 x 20.
I would like to convert the matrix and vectors to numpy arrays. Hence I was wondering if it is possible to do it directly when loading from json, or at least how to convert afterwards. The reason for converting is that I will need in any case some functions from numpy later, but I also hope (especially if the conversion is done on loading) that it will speed up my code: for the moment loading the complete dataset takes roughly 90 seconds (a previous loading using json module required only 20 s; I will open another thread on this question if using numpy does not improve).
Here is a minimal working code:
import pandas as pd import numpy as np json = ''' {"1":{"euler":2610,"matrix":[[6]],"size":[1,1],"vec":[5]}, "2":{"euler":2190,"matrix":[[2,5]],"size":[1,2],"vec":[6]}, "4":{"euler":1632,"matrix":[[2,2,4]],"size":[1,3],"vec":[7]}, "6":{"euler":1152,"matrix":[[2,2,2,3]],"size":[1,4],"vec":[8]}, "7":{"euler":960,"matrix":[[2,2,2,2,2]],"size":[1,5],"vec":[9]}, "8":{"euler":2160,"matrix":[[2],[5]],"size":[2,1],"vec":[1,4]}, "9":{"euler":1836,"matrix":[[0,2],[2,4]],"size":[2,2],"vec":[1,5]}} ''' data = pd.read_json(json, orient="index") data.sort_index(inplace=True)
My first guess was to use the
numpy
argument, but it fails with an error:>>> data = pd.read_json(json, orient="index", numpy=True) ValueError: cannot reshape array of size 51 into shape (7,4,2,2)
Then I have tried giving the
dtype
argument but it does not look like changing anything (my hope was that by using a numpy type it would convert the list to an array):>>> dtype = {"euler": np.int16, "matrix": np.int8, "vector": np.int8, ... "size": np.int8, "number": np.int32} >>> data = pd.read_json(json, orient="index", dtype=dtype) >>> type(type(data["matrix"][1])) list
For the conversion I was wondering if there is a more subtle (and perhaps more efficient) way than brutal conversion:
data["matrix"] = data["matrix"].apply(lambda x: np.array(x, dtype=np.int8))

Python: *args unpacks, how to repack from test_train_split?
I have multiple (type) inputs put inside a list
x
and I'm doing thetest train split
using:x = [some_matrix, scalar_value, something_else, ...] x0_train, x0_test, x1_train, x1_test, ... , y_train, y_test = train_test_split(x[0],x[1],... , y, test_size=0.2, random_state=np.random, shuffle=True)
I managed to change the input parameters
x[0], x[1], ...
to*x
:x0_train, x0_test, x1_train, x1_test, ... , y_train, y_test = train_test_split(*x, y, test_size=0.2, random_state=np.random, shuffle=True) # But I have to manually repack x_train = [x0_train, x1_train] x_test = [x0_test, x1_test]
But is there a way to receive it without having to manually repack? What is the equivalent of:
*x_train, *x_test, y_train, y_test = train_test_split(*x, y, test_size=0.2, random_state=np.random, shuffle=True)
Or is there any other way to do this? For eg: constructing a dictionary and using ** to unpack, but I still have the same problem. What is the convention anyway (if one exists)?

Difference between cv2, scipy.misc and skimage
What is the main difference between
cv2.imread / resize/ imwrite
,scipy.misc.imread / imresize/ imsave
andskimage.io.imread / skimage.transform.resize / skimage.io.imsave
and how to decide which one to use?I know
cv2
andskimage
have different encoder, andcv2
use 'BGR' not 'RGB' in default. But sometimes a scripy might use them together, for example main.py, where it usesscipy.misc.imread
,cv2.imresize
andcv2.imwrite
. I am wondering the reason to do so. 
get_dummies does not have columns attribute
This line of code:
for j in range(0,len(names)): #fullSet = pandas.get_dummies(fullSet,columns=[names[j]]) fullSet = pandas.get_dummies(fullSet,columns=[categoricalNames.columns[j]])
Is generating this error:
Traceback (most recent call last): File "noPrintsMachineLearnOptions.py", line 109, in <module> fullSet = pandas.get_dummies(fullSet,columns=[categoricalNames.columns[j]]) TypeError: get_dummies() got an unexpected keyword argument 'columns'
This code runs on my machine with Python 2.7.12 without issue, but on my work's server with Python 2.7.13 I get the above error. There are countless examples on the web where
columns
is used withget_dummies
so I do not understand what the problem is. 
The same operation for many pairs in Python
In Python, I would like to evaluate a function for an array, however, an array of pairs (or more generally arrays).
I know I can do this operation for an array of scalars:
def f_test(scalar, pair): return scalar + pair[0] + pair[1] result = f_test(numpy.linspace(0, 9, 10), [3, 4])
And get the desired result:
[ 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.]
So the pair is fixed and the scalar is taken from an array.
The question is: Can it be done it the other way? With the scalar kept fixed... Can the pairs be taken from a vector to obtain again a vector of the same length as the result?
That is for (something like, e.g. not necessarily a numpy.array)
scalar = 0 pair = numpy.array([ [1,2], [3,7], [5,8] ])
obtain
[ 3, 10, 13 ]
instead of
[4, 9]
Note: I have simplified the operation I need to perform on the numbers a lot to keep the example simple.
If it can't be done or to be more general: What is the best practice (in Python!) to perform the same operation on a large number of arrays?
Note: I was searching this topic and even found some similar questions. However, I was not sure if they are really the same and more importantly did not found the answer. As it seems to me as a generally desirable operation, I asked a separate question.

Is there a way to vectorize selection of columns (with repetition) from a matrix?
I have a matrix L of size
n x k
and a vector Z of sizep
. Z is composed of integers which represent the column indices of L. I want to create a matrix X of sizen x p
which is the aggregation of the corresponding columns of L selected based on the values in Z.Z = c(1, 3, 1, 2) L = matrix(c(73,50,4,14,87,5,34,51,17,57,47,65),nrow=4) > L [,1] [,2] [,3] [1,] 73 87 17 [2,] 50 5 57 [3,] 4 34 47 [4,] 14 51 65
I want X to be
> X [,1] [,2] [,3] [,4] [1,] 73 17 73 87 [2,] 50 57 50 5 [3,] 4 47 4 34 [4,] 14 65 14 51
In my original data,
p
,k
andn
are quite big (30K, 500 and 2K, respectively), and a loop over all Z values to select and combine the columns from L takes a very long time. Can there be a vectorized way (no loops) to do this task? 
Vectorization  how to append array without loop for
I have the following code:
x = range(100) M = len(x) sample=np.zeros((M,41632)) for i in range(M): lista=np.load('sample'+str(i)+'.npy') for j in range(41632): sample[i,j]=np.array(lista[j]) print i
to create an array made of sample_i numpy arrays.
sample0, sample1, sample3, etc. are numpy arrays and my expected output is a Mx41632 array like this:
sample = [[sample0],[sample1],[sample2],...]
How can I compact and make more quick this operation without loop for? M can reach also 1 million.
Or, how can I append my sample array if the starting point is, for example, 1000 instead of 0?
Thanks in advance