Python: “Too many indices for array" happens when using sparse
I want to get_dummies a dataframe, and turn the dummy columns into sparse matrix.
df = pd.DataFrame(
{
"A": ["a", "b", "c", "a"],
"B": [1, 2, 3, 4]
})
df['A'] = df['A'].astype('category')
one_hot = pd.get_dummies(df.to_sparse(), sparse=True)
print(one_hot)
one_hot.to_csv('test_sparse.csv',index=False)
one_hot:
B A_a A_b A_c
0 1 1 0 0
1 2 0 1 0
2 3 0 0 1
3 4 1 0 0
Error:
IndexError: too many indices for array
Hopefully for help!
See also questions close to this topic

Method Not Allowed (GET): / in django
from django.views.generic import View from django.http import HttpResponse class home(View): def post(self,request): return HttpResponse('Class based view')
When I tried to define above method it says Method Not Allowed (GET): /
Can anyone please help me on this issue?

How to auto login to jupyterhub in ubuntu machine
I am new to python and jupyterhub, i have installed juputerhub ubuntu machine in cloud. i want to access the jupyterhub without login from anywhere. can anyone please help me to resolve this issue.
I tried below configuration in jupyterhub_config.py
c.Authenticator.auto_login = True
my problem is not resolving from this
please help me to solve this problem

SDN pox controller Timer demo referred from pox wiki won't work
I try to use pox controller's Timer to do a little experiment. But a unexpected error have confused me, I just copied the demo code from pox wiki and I sure that I use the right python version 2.7.
pox wiki:
Error message:
demo code

Can anyone explain the whole meaning of statement
df.apply(lambda x: pd.to_numeric(x, errors='coerce'))
i do understand that this statement converts the dataframe columns to integer values, but was not able to understand the usage of lambda function properly and error part.

Parsing large string values in Pandas
I have a .csv which I've generated a dataframe from. This csv has raw data outputs from a system that follows this format:
{"DataType1":"Value","DataType2":"Value","DataType3":"Value",.....}
Each row in the dataframe has just this in 1 column. I'm trying to break this out so that the data types become column headers and the values populate the rows. One other aspect is that not all rows have the same data types, some have additional data types that might not be present in other rows. For example row 1 may have DataType1, DataType2, and DataType3 and row 2 may have DataType2, DataType4, and DataType5. Ideally I'd like for the output to have the column headers incorporate all data types whether that row has a value for it or not. So the final dataframe would this structure:
  DataType1  DataType2  DataType3  DataType4  DataType5    Value  Value  Value  NaN  NaN    NaN  Value  NaN  Value  Value  

Appending strings after groupby in pandas
I am trying to join strings after groupby operation in pandas. But the output is not as expected. The code and the output are as follows.
import pandas as pd import numpy as np import warnings warnings.filterwarnings('ignore') top_1000_jobs_data = pd.read_excel('top_1000_jobs_excel.xlsx', names = ['text', 'subtext']) top_1000_jobs_data.groupby('text')['subtext'].apply(', '.join)
Sample Output
text \tfashion,\tbeauty,\tecommerce,\thealthcare,\tpharmaceutical,\tfinancial,\t
text and subtext columns are string columns.
Sample text:
responsible for monitoring and supporting lucent critical optical network elements  add/drop multiplexers (adm’s), terminal multiplexers (tm’s) , line regenerator (lr’s) , phase local crossconnects (lxc’s) on lucent itm sc and itmnm (network management system) and alcatel network elements (1651sm, 1661sm, optinex 1650, 1660sm) on alcatel nms respectively by using real time alarming of network faults, analysis, notification, trending of faults and escalation.
Sample subtext:
lxc

Speed up sparse matrix multiplication in R
I am trying to multiply a matrix (made up of few 1's and majority O's) with a vector using %*% function in R, this process is taking huge amount of time. Is there a way I can make this faster??
Thanks

How to solve equation Ax=b with tensorflow&GPU while A is a huge sparse matrix
Currently I have a problem to solve Ax=b equation. First,tensorflow&GPU has to be involved. Second,A is a huge sparse matrix, and which element is nonzero is unknown(which means its impossible to use tf.SparseTensor). Third,this calculation will easily overwhelm my resources. Are there any other solutions for this problem? Can anyone post some demo codes here to give me some instructions or inspirations? Thx in advance!

Efficient way to do cell multiplication and stacking of sparse matrices in MATLAB
What I have is a cell
J
of size1xs
filled with sparse matrices of sizemxn
(m>=n
), and two full matricesr
andl
of sizemxcxp
andsxcxp
, respectively. The dimensions are roughlys = 4; % or 9 m = 10000; % can increase up to 300k n = 36; % can increase up to m c = 3; p = 25;
What I do until now is this (see code below), but this seems to be quite inefficient. I’m looking for a way on how to speed up things in a scalable way, as I need to do this operation many times (also
m
can increase up to 300k), thus having this step as fast as possible would be great, as this seems to be the bottleneck till now. Here the code on what I need to do:J = cell(s, 1); % just fill J with sparse matrices of size mxn. Each sparse matrix is different for each cell, but all have the same nnz. J(:) = {sprand(m,n,0.1)}; r = rand(m, c, p); l = rand(s, c, p); % preallocate vectors row_vec = zeros(nnz(J{1}),c*p); col_vec = zeros(nnz(J{1}),c*p); val_vec = zeros(nnz(J{1}),c*p); % do computation for pi = 1:p for ci = 1:c J_ = 0; for si=1:s % multiply each sparse matrix in cell with scalar l(si,ci,pi) and sum them up J_ = J_ + J{si} * l(si,ci,pi); end % multiply resulting sparse matrix with diagonal matrix (resulting from vector r(:,ci,pi)) and get final indices for later [row_vec_temp, ... col_vec(:,(pi1)*c+ci), ... val_vec(:,(pi1)*c+ci)] = find(spdiags(r(:,ci,pi),0,m,m) * J_); row_vec(:,(pi1)*c+ci) = row_vec_temp + row_vec(end,max(1,(pi1)*c+ci1)); end end % build final stacked sparse matrix of size m*c*pxn using calculated indices. J_final = sparse(row_vec, col_vec, val_vec);
Additionally, I found this way without nested
for
loops, but this seems to be even less efficient:% create cell 1xcxp cell from r, where each cell is mx1 vector R = mat2cell(r, m, ones(c,1), ones(p,1)); % make each cell a sparse diagonal matrix R = repmat(cellfun(@(x) spdiags(x,0,m,m), R, 'UniformOutput', false), s, 1, 1); % do pointwise computation C = cellfun(@(x,y,z) z*(x.*y), repmat(J',1,c,p), mat2cell(l,ones(s,1),ones(c,1),ones(p,1)), R, 'UniformOutput',false); % sum over each row of resulting cell C J_ = cell(1,c,p); J_(:) = {0}; for si=1:s J_(1,:,:) = cellfun(@(x,y) (x+y), J_(1,:,:), C(si,:,:), 'UniformOutput',false); end % stack final cell and convert it to sparse matrix J_final = cell2mat(cat(1,J_(:)));
The first version takes roughly 0.27s and the second version takes about 0.3s.