python dataframe counter on a column
I column x in dataframe has only 0 and 1. I want to create variable y which starts counting zeros and resets when when 1 comes in x. I'm getting an error "The truth value of a Series is ambiguous."
count=1
countList=[0]
for x in df['x']:
if df['x'] == 0:
count = count + 1
df['y']= count
else:
df['y'] = 1
count = 1
1 answer
-
answered 2018-07-11 06:00
jezrael
First dont loop in pandas, because slow, if exist some vectorized solution.
I think need count consecutive
0
values:df = pd.DataFrame({'x':[1,0,0,1,1,0,1,0,0,0,1,1,0,0,0,0,1]}) a = df['x'].eq(0) b = a.cumsum() df['y'] = (b-b.mask(a).ffill().fillna(0).astype(int)) print (df) x y 0 1 0 1 0 1 2 0 2 3 1 0 4 1 0 5 0 1 6 1 0 7 0 1 8 0 2 9 0 3 10 1 0 11 1 0 12 0 1 13 0 2 14 0 3 15 0 4 16 1 0
Detail + explanation:
#compare by zero a = df['x'].eq(0) #cumulative sum of mask b = a.cumsum() #replace Trues to NaNs c = b.mask(a) #forward fill NaNs d = b.mask(a).ffill() #First NaNs to 0 and cast to integers e = b.mask(a).ffill().fillna(0).astype(int) #subtract from cumulative sum Series y = b - e df = pd.concat([df['x'], a, b, c, d, e, y], axis=1, keys=('x','a','b','c','d','e', 'y')) print (df) x a b c d e y 0 0 True 1 NaN NaN 0 1 1 0 True 2 NaN NaN 0 2 2 0 True 3 NaN NaN 0 3 3 1 False 3 3.0 3.0 3 0 4 1 False 3 3.0 3.0 3 0 5 0 True 4 NaN 3.0 3 1 6 1 False 4 4.0 4.0 4 0 7 0 True 5 NaN 4.0 4 1 8 0 True 6 NaN 4.0 4 2 9 0 True 7 NaN 4.0 4 3 10 1 False 7 7.0 7.0 7 0 11 1 False 7 7.0 7.0 7 0 12 0 True 8 NaN 7.0 7 1 13 0 True 9 NaN 7.0 7 2 14 0 True 10 NaN 7.0 7 3 15 0 True 11 NaN 7.0 7 4 16 1 False 11 11.0 11.0 11 0