About pandas groupby apply columns as parameters
I have a function f(a, b)
where a, b
are pandas.Series
and it returns pandas.Series
c
with the same length of a
and b
.
Now I have two Series A
and B
which have the same Multiindex. A, B
is consist of many small series(a1, a2, a3, a4, a5...), (b1, b2, b3, b4, b5...)
. Given that I cannot use f(A, B)
to caluculate the result directly. I want to use groupby to calculate the result f(a1, b1), f(a2, b2), f(a3, b3)
, and concatenate them together.
How should I do that?
Sample data, function and expected output. (I know use other method in pandas could handle this sample easily, but I only want to talk about the groupby method. Thanks)
a1 0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
a2 0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
a3 0 2
1 3
2 4
3 5
4 6
5 7
6 8
7 9
8 10
9 11
b1 0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 1.0
6 0.0
7 0.0
8 1.0
9 0.0
b2 0 0.0
1 1.0
2 0.0
3 0.0
4 0.0
5 0.0
6 0.0
7 0.0
8 1.0
9 0.0
b3 0 0.0
1 0.0
2 0.0
3 0.0
4 1.0
5 0.0
6 1.0
7 0.0
8 0.0
9 0.0
c1 0 0.0
1 0.0
2 0.0
3 0.0
4 0.0
5 5.0
6 6.0
7 7.0
8 8.0
9 9.0
c2 0 0.0
1 2.0
2 3.0
3 4.0
4 5.0
5 6.0
6 7.0
7 8.0
8 9.0
9 10.0
c3 0 0.0
1 0.0
2 0.0
3 0.0
4 6.0
5 7.0
6 8.0
7 9.0
8 10.0
9 11.0
def f(a, b):
loc = 0
res = np.zeros(len(a))
for i in range(len(b1)):
if b[i] != 0:
if b[i] != loc:
loc = b[i]
res[i] = a[i] * loc
return res
1 answer

You may solve the problem like this:
# result is a Series of numpy arrays result = ( pd.DataFrame({'A': A, 'B': B}) .groupby(level=0) .apply(lambda x: f(x['A'], x['B']))) # now result is a Series of float values result = pd.Series(list(itertools.chain(*result.values)))