About pandas groupby apply columns as parameters

I have a function f(a, b) where a, b are pandas.Series and it returns pandas.Series c with the same length of a and b.

Now I have two Series A and B which have the same Multiindex. A, B is consist of many small series(a1, a2, a3, a4, a5...), (b1, b2, b3, b4, b5...). Given that I cannot use f(A, B) to caluculate the result directly. I want to use groupby to calculate the result f(a1, b1), f(a2, b2), f(a3, b3), and concatenate them together.

How should I do that?

Sample data, function and expected output. (I know use other method in pandas could handle this sample easily, but I only want to talk about the groupby method. Thanks)

a1  0     0
    1     1
    2     2
    3     3
    4     4
    5     5
    6     6
    7     7
    8     8
    9     9
a2  0     1
    1     2
    2     3
    3     4
    4     5
    5     6
    6     7
    7     8
    8     9
    9    10
a3  0     2
    1     3
    2     4
    3     5
    4     6
    5     7
    6     8
    7     9
    8    10
    9    11

b1  0    0.0
    1    0.0
    2    0.0
    3    0.0
    4    0.0
    5    1.0
    6    0.0
    7    0.0
    8   -1.0
    9    0.0
b2  0    0.0
    1    1.0
    2    0.0
    3    0.0
    4    0.0
    5    0.0
    6    0.0
    7    0.0
    8   -1.0
    9    0.0
b3  0    0.0
    1    0.0
    2    0.0
    3    0.0
    4   -1.0
    5    0.0
    6    1.0
    7    0.0
    8    0.0
    9    0.0


c1  0     0.0
    1     0.0
    2     0.0
    3     0.0
    4     0.0
    5     5.0
    6     6.0
    7     7.0
    8    -8.0
    9    -9.0
c2  0     0.0
    1     2.0
    2     3.0
    3     4.0
    4     5.0
    5     6.0
    6     7.0
    7     8.0
    8    -9.0
    9   -10.0
c3  0     0.0
    1     0.0
    2     0.0
    3     0.0
    4    -6.0
    5    -7.0
    6     8.0
    7     9.0
    8    10.0
    9    11.0


def f(a, b):
    loc = 0
    res = np.zeros(len(a))
    for i in range(len(b1)):
        if b[i] != 0:
            if b[i] != loc:
                loc = b[i]
        res[i] = a[i] * loc
    return res

1 answer

  • answered 2018-11-08 10:48 Poolka

    You may solve the problem like this:

    # result is a Series of numpy arrays
    result = (
        pd.DataFrame({'A': A, 'B': B})
        .groupby(level=0)
        .apply(lambda x: f(x['A'], x['B'])))
    
    # now result is a Series of float values
    result = pd.Series(list(itertools.chain(*result.values)))