Pivoting with grouby?

I wonder if you can help me to find a solution for the following problem. Given a data frame df1 like this

d1={'L':['aaa','bbb','ccc','aaa','bbb','ddd'],
 'w':[1,5,9,13,17,21],
 'x':[2,6,10,14,18,22],
 'y':[3,7,11,15,19,23],
 'z':[4,8,12,16,20,24]}
df1=pd.DataFrame(d1)

Data

and two dictionaries to define grouping over columns and rows

dctRowGroups={'aaa':'A','bbb':'B','ccc':'A','ddd':'B'}
dctColGroups={'w':'ALPHA','x':'BETA','y':'ALPHA','z':'BETA'}

I wanted to aggregate over columns as a first step. Applying

g2=df1.groupby(dctColGroups,axis=1)
g2.sum()

results in

Result of grouping by column

but I wanted to keep the 'L' column for the next step row-wise aggregation, i.e. the result should be a dataframe df2 more like this:

Result I wanted to get

What do I need to code to make this happen? As a next step, I want to aggregate df2 over the rows using the dctRowGroups dictionary

g3=df2.groupby(dctRowGroups,axis=0)
g3.sum()

to get a final result like this:

Final result

In what way can I do all these steps in as few lines of code as possible? Appreciate your advice on this.

Thanks a lot

Willfried.

1 answer

  • answered 2021-05-15 09:51 Anurag Dabas

    You can do:

    Firstly create df2 and insert 'L' column by using insert() method:

    df2=df1.groupby(dctColGroups,axis=1).sum()
    
    df2.insert(0,'L',df1['L'])  #use this only when the order matters
    
    #OR(use anyone of the method either insert or assign)
    
    df2=df2.assign(L=df1['L'])  #otherwise use this
    

    Finally use assign() ,map() and groupby() method:

    result=df2.assign(L=df2['L'].map(dctRowGroups)).groupby('L').sum()
    

    Outputs:

    df2:

        L   ALPHA   BETA
    0   aaa     4   6
    1   bbb     12  14
    2   ccc     20  22
    3   aaa     28  30
    4   bbb     36  38
    5   ddd     44  46
    

    result:

        ALPHA   BETA
    L       
    A   52      58
    B   92      98