Count rows across certain columns in a dataframe if they are greater than another value and groupby another column
I have a dataframe:
df = pd.DataFrame({
'BU': ['Total', 'Total', 'Total', 'CRS', 'CRS', 'CRS'],
'Line_Item': ['Revenues','EBT', 'Expenses', 'Revenues', 'EBT', 'Expenses'],
'1Q16': [100, 120, 0, 200, 190, 210],
'2Q16': [100, 0, 130, 200, 190, 210],
'3Q16': [200, 250, 0, 120, 0, 190]})
I wish to count the number of rows in 1Q16, 2Q16, 3Q16 by "BU" that are greater than zero. To count rows in 1Q16, 2Q16, 3Q16 I was just explained, I can use:
cols = ['1Q16','2Q16','3Q16']
df[cols].gt(0).sum()
In addition, I want to group them by BU
1 answer

With your shown samples, please try following.
cols = ['1Q16','2Q16','3Q16'] df[cols].gt(0).groupby(df['BU']).sum()
Output will be as follows:
1Q16 2Q16 3Q16 BU CRS 3.0 3.0 2.0 Total 2.0 2.0 2.0
Explanation: Following is detailed explanation for above.
 Creating
cols
list which has columns names in it where we want to perform tasks.  Using
gt
function to get values which are more than0
in mentioned cols.  Then using
groupby
and passingdf['BU']
to get groupby values related to BU column.  Then applying
sum
function to get total sum of values greater than0
.
 Creating
