Using for loop with pandas DataFrame

I'm trying to make a loop like the following:

x_list = df['Column1'].unique()
for x in x_list:
    y = df.query('Column1 == "x" and Column2 == "No"')
    y_count = y['Column1'].count()
    print ('Total number of {} is {}.' .format(x, y_count))

However, always the y_count results in zero!!

e.g.,
Total number of x1 is 0.
Total number of x2 is 0.
Total number of x3 is 0.
etc.

What would be the problem?

Thanks in advance.

1 answer

  • answered 2022-01-23 02:46 Raymond Kwok

    I seldom use query, my guess is because "Column1 == 'x'" was understood as choosing rows which the Column1 is equal to a 'x' as a string, not the value of your x variable.

    Try this instead:

    x_list = df['Column1'].unique()
    for x in x_list:
        y = df.query('Column1 == {} and Column2 == "No"'.format(x))
        y_count = y['Column1'].count()
        print ('Total number of {} is {}.' .format(x, y_count))
    
    Or consider this
    for x, sub_df in df[df['Column2']=='No'].groupby('Column1'):
        y_count = len(sub_df)
        print('Total number of {} is {}.' .format(x, y_count))
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum