How to count number of rows that follow a condition in two columns in pandas using Groupby

I have a data frame with multiple columns, 4 of which are car, company_name, id, and status. Each car has an associated company_name and status, and each company_name is linked to a unique ID. One possible status is Rented and I'm trying to count the number of Rented cars for each company (in a new column called # of Rented Cars) and I have been trying to use their unique ids to do so.

I have tried using groupby and apply but to no success.

df['# of Rented Cars'] = df.groupBy('unique_id')['status'].apply(lambda x: (x=='Rented').sum())

Using the following table as an example, you can see the values I want in the # of Rented Cars Column:


But using the code above I just get the value Nan for all values in the last column.

1 answer

  • answered 2019-05-15 03:13 YOBEN_S

    I think you are looking for transform

    df['# of Rented Cars'] = df.groupBy('unique_id')['status'].transform(lambda x: (x=='Rented').sum())

    or without lambda

    df['# of Rented Cars'] = df['status'].eq('Rented').groupBy(df['unique_id']).transform('sum')