How to skip the null value of a column while creating the dense rank in pandas dataframe?
I have to create dense rank based on the value of a column. That column has some null value and some float values. I have to skip null value and concentrate on float values. I am not sure how to do this? Can you please help me for doing this?
Explain:
df:
comb_fld_order
0 NULL
1 NULL
2 NULL
3 1.1
4 1.1
5 NULL
I am using below code to creating dense rank
df['dense_rank'] = df.groupby('comb_fld_order', sort=True).ngroup()+1
It is giving me output:
comb_fld_order dense_rank
0 NULL 2
1 NULL 2
2 NULL 2
3 1.1 1
4 1.1 1
5 NULL 2
But I want below output:
comb_fld_order dense_rank
0 NULL
1 NULL
2 NULL
3 1.1 1
4 1.1 1
5 NULL
2 answers

 Filter out
NaN
values with.notnull()
prior to using.groupby
 Then, use
fillna('')
if you want blanks instead ofNaN
import pandas as pd df['dense_rank'] = df[df['comb_fld_order'].notnull()].groupby('comb_fld_order', sort=True).ngroup()+1 df['dense_rank'] = df['dense_rank'].fillna('') df Out[1]: comb_fld_order dense_rank 0 NaN 1 NaN 2 NaN 3 1.1 1 4 1.1 1 5 NaN
If
NULL
is a string, then you can do:df['dense_rank'] = df[df['comb_fld_order'] != 'NULL'].groupby('comb_fld_order', sort=True).ngroup()+1
 Filter out

You can just give dropna=True
Here's what I did.
>>> import pandas as pd >>> import numpy as np >>> df = pd.DataFrame({'comb_fld_order':[np.nan, np.nan, np.nan, 1.1, 1.1, np.nan]}) >>> df comb_fld_order 0 NaN 1 NaN 2 NaN 3 1.1 4 1.1 5 NaN >>> df['dense_rank'] = df.groupby('comb_fld_order', sort=True,dropna=True).ngroup()+1 >>> df comb_fld_order dense_rank 0 NaN 0 1 NaN 0 2 NaN 0 3 1.1 1 4 1.1 1 5 NaN 0