Select a column in dataframe and mask the duplicate

I have a dataframe like this:-

import pandas as pd

dict_data = {
    'Date':pd.Timestamp('20200720'),
    'Number': 123,
    'course':pd.Series(['Python', 'Quant', 'CFA', 'Finance', 'Python', 'Python', 'Finance', 'Finance']),
    'Company':['AA', 'BB', 'CC', 'DD', 'BB', 'BB', 'DD', 'CC']
}

pd.DataFrame(dict_data)

I can select a column. For example, dict_data['course'] and it will output all data of this column. May I know is there any method it can mask the duplicate value? Look like this?

0     Python
1      Quant
2        CFA
3    Finance

1 answer

  • answered 2020-10-20 16:51 Mayank Porwal

    You can use df.drop_duplicates():

    df = pd.DataFrame(dict_data)
    
    In [1327]: df.course.drop_duplicates()
    Out[1327]: 
    0     Python
    1      Quant
    2        CFA
    3    Finance
    Name: course, dtype: object