How can I create a conditional text column in Pandas?

In a Pandas Dataframe vinhos I have a (quite long) text column regiao. I want to create a new column reg with all the elements of another Dataframe local column nome that are found in regiao. I am using this code

local['nome']
0, Vinho Verde
1, Minho
...
4, Douro
5, Porto

vinhos['regiao']
...
232, Douro tinto 2014
...

vinhos['reg']
Douro

vinhos['reg'] = ','.join([r for r in local['nome'] if r in vinhos['regiao']])

and it reurns empty column when there are elements there.

Could you help me?

1 answer

  • answered 2018-01-14 10:50 jezrael

    I believe you need str.findall with word boundary + str.join:

    print (vinhos)
                        regiao
    232       Douro tinto 2014
    233  Vinho Verde Douro new
    
    pat = '|'.join([r'\b{}\b'.format(x) for x in local['nome'].tolist()])
    vinhos['reg'] = vinhos['regiao'].str.findall(pat).str.join(',')
    print (vinhos)
                        regiao                reg
    232       Douro tinto 2014              Douro
    233  Vinho Verde Douro new  Vinho Verde,Douro