Creating IF statement for column based on NaN
Here's a sample of my data.
df[['caption', 'mentions']].sample(7) caption mentions 42 b'Alexa is helping people of all abilities do ... NaN 48 NaN NaN 7 b'Introducing Amazon Pharmacy. :pill::clipboar... NaN 25 b"When it's day:victory_hand_selector:and the ... charliesmallsthedood 58 b'We look at all angles when it comes to safet... NaN 88 b'A night in with your favorite food + pup + e... amazonfiretv,lissettecalv 22 b'Get everyday essentials auto-delivered AND s... NaN
I want to create a column that counts number of mentions in a caption. For the above sample it would return (0,0,0,2,0,1,0)
Here is what I've tried so far:
mentions = df['mentions'].str.lower().str.split(',') for value in df['mentions']: if value != 'nan': df['mention_counts'] = mentions.apply(len) else: df['mention_counts'] = 0
The easiest thing to do would be to write your functionality out explicitly as so -
def count_thing(row): if type(row.mentions) == str: return len(row.mentions.split(',')) elif np.isnan(row.mentions): return 0 else: pass # not sure how you want to deal with this case...
and then use
applyto get the required column:
df['mention_counts'] = df.apply(count_thing, axis=1)
On a side note, I don't see any reason to use
lower, seeing as you're splitting on
,which is uneffected...