Dropping column during for loop - Pandas

I have two basic DataFrames, and I combine them into a list called dfCombo:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(12).reshape(3,4), columns=['A', 'B', 'C', 'D'])
df2 = pd.DataFrame(np.arange(12,24).reshape(3,4), columns=['A', 'B', 'C', 'D'])
dfCombo = [df, df2]

They are both 3x4 DF's with 4 columns A, B, C, D.

I am able to use a for loop to add a column to both the DF with the following code:

for df3 in dfCombo:
    df3['E'] = df3['A'] + df3['B']

With this both df and df2 will both have an new column E. However when I try to drop a column using this method with the below code, no columns are dropped:

for df3 in dfCombo:
    df3 = df3.drop('B', axis = 1)

or

for df3 in dfCombo:
    df3 = df3.drop(columns = ['B'])

If I use the same code on a single DF the column is dropped:

df2 = df2.drop('B', axis = 1)

or

df2 = df2.drop(columns = ['B'])

If you could help me understand what is going on I would be most appreciative.

1 answer

  • answered 2018-10-11 19:55 rahlf23

    You need to use inplace=True:

    for df3 in dfCombo:
        df3.drop('B', axis = 1, inplace=True)
    

    Which returns:

       A   C   D   E
    0  0   2   3   1
    1  4   6   7   9
    2  8  10  11  17
    
        A   C   D   E
    0  12  14  15  25
    1  16  18  19  33
    2  20  22  23  41
    

    The default inplace=False is intended for assigning back to the original dataframe, because it returns a new copy. However inplace=True operates on the same copy and returns None, therefore there is no need to assign back to the original dataframe.