join time series NA values using Pandas

How can I add NA's from input2 to input1 col1 values using the date as index?

input1

                 col1
2020-02-01 00:00:00 0
2020-02-01 00:01:00 0
2020-02-01 00:02:00 0
2020-02-01 00:03:00 0
2020-02-01 00:04:00 0
2020-02-03 00:02:00 0
2020-02-04 00:03:00 0
2020-02-05 00:04:00 0

input2

2020-02-03 NaN
2020-02-04 NaN

output

                 col1
2020-02-01 00:00:00 0
2020-02-01 00:01:00 0
2020-02-01 00:02:00 0
2020-02-01 00:03:00 0
2020-02-01 00:04:00 0
2020-02-03 00:02:00 NaN
2020-02-04 00:03:00 NaN
2020-02-04 00:04:00 NaN
2020-02-05 00:04:00 0

1 answer

  • answered 2020-06-27 04:42 Psidom

    You can use merge if performance is a concern, but for simplicity you can also leverage boolean index for conditional assignment:

    Say you have df1, df2 created as follows:

    import pandas as pd
    import numpy as np
    
    # prepare the data
    data = [["2020-02-01 00:00:00", 0 ],
    ["2020-02-01 00:01:00", 0 ],
    ["2020-02-01 00:02:00", 0 ],
    ["2020-02-01 00:03:00", 0 ],
    ["2020-02-01 00:04:00", 0 ],
    ["2020-02-03 00:02:00", 0 ],
    ["2020-02-04 00:03:00", 0 ],
    ["2020-02-05 00:04:00", 0 ]]
    
    df1 = pd.DataFrame(data, columns = ['datetime', 'col'])
    df1['datetime'] = pd.to_datetime(df1.datetime)
    df1.set_index('datetime', inplace=True)
    
    df2 = pd.DataFrame([
        ['2020-02-03', None],
        ['2020-02-04', None]], 
    columns=['date', 'col'])
    df2.date = pd.to_datetime(df2.date).dt.date
    

    Then the following should work:

    df1.loc[np.isin(df1.index.date, df2.date.tolist()), 'col'] = np.nan
    print(df1)
    
    #                      col
    #datetime                
    #2020-02-01 00:00:00  0.0
    #2020-02-01 00:01:00  0.0
    #2020-02-01 00:02:00  0.0
    #2020-02-01 00:03:00  0.0
    #2020-02-01 00:04:00  0.0
    #2020-02-03 00:02:00  NaN
    #2020-02-04 00:03:00  NaN
    #2020-02-05 00:04:00  0.0
    

    You can play around with the solution here.