operate calculation of date time in a column on data frame

Supposed I have two column in a data frame, consist of probability and remaining time to an event.

      prob          time         
0   0.975909   0 days 00:00:00   
1   0.957819   0 days 01:00:00   
2   0.937498   0 days 02:00:00   
3   0.912779   0 days 03:00:00   
4   0.894139   0 days 04:00:00   
5   0.873184   0 days 05:00:00   
6   0.847748   0 days 06:00:00   
7   0.828572   0 days 07:00:00   
8   0.807029   0 days 08:00:00   
9   0.780847   0 days 09:00:00   
10  0.761082   0 days 10:00:00   
11  0.738855   0 days 11:00:00   
12  0.711733   0 days 12:00:00   

I want to calculate exact time and date, with some additional input is date and time, and probability, for example I will put this:

# Type the date of input data 
i = datetime.datetime.now() #e.g. 2018-01-01 00:00:00

# Type the expected probability 
exprob = 0.80

And what I need for the output is the result of: Find the nearest probability with 'exprob' (0.80) --> 0.80709, then calculate 'i' + time related to 0.80709 = 2018-01-01 08:00:00

2 answers

  • answered 2018-11-08 06:39 Franco Piccolo

    You can use idxmin to find the index for the minimum difference between df['prob'] and exprob and then find the Timedelta and add it to the date i like:

    i = datetime.datetime.now()
    exprob = 0.80
    
    df.loc[((df['prob'] - exprob).abs().idxmin()),'time'] + i
    Timestamp('2018-11-08 18:36:11.529609')
    

  • answered 2018-11-08 06:40 Xpeditions

    using argsort() we can get it like below.

    input = 0.80
    i = datetime.now()
    next_time = i + df.ix[(df['prob']-input).abs().argsort()[:1]]['time']
    

    Complete example is

    import pandas as pd
    from datetime import datetime, timedelta
    
    df = pd.DataFrame(columns = ['prob', 'time'])
    df.loc[len(df)] = [0.975909, timedelta(hours=0, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.957819, timedelta(hours=1, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.937498, timedelta(hours=2, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.912779, timedelta(hours=3, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.894139, timedelta(hours=4, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.873184, timedelta(hours=5, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.847748, timedelta(hours=6, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.828572, timedelta(hours=7, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.807029, timedelta(hours=8, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.780847, timedelta(hours=9, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.761082, timedelta(hours=10, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.738855, timedelta(hours=11, minutes=0, seconds=0)]
    df.loc[len(df)] = [0.711733, timedelta(hours=12, minutes=0, seconds=0)]
    
    input = 0.80
    i = datetime.now()
    next_time = i + df.ix[(df['prob']-input).abs().argsort()[:1]]['time']
    
    print(i)
    print(next_time)