Efficiently find which range a value belongs to

I have some datetime ranges associated with values. I imagine the problem would be same for other ranges, such as integers.

ranges = [
    (datetime.datetime(2021, 6, 10, 10, 0), datetime.datetime(2021, 6, 10, 10, 30), 100),
    (datetime.datetime(2021, 6, 10, 10, 30), datetime.datetime(2021, 6, 10, 11, 0), 200),
    (datetime.datetime(2021, 6, 10, 11, 0), datetime.datetime(2021, 6, 10, 11, 30), 150),
    ...
]

This list is sorted and contains no gaps or overlaps. The lower bound is inclusive, and the upper bound is exclusive.

For a given datetime, I'd like to find the value of the range it belongs to:

def get_value_for_datetime(dt: datetime.datetime) -> int:
    ???

For example:

>>> get_value_for_datetime(datetime.datetime(2021, 6, 10, 10, 45)
>>> 200

My first thought was to look at the bisect module but it seems like there's nothing in here to help me with ranges and there's no way to provide a custom function for deciding to look to the left or the right, but maybe I'm missing something.

I'm also not averse to a numpy and/or pandas solution if it's significantly faster, and it's also possible to use a different structure for the ranges list if it helps.

1 answer

  • answered 2021-06-10 11:09 a_guest

    You can use bisection on the lower bounds and then check if the corresponding upper bound satisfies the condition upper_bound > value:

    import bisect
    
    lb, ub, values = zip(*ranges)
    
    def get_value_for_datetime(x):
        index = bisect.bisect_right(lb, x) - 1
        if index == -1 or ub[index] <= x:
            raise ValueError(x)
        return values[index]