maximum of minimum of difference in subsequence of k size

Given a sorted sequence of n elements. Find the maximum of all the minimums taken from all pairs differences of subsequences of length k.

Here 1<=n<=10^5
and 2<=k<=n

For eg: [2, 3, 5, 9] and k = 3
there are 4 subsequences:

[2, 3, 5] - min diff of all pairs = 1
[2, 3, 9] - min diff of all pairs = 1
[3, 5, 9] - min diff of all pairs = 2
[2, 5, 9] - min diff of all pairs = 3

So answer is max of all min diffs = 3

The naive way is to find all k length subsequences and then find mins in each of them and then max of all of them but that will time out because of the constraints.

Apart from that what I thought was to find the sequence which is optimally distanced so that the min becomes maximum.

Can someone give an optimal and better solution?

1 answer

  • answered 2018-11-08 01:18 btilly

    Suppose that your sequence of integers is a[i]. Then my solution will find the answer in time O(n log((a[n-1]-a[0])/n)). If your sequence is floating point numbers it will likely run in similar time, but could theoretically be as bad as O(n^3).

    The key observation is this. It is easy to construct the most compact sequence starting at the first element whose minimum gap is at least m. Just take the first element, and take each other one when it is at at least m bigger than the last one that you took. So we can write a function that constructs this sequence and tracks 3 numbers:

    1. How many elements we got
    2. The size of the smallest gap that we found.
    3. The next smallest m that would result in a more compact sequence. That is, the largest gap to an element that we didn't include.

    In the case of your sequence if we did this with a gap of 2 we'd find that we took 3 elements, the smallest gap is 3, and we'd get a different sequence if we had looked for a gap of 1.

    This is enough information to construct a binary search for the desired gap length. With the key logic looking like this:

    lower_bound = 0
    upper_bound = (a[n-1] - a[0])/(k-1)
    while lower_bound < upper_bound:
        # Whether int or float, we want float to land "between" guesses
        guess = (lower_bound + upper_bound + 0.0) / 2
        (size, gap_found, gap_different) = min_gap_stats(a, guess)
        if k < size:
            # We must pick a more compact sequence
            upper_bound = gap_different
        else:
            # We can get this big a gap, but maybe we can get bigger?
            lower_bound = gap_found
    

    If we ran this for your sequence we'd first set a lower_bound of 0 and an upper_bound of 7/2 = 3 (thanks to integer division). And we'd immediately find the answer.

    If you had a sequence of floats with the same values it would take longer. We'd first try 3.5, and get a sequence of 2 with a different decision at 3. We'd then try 1.5, and find our sequence of 3 with the gap that we want.

    The binary search will usually make this take a logarithmic number of passes. However each time we set either the upper or lower bound to the size of an actual pairwise gap. Since there are only O(n^2) gaps, we are guaranteed to need no more than that many passes.