Why is std::common_iterator just std::forward_iterator?

C++20 introduced a std::common_iterator that is capable of representing a non-common range of elements (where the types of the iterator and sentinel differ) as a common range (where they are the same), its synopsis defines as:

template<input_­or_­output_­iterator I, sentinel_­for<I> S>
    requires (!same_­as<I, S> && copyable<I>)
class common_iterator {
  // ...
  variant<I, S> v_;   // exposition only

And it is useful for interfacing with legacy code that expects the begin and end of a range to have the same type.

In [iterators.common#common.iter.types-1.1], its iterator_­concept is defined as:

iterator_­concept denotes forward_­iterator_­tag if I models forward_­iterator; otherwise it denotes input_­iterator_­tag.

Why common_iterator can only be a forward_iterator at most, and cannot completely define its iterator_concept based on I‘s iterator_category? For example, if I is random_asscess_iterator, then common_iterator<I, S> is random_asscess_iterator, and so on.

It seems that this is technically feasible since common_iterator just uses std::variant to type erase I and S.

Consider following (godbolt):

auto r = views::iota(0) | std::views::take(5);
static_assert( ranges::random_access_range<decltype(r)>);

auto cr = r | views::common;
static_assert( ranges::forward_range<decltype(cr)>);

r is random_access_range, so C++20 constrained algorithms such as ranges::binary_search can use this trait to perform more efficient operations on it, but in order to enable the old std algorithm to apply it, we need to use views::common to convert it to a common_range. However, it also degenerates to a forward_range, which reduces the efficiency of the algorithm.

Why does the standard only define common_iterator as forward_iterator at most? What are the considerations behind this?

3 answers

  • answered 2021-05-15 13:02 Philipp Claßen

    The concept of a sentinel is closely linked to a iterator as it is known from other languages, which support to advance and test whether you reached the end. A good example would be a zero-terminated string where you stop when you reached \0 but do not know the size in advance.

    My assumption is that modeling it as a std::forward_iterator is enough for the use cases where you would need to convert a C++20 iterator with a sentinel to call an older algorithm.

    I also think it should be possible to provide a generic implementation that could detect cases where the iterator provides more functionality. It would complicate the implementation in the standard library, maybe that was the argument against it. In generic code, you could still detect the special cases yourself to avoid wrapping a random access iterator.

    But to my understanding, if you deal with a performance critical code section, you should be careful with wrapping everything as a std::common_iterator unless it is needed. I would not be surprised if the underlying variant introduces some overhead.

  • answered 2021-05-15 13:58 Yakk - Adam Nevraumont

    As @daves mentioned, the end iterator is an iterator.

    If your iterator supports -- (as everything more powerful than forward does), then your common iterator has to be able to -- from the wrapped sentinel.

    Sentinels don't support --; you'd have fake it somehow. And in many cases this requires O(n) work; basically scan for the sentinel and turn it into an iterator. Which sucks.

  • answered 2021-05-15 14:07 Nicol Bolas

    If you have a non-common range of iterators, and you need to transform it into a common range, then you basically have two choices.

    You can either compute what the ending iterator would be by successively incrementing the beginning iterator until it is equal to the sentinel, or you can do trickery. common_iterator is for the latter.

    That's important because successively incrementing the begin iterator has two flaws. First, you can't do that if it's not at least a forward range. Second... what happens if the range is infinite? Because that's a thing in C++20 ranges. Indeed, that possibility is one of the most important reasons why we have a sentinel type.

    So trickery it is then. However, "trickery" in this case means creating a new iterator type that is used for both the begin and the sentinel. Therefore, it must have the limitations of both of these. The sentinel basically has to pretend that it's an iterator.

    Iterators can be incremented; sentinels can't. However, you're not allowed to increment the end iterator of a range anyway, so you can pretend that incrementing the end common_iterator is allowed. Similarly, you can't derefence a sentinel, but you can't dereference the end iterator either. So it can pretend that it could be dereferenced, even though no algorithm will do that.

    This means that for a forward range, you cannot do anything except test the ending iterator against some other iterator. In short, for a forward range, the end iterator may as well be a sentinel.

    But that's not true for higher levels of ranges. An algorithm that takes bidirectional ranges is explicitly permitted to move the end iterator backwards. But you can't do that to a sentinel that's pretending its an iterator.

    That's why a common_iterator range cannot be anything higher than a forward range.