Title
The past end issue for `lazy_split_view`
Status
new
Section
[range.lazy.split.outer]
Submitter
Hewill Kang

Created on 2025-04-26.00:00:00 last changed 1 week ago

Messages

Date: 2025-04-27.16:09:32

Proposed resolution:

This wording is relative to N5008.

  1. Modify [range.lazy.split.outer] as indicated:

    namespace std::ranges {
      template<input_range V, forward_range Pattern>
        requires view<V> && view<Pattern> &&
                 indirectly_comparable<iterator_t<V>, iterator_t<Pattern>, ranges::equal_to> &&
                 (forward_range<V> || tiny-range<Pattern>)
      template<bool Const>
      struct lazy_split_view<V, Pattern>::outer-iterator {
      private:
        using Parent = maybe-const<Const, lazy_split_view>;     // exposition only
        using Base = maybe-const<Const, V>;                     // exposition only
        Parent* parent_ = nullptr;                              // exposition only
    
        iterator_t<Base> current_ = iterator_t<Base>();         // exposition only, present only
                                                                // if V models forward_range
    
        bool trailing_empty_ = false;                           // exposition only
        bool has_next_ = false;                                 // exposition only, present only
                                                                // if forward_range<V> is false
      public:
        […]
      };
    }
    
    […]
    constexpr explicit outer-iterator(Parent& parent)
      requires (!forward_range<Base>);
    

    -2- Effects: Initializes parent_ with `addressof(parent)` and has_next_ with current != ranges::end(parent_->base_).

    […]
    constexpr outer-iterator& operator++();
    

    -6- Effects: Equivalent to:

    const auto end = ranges::end(parent_->base_);
    if (current == end) {
      trailing_empty_ = false;
      if constexpr (!forward_range<V>)
        has_next_ = false;
      return *this;
    }
    const auto [pbegin, pend] = subrange{parent_->pattern_};
    if (pbegin == pend) ++current;
    else if constexpr (tiny-range<Pattern>) {
      current = ranges::find(std::move(current), end, *pbegin);
      if (current != end) {
        ++current;
        if (current == end)
          trailing_empty_ = true;
      }
    }
    else {
      do {
        auto [b, p] = ranges::mismatch(current, end, pbegin, pend);
        if (p == pend) {
          current = b;
          if (current == end)
            trailing_empty_ = true;
          break;            // The pattern matched; skip it
        }
      } while (++current != end);
    }
    if constexpr (!forward_range<V>)
      if (current == end)
        has_next_ = false;
    return *this;
    
    […]
    friend constexpr bool operator==(const outer-iterator& x, default_sentinel_t);
    

    -8- Effects: Equivalent to:

    if constexpr (!forward_range<V>)
      return !x.has_next_ && !x.trailing_empty_;
    else
      return x.current == ranges::end(x.parent_->base_) && !x.trailing_empty_;
    
Date: 2025-04-26.00:00:00

Consider (demo):

#include <print>
#include <ranges>
#include <sstream>

int main() {
  std::istringstream is{"1 0 2 0 3"};
  auto r = std::views::istream<int>(is)
         | std::views::lazy_split(0)
         | std::views::stride(2);
  std::println("{}", r); // should print [[1], [3]]
}

The above leads to SIGSEGV in libstdc++, the reason is that we are iterating over the nested range as:

for (auto&& inner : r) {
  for (auto&& elem : inner) {
    // […]
  }
}

which is disassembled as:

auto outer_it = r.begin();
std::default_sentinel_t out_end = r.end();
for(; outer_it != out_end; ++outer_it) {
  auto&& inner_r = *outer_it;
  auto inner_it = inner_r.begin();
  std::default_sentinel_t inner_end = inner_r.end();
  for(; inner_it != inner_end; ++inner_it) {
    auto&& elem = *inner_it;
    // […]
  }
}

Since `inner_it` and `output_it` actually update the same iterator, when we back to the outer loop, lazy_split_view::outer-iterator is now equal to `default_sentinel`, which makes `output_it` reach the end, so `++outer_it` will increment the iterator past end, triggering the assertion.

Note that this also happens in MSVC-STL when `_ITERATOR_DEBUG_LEVEL` is turned on.

It seems that extra flags are needed to fix this issue because `output_it` should not be considered to reach the end when we back to the outer loop.

History
Date User Action Args
2025-04-27 16:09:32adminsetmessages: + msg14735
2025-04-26 00:00:00admincreate