Created on 2023-11-19.00:00:00 last changed 4 months ago
Proposed resolution:
This wording is relative to N4964.
Modify [range.split.iterator] as indicated:
constexpr iterator(split_view& parent, iterator_t<V> current, subrange<iterator_t<V>> next);-1- Effects: Initializes parent_ with addressof(parent), cur_ with std::move(current),
andnext_ with std::move(next), and trailing_empty_ with cur_ == next_.begin().
Modify [range.lazy.split.outer] as indicated:
constexpr outer-iterator(Parent& parent, iterator_t<Base> current) requires forward_range<Base>;-3- Effects: Initializes parent_ with addressof(parent),
andcurrent_ with std::move(current), and trailing_empty_ with current_ == ranges::end(parent.base_).
[ 2024-03; Reflector comments ]
"For `split`, we need to adjust the definition of `end()` for the `common_range` case (which may require introducing a new constructor to the iterator); right now it would compare `ranges::end(base_)` against a value-initialized iterator, which is not in the domain of `==`. For `lazy_split`, we need to also change the non-forward overload."
"What should splitting an empty range on an empty pattern produce? Right now the behavior is that splitting a range of N > 0 elements with an empty pattern produces a range of N single-element ranges. I suppose you can argue that an empty pattern matches between adjacent elements but not at the start or end, so that an empty range, like a single-element range, contains 0 delimiters so should produce a range of one empty range. But it's also at least arguable that this should produce an empty range instead, so that we maintain the N element <-> N subrange and 1 element per subrange invariant.
[ 2024-03-11; Reflector poll ]
Set priority to 3 after reflector poll.
Consider the following example (which uses fmt::println instead of std::println, but they do the same thing in C++23):
#include <iostream> #include <string> #include <ranges> #include <fmt/ranges.h> int main() { fmt::println("{}", std::views::split(std::string(" x "), ' ')); fmt::println("{}", std::views::split(std::string(" "), ' ')); fmt::println("{}", std::views::split(std::string("x"), ' ')); fmt::println("{}", std::views::split(std::string(""), ' ')); }
The output of this program (as specified today) is
[[], ['x'], []] [[], []] [['x']] []
The principle set out in LWG 3478 is that splitting a sequence containing N delimiters should lead to N+1 subranges. That principle was broken if the N-th delimiter was at the end of the sequence, which was fixed by P2210.
However, the principle is still broken if the sequence contains zero delimiters. A non-empty sequence will split into one range, but an empty sequence will split into zero ranges. That last line is incorrect — splitting an empty range on a delimiter should yield a range of an empty range — not simply an empty range. Proposed Resolution: Currently, split_view::iterator's constructor unconditionally initializes trailing_empty_ to false. Instead, change [range.split.iterator]/1 to initialize trailing_empty_ to cur_ == next_.begin() (i.e. trailing_empty_ is typically false, but if we're empty on initialization then we have to have a trailing empty range). The following demo shows Barry Revzin's implementation from P2210, adjusted to fix this: godbolt.org/z/axWb64j9fHistory | |||
---|---|---|---|
Date | User | Action | Args |
2024-06-24 19:30:31 | admin | set | messages: + msg14211 |
2024-03-11 22:16:42 | admin | set | messages: + msg13989 |
2023-11-25 13:20:33 | admin | set | messages: + msg13874 |
2023-11-19 00:00:00 | admin | create |