`string_view(Iter, Iter)` constructor breaks existing code
Derek Zhang

Created on 2024-05-14.00:00:00 last changed 1 month ago


Date: 2024-05-20.10:49:19

[ Jonathan comments ]

At the very least, we should have an Annex C entry documenting the change. Making the new `string_view(Iter, Iter)` constructor `explicit` would prevent the runtime behaviour change for the second example, but GCC thinks the first example would still be ambiguous (it seems to depend on how list-initialization handles explicit constructors, which has implementation divergence).

Maybe we should have a deleted constructor matching string literals:

template<size_t N1, size_t N2>
basic_string_view(const charT(&)[N1], const charT(&)[N2]) = delete;
Or to handle both `const char[N]` and `char[N]`:

template<class A1, class A2>
requires (rank_v<A1> == 1) && (rank_v<A2> == 1)
basic_string_view(A1&, A2&) = delete;
Both options would prevent this currently valid (but weird) code:

const char arr[] = "str";
std::string_view s(arr, arr); // s.size() == 0 and s.data() == arr
That seems acceptable, because `std::string_view s(arr, 0)` is simpler and clearer anyway.

Date: 2024-05-14.11:16:01

As a result of the new constructor added by P1391, this stopped working in C++20:

void fun(string_view);
void fun(vector<string_view>);
fun({"a", "b"});

Previously the first `fun` wasn't viable, so it constructed a vector<string_view> of two elements using its initializer-list constructor and then called the second `fun`. Now `{"a", "b"}` could also be a call to the new `string_view(Iter, Iter)`, so it's ambiguous and fails to compile.

The following case is arguably worse as it doesn't become ill-formed in C++20, it still compiles but now has undefined behaviour:

fun({{"a", "b"}});

Previously the first `fun` wasn't viable, so this constructed a vector<string_view> of two elements (via somewhat bizarre syntax, but using the same initializer-list constructor as above). Now it constructs a `vector` from an `initializer_list` with one element, where that element is constructed from the two `const char*` using `string_view(Iter, Iter)`. But those two pointers are unrelated and do not form a valid range, so this violates the constructor's precondition and has undefined behaviour. If you're lucky it crashes at runtime when trying to reach `"b"` from `"a"`, but it could also form a `string_view` that reads arbitrary secrets from the memory between the two pointers.

Date User Action Args
2024-05-14 11:16:01adminsetmessages: + msg14143
2024-05-14 00:00:00admincreate