Title
§[simd] conversions (constructor, load, stores, gather, and scatter) are incorrectly constrained for <stdfloat> types
Status
wp
Section
[simd]
Submitter
Matthias Kretz

Created on 2025-10-15.00:00:00 last changed 1 month ago

Messages

Date: 2025-11-11.10:48:55

Proposed resolution:

This wording is relative to N5014.

  1. Modify [simd.expos] as indicated:

    […]
    
    template<class T>
      concept constexpr-wrapper-like =                   // exposition only
        […]
        bool_constant<static_cast<decltype(T::value)>(T()) == T::value>::value;
        
    template<class From, class To>
      concept explicitly-convertible-to =                // exposition-only
        requires {
          static_cast<To>(declval<From>());
        };
    
    template<class T> using deduced-vec-t = see below; // exposition only
    […]
    
  2. Modify [simd.syn] as indicated:

    […]
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r,
                                     flags<Flags...> f = {});
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r,
                                     const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first,
                                     iter_difference_t<I> n, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first,
                                     iter_difference_t<I> n, const typename basic_vec<T, Abi>::mask_type& mask,
                                     flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
                                     flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
                                     const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r,
                                   flags<Flags...> f = {});
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(
        const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(
        const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
                                   flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    […]
    
  3. Modify [simd.ctor] as indicated:

    template<class U> constexpr explicit(see below) basic_vec(U&& value) noexcept;
    

    -1- Let `From` denote the type remove_cvref_t<U>.

    -2- Constraints: value_typeU satisfies constructible_from<U>explicitly-convertible-to<value_type>.

    […]
    template<class U, class UAbi>
      constexpr explicit(see below) basic_vec(const basic_vec<U, UAbi>& x) noexcept;
    

    -5- Constraints:

    1. (5.1) — simd-size-v<U, UAbi> == size() is `true`, and

    2. (5.2) — `U` satisfies explicitly-convertible-to<T>.

    […]
    template<class R, class... Flags>
      constexpr basic_vec(R&& r, flags<Flags...> = {});
    template<class R, class... Flags>
      constexpr basic_vec(R&& r, const mask_type& mask, flags<Flags...> = {});
    

    -12- Let `mask` be `mask_type(true)` for the overload with no `mask` parameter.

    -13- Constraints:

    1. (13.1) — `R` models `ranges::contiguous_range` and `ranges::sized_range`,

    2. (13.2) — `ranges::size(r)` is a constant expression, and

    3. (13.3) — `ranges::size(r)` is equal to `size()`, and

    4. (13.?) — ranges::range_value_t<R> is a vectorizable type and satisfies explicitly-convertible-to<T>.

    -14- Mandates:

    1. (14.1) — ranges::range_value_t<R> is a vectorizable type, and

    2. (14.2) — ifIf the template parameter pack `Flags` does not contain convert-flag, then the conversion from ranges::range_value_t<R> to `value_type` is value-preserving.

  4. Modify [simd.loadstore] as indicated:

    template<class V = see below , ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R>
      constexpr V partial_load(R&& r, flags<Flags...> f = {});
    template<class V = see below , ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R>
      constexpr V partial_load(R&& r, const typename V::mask_type& mask, flags<Flags...> f = {});
    template<class V = see below , contiguous_iterator I, class... Flags>
      constexpr V partial_load(I first, iter_difference_t<I> n, flags<Flags...> f = {});
    template<class V = see below , contiguous_iterator I, class... Flags>
      constexpr V partial_load(I first, iter_difference_t<I> n, const typename V::mask_type& mask,
                               flags<Flags...> f = {});
    template<class V = see below , contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      constexpr V partial_load(I first, S last, flags<Flags...> f = {});
    template<class V = see below , contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      constexpr V partial_load(I first, S last, const typename V::mask_type& mask,
                               flags<Flags...> f = {});
    

    -6- Let

    1. (6.1) — mask be V::mask_type(true) for the overloads with no mask parameter;

    2. (6.2) — R be span<const iter_value_t<I>> for the overloads with no template parameter R;

    3. (6.3) — r be R(first, n) for the overloads with an n parameter and R(first, last) for the overloads with a last parameter.;

    4. (6.?) — T be typename V::value_type.

    -7- Mandates:

    1. (7.1) — ranges::range_value_t<R> is a vectorizable type and satisfies explicitly-convertible-to<T>,

    2. (7.2) — same_as<remove_cvref_t<V>, V> is `true`,

    3. (7.3) — `V` is an enabled specialization of `basic_vec`, and

    4. (7.4) — if the template parameter pack `Flags` does not contain convert-flag, then the conversion from ranges::range_value_t<R> to `V::value_type` is value-preserving.

    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r,
                                     flags<Flags...> f = {});
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r,
                                     const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first,
                                     iter_difference_t<I> n, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first,
                                     iter_difference_t<I> n, const typename basic_vec<T, Abi>::mask_type& mask,
                                     flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
                                     flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
                                     const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    

    -11- Let […]

    […]

    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r,
                                   flags<Flags...> f = {});
    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(
        const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(
        const basic_vec<T, Abi>& v, I first, iter_difference_t<I> n,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
                                   flags<Flags...> f = {});
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    

    -15- Let […]

    -?- Constraints:

    1. (?.1) — ranges::iterator_t<R> models indirectly_writable<ranges::range_value_t<R>>, and

    2. (?.2) — `T` satisfies explicitly-convertible-to<ranges::range_value_t<R>>

    -16- Mandates: […]

    -17- Preconditions: […]

    -18- Effects: For all i in the range of [0, basic_vec<T, Abi>::size()), if mask[i] && i < ranges::size(r) is `true`, evaluates ranges::data(r)[i] = static_cast<ranges::range_value_t<R>>(v[i]).

  5. Modify [simd.permute.memory] as indicated:

    template<class V = see below, ranges::contiguous_range R, simd-integral I, class... Flags>
      requires ranges::sized_range<R>
      constexpr V partial_gather_from(R&& in, const I& indices, flags<Flags...> f = {});
    template<class V = see below, ranges::contiguous_range R, simd-integral I, class... Flags>
      requires ranges::sized_range<R>
      constexpr V partial_gather_from(R&& in, const typename I::mask_type& mask,
                                      const I& indices, flags<Flags...> f = {});
    

    -5- Let: […]

    -?- Constraints: ranges::range_value_t<R> is a vectorizable type and satisfies explicitly-convertible-to<T>.

    -6- Mandates: […]

    […]

    template<simd-vec-type V, ranges::contiguous_range R, simd-integral I, class... Flags>
      requires ranges::sized_range<R>
      constexpr void
      partial_scatter_to(const V& v, R&& out, const I& indices, flags<Flags...> f = {});
    template<simd-vec-type V, ranges::contiguous_range R, simd-integral I, class... Flags>
      requires ranges::sized_range<R>
      constexpr void partial_scatter_to(const V& v, R&& out, const typename I::mask_type& mask,
                                        const I& indices, flags<Flags...> f = {});
    

    -13- Let `mask` be `I::mask_type(true)` for the overload with no `mask` parameter.

    -14- Constraints:

    1. (14.1) — `V::size() == I::size()` is `true`,

    2. (14.2) — ranges::iterator_t<R> models indirectly_writable<ranges::range_value_t<R>>, and

    3. (14.3) — `typename V::value_type` satisfies explicitly-convertible-to<ranges::range_value_t<R>>.

    […]

    -17- Effects: For all i in the range [0, I::size()), if mask[i] && (indices[i] < ranges::size(out)) is true, evaluates ranges::data(out)[indices[i]] = static_cast<ranges::range_value_t<R>>(v[i]).

Date: 2025-11-11.10:48:55

[ Kona 2025-11-08; Status changed: Immediate → WP. ]

Date: 2025-11-04.20:35:15

[ Kona 2025-11-04; approved by LWG. Status changed: New → Immediate. ]

Date: 2025-11-04.20:35:15

[ Kona 2025-11-04; Also resolves LWG 4393. ]

Date: 2025-10-15.00:00:00

[ 2025-10-22; Matthias Kretz improves discussion and provides new wording ]

Date: 2025-10-15.00:00:00

[ 2025-10-22; Reflector poll. ]

Set priority to 1 after reflector poll.

We also need to update Effects. There are more places in [simd] where float to float16_t and similar conversion are not supported.

It was pointed out that similar issues happen for complex<float16_t>. There seem to be mismatch between language initialization rules and the intended usage based on library API.

This wording is relative to N5014.

  1. In [simd.syn] and [simd.loadstore] replace all occurrences of

    indirectly_writable<ranges::iterator_t<R>, T>
    

    with

    indirectly_writable<ranges::iterator_t<R>, Tranges::range_value_t<R>>
    

    and all occurrences of

    indirectly_writable<I, T>
    

    with

    indirectly_writable<I, Titer_value_t<I>>
    
  2. Modify [simd.loadstore] as indicated:

    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, R&& r, flags<Flags...> f = {});
    […]
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void unchecked_store(const basic_vec<T, Abi>& v, I first, S last,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    

    -11- Let […]

    -?- Constraints: The expression static_cast<ranges::range_value_t<R>>(x) where `x` is an object of type `T` is well-formed.

    -12- Mandates: If `ranges::size(r)` is a constant expression then ranges::size(r) ≥ simd-size-v<T, Abi>.

    […]

    template<class T, class Abi, ranges::contiguous_range R, class... Flags>
      requires ranges::sized_range<R> && indirectly_writable<ranges::iterator_t<R>, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, R&& r, flags<Flags...> f = {});
    […]
    template<class T, class Abi, contiguous_iterator I, sized_sentinel_for<I> S, class... Flags>
      requires indirectly_writable<I, T>
      constexpr void partial_store(const basic_vec<T, Abi>& v, I first, S last,
        const typename basic_vec<T, Abi>::mask_type& mask, flags<Flags...> f = {});
    

    -15- Let […]

    -?- Constraints: The expression static_cast<iter_value_t<I>>(x) where `x` is an object of type `T` is well-formed.

    -16- Mandates: […]

Date: 2025-10-23.14:57:49

Addresses DE-288 and DE-285

[simd.loadstore] `unchecked_store` and `partial_store` are constrained with `indirectly_writable` in such a way that `basic_vec`'s `value_type` must satisfy convertible_to<range value-type>. But that is not the case, e.g. for float → float16_t or double → float32_t. However, if `simd::flag_convert` is passed, these conversions were intended to work. The implementation thus must `static_cast` the `basic_vec` values to the range's value-type before storing to the range.

unchecked_store(vec<float>, span<complex<float16_t>>, flag_convert) does not work for a different reason. The complex(const float16_t&, const float16_t&) constructor simply does not allow construction from `float`, irrespective of using implicit or explicit conversion. The only valid conversion from float → complex<float16_t> is via an extra step through complex<float16_t>::value_type. This issue considers it a defect of `complex` that an explicit conversion from float → complex<float16_t> is ill-formed and therefore no workaround/special case is introduced.

Conversely, the conversion constructor in [simd.ctor] does not reject conversion from vec<complex<float>, 4> to vec<float, 4>. I.e. convertible_to<vec<complex<float>, 4>, vec<float, 4>> is `true`, which is a lie. This is NB comment DE-288. However, the NB comment's proposed resolution is too strict, in that it would disallow conversion from `float` to `float16_t`.

The conversion/load from static-sized range constructor in [simd.ctor] has a similar problem:

convertible_to<array<std::string, 4>, vec<int, 4>> is `true`

but when fixing this

vec<float16_t, 4>(array<float, 4>, flag_convert)

must continue to be valid.

`unchecked_load` and `partial_load` in [simd.loadstore] currently Mandate the range's value-type to be vectorizable, but converting loads from complex<float> to `float` are not covered. It is unclear what a conversion from complex<float> to `float` should do, so it needs to be added (again without breaking float → float16_t).

[simd.permute.memory] is analogous to [simd.loadstore] and needs equivalent constraints.

[simd.ctor] p2 requires constructible_from<U>, which makes explicit construction of vec<float16_t> from `float` ill-formed. For consistency this should also be constrained with explicitly-convertible-to.

History
Date User Action Args
2025-11-11 10:48:55adminsetmessages: + msg15680
2025-11-11 10:48:55adminsetstatus: immediate -> wp
2025-11-04 20:35:15adminsetmessages: + msg15501
2025-11-04 20:11:16adminsetmessages: + msg15500
2025-11-04 20:11:16adminsetstatus: new -> immediate
2025-10-22 17:50:45adminsetmessages: + msg15364
2025-10-22 17:04:49adminsetmessages: + msg15355
2025-10-19 11:10:16adminsetmessages: + msg15262
2025-10-15 00:00:00admincreate