Title
§[locale.codecvt.virtuals] `do_in` and `do_out` could do with better specification
Status
new
Section
[locale.codecvt.virtuals]
Submitter
S. B. Tam

Created on 2025-06-18.00:00:00 last changed 1 week ago

Messages

Date: 2025-07-05.16:52:19

Proposed resolution:

This wording is relative to N5008.

[Drafting note: This is modified from Jonathan Wakely's suggestion in https://github.com/cplusplus/draft/pull/7347#issuecomment]

  1. In [locale.codecvt.virtuals] remove Table 91 [tab:locale.codecvt.inout] in its entirety:

    Table 91 — `do_in`/`do_out` result values [tab:locale.codecvt.inout]
    Value Meaning
    ok completed the conversion
    partial not all source characters converted
    error encountered a character in `[from, from_end)` that cannot be converted
    noconv `internT` and `externT` are the same type, and input sequence is identical to converted sequence
  2. Modify [locale.codecvt.virtuals] as indicated:

    result do_out(
      stateT& state,
      const internT* from, const internT* from_end, const internT*& from_next,
      externT* to, externT* to_end, externT*& to_next) const;
    
    result do_in(
      stateT& state,
      const externT* from, const externT* from_end, const externT*& from_next,
      internT* to, internT* to_end, internT*& to_next) const;
    

    -1- Preconditions: […]

    -2- Effects: Translates characters in the source range `[from, from_end)`, placing the results in sequential positions starting at destination to. Converts no more than `(from_end - from)` source elements, and stores no more than `(to_end - to)` destination elements.

    -3- Stops if it encounters a character it cannot convert. It always leaves the `from_next` and `to_next` pointers pointing one beyond the last element successfully converted. If it returns `noconv`, `internT` and `externT` are the same type, and the converted sequence is identical to the input sequence `[from, from_next)`, `to_next` is set equal to `to`, the value of `state` is unchanged, and there are no changes to the values in `[to, to_end)`. If `internT` and `externT` are the same type and the converted sequence would be identical to the input sequence [`from`, `from_next`), then no elements are converted, the value of `state` is unchanged, there are no changes to the values in [`to`, `to_end`), and the result is `noconv`. Otherwise, if a character in [`from`,`from_end`) cannot be converted, conversion stops at that character and the result is `error`. Otherwise, if all input characters are successfully converted and placed in the output range, the result is `ok`. Otherwise, the result is `partial`. In all cases, `from_next` is set to point to the first element of the input that was not converted, `to_next` is set to point to the first unchanged element in the output. [Note: When the result is `noconv`, `from_next` points to `from` and `to_next` points to `to`. — end note]

    -4- A `codecvt` facet that is used by `basic_filebuf` […]

    -5- Returns: An enumeration value, as summarized in Table 91 The result as described above.

Date: 2025-06-18.00:00:00

Background: https://github.com/cplusplus/draft/pull/7347

The specification of `codecvt::do_in` and `codecvt::do_out` is unclear, and possibly incorrect:

  1. the meaning of `noconv` is specified twice (once in paragraph 3, once in Table 91 [tab:locale.codecvt.inout]);

  2. the effect on `from_next` is not specified;

  3. the specification talks about "the input sequence [from, from_next)", but `from_next` is supposed to be an out parameter. I think it should say "[from, from_end)" instead.

History
Date User Action Args
2025-07-05 16:52:19adminsetmessages: + msg14879
2025-06-18 00:00:00admincreate