Title
Handling of encodings in localized formatting of chrono types is underspecified
Status
resolved
Section
[time.format]
Submitter
Victor Zverovich

Created on 2021-05-31.00:00:00 last changed 20 months ago

Messages

Date: 2023-03-22.00:00:00

[ 2023-03-22 Resolved by the adoption of P2419R2 in the July 2022 virtual plenary. Status changed: SG16 → Resolved. ]

Date: 2021-06-15.00:00:00

[ 2021-06-14; Reflector poll ]

Set priority to 2 after reflector poll. Send to SG16.

This wording is relative to N4885.

  1. Modify [time.format] as indicated:

    -2- Each conversion specifier conversion-spec is replaced by appropriate characters as described in Table [tab:time.format.spec]; the formats specified in ISO 8601:2004 shall be used where so described. Some of the conversion specifiers depend on the locale that is passed to the formatting function if the latter takes one, or the global locale otherwise. If the string literal encoding is UTF-8 the replacement of a conversion specifier that depends on the locale is transcoded to UTF-8 for narrow strings, otherwise the replacement is taken as is. If the formatted object does not contain the information the conversion specifier refers to, an exception of type format_error is thrown.

Date: 2021-05-31.00:00:00

When formatting chrono types using a locale the result is underspecified, possibly a mix of the literal and locale encodings. For example:

std::locale::global(std::locale("Russian.1251"));
auto s = std::format("День недели: {}", std::chrono::Monday);

(Note that "{}" should be replaced with "{:L}" if P2372 is adopted but that's non-essential.)

If the literal encoding is UTF-8 and the "Russian.1251" locale exists we have a mismatch between encodings. As far as I can see the standard doesn't specify what happens in this case.

One possible and undesirable result is

"День недели: \xcf\xed"

where "\xcf\xed" is "Пн" (Mon in Russian) in CP1251 and is not valid UTF-8.

Another possible and desirable result is

"День недели: Пн"

where everything is in one encoding (UTF-8).

This issue is not resolved by LWG 3547 / P2372 but the resolution proposed here is compatible with P2372 and can be rebased onto its wording if the paper is adopted.

History
Date User Action Args
2023-03-23 11:42:08adminsetstatus: open -> resolved
2021-06-14 14:09:34adminsetmessages: + msg11933
2021-06-14 14:09:34adminsetstatus: new -> open
2021-06-06 16:30:06adminsetmessages: + msg11879
2021-05-31 00:00:00admincreate