Underspecified use of locale facets for locale-dependent std::format
Jens Maurer

Created on 2024-04-30.00:00:00 last changed 1 month ago


Date: 2024-05-15.00:00:00

[ 2024-05-08; Reflector poll ]

Set priority to 3 after reflector poll.

Date: 2024-04-30.00:00:00

There are std::format variants that take an explicit std::locale parameter. There is the "L" format specifier that uses that locale (or some environment locale) for formatting, according to [format.string.std] p17:

"For integral types, the locale-specific form causes the context's locale to be used to insert the appropriate digit group separator characters."

It is unclear which specific facets are used to make this happen. This is important, because users can install their own facets into a given locale. Specific questions include:

  • Is num_put<> being used? Or just numpunct<>?

  • Are any of the _byname facets being used?

Assuming the encoding for char is UTF-8, the use of a user-provided num_put<> facet (as opposed to std::format creating the output based on numpunct<>) would allow digit separators that are not expressibly as a single UTF-8 code unit.

