Title
Underspecified use of locale facets for locale-dependent std::format
Status
open
Section
[format.string.std]
Submitter
Jens Maurer

Created on 2024-04-30.00:00:00 last changed 2 weeks ago

Messages

Date: 2025-06-12.10:39:03

Proposed resolution:

This wording is relative to N5008.

  1. Modify [format.string.std] as indicated:

    -17- When the `L` option is used, the form used for the conversion is called the locale-specific form. The `L` option is only valid for arithmetic types, and its effect depends upon the type.
    1. (17.1) — For integral types, the locale-specific form causes the context’s locale to be used to insert the appropriate digit group separator characters as if obtained with numpunct<charT>::grouping and numpunct<charT>::thousands_sep .
    2. (17.2) — For floating-point types, the locale-specific form causes the context’s locale to be used to insert the appropriate digit group and radix separator characters as if obtained with numpunct<charT>::grouping, numpunct<charT>::thousands_sep, and numpunct<charT>::decimal_point .
    3. (17.3) — For the textual representation of `bool`, the locale-specific form causes the context’s locale to be used to insert the appropriate string as if obtained with numpunct<charT>::truename or numpunct<charT>::falsename.
Date: 2025-06-15.00:00:00

[ 2025-06-12; Jonathan provides wording ]

Date: 2024-06-15.00:00:00

[ 2024-06-12; SG16 meeting ]

The three major implementations all use `numpunct` but not `num_put`, clarify that this is the intended behaviour.

Date: 2024-05-15.00:00:00

[ 2024-05-08; Reflector poll ]

Set priority to 3 after reflector poll.

Date: 2024-04-30.00:00:00

There are std::format variants that take an explicit std::locale parameter. There is the "L" format specifier that uses that locale (or some environment locale) for formatting, according to [format.string.std] p17:

"For integral types, the locale-specific form causes the context's locale to be used to insert the appropriate digit group separator characters."

It is unclear which specific facets are used to make this happen. This is important, because users can install their own facets into a given locale. Specific questions include:

  • Is num_put<> being used? Or just numpunct<>?

  • Are any of the _byname facets being used?

Assuming the encoding for char is UTF-8, the use of a user-provided num_put<> facet (as opposed to std::format creating the output based on numpunct<>) would allow digit separators that are not expressibly as a single UTF-8 code unit.

History
Date User Action Args
2025-06-12 10:39:03adminsetmessages: + msg14789
2025-06-12 10:39:03adminsetmessages: + msg14788
2025-06-12 10:39:03adminsetmessages: + msg14787
2024-05-08 10:05:54adminsetmessages: + msg14116
2024-05-08 10:05:54adminsetstatus: new -> open
2024-04-30 00:00:00admincreate