Issue 3644: std::format does not define "integer presentation type"

Title: std::format does not define "integer presentation type"
Status: new
Section: [format.string.std]
Submitter: Charlie Barto

Created on 2021-11-23.00:00:00 last changed 32 months ago

Messages

Date: 2022-11-01.16:49:20

Proposed resolution:

This wording is relative to N4901.

Modify [format.string.std] as indicated:

-6- The # option causes the alternate form to be used for the conversion. This option is only valid for arithmetic types other than charT and bool or when an integer presentation type is specified~~, and not otherwise~~. For integral types, the alternate form inserts the base prefix (if any) specified in Table 65 into the output after the sign character (possibly space) if there is one, or before the output of to_chars otherwise. For floating-point types, the alternate form causes the result of the conversion of finite values to always contain a decimal-point character, even if no digits follow it. Normally, a decimal-point character appears in the result of these conversions only if a digit follows it. In addition, for g and G conversions, trailing zeros are not removed from the result.
[…]

[Drafting note: This modification is a simple cleanup given the other changes further below, to bring the wording for # in line with the wording for the other modifiers, in the interest of preventing confusion.]

[…]
-16- The type determines how the data should be presented.
-?- An integer presentation type is one of the following type specifiers in Table [tab:format.type.integer_presentation], or none, if none is defined to have the same behavior as one of the type specifiers in Table [tab:format.type.integer_presentation].

Table ? — Meaning of type options for integer representations [tab:format.type.integer_presentation]
Type Meaning

b to_chars(first, last, value, 2); the base prefix is 0b.

B The same as b, except that the base prefix is 0B.

d to_chars(first, last, value).

o to_chars(first, last, value, 8); the base prefix is 0 if value is nonzero and is empty otherwise.

x to_chars(first, last, value, 16); the base prefix is 0x.

X The same as x, except that it uses uppercase letters for digits above 9 and the base prefix is 0X.

[Drafting note: This is the same as [tab:format.type.int] with "none" and 'c' removed]

-17- The available string presentation types are specified in Table 64 ([tab:format.type.string]).
[…]

Table 65 — Meaning of type options for integer types [tab:format.type.int]
Type Meaning

b, B, d, o, x, X As specified in Table [tab:format.type.integer_presentation]~~to_chars(first, last, value, 2); the base prefix is 0b~~.

~~B~~ ~~The same as b, except that the base prefix is 0B.~~

c Copies the character static_cast<charT>(value) to the output. Throws format_error if value is not in the range of representable values for charT.

~~d~~ ~~to_chars(first, last, value).~~

~~o~~ ~~to_chars(first, last, value, 8); the base prefix is 0 if value is nonzero and is empty otherwise.~~

~~x~~ ~~to_chars(first, last, value, 16); the base prefix is 0x.~~

~~X~~ ~~The same as x, except that it uses uppercase letters for digits above 9 and the base prefix is 0X.~~

none The same as d. ~~[Note 8: If the formatting argument type is charT or bool, the default is instead c or s, respectively. — end note]~~

Table 66 — Meaning of type options for charT [tab:format.type.char]
Type Meaning

none, c Copies the character to the output.

b, B, d, o, x, X As specified in Table ~~[tab:format.type.int]~~[tab:format.type.integer_presentation].

Table 67 — Meaning of type options for bool [tab:format.type.bool]
Type Meaning

none, s Copies textual representation, either true or false, to the output.

b, B, ~~c,~~ d, o, x, X As specified in Table ~~[tab:format.type.int]~~[tab:format.type.integer_presentation] for the value static_cast<unsigned char>(value).

c Copies the character static_cast<unsigned char>(value) to the output.

[Drafting note: allowing the 'c' specifier for bool is pretty bizarre behavior, but that's very clearly what the standard says now, so I'm preserving it. I would suggest keeping discussion of changing that behavior to a separate issue or defect report (the reworking of the tables in this issue makes addressing that easier anyway).
The inconsistency with respect to using static_cast<unsigned char> here and static_cast<charT> in [tab:format.type.int] is pre-existing and should be addressed in a separate issue if needed. ]

Table ? — Meaning of *type* options for integer representations [tab:format.type.integer_presentation]
Type	Meaning
`b`	`to_chars(first, last, value, 2);` the base prefix is `0b`.
`B`	The same as `b`, except that the base prefix is `0B`.
`d`	`to_chars(first, last, value)`.
`o`	`to_chars(first, last, value, 8)`; the base prefix is `0` if `value` is nonzero and is empty otherwise.
`x`	`to_chars(first, last, value, 16)`; the base prefix is `0x`.
`X`	The same as `x`, except that it uses uppercase letters for digits above `9` and the base prefix is `0X`.

Table 65 — Meaning of *type* options for integer types [tab:format.type.int]
Type	Meaning
`b`, `B`, `d`, `o`, `x`, `X`	As specified in Table [tab:format.type.integer_presentation]~~`to_chars(first, last, value, 2);` the base prefix is `0b`~~.
~~`B`~~	~~The same as `b`, except that the base prefix is `0B`.~~
`c`	Copies the character `static_cast<charT>(value)` to the output. Throws `format_error` if `value` is not in the range of representable values for `charT`.
~~`d`~~	~~`to_chars(first, last, value)`.~~
~~`o`~~	~~`to_chars(first, last, value, 8)`; the base prefix is `0` if `value` is nonzero and is empty otherwise.~~
~~`x`~~	~~`to_chars(first, last, value, 16)`; the base prefix is `0x`.~~
~~`X`~~	~~The same as `x`, except that it uses uppercase letters for digits above `9` and the base prefix is `0X`.~~
none	The same as `d`. ~~[Note 8: If the formatting argument type is `charT` or `bool`, the default is instead `c` or `s`, respectively. — end note]~~

Table 66 — Meaning of *type* options for `charT` [tab:format.type.char]
Type	Meaning
none, `c`	Copies the character to the output.
`b`, `B`, `d`, `o`, `x`, `X`	As specified in Table ~~[tab:format.type.int]~~[tab:format.type.integer_presentation].

Table 67 — Meaning of *type* options for `bool` [tab:format.type.bool]
Type	Meaning
none, `s`	Copies textual representation, either `true` or `false`, to the output.
`b`, `B`, ~~`c`,~~ `d`, `o`, `x`, `X`	As specified in Table ~~[tab:format.type.int]~~[tab:format.type.integer_presentation] for the value `static_cast<unsigned char>(value)`.
`c`	Copies the character `static_cast<unsigned char>(value)` to the output.

msg12320 (view)

Date: 2021-11-15.00:00:00

[ 2021-11-29; Jonathan comments ]

LWG 3648 removed 'c' as a valid presentation type for bool. The last change in the resolution below (and the drafting note) can be dropped.

LWG 3586 could be resolved as part of this issue by using "this is the default unless formatting a floating-point type or using an integer presentation type" for '<' and by using "this is the default when formatting a floating-point type or using an integer presentation type" for '>'.

msg12237 (view)

Date: 2022-01-15.00:00:00

[ 2022-01-30; Reflector poll ]

Set priority to 2 after reflector poll.

msg12234 (view)

Date: 2021-11-15.00:00:00

[ 2021-11-29; Tim comments ]

This issue touches the same wording area as LWG 3586 does.

msg12233 (view)

Date: 2021-11-23.00:00:00

[format.string.std] specifies the behavior of several format specifiers in terms of "integer presentation types"; for example [format.string.std]/4 states:

"The sign option is only valid for arithmetic types other than charT and bool or when an integer presentation type is specified".

Unfortunately nowhere does the standard actually define the term "integer presentation type". The closest it comes is in [format.string.std]/19 and [tab:format.type.int], but that explicitly excludes charT and bool. [tab:format.type.char] and [tab:format.type.bool] then refer to [tab:format.type.int].

I can come up with many interpretations for what could happen when 'c' is used with charT or bool, but the following table is what msvc does right now (throws is the same as does not compile after P2216 in all these cases, although not in general for 'c'):

Argument type Specifiers Throws?

bool # Yes

bool #c No

bool :+ Yes

bool +c Yes

bool ^ No

bool ^c No

bool 0 Yes

bool 0c Yes

bool c No

charT # Yes

charT #c Yes

charT + Yes

charT +c Yes

charT ^ No

charT ^c No

charT 0 Yes

charT 0c Yes

Argument type	Specifiers	Throws?
`bool`	`#`	Yes
`bool`	`#c`	No
`bool`	`:+`	Yes
`bool`	`+c`	Yes
`bool`	`^`	No
`bool`	`^c`	No
`bool`	`0`	Yes
`bool`	`0c`	Yes
`bool`	`c`	No
`charT`	`#`	Yes
`charT`	`#c`	Yes
`charT`	`+`	Yes
`charT`	`+c`	Yes
`charT`	`^`	No
`charT`	`^c`	No
`charT`	`0`	Yes
`charT`	`0c`	Yes

As you can see we don't interpret 'c' as an "integer type specifier", except when explicitly specified for bool with #. I think this is because for # the standard states

"This option is valid for arithmetic types other than charT and bool or when an integer presentation type is specified, and not otherwise",

and [tab:format.type.bool] puts 'c' in the same category as all the other "integer type specifiers", whereas [tab:format.type.char] separates it out into the char-specific types. If this issue's proposed resolution is adopted our behavior would become non-conforming (arguably it already is) and "#c" with bools would become invalid.

History
Date	User	Action	Args
2022-11-01 16:49:20	admin	set	messages: + msg12906
2022-01-30 17:05:36	admin	set	messages: + msg12320
2021-11-29 17:25:00	admin	set	messages: + msg12237
2021-11-27 17:43:28	admin	set	messages: + msg12234
2021-11-23 00:00:00	admin	create