Created on 2024-01-23.00:00:00 last changed 3 weeks ago
Proposed resolution:
This wording is relative to N4971.
Modify [text.encoding.general] as indicated:
-1- A registered character encoding is a character encoding scheme in the IANA Character Sets registry.
[Note 1: The IANA Character Sets registry uses the term “character sets” to refer to character encodings. — end note]
The primary name of a registered character encoding is the name of that encoding specified in the IANA Character Sets registry.
-2- The set of known registered character encodings contains every registered character encoding specified in the IANA Character Sets registry except for the following:
- (2.1) – NATS-DANO (33)
- (2.2) – NATS-DANO-ADD (34)
-3- Each known registered character encoding is identified by an enumerator in
text_encoding::id
, and has a set of zero or more aliases.-4- The set of aliases of a known registered character encoding is an implementation-defined superset of the aliases specified in the IANA Character Sets registry. The set of aliases for US-ASCII includes
"ASCII"
. No two aliases or primary names of distinct registered character encodings are equivalent when compared bytext_encoding::comp-name
.
[ Tokyo 2024-03-23; Status changed: Voting → WP. ]
[ 2024-03-12; Reflector poll ]
SG16 approved the proposed resolution. Set status to Tentatively Ready after seven votes in favour during reflector poll.
The IANA Charater Sets registry does not contain "ASCII" as an alias of the "US-ASCII" encoding. This is apparently for historical reasons, because there used to be some ambiguity about exactly what "ASCII" meant. I don't think those historical reasons are relevant to C++26, but the absence of "ASCII" in the IANA registry means that it's not a registered character encoding as defined by [text.encoding.general].
This means that the encoding referred to by notes in the C++ standard
([fs.path.generic], [facet.numpunct.virtuals])
and by an example in the std::text_encoding
proposal
(P1885) isn't actually usable in portable code.
So std::text_encoding("ASCII")
creates an object with
mib() == std::text_encoding::other
, which is not the same
encoding as std::text_encoding("US-ASCII")
.
This seems surprising.
History | |||
---|---|---|---|
Date | User | Action | Args |
2024-04-02 10:29:12 | admin | set | messages: + msg14047 |
2024-04-02 10:29:12 | admin | set | status: voting -> wp |
2024-03-18 09:32:04 | admin | set | status: ready -> voting |
2024-03-12 01:10:06 | admin | set | messages: + msg13998 |
2024-03-12 01:10:06 | admin | set | status: new -> ready |
2024-01-23 13:57:27 | admin | set | messages: + msg13929 |
2024-01-23 00:00:00 | admin | create |