Issue 2779: Restrictions on the ordinary literal encoding

Title: Restrictions on the ordinary literal encoding
Status: open
Section: 5.3.1 [lex.charset]
Submitter: Jim X

Created on 2023-03-28.00:00:00 last changed 27 months ago

Messages

Date: 2025-02-24.21:26:04

(From submission #285.)

There are no restrictions on the implementation's choice of ordinary literal encoding. However, there is an implicit assumption that a code unit value must fit into a char.

Tangentially related to that, "cannot be encoded as a single code unit" could be interpreted as referring to the values of the code units as opposed to the fact that multiple code units might be needed.

Possible resolution:

Change in 5.3.1 [lex.charset] paragraph 8 as follows and add to the index of implementation-defined behavior:

A code unit is an integer value of character type (6.8.2 [basic.fundamental]). Characters in a character-literal other than a multicharacter ~~or non-encodable character~~ literal or in a string-literal are encoded as a sequence of one or more code units, as determined by the encoding-prefix (5.13.3 [lex.ccon], 5.13.5 [lex.string]); this is termed the respective literal encoding. The ordinary literal encoding is the implementation-defined encoding applied to an ordinary character or string literal; its code units are of type unsigned char. The wide literal encoding is the implementation-defined encoding applied to a wide character or string literal; its code units are of type wchar_t.
Change in 5.13.3 [lex.ccon] bullet 3.1 as follows:
- A character-literal with a c-char-sequence consisting of a single basic-c-char , simple-escape-sequence, or universal-character-name is the code unit value of the specified character as encoded in the literal's associated character encoding. If the specified character lacks representation in the literal's associated character encoding or if it ~~cannot be encoded as a single code unit~~ is encoded with multiple code units, then the program is ill-formed.
- ...

History
Date	User	Action	Args
2023-03-28 00:00:00	admin	create