Title
Encoding of numerically-escaped characters
Status
cd6
Section
5.13.3 [lex.ccon]
Submitter
Mike Miller

Created on 2013-04-30.00:00:00 last changed 20 months ago

Messages

Date: 2013-09-15.00:00:00

Notes from the September, 2013 meeting:

The second interpretation (that the escape sequence specifies the execution-time code unit) is intended.

Date: 2020-11-15.00:00:00

[Accepted at the November, 2020 meeting as part of paper P2029R4.]

According to 5.13.3 [lex.ccon] paragraph 4,

The escape \ooo consists of the backslash followed by one, two, or three octal digits that are taken to specify the value of the desired character. The escape \xhhh consists of the backslash followed by x followed by one or more hexadecimal digits that are taken to specify the value of the desired character. There is no limit to the number of digits in a hexadecimal sequence. A sequence of octal or hexadecimal digits is terminated by the first character that is not an octal digit or a hexadecimal digit, respectively. The value of a character literal is implementation-defined if it falls outside of the implementation-defined range defined for char (for literals with no prefix), char16_t (for literals prefixed by 'u'), char32_t (for literals prefixed by 'U'), or wchar_t (for literals prefixed by 'L').

It is not clearly stated whether the “desired character” being specified reflects the source or the target encoding. This particularly affects UTF-8 string literals (5.13.5 [lex.string] paragraph 7) :

A string literal that begins with u8, such as u8"asdf", is a UTF-8 string literal and is initialized with the given characters as encoded in UTF-8.

For example, assuming the source encoding is Latin-1, is u8"\xff" supposed to specify a three-byte string whose first two bytes are 0xc3 0xbf (the UTF-8 encoding of \u00ff) or a two-byte string whose first byte has the value 0xff? (At least some current implementations assume the latter interpretation.)

History
Date User Action Args
2022-08-19 07:54:33adminsetstatus: wp -> cd6
2021-02-24 00:00:00adminsetstatus: accepted -> wp
2020-12-15 00:00:00adminsetstatus: drafting -> accepted
2013-10-14 00:00:00adminsetmessages: + msg4617
2013-10-14 00:00:00adminsetstatus: open -> drafting
2013-04-30 00:00:00admincreate