Title
Escape sequences in UTF-8 character literals
Status
cd6
Section
5.13.3 [lex.ccon]
Submitter
Mike Miller

Created on 2017-01-05.00:00:00 last changed 27 months ago

Messages

Date: 2017-08-15.00:00:00

Notes from the August, 2017 teleconference:

An escape sequence in a UTF-8 character literal should be ill-formed.

Date: 2020-11-15.00:00:00

[Accepted at the November, 2020 meeting as part of paper P2029R4.]

The meaning of a numeric escape appearing in a UTF-8 character literal is not clear. 5.13.3 [lex.ccon] paragraph 3 assumes that the contents of the quoted string is a character with an ISO 10646 code point value, which is not necessarily the case with a numeric escape, and paragraph 8 could be read to indicate that a numeric escape specifies the actual runtime value of the object rather than a Unicode code point. In addition, paragraph 8 only specifies the result for unprefixed and wide-character literals, not for UTF-8 literals, so that could be read as indicating that a numeric escape in a UTF-8 character literal is undefined behavior (i.e., not defined by the Standard).

History
Date User Action Args
2022-08-19 07:54:33adminsetstatus: wp -> cd6
2021-02-24 00:00:00adminsetstatus: accepted -> wp
2020-12-15 00:00:00adminsetstatus: drafting -> accepted
2018-02-27 00:00:00adminsetmessages: + msg5890
2017-01-05 00:00:00admincreate