Issue 2333: Escape sequences in UTF-8 character literals

Title: Escape sequences in UTF-8 character literals
Status: cd6
Section: 5.13.3 [lex.ccon]
Submitter: Mike Miller

Created on 2017-01-05.00:00:00 last changed 35 months ago

Messages

msg5890 (view)

Date: 2017-08-15.00:00:00

Notes from the August, 2017 teleconference:

An escape sequence in a UTF-8 character literal should be ill-formed.

msg5889 (view)

Date: 2020-11-15.00:00:00

[Accepted at the November, 2020 meeting as part of paper P2029R4.]

The meaning of a numeric escape appearing in a UTF-8 character literal is not clear. 5.13.3 [lex.ccon] paragraph 3 assumes that the contents of the quoted string is a character with an ISO 10646 code point value, which is not necessarily the case with a numeric escape, and paragraph 8 could be read to indicate that a numeric escape specifies the actual runtime value of the object rather than a Unicode code point. In addition, paragraph 8 only specifies the result for unprefixed and wide-character literals, not for UTF-8 literals, so that could be read as indicating that a numeric escape in a UTF-8 character literal is undefined behavior (i.e., not defined by the Standard).

History
Date	User	Action	Args
2022-08-19 07:54:33	admin	set	status: wp -> cd6
2021-02-24 00:00:00	admin	set	status: accepted -> wp
2020-12-15 00:00:00	admin	set	status: drafting -> accepted
2018-02-27 00:00:00	admin	set	messages: + msg5890
2017-01-05 00:00:00	admin	create