Issue 2640: Allow more characters in an n-char sequence

Title: Allow more characters in an n-char sequence
Status: c++23
Section: 5.3.1 [lex.charset]
Submitter: US

Created on 2022-11-03.00:00:00 last changed 24 months ago

Messages

Date: 2022-11-07.21:52:14

Proposed resolution (approved by CWG 2022-11-07):

Change the grammar in 5.3.1 [lex.charset] paragraph 3 as follows:

n-char:
     A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
     0 1 2 3 4 5 6 7 8 9
     U+002d hyphen-minus
     U+0020 space
     any member of the translation character set except the U+007D RIGHT CURLY BRACKET or new-line character

msg6962 (view)

Date: 2022-11-27.21:00:25

P2720R0 comment US 1-028

[Accepted at the November, 2022 meeting.]

The n-char grammar term is defined to match only the Latin uppercase, Latin digit, hyphen and space characters. This results in \N{ABC} matching named-universal-character while \N{abc} does not. This leads to programs like the following being unexpectedly well-formed because the \N{abc} sequence is lexed as the preprocessing token sequence , N, {, abc, }. The expansion of macro a then leads to the token sequence being passed as an argument to macro z where it is discarded.

  #define z(x) 0
  #define a z(
  int x = a\N{abc});

Changes to make the above program ill-formed would provide two benefits:

Implementations could diagnose the \N{abc} sequence as an ill-formed named-universal-character regardless of where it appears in a program.
The \N{...} syntax space would be reserved for expansion (e.g., for extensions or future support of UAX44-LM2 loose matching schemes).

History
Date	User	Action	Args
2023-07-16 13:00:43	admin	set	status: open -> c++23
2023-07-16 13:00:43	admin	set	status: wp -> open
2023-02-18 18:43:04	admin	set	status: accepted -> wp
2022-11-25 05:14:04	admin	set	status: nb -> accepted
2022-11-07 21:52:14	admin	set	status: open -> nb
2022-11-07 21:52:14	admin	set	status: open -> open
2022-11-07 21:52:14	admin	set	status: open -> open
2022-11-07 21:52:14	admin	set	status: open -> open
2022-11-07 07:49:23	admin	set	status: nb -> open
2022-11-07 07:49:23	admin	set	status: nb -> nb
2022-11-07 07:49:23	admin	set	status: nb -> nb
2022-11-07 07:49:23	admin	set	messages: + msg6963
2022-11-03 00:00:00	admin	create