Title
Phase 1 replacement of characters with universal-character-names
Status
cd6
Section
5.2 [lex.phases]
Submitter
Martin Vejnár

Created on 2006-05-07.00:00:00 last changed 28 months ago

Messages

Date: 2022-02-15.00:00:00

Additional note (February, 2022):

P2314R4 Character sets and encodings (approved in October, 2021) effected changes so that extended characters are no longer translated to UCNs in phase 1.

Date: 2021-10-15.00:00:00

[Accepted at the October, 2021 meeting as part of paper P2314R4.]

According to 5.2 [lex.phases] paragraph 1, in translation phase 1,

Any source file character not in the basic source character set (5.3.1 [lex.charset]) is replaced by the universal-character-name that designates that character.

If a character that is not in the basic character set is preceded by a backslash character, for example

    "\á"

the result is equivalent to

    "\\u00e1"

that is, a backslash character followed by the spelling of the universal-character-name. This is different from the result in C99, which accepts characters from the extended source character set without replacing them with universal-character-names.

See also issue 1335.

History
Date User Action Args
2022-08-19 07:54:33adminsetstatus: review -> cd6
2022-02-18 07:47:23adminsetmessages: + msg6695
2022-02-18 07:47:23adminsetstatus: open -> review
2006-05-07 00:00:00admincreate