Title
Line endings in raw string literals
Status
drafting
Section
5.4 [lex.pptoken]
Submitter
Mike Miller

Created on 2013-04-26.00:00:00 last changed 128 months ago

Messages

Date: 2015-04-13.00:00:00

According to 5.4 [lex.pptoken] paragraph 3,

If the input stream has been parsed into preprocessing tokens up to a given character:

  • If the next character begins a sequence of characters that could be the prefix and initial double quote of a raw string literal, such as R", the next preprocessing token shall be a raw string literal. Between the initial and final double quote characters of the raw string, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted; this reversion shall apply before any d-char, r-char, or delimiting parenthesis is identified.

However, phase 1 is defined as:

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. The set of physical source file characters accepted is implementation-defined. Trigraph sequences (_N4140_.2.4 [lex.trigraph]) are replaced by corresponding single-character internal representations. Any source file character not in the basic source character set (5.3 [lex.charset]) is replaced by the universal-character-name that designates that character.

The reversion described in 5.4 [lex.pptoken] paragraph 3 specifically does not mention the replacement of physical end-of-line indicators with new-line characters. Is it intended that, for example, a CRLF in the source of a raw string literal is to be represented as a newline character or as the original characters?

History
Date User Action Args
2013-10-14 00:00:00adminsetstatus: open -> drafting
2013-04-26 00:00:00admincreate