Title
Different treatment of #include pp-tokens and header-name-tokens
Status
review
Section
15.3 [cpp.include]
Submitter
Hubert Tong

Created on 2025-09-22.00:00:00 last changed 2 weeks ago

Messages

Date: 2025-09-15.00:00:00

Proposed resolution (September, 2025):

  1. Change in 5.9 [lex.digraph] paragraph 2 as follows:

    In all respects of the language, each alternative token behaves the same, respectively, as its primary token, except for its spelling. [ Footnote: Thus the Note: The “stringized” values (15.7.3 [cpp.stringize]) of [ and <: will be are different, maintaining the source spelling, but the tokens can otherwise be freely interchanged. ] The set of alternative tokens is defined in Table 3.
  2. Change in 15.2 [cpp.cond] paragraph 1 as follows:

      header-name-tokens:
          string-literal plain-string-literal
          < h-pp-tokens >
    
  3. Change in 15.3 [cpp.include] paragraph 7 (supersedes the change to that paragraph from issue 3076):

    A preprocessing directive of the form
      # include pp-tokens new-line
    
    (that does not match the previous form) is permitted. The preprocessing tokens after include in the directive are processed just as in normal text (i.e., each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens). The resulting sequence of preprocessing tokens shall be of the form
      header-name-tokens
    
    Then, an An attempt is then made to form a header-name preprocessing token (5.6 [lex.header]) from the whitespace and the characters of the spellings of the resulting sequence of preprocessing tokens header-name-tokens; the treatment of whitespace is implementation-defined. If the attempt succeeds, the directive with the so-formed header-name is processed as specified for the previous form. Otherwise, the program is ill-formed, no diagnostic required.

Note: The third change of the resolution supersedes the second change in the resolution of issue 3076.

Date: 2025-09-22.00:00:00

(From submission #770.)

Consider:

  #define X >
  #include <<X

As further clarified by issue 3015, this performs the same inclusion as

  #include <<>

There is implementation divergence; clang accepts; GCC, EDG, and MSVC reject.

There are related concerns when the character sequence of a digraph appers in prospective header-name. The following is ill-formed because <% is a digraph:

  #define X >
  #if __has_include(<%X)
  #endif

However, the same character sequence is valid in #include:

  #define X >
  #include <%X    // valid, includes %

Thus the footnote in 5.9 [lex.digraph] paragraph 2 is overly broad.

History
Date User Action Args
2025-09-26 20:51:59adminsetstatus: open -> review
2025-09-22 20:49:34adminsetmessages: + msg8123
2025-09-22 00:00:00admincreate