Created on 2016-01-13.00:00:00 last changed 101 months ago
Proposed resolution:
This wording is relative to N4567.
Change [re.grammar]/3 as indicated:
-3- The following productions within the ECMAScript grammar are modified as follows:
ClassAtom :: - ClassAtomNoDash ClassAtomExClass ClassAtomCollatingElement ClassAtomEquivalence IdentityEscape :: SourceCharacter but not c
[ 2016-08, Chicago ]
Monday PM: Move to tentatively ready
Stephan and I are seeing differences in implementation for how non-special characters should be handled in the IdentityEscape part of the ECMAScript grammar. For example:
#include <stdio.h>
#include <iostream>
#ifdef USE_BOOST
#include <boost/regex.hpp>
using namespace boost;
#else
#include <regex>
#endif
using namespace std;
int main() {
try {
const regex r("\\z");
cout << "Constructed \\z." << endl;
if (regex_match("z", r))
cout << "Matches z" << endl;
} catch (const regex_error& e) {
cout << e.what() << endl;
}
}
libstdc++, boost, and browsers I tested with (Microsoft Edge, Google Chrome) all happily interpret \z, which otherwise has no meaning, as an identity character escape for the letter z. libc++ and msvc++ say that this is invalid, and throw regex_error with error_escape.
ECMAScript 3 (which is what C++ currently points to) seems to agree with libc++ and msvc++:IdentityEscape :: SourceCharacter but not IdentifierPart IdentifierPart :: IdentifierStart UnicodeCombiningMark UnicodeDigit UnicodeConnectorPunctuation \ UnicodeEscapeSequence IdentifierStart :: UnicodeLetter $ _ \ UnicodeEscapeSequence
But this doesn't make any sense — it prohibits things like \$ which users absolutely need to be able to escape. So let's look at ECMAScript 6. I believe this says much the same thing, but updates the spec to better handle Unicode by referencing what the Unicode standard says is an identifier character:
IdentityEscape :: SyntaxCharacter / SourceCharacter but not UnicodeIDContinue UnicodeIDContinue :: any Unicode code point with the Unicode property "ID_Continue", "Other_ID_Continue", or "Other_ID_Start"
However, ECMAScript 6 has an appendix B defining "additional features for web browsers" which says:
IdentityEscape :: SourceCharacter but not c
which appears to agree with what libstdc++, boost, and browsers are doing.
What should be the correct behavior here?| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2017-07-30 20:15:43 | admin | set | status: wp -> c++17 |
| 2016-11-14 03:59:28 | admin | set | status: pending -> wp |
| 2016-11-14 03:55:22 | admin | set | status: ready -> pending |
| 2016-08-02 17:19:11 | admin | set | messages: + msg8332 |
| 2016-08-02 17:19:11 | admin | set | status: new -> ready |
| 2016-01-16 21:32:44 | admin | set | messages: + msg7686 |
| 2016-01-13 00:00:00 | admin | create | |