Issue 119: N4197 Adding u8 character literals, [tiny] Why no u8 character literals?

Title: N4197 Adding u8 character literals, [tiny] Why no u8 character literals?
Status: wp
Section: [lex.ccon]
Submitter: Richard Smith

Created on 2014-04-14.00:00:00 last changed 130 months ago

Messages

msg131 (view)

Date: 2014-11-21.17:12:23

http://open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4197.html

The discussion thread started at [c++std-ext-14798].

We have five encoding-prefixes for string-literals (none, L, u8, u, U) but only four for character literals -- the missing one is u8 for character literals.

This matters for implementations where the narrow execution character set is not ASCII. In such a case, u8 character literals would provide an ideal way to write character literals with guaranteed ASCII encoding (the single-code-unit u8 encodings are exactly ASCII), but... we don't provide them. Instead, the best one can do is something like this:

  char x_ascii = { u'x' };

... where we'll get a narrowing error if the codepoint doesn't fit in a 'char'. (Note that this is not quite the same as u8'x', which would give us an error if the codepoint was not representable as a single code unit in UTF-8.)

Is there a good reason for omitting this (useful and natural) functionality?

Discussed in Rapperswil 2014. EWG considers this to be an improvement, and encourages the author to take a proposal with wording to CWG. It's expected that Smith can do so without a separate EWG review.

Voted into the working draft in Urbana, as N4267.

History
Date	User	Action	Args
2014-11-21 17:12:23	admin	set	status: ready -> wp
2014-07-01 21:57:43	admin	set	status: open -> ready
2014-04-14 00:00:00	admin	create