Title
§[time.duration.io]p4 makes surprising claims about encoding
Status
c++20
Section
[time.duration.io]
Submitter
Richard Smith

Created on 2018-04-02.00:00:00 last changed 46 months ago

Messages

Date: 2018-06-12.01:05:16

Proposed resolution:

This wording is relative to N4741.

  1. Edit [time.duration.io] as indicated:

    template<class charT, class traits, class Rep, class Period>
      basic_ostream<charT, traits>&
        operator<<(basic_ostream<charT, traits>& os, const duration<Rep, Period>& d);
    

    -1- Requires: […]

    -2- Effects: […]

    -3- The units suffix depends on the type Period::type as follows:

    1. […]

    2. (3.5) — Otherwise, if Period::type is micro, the suffix is "µs" ("\u00b5\u0073").

    3. […]

    4. (3.21) — Otherwise, the suffix is "[num/den]s".

    […]

    -4- For streams where charT has an 8-bit representation, "µs" should be encoded as UTF-8. Otherwise UTF-16 or UTF-32 is encouraged. The implementation may substitute other encodings, including "us"If Period::type is micro, but the character U+00B5 cannot be represented in the encoding used for charT, the unit suffix "us" is used instead of "µs".

    -5- Returns: os.

Date: 2018-06-12.01:05:16

[ 2018-06 Rapperswil: Adopted ]

Date: 2018-04-23.00:00:00

[ 2018-04-23 Moved to Tentatively Ready after 6 positive votes on c++std-lib. ]

Date: 2018-04-08.14:20:44

[time.duration.io]p4 says:

For streams where charT has an 8-bit representation, "µs" should be encoded as UTF-8. Otherwise UTF-16 or UTF-32 is encouraged. The implementation may substitute other encodings, including "us".

This choice of encoding is not up to the <chrono> library to decide or encourage. The basic execution character set determines how a mu should be encoded in type char, for instance, and it would be truly bizarre to use a UTF-8 encoding if that character set is, say, Latin-1 or EBCDIC.

I suggest we strike at least the first two sentences of this paragraph, as the meaning of the prior wording is unambiguous without them and confusing with them, and they do not providing any normative requirements (although they do provide recommendations). The third sentence appears to have a normative impact, but it's hard to see how it's legitimate to call "us" an "encoding" of "µs"; it's really just an alternative unit suffix. So how about replacing that paragraph with this:

If Period::type is micro, but the character U+00B5 cannot be represented in the encoding used for charT, the unit suffix "us" is used instead of "µs".

(This also removes the permission for an implementation to choose an arbitrary alternative "encoding", which seems undesirable.)

History
Date User Action Args
2021-02-25 10:48:01adminsetstatus: wp -> c++20
2018-06-12 01:05:16adminsetmessages: + msg9890
2018-06-12 01:05:16adminsetstatus: voting -> wp
2018-05-06 19:23:13adminsetstatus: ready -> voting
2018-05-05 12:06:07adminsetmessages: + msg9830
2018-05-05 12:06:07adminsetstatus: new -> ready
2018-04-08 14:18:08adminsetmessages: + msg9805
2018-04-02 00:00:00admincreate