Title
error_category messages have unspecified encoding
Status
new
Section
[syserr.errcat.virtuals]
Submitter
Victor Zverovich

Created on 2024-09-18.00:00:00 last changed 1 month ago

Messages

Date: 2024-09-18.21:00:19

Proposed resolution:

This wording is relative to N4988.

  1. Modify [syserr.errcat.virtuals] as indicated:

    virtual string message(int ev) const = 0;

    -5- Returns: A string in the ordinary literal encoding that describes the error condition denoted by `ev`.

Date: 2024-09-15.00:00:00

[ 2024-09-18; Jonathan comments ]

It might make sense to stop using the word "encoding" in [syserr.errcat.overview].

Date: 2024-09-18.21:00:19

[syserr.errcat.overview] says:

The class `error_category` serves as a base class for types used to identify the source and encoding of a particular category of error code.

However, this doesn't seem to be referring to a character encoding, just something about how an error is encoded into an integer value. The definition of `error_category::message` ([syserr.errcat.virtuals] p5) just says:

virtual string message(int ev) const = 0;

Returns: A string that describes the error condition denoted by `ev`.

This says nothing about character encoding either.

There is also implementation divergence: some implementations use variants of `strerror` which return messages in the current C locale encoding, but at least one major implementation doesn't use the current C locale: MSVC STL issue 4711.

Using the current C locale is obviously problematic. First, it is inconsistent with other C++ APIs that normally use C++ locales. Second, because it is a global state, it may change (possibly from another thread) between the time the message is obtained and the time it needs to be consumed, which may lead to mojibake. At the very least there should be a mechanism that captures the encoding information in a race-free manner and communicates it to the caller if the locale encoding is used although it is better not to use it in the first place.

This is somewhat related to LWG 4087 but should probably be addressed first because it may affect how some exceptions are defined.

The proposed resolution is similar to the one of LWG 4087.

History
Date User Action Args
2024-09-18 21:00:19adminsetmessages: + msg14378
2024-09-18 20:24:10adminsetmessages: + msg14375
2024-09-18 00:00:00admincreate