Created on 2007-04-05.00:00:00 last changed 171 months ago
Rationale:
post-Toronto: Changed from New to NAD at the request of the author. The preferred solution of N2327 makes this resolution obsolete.
Proposed resolution:
Change [facet.num.get.virtuals]:
Stage 3: The result of stage 2 processing can be one of
- A sequence of
chars
has been accumulated in stage 2 that is converted (according to the rules ofscanf
) to a value of the type ofval
.This value is stored inval
andios_base::goodbit
is stored inerr
.- The sequence of
chars
accumulated in stage 2 would have causedscanf
to report an input failure.ios_base::failbit
is assigned toerr
.In the first case,
Ddigit grouping is checked. That is, the positions of discarded separators is examined for consistency withuse_facet<numpunct<charT> >(loc).grouping()
. If they are not consistent thenios_base::failbit
is assigned toerr
. Otherwise, the value that was converted in stage 2 is stored inval
andios_base::goodbit
is stored inerr
.
From Section [facet.num.get.virtuals], paragraphs 11 and 12, it is implied
that the value read from a stream must be stored
even if the placement of thousands separators does not conform to the
grouping()
specification from the numpunct
facet.
Since incorrectly-placed thousands separators are flagged as an extraction
failure (by the means of failbit
), we believe it is better not
to store the value. A consistent strategy, in which any kind of extraction
failure leaves the input item intact, is conceptually cleaner, is able to avoid
corner-case traps, and is also more understandable from the programmer's point
of view.
Here is a quote from "The C++ Programming Language (Special Edition)" by B. Stroustrup (Section D.4.2.3, pg. 897):
"If a value of the desired type could not be read, failbit is set in r. [...] An input operator will use r to determine how to set the state of its stream. If no error was encountered, the value read is assigned through v; otherwise, v is left unchanged."
This statement implies that rdstate()
alone is sufficient to
determine whether an extracted value is to be assigned to the input item
val passed to do_get
. However, this is in disagreement
with the current C++ Standard. The above-mentioned assumption is true in all
cases, except when there are mismatches in digit grouping. In the latter case,
the parsed value is assigned to val, and, at the same time, err
is assigned to ios_base::failbit
(essentially "lying" about the
success of the operation). Is this intentional? The current behavior raises
both consistency and usability concerns.
Although digit grouping is outside the scope of scanf
(on which
the virtual methods of num_get
are based), handling of grouping
should be consistent with the overall behavior of scanf. The specification of
scanf
makes a distinction between input failures and matching
failures, and yet both kinds of failures have no effect on the input items
passed to scanf
. A mismatch in digit grouping logically falls in
the category of matching failures, and it would be more consistent, and less
surprising to the user, to leave the input item intact whenever a failure is
being signaled.
The extraction of bool
is another example outside the scope of
scanf
, and yet consistent, even in the event of a successful
extraction of a long
but a failed conversion from
long
to bool
.
Inconsistency is further aggravated by the fact that, when failbit is set,
subsequent extraction operations are no-ops until failbit
is
explicitly cleared. Assuming that there is no explicit handling of
rdstate()
(as in cin>>i>>j
) it is
counter-intuitive to be able to extract an integer with mismatched digit
grouping, but to be unable to extract another, properly-formatted integer
that immediately follows.
Moreover, setting failbit
, and selectively assigning a value to
the input item, raises usability problems. Either the strategy of
scanf
(when there is no extracted value in case of failure), or
the strategy of the strtol
family (when there is always an
extracted value, and there are well-defined defaults in case of a failure) are
easy to understand and easy to use. On the other hand, if failbit
alone cannot consistently make a difference between a failed extraction, and a
successful but not-quite-correct extraction whose output happens to be the same
as the previous value, the programmer must resort to implementation tricks.
Consider the following example:
int i = old_i; cin >> i; if (cin.fail()) // can the value of i be trusted? // what does it mean if i == old_i? // ...
Last but not least, the current behvaior is not only confusing to the casual
reader, but it has also been confusing to some book authors. Besides
Stroustrup's book, other books (e.g. "Standard C++ IOStreams and Locales" by
Langer and Kreft) are describing the same mistaken assumption. Although books
are not to be used instead of the standard reference, the readers of these
books, as well as the people who are generally familiar to scanf
,
are even more likely to misinterpret the standard, and expect the input items
to remain intact when a failure occurs.
History | |||
---|---|---|---|
Date | User | Action | Args |
2010-10-21 18:28:33 | admin | set | messages: + msg3365 |
2010-10-21 18:28:33 | admin | set | messages: + msg3364 |
2007-04-05 00:00:00 | admin | create |