Issue 459: Requirement for widening in stage 2 is overspecification

Title: Requirement for widening in stage 2 is overspecification
Status: nad
Section: [facet.num.get.virtuals]
Submitter: Martin Sebor

Created on 2004-03-16.00:00:00 last changed 191 months ago

Messages

msg2714 (view)

Date: 2010-10-21.18:28:33

Proposed resolution:

Change stage 2 so that implementations are permitted to use either technique to perform the comparison:

call widen on the atoms and compare (either by using operator== or char_traits<charT>::eq) the input with the widened atoms, or
call narrow on the input and compare the narrow input with the atoms
do (1) or (2) only if charT is not char or wchar_t, respectively; i.e., avoid calling widen or narrow if it the source and destination types are the same

msg2713 (view)

Date: 2010-10-21.18:28:33

[ 2009-07 Frankfurt ]

NAD. The standard is clear enough as written.

msg2712 (view)

Date: 2010-10-21.18:28:33

[ Sydney: To a large extent this is a nonproblem. As long as you're only trafficking in char and wchar_t we're only dealing with a stable character set, so you don't really need either 'widen' or 'narrow': can just use literals. Finally, it's not even clear whether widen-vs-narrow is the right question; arguably we should be using codecvt instead. ]

msg2711 (view)

Date: 2004-03-16.00:00:00

When parsing strings of wide-character digits, the standard requires the library to widen narrow-character "atoms" and compare the widened atoms against the characters that are being parsed. Simply narrowing the wide characters would be far simpler, and probably more efficient. The two choices are equivalent except in convoluted test cases, and many implementations already ignore the standard and use narrow instead of widen.

First, I disagree that using narrow() instead of widen() would necessarily have unfortunate performance implications. A possible implementation of narrow() that allows num_get to be implemented in a much simpler and arguably comparably efficient way as calling widen() allows, i.e. without making a virtual call to do_narrow every time, is as follows:

  inline char ctype<wchar_t>::narrow (wchar_t wc, char dflt) const
  {
      const unsigned wi = unsigned (wc);

      if (wi > UCHAR_MAX)
          return typeid (*this) == typeid (ctype<wchar_t>) ?
                 dflt : do_narrow (wc, dflt);

      if (narrow_ [wi] < 0) {
         const char nc = do_narrow (wc, dflt);
         if (nc == dflt)
             return dflt;
         narrow_ [wi] = nc;
      }

      return char (narrow_ [wi]);
  }

Second, I don't think the change proposed in the issue (i.e., to use narrow() instead of widen() during Stage 2) would be at all drastic. Existing implementations with the exception of libstdc++ currently already use narrow() so the impact of the change on programs would presumably be isolated to just a single implementation. Further, since narrow() is not required to translate alternate wide digit representations such as those mentioned in issue 303 to their narrow equivalents (i.e., the portable source characters '0' through '9'), the change does not necessarily imply that these alternate digits would be treated as ordinary digits and accepted as part of numbers during parsing. In fact, the requirement in [locale.ctype.virtuals], p13 forbids narrow() to translate an alternate digit character, wc, to an ordinary digit in the basic source character set unless the expression (ctype<charT>::is(ctype_base::digit, wc) == true) holds. This in turn is prohibited by the C standard (7.25.2.1.5, 7.25.2.1.5, and 5.2.1, respectively) for charT of either char or wchar_t.

History
Date	User	Action	Args
2010-10-21 18:28:33	admin	set	messages: + msg2714
2010-10-21 18:28:33	admin	set	messages: + msg2713
2010-10-21 18:28:33	admin	set	messages: + msg2712
2004-03-16 00:00:00	admin	create