Title
operator==(sub_match, string) slices on embedded '\0's
Status
c++17
Section
[re.submatch.op]
Submitter
Jeffrey Yasskin

Created on 2012-11-26.00:00:00 last changed 89 months ago

Messages

Date: 2014-03-03.13:52:20

Proposed resolution:

This wording is relative to N3691.

  1. Change [re.submatch.op] as indicated:

    template <class BiIter, class ST, class SA>
      bool operator==(
        const basic_string<
          typename iterator_traits<BiIter>::value_type, ST, SA>& lhs,
        const sub_match<BiIter>& rhs);
    

    -7- Returns: rhs.compare(lhs.c_str()typename sub_match<BiIter>::string_type(lhs.data(), lhs.size())) == 0.

    […]

    template <class BiIter, class ST, class SA>
      bool operator<(
        const basic_string<
          typename iterator_traits<BiIter>::value_type, ST, SA>& lhs,
        const sub_match<BiIter>& rhs);
    

    -9- Returns: rhs.compare(lhs.c_str()typename sub_match<BiIter>::string_type(lhs.data(), lhs.size())) > 0.

    […]

    template <class BiIter, class ST, class SA>
      bool operator==(const sub_match<BiIter>& lhs,
                      const basic_string<
                        typename iterator_traits<BiIter>::value_type, ST, SA>& rhs);
    

    -13- Returns: lhs.compare(rhs.c_str()typename sub_match<BiIter>::string_type(rhs.data(), rhs.size())) == 0.

    […]

    template <class BiIter, class ST, class SA>
      bool operator<(const sub_match<BiIter>& lhs,
                     const basic_string<
                       typename iterator_traits<BiIter>::value_type, ST, SA>& rhs);
    

    -15- Returns: lhs.compare(rhs.c_str()typename sub_match<BiIter>::string_type(rhs.data(), rhs.size())) < 0.

Date: 2014-02-15.00:00:00

[ 2014-02-15 post-Issaquah session : move to Tentatively Ready ]

Date: 2013-10-17.00:00:00

[ 2013-10-17: Daniel provides concrete wording ]

The original wording was suggested to reduce the need to allocate memory during comparisons. The specification would be very much easier, if sub_match would provide an additional compare overload of the form:

int compare(const value_type* s, size_t n) const;

But given the fact that currently all of basic_string's compare overloads are defined in terms of temporary string constructions, the following proposed wording does follow the same string-construction route as basic_string does (where needed to fix the embedded zeros issue) and to hope that existing implementations ignore to interpret this semantics in the literal sense.

I decided to use the second replacement form

Returns: rhs.compare(sub_match<BiIter>::string_type(lhs.data(), lhs.size())) == 0.

because it already reflects the existing style used in [re.submatch.op] p31.

Date: 2012-11-28.21:54:22

[ Daniel: ]

This wording change was done intentionally as of LWG 1181, but the here mentioned slicing effect was not considered at that time. It seems best to use another overload of compare to fix this problem:

Returns: rhs.str().compare(0, rhs.length(), lhs.data(), lhs.size()) == 0.

or

Returns: rhs.compare(sub_match<BiIter>::string_type(lhs.data(), lhs.size())) == 0.

Date: 2012-11-26.00:00:00
template <class BiIter, class ST, class SA>
  bool operator==(
    const basic_string<
      typename iterator_traits<BiIter>::value_type, ST, SA>& lhs,
    const sub_match<BiIter>& rhs);

is specified as:

Returns: rhs.compare(lhs.c_str()) == 0.

This is odd because sub_match::compare(basic_string) is defined to honor embedded '\0' characters. This could allow a sub_match to == or != a std::string unexpectedly.

History
Date User Action Args
2017-07-30 20:15:43adminsetstatus: wp -> c++17
2014-11-08 19:44:42adminsetstatus: voting -> wp
2014-11-04 10:26:50adminsetstatus: ready -> voting
2014-03-03 13:52:20adminsetmessages: + msg6891
2014-03-03 13:52:20adminsetstatus: new -> ready
2013-10-17 21:42:45adminsetmessages: + msg6752
2013-10-17 21:42:45adminsetmessages: + msg6751
2012-11-28 21:54:22adminsetmessages: + msg6286
2012-11-26 00:00:00admincreate