Title
Transparent lookups in unordered containers are inconsistent
Status
nad
Section
[unord.req]
Submitter
Marshall Clow

Created on 2020-07-23.00:00:00 last changed 44 months ago

Messages

Date: 2020-08-21.00:00:00

[ 2020-08-21 Issue processing telecon: NAD based on reflector discussion. Status changed: New → NAD. ]

Date: 2020-07-23.00:00:00

In C++14, we added "transparent lookups" into the ordered associative containers. This was sold as an efficiency concern, as removing the need to create temporary objects just to compare against.

However, people found clever ways to use this. One of them, in fact, was in the original paper, and was the subject of a question on Stack Overflow.

This all works because the elements in the ordered associative containers are, well, ordered.

For C++20, we added this facility to the unordered containers.

Consider the following code:

#include <unordered_set>
#include <string>
#include <iostream>

struct DumbHash // put everything in the same bucket
{
  using is_transparent = void;

  template<typename T>
  size_t operator()(const T&) const { return 0; }
};

struct CompareEQ 
{
  using is_transparent = void;

  bool operator()(const std::string& lhs, const std::string& rhs) const
  { return lhs == rhs; }

  bool operator()(const std::string& lhs, char rhs) const
  { return !lhs.empty() && (lhs[0] == rhs); }

  bool operator()(char lhs, const std::string& rhs) const
  { return !rhs.empty() && (lhs == rhs[0]); }
};

int main () 
{
  const char* one[] = {"a", "b",  "c", "d",  "e", "bb"};
  const char* two[] = {"b", "e",  "d", "bb", "c", "a"};
  const char* thr[] = {"b", "bb", "a", "c",  "d", "e"};

  typedef std::unordered_set<std::string, DumbHash, CompareEQ> MS;
  MS m1{std::begin(one), std::end(one)};
  MS m2{std::begin(two), std::end(two)};
  MS m3{std::begin(thr), std::end(thr)};

  for (const auto& s: m1) 
    std::cout << s << ' '; 
  std::cout << std::endl;
  for (const auto& s: m2) 
    std::cout << s << ' '; 
  std::cout << std::endl;
  for (const auto& s: m3) 
    std::cout << s << ' '; 
  std::cout << std::endl;

  std::cout << "m1:" << m1.count('b') << ' ';
  std::cout << "m2:" << m2.count('b') << ' ';
  std::cout << "m3:" << m3.count('b');
}

When I run this program on my Mac, I get the following output:

bb e d c b a
a c bb d e b
e d c a bb b
m1:1 m2:1 m3:2

(This is using unreleased code, but I have confirmed this with VS2019's unordered_multiset.)

This is clearly bad; three containers, containing the same elements, doing the same lookups, giving different results. This also applies to the transparent versions of find, equal_range, and contains.

The problem is that the elements in the unordered containers are only partially ordered; i.e, all the elements that are equal (according to the non-transparent version of the comparison predicate) are adjacent in the container, but are unordered relative to the other elements in the container.

My recommendation is to declare this all UB.

Suggested resolution:

Add a precondition to all of the transparent lookup functions for the unordered containers forbidding stuff like this. This should probably go in [unord.req], maybe at the end of Table [tab:container.hash.req].

Precondition: The value being searched matches at most one unique key in the container.

History
Date User Action Args
2020-08-21 17:46:07adminsetmessages: + msg11442
2020-08-21 17:46:07adminsetstatus: new -> nad
2020-07-23 00:00:00admincreate