Title
Awkward interface for changing the number of buckets in an unordered associative container
Status
c++11
Section
[unord.req][unord]
Submitter
Matt Austern

Created on 2009-08-10.00:00:00 last changed 153 months ago

Messages

Date: 2010-10-21.18:28:33

Proposed resolution:

In [unord.req]/9, table 98, append a new row after the last one:

Table 87 — Unordered associative container requirements (in addition to container)
ExpressionReturn typeAssertion/note pre-/post-condition Complexity
a.rehash(n) void Post: a.bucket_count > a.size() / a.max_load_factor() and a.bucket_count() >= n. Average case linear in a.size(), worst case quadratic.
a.reserve(n) void Same as a.rehash(ceil(n / a.max_load_factor())) Average case linear in a.size(), worst case quadratic.

In [unord.map]/3 in the definition of class template unordered_map, in [unord.multimap]/3 in the definition of class template unordered_multimap, in [unord.set]/3 in the definition of class template unordered_set and in [unord.multiset]/3 in the definition of class template unordered_multiset, add the following line after member function rehash():

void reserve(size_type n);
Date: 2009-10-28.00:00:00

[ 2009-10-28 Howard: ]

Moved to Tentatively Ready after 5 votes in favor of Ganesh's option 2 above. The original proposed wording now appears here:

Informally: instead of providing rehash(n) provide resize(n), with the semantics "make the container a good size for n elements".

In the unordered associative container requirements ([unord.req]), remove the row for rehash and replace it with:

Table 87 — Unordered associative container requirements (in addition to container)
ExpressionReturn typeAssertion/note pre-/post-condition Complexity
a.rehashresize(n) void Post: a.bucket_count > max(a.size(), n) / a.max_load_factor() and a.bucket_count() >= n. Average case linear in a.size(), worst case quadratic.

Make the corresponding change in the class synopses in [unord.map], [unord.multimap], [unord.set], and [unord.multiset].

Date: 2009-10-28.00:00:00

[ 2009-10-28 Ganesh summarizes alternative resolutions and expresses a strong preference for the second (and opposition to the first): ]

  1. In the unordered associative container requirements ([unord.req]), remove the row for rehash and replace it with:

    Table 87 — Unordered associative container requirements (in addition to container)
    ExpressionReturn typeAssertion/note pre-/post-condition Complexity
    a.rehashreserve(n) void Post: a.bucket_count > max(a.size(), n) / a.max_load_factor() and a.bucket_count() >= n. Average case linear in a.size(), worst case quadratic.

    Make the corresponding change in the class synopses in [unord.map], [unord.multimap], [unord.set], and [unord.multiset].

  2. In [unord.req]/9, table 98, append a new row after the last one:

    Table 87 — Unordered associative container requirements (in addition to container)
    ExpressionReturn typeAssertion/note pre-/post-condition Complexity
    a.rehash(n) void Post: a.bucket_count > a.size() / a.max_load_factor() and a.bucket_count() >= n. Average case linear in a.size(), worst case quadratic.
    a.reserve(n) void Same as a.rehash(ceil(n / a.max_load_factor())) Average case linear in a.size(), worst case quadratic.

    In [unord.map]/3 in the definition of class template unordered_map, in [unord.multimap]/3 in the definition of class template unordered_multimap, in [unord.set]/3 in the definition of class template unordered_set and in [unord.multiset]/3 in the definition of class template unordered_multiset, add the following line after member function rehash():

    void reserve(size_type n);
    
Date: 2009-09-30.00:00:00

[ 2009-09-30 Daniel adds: ]

I recommend to replace "resize" by a different name like "reserve", because that would better match the intended use-case. Rational: Any existing resize function has the on-success post-condition that the provided size is equal to size(), which is not satisfied for the proposal. Reserve seems to fit the purpose of the actual renaming suggestion.

Date: 2009-08-10.00:00:00

Consider a typical use case: I create an unordered_map and then start adding elements to it one at a time. I know that it will eventually need to store a few million elements, so, for performance reasons, I would like to reserve enough capacity that none of the calls to insert will trigger a rehash.

Unfortunately, the existing interface makes this awkward. The user naturally sees the problem in terms of the number of elements, but the interface presents it as buckets. If m is the map and n is the expected number of elements, this operation is written m.rehash(n / m.max_load_factor()) — not very novice friendly.

History
Date User Action Args
2011-08-23 20:07:26adminsetstatus: wp -> c++11
2010-10-21 18:28:33adminsetmessages: + msg1073
2010-10-21 18:28:33adminsetmessages: + msg1072
2010-10-21 18:28:33adminsetmessages: + msg1071
2010-10-21 18:28:33adminsetmessages: + msg1070
2009-08-10 00:00:00admincreate