Title
Simplification of seed_seq::seq_seq
Status
resolved
Section
[rand.util.seedseq]
Submitter
Charles Karney

Created on 2008-02-22.00:00:00 last changed 171 months ago

Messages

Date: 2011-04-27.21:19:46

Rationale:

Addressed by N2836 "Wording Tweaks for Concept-enabled Random Number Generation in C++0X".

Date: 2010-10-21.18:28:33

Proposed resolution:

Change [rand.util.seedseq]:

template<class InputIterator, 
  size_t u = numeric_limits<iterator_traits<InputIterator>::value_type>::digits>
  seed_seq(InputIterator begin, InputIterator end);

-5- Requires: InputIterator shall satisfy the requirements of an input iterator (24.1.1) such that iterator_traits<InputIterator>::value_type shall denote an integral type.

-6- Constructs a seed_seq object by the following algorithm rearranging some or all of the bits of the supplied sequence [begin,end) of w-bit quantities into 32-bit units, as if by the following:

First extract the rightmost u bits from each of the n = end - begin elements of the supplied sequence and concatenate all the extracted bits to initialize a single (possibly very large) unsigned binary number, b = ∑n-1i=0 (begin[i] mod 2u) · 2w·i (in which the bits of each begin[i] are treated as denoting an unsigned quantity). Then carry out the following algorithm:

v.clear(); if ($w$ < 32) v.push_back($n$); for( ; $n$ > 0; --$n$) v.push_back(b mod 232), b /= 232;

for (InputIterator s = begin; s != end; ++s) v.push_back((*s) mod 232);

Date: 2010-10-21.18:28:33

[ Sophia Antipolis: ]

Marc Paterno wants portable behavior between 32bit and 64bit machines; we've gone to significant trouble to support portability of engines and their values.

Jens: the new algorithm looks perfectly portable

Marc Paterno to review off-line.

Modify the proposed resolution to read "Constructs a seed_seq object by the following algorithm ..."

Disposition: move to review; unanimous consent.

(moots 782 and 800)

Date: 2010-10-21.18:28:33

[ Bellevue: ]

Walter needs to ask Fermilab for guidance. Defer till tomorrow. Bill likes the proposed resolution.

Date: 2008-02-22.00:00:00

seed_seq(InputIterator begin, InputIterator end); constructs a seed_seq object repacking the bits of supplied sequence [begin, end) into a 32-bit vector.

This repacking triggers several problems:

  1. Distinctness of the output of seed_seq::generate required the introduction of the initial "if (w < 32) v.push_back(n);" (Otherwise the unsigned short vectors [1, 0] and [1] generate the same sequence.)
  2. Portability demanded the introduction of the template parameter u. (Otherwise some sequences could not be obtained on computers where no integer types are exactly 32-bits wide.)
  3. The description and algorithm have become unduly complicated.

I propose simplifying this seed_seq constructor to be "32-bit only". Despite it's being simpler, there is NO loss of functionality (see below).

Here's how the description would read

[rand.util.seedseq] Class seed_seq

template<class InputIterator>
  seed_seq(InputIterator begin, InputIterator end);

5 Requires: NO CHANGE

6 Effects: Constructs a seed_seq object by

for (InputIterator s = begin; s != end; ++s) v.push_back((*s) mod 232);

Discussion:

The chief virtues here are simplicity, portability, and generality.

  • Simplicity — compare the above specification with the n2461 proposal.
  • Portability — with iterator_traits<InputIterator>::value_type = uint_least32_t the user is guaranteed to get the same behavior across platforms.
  • Generality — any behavior that the n2461 proposal can achieve can be obtained with this simpler proposal (albeit with a shuffling of bits in the input sequence).

Arguments (and counter-arguments) against making this change (and retaining the n2461 behavior) are:

  • The user can pass an array of unsigned char and seed_seq will nicely repack it.

    Response: So what? Consider the seed string "ABC". The n2461 proposal results in

    v = { 0x3, 0x434241 };
    

    while the simplified proposal yields

    v = { 0x41, 0x42, 0x43 };
    

    The results produced by seed_seq::generate with the two inputs are different but nevertheless equivalently "mixed up" and this remains true even if the seed string is long.

  • With long strings (e.g., with bit-length comparable to the number of bits in the state), v is longer (by a factor of 4) with the simplified proposal and seed_seq::generate will be slower.

    Response: It's unlikely that the efficiency of seed_seq::generate will be a big issue. If it is, the user is free to repack the seed vector before constructing seed_seq.

  • A user can pass an array of 64-bit integers and all the bits will be used.

    Response: Indeed. However, there are many instances in the n2461 where integers are silently coerced to a narrower width and this should just be a case of the user needing to read the documentation. The user can of course get equivalent behavior by repacking his seed into 32-bit pieces. Furthermore, the unportability of the n2461 proposal with

    unsigned long s[] = {1, 2, 3, 4};
    seed_seq q(s, s+4);
    

    which typically results in v = {1, 2, 3, 4} on 32-bit machines and in v = {1, 0, 2, 0, 3, 0, 4, 0} on 64-bit machines is a major pitfall for unsuspecting users.

Note: this proposal renders moot issues 782 and 800.

History
Date User Action Args
2010-12-05 14:14:49adminsetstatus: nad editorial -> resolved
2010-10-21 18:28:33adminsetmessages: + msg3832
2010-10-21 18:28:33adminsetmessages: + msg3831
2010-10-21 18:28:33adminsetmessages: + msg3830
2010-10-21 18:28:33adminsetmessages: + msg3829
2008-02-22 00:00:00admincreate