Title
Issue with internal counter in discard_block_engine
Status
ready
Section
[rand.adapt.disc]
Submitter
Ilya Burylov

Created on 2021-06-03.00:00:00 last changed 2 days ago

Messages

Date: 2021-06-14.14:09:34

Proposed resolution:

This wording is relative to N4885.

  1. Modify [rand.adapt.disc], class template discard_block_engine synopsis, as indicated:

      […]
      // generating functions
      result_type operator()();
      void discard(unsigned long long z);
      
      // property functions
      const Engine& base() const noexcept { return e; };
      
    private:
      Engine e; // exposition only
      intsize_t n; // exposition only
    };
    
Date: 2021-06-15.00:00:00

[ 2021-06-14; Reflector poll ]

Set status to Tentatively Ready after six votes in favour during reflector poll.

Date: 2021-06-03.00:00:00

A discard_block_engine engine adaptor is described in [rand.adapt.disc]. It produces random numbers from the underlying engine discarding (p - r) last values of p - size block, where r and p are template parameters of std::size_t type:

template<class Engine, size_t p, size_t r>
class discard_block_engine;

The transition algorithm for discard_block_engine is described as follows (paragraph 2):

The transition algorithm discards all but r > 0 values from each block of p = r values delivered by &escr;. The state transition is performed as follows: If n = r, advance the state of e from ei to ei+p-r and set n to 0. In any case, then increment n and advance e's then-current state ej to ej+1.

Where n is of integer type. In the API of discard block engine, n is represented in the following way:

[…]
int n; // exposition only

In cases where int is equal to int32_t, overflow is possible for n that leads to undefined behavior. Such situation can happen when the p and r template parameters exceed INT_MAX.

This misleading exposition block leads to differences in implementations:

  • GNU Libstdc++ uses size_t for n — in this case no overflow happened even if template parameters exceed INT_MAX

  • LLVM Libcxx uses int for n together with a static_assert that is checking that p and r values are <= INT_MAX

Such difference in implementation makes code not portable and may potentially breaks random number sequence consistency between different implementors of C++ std lib.

The problematic use case is the following one:

#include <iostream>
#include <random>
#include <limits>

int main() {
  std::minstd_rand0 generator;

  constexpr std::size_t skipped_size = static_cast<std::size_t>(std::numeric_limits<int>::max());
  constexpr std::size_t block_size = skipped_size + 5;
  constexpr std::size_t used_block = skipped_size;

  std::cout << "block_size = " << block_size << std::endl;
  std::cout << "used_block = " << used_block << "\n" << std::endl;

  std::discard_block_engine<std::minstd_rand0, block_size, used_block> discard_generator;

  // Call discard procedures
  discard_generator.discard(used_block);
  generator.discard(block_size);

  // Call generation. Results should be equal
  for (std::int32_t i = 0; i < 10; ++i)
  {
    std::cout << discard_generator() << " should be equal " << generator() << std::endl;
  }
}

We see no solid reason for n to be an int, given that the relevant template parameters are std::size_t. It seems like a perfectly fine use case to generate random numbers in amounts larger than INT_MAX.

The proposal is to change exposition text to std::size_t:

size_t n; // exposition only

It will not mandate the existing libraries to break ABI, but at least guide for better implementation.

History
Date User Action Args
2021-06-14 14:09:34adminsetmessages: + msg11929
2021-06-14 14:09:34adminsetstatus: new -> ready
2021-06-06 12:27:07adminsetmessages: + msg11870
2021-06-03 00:00:00admincreate