Issue 2524: generate_canonical can occasionally return 1.0

Title: generate_canonical can occasionally return 1.0
Status: resolved
Section: [rand.dist.pois.exp]
Submitter: Michael Prähofer

Created on 2015-08-20.00:00:00 last changed 2 weeks ago

Messages

msg9640 (view)

Date: 2024-04-02.00:00:00

[ 2024-04-02 Kona 2023; Resolved by P0952R2. Status changed: Open → Resolved. ]

msg9639 (view)

Date: 2018-08-22.12:55:05

[ 2018-08 Batavia Monday issue discussion ]

Davis has a paper P0952 which resolves this.

msg9530 (view)

Date: 2017-11-10.03:05:33

[ 2017-11 Albuquerque Wednesday issue processing ]

Choice: Rerun the algorithm if it gets 1.0.

Thomas K to provide wording; Marshall and STL to review.

msg7614 (view)

Date: 2015-11-04.16:49:21

[ 2015-10, Kona Saturday afternoon ]

Options:

1) NAD, make it work. E.g. nudge slightly, making the hightest result return the highest allowable value (next_after). This might change the distribution slightly.
2) Re-run the algorithm when you get a 1. This changes the specification in that it makes extra calls to the random engine.
3) Something else

WEB: The one thing we cannot tolerate is any output range other than [0, 1).

WEB: I believe there may be a documented algorithm for the generator, and perhaps it's possible to discover en-route that the algorithm produces the wrong result and fix it.

MC: No. I analyzed this once, and here it is: the algorithm is in [rand.util.canonical], and it's all fine until p5. The expression S/R^k is mathematically less than one, but it may round to one.

GR: Could we change the rounding mode for the computation?

HH: No, because the rounding mode is global, not thread-local.

AM: SG1 wants to get rid of the floating point environment.

STL: The problem is that the standard specifies the implementation, and the implementation doesn't work.

MC: I'm not sure if nudging it down will introduce a subtle bias.

EF: I worry about how the user's choice of floating point environment affects the behaviour.

MS offers to run the topic past colleagues.

MC: Will set the status to open. STL wants to rename the issue. WEB wants to be able to find the issue by its original name still.

Mike Spertus to run the options past his mathematical colleagues, and report back.

msg7498 (view)

Date: 2015-11-04.16:49:21

Original title was: exponential_distribution<float> sometimes returns inf.

The random number distribution class template exponential_distribution<float> may return "inf" as can be seen from the following example program:

// compiled with
// g++ -std=c++11 Error_exp_distr.cpp

#include <iostream>
#include <random>
#include <bitset>

int main(){
  unsigned long long h;
  std::mt19937_64 mt1(1);
  std::mt19937_64 mt2(1);
  mt1.discard(517517);
  mt2.discard(517517);
  std::exponential_distribution<float> dis(1.0);
  h = mt2();
  std::cout << std::bitset<64>(h) << " " << (float) -log(1 - h/pow(2, 64)) << " " 
            << -log(1 - (float) h/pow(2, 64)) << " " << dis(mt1) << std::endl;
  h = mt2();
  std::cout << std::bitset<64>(h) << " " << (float) -log(1 - h/pow(2, 64)) << " " 
            << -log(1 - (float) h/pow(2, 64)) << " " << dis(mt1) << std::endl;
}

output:

0110010110001001010011000111000101001100111110100001110011100001 0.505218 0.505218 0.505218
1111111111111111111111111101010011000110011110011000110101100110 18.4143 inf inf

The reason seems to be that converting a double x in the range [0, 1) to float may result in 1.0f if x is close enough to 1. I see two possibilities to fix that:

use internally double (or long double?) and then convert the result at the very end to float.
take only 24 random bits and convert them to a float x in the range [0, 1) and then return -log(1 - x).

I have not checked if std::exponential_distribution<double> has the same problem: For float on the average 1 out of 2²⁴ (~10⁷) draws returns "inf", which is easily confirmed. For double on the average 1 out of 2⁵³ (~10¹⁶) draws might return "inf", which I have not tested.

Marshall:
I don't think the problem is in std::exponential_distribution; but rather in generate_canonical.

Consider:

std::mt19937_64 mt2(1);
mt2.discard(517517);
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;
std::cout << std::hexfloat << std::generate_canonical<float, std::numeric_limits<float>::digits>(mt2) << std::endl;

which outputs:

0x1.962532p-2
0x1p+0
0x1.20d0cap-3

but generate_canonical is defined to return a result in the range [0, 1).

History
Date	User	Action	Args
2024-04-02 10:29:23	admin	set	status: open -> resolved
2018-01-25 22:57:09	admin	set	messages: + msg9640
2018-01-25 22:57:09	admin	set	messages: + msg9639
2017-11-10 03:05:33	admin	set	messages: + msg9530
2015-11-04 16:49:21	admin	set	messages: + msg7614
2015-11-04 16:49:21	admin	set	status: new -> open
2015-08-20 00:00:00	admin	create