Title
Wording for negative_binomial_distribution is unclear as a consequence of LWG 2406 resolution
Status
new
Section
[rand.dist.bern.negbin]
Submitter
Ahti Leppänen

Created on 2019-02-17.00:00:00 last changed 49 months ago

Messages

Date: 2020-03-11.00:00:00

[ 2020-03-11 Issue Prioritization ]

Priority to 3 and hand over to SG6 after reflector discussion.

Date: 2019-02-17.00:00:00
xmlns:ns0="http://www.w3.org/1998/Math/MathML">

This issue has been created because a corresponding editorial change request had been rejected.

The resolution of LWG 2406 added a note to the definition of negative_binomial_distribution:

[Note: This implies that P(i|k,p) is undefined when p == 1. — end note]

This issue argues that the note is invalid as are the premises on which LWG 2406 was based on. It's also argued that current normative standard text allowing p == 1 is valid both conceptually and mathematically, and that it follows existing conventions in other software.

Problems with the added note:

  • Why does p == 1 imply that P(i|k,p) is undefined? The only questionable factor in the definition of P(i| k,p) seems to be that in case of p == 1, the factor (1 - p)i leads to 00 when i == 0. While it is true that there's no generally accepted convention what this means, std::binomial_distribution already uses the common convention 00 == 1 (e.g. with p == 1 && t == i, P(i|k,p) leads to 00)

  • Even if the term was undefined mathematically, does a non-normative note of mathematical term being undefined mean that the behaviour of the program is undefined (instead of e.g. resulting to NaN) even when no preconditions are violated?

  • The note has lead to unclear situation of being able to construct a distribution object, but calling operator() might lead to undefined behaviour even though no preconditions are violated: for example the cppreference.com notes that

    If p == 1, subsequent calls to the operator() overload that does not accept a param_type object will cause undefined behavior.

Invalidity of premises of LWG 2406:

  • For p == 1, this is "* 1^k * 0^i", so every integer i >= 0 is produced with zero probability. (Let's avoid thinking about 0^0.)

    • This is contradictory: first assuming that 0^i == 0 for all i >= 0 (implying that 0^0 == 0), but then comments not to think about 0^0. The very essence of the issue is interpretation of 0^0 and given the definition of binomial_distribution, where 0^0 == 1, the claim "so every integer i >= 0 is produced with zero probability" can be considered faulty.

  • Wikipedia states that p must be within (0, 1), exclusive on both sides.

    • I cannot find any mention of this in the Wikipedia's version as of 2014-06-02 (i.e. around the time when LWG 2406 was opened). Note that the Wikipedia's version is not the same as in C++ standard; in Wikipedia, p parameter is the same — i.e. the probability of success — but the integer parameter (> 0) is number of failures, while in C++ it is the number of successes. In the failure formulation p == 1 is indeed invalid for essentially the same reason why p == 0 is invalid for the C++ definition (i.e. leads to P(i|k,p) == 0 for all i).

Validity of p == 1:

  • […] distribution of the number of failures in a sequence of trials with success probability p before n successes occur.

    (from Wolfram documentation). When p == 1, this means that trial always succeeds, so it's obvious that the probability to get 0 failures is 1, and the probability for i > 0 failures is 0. This is exactly what the mathematical definition in [rand.dist.bern.negbin] gives with convention 00 = 1 when p == 1.

  • Software such as Mathematica, Matlab and R all accept p == 1 for negative binomial distribution and they use the integer parameter as number of successes like the C++ standard.

What comes to the reasons why p == 1 could have been considered invalid, it seems that major implementations — namely libstd++, libc++ and MSVC standard library — are using std::gamma_distribution in std::negative_binomial_distribution and passing (1 - p)/p as the second argument of std::gamma_distribution. Case p == 1 is not checked leading to violation of precondition of std::gamma_distribution, which requires argument to be > 0.

For these reasons the note added by resolution of LWG 2406 seems invalid and could be considered for removal. However given the current status and history regarding handling of case p == 1, removing the note might not be the only option to consider.

History
Date User Action Args
2020-03-11 19:07:23adminsetmessages: + msg11164
2019-02-17 00:00:00admincreate