Issue 3402: Wording for negative_binomial_distribution is unclear as a consequence of LWG 2406 resolution

Title: Wording for negative_binomial_distribution is unclear as a consequence of LWG 2406 resolution
Status: new
Section: [rand.dist.bern.negbin]
Submitter: Ahti Leppänen

Created on 2019-02-17.00:00:00 last changed 73 months ago

Messages

msg11164 (view)

Date: 2020-03-11.00:00:00

[ 2020-03-11 Issue Prioritization ]

Priority to 3 and hand over to SG6 after reflector discussion.

msg11137 (view)

Date: 2019-02-17.00:00:00

xmlns:ns0="http://www.w3.org/1998/Math/MathML">

This issue has been created because a corresponding editorial change request had been rejected.

The resolution of LWG 2406 added a note to the definition of negative_binomial_distribution:

[Note: This implies that P(i|k,p) is undefined when p == 1. — end note]

This issue argues that the note is invalid as are the premises on which LWG 2406 was based on. It's also argued that current normative standard text allowing p == 1 is valid both conceptually and mathematically, and that it follows existing conventions in other software.

Problems with the added note:

Why does p == 1 imply that P(i|k,p) is undefined? The only questionable factor in the definition of P(i| k,p) seems to be that in case of p == 1, the factor (1 - p)ⁱ leads to 0⁰ when i == 0. While it is true that there's no generally accepted convention what this means, std::binomial_distribution already uses the common convention 0⁰ == 1 (e.g. with p == 1 && t == i, P(i|k,p) leads to 0⁰)
Even if the term was undefined mathematically, does a non-normative note of mathematical term being undefined mean that the behaviour of the program is undefined (instead of e.g. resulting to NaN) even when no preconditions are violated?
The note has lead to unclear situation of being able to construct a distribution object, but calling operator() might lead to undefined behaviour even though no preconditions are violated: for example the cppreference.com notes that

If p == 1, subsequent calls to the operator() overload that does not accept a param_type object will cause undefined behavior.

Invalidity of premises of LWG 2406:

For p == 1, this is "* 1^k * 0^i", so every integer i >= 0 is produced with zero probability. (Let's avoid thinking about 0^0.)
- This is contradictory: first assuming that 0^i == 0 for all i >= 0 (implying that 0^0 == 0), but then comments not to think about 0^0. The very essence of the issue is interpretation of 0^0 and given the definition of binomial_distribution, where 0^0 == 1, the claim "so every integer i >= 0 is produced with zero probability" can be considered faulty.
Wikipedia states that p must be within (0, 1), exclusive on both sides.
- I cannot find any mention of this in the Wikipedia's version as of 2014-06-02 (i.e. around the time when LWG 2406 was opened). Note that the Wikipedia's version is not the same as in C++ standard; in Wikipedia, p parameter is the same — i.e. the probability of success — but the integer parameter (> 0) is number of failures, while in C++ it is the number of successes. In the failure formulation p == 1 is indeed invalid for essentially the same reason why p == 0 is invalid for the C++ definition (i.e. leads to P(i|k,p) == 0 for all i).

Validity of p == 1:

[…] distribution of the number of failures in a sequence of trials with success probability p before n successes occur.

(from Wolfram documentation). When p == 1, this means that trial always succeeds, so it's obvious that the probability to get 0 failures is 1, and the probability for i > 0 failures is 0. This is exactly what the mathematical definition in [rand.dist.bern.negbin] gives with convention 0⁰ = 1 when p == 1.
Software such as Mathematica, Matlab and R all accept p == 1 for negative binomial distribution and they use the integer parameter as number of successes like the C++ standard.

What comes to the reasons why p == 1 could have been considered invalid, it seems that major implementations — namely libstd++, libc++ and MSVC standard library — are using std::gamma_distribution in std::negative_binomial_distribution and passing (1 - p)/p as the second argument of std::gamma_distribution. Case p == 1 is not checked leading to violation of precondition of std::gamma_distribution, which requires argument to be > 0.

For these reasons the note added by resolution of LWG 2406 seems invalid and could be considered for removal. However given the current status and history regarding handling of case p == 1, removing the note might not be the only option to consider.

History
Date	User	Action	Args
2020-03-11 19:07:23	admin	set	messages: + msg11164
2019-02-17 00:00:00	admin	create