User Documentation
 All Files Functions Groups
Probability Functions
+ Collaboration diagram for Probability Functions:
About:

The Probability Functions module provides cumulative distribution, density/mass, and quantile functions for a wide range of probability distributions.

Unless otherwise documented, all of these functions are wrappers around functionality provided by the boost C++ library [1, “Statistical Distributions and Functions”].

For convenience, all cumulative distribution and density/mass functions (CDFs and PDF/PMFs in short) are defined over the range of all floating-point numbers including infinity. Inputs that are NULL or NaN (not a number) will always produce a NULL or NaN result, respectively. Inputs that are plus or minus infinity will return the respective limits.

A quantile function for a probability distrution with CDF \( F \) takes a probability argument \( p \in [0,1] \) and returns the value \( x \) so that \( F(x) = p \), provided such an \( x \) exists and it is unique. If it does not, the result will be \( \sup \{ x \in D \mid F(x) \leq p \} \) (interpreted as 0 if the supremum is over an empty set) if \( p < 0.5 \), and \( \inf \{ x \in D \mid F(x) \geq p \} \) if \( p \geq 0.5 \). Here \( D \) denotes the domain of the distribution, which is the set of reals \( \mathbb R \) for continuous and the set of nonnegative integers \( \mathbb N_0 \) for discrete distributions.

Intuitively, the formulas in the previous paragraph deal with the following special cases. The 0-quantile will always be the “left end” of the support, and the 1-quantile will be the “right end” of the support of the distribution. For discrete distributions, most values of \( p \in [0,1] \) do not admit an \( x \) with \( F(x) = p \). Instead, there is an \( x \in \mathbb N_0 \) so that \( F(x) < p < F(x + 1) \). The above formulas mean that the value returned as \( p \)-quantile is \( x \) if \( p < 0.5 \), and it is \( x + 1 \) if \( p \geq 0.5 \). (As a special case, in order to ensure that quantiles are always within the support, the \( p \)-quantile will be 0 if \( p < F(0) \)).

The rationale for choosing this behavior is that \(p\)-quantiles for \( p < 0.5 \) are typically requested when interested in the value \( x \) such that with confidence level at least \( 1 - p \) a random variable will be \( > x \) (or equivalently, with probability at most \( p \), it will be \( \leq x \)). Likewise, \(p\)-quantiles for \( p \geq 0.5 \) are typically requested when interested in the value \( x \) such that with confidence level at least \( p \) a random variable will be \( \leq x \). See also [1, “Understanding Quantiles of Discrete Distributions”].

Usage:
  • Cumulative distribution functions:
    SELECT distribution_cdf(random variate[, parameter1 [, parameter2 [, parameter3] ] ])
  • Probability density/mass functions:
    SELECT distribution_{pdf|pmf}(random variate[, parameter1 [, parameter2 [, parameter3] ] ])
  • Quantile functions:
    SELECT distribution_quantile(probability[, parameter1 [, parameter2 [, parameter3] ] ])

For concrete function signatures, see prob.sql_in.

Examples:
sql> SELECT normal_cdf(0);
 normal_cdf
------------
        0.5

sql> SELECT normal_quantile(0.5, 0, 1);
 normal_quantile
-----------------
               0
(1 row)
Literature:

[1] John Maddock, Paul A. Bristow, Hubert Holin, Xiaogang Zhang, Bruno Lalande, Johan Råde, Gautam Sewani and Thijs van den Berg: Boost Math Toolkit, Version 1.49, available at: http://www.boost.org/doc/libs/1_49_0/libs/math/doc/sf_and_dist/html/index.html

See Also
File prob.sql_in documenting the SQL functions.