stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

Logarithm of the Cumulative Distribution Function for a Student's t distribution.

Open soegaard opened this issue 1 year ago • 5 comments

  1. The formula for F(x;nu) in the documentation is only valid for x>0.
image

Wikipedia uses t (not x), but gives the same formula, but for t>0 only:

image

From "Sampling Student’s T distribution – use of the inverse cumulative distribution function" by William T. Shaw: image

And the same formula is returned by wolframscript: image

  1. There is a problem in the implementation of logcdf. The implementation basically computes the normal cdf-probability and then takes the logarithm. This defeats the purpose of having a logcdf, which is meant to be used when cdf(x) is so small it is rounded to 0.

    The formula for cdf uses the incomplete beta function, which in turn uses beta. Your project includes an betaln which computes the logarithm of a beta value. A solution must be to use betaln to implement an "logarithm of a regularized, incomplete beta function" and use that to implement logcdf.

soegaard avatar Jul 06 '24 21:07 soegaard

:wave: Hi there! :wave:

And thank you for opening your first issue! We will get back to you shortly. :runner: :dash:

stdlib-bot avatar Jul 06 '24 21:07 stdlib-bot

@Planeshifter Would you mind taking a look?

kgryte avatar Jul 06 '24 21:07 kgryte

Hi @soegaard,

Thanks for opening this issue!

You are correct that the displayed formula is only valid for x > 0, which is missing from the respective README.mds. It's probably best to change them to instead display the formula you referenced. I will take care of this shortly and post an update once it is done.

Your second point with regards to the logcdf is fair as well; it would be good to have a dedicated implementation that avoids rounding errors. However, this may require some R&D as betainc's underlying implementation is rather complex (see kernel-betainc.)

Planeshifter avatar Jul 07 '24 21:07 Planeshifter

Just thinking out loud:

In Mathematica the regularized incomplete Beta function is called BetaRegularized.

https://reference.wolfram.com/language/ref/BetaRegularized.html

We need the logarithm, so we can use a combination of FunctionExpand and PowerExpand to get a formula (using the free wolframscript):

In[7]:= PowerExpand[FunctionExpand[Log[BetaRegularized[z,a,b]]]]                                                                                                              
Out[7]= Log[Beta[z, a, b]] - Log[Gamma[a]] - Log[Gamma[b]] + Log[Gamma[a + b]]

At first sight, I thougt this meant that betaln could be used. However, Beta[z, a, b] is the incomplete beta function, so to compute Log[Beta[z, a, b]] we need (like you concluded) an betaincln in stdlib-js.

That is indeed tricky to implement.

The reason I noticed logcdf in the first place is that I am implementing logcdf for the Racket math library and wanted to see how other libraries did it.

FWIW the implementation of "betaincln" in the Racket library is here (the flag log? switches between betaincln and betainc):

https://github.com/racket/math/blob/master/math-lib/math/private/functions/incomplete-beta.rkt#L192

I do not know the details of the algorithm used. But the strategy used it to give all functions involved in computing "betainc" a flag log?. Then (very oversimplified) the flag is used to determine whether to compute a sum or a product.

soegaard avatar Jul 07 '24 23:07 soegaard

Related discussion happening on Boost: https://github.com/boostorg/math/issues/1173

kgryte avatar Aug 08 '24 20:08 kgryte