stdlib icon indicating copy to clipboard operation
stdlib copied to clipboard

[RFC]: skew-normal distribution

Open quinn-dougherty opened this issue 3 years ago • 8 comments

Description

This RFC proposes the skew-normal distribution

Related Issues

I didn't see any.

Questions

No.

Other

No.

Checklist

  • [X] I have read and understood the Code of Conduct.
  • [X] Searched for existing issues and pull requests.
  • [X] The issue name begins with RFC:.

quinn-dougherty avatar May 07 '22 15:05 quinn-dougherty

:tada: Welcome! :tada:

And thank you for opening your first issue! We will get back to you shortly. :runner: :dash:

github-actions[bot] avatar May 07 '22 15:05 github-actions[bot]

Hey, @quinn-dougherty! Thanks for filing this RFC.

This would definitely be a nice distribution to have. Does entail, however, a more involved implementation. Namely, we'd need integration facilities (see SciPy) and Owen's T function (see https://www.jstatsoft.org/article/view/v005i05 (GPL)).

@Planeshifter May have a better sense of how difficult this distribution may be to implement.

kgryte avatar May 07 '22 21:05 kgryte

For now, this is getting the job done for me:

skew_normal(params) = {
    loc = params.loc
    scale = params.scale
    shape = params.shape

    {|x| (2 / scale) * pdf(normal(loc, scale), x) * cdf(normal(loc, scale), shape * x)}
}

(when a function suffices)

Or this incredibly janky thing:


skew_normal(params) = {
    _skew_normal(params) = {
        loc = params.loc
        scale = params.scale
        shape = params.shape

        {|x| (2 / scale) * pdf(normal(loc, scale), x) * cdf(normal(loc, scale), shape * x)}
    }

    range = List.upTo((params.loc - params.scale * 3) * 100, (params.loc + params.scale * 3) * 100)
    PointSet.makeContinuous(range |> map({|x| {x: x/ 100, y: _skew_normal(params)(x/100)}}))
}

(when I need a proper distribution)

jqhoogland avatar Sep 03 '22 07:09 jqhoogland

Hey, @kgryte!

I would like to work on that. I checked the SciPy source code you shared; they are using direct implementation of the formula here. On the contrary, in the paper, they are utilizing Owen's T function properties to bring down the $\text{shape } (a)$ to $0 \le a \le1$. Then, they use one of six approximation methods depending on the value of h and a. One of these methods is the Gaussian quadrature, which I think is sufficient for now and wouldn't require integration facilities. Also, Owen's T function will be needed in the Multivariate Normal Distributions (GSoC listed ideas), and this can be a good starter for whom will work on it.

Prog-Jacob avatar Mar 29 '24 22:03 Prog-Jacob

@Prog-Jacob As a start, feel free to submit an initial PR which adds the pdf to @stdlib/stats/base/dists/skew-normal/pdf.

kgryte avatar Mar 29 '24 23:03 kgryte

@Prog-Jacob Or perhaps take an initial pass at adding Owen's T to @stdlib/math/base/special/owens-t, based on Boost: https://github.com/boostorg/math/blob/15c40fae1610fd83728d75e96d7377fe55f976f7/include/boost/math/special_functions/owens_t.hpp#L1030

kgryte avatar Mar 29 '24 23:03 kgryte

@kgryte Boost is implementing the paper you referenced – Fast and Accurate Calculation of Owen's T Function. It looks hefty; should I initially focus on supporting Skewed Normal Distributions and work on Owen's T function later on? what do you think?

Prog-Jacob avatar Mar 30 '24 00:03 Prog-Jacob

@Prog-Jacob Sure. Sounds good.

kgryte avatar Mar 30 '24 00:03 kgryte