aesara icon indicating copy to clipboard operation
aesara copied to clipboard

Add docstring for `(Half)CauchyRV`, `InvGammaRV` and `WaldRV`

Open rlouf opened this issue 2 years ago • 1 comments

Also, for some reason, the __call__ method of InvGammaRV required a rate parameter; I checked the scipy documentation, compared their definition to Wikipedia's and the distribution is indeed parametrized in terms of the scale.

  • [x] There is an informative high-level description of the changes.
  • [x] The description and/or commit message(s) references the relevant GitHub issue(s).
  • [X] pre-commit is installed and set up.
  • [x] The commit messages follow these guidelines.
  • [x] The commits correspond to relevant logical changes, and there are no commits that fix changes introduced by other commits in the same branch/BR.
  • [ ] There are tests covering the changes introduced in the PR.

rlouf avatar Aug 08 '22 16:08 rlouf

Codecov Report

Merging #1105 (d50fc01) into main (2c9f692) will increase coverage by 0.00%. The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1105   +/-   ##
=======================================
  Coverage   79.30%   79.31%           
=======================================
  Files         162      162           
  Lines       48162    48180   +18     
  Branches    10938    10938           
=======================================
+ Hits        38197    38215   +18     
  Misses       7453     7453           
  Partials     2512     2512           
Impacted Files Coverage Δ
aesara/tensor/random/basic.py 98.99% <100.00%> (+0.03%) :arrow_up:

codecov[bot] avatar Aug 08 '22 17:08 codecov[bot]

I think I have found a small API inconsistency. Indeed, while PermutationRV behaves identically to its numpy equivalent by interpreting permutation(a) as permutation(np.range(a)) when a is an int:

import aesara.tensor as at
import numpy as np

srng = at.random.RandomStream(0)

b = at.iscalar('b')
c_rv = srng.permutation(b)

print(c_rv.eval({b: 20}))
# [10  8 19 16  0 17 18  7  9 12  1 15  4  6  3  5 11  2 13 14]

rng = np.random.default_rng()
print(rng.permutation(20))
# [ 6  2 18  9  3 16  5  4 14 12 19 17  0  1  7 15 13  8 10 11]

ChoiceRV does not:

import aesara.tensor as at
import numpy as np

srng = at.random.RandomStream(0)

b = at.iscalar('b')
c_rv = srng.choice(b)

print(c_rv.eval({b: 100}))
# 100; does not change if we change seed

rng = np.random.default_rng()
print(rng.choice(100))
# 30; changes with seed

Should I go ahead and make ChoiceRV behave similarly to PermutationRV or is there a reason for this inconsistency?

rlouf avatar Aug 17 '22 01:08 rlouf

Should I go ahead and make ChoiceRV behave similarly to PermutationRV or is there a reason for this inconsistency?

If it's not behaving like it does in NumPy, then it most likely needs to be fixed.

brandonwillard avatar Aug 17 '22 02:08 brandonwillard

Interestingly, ChoiceRV's behavior also differs from NumPy's in that it does not accept input tensors that have more than one dimention while NumPy does. The following raises an error:

import aesara.tensor as at
import numpy as np

arr = np.vstack([[1,2], [3,4]])
srng = at.random.RandomStream(0)
c_rv = srng.choice(at.as_tensor(arr))
print(c_rv.eval())

while

import aesara.tensor as at
import numpy as np

arr = np.vstack([[1,2], [3,4]])
srng = at.random.RandomStream(0)
c_rv = srng.choice(at.as_tensor(arr))
print(c_rv.eval())
# [1, 2]

I could not find an explanation for this anywhere in the codebase, although it seems that this was intentional as the tests explicitly test for the above example raising an exception. If you can think of a specific reason for this behavior I will document it, otherwise I will change it to follow NumPy's.

Besides this I have no docstring left to add.

rlouf avatar Aug 17 '22 18:08 rlouf

Thank you for the thorough review. I included all suggestions that were not related to the behavior when size=None and rebased the changes.

I think we may define "sample" differently? For instance for the normal distribution:

import aesara.tensor as at

srng = at.random.RandomStream(0)
loc = at.vector()
scale = at.vector()
Y_rv = snrg.normal(loc, scale, size=None)

Y_rv.eval()
# An array of dimensions np.broadcast(loc, scale).size`

And not a single sample. Perhaps I should also check these broadcasting behaviors while we're at it.

Also, what about my comment about ChoiceRV behaving differently from its NumPy equivalent?

rlouf avatar Aug 24 '22 03:08 rlouf

Also, what about my comment about ChoiceRV behaving differently from its NumPy equivalent?

Looks like that might not have been implemented yet, so we need an issue for it.

brandonwillard avatar Aug 24 '22 03:08 brandonwillard

FYI: I've been waiting to see the rendered documentation, but the job doesn't seem to finish.

brandonwillard avatar Aug 24 '22 19:08 brandonwillard

Regarding the docstring for the size argument. I think that the following:

size
   Sample shape. If the given size is `(m, n, k)`, then `m * n * k`
   independent, identically distributed samples are
   returned. Default is `None` in which case a single sample
   is returned.

Is true (most of the time) when (1) Arguments are scalars. (2) The distribution's support is 0-dimensional

When a parameter must be of dimension > 0 (DirichletRV for instance) or the distribution's support is at least a vector then we need to specify the sampling shape in terms of these parameter's shape. NumPy could be a little more explicit in some of these docstrings.

I've also prepared a small write-up on shape semantics and broadcasting that we can add later to complement the content of these docstrings.

rlouf avatar Aug 24 '22 19:08 rlouf

The binomial is pretty clear:

size: int or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if n and p are both scalars. Otherwise, np.broadcast(n, p).size samples are drawn.

https://numpy.org/doc/stable/reference/random/generated/numpy.random.binomial.html

ricardoV94 avatar Aug 24 '22 19:08 ricardoV94

Regarding the docstring for the size argument. I think that the following:

size
   Sample shape. If the given size is `(m, n, k)`, then `m * n * k`
   independent, identically distributed samples are
   returned. Default is `None` in which case a single sample
   is returned.

Is true (most of the time) when (1) Arguments are scalars. (2) The distribution's support is 0-dimensional

When a parameter must be of dimension > 0 (DirichletRV for instance) or the distribution's support is at least a vector then we need to specify the sampling shape in terms of these parameter's shape. NumPy could be a little more explicit in some of these docstrings.

I've also prepared a small write-up on shape semantics and broadcasting that we can add later to complement the content of these docstrings.

Yeah, that works. The basic idea is that size replicates the dimensions determined by the parameters and the support dimensions of the underlying distribution.

brandonwillard avatar Aug 24 '22 19:08 brandonwillard

The binomial is pretty clear:

size: int or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if n and p are both scalars. Otherwise, np.broadcast(n, p).size samples are drawn.

https://numpy.org/doc/stable/reference/random/generated/numpy.random.binomial.html

It is. It would be nice to have one explanation that covers everything, though.

brandonwillard avatar Aug 24 '22 19:08 brandonwillard

Documentation for the binomial is very clear in NumPy, but I've found that it wasn't always the case. Imo it's ok to document the simplest case in these docstrings, and have a separate tutorial on shape semantics and broadcasting behavior.

rlouf avatar Aug 24 '22 19:08 rlouf

Documentation for the binomial is very clear in NumPy, but I've found that it wasn't always the case. Imo it's ok to document the simplest case in these docstrings, and have a separate tutorial on shape semantics and broadcasting behavior.

Certainly. The "gaps" in their docs may come from the fact that they do not support batched parameters in multivariate distributions. Which in itself is worth mentioning.

ricardoV94 avatar Aug 24 '22 20:08 ricardoV94

I have updated all the docstrings to only explain the non-broadcasted cases (eg scalar and scalar parameters for the normal distribution, scalar and vector parameters for the multinomial distribution, etc.) instead of enumerating the different possibilities. I think this should do alongside a general explanation of how broadcasting works with RandomVariables and what the size argument represents.

Broadcasting rules are not always trivial, we should explicit them in the distributions' docstring in the form of gufunc signatures for instance.

rlouf avatar Aug 25 '22 03:08 rlouf