pymc
pymc copied to clipboard
Do not force lower precision in intX
Description of your problem
Please provide a minimal, self-contained, and reproducible example. See implementation here https://github.com/pymc-devs/pymc3/blob/c1efb7ad51e34e6bd8b292ef226a7e191adbdb82/pymc3/theanof.py#L92 for float32 precision, it converts values to int16 for float64 precision, it converts values to int32 But the implementation of Binomial distributions https://github.com/pymc-devs/pymc3/blob/c1efb7ad51e34e6bd8b292ef226a7e191adbdb82/pymc3/distributions/multivariate.py#L714 https://github.com/pymc-devs/pymc3/blob/c1efb7ad51e34e6bd8b292ef226a7e191adbdb82/pymc3/distributions/discrete.py#L103 Does not assume a float is converted to int, thus converting ints in this way may overflow (I ran exactly into this issue)
Please provide the full traceback. The referenced code is enough to spot the problem
Please provide any additional information below. The issue is also relevant to pymc v4
Versions and main components
- PyMC3 Version: 3.11.2
- Aesara Version: None
- Python Version: 3.8.9
- Operating system: Archlinux
- How did you install PyMC3: pip
Does not assume a float is converted to int, thus converting ints in this way may overflow (I ran exactly into this issue)
What do you mean by does not assume?
It converts presumably int parameters but takes float precision in account
The correct behaviour should convert int parameters to RV.dtype that user has control over
I think @lucianopaz introduced these methods.
He will probably have a better idea of the rationale, but IIRC it had to do with ensuring int * float operations downstream worked predictably?
https://github.com/pymc-devs/pymc3/issues/4279#issuecomment-737044970
But are you saying there is no point in converting to intX at all in these specific distributions?
If that's what you mean, I agree.
@ricardoV94, just to give some context, this was introduced here and it was intended to solve problems that could happen when a value they should be a float was an int, and viceversa.
I agree with @ferrine that we could use a distribution's dtype instead of forcing intX to all integer parameters, but we need to be aware that if we drop the intX or floatX casting for the distribution parameters, we may resurface bugs like #3223.
Thanks for the context.
What would be the disadvantage of always converting to float instead of some to float and some to int?
The output of logp computations is always float, so somewhere int precision would be lost anyway. Also most custom c_code in aesara only works with floats.
At least behavior would be a bit more predictable across the board. Am I missing a big disadvantage?
Very few distributions actually work with integers or they may provide alternative parametrizations (e.g. the negative binomial) of which only one is a classical integer input.
I can only think of one computational disadvantage. If you have a discrete random variable, you should be able to use it to index into an array. Imagine something like
...
label = pm.Categorical("label", some_vec)
other_vec[label]
...
If we cast the random variable to a float, then the above lines would raise an exception.
On the other hand, there are two things we need to be careful with:
- Passing observations to the distribution
- That the random number generators for every distribution return results that have the distribution's dtype.
Point 1 caused problems in v3 because the observations were converted to a tensor that mimicked the values dtype, and that could cause a clash with the distribution's dtype. And point 2 could lead to inconsistencies when doing things like model calibration based on prior samples.
We would only cast the inputs to the random variable, not the variable itself? In your example some vec would be converted to floatx for the purpose of computing logp / generating random values via scipy/numpy methods (only inside categorical?). The random output of categorical would still be it's dtype (int for random, float for logp).
I think point 2 is already taken care of in the aesara random ops. But I could be mistaken.
Point 1 hadn't crossed my mind.
Well, I think I should open a PR and fix the issue I encountered. There are 2 ways to do that.
- Touch only these specific distributions and fix the type there
- Change the logic in intX to no-op if an input is already int
I'd prefer to change the intX logic since it is misleading, what do you think, @lucianopaz ?
On a related note, the TensorType of the Distribution (i.e. Distribution.type) isn't used when the ObservationRV is created (see here). Instead, the type of the observed data is used, which leads to conflicts when the value of the resulting observation (i.e. a View Op on the data) is generated by—or according to—the associated Distribution object.
@ferrine This change hasn't been made in V4. Should we?
Yes, that change was caused by a very annoying bug so I had to monkey patch pymc3
Maybe we create a better way to support that
But the approach I used totally made sense I think
@ferrine want you open a PR for V4?