pymc
pymc copied to clipboard
ENH: <Support CAR distribution with tau as a vector>
Before
# Code snippet from pymc v5.10.3:
class CARRV(RandomVariable):
name = "car"
ndim_supp = 1
ndims_params = [1, 2, 0, 0]
# Workaround with Monkeypatching:
UPM_CAR = pm.distributions.multivariate.CARRV(ndims_params = [1, 2, 0, 1]) #The dimensionality of tau changed from 0 to 1
pm.CAR.rv_op = UPM_CAR
After
# Allow the dimensionality of tau to be a vector with the new class definition in version 5.16.2
class CARRV(RandomVariable):
name = "car"
signature = "(m),(m,m),(),(),()->(m)"
Context for the issue:
I use CAR model with tau as a vector in uncertainty projected mapping (UPM), which is a special case of CAR. In CAR, smoothing is global, however in UPM a local smoothing is employed based on the standard deviations of the sites. UPM is a published model and has been applied to many earthquake engineering case-studies (Chakraborty and Goto, 2018; 2019, Chakraborty et al.,2024).
My transition from WinBUGS to pymc was successful until the pymc version 5.10.3 with monkey-patching I shared above. Given that ndims_params has been depreciated in the signature of CARRV class, I am no longer able to specify tau to be a vector.
I plan to actively use pymc for the UPM model development going forward, and hence would request a feature that allows for tau to be a vector. It used to be specified with the last element in ndims_params. This will remove the need for me to do monkey patching in the code and make my code more readable.
You can find the working code of UPM with monkey-patching in my Github repository: https://github.com/anirban1990/UPM-Spatial-Interpolation/blob/main/Codes/3_generate_UPM.ipynb
I also attach a screenshot of the code to help you navigate to the monkey-patched line.
I would really appreciate your support in this matter.
]
:tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.
@anirban1990 that would correspond to a distinct core RV, similarly to how a GaussianRandomWalk can be parametrized with a scalar sigma for all timesteps, or a vector sigma for each timesteps. The PyMC implementation is currently the first, and if you pass a vector you get a batch of independent GRW variates. There is some discussion here: https://github.com/pymc-devs/pymc/discussions/5972
To allow and distinguish both cases here as well we would need to parametrize the signature of CARRV so that it is either: "(m),(m,m),(),(),()->(m)" or "(m),(m,m),(),(m),()->(m)", depending on whether the core case has a single sigma or a vector of sigma.
That would require some keyword argument like tau_ndim: Literal[0, 1] = 0 to know which signature to use it. The Solve operator in PyTensor has a similar argument to know if we are solving with a vector or a matrix on the rhs: https://github.com/pymc-devs/pytensor/blob/23427a0a375bfca82bd85e35a2f247759fbdcff1/pytensor/tensor/slinalg.py#L441-L443
@ricardoV94 Thank you so much for responding to my question.
Let me try monkey patching with the new signature "(m),(m,m),(),(m),()->(m)" and see if my code runs successfully.
Since this monkey patching may not be compatible with your future versions, I was wondering what a long-term solution could be. Are you suggesting having a variant of the CARRV class with the new signature OR adding a conditional in the signature of the CARRV itself, depending on tau_ndim? If you could expand on your idea here, I can try editing the source code myself and make a pull request.
I am new to this codebase. Thanks for being patient with my questions.