pymc ENH: <Support CAR distribution with tau as a vector>

Before

# Code snippet from pymc v5.10.3: 

class CARRV(RandomVariable):
    name = "car"
    ndim_supp = 1
    ndims_params = [1, 2, 0, 0]
   

# Workaround with Monkeypatching: 

UPM_CAR = pm.distributions.multivariate.CARRV(ndims_params = [1, 2, 0, 1])  #The dimensionality of tau changed from 0 to 1
pm.CAR.rv_op = UPM_CAR

After

# Allow the dimensionality of tau to be a vector with the new class definition in version 5.16.2
class CARRV(RandomVariable):
    name = "car"
    signature = "(m),(m,m),(),(),()->(m)"

Context for the issue:

I use CAR model with tau as a vector in uncertainty projected mapping (UPM), which is a special case of CAR. In CAR, smoothing is global, however in UPM a local smoothing is employed based on the standard deviations of the sites. UPM is a published model and has been applied to many earthquake engineering case-studies (Chakraborty and Goto, 2018; 2019, Chakraborty et al.,2024).

My transition from WinBUGS to pymc was successful until the pymc version 5.10.3 with monkey-patching I shared above. Given that ndims_params has been depreciated in the signature of CARRV class, I am no longer able to specify tau to be a vector.

I plan to actively use pymc for the UPM model development going forward, and hence would request a feature that allows for tau to be a vector. It used to be specified with the last element in ndims_params. This will remove the need for me to do monkey patching in the code and make my code more readable.

You can find the working code of UPM with monkey-patching in my Github repository: https://github.com/anirban1990/UPM-Spatial-Interpolation/blob/main/Codes/3_generate_UPM.ipynb

I also attach a screenshot of the code to help you navigate to the monkey-patched line. Code

I would really appreciate your support in this matter.

Jul 29 '24 09:07 anirban1990

] :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

Jul 29 '24 09:07 welcome[bot]

@anirban1990 that would correspond to a distinct core RV, similarly to how a GaussianRandomWalk can be parametrized with a scalar sigma for all timesteps, or a vector sigma for each timesteps. The PyMC implementation is currently the first, and if you pass a vector you get a batch of independent GRW variates. There is some discussion here: https://github.com/pymc-devs/pymc/discussions/5972

To allow and distinguish both cases here as well we would need to parametrize the signature of CARRV so that it is either: "(m),(m,m),(),(),()->(m)" or "(m),(m,m),(),(m),()->(m)", depending on whether the core case has a single sigma or a vector of sigma.

That would require some keyword argument like tau_ndim: Literal[0, 1] = 0 to know which signature to use it. The Solve operator in PyTensor has a similar argument to know if we are solving with a vector or a matrix on the rhs: https://github.com/pymc-devs/pytensor/blob/23427a0a375bfca82bd85e35a2f247759fbdcff1/pytensor/tensor/slinalg.py#L441-L443

Aug 03 '24 14:08 ricardoV94

@ricardoV94 Thank you so much for responding to my question.

Let me try monkey patching with the new signature "(m),(m,m),(),(m),()->(m)" and see if my code runs successfully.

Since this monkey patching may not be compatible with your future versions, I was wondering what a long-term solution could be. Are you suggesting having a variant of the CARRV class with the new signature OR adding a conditional in the signature of the CARRV itself, depending on tau_ndim? If you could expand on your idea here, I can try editing the source code myself and make a pull request.

I am new to this codebase. Thanks for being patient with my questions.

Aug 09 '24 09:08 anirban1990

pymc pymc copied to clipboard

ENH: <Support CAR distribution with tau as a vector>

Before

After

Context for the issue:

pymc
pymc copied to clipboard