nutpie icon indicating copy to clipboard operation
nutpie copied to clipboard

BUG: constant_data is missing from idata when sampling with nutpie

Open Y0dler opened this issue 2 years ago • 2 comments

Describe the issue:

I recently encountered an issue when trying to add constant data to some of my models.
When sampling with nuts_sampler="pymc", idata contains a group denominated as constant_data where x and y from this toy example are stored. Unfortunately, when using nutpie, the constant data is nowhere to be found.
Is this actually a bug or is there something wrong with my definitions? Or are pm.ConstantData() and such perhaps just not supported when using nutpie?
Any help would be much appreciated.

Reproduceable code example:

import numpy as np
import pymc as pm
import pytensor.tensor as pt

x = np.array([1,2,3,4])
y = np.array([100, 190, 310, 405])

with pm.Model() as pmodel:
    # add data to the pmodel as ConstantData
    pm.ConstantData("x", x)
    pm.ConstantData("y", y)

    var = pm.Normal("var", 100, 5)
    
    # likelihood
    pm.Normal("L", mu=y, sigma=0.1, observed=y)
    # posterior
    pst = x * var
    pst = pm.Deterministic("posterior", pst)
    # sampling
    idata = pm.sample(nuts_sampler="nutpie", tune=50, draws=50)

print(idata)

Error message:

Output of print(idata):

Inference data with groups:
	> posterior
	> sample_stats

Warmup iterations saved (warmup_*).

PyMC version information:

PyMC v5.8.1 (pypi) nutpie v0.9.1 (pypi) PyTensor v2.16.2 (pypi)

Context for the issue:

No response

Y0dler avatar Sep 25 '23 10:09 Y0dler

Welcome Banner :tada: Welcome to PyMC! :tada: We're really excited to have your input into the project! :sparkling_heart:
If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

welcome[bot] avatar Sep 25 '23 10:09 welcome[bot]

Might be because this information is not even passed to _trace_to_arviz here?

https://github.com/pymc-devs/nutpie/blob/2938d5a0f04a8797792d1f7746f8a24de250db82/python/nutpie/sample.py#L250C25-L250C25

We should probably adopt from here: https://github.com/pymc-devs/mcbackend/blob/96e4248d1d1d1be8b73b28d702bb1a3012ef98f6/mcbackend/core.py#L265

These pieces are all already implemented in the McBackend code. For example a find_data function: https://github.com/pymc-devs/pymc/blob/15fbf0e2f0892b556c8b59446347dd4691a476e6/pymc/backends/mcbackend.py#L44

@aseyboldt @ferrine do I see it correctly that the solution here is to change this line to pass return_raw_trace=True and do the conversion to InferenceData in PyMC codebase?

michaelosthege avatar Sep 25 '23 10:09 michaelosthege