pymc-examples UpdatingPriors

UpdatingPriors

Open caxelrud opened this issue 2 years ago • 3 comments

updating_priors: [Notebook url] (https://docs.pymc.io/en/v3/pymc-examples/examples/pymc3_howto/updating_priors.html): For the following code section, to get to the posterior trace, I had to change to trace.posterior.alpha.mean(axis=0), etc. I am not sure if this is correct. Check the code:

for _ in range(10):
    # generate more data
    X1 = np.random.randn(size)
    X2 = np.random.randn(size) * 0.2
    Y = alpha_true + beta0_true * X1 + beta1_true * X2 + np.random.randn(size)
    model = Model()
    with model:
        # Priors are posteriors from previous iteration
        alpha = from_posterior("alpha", trace.posterior.alpha.mean(axis=0)) #CA
        beta0 = from_posterior("beta0", trace.posterior.beta0.mean(axis=0)) #CA
        beta1 = from_posterior("beta1", trace.posterior.beta1.mean(axis=0)) #CA
        # Expected value of outcome
        mu = alpha + beta0 * X1 + beta1 * X2
        # Likelihood (sampling distribution) of observations
        Y_obs = Normal("Y_obs", mu=mu, sigma=1, observed=Y)
        # draw 10000 posterior samples
        trace = sample(1000)
        traces.append(trace)

But, I got the following error on trace = sample(1000):

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'alpha_interval__': array(nan), 'beta0_interval__': array(nan), 'beta1_interval__': array(nan)}

Initial evaluation results:
{'alpha': nan, 'beta0': nan, 'beta1': nan, 'Y_obs': nan}

Jun 24 '22 23:06 caxelrud

I would do:

...
post_means = trace.posterior.mean(("chain", "draw"))
...
alpha = from_posterior("alpha", post_means.alpha)
# or 
alpha = from_posterior("alpha", post_means.alpha.values.flatten())

Potentially useful references:

https://oriolabrilpla.cat/python/arviz/pymc/xarray/2022/06/07/pymc-arviz.html
https://cluhmann.github.io/inferencedata/
https://python.arviz.org/en/latest/getting_started/WorkingWithInferenceData.html

Jun 27 '22 22:06 OriolAbril

@OriolAbril , I got the following error when testing your suggestion.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_10784/1212275851.py in <module>
     10         # Priors are posteriors from previous iteration
     11         post_means = trace.posterior.mean(("chain", "draw"))
---> 12         alpha = from_posterior("alpha", post_means.alpha)
     13         beta0 = from_posterior("beta0", post_means.beta0)
     14         beta1 = from_posterior("beta1", post_means.beta1)

~\AppData\Local\Temp/ipykernel_10784/45955940.py in from_posterior(param, samples)
      3     width = smax - smin
      4     x = np.linspace(smin, smax, 100)
----> 5     y = stats.gaussian_kde(samples)(x)
      6 
      7     # what was never sampled should have a small probability but not 0,

~\anaconda3\lib\site-packages\scipy\stats\kde.py in __init__(self, dataset, bw_method, weights)
    191         self.dataset = atleast_2d(asarray(dataset))
    192         if not self.dataset.size > 1:
--> 193             raise ValueError("`dataset` input should have multiple elements.")
    194 
    195         self.d, self.n = self.dataset.shape

ValueError: `dataset` input should have multiple elements.

Jun 28 '22 10:06 caxelrud

Oh, I see. I was only trying to provide a working alternative to the indexing and averaging code, without looking and what it was doing.

You are taking the mean, therefore reducing your posterior samples to a single summary value, then trying to get a KDE from there which is impossible. The original code doesn't take the mean. It looks like you want a 1d array with both chain and draw dimensions stacked together. For that you can use

post = idata.posterior.stack(sample=("chain", "draw"))
...
     alpha = from_posterior("alpha", post_means.alpha)

Jun 30 '22 13:06 OriolAbril

pymc-examples pymc-examples copied to clipboard

UpdatingPriors

pymc-examples
pymc-examples copied to clipboard