pymc-examples stochastic volatility

stochastic volatility

Open OriolAbril opened this issue 3 years ago • 12 comments

File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/case_studies/stochastic_volatility.ipynb Reviewers:

Known changes needed

Changes listed in this section should all be done at some point in order to get this notebook to a "Best Practices" state. However, these are probably not enough! Make sure to thoroughly review the notebook and search for other updates.

ArviZ related

Use arviz-darkgrid style
Use return_inferencedata=True

Notes

Exotic dependencies

None

Computing requirements

Model takes roughly 15 mins to sample

Mar 29 '21 03:03 OriolAbril

Hi @OriolAbril ! Can I work on this issue?

May 05 '21 11:05 chiral-carbon

Assigned :)

May 05 '21 12:05 OriolAbril

thanks!

I tried running it first, and I get a KeyError in cell 12:

KeyError                                  Traceback (most recent call last)
<ipython-input-12-3bac4038cc19> in <module>
      1 fig, ax = plt.subplots(figsize=(14, 4))
      2 
----> 3 y_vals = np.exp(trace["volatility"])[::5].T
      4 x_vals = np.vstack([returns.index for _ in y_vals.T]).T.astype(np.datetime64)
      5 

~/.local/lib/python3.8/site-packages/arviz/data/inference_data.py in __getitem__(self, key)
    234         """Get item by key."""
    235         if key not in self._groups_all:
--> 236             raise KeyError(key)
    237         return getattr(self, key)
    238 

KeyError: 'volatility'

if trace should have a 'volatility' key, then I'm not sure how to verify this. could you help me out?

May 05 '21 13:05 chiral-carbon

I do have 'volatility' under Data Variables, so I'm wondering if this is an issue of syntax error and there is a different way to access it than trace["volatility"]?

Screenshot from 2021-05-05 18-57-33

May 05 '21 13:05 chiral-carbon

update: I figured out how to access the xarray variables.

I think using trace.posterior.data_vars['volatility'] works, but then in cell 12 if I run y_vals = np.exp(trace.posterior.data_vars['volatility'])[::5].T instead of y_vals = np.exp(trace["volatility"])[::5].T I encounter an error again, so I am still a little unclear with how to work around xarray. could use some help here.

May 05 '21 14:05 chiral-carbon

Hi, sorry about that, the documentation on InferenceData is still in very active development and quite scattered.

I think using trace.posterior.data_vars['volatility'] works

I'd recommend using directly trace.posterior["volatility"] which should return the same result. I think going over https://docs.pymc.io/notebooks/multilevel_modeling.html will help as you'll see InferenceData in action, and then you need to use the fact that InferenceData groups (i.e. idata.posterior, idata.sample_stats...) are xarray Datasets

y_vals = np.exp(trace.posterior.data_vars['volatility'])[::5].T

This touches deeper and more important changes. InferenceData uses label based indexing, not positional indexing, and in addition it doesn't flatten the chain and draw dimensions but keeps them separate. You should give a name to volatility_dim_0 (as per the first point in ArviZ section of https://github.com/pymc-devs/pymc-examples/wiki/Notebook-updates-overview) and then you'll be able to do this ::5 subsetting to get one out of five points as idata.posterior["volatility"].sel(dim_name=slice(step=5)) if using integer coordinate values (otherwise use ìsel). I have a blogpost on arviz-pymc interaction and about cool things one can do with named dims and coords: https://oriolabril.github.io/oriol_unraveled/python/arviz/pymc3/xarray/2020/09/22/pymc3-arviz.html. You may also want to combine chain and draw dims as shown in https://arviz-devs.github.io/arviz/getting_started/WorkingWithInferenceData.html#combine-chains-and-draws

May 05 '21 16:05 OriolAbril

this really helps, thanks a lot!

May 05 '21 16:05 chiral-carbon

I'm not sure how to choose one in five points using sel. what I have now is y = trace.posterior.rename_dims({'volatility_dim_0':'vol'}).stack(pooled_chain=("chain","draw")['volatilty'] which has a dimension of (2905, 8000). If I want one in 5 points the resulting dimensions for y_vals should be (2905,1600), however I'm not able to get that if I apply sel(vol=slice(5)) on y here and applying sel(pooled_chain=slice(5)) on y throws an error.

May 06 '21 23:05 chiral-carbon

I think you'll want to do .isel(pooled_chain=slice(step=5)), after stacking, the labels are like tuples so we'll want to use positional indexing. Note also that doing slice(step=5) is completely different from slice(5) which is equivalent to slice(stop=5) (see https://docs.python.org/3.9/library/functions.html#slice).

Now that you are already discussing specific changes, can you open a PR? Even if the code doesn't run, it will be easier to discuss the changes and go over the feedback, we'll all see the cell and have the comments attached there, so there will be no need for sharing cell numbers or screenshots, we'll also have a better context of what exactly are we slicing for without needing to open the notebook in a different window/tab.

May 07 '21 00:05 OriolAbril

right, thanks! very silly of me to mistake how to use slice in python, I assumed it was some different functionality when used in xarray, don't know why! Yup, will open a PR. a seems more logical than discussing like this.

May 07 '21 08:05 chiral-carbon

very silly of me to mistake how to use slice in python, I assumed it was some different functionality when used in xarray, don't know why!

No need to apologize! We all make mistakes :), and I think all of us were somewhere between very confused and completely lost when starting with xarray. We even created https://github.com/arviz-devs/xarray_examples to help ourselves with specific xarray questions.

May 07 '21 08:05 OriolAbril

looks like a very useful tracker/resource to me :smile:

May 07 '21 09:05 chiral-carbon

pymc-examples pymc-examples copied to clipboard

stochastic volatility

Known changes needed

ArviZ related

Notes

Exotic dependencies

Computing requirements

pymc-examples
pymc-examples copied to clipboard