pymc-examples Reinforcement Learning Notebook

Closes https://github.com/pymc-devs/pymc-examples/issues/272

[x] Notebook follows style guide https://docs.pymc.io/en/latest/contributing/jupyter_style.html
[x] PR description contains a link to the relevant issue: a tracker one for existing notebooks or a proposal one for new notebooks
[x] Check the notebook is not excluded from any pre-commit check: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml

Helpful links

https://github.com/pymc-devs/pymc-examples/blob/main/CONTRIBUTING.md

Aug 04 '22 13:08 juanitorduz

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Aug 04 '22 13:08 review-notebook-app[bot]

Ok! I think the style is in good shape for a first review iteration (thanks for the comments!) in case someone wants to add or suggest more comments on the content. I have done no changes (besides style).

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

Aug 04 '22 19:08 juanitorduz

Remark: Seems the title is a bit long? In the gallery id does not render nicely. Maybe is better if we remove the "two action" from the title?

Agree

Aug 04 '22 20:08 ricardoV94

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-05T22:21:04Z ----------------------------------------------------------------

We could extend this bonus section and give it a less lazy name. One reason why it's useful to use the bernoulli likelihood is that one can then do prior and posterior predictive sampling as well as model comparison. Even if we don't show it, it's useful to mention it.

juanitorduz commented on 2022-08-17T20:53:23Z ----------------------------------------------------------------

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com/pymc-devs/pymc-examples/pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12

juanitorduz commented on 2022-08-18T07:23:17Z ----------------------------------------------------------------

Better comment added in https://github.com/pymc-devs/pymc-examples/pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0

Aug 05 '22 22:08 review-notebook-app[bot]

Following suggestion I added some model comparison and a posterior predictive check for the Bernoulli model, see https://github.com/pymc-devs/pymc-examples/pull/410/commits/7d9dd5681a33096cae8dac05f587b2e5305c8c12

View entire conversation on ReviewNB

Aug 17 '22 20:08 juanitorduz

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:18Z ----------------------------------------------------------------

By model comparison I meant comparison between different models using stuff like LOO approximations. With Potential you cannot do it, because PyMC does not know what is likelihood and what is prior. With a Bernoulli likelihood you can.

Otherwise, here you compared the very same models written slightly differently, which is not very interesting.

Aug 17 '22 21:08 review-notebook-app[bot]

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-17T21:10:19Z ----------------------------------------------------------------

This plot is a bit difficult to read, what are the x/y axis showing?

OriolAbril commented on 2022-08-17T21:25:33Z ----------------------------------------------------------------

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.

Aug 17 '22 21:08 review-notebook-app[bot]

I would recommend using https://python.arviz.org/en/latest/api/generated/arviz.plot_separation.html which is designed for binary outcomes.

For completeness, here is what the plot above is showing. The black line is the histogram of the observations. As the dtype is integer, ArviZ will not use bins with width smaller than 1. There are therefore two bins: [0, 1) and [1, 2) that show it is more probable to observe a 1 than a 0. The blue lines represent the same thing, but for a specific chain+draw combination, if the model is explaining the data generating process, the black line should be undistinguishable from these blue ones. The orange one is the histogram but over all chains an draws instead of a single one which can sometimes be informative too.

View entire conversation on ReviewNB

Aug 17 '22 21:08 OriolAbril

Thanks for your comments! I misunderstood the suggestion for this section 🤦 (apologies). I will remove these confusing plots and add @ricardoV94 's comments as part of the last section so that we do not make this notebook unnecessary long. I think this could be a good first iteration 😅

Aug 17 '22 21:08 juanitorduz

Better comment added in https://github.com/pymc-devs/pymc-examples/pull/410/commits/e9035c7611e59cd831e5a70255251dbb3f2f28d0

View entire conversation on ReviewNB

Aug 18 '22 07:08 juanitorduz

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2022-08-18T07:44:08Z ----------------------------------------------------------------

Suggestion:

"With pm.Potential you cannot do it, because PyMC does not know what is likelihood and what is prior... NOR HOW TO GENERATE RANDOM DRAWS. NEITHER OF THIS IS A PROBLEM WHEN USING PM.BERNOULLI"

juanitorduz commented on 2022-08-18T08:56:33Z ----------------------------------------------------------------

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.

ricardoV94 commented on 2022-08-18T09:03:14Z ----------------------------------------------------------------

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)

juanitorduz commented on 2022-08-18T09:15:43Z ----------------------------------------------------------------

added in https://github.com/pymc-devs/pymc-examples/pull/410/commits/f72af486f44452e877e835caa5965a480f4affba

Aug 18 '22 07:08 review-notebook-app[bot]

Agree! Thanks! But probably not with the UPPER CASE right? Seems a bit to aggressive XD.

View entire conversation on ReviewNB

Aug 18 '22 08:08 juanitorduz

Hehe maybe not (couldn't think of a better way to emphasize where the suggestion connected with the existing text)

View entire conversation on ReviewNB

Aug 18 '22 09:08 ricardoV94

added in https://github.com/pymc-devs/pymc-examples/pull/410/commits/f72af486f44452e877e835caa5965a480f4affba

View entire conversation on ReviewNB

Aug 18 '22 09:08 juanitorduz

pymc-examples pymc-examples copied to clipboard

Reinforcement Learning Notebook

Helpful links

pymc-examples
pymc-examples copied to clipboard