Broad checklist of docs bits to work on
The documentation, in particular the one in Turing.jl itself, is in dire need of update given the amount of features and improvements we've made over the past year. In particular, the tutorials have lots and lots room for improvement.
A few things that come to mind immediately are the foollowing.
User-facing side:
- [x]
Turing.predictfor predicting based on a givenchain. - [x]
DynamicPPL.generated_quantities, similar to Stan's generated-block, which allows you to, effectively, capture the return-values of the model (i.e. the stuff inreturn ...) conditioned on achain. - [ ]
conditionanddecondition. There are now two ways to indicate whether a variable is to be considered an observation: passing the variable as an argument (the "old" way), or usingcondition/|(the "new" way). The latter has is, arguably, more intuitive, in addition to being much easier to work with programmatically. - [x]
@submodel. A macro that allows you to use models within models. Makes it very easy to write modular models. - [x]
logprior,loglikelihood, andlogjoint. Easy-to-use methods for evaluating the model in different ways. - [x] fix and condition
Developer-side:
- [ ] Implementation of the LogDensityProblems.jl interface for a
@model. - [ ]
DynamicPPL.TestUtils. This is a sub-module of DynamicPPL that can be quite useful if one is developing features for Turing.
We will add more to the list as we go on, but for now this is a good starting point.
Hi there. I'm really interested in helping to improve the documentation. I pretty much reach for Turing as my first (and usually only) line for Bayesian inference. And have developed a bunch of tutorials and things which are sort of semi-publicly available that use Turing. I even occasionally get some time I could imagine doing the work :sweat_smile:
In terms of topics I really like the list so far. I can't remember, are there macros that allow you to tell the model to treat something like an observation? Something like:
foo = myfunction(my_observations)
foo ~ MyDistribution(my_parameter)
that will currently (I think) replace foo with draws from MyDistribution rather than conditioning the model on the transformed observation.
I can't remember, are there macros that allow you to tell the model to treat something like an observation? Something like:
I've probably made some at some point, which is maybe where you've come across it. I'll see if I can dig it out. But that is not "officially" supported so I don't think that should go in the docs for now :confused:
Hi @torfjelde, let me try to get just the basics down first :-) which repo has the docs for the website turing.ml ? It looks like it's the Turing.jl repo, but I want to make sure.
And then, how does that get built into the website? And what's a good workflow for "testing" the docs? If I wanted to write text, or add sections, and then build and check it on my local Linux workstation (Debian) what do I need to install, and how do I make it happen?
Thanks.
The source of the docs are found here: https://github.com/TuringLang/turinglang.github.io. You can find instructions on how to get it up and running locally in the README:) Let me know if something isn't clear!
And then the library docs, i.e. everything under https://turinglang.org/library/ is found in the corresponding package.
Haven't forgotten this project! Though I'm not excited about installing Jekyll and such on my desktop machine, I'm still up for working on some of the documentation. I've been working on writing some other stuff, and just got to the point where I wanted to USE TURING again, so now it's fresh in my mind.
Some other thoughts: There's not an easy way to get from the turing documentation website to a place where you can find out all the different sampler algorithms that are available, what the constructors are for them, and a little about how they work.
I've got a problem which isn't playing nice with autodiff and I decided to try using alternative samplers to those that require derivatives, and it was frustrating to try to pick an algorithm and figure out what arguments were required. For example MH() is basically never going to work in reality because proposal from the prior is very rarely going to be a good proposal. What you want is diffusive MH which I guess at some point was called RWMH() but that no longer exists? Anyway, that whole ball of wax could use some attention.
Some other thoughts I've had:
- How do you get a list of model variables? How can you determine if a variable is discrete or continuous?
- How does @addlogprob! interact with models that use condition/decondition? I guess we could use priors for things that are intended to be data, but then condition on them having a particular value? This is actually a bigger ball of wax than just addlogprob!
- How does someone write a new sampler?
- Is there a system for Tempering? I'd like to be able to run in parallel two separate chains for two different but related models, and have them occasionally try to swap states between them.
- We need more examples of how to work with MCMCChains objects, extracting certain sub-variables, extracting only certain samples, is there a way to sample randomly from a chain to get a single "row"? like sample(mychain,1)?
- In the tutorials, pointers to good diagnostic plots and diagnostic stats packages etc. For example maybe ArviZ? or something else, some simple examples of how to use them.
- Possibly break up the documentation into sections relevant to the 4 types of documentation as described here: https://www.writethedocs.org/videos/eu/2017/the-four-kinds-of-documentation-and-why-you-need-to-understand-what-they-are-daniele-procida/ Right now we have along the top of the website: "Get Started", "Library API", "Tutorials". Then if you click the main headings are "Using turing" "For Developers" "Tutorials" and "Contributing".
I'd like to see...
Get Started (a goal oriented how-to guide to installing turing, writing a simple model, sampling from the model, plotting results from the sample)
Learning oriented tutorials: A big list of examples, we've got this pretty much
Understanding Oriented Discussion: Discuss how Turing relates to some of its related packages, DynamicPPL, AdvancedHMC, AdvancedMH, MCMCChains, Distributions, and what functionality comes from what pieces of this puzzle. How to extend things, like writing your own sampler, writing your own specialized Distribution, writing your own specialized diagnostics for Chains, writing plot routines that take a chain... Also, what is a Turing model? What "fields" does it have and what would you legitimately do with them (for example suppose you wanted to write a "pretty printer" for a model?)
I realize this is more than just "let's shove some new material into the current documentation structure" but I do think it's what's needed to make hacking with Turing itself more accessible outside the core developer team. If I knew more about this stuff for example I probably would be experimenting with new sampling methods, such as tempering, and piecewise deterministic processes.
Copying from @BenjaminJCox's excellent comment here https://github.com/TuringLang/docs/issues/512#issue-2485521285:
(Note by me: Much of this is outdated now, and some will become more outdated in the future too, for example I fully expect that DynamicPPL.initialstep will be removed)
As requested on the Slack I am posting my perspective as to how the docs could be improved.
The below is not complete, but these are the issues that I have come across when trying to implement an adaptive multiple importance sampler along the lines of https://arxiv.org/abs/0907.1254
All of the below comes from the perspective of a guy that designs samplers and parameter inference methods, so some things are probably obvious but perhaps named or laid out differently than expected by me as I am educated in statistics only.
Overview of potential improvements to documentation for DynamicPPL (and associated):
- Document expected return values of functions, e.g. what do DynamicPPL.assume and DynamicPPL.observe return, what do AbstractMCMC.step and DynamicPPL.initialstep return? How are these used in the sampling loop?
AbstractMCMC.step is quite well documented now. initialstep, assume and observe are gone, so what remains is to
- [ ] document
tilde_assume!!andtilde_observe!!. I think this is probably best done by demonstrating with a new context since these methods are mostly tied to contexts rather than varinfos.
- What does DynamicPPL.updategid! do? It is used in both the MH and HMC implementations yet it is completely unexplained what a gid is.
gids are gone now.
- I believe that https://turinglang.org/v0.24/docs/for-developers/interface uses an outdated version of the AbstractMCMC api, as AbstractMCMC.step! seems to be deprecated in favour of AbstractMCMC.step, which has a different return structure.
This page has been removed from the docs.
- I do not see any way by which multiple samples can be saved per iteration. Whilst this is never really done for MCMC, it is very common for importance samplers and related methods. Indication as to whether this is possible within the existing Turing framework would be useful.
This is still an open question but I think the answer is pretty much no, it's not possible with the existing interface. You will have to do something outside of AbstractMCMC (i.e. using a callback to push to a global object) to get this to work.
- https://turinglang.org/v0.24/docs/for-developers/interface seems to be entirely outdated, and is also the only resource for implementing a sampler within Turing.
This page is gone now.
- A flowchart of what happens at each sampling step with required inputs and outputs would be invaluable.
I think step_warmup could be better documented, but the sampler documentation now is in a better state, I think.
- [ ] The external sampler docs could do with a more direct statement of what the necessary interface is.
- [ ] We could have a page on implementing a sampler directly for
DynamicPPL.Model(the same as how the ones in Turing work).
- It is unclear as to how to access conditional posterior likelihoods (e.g. sample the mean and variance separately, as these may require different sampler design). It seems to be done via DynamicPPL.condition and DynamicPPL.decondition, but the examples for these are bizarre and not representative of real word usage.
- [ ] Page on conditioning and fixing
- I think that having a standard example model would be extremely useful, and I believe that this model should take in a dataset as an argument as this is how it would be in usage.
I think there are enough docs examples of this.
- It is not clear how to take gradients of the posterior likelihood within DynamicPPL. It may be as simple as calling gradient on DynamicPPL.logjoint or DynamicPPL.getlogp, but it is possible that there may be issues with this. A small tutorial could ease this.
- [ ] LogDensityFunction page
- There is no documentation (or at least I cannot find any) for DynamicPPL.assume or DynamicPPL.observe or derived functions, but these are extremely important to implementing a sampler.
Same as above, needs a page on contexts.
- I cannot find documentation for VarInfo or how it is used, although it seems to be used in all samplers.
This has been written now
- I believe that the importance sampler is considered the example of implementing a sampler using the Turing api. I believe that it could do with extensive documentation as to what everything does, as the tutorial that references it is out of date. Also I think that the use of push!! is legacy and should be updated, but I am unsure.
IS is probably not the best example of a sampler. RWMH or basic HMC would be more meaningful IMO.
- https://turinglang.org/v0.24/docs/for-developers/how_turing_implements_abstractmcmc is no longer correct, as nearly all of the functions called therein have been updated within DynamicPPL. This tutorial represents a valuable resource for potential contributors, and I believe that it could be updated to 'tutorialise' the entire implemented importance sampler over a few hours by someone who knows what they are doing.
This one is gone.
Suggestion:
Denote by θ the model parameters and by x the data Most samplers can be implemented using (a subset of) the following: • p(θ|x) the posterior likelihood • p(θ) the parameter prior • p(x|θ) the data likelihood • A way to evaluate the above at arbitrary parameter values • A way to evaluate conditionals of above for subsets of the sampled variables (e.g. let θ = [μ,σ], get p(μ|σ, x)) • first and second order derivatives of the posterior likelihood • Transforming the parameter space to an unconstrained space • A place to store samples from each iteration (potentially multiple samples per iteration) • A place to store weights and other (meta)data associated with samples at each iteration • A way to accumulate probabilities from each iteration
To this end I think that it would be invaluable to have a tutorial that implements a very basic HMC algorithm (literally just a textbook method) within Turing (i.e. using DynamicPPL and AbstractMCMC), as it will cover the majority of these in a way that allows extension.
Addendum:
I believe that with some tweaking the Turing ecosystem has potential to be an excellent tool for prototyping inference algorithms on complex models, however it is currently rather opaque as to how to implement samplers. Furthermore, it seems to be built around the implicit assumptions of MCMC, with two obvious ones being one sample per iteration and equal weighting of samples. Whilst these assumptions can be circumvented (e.g. by writing your method using only the DynamicPPL modelling interface), it would be useful to have native interop, although I appreciate this is a big ask.
I understand that improving this API is secondary to improving the user facing part of Turing, as many more people use Turing without implementing their own samplers. However, I believe that improving the documentation will attract more contributors, and allow the use of the Turing ecosystem in reseaching sampler design in addition to statistical studies.
I am of course happy to help to the best of my ability, but I do not understand the design of DynamicPPL or AbstractMCMC to an extent that I think I could.
(apologies for the formatting, I copied this over from notepad)