docs icon indicating copy to clipboard operation
docs copied to clipboard

Request: FAQ section

Open torfjelde opened this issue 1 year ago • 3 comments

IMO we should have a FAQ section for typical issues people face so that we can easily point them to this resource rather than repeat answers constantly. In particular because we're increasingly getting new users who are not familiar with Julia on its own, and so they might not know "trivial" details like "AD comes in a separate package", "Distriubtions are from Distributions.jl", etc.

Here's a list of a few questions I have in mind:

  1. [ ] "predict isn't working with my missing values"
    • E.g. https://discourse.julialang.org/t/turing-posterior-prediction/111072
    • Typically it's a matter of using fill(missing, n) instead of missing when y ~ MvNormal(...) is in the model, or something like that.
  2. [ ] "Why is my model slow?"
    • We should of course just update the performance section quite drastically, but for the time being it would be useful to have a few pointers + tell people to use TuringBenchmarking.jl to benchmark different approaches.
    • There are two aspects to "slow sampling": computational and sampling complexity.
    • Computational:
      1. ForwardDiff.jl is good for low dim models, e.g. <100.
      2. ReverseDiff.jl with compilation is recommended for higher dim models, e.g. >= 100.
      3. Zygote.jl is usually quite slow and can take ages to compile (link to my issue on Zygote.jl compilation times).
      4. Use TuringBenchmarking.jl to benchmark different models.
      5. Avoid indexing, e.g. replace y[i] ~ Normal(...) with y ~ MvNormal(...). NOTE: y is now treated differently and so it might have undesirable effects, see FAQ on predict (the one above).
    • Sampling complexity (probably don't want to say too much about this here though; too broad of a topic):
      1. First thing to try: NUTS.
      2. Try using Gibbs.
  3. [x] "How do I use a custom distribution in Turing.jl?"
    • Point them to the existing docs we have (which I believe needs to be updated).

There are probably many more; please suggest some:)

torfjelde avatar Mar 29 '24 13:03 torfjelde

@devmotion @yebai you might have some to add here

torfjelde avatar Mar 29 '24 13:03 torfjelde

I think it is a great idea; a simple way to start is to introduce a new "tutorial" so we can gradually add more stuff.

yebai avatar Mar 30 '24 15:03 yebai

Unless this has changed, one thing that confused me initially was the inability to unpack model arguments, e.g. defining model(data) and then doing:

`model(data)`
    x = data[:,1]
    y =  data[:,2]

Will no longer recognize a, b as observations. I actually spent many hours trying to debug what I thought was a sampler issue.

JasonPekos avatar Apr 06 '24 03:04 JasonPekos

Another one I'm running into that could probably use a convenient FAQ response is "How do I interpret the summary statistics?" e.g. mcse, ess_bulk, ess_tail, rhat and so on. As someone with a maths background NOT in stats/probability having that additional context is useful.

Upon some initial search, some people have done a bit of digging already:

  • https://discourse.julialang.org/t/what-is-the-interpretation-of-turings-std-naive-se-mcse/52252
  • https://stats.stackexchange.com/questions/348984/stan-hatr-versus-gelman-rubin-hatr-definition

In an ideal scenario I'll probably dig through everything and then draft a summary to submit as a pull request, but it's likely I'll forget, hence my writing it here as a suggestion.

matthras avatar Aug 13 '25 01:08 matthras