MCMCChains.jl icon indicating copy to clipboard operation
MCMCChains.jl copied to clipboard

Rewriting MCMCChains with Makie.jl+AlgebraOfGraphics

Open ParadaCarleton opened this issue 3 years ago • 15 comments

I've been discussing this with a couple people, and I think you could get MCMCChains to feature parity with Bayesplot within a few months if you reimplemented the code using Makie.jl+AlgebraOfGraphics. Makie's recipe system is almost the same as StatsPlots', so you could reuse most of the current code pretty easily. AlgebraOfGraphics is an extension for Makie adding ggplot2-style syntax, and my thought there is that you can reuse most of Bayesplot's code, which relies on ggplot2. The bigger bonus would be that this feature parity would be very easy to maintain, since any plot added to Bayesplot could quickly be added to Turing; in addition, AlgebraOfGraphics provides a lot of tools to users that make it easier for them to modify plots themselves if they'd like to do so. I think that having a package that can do these things would be extremely helpful in getting people to switch from R, since even ArviZ tends to lag behind Bayesplot a bit in terms of what it can do.

I'd be interested in helping with this project if there's any interest in it.

ParadaCarleton avatar Jun 06 '21 15:06 ParadaCarleton

I think it would be great if there would be support for Makie for Chains. However, as long as there is no lightweight recipe support for Makie, similar to RecipesBase for Plots.jl, it would be better to have a separate package for Makie support, similar to https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl that adds Makie support to https://github.com/JuliaGaussianProcesses/AbstractGPs.jl (which already natively supports Plots.jl through RecipesBase). The main reason would be that plotting functionality is not the main focus of MCMCChains (yet) and that we want to avoid the really heavy Makie.jl dependency.

In general, I think an implementation should not be based on AlgebraOfGraphics but on regular Makie and only define @recipes (if needed) and the convert_arguments pipeline for Chains. While AlgebraOfGraphics is nice, it seems limiting if Makie users have to use it if they want to plot a Chains object. And all regular Makie recipes and plotting functionality are supported automatically by AlgebraOfGraphics, so you could still use the ggplot-like syntax if you want to.

devmotion avatar Jun 06 '21 16:06 devmotion

I think it would be great if there would be support for Makie for Chains. However, as long as there is no lightweight recipe support for Makie, similar to RecipesBase for Plots.jl, it would be better to have a separate package for Makie support, similar to https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl that adds Makie support to https://github.com/JuliaGaussianProcesses/AbstractGPs.jl (which already natively supports Plots.jl through RecipesBase). The main reason would be that plotting functionality is not the main focus of MCMCChains (yet) and that we want to avoid the really heavy Makie.jl dependency.

MakieCore is the equivalent of RecipeseBase for Makie, but I don't see any problems with splitting off the Makie functionality.

In general, I think an implementation should not be based on AlgebraOfGraphics but on regular Makie and only define @recipes (if needed) and the convert_arguments pipeline for Chains. While AlgebraOfGraphics is nice, it seems limiting if Makie users have to use it if they want to plot a Chains object. And all regular Makie recipes and plotting functionality are supported automatically by AlgebraOfGraphics, so you could still use the ggplot-like syntax if you want to.

I'm not sure I understand. You're right that you can mix any Makie recipes you'd like with AlgebraOfGraphics, so there's not really much cost involved with installing it. On the other hand, there's a huge upside in terms of time saved by using ggplot syntax that would let you carry over a lot of code from Bayesplot.

ParadaCarleton avatar Jun 10 '21 20:06 ParadaCarleton

MakieCore is the equivalent of RecipeseBase for Makie, but I don't see any problems with splitting off the Makie functionality.

It's not usable yet (requires https://github.com/JuliaPlots/Makie.jl/pull/998), and it will only work for truly lightweight recipes (see https://github.com/JuliaPlots/Makie.jl/issues/996). So it is still unclear when and how an equivalent/similar approach to RecipesBase would be available.

On the other hand, there's a huge upside in terms of time saved by using ggplot syntax that would let you carry over a lot of code from Bayesplot.

It would be strange to require users to commit to AlgebraOfGraphics and the ggplots-like syntax if you can implement stuff as generic Makie recipes that can be used both with standard Makie and AlgebraOfGraphics. I don't think this should be guided by similarities with bayesplot.

devmotion avatar Jun 10 '21 20:06 devmotion

Maybe, a section in the documentation like Gadfly would be a good compromise? Like Gadfly, AlgebraOfGraphics is also very high level and, therefore, there is less need to implement special glue code between Makie and MCMCChains.

For example, based on the MCMCChains docs:

using AlgebraOfGraphics
using DataFrames
using CategoricalArrays
using MCMCChains
using CairoMakie
using Random

n_iter = 400
n_name = 3
n_chain = 2

val = randn(n_iter, n_name, n_chain) .+ [1, 2, 3]'
val = hcat(val, rand(1:2, n_iter, 1, n_chain))

chn = Chains(randn(100, 2, 3), [:A, :B])
df = DataFrame(chn)
df[!, :chain] = categorical(df.chain)

layers = data(df) * mapping(:A; color=:chain) * AlgebraOfGraphics.density()
axis = (; ylabel="Density")
AlgebraOfGraphics.draw(layers; axis)

image

EDIT. And for the MCMCChains.plot(chn):

using Makie

chn = Chains(val, [:A, :B, :C, :D])
df = DataFrame(chn)
df[!, :chain] = categorical(df.chain)
sdf = stack(df, names(chn), variable_name=:parameter)

layer = data(sdf) * mapping(:value; color=:chain, row=:parameter)
scat = layer * visual(Lines)
dens = layer * AlgebraOfGraphics.density()

fig = Figure(; resolution=(800, 600))
axis = (xlabel="Iteration", ylabel="Sample value")
draw!(fig[1, 1], scat; axis)
axis = (xlabel="Sample value", ylabel="Density")
draw!(fig[1, 2], dens; axis)

image

(Note that the axis ranges are linked though, which isn't the case for MCMCChains.plot.)

rikhuijzer avatar Jun 14 '21 13:06 rikhuijzer

I kinda love it and would be happy to see this in the docs. I think also a glue-on package that handles some common interactions between MCMCChains and Makie/AlgebraOfGraphics would be warranted, but I am generally opposed at this time to rewriting the internals to use Makie/AlgebraOfGraphics. For now -- though I may revise my opinion later because DAMN those are some good-looking plots.

cpfiffer avatar Jun 14 '21 15:06 cpfiffer

It's nice to see how much functionality is already provided for free by the Tables interface. Probably a separate MCMCChains-Makie package would be helpful to define custom plots or specific conversion rules (e.g., for default plot types).

devmotion avatar Jun 14 '21 15:06 devmotion

specific conversion rules (e.g., for default plot types).

@devmotion, what do you mean by this exactly? I'm afraid that I don't understand

rikhuijzer avatar Jun 14 '21 15:06 rikhuijzer

Rules like https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl/blob/9f25ba0a563b3dad33667d7f476d80f5d220edf0/src/conversions.jl#L18-L26 (it defines how to plot a specific type with a specific plot type) or https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl/blob/9f25ba0a563b3dad33667d7f476d80f5d220edf0/src/AbstractGPsMakie.jl#L16-L19 (it defines what type of plot is created if you just call plot).

devmotion avatar Jun 14 '21 15:06 devmotion

It shouldn't be necessary to construct a DataFrame. Chains supports the Tables.jl interface and AlgebraOfGraphics can deal with any Tables.jl input.

devmotion avatar Jun 28 '21 09:06 devmotion

There is a package related to this discussion at https://github.com/theogf/Turkie.jl.

rikhuijzer avatar Jul 08 '21 18:07 rikhuijzer

I wrote some code for MCMCChains + AlgebraOfGraphics a while back. https://github.com/adkabo/BayesPlots.jl/blob/main/src/plots.jl

adkabo avatar Jul 09 '21 18:07 adkabo

It shouldn't be necessary to construct a DataFrame. Chains supports the Tables.jl interface and AlgebraOfGraphics can deal with any Tables.jl input.

More concretely, the plots above can be generated with the following code without DataFrames:

using AlgebraOfGraphics
using CairoMakie
using MCMCChains

using AlgebraOfGraphics: density

chain = Chains(randn(100, 2, 3), [:A, :B])

plt = data(chain) * mapping(:A; color=:chain => nonnumeric) * density()
draw(plt; axis=(ylabel="density",))

density

using AlgebraOfGraphics
using CairoMakie
using MCMCChains

using AlgebraOfGraphics: density

val = hcat(randn(400, 3, 2), rand(1:2, 400, 1, 2))
val .+= [1 2 3 0]
chain = Chains(val, [:A, :B, :C, :D])

# exclude additional information such as log probability
params = names(chain, :parameters) 
chain_mapping = mapping(params .=> "sample value") *
    mapping(; color=:chain => nonnumeric, row=dims(1) => renamer(params))
plt1 = data(chain) * mapping(:iteration) * chain_mapping * visual(Lines)
plt2 = data(chain) * chain_mapping * density()
fig = Figure(; resolution=(800, 600))
draw!(fig[1, 1], plt1)
draw!(fig[1, 2], plt2; axis=(ylabel="density",))

values

devmotion avatar Jul 09 '21 22:07 devmotion

Looks like @kskyten is doing something with Makie.

Look at: https://github.com/kskyten/BayesPlot.jl

storopoli avatar Jul 27 '21 20:07 storopoli

There's also a plan to have ArviZ's plots in Plots.jl or Makie.jl. https://github.com/arviz-devs/ArviZ.jl/issues/108. My current thinking is that the steps are

  1. Split diagnostics and statistics from ArviZ into smaller, modular packages
  2. Add atomic recipes for various uncertainty visualizations for uni- and bivariate draws with a consistent interface to StatsPlots and Makie
  3. Create a package full of "plotting data" functions. These are functions that compute some statistics from the inference data, to be used for specific plots, and store them in structs.
  4. Create packages for Makie.jl and/or Plots.jl that simply implement the plotting functions for the structs using the atomic uncertainty recipes.

This is PPL-agnostic. Any PPL can hook into this by overloading the "plotting data" functions for types owned by the PPL. Alternatively, one can define a single converter from the types of the PPL to a common structure, which is the ArviZ.InferenceData type.

Another idea is to have something like a Tables interface for collections of MCMC draws produced in a Bayesian workflow. This would give a unified interface for retrieving prior, prior-predictive, posterior, posterior-predictive, etc draws, and then iterating over them either chain-wise, iteration-wise, or parameter-wise, etc. If we as a community could develop such a unified interface, then any plotting or diagnostics package could hook into the interface to provide plots for any PPL.

sethaxen avatar Jul 27 '21 22:07 sethaxen

Another idea is to have something like a Tables interface for collections of MCMC draws produced in a Bayesian workflow. This would give a unified interface for retrieving prior, prior-predictive, posterior, posterior-predictive, etc draws, and then iterating over them either chain-wise, iteration-wise, or parameter-wise, etc.

Oh man, I love this! I think it's a great idea -- it would have saved me so much time/trouble with ParetoSmooth.jl.

ParadaCarleton avatar Jul 28 '21 22:07 ParadaCarleton