rstan icon indicating copy to clipboard operation
rstan copied to clipboard

Deprecate ggplot functions in future rstan?

Open jgabry opened this issue 2 years ago • 53 comments

@bgoodri @andrjohns @hsbadr Should we deprecate all the ggplot functions in rstan? Maybe in conjunction with the release of 2.26 or another future version? The ggplot functions in rstan were added before bayesplot existed and I'm not sure if there's any reason to keep supporting both. We could deprecate and point users to bayesplot. Thoughts?

jgabry avatar Mar 16 '23 17:03 jgabry

Yes

bgoodri avatar Mar 16 '23 17:03 bgoodri

Agreed. We can remove them from both v2.26 & the experimental branch. Though, not sure if any dependency is using them right now. This won't affect the release of StanHeaders v2.26.

hsbadr avatar Mar 16 '23 17:03 hsbadr

I think there may be a few dependencies using them, but not many. How about just a deprecation warning at first and then removing them entirely at some point in the future? If that sounds good I can make a PR against the experimental branch adding the deprecation warnings.

jgabry avatar Mar 16 '23 18:03 jgabry

@andrjohns Can you work on this for the experimental branch? I'll backport to v2.26 and run reverse dependency checks.

hsbadr avatar Mar 16 '23 18:03 hsbadr

I think there may be a few dependencies using them, but not many. How about just a deprecation warning at first and then removing them entirely at some point in the future? If that sounds good I can make a PR against the experimental branch adding the deprecation warnings.

That works, but we could also depend on bayesplot or create aliases (if bayesplot is installed use it; otherwise, print a deprecation warning).

I think there may be a few dependencies using them

Alternatively, we could remove them completely and contact the maintainers to switch to bayesplot.

hsbadr avatar Mar 16 '23 18:03 hsbadr

I vote for removing them completely, if the number of dependencies is small. We can easily fix it for them.

hsbadr avatar Mar 16 '23 18:03 hsbadr

That's true, although I think there are a lot of users who got accustomed to using the plotting functions in RStan and giving them one release with a message pointing them to bayesplot would be nice (instead of them just getting an error from R that the function doesn't exist). @bgoodri and @andrjohns what do you guys think?

jgabry avatar Mar 16 '23 18:03 jgabry

If it's going to require fixing other package dependencies, I'd prefer to wait until 2.31 is on CRAN before removing. At the moment the submission path for 2.31 is relatively clear, with all dependencies having patches either submitted or released to CRAN. I'd rather not add an additional source of delay/breakage on that side of things

andrjohns avatar Mar 16 '23 18:03 andrjohns

@andrjohns That makes sense. Are you ok if I add deprecation warnings? That way everything continues to work (no breaking dependencies) but we warn them of the change coming in the future?

jgabry avatar Mar 16 '23 18:03 jgabry

I'm on board with deprecation warnings

andrjohns avatar Mar 16 '23 18:03 andrjohns

I can make a PR. Is the right place to do that the experimental branch?

jgabry avatar Mar 16 '23 18:03 jgabry

I can make a PR. Is the right place to do that the experimental branch?

Yes. That's the target for v2.32 after releasing StanHeaders v2.26; we can jump directly to the latest version of rstan since StanHeaders now includes a compatible version of stanc3.

hsbadr avatar Mar 16 '23 18:03 hsbadr

It occurs to me that the one thing we lose by removing these functions from rstan is the ability to automatically unconstrain parameters if the user wants (I think this is why we haven't already deprecated them, if I remember previous discussion accurately). bayesplot can't unconstrain the parameters automatically, so the user would have to unconstrain and then pass to bayesplot if they want plots of unconstrained parameters.

Are we ok with that or is that sufficient reason to keep plots in rstan itself?

jgabry avatar Mar 16 '23 18:03 jgabry

Could the rstan functions instead just be a wrapper that performs the unconstraining and then calls bayesplot? It could also be something that gets built into bayesplot

andrjohns avatar Mar 16 '23 18:03 andrjohns

The idea being not to remove existing functionality, in case it's needed in a workflow somewhere

andrjohns avatar Mar 16 '23 18:03 andrjohns

Could the rstan functions instead just be a wrapper that performs the unconstraining and then calls bayesplot? It could also be something that gets built into bayesplot

Maybe the first option. I think if we can keep bayesplot as separate as possible and not call rstan::unconstrain_pars inside bayesplot that would be ideal.

This is only really an issue for complicated transformations. For a standard deviation, for example, it's easy to get bayesplot to do the transformation:

bayesplot::mcmc_hist(x, transformations = list(sigma = "log")) 

but you have to know the function to specify. For more complicated transformations it would be tough for the user to do this without relying on unconstrain_pars. So yeah maybe we should do it in rstan and then call bayesplot instead of deprecating the rstan functions. I'll look into that.

jgabry avatar Mar 16 '23 18:03 jgabry

@andrjohns What's the best way for me to build the experimental branch locally? Seems like I need StanHeaders 2.31.0, right?

jgabry avatar Mar 16 '23 18:03 jgabry

I either just devtools::install() the StanHeaders subdir and then the rstan/rstan subdir, or use the pre-built package source from any of the recent actions runs on the experimental branch

andrjohns avatar Mar 16 '23 18:03 andrjohns

(and yeah I normally use StanHeaders 2.31 with it as well)

andrjohns avatar Mar 16 '23 19:03 andrjohns

Ok thanks. I’ll give that a try later (got to hop on a few zoom calls first)

On Thu, Mar 16, 2023 at 1:00 PM Andrew Johnson @.***> wrote:

(and yeah I normally use StanHeaders 2.31 with it as well)

— Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstan/issues/1051#issuecomment-1472586858, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3PQQ5AMRVBP3JPKCQELBDW4NPNPANCNFSM6AAAAAAV5RFEJQ . You are receiving this because you authored the thread.Message ID: @.***>

jgabry avatar Mar 16 '23 19:03 jgabry

@andrjohns I successfully built the experimental branch, and it's a good thing I tried it because I think the existing code for plotting unconstrained parameters breaks, potentially due to changes in the generated C++ code in newer versions of Stan.

In the past this (admittedly hacky) helper function was able to get the names of the variables declared in the parameters block:

https://github.com/stan-dev/rstan/blob/d9a7521ab6d6930d8a459bb67425a03e5b1de242/rstan/rstan/R/stan_plot_helpers.R#L198-L204

(I think it would get data + parameters and then drop the data by intersecting with model_pars)

But now it seems that grepping for context__.vals_r in the C++ code only returns variables in the data block not the parameters block. This basically breaks the plotting code for unconstrained parameters (we were using this to provide a way to unconstrained parameters automatically for the user, for which we needed to know if the parameter the user asked to plot was in the parameters block -- if it was in generated quantities, for example, we couldn't unconstrain it).

Do you know of a way to get the names of variables just in the parameters block that I could use instead of this?

jgabry avatar Mar 16 '23 20:03 jgabry

Ooh good question. Wouldn't it be easier to grep the Stan code in the stanfit object for the entries of the parameters block? Then it wouldn't be dependent on the C++

andrjohns avatar Mar 16 '23 21:03 andrjohns

I think there was some reason we didn’t do that before but I can’t remember and now I can’t think of a good reason not to do that, so maybe I’ll give that a try.

On Thu, Mar 16, 2023 at 3:15 PM Andrew Johnson @.***> wrote:

Ooh good question. Wouldn't it be easier to grep the Stan code in the stanfit object for the entries of the parameters block? Then it wouldn't be dependent on the C++

— Reply to this email directly, view it on GitHub https://github.com/stan-dev/rstan/issues/1051#issuecomment-1472750998, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3PQQ5UBOAHJKOQEPT6QPDW4N7IPANCNFSM6AAAAAAV5RFEJQ . You are receiving this because you authored the thread.Message ID: @.***>

jgabry avatar Mar 16 '23 21:03 jgabry

Is there a way to run stanc3's auto formatter (I don't see it as an option to stanc)? I think one reason we didn't grep the Stan code is that we'd need to find the parameters block but the user could have the word "parameters" in comments. The block doesn't even have to start on a new line due to Stan's whitespace rules, so we can't just search for "parameters" starting a line. There may be other issues along those lines too. (I think the auto formatter removes comments if I recall correctly).

jgabry avatar Mar 16 '23 22:03 jgabry

Is there a way to run stanc3's auto formatter (I don't see it as an option to stanc)?

The auto-formatting option generates Stan code while stanc() generates C++ code. To run the formatter internally, use auto_format = TRUE in stanc_process or set stanc.auto_format option to TRUE.

hsbadr avatar Mar 16 '23 22:03 hsbadr

Is there a way to run stanc3's auto formatter (I don't see it as an option to stanc)?

The auto-formatting option generates Stan code while stanc() generates C++ code. To run the formatter internally, use auto_format = TRUE in stanc_process or set stanc.auto_format option to TRUE.

Thanks! I'll try stanc_process. I think I'll need to take the user's stanfit object that they pass to the plotting function, grab the stan code, and feed it into stanc_process with auto_format = TRUE. That seems like the only way to get an auto formatted version of the user's existing Stan program?

jgabry avatar Mar 16 '23 22:03 jgabry

The easiest way is to set the option:

options(stanc.auto_format = TRUE)

We could enable it by default (which would help brms; see https://github.com/paul-buerkner/brms/issues/1376) or pass auto_format argument to stanc(), if needed.

hsbadr avatar Mar 16 '23 22:03 hsbadr

But actually I guess auto formatting doesn't remove user code comments (so the word "parameters" can appear anywhere in the Stan file even after auto formatting). I need a way to figure out which variables are declared in the parameters block (as opposed to transformed parameters and generated quantities). @andrjohns suggested grepping the Stan code itself but that seems potentially error prone if the word "parameters" can appear anywhere in user comments.

jgabry avatar Mar 16 '23 22:03 jgabry

You could use a C++ preprocessor to remove the comments; something like $CXX -fpreprocessed -dD -E <filename.cpp>.

hsbadr avatar Mar 16 '23 22:03 hsbadr

For Stan code, could we request this in stanc3?

hsbadr avatar Mar 16 '23 22:03 hsbadr