posterior
posterior copied to clipboard
.args in summarise draws incompatible with summary functions without ellipses
The .args argument in summarise draws is passed to all summary functions as additional arguments. This fails if the summary function does not handle ... or the keyword arguments specified in .args. Common functions that don't handle ... include stats::sd and stats::var.
I'd like to be able to do something like the following:
x <- example_draws()
summarise_draws(x, mean, sd, quantile2, mcse_quantile) #q5 and q95 by default
#> [Output as expected]
# change the quantiles of interest to q1 and q99
additional_args <- list(probs = c(0.01, 0.99))
summarise_draws(x, mean, sd, quantile2, mcse_quantile, .args = additional_args)
#> Error in stats::sd(x, ...) : unused argument (probs = c(0.01, 0.99))
One option would be to make aliases of these common summary functions that handle ..., but perhaps there is a more sensible way to only pass the keywords to functions that handle them?
I think since those args don't apply to all the functions anyway I'd be inclined to wrap the relevant functions in a lambda, like define p = c(.01,. 99) then pass \(...) quantile2(..., probs = p). Feels more precise than using a blanket .args.
If we want to get fancy we could probably inspect the formals of functions to find functions that don't take dots and then not pass .args down. Not sure if there would be unintended side effects...
Good points @mjskay, thanks! Another idea I just had is to extend .args to be a named nested list, by which one could supply additional arguments to specific summary functions, rather than having them be passed to all. I think this would greatly improve the utility of the .args argument, but would either be a breaking change or we would somehow need to detect whether .args should be passed to all functions or split and passed to only corresponding ones.
I managed to get this working with a small change to create_summary_list and it would work like this:
summarise_draws(
example_draws(),
quantile2,
median,
mcse_quantile,
.args = list(
quantile2 = list(probs = 0.1), # args for quantile2 function
mcse_quantile = list(probs = c(0.05, 0.7)) # args for mcse_quantile function
)
)
## # A tibble: 10 × 5
## variable q10 median mcse_q5 mcse_q70
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 mu -0.116 4.16 0.551 0.196
## 2 tau 0.575 3.07 0.114 0.426
## 3 theta[1] 0.0380 5.97 0.820 0.400
## 4 theta[2] -0.255 5.13 0.676 0.392
## 5 theta[3] -3.94 3.99 2.18 0.349
## 6 theta[4] -1.10 4.99 0.956 0.187
## 7 theta[5] -2.25 3.72 1.62 0.245
## 8 theta[6] -2.15 4.14 1.16 0.349
## 9 theta[7] 0.558 5.90 0.458 0.362
## 10 theta[8] -1.74 4.64 0.997 0.368
Personally I think anonymous functions make the code clearer as they move the args closer to the function they apply to:
summarise_draws(
example_draws(),
\(x) quantile2(x, probs = 0.1),
median,
\(x) mcse_quantile(x, probs = c(0.05, 0.7))
)
#> # A tibble: 10 × 5
#> variable q10 median mcse_q5 mcse_q70
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 mu -0.116 4.16 0.551 0.196
#> 2 tau 0.575 3.07 0.114 0.426
#> 3 theta[1] 0.0380 5.97 0.820 0.400
#> 4 theta[2] -0.255 5.13 0.676 0.392
#> 5 theta[3] -3.94 3.99 2.18 0.349
#> 6 theta[4] -1.10 4.99 0.956 0.187
#> 7 theta[5] -2.25 3.72 1.62 0.245
#> 8 theta[6] -2.15 4.14 1.16 0.349
#> 9 theta[7] 0.558 5.90 0.458 0.362
#> 10 theta[8] -1.74 4.64 0.997 0.368
But I'll admit I have a bit of a bias against the .args argument as it feels like a bit of a hack in the first place. But maybe I'm missing a use case where .args makes things easier?
I don't like .args either to be honest. if we can get rid of it, I would be happy I think.
Matthew Kay @.***> schrieb am Di., 30. Jan. 2024, 09:45:
Personally I think anonymous functions make the code clearer as they move the args closer to the function they apply to:
summarise_draws( example_draws(), (x) quantile2(x, probs = 0.1), median, (x) mcse_quantile(x, probs = c(0.05, 0.7)) )#> # A tibble: 10 × 5#> variable q10 median mcse_q5 mcse_q70#>
#> 1 mu -0.116 4.16 0.551 0.196#> 2 tau 0.575 3.07 0.114 0.426#> 3 theta[1] 0.0380 5.97 0.820 0.400#> 4 theta[2] -0.255 5.13 0.676 0.392#> 5 theta[3] -3.94 3.99 2.18 0.349#> 6 theta[4] -1.10 4.99 0.956 0.187#> 7 theta[5] -2.25 3.72 1.62 0.245#> 8 theta[6] -2.15 4.14 1.16 0.349#> 9 theta[7] 0.558 5.90 0.458 0.362#> 10 theta[8] -1.74 4.64 0.997 0.368 But I'll admit I have a bit of a bias against the .args argument as it feels like a bit of a hack in the first place. But maybe I'm missing a use case where .args makes things easier?
— Reply to this email directly, view it on GitHub https://github.com/stan-dev/posterior/issues/341#issuecomment-1916338955, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2AEHBOFZMEKHVKTHZHTYRCXKBAVCNFSM6AAAAABCPGS7RKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJWGMZTQOJVGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I agree, the anonymous functions end up with much neater looking code. My current use-case is not using summarise_draws directly, but inside priorsense, and I currently only support summary functions given as strings, as they are actually first converted to *_weighted versions of the functions. (This is all a bit messy and perhaps will be fixed by #331 and using the corresponding weight-aware rvar methods)
If I remember correctly, .args was added in the first place to handle passing weights to all summary functions, so perhaps when the summary functions directly support weights, we won't really need it
is there anything we need to add to make .args obsolete? if everything works already now using anonymous functions, I suggest we go ahead and deprecate .args with the intention of removing it soonish. would anybody have time to make the necessary changes?
n-kall @.***> schrieb am Di., 30. Jan. 2024, 09:59:
I agree, the anonymous functions end up with much neater looking code. My current use-case is not using summarise_draws directly, but inside priorsense https://github.com/n-kall/priorsense, and I currently only support summary functions given as strings, as they are actually first converted to *_weighted versions of the functions. (This is all a bit messy and perhaps will be fixed by #331 https://github.com/stan-dev/posterior/pull/331 and using the corresponding weight-aware rvar methods)
If I remember correctly, .args was added in the first place to handle passing weights to all summary functions, so perhaps when the summary functions directly support weights, we won't really need it
— Reply to this email directly, view it on GitHub https://github.com/stan-dev/posterior/issues/341#issuecomment-1916361893, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2AFWLIU3VBY5IXD3B3DYRCY6LAVCNFSM6AAAAABCPGS7RKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJWGM3DCOBZGM . You are receiving this because you commented.Message ID: @.***>
I'm fine with deprecation of .args but I would prefer it not to be removed until the posterior-provided summary functions support using the weights that are in the draws (or rvar) objects. For now summarise_draws(x, mean_weighted, sd_weighted, .args = list(weights = weights(x))) seems simpler (to me) than using anonymous functions for all summary functions to add the weights.
Yes, agreed. We still need a not so ugly solution to passing weights.
I'm trying to remove any usage of .args in my code. Currently, I have been using it to pass weights (as discussed above), but also to pass the probs vector for quantile2 and mcse_quantile.
I have something like this (actual function does other things, but this is the relevant structure):
my_summary <- function(x, quantity, quantity_args) {
posterior::summarise_draws(x, quantity, .args = quantity_args)
}
Using this code makes it possible for the user to specify quantiles in a vector, and have it apply to both mcse_quantile and quantile2, for example:
my_summary(x, quantity = c("quantile2", "mcse_quantile"), quantity_args = list(probs = c(0.2, 0.8))
@mjskay Do you know of a way to do something similar without .args, without the user needing to write each anonymous function directly? Perhaps there is a way to programmatically create the anonymous functions that I am not aware of. i.e. result in
posterior::summarise_draws(x,
\(x) quantile2(.x, probs = c(0.2, 0.8)),
\(x) mcse_quantile(x, probs = c(0.2, 0.8))
)
Because summarise_draws() can take a named list of functions, you can construct partially-applied versions of your desired functions and pass them as lists. This allows you to apply the arguments to just the functions they apply to; e.g. something like this:
fs = lapply(c(q = quantile2, mcse = mcse_quantile), \(f) \(...) f(..., probs = c(0.2, 0.8)))
summarise_draws(x, fs)
Or using a library that provide partial function application, like {purrr}, you could do:
fs = lapply(c(q = quantile2, mcse = mcse_quantile), purrr::partial, probs = c(0.2, 0.8))
summarise_draws(x, fs)
Then if you need different arguments to go to different functions, you could pass multiple function lists.