posterior
posterior copied to clipboard
modify summarise_draws internals not to coerce values to same type
Summary
This would fix #355.
As it is affecting summarise_draws, which is one of the main functions in posterior, I think this change needs to be considered carefully. It might be less efficient than currently implemented, so I am opening this PR now to check how the benchmarks for summarise_draws perform.
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
This is how benchmark results would change (along with a 95% confidence interval in relative change) if eca0fae786e0769202281c83044ba6cbb3372662 is merged into master:
- :ballot_box_with_check:as_draws_array: 136ms -> 136ms [-0.51%, +0.83%]
- :rocket:as_draws_df: 64.6ms -> 61ms [-6.83%, -4.58%]
- :ballot_box_with_check:as_draws_list: 157ms -> 156ms [-1.13%, +0.17%]
- :ballot_box_with_check:as_draws_matrix: 58.8ms -> 58.7ms [-0.97%, +0.57%]
- :ballot_box_with_check:as_draws_rvars: 80ms -> 80.4ms [-0.49%, +1.55%]
- :exclamation::snail:summarise_draws_100_variables: 706ms -> 715ms [+0.92%, +1.51%]
- :exclamation::snail:summarise_draws_10_variables: 105ms -> 107ms [+0.93%, +2.1%] Further explanation regarding interpretation and methodology can be found in the documentation.
Looks like a >1% increase to summarise_draws times, which I don't think is acceptable
How much increase exactly? 1% would barely matter I think
Codecov Report
Attention: Patch coverage is 93.75000% with 1 lines in your changes are missing coverage. Please review.
Project coverage is 95.33%. Comparing base (
9de9857) to head (a28483f).
:exclamation: Current head a28483f differs from pull request most recent head ded8b04. Consider uploading reports for the commit ded8b04 to get more accurate results
| Files | Patch % | Lines |
|---|---|---|
| R/misc.R | 90.90% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #356 +/- ##
=======================================
Coverage 95.33% 95.33%
=======================================
Files 50 50
Lines 3855 3860 +5
=======================================
+ Hits 3675 3680 +5
Misses 180 180
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
❗🐌summarise_draws_100_variables: 706ms -> 715ms [+0.92%, +1.51%]
❗🐌summarise_draws_10_variables: 105ms -> 107ms [+0.93%, +2.1%]
Based on these ^
This is how benchmark results would change (along with a 95% confidence interval in relative change) if a28483f84ffd276cba402929ff3a5e976a12046a is merged into master:
- :ballot_box_with_check:as_draws_array: 137ms -> 138ms [-0.57%, +1.33%]
- :exclamation::snail:as_draws_df: 64.7ms -> 66.8ms [+1.45%, +5.05%]
- :ballot_box_with_check:as_draws_list: 163ms -> 164ms [-0.39%, +2%]
- :exclamation::snail:as_draws_matrix: 60.5ms -> 63.6ms [+3.63%, +6.84%]
- :ballot_box_with_check:as_draws_rvars: 81.6ms -> 81.1ms [-1.61%, +0.46%]
- :exclamation::snail:summarise_draws_100_variables: 709ms -> 719ms [+0.98%, +1.77%]
- :exclamation::snail:summarise_draws_10_variables: 108ms -> 110ms [+0.75%, +2.83%] Further explanation regarding interpretation and methodology can be found in the documentation.
I've now added tests checking that the column types are as expected, and if the speed is ok, this is ready for review
FWIW, I think the unnesting / flattening might be doable as a cbind on row vectors (or maybe vec_cbind / vec_rbind for safety) to create a 1-row data frame. Dunno if that helps at all.
This is how benchmark results would change (along with a 95% confidence interval in relative change) if e175ba88de29e2cbaa1ce74dad77763411003118 is merged into master:
- :ballot_box_with_check:as_draws_array: 156ms -> 155ms [-1.35%, +0.57%]
- :rocket:as_draws_df: 71.3ms -> 68.4ms [-5.59%, -2.54%]
- :ballot_box_with_check:as_draws_list: 176ms -> 175ms [-1.7%, +0.47%]
- :ballot_box_with_check:as_draws_matrix: 71ms -> 70.5ms [-1.68%, +0.53%]
- :ballot_box_with_check:as_draws_rvars: 84ms -> 84.3ms [-0.89%, +1.65%]
- :exclamation::snail:summarise_draws_100_variables: 724ms -> 733ms [+0.35%, +1.98%]
- :exclamation::snail:summarise_draws_10_variables: 112ms -> 114ms [+1.05%, +2.91%] Further explanation regarding interpretation and methodology can be found in the documentation.
Speed is okay. Still, is this already the best way to implement the flattening?
unnesting / flattening might be doable as a cbind on row vectors (or maybe vec_cbind / vec_rbind for safety) to create a 1-row data frame.
Thanks for the tip! Simplifying this unnesting/flattening would, I think, be beneficial. However, I can't quite get the right structure with rbind or vec_rbind as the quantile2 column ends up with vectors of length 2 (as opposed to separate columns as in unlist). Looks like it would still need some kind of column splitting (e.g. tidyr::unnest_wider)
create_summary_list(example_draws(), 3, list(rhat = rhat_basic, quantile2 = quantile2), NULL) |>
vec_rbind()
## rhat quantile2
## 1 1.014967 -1.231353, 18.874003
create_summary_list(example_draws(), 3, list(rhat = rhat_basic, quantile2 = quantile2), NULL) |>
unlist()
## rhat quantile2.q5 quantile2.q95
## 1.014967 -1.231353 18.874003
Ah right... you might be able to use cbind if you conditionally transpose (or rbind) vectors of length > 1 to turn them into row vectors (not on a computer atm so I can't try it)
is this already the best way to implement the flattening
I don't think the issue is so critical that the fix is needed right away, so perhaps it's better to leave this open in case some other ideas come up (e.g. if @mjskay's method works out). However, I won't spend more time on this now