Additional loo utilities
Hi!
This PR adds functions to compute additional LOO utilities: mean absolute error, mean squared error, and root mean squared error for regression problems and accuracy and balanced accuracy for binary classification. I have already discussed this with @avehtari and opened a PR for more discussion and feedback.
Codecov Report
Merging #202 (4df55e8) into master (0284928) will decrease coverage by
1.68%. The diff coverage is100.00%.
:exclamation: Current head 4df55e8 differs from pull request most recent head ad74be2. Consider uploading reports for the commit ad74be2 to get more accurate results
@@ Coverage Diff @@
## master #202 +/- ##
==========================================
- Coverage 95.15% 93.47% -1.69%
==========================================
Files 28 29 +1
Lines 2663 2712 +49
==========================================
+ Hits 2534 2535 +1
- Misses 129 177 +48
| Impacted Files | Coverage Δ | |
|---|---|---|
| R/loo_predictive_error.R | 100.00% <100.00%> (ø) |
|
| R/zzz.R | 76.92% <0.00%> (-23.08%) |
:arrow_down: |
| R/effective_sample_sizes.R | 84.25% <0.00%> (-13.39%) |
:arrow_down: |
| R/importance_sampling.R | 84.04% <0.00%> (-8.52%) |
:arrow_down: |
| R/loo.R | 93.00% <0.00%> (-5.35%) |
:arrow_down: |
| R/loo_moment_matching.R | 97.63% <0.00%> (-1.58%) |
:arrow_down: |
| R/loo_subsample.R | 94.80% <0.00%> (-0.51%) |
:arrow_down: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
Looks good! I hope @jgabry has also time to have a look? PR is small and clear, so I hope it's not taking much time
I think this PR could be merged adding just the loo_predictive_error function, but it would be useful to think the bigger picture and how we could support loo_compare with different criteria
Looking at this now. Sorry for the delay!
@LeeviLindgren Thanks for the PR! Sorry for the delay in reviewing it. I'll make a few review comments now, but can you also edits from maintainers? You can do that by following the instructions here: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork
@LeeviLindgren Thanks for the PR! Sorry for the delay in reviewing it. I'll make a few review comments now, but can you also edits from maintainers? You can do that by following the instructions here: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork
Thanks for the comments, I will take a look! Edits from maintainers should be allowed.
@avehtari also suggested that it would be nice to support model comparison using these additional metrics. I suggest we try to merge this PR which implements the metrics, and I can start working on the functionality to support comparison. The initial idea I had was to write a method for loo_compare for loo_predictive_error-class (we can also come up with another name). Users could then pass output from the calls to loo_predictive_error to loo_compare. What do you think @jgabry?
Thanks for the updates. Waiting to hear from @avehtari on the couple of questions in my review comments (about the argument names) but after that we can merge this (assuming the GitHub Actions checks pass).
@avehtari also suggested that it would be nice to support model comparison using these additional metrics. I suggest we try to merge this PR which implements the metrics, and I can start working on the functionality to support comparison. The initial idea I had was to write a method for
loo_compareforloo_predictive_error-class (we can also come up with another name). Users could then pass output from the calls toloo_predictive_errortoloo_compare. What do you think @jgabry?
Yeah I agree it would be nice to be able to do model comparison with these. My initial thought is that your suggestion of using a class and a loo_compare method seems like a good approach.
I agree with @jgabry's suggestions
- I changed the function to
loo_predictive_metric - Updated the wording in docs to match the updates
llargument is now calledlog_likpred_errorargument is now calledmetric
Looks good, thanks! Merging now.