pymc-marketing
pymc-marketing copied to clipboard
Replace scan in BetaGeoBetaBinom
Closes https://github.com/pymc-labs/pymc-marketing/issues/703
📚 Documentation preview 📚: https://pymc-marketing--707.org.readthedocs.build/en/707/
@ColtAllen Thanks for the nice issue description. I removed the pt.ge(t_x, i) from the unnorm_logp_died_at_tx_plus_i expression and this solved the scalar case, but something is missing in the vectorized one:
tests/clv/test_distributions.py::TestBetaGeoBetaBinom::test_logp_matches_lifetimes[scalar] PASSED [ 50%]
tests/clv/test_distributions.py::TestBetaGeoBetaBinom::test_logp_matches_lifetimes[vector] FAILED [100%]
with
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=0
E
E Mismatched elements: 4 / 5 (80%)
E Max absolute difference: 0.0505898
E Max relative difference: 0.18415497
E x: array([ -0.360687, -21.684232, -6.955498, -0.224123, -0.360694])
E y: array([ -0.36069 , -21.696809, -6.9558 , -0.274713, -0.360694])
Any ideas? The difference is very small ....
Codecov Report
Attention: Patch coverage is 91.05582% with 133 lines in your changes missing coverage. Please review.
Project coverage is 94.25%. Comparing base (
c6be6a9) to head (00efd07). Report is 514 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #707 +/- ##
==========================================
+ Coverage 91.57% 94.25% +2.68%
==========================================
Files 21 29 +8
Lines 2172 3101 +929
==========================================
+ Hits 1989 2923 +934
+ Misses 183 178 -5
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
CC'ing @ricardoV94 (he wrote this code).
@ricardoV94 I think I got it and the tests are passing :) . Anything else I am missing?
Changes look fine, although I would suggest a more careful benchmark before deciding on it.
Specifically other backends and cases that should be more favourable to Scan (large variation between T and tx across rows). If there's some reproducible code I don't mind taking a look.
I gave some more details on the original issue.
And more generally when we do performance driven changes I would add a benchmark as part of the CI (pytest-benchmark is pretty nice). That's the conceptual equivalent to a regression test for bugfixes.
Despite the changes, do what we want. We wanna do more benchmarking to decide on the implementation to choose. See https://github.com/pymc-labs/pymc-marketing/issues/703#issuecomment-2141495580