make default_optimizer thread safe
Fixes #248 for me - I believe, I haven't actually run nor added any tests yet.
Also, maybe this shouldn't be just merged into main? How does this go, @sethaxen?
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 76.19%. Comparing base (
20dd77e) to head (6971c21).
:exclamation: There is a different number of reports uploaded between BASE (20dd77e) and HEAD (6971c21). Click for more details.
HEAD has 148 uploads less than BASE
Flag BASE (20dd77e) HEAD (6971c21) 154 6
Additional details and impacted files
@@ Coverage Diff @@
## main #249 +/- ##
==========================================
- Coverage 82.08% 76.19% -5.89%
==========================================
Files 13 13
Lines 586 584 -2
==========================================
- Hits 481 445 -36
- Misses 105 139 +34
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
I'm guessing this makes a bunch of tests fail which rely on the previous behavior?
Also, maybe this shouldn't be just merged into main? How does this go, @sethaxen?
Pathfinder follows a continuous deployment model, so yes we'll merge this directly into main and immediately register a release. It's a non-breaking bug fix PR so just add a patch version bump.
I'm guessing this makes a bunch of tests fail which rely on the previous behavior?
Seems to only be 2 tests: https://github.com/mlcolab/Pathfinder.jl/blob/20dd77e5514c2d5d65a2a5edfb2cf72e1dfb9758/test/singlepath.jl#L34-L35 https://github.com/mlcolab/Pathfinder.jl/blob/20dd77e5514c2d5d65a2a5edfb2cf72e1dfb9758/test/multipath.jl#L38-L39
Not certain if equality check would work here, but if not, just checking the optimizer is an LBFGS with the same m, linesearch and linesearch init types would be fine.
Docs build failure is unrelated, and I'll fix in a separate PR
Ah, okay! Will do. The new test would probably have to check that constructing a default_optimizer pre and post a pathfinder run results in identical-in-value but different-in-memory objects, if that makes sense. Will probably add this later today :+1:
The new test would probably have to check that constructing a
default_optimizerpre and post a pathfinder run results in identical-in-value but different-in-memory objects, if that makes sense.
I think that makes sense for testing that this particular approach we're using now works, but it would be even better to have a test that was independent of our default_optimizer but instead directly tested our invariants (things that we should be able to guarantee are true). Here our invariants would be:
- if you didn't pass a stateful optimizer/log-density function and if your RNG is thread-safe, then
pathfindershould be thread-safe. multipathfinderwith a thread-safe RNG and a multithreading executor should be thread-safe.
The only way I can think of to test thread-safeness is with reproducibility.
I'm thinking 2 tests:
- Call
pathfindermultiple times in a multi-threaded loop with identically seeded thread-safe RNG for a nontrivial model (maybe the banana) without specifying theoptimizer. Verify that results (e.g. trace, trace gradient, log-density, and draws) are numerically identical (with==) - Call
multipathfinderwith a user-constructedLBFGS(so they have a shared state), a thread-safe RNG, andPreferParallelexecutor. Reseed identically and re-run. Results should be identical.
We do have reproducibility tests here: https://github.com/mlcolab/Pathfinder.jl/blob/20dd77e5514c2d5d65a2a5edfb2cf72e1dfb9758/test/multipath.jl#L67-L82. My guess is that these are currently passing because the log-density is so trivial that very little time is spent in the linesearch, so the runs don't interfere with each other often.
Will probably add this later today 👍
I really appreciate it! Let me know if you'd like help with any of this.
My guess is that these are currently passing because the log-density is so trivial that very little time is spent in the linesearch, so the runs don't interfere with each other often.
Right, there was also no issue in my example in the github issue for the parallel standard normal run.
@nsiccha anything I can do to help out with this PR?
@sethaxen, right, so sorry, I've been a lot busier than expected. I'll try to do it now :)
Ah, I've been trying to construct a test that fails for multipathfinder, but was unable to. The reason being you already catching the multipathfinder+stateful optimizer case, see https://github.com/mlcolab/Pathfinder.jl/blob/f4ca90dc3d91f077f479d13904a2b6bf99e8ee25/src/multipath.jl#L181-L182
Which is probably why few if any other people have run into this issue...
I'll still finish the changes, but will only add a new test for a parallel pathfinder call without an explicitly passed optimizer, alright, @sethaxen?
Ah, I've been trying to construct a test that fails for
multipathfinder, but was unable to. The reason being you already catching themultipathfinder+stateful optimizer case, see
Ah, yes, it seems we do catch that case! Okay good, this bug wasn't as severe as it initially looked.
I'll still finish the changes, but will only add a new test for a parallel
pathfindercall without an explicitly passed optimizer, alright, @sethaxen?
Yes, that sounds like the right approach. Thanks!
this bug wasn't as severe as it initially looked.
Yeah, indeed. In retrospect, if it had affected everyone, I guess it would have been discovered earlier.
BTW, the reason why I was even using the parallel (single) pathfinder thing was because I wanted to initialize several chains for the same posterior in parallel, but I did not want all eventual initialization points to come from a single approximation, which AFAICT inevitably happens for high dimensions and importance resampling. I wanted to do something slightly more clever, but for that I'd needed to be able to match each draw to the approximation that generated it, but AFAICT that wasn't possible with multipathfinder. I guess in the end I could have used the fit_distributions from the PathfinderResult - I'm unsure why I didn't. Or actually, IIRC for some reason the multipathfinder method was actually much slower than my parallel pathfinder implementation?
I'm unsure, I eventually just stuck with the simple parallel pathfinder approach, which worked well for me (except for the race condition).
BTW, the reason why I was even using the parallel (single)
pathfinderthing was because I wanted to initialize several chains for the same posterior in parallel, but I did not want all eventual initialization points to come from a single approximation, which AFAICT inevitably happens for high dimensions and importance resampling. I wanted to do something slightly more clever, but for that I'd needed to be able to match each draw to the approximation that generated it, but AFAICT that wasn't possible withmultipathfinder
The multi-pathfinder result object stores the individual single-path results, each of which stores the draws the chain generated, so you can always access those or, as you said, use the fit distribution for each path. You can also disable importance resampling with importance=False. It still resamples with replacement but does not use importance weights to do so. Then the draw_component_ids field stores for each draw in draws the index of the single-path run that generated that specific draw.
Makes sense! I was in the end also affected by https://github.com/julia-vscode/julia-vscode/issues/3853, but wasn't aware at the time.
And also wrapped the (single) pathfinder calls in another loop, retrying pathfinder until NUTS initialization worked for the returned (initialization) draw, which mainly means until the gradient evaluation does not error.
Maybe this is actually something that Pathfinder should (optionally) check? That the log density (gradient) can be evaluated for the returned draws?
Maybe this is actually something that Pathfinder should (optionally) check? That the log density (gradient) can be evaluated for the returned draws?
Oh that's interesting, can you open an issue for this feature?