DistributionsAD.jl
DistributionsAD.jl copied to clipboard
Add Enzyme tests
I had tried to run Enzyme tests, and it seems currently quite a few tests fail/error.
The fact that it doesn't segfault in so many tests is a good sign.
Related: https://github.com/TuringLang/Turing.jl/pull/1887
@devmotion can you rerun on latest Enzyme main?
fyi the test failures appear to be due to currently unimplemented derivatives of lapack functions. probably should disable those tests.
@devmotion @yebai at this point, everything in this PR works except:
- differentiating through Lapack functions is not yet supported by Enzyme
- your test code uses an active return for a function which returns a vector (this must be duplicated).
Moreover all the errors are relatively nice.
at this point there aren't Enzyme bugs here and merging this is now on your end (presumably marking the code using lapack as unsupported atm, and fixing your test handler to not use an active vector for that input)
Thank you, I'll try to have a look at and fix the remaining issues this week.
The latest commit added a way to mark some Enzyme tests as broken (and I also introduced a bug in the test setup :smile:) but it also reveals that there are many more issues - so far only a tiny initial testset was run but no tests of the log densities of the uni-, multi- and matrixvariate distributions we're actually interested in.
@wsmoses Is this an enzyme bug or did I call autodiff incorrectly? https://github.com/TuringLang/DistributionsAD.jl/actions/runs/5872312798/job/15923610267?pr=254#step:6:303
Definitely seems like a bug (that we need a minimal reproducer for)
On Wed, Aug 16, 2023 at 4:26 PM David Widmann @.***> wrote:
@wsmoses https://github.com/wsmoses Is this an enzyme bug or did I call autodiff incorrectly? https://github.com/TuringLang/DistributionsAD.jl/actions/runs/5872312798/job/15923610267?pr=254#step:6:303
— Reply to this email directly, view it on GitHub https://github.com/TuringLang/DistributionsAD.jl/pull/254#issuecomment-1680100118, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTUXBUFJ5TP4AJUKMV3ADXVRY3HANCNFSM6AAAAAA2LPVOW4 . You are receiving this because you were mentioned.Message ID: @.***>
One slightly annoying thing I found about the Enzyme error message is printing a large amount of LLVM code when compilation fails (see here for an example). This is useful when developing Enzyme (at least in the early stages) but not very helpful for users of Enzyme. Very few people can read such LLVM code and figure out why Enzyme couldn't compile. They also add clutter to CI logs. I suggest disabling such LLVM messages by default but providing an API for developers to activate them when needed.
@devmotion did you have a chance to isolate that 1.6 error that crashes Julia (so we can ensure it does not)?
@devmotion can you repush (its not letting me rerun tests for some reason)
@devmotion hm I still can't retrigger this, if you can have the tests run again.
@wsmoses you should have the priveledges to trigger CI (re-)runs -- can you try again?
I reran CI against the latest Enzyme.jl v0.12.19. They are now failing with either (mac os)
Assertion failed: (vec.back() == currentBlock), function addReverseBlock, file /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp, line 8952.
or (ubuntu)
julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.cpp:8952: llvm::BasicBlock* GradientUtils::addReverseBlock(llvm::BasicBlock*, const llvm::Twine&, bool, bool): Assertion `vec.back() == currentBlock' failed.
See the failed CI runs for details.
@wsmoses, is this an issue you are aware of?
Nope not aware of please open MWE
Gathering links to some related Enzyme issues: https://github.com/EnzymeAD/Enzyme.jl/issues/1585 https://github.com/EnzymeAD/Enzyme.jl/issues/1593 https://github.com/EnzymeAD/Enzyme.jl/issues/1594 https://github.com/EnzymeAD/Enzyme.jl/issues/1595 https://github.com/EnzymeAD/Enzyme.jl/issues/1596
@yebai Looks like the issues here are that Enzyme actually passes tests it was marked as failing?
yes, most tests are passing with most recent Enzyme.
@devmotion do you know why we are testing Enzyme only on Others tests?
I think Enzyme might still struggle with a few MvNomal tests, which was revealed after 8b9fa77
open issues and we'll get those sorted!
Oh yeah runtime activity isn't supported yet for forward mode blas, so my recommendation here is to turn on runtime activity more selectively for which tests need it
Some open issues that are causing test failures here. Some might just need marking as broken, others seem more in need of fixing:
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1620 (marked as broken)
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1603 (marked as broken)
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1601
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1599
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1649
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1629
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1626
- ~~https://github.com/EnzymeAD/Enzyme.jl/issues/1623~~
- Likely a Julia compiler bug
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1621
- [ ] https://github.com/EnzymeAD/Enzyme.jl/issues/1696
See also:
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1625
- [x] https://github.com/EnzymeAD/Enzyme.jl/issues/1634
I was in the process of checking if I could find more issues in this test suite, but got distracted by https://github.com/TuringLang/DynamicPPL.jl/issues/643 and related issues. Will get back to this once I'm done minimising what I've found there.
I marked distributions that involve ccalls without rules as broken so CI won't fail. Enzyme seem to run in most cases but could return incorrect gradients some distribution density functions.
Cool, open an issue for whatever is wrong and we’ll get it fixed
Draft PR for doing the reverse, namely adding Distributions.jl integration tests into Enzyme: https://github.com/EnzymeAD/Enzyme.jl/pull/1819
We're going to need individual issues for any failures. The PR above doesn't presently work as it hits a version issue with QuadGK.
Closed in favour of https://github.com/EnzymeAD/Enzyme.jl/pull/1819