More parameters for `AutoEnzyme`
I think it would make sense to add more parameters to AutoEnzyme, like:
deferred = true / falseto useautodiff_deferredinstead ofautodiffchunksizeto imitate theAutoForwardDiffAPI and allow selecting a chunk size for Jacobians and Hessians
@wsmoses @vchuravy what do you think?
Related:
- https://github.com/JuliaDiff/DifferentiationInterface.jl/issues/819#issuecomment-3070516150
deferred should not be needed. Enzyme does this automatically now and it is just needed for directly using it in GPU kernels.
gentle bump on this @gdalle since it came up in the linked thread. would be great to have
I would love some input on the options we'd like there. ForwardDiff has two:
chunk size=nothingfor automatic selection, upper-bounded by 12 for bandwidth reasonschunk size::Integerfor fixed manual selection
Does Enzyme need a third option which always takes the batch size to be the length of the vector (aka computing the whole Jacobian in a single batch for example)?
Related:
- https://github.com/EnzymeAD/Enzyme.jl/issues/1542
- https://github.com/EnzymeAD/Enzyme.jl/pull/1545
yeah the "same size as vector" does have performance benefits, so it would be nice to represent
I was thinking we could use chunksize=Inf for that one. Sure, Inf is technically a Float but typemax(Int) is much less readable and Inf gets the point across rather well. How does that sound?
I guess the question becomes what is the meaning if someone gives a float != Inf. Ideally we would restrict the type to only contain valid inputs.
Maybe a new type class?
We can check type parameters at construction with an inner constructor:
struct AutoEnzyme{M,A,chunksize}
mode::M
function AutoEnzyme{M,A,chunksize}(mode::M) where {M,A,chunksize}
if chunksize isa Integer
@assert chunksize > 0
elseif chunksize isa Float
@assert chunksize == Inf
else
@assert isnothing(chunksize)
end
end
return new{M,A,chunksize}(mode)
end
Is that cleaner than say AutoEnzyme(ForwardMode, FullBatch), where FullBatch is a new struct? (I'm not sure, genuinely asking this as a design question)
I'd say that with ForwardDiff, ADTypes has already started a tradition of using built-in language values to represent specific batching behaviors. For instance, chunksize=nothing really means chunksize=AutoChunk(), in the same way that chunksize=Inf would mean chunksize=FullChunk(). Unfortunately, we can't change the ForwardDiff API without a breaking release, so I would suggest we continue down that path for the time being, but keep this change in mind for a future ADTypes v2.0?
I mean we could equally put FullChunk/etc in EnzymeCore proper [like mode]
Down the road we'll need the same chunking options for Mooncake too (https://github.com/chalk-lab/Mooncake.jl/discussions/533), and we already have them for ForwardDiff. I think it would be nice for users if these three were handled in the same way?
perhaps, though per https://github.com/SciML/ADTypes.jl/issues/123 users already need to import EnzymeCore.Forward anyways atm
I'm not sure how to get around that one, because Enzyme's mode objects are rather sophisticated. For Mooncake I'll probably do AutoMooncake(; mode=ADTypes.ForwardMode()) and call it a day. Maybe we could add such a shortcut for Enzyme as well, but it wouldn't incorporate things like runtime activity, holomorphy, etc.
honestly beyond mode, and batch, I think the only one that users really care about is runtime activity (x/ref https://github.com/SciML/ADTypes.jl/issues/85#issuecomment-2374897752 / https://github.com/SciML/ADTypes.jl/issues/113 ), so if one needs more than that they can use mode, if not ADTypes can construct the mode itself
Would a constructor like this one look good to you?
AutoEnzyme(;
mode::Union{Enzyme.Mode,ADTypes.AbstractMode},
function_annotation=nothing,
runtime_activity=false,
chunksize::Union{Nothing,Int,Float}=nothing,
)
When ADTypes.ForwardMode() or ADTypes.ReverseMode() is provided, then the custom runtime_activity kwarg is used to construct the Enzyme object. The remaining question would be: what if an Enzyme mode is passed with runtime activity set to true, but the kwarg is set to false?
maybe have runtime_activity::Union{Nothing,Bool}=nothing and if non-nothing it overrides the mode?
I'd be okay with that. PR incoming in the next few days
Following up on this, should we add deferred to AutoEnzyme or not? Otherwise #124 is ready, except for the corresponding implementation in Enzyme (https://github.com/EnzymeAD/Enzyme.jl/pull/2659)