Define parent_job_for_tape_type
As noted in https://github.com/JuliaGPU/CUDA.jl/pull/2260#discussion_r1508550355 it is a bit akward to add a specific interface to KernelAbstraction so instead we add it to EnzymeCore.
But what should the argument type be here? That’s similarly weird to have a function that takes a KA type in EnzymeCore
On Tue, Mar 5, 2024 at 12:38 PM Valentin Churavy @.***> wrote:
@.**** commented on this pull request.
In lib/EnzymeCore/src/EnzymeCore.jl https://github.com/EnzymeAD/Enzyme.jl/pull/1328#discussion_r1513324869:
@@ -239,6 +239,8 @@ end
function tape_type end
+function parent_job_for_tape_type end
Its pretty much https://github.com/JuliaGPU/CUDA.jl/blob/8e25fc1ada7f63e2e9b93785b451a751e58a20e1/ext/EnzymeCoreExt.jl#L9
for the extended version of tape_type introduced in #1104 https://github.com/EnzymeAD/Enzyme.jl/pull/1104 we need to be able to construct a parent_job explicitly. There is a question who should own that API. Originally I had it in KA, but it makes more sense in EnzymeCore, since otherwise we can never call the new tape_type function.
— Reply to this email directly, view it on GitHub https://github.com/EnzymeAD/Enzyme.jl/pull/1328#discussion_r1513324869, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTUXEZ7Z4MUZEICG6TH5TYWYGJTAVCNFSM6AAAAABEHULMSSVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSMJXHA2DKOJRGM . You are receiving this because you commented.Message ID: <EnzymeAD/Enzyme. @.***>
It doesn't need to be a KA type, we could imagine a backend (IPU?) that doesn't use KA.
We are simply declaring a name for this function here. For packages that use KA and want to use the generic rule in KA they can use the KA backend types, for others it is just a name and they could use their own type to dispatch on.
Copying here for persistence:
I guess my issue is that I don't understand what this function is supposed to do.
Eg a kernel or function could have a type tape of float64 (eg squared something), but it's ambiguous what parent job that corresponds to.
Also the KA backend arg that you pass in isn't a tape type that enzyme would generate so thst confuses me too.
I guess my issue is that I don't understand what this function is supposed to do.
It returns the parent_job to pass to tape_type. tape_type(parent_jov_for_tape_type(...), ...)
Ah so this is really parent job for use in tape type, not generate a parent job from a tape type.
The name feels too specific, maybe parent_job_for_context ? But similarly then I'd want to undertand what args are legal for the context and what they mean
But similarly then I'd want to undertand what args are legal for the context and what they mean
Sadly we can't dispatch on modules, we could use Val(:CUDA) or something similar.
What we want to express is for some descriptor of "compilation target" we need to obtain a suitable GPUCompiler job. GPUCompiler doesn't define what targets are, that is left up to the users of GPUCompiler. KernelAbstraction has a notion of compute target that is a viable standing for compilation target. But IPU.jl doesn't use KernelAbstractions (so the generic rule in KA doesn't apply).
So options for arguments are:
- Leave it unspecified (as I have here), maybe add a default (that returns NativeJob)
- Do dispatch on symbol, but then KA needs to define backend -> symbol conversion and symbol dispatch is ugly.
Apologies: realized this is a somewhat rambly chain of thought as I think this through:
I guess a clarifying question that may help answer this, in what context would we need to call to get this information?
E.g. inside of CUDA.jl, for making the tape, we would presumably know that we need a CUDA parent job so hypothetically it could side-step this (or call it with any constant input, like :CUDA).
For KA, writing the generic form of the rule requires converting whatever KA context there is into a context here.
Are there other contexts we would want or need this?
At the moment the only conditional dispatch I can foresee is the KA one, which is great since then we could use the KA type. But in that case, why have this here and not KA.
So I wonder if there's another type we would want (and could apply conversions to). Since we'd only want this for GPUCompiler jobs I suppose that's relatively small (and perhaps we could consider some enum type there? but also that then begs the question of why not have that in GPUCompiler then).
But in that case, why have this here and not KA.
The question is indeed who owns the extension. For me the extension is owned by Enzyme since AD
is not part of the core functionality of KA. We host the extension in KA since it needs to be kept more in sync with the internals of KA, but Enzyme should provide all
the necessary interfaces to make it feasible to write a rule. Right now using the new form of tape_type that requires a CompilerJob is only use-able directly by GPUCompiler users like CUDA.jl or AMDGPU.jl. KA doesn't need to know what a CompilerJob is.
So the question is if we want KA to have a generic rule it needs some way to query the correct tape type. From the extension perspective the natural object to query with is the backend, which is owned by CUDA.jl or AMDGPU.jl, we could of course add KA as a dependency to EnzymeCore, so that we can access the abstract backend type, but we can use monkey patching to avoid the load time impact.
but also that then begs the question of why not have that in GPUCompiler then
GPUCompiler does intentionally not specify what the possible set of users is, there is a limited set of targets, but these are not quite meant as interface types.
What about making a subpackage in KA like KernelGradiwnts in the past.
I’m somewhat skeptical of having an interface function whose arguments aren’t owned by the package or depended on (KA shouldn’t be a dependency of EnzymeCore since we already got complaints about and since removed the Adapt dependency).
It’s also a bit weird since while now needed for tape type it’s not really related to AD, but the compiler job. Eg I see this more like a similar GPUCompiler job for this package function.
Maybe we make a new compilerjobs package?
On Tue, Mar 5, 2024 at 3:47 PM Valentin Churavy @.***> wrote:
But in that case, why have this here and not KA.
The question is indeed who owns the extension. For me the extension is owned by Enzyme since AD is not part of the core functionality of KA. We host the extension in KA since it needs to be kept more in sync with the internals of KA, but Enzyme should provide all the necessary interfaces to make it feasible to write a rule. Right now using the new form of tape_type that requires a CompilerJob is only use-able directly by GPUCompiler users like CUDA.jl or AMDGPU.jl. KA doesn't need to know what a CompilerJob is.
So the question is if we want KA to have a generic rule it needs some way to query the correct tape type. From the extension perspective the natural object to query with is the backend, which is owned by CUDA.jl or AMDGPU.jl, we could of course add KA as a dependency to EnzymeCore, so that we can access the abstract backend type, but we can use monkey patching to avoid the load time impact.
but also that then begs the question of why not have that in GPUCompiler then
GPUCompiler does intentionally not specify what the possible set of users is, there is a limited set of targets, but these are not quite meant as interface types.
— Reply to this email directly, view it on GitHub https://github.com/EnzymeAD/Enzyme.jl/pull/1328#issuecomment-1979691839, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTUXDUED5FYLDIGTFRQ7TYWY4PTAVCNFSM6AAAAABEHULMSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZZGY4TCOBTHE . You are receiving this because you commented.Message ID: <EnzymeAD/Enzyme. @.***>
I agree with you that this name does belong into KernelAbstractionsEnzymeExt and there is some weirdness putting it into either.
The benefit of extensions is that they are automatically loaded, creating a third package like KernelGradients would require the user to load it manually. Extensions in Julia currently can't have their own dependencies.
Ready from my side. The EnzymeCore extension in KernelAbstractions doesn't have access to GPUCompiler. So I kept the API of returning the job object.