KernelAbstractions.jl icon indicating copy to clipboard operation
KernelAbstractions.jl copied to clipboard

Enzyme integration likely causes divergent kernels

Open vchuravy opened this issue 1 year ago • 0 comments

While looking at #583 I noticed that the aug_fwd kernel looks like:

function aug_fwd(
        ctx,
        f::FT,
        ::Val{ModifiedBetween},
        subtape,
        ::Val{TapeType},
        args...,
    ) where {ModifiedBetween, FT, TapeType}
    # A2 = Const{Nothing} -- since f->Nothing
    forward, _ = EnzymeCore.autodiff_deferred_thunk(
        ReverseSplitModified(ReverseSplitWithPrimal, Val(ModifiedBetween)),
        TapeType,
        Const{Core.Typeof(f)},
        Const{Nothing},
        Const{Core.Typeof(ctx)},
        map(Core.Typeof, args)...,
    )

    # On the GPU: F is a per thread function
    # On the GPU: subtape::Vector
    if __validindex(ctx)
        I = __index_Global_Linear(ctx)
        subtape[I] = forward(Const(f), Const(ctx), args...)[1]
    end
    return nothing
end

This will create divergent execution of barrier operations https://github.com/JuliaGPU/KernelAbstractions.jl/pull/558#issue-2815921036 Likely this is also broken with @kernel unsafe_indicies=true

cc: @michel2323

vchuravy avatar Mar 12 '25 12:03 vchuravy