AMDMIGraphX
AMDMIGraphX copied to clipboard
Scatter elements simplification
Example series of instructions found in Longformer:
@145 = hip::copy(@121,@144) -> half_type, {4, 4, 256, 513}, {525312, 131328, 513, 1}: 0.0193204ms, 1%
@146 = load[offset=8404992,end=12607488](@1) -> half_type, {4, 4, 256, 513}, {525312, 131328, 513, 1}: 0.00071044ms, 1%
@147 = hip::copy(@121,@146) -> half_type, {4, 4, 256, 513}, {525312, 131328, 513, 1}: 0.0183114ms, 1%
....
@151 = gpu::code_object[code_object=5072,symbol_name=scatter_elements_kernel,global=622592,local=1024,](@143,@150,@147) -> half_type, {4, 4, 256, 513}, {525312, 131328, 513, 1}: 0.0539156ms, 1%
@152 = gpu::code_object[code_object=5072,symbol_name=scatter_elements_kernel,global=622592,local=1024,](@55,@151,@145) -> half_type, {4, 4, 256, 513}, {525312, 131328, 513, 1}: 0.054144ms, 1%
In cases where scatter axis is the same, we might be able to simplify this to a single scatter op
Shiv to check if possible