DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[ROCm] Declare gated_act_fn() as a device function

Open rraminen opened this issue 1 year ago • 1 comments

It is required to include __device__ in the declaration of gated_act_fn() for AMD compilers to avoid below errors during inference_core_ops extension build.

/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:25:19: error: no function template matches function template specialization 'gated_act_fn'
DS_D_INLINE float gated_act_fn<ActivationType::GEGLU>(float x, float y)
                  ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:22:7: note: candidate template ignored: target attributes do not match
float gated_act_fn(float x, float y);
      ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:33:19: error: no function template matches function template specialization 'gated_act_fn'
DS_D_INLINE float gated_act_fn<ActivationType::ReGLU>(float x, float y)
                  ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:22:7: note: candidate template ignored: target attributes do not match
float gated_act_fn(float x, float y);
      ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:39:19: error: no function template matches function template specialization 'gated_act_fn'
DS_D_INLINE float gated_act_fn<ActivationType::SiGLU>(float x, float y)
                  ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:22:7: note: candidate template ignored: target attributes do not match
float gated_act_fn(float x, float y);
      ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:82:33: error: no matching function for call to 'gated_act_fn'
                float act_val = gated_act::gated_act_fn<ActType>(g_val, a_val);
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:113:9: note: in instantiation of function template specialization 'gated_activation_kernel<float, GEGLU, 1>' requested here
        DISPATCH_UNROLL(1);
        ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:92:25: note: expanded from macro 'DISPATCH_UNROLL'
   hipLaunchKernelGGL(( gated_activation_kernel<T, ActType, unroll_val>) \
                        ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:142:13: note: in instantiation of function template specialization 'launch_gated_activation_impl<float, GEGLU>' requested here
            launch_gated_activation_impl<T, ActivationType::GEGLU>(
            ^
/opt/conda/envs/py_3.9/lib/python3.9/site-packages/deepspeed/inference/v2/kernels/core_ops/gated_activations/gated_activation_kernels.hip:22:7: note: candidate function not viable: call to __host__ function from __global__ function
float gated_act_fn(float x, float y);
      ^

rraminen avatar Nov 30 '23 22:11 rraminen

@mrwyattii and @cmikeh2 - do we expect these to work on AMD and thoughts on the change?

loadams avatar Dec 13 '23 21:12 loadams

@rraminen - AMD isn't currently supported in FastGen, so does it make sense to merge this PR with later support for when that comes in? Since for now, this won't have any impact.

loadams avatar Jan 10 '24 18:01 loadams