[BUG] FAILED to compile example 47_ampere_gemm_universal_streamk
Describe the bug
I tried to replace half_t with bfloat16_t in examples/47_ampere_gemm_universal_streamk/ampere_gemm_universal_streamk.cu, but encountered compilation errors.
Steps/Code to reproduce bug
here is the diff
here is the part of error
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
Have you tried changing the accumulation type to fp32? See https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html for details on which datatypes configurations are supported.
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.