Jeff Daily
Jeff Daily
change labels linux.rocm.gpu to linux.rocm.gpu.2 cc @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang
Hi scikit-bio team, I have developed a SIMD C/C++/Python library similar to the SSW library. Unlike SSW, it does not provide the traceback, but it does implement global as well...
- composable_kernel as a third_party submodule - "ck" as a `torch.backends.cuda.preferred_linalg_library()` - reference CK gemm implementations for float, bfloat16, and half types cc @XilunWu @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj...
This ports (copies) FBGEMM's implementation from @jwfromm. https://github.com/pytorch/FBGEMM/tree/main/fbgemm_gpu/experimental/gen_ai/src/quantize/ck_extensions/fp8_rowwise cc @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd @yanbing-j @vkuzo @albanD @kadeng @penguinwu