Shaojie WANG issues

Repositories
Issues
Comments

Results 4 issues of


                                            Shaojie WANG

[HIPIFY][question] How to convert cublasLt's API to hipblasLt's

We somehow using cublasLt to compute linear operation in modern language model like transformer. We know that in rocm there is a similar lib called hipblasLt. But hipify does not...

question

BLAS

Add mini weight kernel

tensor transpose kernels

req: - [x] input/output: nchw->nchw-vecc nchw-vecc->nchw - [ ] weight: nchw->chwn-vecc - [ ] padding transpose: for cases c%vecc!=0, padding 0 at vecc's tail

frequently merge new asm file to miopen

As we continue optimizing the performance and stability for igemmGen and this tool can generate more efficient kernels for igemm or direct conv, we may think about how to merge...