ankutalev

Results 4 issues of ankutalev

**What is your question?** Hello! I want to implement elementwise epilogue, which depends on output matrix coordinates, i.e. ``` d_ij = F(alpha * sum_k(a_ik * b_kj) + c_ij, i, j)...

question
? - Needs Triage

I understand that this project is abandoned, but maybe there is a chance to be merged.

**Describe the bug** See title - I expected GroupedGemm works, when workspace pointer 16-bits aligned, but it fails with `Got bad cuda status: misaligned address at line: 596` for 16...

bug
? - Needs Triage

Hello! This MR provides two things: 1) Zero points for default mode 2) GPT-Q [semantics](https://pytorch.org/blog/accelerating-triton/) Closes #2261