Daniël de Kok
Daniël de Kok
This change add support for quantizing Phi 3.5 MoE models.
Suppose that we have a file `add.py` in a package: ``` import cutlass.cute as cute from . import hello @cute.jit def add(a, b): return a + b ``` and then...
# What does this PR do? Update quantization kernels to end-of-June vLLM/Hub. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss...
# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...