Casper

Results 293 comments of Casper

@younesbelkada any plans to finish this or can we close it for now?

I would love some help here for implementing the tests. T4 has compute capability 7.5, so it is not compatible with the AWQ CUDA kernel for running the quantized layers...

`s^-1 * x` is not explicitly computed, it is fused according to authors. I added the comment in quantization code because I understand it as being fused during the creation...

AMD is not supported at the moment. I am hoping the open-source community can come together to support AMD GPUs for AWQ at some point.

I am open to PRs that add support. If someone could contribute, it would be awesome

No, this is not supported yet.

DeepSpeed is not supported with AutoAWQ. We use accelerate.

Hi @abhinavkulkarni, thanks for posting this. I talked with the Striped Hyena team and I am looking to implement it. I have already started on a branch below, but needs...

This kind of transient issue has been popping up every since transformers 4.36 was released. Unfortunately, the code since transformers 4.36 is unfavorable to handle these issues around input arguments...

This is good work @qwopqwop200. I was working on the same thing on the exllama branch. It seems there could be a modest boost in speed of around 10% from...