Casper

Results 291 comments of Casper

Excellent work @SuibhneOFoighil

This is with Qwen 2 7B

Today with 4x nodes and Qwen 2.5 7B, I logged 6.7 minutes until step 1. @vermouth1992 Do you have any idea of which process in the init is taking so...

@winglian @djsaunde This would be a super handy datasets feature! +1 from me

If you have a normal FP16/BF16 model, this does not happen. I would suggest you check if the model can run inference with Huggingface libraries as a first step

I want to make it more flexible, but PyPi only allow for one wheel. So I cannot upload multiple versions, but one fix could be to implement a flag in...

@qinxuye would it help with a flag like this? https://github.com/casper-hansen/AutoAWQ/pull/582

@WoosukKwon I have used the same shapes as referenced in the original implementation, yet it does not load in vLLM for reasons I am unsure how to fix. If I...

@shiqingzhangCSU currently there is no progress. if you have suggestions or fixes, please open a PR to my fork. i am hoping to have this feature in vLLM soon, but...

> @robertgshaw2-neuralmagic any luck with this patch? I benchmarked and those kernels are really something. Great boost on my internal tests! @bratao I believe rob has a branch over in...