Michael Goin

Results 270 comments of Michael Goin

@Tmn07 @schung-amd would either of you want to revive this PR? Sorry for losing track of this

This seems like it might be an issue with transformers gguf support since the error is in `transformers/modeling_gguf_pytorch_utils.py`, do you have an idea @Isotr0py ? Per this dictionary in transformers,...

Okay thank you for clarifying! @alllexx88 I would recommend opening an issue on the transformers repo to resolve this https://github.com/huggingface/transformers/issues?q=is%3Aissue+is%3Aopen+gguf

Could triton support conversions from fp8 to/from fp16? I understand the lack of compute support, but it would be nice to be able to cast and work with the type,...

@houseroad it seems worse at small M but better at large M compared to our CUTLASS kernels, however this is only true for specific shapes. I need to do more...

> I wonder whether the next effort could be on pre-compiling just the fixed-sized encoders Yes exactly, this is the stated plan. I just wanted to pull this complexity into...

Personally I would appreciate a tad more line width given we already have imports that are longer than 80, but this decision should come down to large consensus. Maybe a...

@LeiWang1999 thanks for the WIP, very cool interface with bitblas as a package. Can you explain if the GPTQ benchmarking results in vLLM were run with the base "gptq" kernels...

Thanks for all the work @LeiWang1999! I have a few high-level thoughts first on how to make landing this more straightforward: 1. Make bitblas an optional dependency and remove from...

@LeiWang1999 thanks for the ping and updates, excited to review!