Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

--quantize bitsandbytes or --quantize gptq does not work.

@solarslurpi it seems more to me that the GPU is not detected in the docker image, and that error message is bogus stemming from that. (I can run fine with...

--quantize bitsandbytes or --quantize gptq does not work.

A10G, but the choice of GPU doesn't matter, tgi works on 3090 for sure. But we've seen people having issues with runpod before. Something about shm not being properly set...

RWKV

This would be interesting, however is pretty far away from how this repo operates (the transformer assumption is pretty strong). But since there's no past key values, no attention, it...

Add support for Speculative Decoding

Hey thanks for proposing to contribute. Disclaimer: It's august so a lot of the team members are off to vacation and I myself am handling various (too many perhaps) projects....

Add support for Speculative Decoding

@calvintwr we're not using `transformers` here. CPU bottlenecking is really something, but we have a longer term solution for it https://github.com/huggingface/candle/

"text2text-generation" pipeline fails when setting return_dict_in_generate=True

> want to use all in one tokenizer, feature extractor and model but still post process Feels a bit power usery to me. Two options : - Subclass pipeline and...

add intel xpu support for TGI

Overall looks pretty good ! I think we'll move to propre enum `SYSTEM` instead of `IS_XXXX` since there's no way a user could be running 2 devices simultaneously. But we'll...

add intel xpu support for TGI

> @Narsil @OlivierDehaene I add xpu smi in env runtime, do you think is it proper to add this? this is to dump intel XPU version. It's OK in the...

add intel xpu support for TGI

Feel free to open up the draft whenever you're OK so we can run the tests + merge.

@cwfitzgerald (Sorry for the ping if you're not the correct person). We're iterating quite a bit on this here -> https://github.com/ivarflakstad/metal-rs We would love some feedback if upstreaming here makes...