stefanobranco comments

Results 8 comments of


                                            stefanobranco

Issue with custom dataset after updating to flair 0.11

Hi @alanakbik! Thanks for the feedback. I completely uninstalled the flair package and then reinstalled it, and now I can no longer reproduce the problem either. It seems something must...

Issue with custom dataset after updating to flair 0.11

Hey @alanakbik! Sorry to dig this out again, but turns out the issue is not quite resolved after all, and I think I figured out the root cause. We are...

Configuable NCCL timeout

I'm having the same issue, and I can't quite figure it out. `docker run --gpus '"device=0,1"' --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN -p 8000:8000 -v /mnt/machinelearning/:/data ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id meta-llama/Llama-2-7b-chat-hf --sharded true` I'm...

TransformerEngine FP8 speedup

Has there been any development on this? From what I understand FP8 support is still quite limited in TGI (the docs mention this is not the fastest due to unpacking...

Webserver crashing with GPTQ model `Server error: transport error Error: Warmup(Generation("transport error"))`

Since 2.1 is not gonna be out for another two months, I assume the easiest workaround for now is probably gonna be to completely rebuild the docker container with the...

TGI keeps crashing with 'device-side assert triggered'

Sometimes this also just causes the server to hang indefinitely it seems. I'll get a debug entry for generate, but nothing further happens: ``` DEBUG generate{parameters=GenerateParameters { best_of: None, temperature:...

[Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError

Does this also happen without multi-step scheduling?

Default index is not used in query with 8.13.15 client

Is there any development on this? Am I correct in understanding that if I've previously been relying on a defaultIndex setting I will now have to specifically set the index...