stefanobranco

Results 8 comments of stefanobranco

Hi @alanakbik! Thanks for the feedback. I completely uninstalled the flair package and then reinstalled it, and now I can no longer reproduce the problem either. It seems something must...

Hey @alanakbik! Sorry to dig this out again, but turns out the issue is not quite resolved after all, and I think I figured out the root cause. We are...

I'm having the same issue, and I can't quite figure it out. `docker run --gpus '"device=0,1"' --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$HUGGING_FACE_HUB_TOKEN -p 8000:8000 -v /mnt/machinelearning/:/data ghcr.io/huggingface/text-generation-inference:1.0.3 --model-id meta-llama/Llama-2-7b-chat-hf --sharded true` I'm...

Has there been any development on this? From what I understand FP8 support is still quite limited in TGI (the docs mention this is not the fastest due to unpacking...

Since 2.1 is not gonna be out for another two months, I assume the easiest workaround for now is probably gonna be to completely rebuild the docker container with the...

Sometimes this also just causes the server to hang indefinitely it seems. I'll get a debug entry for generate, but nothing further happens: ``` DEBUG generate{parameters=GenerateParameters { best_of: None, temperature:...

Does this also happen without multi-step scheduling?

Is there any development on this? Am I correct in understanding that if I've previously been relying on a defaultIndex setting I will now have to specifically set the index...