Nicolas Patry comments

Results 977 comments of


                                            Nicolas Patry

TGI metrics don't have model_name label to indicate which model the metrics belong to

This is not up to a single TGI deployment to declare it's served_model_name, you should do that at the aggregation level (during aggregation on your prometheus probes start gathering information...

Add 'json_schema' alias to GrammarType.Json

Hi @aW3st, Thanks a lot for the PR. We're unlikely to adhere to anything `beta` from OpenAI, or even everything that OpenAI supports. The OpenAI adaptation layer exists for simple,...

Support xccl distributed backend

What's the benefit over the Ipex backend ? If this allows suboptimal deployments compared to the IPEX image, I think we'd rather not merge this at all (and error out...

Support xccl distributed backend

> The change I propose in this PR is being done with above background in mind. It introduces "xccl" distributed support into TGI which can be tried out if someone...

tauri dev crashes whenever a key is pressed

@jagzmz it's probably linked to the listener not being on the main thread. I'm not sure how to fix it easily. Also please, pin `rdev` to a specific revision in...

SafetensorError does not appear in `init.pyi`

Fixed in : https://github.com/huggingface/safetensors/pull/554

Build failed due to `half` and `rand` issue

Yes, some subdependencies broke semver which makes `cargo install` fail. Cargo install will always attempt to use latest patches if they exist which would break because of the semver breaking...

Support for Golang now or support a cli for other languages?

I think other projects are maintaining their own: https://pkg.go.dev/github.com/gomlx/tokenizers#section-readme We are currently not going to support due to the low amount of demand (compared to Python)

How to change the model weights in safetensors?

Load the file, modify the tensors, resave the file. This operation is destructive (you have less data than before) therefore doing a full rewrite of the file is necessary (there...

Could not start backend: cannot find tensor embeddings.word_embeddings.weight

I'm not super familiar with vLLM, recent work, but if it's anything like TGI which is very similar, it will attempt to use ALL possible memory when loading up. Therefore...