OlivierDehaene comments

Results 149 comments of


                                            OlivierDehaene

Request a new api endpoint to check and retrieve token length for given text/prompt

No, sending an API request to check your token count is not something we want. This compute needs to happen client side.

Request a new api endpoint to check and retrieve token length for given text/prompt

> Alternatively one could also consider adding additional settings for truncation side. No. Server side truncation should be seen as a last resort. We will never offer enough flexibility on...

Request a new api endpoint to check and retrieve token length for given text/prompt

> This would allow downstream applications to better handle load and token counting of requests. How?

Request a new api endpoint to check and retrieve token length for given text/prompt

Same tokenizer client side either with WASM or something else.

Python wrapper for text-generation-benchmark utility

I'm a bit confused. Do you want a wrapper or do you want the --no-tui option to exist?

Client validation error when server generating <unk> token

> The last line of text_generation_launcher: Args seems not correct What do you mean? The logprob should always have a value. If it does not something is going wrong, hence...

Client validation error when server generating <unk> token

If you set a temperature of 10e-4, why not simply use greedy decoding?

Client validation error when server generating <unk> token

> I'm new on text generation tasks but I want to lower the "creativity" of the model and stick to stable outputs You should not use any temperature then and...

How to solve "Model is overloaded" when sending 500 requests?

To complete what @Narsil just said, what you would usually do instead is adding a rate limiter on the client side to avoid overloading the server (for example, limit the...

How to solve "Model is overloaded" when sending 500 requests?

https://github.com/huggingface/text-generation-inference/blob/main/clients/python/text_generation/client.py#L285