text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Remove max_stop_sequences by default

Open sestinj opened this issue 1 year ago • 0 comments

System Info

https://github.com/huggingface/text-generation-inference/blob/1028996fb380f07ebb2a9de1d2795e176f845c59/launcher/src/main.rs#L427-L428

I think it would be best for this limit to be non-existent by default, rather than 4. Or at least something higher like 16. Though client applications are able to detect that TGI is being used and encode the limit, it causes poorer behavior in autocomplete scenarios where >4 stop words are actually necessary. Yes, users of TGI can technically change this value, but many do not know to do this and will have lower-quality first experiences with whatever tools they are using.

All this said, I appreciate that this limit was initially defined by OpenAI and following along with their patterns is something that I in general support.

For additional background: https://github.com/continuedev/continue/issues/2380

Information

  • [X] Docker
  • [X] The CLI directly

Tasks

  • [X] An officially supported command
  • [ ] My own modifications

Reproduction

Send a request with >4 stop tokens

Expected behavior

Ideally a request with >4 stop tokens does not throw an error, unless the person running TGI has explicitly set a limit

sestinj avatar Sep 29 '24 21:09 sestinj