languagemodels Introduce guardrails around long prompts

Introduce guardrails around long prompts

Open jncraton opened this issue 1 year ago • 0 comments

It is currently easy to cause an out-of-memory condition when prompting a model with a very long prompt. This is an expected result of the implementation of both certain tokenizers as well as transformer attention. Experienced users may intentionally want to use long prompts, but less experienced users may encounter this issue by accident and encounter confusing OOM conditions (#31) or extremely slow runtime performance.

It may be helpful to explore a mechanism to limit default prompt length in order to help users avoid these friction points.

Feb 18 '24 18:02 jncraton

languagemodels languagemodels copied to clipboard

Introduce guardrails around long prompts

languagemodels
languagemodels copied to clipboard