llm [FR] Add `--max-input-tokens`, `--input-limit-mode`

[FR] Add `--max-input-tokens`, `--input-limit-mode`

Open NightMachinery opened this issue 7 months ago • 1 comments

To avoid unintentionally sending requests that are too large to the new 128k-context models and incurring unnecessary costs, it's crucial to put in place some safeguards.

The parameter --max-input-tokens should be set to establish a maximum limit on the number of input tokens accepted. In addition, --input-limit-mode needs to specify what action is taken when this limit is reached. Possible settings for --input-limit-mode include:

exit: which halts the process;
truncate: which reduces the input to the permissible size;
summarize: which shortens the input, potentially through the use of a less expensive model;
ask-user: which asks the user to decide the next step.

Nov 29 '23 03:11 NightMachinery

llm llm copied to clipboard

[FR] Add `--max-input-tokens`, `--input-limit-mode`

llm
llm copied to clipboard