garak rename `max_tokens` throughout to `max_generation

So distinct from max input tokens, max context window.

See discussions in #1112

Aug 05 '25 12:08 leondz

https://github.com/NVIDIA/garak/pull/1112#discussion_r2259893786

consider garak values:

ctx_len - total context window size in tokens
max_input_tokens - maximum input/prompt size for a generator
max_output_tokens - maximum available/requested output size (less is fine, this is a cap not a demand)
input_overhead_tokens - fixed costs on inputs (see OpenAI chat modality)
max_prompt_tokens - max prompt length this turn, given overhead, system prompt/conv history, ctx_len, max_input_len, and "some output" (suggest min 150 tokens)

we assume tiktoken by default, and I guess all of these params are optional

Aug 07 '25 10:08 leondz

I would expand on this issue to contextualize that removal of max_tokens is not the primary goal. Accounting for and enforcing token budgets needs to be more straight forward and consistent for all generators.

Aug 07 '25 19:08 jmartin-tech

Oh, is it not? I see accounting and naming as parallel, independent efforts

Aug 07 '25 19:08 leondz

The naming to me is intrinsically linked to how it is used, accounting from a feature vector is independent to some extent but core usage and name are coupled in my mind at this time.

Aug 07 '25 20:08 jmartin-tech

OK. I am mindful of oai-specific maths affecting how we think of token counting in general.

Aug 08 '25 14:08 leondz

This issue has been automatically marked as stale because it has not had recent activity. If you are still interested in this issue, please respond to keep it open. Thank you!

Nov 07 '25 00:11 github-actions[bot]

rename `max_tokens` throughout to `max_generation_tokens`