garak Subselect probes by input length

Summary

Some targets will have artificial limits on input length that are independent of the model (e.g. a web frontend that allows only n characters/words of input)

Motivation

Running full sets of probes against these targets is necessarily going to be wasteful and will not indicate anything truly about robustness. If we subselect by length, we can reduce load and improve accuracy.

Mar 07 '25 15:03 erickgalinkin

Do we have a discrete list of such targets that can have their input lengths capped?

Apr 18 '25 22:04 mrowebot

Interesting feature. Are there concrete examples of this?

Do we have a discrete list of such targets that can have their input lengths capped?

Not really - some are manually tracked in the openai module

Some targets will have artificial limits on input length that are independent of the model (e.g. a web frontend that allows only n characters/words of input)

This sounds like it requires three ingredients

Knowledge of the max length, maybe set by config or a generator attrib
Knowledge of prompt length, available after prompt is composed, requiring a tokenizer / estimation. A pattern will emerge with #1112
Orchestration-level intervention to not pose the prompt. This could be represented as prompt:whatever output:None, which will come back as a skip - that seems appropriate to me, the prompt is skipped.

Apr 23 '25 06:04 leondz