garak try to enable `parallel_attempts

It's sad (and slow) when parallelisation is off but a generator can support it.

Proposal:

benchmarks show parallel_attempts has more impact than parallel_requests (over a whole default run) - try to set this to be used by default
not all generators support it; we should have a flag in each generators that sets whether it should be run in parallel by default
- let's repurpose parallel_capable
- it's True by default
- [ ] needs to be set to False in ggml
- [ ] needs to be sensibly set to False or True for each Hugging Face class
- it should be possible to override parallel_capable somehow. options:
  1. ignore parallel_(requests|attempts) if generator's parallel_capable == False
  2. use parallel_capable to determine default parallel configuration defaults, which can be overridden in CLI/config options (print/log a warning when this happens)
  3. setting parallel_(requests|attempts) to 1 should (effectively? actually?) override parallel_capable; print/log a notice when this happens
- what's a sensible default for the rest class?
- what's a sensible default for parallel_attempts? what's polite? I like 200 with NIMs, the SREs have comments about why this isn't good; 2 is too low. powers of two are popular - how about 8?

This will expose more people to #308

Aug 14 '24 12:08 leondz

@jmartin-tech opinions welcome!

Aug 14 '24 13:08 leondz

I like the idea of expanding parallel_capable to limiting both parallel_requests & parallel_attempts for all plugins. I think there could be an argument for wanting to allow concurrent requests while not allowing concurrent attempts, the simplicity of a single control though may outweigh the value.

it should be possible to override parallel_capable somehow. options

In theory, parallel_capable can already be overridden via config on a per generator class basis. The impact would be to all instance of the generator class, and could have impacts for generators within other plugins like probes and detectors.

what's a sensible default for the rest class? what's a sensible default ...

Using a single default for all types seems viable, I don't see a strong reason to limit rest differently. 8 sounds reasonable, powers of 2 are nice from a cores/threads available perspective. We could also consider an auto-detect value based on a CPU core count check and scale to 2x the CPU cores possibly with a max of 16 or something to keep it from over saturating on a high core count CPU.

Aug 16 '24 13:08 jmartin-tech

A note to add here, this will add constraints on integrators using the function generator as it puts a requirement on the package providing the function to ensure the package provides inference with no configuration of setup required on import by a sub-process.

Oct 30 '24 15:10 jmartin-tech

Good point, thanks for tracking it here. This generator is weird and somewhat unlike the others, so I'm OK with hardcoding a case around it alone.

Oct 30 '24 15:10 leondz

This issue has been automatically marked as stale because it has not had recent activity. If you are still interested in this issue, please respond to keep it open. Thank you!

Jul 01 '25 00:07 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. If you are still interested in this issue, please respond to keep it open. Thank you!

Sep 30 '25 00:09 github-actions[bot]

try to enable `parallel_attempts > 1` by default