genkit icon indicating copy to clipboard operation
genkit copied to clipboard

[CLI] Option to batch run on evaluation samples

Open ssbushi opened this issue 11 months ago • 1 comments

Is your feature request related to a problem? Please describe. Current evals implementation is serial, and takes up a long time if using a complex evaluator / large dataset

Describe the solution you'd like Add a batch option on the CLI to batch multiple samples together when running evaluation (not necessarily inference, since this may interfere with the trace collection process). This has already been implemented as POC in https://github.com/firebase/genkit/commit/5bded06a4885fc136ce43738b5779059299ac423

Additional Requirements:

  • have better span names,
  • link a metric to the corresponding span in the evaluator trace.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context https://github.com/firebase/genkit/commit/5bded06a4885fc136ce43738b5779059299ac423

ssbushi avatar Jan 22 '25 16:01 ssbushi

ANy update on this? We also have a dataset where each item runs for 8 mins, and it would help significantly to have batchsize support to make concurrent evals.

anilgulecha avatar May 22 '25 07:05 anilgulecha

Hi @anilgulecha

I've just merged support for batching on Genkit tooling side (only for JS right now). You can leverage this by using the eval:flow or eval:run commands with the --batchSize=<number> flag once we have the next release.

I'm also adding support in the Dev UI and that should be out very soon too!

Thanks!

ssbushi avatar May 29 '25 15:05 ssbushi