stagehand icon indicating copy to clipboard operation
stagehand copied to clipboard

Control concurrency of evals

Open filip-michalsky opened this issue 4 months ago • 0 comments

running 10+ evals on different browsers in parallel locally can become computationally challenging.

Braintrust offers a concurrency semaphor: https://www.braintrust.dev/docs/guides/evals/run

import { Factuality, Levenshtein } from "autoevals";
import { Eval } from "braintrust";
 
Eval("Say Hi Bot", {
  data: () =>
    Array.from({ length: 100 }, (_, i) => ({
      input: `${i}`,
      expected: `${i + 1}`,
    })),
  task: (input) => {
    return input + 1;
  },
  scores: [Factuality, Levenshtein],
  maxConcurrency: 5, // Run 5 tests concurrently
});

filip-michalsky avatar Oct 07 '24 00:10 filip-michalsky