lumen
lumen copied to clipboard
Add abstraction for batch jobs
I've noticed that we've been embedding a lot more text in recent times (soon Vega-Lite docs), and the previous method of looping could be costly, so here I introduce batch processing for chunk situating. This could also be useful for batch generating multiple reports, or performing evals, in the future
This PR adds multi-provider batch support (OpenAI, Anthropic, Mistral) that processes chunks in parallel rather than sequentially, and has capability to auto-resume the existing batches.
Falls back to sequential processing if batch APIs aren't supported or batch_situate_threshold (default: 1000 chunks, set to 0 to disable).
OpenAI batch docs: https://platform.openai.com/docs/guides/batch Mistral: https://docs.mistral.ai/capabilities/batch/ Anthropic: https://docs.anthropic.com/en/docs/build-with-claude/batch-processing