dspy Running batch predictions using DSPy complied prompts on large dataset

Hi,

I've implemented a clinical entity extraction pipeline using DSPy for processing patient notes. The pipeline extracts various entities (drugs, diseases, procedures, lab tests) and performs condition assessments. Currently, I'm facing challenges in scaling this pipeline to process a large dataset of approximately 500,000 notes efficiently.

Current Implementation:

Data loading
Note aggregation and preprocessing
Entity extraction using DSPy signatures and predictors
Condition assessment using custom DSPy modules
Result processing and export

Challenges:

Processing time: Currently, it takes about 5-6 minutes to process a single note.
Lack of native batch processing: Each note is processed individually, leading to inefficient use of API calls and resources.
Scaling difficulties: The current approach is not feasible for processing 500,000 notes in a reasonable timeframe.

Questions for Scaling:

What is the recommended approach for using DSPy with large datasets (~500,000 notes)?
Are there any best practices for compiling DSPy prompts for batch processing?
How can we optimize the use of compiled DSPy prompts in a distributed computing environment?

Any guidance on efficiently scaling DSPy for large-scale entity extraction tasks would be greatly appreciated. I'm open to restructuring my pipeline or adopting new approaches to achieve better performance.

Jun 25 '24 20:06 sarora-roivant

Use dspy.evaluate.Evaluate and pass num_threads. For the metric, just pass a metric that always returns True or False

Jun 27 '24 02:06 okhat

@okhat Why would i need Evaluate for an inference pipeline?

Are there any updates to this or ideas where one should look into ?

Mar 03 '25 12:03 mancunian1792

@mancunian1792 You can now do:

program.batch([dspy.Example(key1=value, ...).with_inputs('key1', ...)])

Mar 28 '25 22:03 okhat

What is the source code for method batch? I see where it's tested in test_parallel.py but nothing beyond that. Is it a LiteLLM utility?

I am hoping for something that somehow conforms to OpenAI's batch inference: https://platform.openai.com/docs/guides/batch

May 16 '25 20:05 chrico-bu-uab