dspy HFClientVLLM Multithreading not working

HFClientVLLM Multithreading not working

Open tom-doerr opened this issue 9 months ago • 0 comments

I'm using HFClientVLLM and set num_threads=32 but the time it takes evaluate to finish goes up linear with the number of samples. This shouldn't be the case since vllm is using 2x A100 GPUs and I'm just using a few samples, e.g. four samples take roughly 4x longer than evaluating on one sample.

May 21 '24 15:05 tom-doerr

dspy dspy copied to clipboard

HFClientVLLM Multithreading not working

dspy
dspy copied to clipboard