dspy icon indicating copy to clipboard operation
dspy copied to clipboard

HFClientVLLM Multithreading not working

Open tom-doerr opened this issue 9 months ago • 0 comments

I'm using HFClientVLLM and set num_threads=32 but the time it takes evaluate to finish goes up linear with the number of samples. This shouldn't be the case since vllm is using 2x A100 GPUs and I'm just using a few samples, e.g. four samples take roughly 4x longer than evaluating on one sample.

tom-doerr avatar May 21 '24 15:05 tom-doerr