rank_llm icon indicating copy to clipboard operation
rank_llm copied to clipboard

P2-Batch Inference

Open ronakice opened this issue 1 year ago • 3 comments

Currently, we only support inference with a (single query, single subset of documents), but technically we could batch over the query dimension pretty easily, doing it over document subsets is not doable given that the next call is not inferable in the sequential process that is sliding windows.

ronakice avatar Jan 22 '24 19:01 ronakice