rank_llm P2-Batch Inference

P2-Batch Inference

Open ronakice opened this issue 1 year ago • 3 comments

Currently, we only support inference with a (single query, single subset of documents), but technically we could batch over the query dimension pretty easily, doing it over document subsets is not doable given that the next call is not inferable in the sequential process that is sliding windows.

Jan 22 '24 19:01 ronakice

rank_llm rank_llm copied to clipboard

P2-Batch Inference

rank_llm
rank_llm copied to clipboard