clipper
clipper copied to clipboard
Queue batch prediction tasks collectively
This PR allows inputs of batch queries (using input_batch
) to be sent to the model container all together (limited to the maximum batch size), rather than individually.
It does this by adding all of the subqueries of the batch query at once to the model queue, so that the queue is locked and the waiters notified once only for the entire batch rather than locking the queue and notifying for each subquery individually.
Can one of the admins verify this patch?