ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[FEATURE] Support batch inference

Open ylwu-amzn opened this issue 1 year ago • 0 comments

Most model services have some throttling limit. For example Bedrock.

With such limit, it takes long time to ingest large amount of data. One way is to use batch inference to increase the throughput. For example, Bedrock supports batch inference https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html

ylwu-amzn avatar Jan 05 '24 06:01 ylwu-amzn