ml-commons [FEATURE] Support batch inference

[FEATURE] Support batch inference

Open ylwu-amzn opened this issue 1 year ago • 0 comments

Most model services have some throttling limit. For example Bedrock.

With such limit, it takes long time to ingest large amount of data. One way is to use batch inference to increase the throughput. For example, Bedrock supports batch inference https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html

Jan 05 '24 06:01 ylwu-amzn

ml-commons ml-commons copied to clipboard

[FEATURE] Support batch inference

ml-commons
ml-commons copied to clipboard