ml-commons
ml-commons copied to clipboard
[FEATURE] Support batch inference
Most model services have some throttling limit. For example Bedrock.
With such limit, it takes long time to ingest large amount of data. One way is to use batch inference to increase the throughput. For example, Bedrock supports batch inference https://docs.aws.amazon.com/bedrock/latest/userguide/batch-inference.html