model_analyzer
model_analyzer copied to clipboard
Run models concurrently
We know that the below command (with -n flag) benchmark the models in sequence - model-analyzer -m /models/ -n model1, model2 --batch-sizes 1,2,4,8,16 -c 1,2,3
My requirement is to benchmark multiple models concurrently on a single GPU and was exploring to find out if there is a flag where we could pass models as arguments and the model analyzer could run them concurrently instead of running them in sequence?
Thanks for your feature request. This is on our roadmap for this project. We'll update this issue whenever this feature is available.
would love to have this feature too
Are there any updates for the timeline of the completion of this feature?
The feature is on our roadmap, but we do not have an ETA at this time.
This won't be in 22.04, but at least an early access version should be available in the next few releases
FYI This feature is now available: https://github.com/triton-inference-server/model_analyzer/blob/main/docs/config_search.md#multi-model-search-mode