model_analyzer
model_analyzer copied to clipboard

Published 20 hours ago •

triton-inference-server

Reame
Issues

Run models concurrently

Open naveengogineni opened this issue 4 years ago • 5 comments

We know that the below command (with -n flag) benchmark the models in sequence - model-analyzer -m /models/ -n model1, model2 --batch-sizes 1,2,4,8,16 -c 1,2,3

My requirement is to benchmark multiple models concurrently on a single GPU and was exploring to find out if there is a flag where we could pass models as arguments and the model analyzer could run them concurrently instead of running them in sequence?

Jan 28 '21 07:01 naveengogineni

Thanks for your feature request. This is on our roadmap for this project. We'll update this issue whenever this feature is available.

Feb 01 '21 23:02 Tabrizian

would love to have this feature too

Mar 11 '21 10:03 ydzhang12345

Are there any updates for the timeline of the completion of this feature?

Nov 16 '21 21:11 jishminor

The feature is on our roadmap, but we do not have an ETA at this time.

Dec 13 '21 23:12 matthewkotila

This won't be in 22.04, but at least an early access version should be available in the next few releases

Apr 13 '22 13:04 tgerdesnv

FYI This feature is now available: https://github.com/triton-inference-server/model_analyzer/blob/main/docs/config_search.md#multi-model-search-mode

Nov 16 '22 17:11 tgerdesnv