model_analyzer icon indicating copy to clipboard operation
model_analyzer copied to clipboard

Run models concurrently

Open naveengogineni opened this issue 4 years ago • 5 comments

We know that the below command (with -n flag) benchmark the models in sequence - model-analyzer -m /models/ -n model1, model2 --batch-sizes 1,2,4,8,16 -c 1,2,3

My requirement is to benchmark multiple models concurrently on a single GPU and was exploring to find out if there is a flag where we could pass models as arguments and the model analyzer could run them concurrently instead of running them in sequence?

naveengogineni avatar Jan 28 '21 07:01 naveengogineni

Thanks for your feature request. This is on our roadmap for this project. We'll update this issue whenever this feature is available.

Tabrizian avatar Feb 01 '21 23:02 Tabrizian

would love to have this feature too

ydzhang12345 avatar Mar 11 '21 10:03 ydzhang12345

Are there any updates for the timeline of the completion of this feature?

jishminor avatar Nov 16 '21 21:11 jishminor

The feature is on our roadmap, but we do not have an ETA at this time.

matthewkotila avatar Dec 13 '21 23:12 matthewkotila

This won't be in 22.04, but at least an early access version should be available in the next few releases

tgerdesnv avatar Apr 13 '22 13:04 tgerdesnv

FYI This feature is now available: https://github.com/triton-inference-server/model_analyzer/blob/main/docs/config_search.md#multi-model-search-mode

tgerdesnv avatar Nov 16 '22 17:11 tgerdesnv