MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

Heterogeneous pool of workers

Open adriangonz opened this issue 2 years ago • 4 comments
trafficstars

Support a Heterogeneous pool of workers with variable number of model replicas per worker

adriangonz avatar Jan 31 '23 10:01 adriangonz

Heterogeneous workers would also be beneficial for the recently added support for online drift detectors (https://github.com/SeldonIO/MLServer/pull/1108), since these detectors must currently be run with parallel_workers = 0.

ascillitoe avatar Apr 20 '23 13:04 ascillitoe

Hello, Is there any estimate of when this issue would be addressed? Is there any intention to resolve it for version 1.4? Thanks.

cristiancl25 avatar Jan 11 '24 10:01 cristiancl25

It is unlikely this is going to be addressed in the next release as it stands. Do you have a particular usecase that requires it that you could share?

sakoush avatar Jan 11 '24 12:01 sakoush

Hello, the main problem with the actual parallel workers is the memory consumption when loading all models. If we can redistribute the models on the workers this could be improved.

cristiancl25 avatar Jan 16 '24 08:01 cristiancl25