MLServer Heterogeneous pool of workers

Heterogeneous pool of workers

Open adriangonz opened this issue 2 years ago • 4 comments

trafficstars

Support a Heterogeneous pool of workers with variable number of model replicas per worker

Jan 31 '23 10:01 adriangonz

Heterogeneous workers would also be beneficial for the recently added support for online drift detectors (https://github.com/SeldonIO/MLServer/pull/1108), since these detectors must currently be run with parallel_workers = 0.

Apr 20 '23 13:04 ascillitoe

Hello, Is there any estimate of when this issue would be addressed? Is there any intention to resolve it for version 1.4? Thanks.

Jan 11 '24 10:01 cristiancl25

It is unlikely this is going to be addressed in the next release as it stands. Do you have a particular usecase that requires it that you could share?

Jan 11 '24 12:01 sakoush

Hello, the main problem with the actual parallel workers is the memory consumption when loading all models. If we can redistribute the models on the workers this could be improved.

Jan 16 '24 08:01 cristiancl25

MLServer MLServer copied to clipboard

Heterogeneous pool of workers

MLServer
MLServer copied to clipboard