serving
serving copied to clipboard
feat: Expose warmup thread pool to model server
Please let me know if I misunderstood the original intention.
- Parallel warm-up options were exposed a very long time ago, but AFAIK, there is no way to set it on the model server side.
- https://github.com/tensorflow/serving/commit/79ac1fd809ce4b09189dd4bf64da42cac9696801
- https://github.com/tensorflow/serving/commit/79f9d2842223faf43db840b5def8e27274a38027
- So it would be great if we can set this, and even if we use one thread, it would be beneficial to separate the specific thread pool ("Warmup_ThreadPool") if the user specify the number of thread explicitly.
Please let me know if I missed something or if I understand wrongly. I just try to understand TensorFlow serving internally.