Mehran Salmani
Mehran Salmani
Thanks @singhniraj08 . For the third question, there are multiple threads for accepting inference requests, is inter_op_parallelism thread pool shared between them (between all the requests to the server) or...
@peiniliu I think intra=1 does not mean we have intra_op_parallelism thread pool of size 1; It likely means that the intra_op_parallelism thread pool is disabled so processing of ops is...
@rolandwang19 One approach I have tested is to use Kubernetes' ConfigMap for your models_config.config; Then you need to define a Kubernetes volume of type configmap, and mount it inside your...