seldon-core
seldon-core copied to clipboard
Inferencing Layer running on seldon core microservices is hanging after certain number of requests.
The Container which is deployed in seldon core microservices is hanging after certain number of requests, because we found that there are so many seldon core microservice process are spawned up and also occupying more memory.
we have below configuration to run the inference layer in seldon core microservices
gunicorn server with 6 workers and each worker is running with one thread.
request CPU is 2000M request memory is 1.2GI
limit CPU is 4000M and limit memory is 6.0Gi we have set Min.Replica as 1 and Max.Replica also set to 1 to test the single container performance
After request sent to the container, the memory will be started increasing from 500M to 6GB very quick.
We are also find that there are 52 seldon core microservices process were running inside the container even though we set 6 workers with 1 thread for each worker process.
could you any one please help us to resolve this issue
Can you clarify which version of Seldon Core - if not latest can you try with latest? Also - what is the model that is being run - have you tried a similar test on a dummy model like with this model
We are using Seldon Core 1.13
Its most likely some detail of your model - how its loaded and used. Can you provide some more details?
This issue is stale because it has been open 10 days with no activity. Remove stale label or comment or this will be closed in 5 days.
closing