cog
cog copied to clipboard
Health check endpoint unresponsive during predictions
Hi, I'm trying to use cog to build images I can run in a kubernetes cluster. I'm seeing an issue where I can't query the state of each pod (e.g. using the /health-check
endpoint that cog defines) while a prediction is running.
I have a simple demonstration here: https://github.com/ggilder/cog-health-check-test
You can run the above on Docker Desktop. It's basically the cog "hello world" example with a sleep added to simulate a long-running prediction. The test script hits the prediction endpoint and then attempts to hit the health check endpoint, but the server waits for the prediction to complete before handling the health check, which doesn't seem to match the intended behavior (since e.g. the health state is defined to allow a "busy" state).
Am I missing something that would allow the server to respond to health checks during predictions?