cog icon indicating copy to clipboard operation
cog copied to clipboard

Health check endpoint unresponsive during predictions

Open ggilder opened this issue 8 months ago • 3 comments

Hi, I'm trying to use cog to build images I can run in a kubernetes cluster. I'm seeing an issue where I can't query the state of each pod (e.g. using the /health-check endpoint that cog defines) while a prediction is running.

I have a simple demonstration here: https://github.com/ggilder/cog-health-check-test

You can run the above on Docker Desktop. It's basically the cog "hello world" example with a sleep added to simulate a long-running prediction. The test script hits the prediction endpoint and then attempts to hit the health check endpoint, but the server waits for the prediction to complete before handling the health check, which doesn't seem to match the intended behavior (since e.g. the health state is defined to allow a "busy" state).

Am I missing something that would allow the server to respond to health checks during predictions?

ggilder avatar Jun 06 '24 17:06 ggilder