cog
cog copied to clipboard
Run predictions off main thread to avoid blocking health check
Fixes https://github.com/replicate/cog/issues/1719
Defining the prediction endpoints with async def
runs them on the main thread per FastAPI docs, which is problematic because it blocks the server from responding to the health check endpoint. Converting these to def
allows health checks to run and fixes the problem I described in the above issue.