Return better error code when setup is running #978
If the server is still running setup and we issue a new predict call, there is an exception thrown because the workers aren't ready yet.
cog.server.exceptions.InvalidStateException: Invalid operation: state is WorkerState.NEW (must be WorkerState.READY)
This commit catches that exception and returns a 503 - retryable server error to ask the client to retry prediction when the server is ready.
Note: The runner doesn't seem to throw a RunnerBusyError if it's doing setup. Instead it throws an InvalidStateException.
Hi, @ruravi. Thanks for contributing this PR. Apologies for not responding sooner.
I just pushed a merge commit to get this up to date with the latest origin/main. Unfortunately, I did this through GitHub's web UI and missed a few details, and wasn't able to push to your downstream repo with a fix. Could you please apply the following diff when you have the chance?
Diff
diff --git a/python/tests/server/test_http.py b/python/tests/server/test_http.py
index 35ce034..83b8f68 100644
--- a/python/tests/server/test_http.py
+++ b/python/tests/server/test_http.py
@@ -419,15 +419,15 @@ def test_prediction_idempotent_endpoint_conflict(client, >
assert resp1.json() == match({"id": "abcd1234", "status": "processing"})
assert resp2.status_code == 409
-
+
@uses_predictor("sleep")
-def test_predict_before_setup_complete():
+def test_predict_before_setup_complete(client):
resp = client.post("/predictions")
assert resp.status_code == 503
assert resp.json() == {"detail": "Server not ready. Try again later"}
@uses_predictor("sleep")
-def test_shutdown_before_setup_complete():
+def test_shutdown_before_setup_complete(client):
resp = client.post("/shutdown")
assert resp.status_code == 200
I'm still trying to understand the timing of when the server raises InvalidStateException vs. RunnerBusyError, which is also discussed in https://github.com/replicate/cog/issues/966... Can you help share the steps you took to run the model and issue a new predict call?