rippled RPC 'tooBusy' response should have 50x (503?) status code

Currently, if rippled (over RPC) returns a tooBusy response, like below, it does so with a HTTP 200 status code. This doesn't deserve a 200 status code, all is not OK, and one is definitely not getting the expected response.

I propose returning a HTTP 503 status code along with the tooBusy error message. Any error for that matter should have a non-200 HTTP status code.

By not assigning a non-200 status code:

Clients are mislead on the validity of the response
Reverse proxies / load balancers can't auto failover to another node as they get a 200 response

{
  "result": {
    "error": "tooBusy",
    "error_code": 9,
    "error_message": "The server is too busy to help you now.",
    "request": {
      "account": "rTooLkitCksh5mQa67eaa2JaWHDBnHkpy",
      "command": "account_tx",
      "ledger_index_max": 61974273,
      "limit": 20
    },
    "status": "error"
  }
}

Nov 28 '21 16:11 WietseWind

I agree that tooBusy should have a 50x status code. I don't know that every error should though. I don't think an account not found or transaction not found error means anything is actually wrong with the server, without knowing whether that data should have been found.

Nov 29 '21 22:11 cjcobb23

If we want to follow the HTTP status codes:

200 OK means the request was successfully received, understood, and accepted. The response includes the requested resource
503 Service Unavailable is correct for a server that is too busy
404 Not Found may be used for account not found or transaction not found
204 No Content may be used for e.g. account_tx with no txs in the requested range of ledgers

Nov 30 '21 04:11 intelliot

I agree that tooBusy should have a 50x status code. I don't know that every error should though. I don't think an account not found or transaction not found error means anything is actually wrong with the server, without knowing whether that data should have been found.

Just had the same thought. There's a distinction between "unexpected", infra-like error (must have an error code) and 'logic' error (200, but with an error).

@intelliot I think the 404 for e.g. account not found is a bad idea as that would trigger auto-retry/failover load balancers to retry with another uplink.

@intelliot @cjcobb23 I think this could be a good thing to keep in mind deciding if something needs a non-200 status code: "would the same call on another node potentially yield different (better, expected) results?"? If so: non-200 (error, so next uplink can be queried). If not (e.g. account doesn't exist, address encoding wrong, ledger sequence in the future, name it...): 200 status code but potentially with an error explaining what was wrong.

So good candidates for a non-200 status code are, for example:

server tooBusy
server not synced to the network
history (ledgers) missing for the queried range
...

Nov 30 '21 07:11 WietseWind

Is this related to the recent scalability issues the XRPL has been having with Trustlines?

Nov 30 '21 16:11 syeduali93

Is this related to the recent scalability issues the XRPL has been having with Trustlines?

No.

Nov 30 '21 19:11 WietseWind

rippled rippled copied to clipboard

RPC 'tooBusy' response should have 50x (503?) status code

rippled
rippled copied to clipboard