scylladb icon indicating copy to clipboard operation
scylladb copied to clipboard

Generic server closes connections on shutdown before a successful reply is sent to the client.

Open bitpathfinder opened this issue 5 months ago • 4 comments

The current implementation of generic_server shuts down by closing all connections without considering the processing state of ongoing requests. This can lead to delays or errors in the execution of active requests.

Example

  1. A node receives a write request and begins processing it: it creates a write response handler and sends write requests to other replicas.

  2. At the same time, the node receives a shutdown signal and begins shutting down its services.

  3. When the CQL server shuts down, it cancels all connections (both inbound and outbound) and then waits for client connections to finish.

  4. Client connections are considered "finished" when the write request either:

  • completes successfully (receives the required number of write acknowledgments to satisfy the requested consistency level),
  • or fails due to a timeout or other errors.
  1. Although the CQL server waits for request processing to complete, the client connections have already been closed, so the successful response cannot be delivered to the client.

Proposed Fix: Two-step shutdown for generic_server

  1. Initial Shutdown Phase
  • Close the accept gate to block new incoming connections.

  • Abort any in-progress accept() calls.

  • For all active connections:

    • Close only the input side of the connection to prevent new requests.
    • Keep the output side open to allow responses to be sent.
  1. Drain Phase
  • Wait for all in-progress requests to either complete or fail.
  1. Final Shutdown Phase
  • Fully close all connections.

bitpathfinder avatar Jun 11 '25 13:06 bitpathfinder

copy from a slack tread:

Rust Driver

In the current implementation, all connection's workers: writer reader keepaliver orphaner (stream-id related thing) are tied together: once any of them gets an error, all of them are terminated. In a scenario when the writer gets a new task (a new request to send) and the writing part of the socket returns an error (because the socket has been closed for writing), all workers terminate, including the reader. It would require quite deep rework of the connection layer, as well as the connection pooling layer, to introduce a new notion of half-working connections (only for reading). It may be feasible, though. (edited)

The two-step shutdown won't bring any user-visible gains, unless we invest heavily in driver refactorings. So I don't think we should work on this issue, unless it's super easy to implement and provides some other benefits.

gusev-p avatar Jun 11 '25 14:06 gusev-p

copy from a slack tread:

Rust Driver

In the current implementation, all connection's workers: writer reader keepaliver orphaner (stream-id related thing) are tied together: once any of them gets an error, all of them are terminated. In a scenario when the writer gets a new task (a new request to send) and the writing part of the socket returns an error (because the socket has been closed for writing), all workers terminate, including the reader. It would require quite deep rework of the connection layer, as well as the connection pooling layer, to introduce a new notion of half-working connections (only for reading). It may be feasible, though. (edited)

The two-step shutdown won't bring any user-visible gains, unless we invest heavily in driver refactorings. So I don't think we should work on this issue, unless it's super easy to implement and provides some other benefits.

Current implementation of TCP connection does not actually close local side when we call shutdown_input, it just sends eof to the reading side, the other side will not get any errors, just new requests will be ignored.

bitpathfinder avatar Jun 16 '25 11:06 bitpathfinder

Current implementation of TCP connection does not actually close local side when we call shutdown_input, it just sends eof to the reading side, the other side will not get any errors, just new requests will be ignored.

posix_connected_socket_impl:

virtual void shutdown_input() override {
     shutdown_socket_fd(_fd, SHUT_RD);
 }

posix stack seems to be the default, no?

gusev-p avatar Jun 16 '25 13:06 gusev-p

Current implementation of TCP connection does not actually close local side when we call shutdown_input, it just sends eof to the reading side, the other side will not get any errors, just new requests will be ignored.

posix_connected_socket_impl:

virtual void shutdown_input() override {
     shutdown_socket_fd(_fd, SHUT_RD);
 }

posix stack seems to be the default, no?

Yes, posix is default. I was looking in the wrong place. Then if we do a proper shutdown, we send fin to the other side, and according to the https://github.com/scylladb/scylladb/issues/24481#issuecomment-2963081280 any new request would shutdown all the reads as well. So my fix probably works because there are no further writes are attempted.

bitpathfinder avatar Jun 16 '25 13:06 bitpathfinder