saunafs icon indicating copy to clipboard operation
saunafs copied to clipboard

discuss SAU_CLTOCS_READ and other issues in protocol

Open aNeutrino opened this issue 8 months ago • 0 comments

Issue: Concurrent SAU_CLTOCS_READ Handling Results in Lost STATUS Responses

Description:
When a client sends a SAU_CLTOCS_READ request to a chunkserver while another SAU_CLTOCS_READ request is still being processed, the chunkserver currently breaks the data stream and fails to send a STATUS response for either request. This leaves the client without clear feedback on what went wrong. Additionally, the chunkserver process logs an error: sfschunkserver[3143954]: Got invalid message (type:1200).

The current behavior violates expectations because the protocol already defines a SAU_CSTOCL_READ_STATUS response that should indicate when non-standard behavior occurs, or at least provide relevant error or informational details.

Client Details:

  • The client in this scenario is not the standard saunafs-client (such as sfsmount) but rather a custom benchmark application designed to directly utilize the protocol to communicate with the chunkserver.

Expected Behavior:
When the chunkserver detects concurrent SAU_CLTOCS_READ requests that it cannot handle simultaneously, it should respond explicitly with a SAU_CSTOCL_READ_STATUS indicating a protocol violation or resource contention issue. The response should clearly inform the client about the exact nature of the problem to facilitate debugging and client-side error handling.

Current Behavior:

  • Chunkserver breaks the data stream.
  • No SAU_CSTOCL_READ_STATUS response is sent to the client.
  • Clients receive no indication of error or reason for interruption.
  • Chunkserver logs the error: sfschunkserver[3143954]: Got invalid message (type:1200).

Proposed Solution:
Modify chunkserver handling logic to ensure that:

  1. A clear SAU_CSTOCL_READ_STATUS response is always provided to clients when concurrent read conflicts occur.
  2. Include an informative status code/message detailing the cause of the interruption (e.g., "Concurrent SAU_CLTOCS_READ not allowed").
  3. or even better Allow for concurrent SAU_CLTOCS_READ in the same stream.

Impact:

  • Improved clarity of protocol interactions.
  • Easier debugging and error handling for clients.

Additional Context:

  • Consider updating the protocol specification if concurrent reads are fundamentally unsupported or should be handled differently.
  • Clarification is needed on whether there is existing comprehensive documentation explaining protocol details and constraints.
  • This issue also serves as an invitation to discuss potential problems in the current protocol design that may lead to impossible or buggy client-server interactions.

Image

aNeutrino avatar Apr 27 '25 00:04 aNeutrino