thrift icon indicating copy to clipboard operation
thrift copied to clipboard

THRIFT-5909: Fixed '#to_io gives NilClass' error

Open kpumuk opened this issue 1 month ago • 1 comments

In a race condition in Thrift::NonblockingServer, the following sequence could occur:

  1. server transport checked for closure (@server_transport.closed? returns false)
  2. another thread calls server.shutdown
  3. yet another thread, scheduled from #shutdown, calls #close on the server transport, which also sets @handle to nil
  4. back in the first thread, #to_io is called on the server transport, which now returns @handle as nil, causing an error:
  1) NonblockingServer Thrift::NonblockingServer should shut down when asked
     Failure/Error: rd, = select([@server_transport], nil, nil, 0.1)

     TypeError:
       can't convert SpecServerSocket to IO (SpecServerSocket#to_io gives NilClass)
     # ./lib/thrift/server/nonblocking_server.rb:48:in `select'
     # ./lib/thrift/server/nonblocking_server.rb:48:in `block in serve'
     # ./lib/thrift/server/nonblocking_server.rb:45:in `loop'
     # ./lib/thrift/server/nonblocking_server.rb:45:in `serve'
     # ./spec/nonblocking_server_spec.rb:116:in `block (4 levels) in <top (required)>'

This patch changes Thrift::ServerSocket#to_io to raise IOError if @handle is nil, which is the expected behavior for closed streams in Ruby:

  require 'socket'
  server = TCPServer.new('localhost', 55554)
  server.close
  select([server], nil, nil, 0.1)
  # => IOError: closed stream  

Before the change

sequenceDiagram
    participant T1 as Thread 1<br/>(NonblockingServer serve)
    participant T2 as Thread 2<br/>(server.shutdown)
    participant T3 as Thread 3<br/>(shutdown helper)
    participant S as ServerTransport<br/>(@server_transport)

    T1->>S: listen
    activate S

    loop accept socket
    Note over T1: 1. Check if server transport is closed
    T1->>S: closed?
    S-->>T1: returns false

    Note over T2: 2. Server is asked to shut down, spawns helper thread
    activate T2
    T2->>T3: &shutdown_proc
    deactivate T2
    activate T3

    Note over T3: 3. Closes server transport 
    T3->>S: close()
    S->>S: @handle = nil
    deactivate T3
    deactivate S

    Note over T1: 4. Back in serve loop<br/>select([@server_transport], nil, nil, 0.1)
    T1->>S: @server_transport.to_io<br/>(@handle is now nil)
    S-->>T1: returns nil

    Note over T1: 5. IO.select throws an error
    T1-->>T1: TypeError:<br/>"can't convert<br/>SpecServerSocket to IO<br/>(to_io gives NilClass)"
    end

After the change

sequenceDiagram
    participant T1 as Thread 1<br/>(NonblockingServer serve)
    participant T2 as Thread 2<br/>(server.shutdown)
    participant T3 as Thread 3<br/>(shutdown helper)
    participant S as ServerTransport<br/>(@server_transport)

    T1->>S: listen
    activate S

    loop accept socket
    Note over T1: 1. Check if server transport is closed
    T1->>S: closed?
    S-->>T1: returns false

    Note over T2: 2. Server is asked to shut down, spawns helper thread
    activate T2
    T2->>T3: &shutdown_proc
    deactivate T2
    activate T3

    Note over T3: 3. Closes server transport 
    T3->>S: close()
    S->>S: @handle = nil
    deactivate T3
    deactivate S

    Note over T1: 4. Back in serve loop<br/>select([@server_transport], nil, nil, 0.1)
    T1->>S: @server_transport.to_io<br/>(@handle is now nil)
    S-->>T1: throws IOError nil

    end

    Note over T1: 5. rescues IOError
    T1->>T1: shutting down
  • [x] Did you create an Apache Jira ticket? THRIFT-5909
  • [x] If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • [x] Did you squash your changes to a single commit? (not required, but preferred)
  • [x] Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • [ ] If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

kpumuk avatar Dec 03 '25 00:12 kpumuk

Build with all the remaining test fixes incorporated: https://github.com/kpumuk/thrift/actions/runs/19878193283

CleanShot 2025-12-02 at 20 12 38@2x

kpumuk avatar Dec 03 '25 01:12 kpumuk