thrift
thrift copied to clipboard
THRIFT-5909: Fixed '#to_io gives NilClass' error
In a race condition in Thrift::NonblockingServer, the following sequence could occur:
- server transport checked for closure (@server_transport.closed? returns false)
- another thread calls server.shutdown
- yet another thread, scheduled from #shutdown, calls #close on the server transport, which also sets @handle to nil
- back in the first thread, #to_io is called on the server transport, which now returns @handle as nil, causing an error:
1) NonblockingServer Thrift::NonblockingServer should shut down when asked
Failure/Error: rd, = select([@server_transport], nil, nil, 0.1)
TypeError:
can't convert SpecServerSocket to IO (SpecServerSocket#to_io gives NilClass)
# ./lib/thrift/server/nonblocking_server.rb:48:in `select'
# ./lib/thrift/server/nonblocking_server.rb:48:in `block in serve'
# ./lib/thrift/server/nonblocking_server.rb:45:in `loop'
# ./lib/thrift/server/nonblocking_server.rb:45:in `serve'
# ./spec/nonblocking_server_spec.rb:116:in `block (4 levels) in <top (required)>'
This patch changes Thrift::ServerSocket#to_io to raise IOError if @handle is nil, which is the expected behavior for closed streams in Ruby:
require 'socket'
server = TCPServer.new('localhost', 55554)
server.close
select([server], nil, nil, 0.1)
# => IOError: closed stream
Before the change
sequenceDiagram
participant T1 as Thread 1<br/>(NonblockingServer serve)
participant T2 as Thread 2<br/>(server.shutdown)
participant T3 as Thread 3<br/>(shutdown helper)
participant S as ServerTransport<br/>(@server_transport)
T1->>S: listen
activate S
loop accept socket
Note over T1: 1. Check if server transport is closed
T1->>S: closed?
S-->>T1: returns false
Note over T2: 2. Server is asked to shut down, spawns helper thread
activate T2
T2->>T3: &shutdown_proc
deactivate T2
activate T3
Note over T3: 3. Closes server transport
T3->>S: close()
S->>S: @handle = nil
deactivate T3
deactivate S
Note over T1: 4. Back in serve loop<br/>select([@server_transport], nil, nil, 0.1)
T1->>S: @server_transport.to_io<br/>(@handle is now nil)
S-->>T1: returns nil
Note over T1: 5. IO.select throws an error
T1-->>T1: TypeError:<br/>"can't convert<br/>SpecServerSocket to IO<br/>(to_io gives NilClass)"
end
After the change
sequenceDiagram
participant T1 as Thread 1<br/>(NonblockingServer serve)
participant T2 as Thread 2<br/>(server.shutdown)
participant T3 as Thread 3<br/>(shutdown helper)
participant S as ServerTransport<br/>(@server_transport)
T1->>S: listen
activate S
loop accept socket
Note over T1: 1. Check if server transport is closed
T1->>S: closed?
S-->>T1: returns false
Note over T2: 2. Server is asked to shut down, spawns helper thread
activate T2
T2->>T3: &shutdown_proc
deactivate T2
activate T3
Note over T3: 3. Closes server transport
T3->>S: close()
S->>S: @handle = nil
deactivate T3
deactivate S
Note over T1: 4. Back in serve loop<br/>select([@server_transport], nil, nil, 0.1)
T1->>S: @server_transport.to_io<br/>(@handle is now nil)
S-->>T1: throws IOError nil
end
Note over T1: 5. rescues IOError
T1->>T1: shutting down
- [x] Did you create an Apache Jira ticket? THRIFT-5909
- [x] If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
- [x] Did you squash your changes to a single commit? (not required, but preferred)
- [x] Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
- [ ] If your change does not involve any code, include
[skip ci]anywhere in the commit message to free up build resources.
Build with all the remaining test fixes incorporated: https://github.com/kpumuk/thrift/actions/runs/19878193283