RaceCond: MsQuicStreamOpen and MsQuicConnectionShutdown cause stream leaks
Describe the bug
Stream could be opened successfully with MsQuicStreamOpen after MsQuicConnectionShutdown is called.
MsQuicConnectionShutdown is async task,
https://github.com/microsoft/msquic/blob/9cab5bf0c340e038ed4003b6d8c99c2dd473a3f6/src/core/api.c#L252
MsQuicStreamOpen is sync call, but there is no synchronization in MsQuicStreamOpen when it checks the states of connection
https://github.com/microsoft/msquic/blob/9cab5bf0c340e038ed4003b6d8c99c2dd473a3f6/src/core/api.c#L653
It could cause stream leaks when the opened stream is not removed here (when processing ): https://github.com/microsoft/msquic/blob/9cab5bf0c340e038ed4003b6d8c99c2dd473a3f6/src/core/connection.c#L1631
Our application do refcounting for conn handle, that it only calls CONNECTION_CLOSE when all the stream handles belong to the conn are closed.
trouble connection: 0xffff42428100 trouble stream: 0xffff5388e880
Affected OS
- [ ] Windows
- [X] Linux
- [ ] macOS
- [ ] Other (specify below)
Additional OS information
ubuntu22.04 arm64
MsQuic version
v2.3
Steps taken to reproduce bug
- Client establish connection to Server.
- Client shutdown the connection.
- Client Open the stream on the connection.
Expected behavior
For step 3, it should either
a) fail to open the stream b) open success but get a "Stream_SHUTDOWN_COMPLETE" event.
Actual outcome
Open success but NO "Stream_SHUTDOWN_COMPLETE" event, completely quiet on callback.
Additional details
will be appreciated if you could take a look and give me some hints how to fix it or work around it.
I found the workaround that is to set the flag QUIC_STREAM_START_FLAG_SHUTDOWN_ON_FAIL so the app will get the callback after MsQuicStreamStart call.
So what happens in the log was, app tried to 0. shutdown the connection
- open the stream
- start the stream
- send some data
All calls returns success but I get callback for event
QUIC_STREAM_EVENT_START_COMPLETE. I missed this in the logging because the ID looks abnormal 18446744073709551615. (UINT64 max ?)
Indicating QUIC_STREAM_EVENT_START_COMPLETE [Status=0x1 ID=18446744073709551615
I think this issue also cause #4307 without stream shutdown, send ctx leaks.
Any plan on this issue?
I am seeing the same effect on StreamSend (#4913) causing context to leak over time