neqo
neqo copied to clipboard
Crash in [@ neqo_transport::connection::Connection::build_packet_header]
See https://bugzilla.mozilla.org/show_bug.cgi?id=1829989.
The crash reason is called
Option::unwrap()on a
None value
.
Crash stack:
8 | libxul.so | core::panicking::panic | library/core/src/panicking.rs:114 | cfi |
---|---|---|---|---|
9 | libxul.so | neqo_transport::connection::Connection::build_packet_header | third_party/rust/neqo-transport/src/packet/mod.rs:0 | cfi |
10 | libxul.so | neqo_transport::connection::Connection::output_close | third_party/rust/neqo-transport/src/connection/mod.rs:1850 | inlined |
10 | libxul.so | neqo_transport::connection::Connection::output | third_party/rust/neqo-transport/src/connection/mod.rs:1760 | inlined |
10 | libxul.so | neqo_transport::connection::Connection::process_output | third_party/rust/neqo-transport/src/connection/mod.rs:1004 | cfi |
11 | libxul.so | neqo_http3::connection_client::Http3Client::process_output | third_party/rust/neqo-http3/src/connection_client.rs:850 |
It's not clear to me where unwrap
is called inside function build_packet_header
. Maybe @martinthomson knows?
The likely source of the unwrap
call is in Path::remote_cid
or Path::local_cid
. If we don't have a connection ID for the path, we could be in a position where this panics. Those functions will likely be inlined.
The remote connection ID is set when a path becomes permanent. Temporary paths is a concept we use for incoming packets on new paths. We should not be sending CONNECTION_CLOSE
on a temporary path, so this is not likely to be the cause of the problem.
The local connection ID is set only for handshake packets. That carries a risk. If we are in a position where we don't have a local connection ID on the path, but we have to send CONNECTION_CLOSE
in a long header, we might hit this path. I would look at calls to ensure_error_path
to see if we are creating a path without a local connection ID from a short-header packet.
A simple test you might write to evaluate that is one where you haven't completed the handshake, but then a packet containing a garbage frame comes in on a different path. That packet would have a short header (or not, check both). If that crashes, then my theory might be solid.
That's what I'd do to understand the problem, which is always the first thing to try. We might get a better and more comprehensive fix if we do that. A simple patch to prevent the panic might be to check in output_close
, whether the path has a local connection ID, before trying to send a close for any PacketNumberSpace
that is not PacketNumberSpace::ApplicationData
.