False Positive in Malformed Packet Detection with UTF-8 Emojis
Describe the bug
The stream buffer's malformed packet detection incorrectly flags valid packets as malformed when UTF-8 encoded data (particularly emojis) contains the packet magic bytes 0x94 0xc3. This causes the parser to purge valid packet data, lose buffer synchronization, and spam "Incomplete packet data" errors with incorrect expected byte counts.
To Reproduce
- Connect to a Meshtastic device via TCP using
StreamApi - Receive a NodeInfo packet where the UTF-8 encoded username, long_name, or short_name contains byte sequences that include
0x94 0xc3 - Common triggers: emojis in node names (e.g., "🐝 hive76.org 🧙♀️")
- Observe error logs showing "Detected malformed packet" followed by repeated "Incomplete packet data, expected [large number] bytes"
Expected behavior
The stream buffer should correctly parse NodeInfo packets containing arbitrary UTF-8 text, including emojis, without triggering false positive malformed packet detection. The validation logic should only flag truly malformed packets, not packets where protobuf-encoded data legitimately contains the magic byte sequence.
Additional context
Trace Logs
2025-10-28T17:26:58.341177Z TRACE Packet buffer with length 156: [148, 195, 0, 152, 34, 149, 1, 8, 130, 168, 226, 127, 18, 92, 10, 9, 33, 48, 102, 102, 56, 57, 52, 48, 50, 18, 29, 240, 159, 144, 157, 32, 104, 105, 118, 101, 55, 54, 46, 111, 114, 103, 32, 240, 159, 167, 153, 226, 128, 141, 226, 153, 128, 239, 184, 143, 26, 4, 104, 105, 118, 101, 34, 6, 236, 200, 15, 248, 148, 2, 40, 9, 66, 32, 108, 95, 253, 31, 200, 184, 30, 217, 229, 58, 169, 84, 212, 159, 179, 202, 39, 89, 25, 113, 1, 24, 134, 95, 84, 121, 204, 63, 148, 195, 70, 83, 26, 12, 13, 130, 109, 213, 23, 21, 8, 21, 56, 211, 40, 1, 37, 0, 0, 152, 64, 45, 207, 251, 0, 105, 50, 22, 8, 100, 21, 68, 139, 136, 64, 29, 140, 37, 164, 65, 37, 38, 229, 101, 63, 40, 148, 137, 151, 4, 72, 1]
2025-10-28T17:26:58.341212Z ERROR Detected malformed packet with next packet starting at index 102, purged malformed packet
2025-10-28T17:26:58.341218Z TRACE Packet buffer with length 54: [148, 195, 70, 83, 26, 12, 13, 130, 109, 213, 23, 21, 8, 21, 56, 211, 40, 1, 37, 0, 0, 152, 64, 45, 207, 251, 0, 105, 50, 22, 8, 100, 21, 68, 139, 136, 64, 29, 140, 37, 164, 65, 37, 38, 229, 101, 63, 40, 148, 137, 151, 4, 72, 1]
2025-10-28T17:26:58.341229Z ERROR Incomplete packet data, expected 18007 bytes, found 54 bytes
Note: The magic bytes [148, 195] appear at position 102-103 in the original buffer as legitimate UTF-8 data.
Successfully Decoded Packet (before error)
FromRadio {
id: 0,
payload_variant: Some(NodeInfo(NodeInfo {
num: 4201268104,
user: Some(User {
id: "!fa6a4388",
long_name: "R2D2", // No emojis - parses fine
...
})
}))
}
Problematic Packet (triggers false positive)
The packet with long_name: "🐝 hive76.org 🧙♀️" contains UTF-8 sequences that include the magic bytes, causing false positive detection.
Environment
- meshtastic/rust version: main branch (commit 947a2b0)
- protobufs version: v2.7.12
- Connection type: TCP
- Rust version: 1.84
I'm new to Rust and used Claude Code to help me dig into this. Here is some additional information in case it can be of value:
AI debugging
Root Cause
In src/connections/stream_buffer.rs:289-299, the validate_packet_in_buffer() function searches for the [0x94, 0xc3] sequence within packet data to detect corruption. However, this creates false positives when legitimate protobuf-encoded data happens to contain these bytes.
In the captured trace, a NodeInfo packet with long_name: "🐝 hive76.org 🧙♀️" contains the UTF-8 sequence that includes 0x94 0xc3 at position 102-103 within the packet payload. The validator incorrectly identifies this as a malformed packet header, purges data to that position, and leaves [148, 195, 70, 83, ...] in the buffer, which gets misinterpreted as a packet header with size u16::from_le_bytes([70, 83]) = 21318 bytes.
Related Code
There is an unimplemented test case acknowledging this issue at src/connections/stream_buffer.rs:687:
/// Test for detecting malformed packets when the packet data contains 0x94 bytes followed by 0x94 0xC3 sequence.
#[tokio::test]
async fn detect_malformed_packets_with_internal_header_sequence() {}
Potential Approaches to Fix
Possible solutions:
- Only check for magic bytes at boundaries determined by the packet length header
- Implement proper protobuf length-prefix validation
- Add a confidence threshold (e.g., only flag as malformed if magic bytes appear AND subsequent decoding fails)
- Use protobuf's built-in length-delimited message framing instead of custom header detection
The current implementation at lines 260-303 in stream_buffer.rs cannot distinguish between legitimate data and actual framing corruption.
This bug sounds very plausible, the parsing code is not very resilient. I'll have a look.
In my app, using this crate, I detect multiple nodes locally that have emojis in the name without any errors.