[Bug]: UDP enabled prevents node from transmitting on channels over RF. DM's still work on RF.
Category
Other
Hardware
Other
Is this bug report about any UI component firmware like InkHUD or Meshtatic UI (MUI)?
- [ ] Meshtastic UI aka MUI colorTFT
- [ ] InkHUD ePaper
- [ ] OLED slide UI on any display
Firmware Version
2.6.7
Description
Setup: Porduino node on roof
- Client mode, Longfast
- Busy mesh (20-30% channel utl.)
- Interface via Android app and web interface
Wifi nodes on network
- Heltec Tracker on roof - RF medumfast - router_late
- Senscap Indicator - RF mediumfast - client
- T-Beam - RF Longfast - client_mute
Longfast RF only nodes
- T1000e - ios app - client_mute
- T-Deck - Standalone -client_mute
Mediunfast RF only nodes
- Senscap Indicator - client
All nodes within ~50M of each other
Issue: When UDP is enable on the roof portduino node, it will only transmit channel messages 5-10% of the time over RF. All channel messages are sent via UDP, and show up on wifi connected nodes. This is across both the default channel, and custom chanels.
DM's from the portdunio node work 100% of the time over RF or UDP.
Trance routes over RF are also unreliable (more so than from other RF only nodes)
It receives everything over RF just fine.
Disabling UDP results in the node transmitting over RF 100% of the time, but obviously wifi and mediumfast connected nodes can no longer see the messages.
Relevant log output
I did some additional testing today.
The "router_late" node on the network is dead.
So Pi in the box on the roof, and 4 esp32 nodes on the network. Two nodes on ShortSlow, two nodes (including pi) on longfast.
UDP enabled, pi as client: 50-70% channel message failure. 5% DM failure
UDP disabled, pi as client: 100% channel message success, 0% DM failure
UDP enables, pi a router_late: 100% channel message success, 0% DM failure
Sorry, wrong button. New to github
I did some additional testing today.
The "router_late" node on the network is dead.
So Pi in the box on the roof, and 4 esp32 nodes on the network. Two nodes on ShortSlow, two nodes (including pi) on longfast.
UDP enabled, pi as client: 50-70% channel message failure. 5% DM failure
UDP disabled, pi as client: 100% channel message success, 0% DM failure
UDP enables, pi a router_late: 100% channel message success, 0% DM failure
Based on this I'm wondering if it is default behavior for clients not to retransmit all packets. I know the "infrastructure" device roles (router, router-late, and repeater) prioritize retransmitting so this may be a factor.
Late update to this: router_late and router do not actually resolve this issue. Even in those modes I'm having issues getting the node to transmit over RF. The only reliable fix as been to ether shut down all other nodes on the network, or disable UDP. Both of which prevent bridging from working.
I think I am also seeing this issue.
Using UDP bridge decreases the chance of LoRa messages arriving at destination.
Once UDP ack is received no retransmits take place over LoRa... I would prefer retries on LoRa even if UDP ack received to avoid interference issues reducing LoRa message reliability when using bridge.
Unless we fixed it while I was not looking this is probably the bug John were describing a month or so ago.
Basically we don't differentiate where a packet comes from based on routing strategies. That means if we want to relay a packet, but we observe the packet on UDP, this cancels all retransmissions including LoRa.
Assuming I am not wrong (big ask I know) multiple queues with fairly fancy strategies would help.
Thinking far into the future, after looking at https://github.com/meshtastic/protobufs/issues/691 , and thinking about the potential of dual radio nodes ... is it worth tracking more than just UDP vs RF? Maybe UDP vs 900 vs 433 vs 2.4?
Can we get this one reopened?
It's a critical bug for the success of UDP. Measuring is more unreliable without it.
Once my PR lands in protobufs, I have some code to address this. Should happen right after 2.7.4 is released as alpha
Is this feature available in the latest alpha? At the moment, I only get one chance to TX to the mesh, but then I receive an acknowledgment from my own UDP repeater node (with LoRa disabled). Essentially, the packet is being acknowledged by myself, which isn’t useful.
What I’d like instead is the ability to only accept acknowledgments from LoRa, while still using my second node as RX-only and forwarding traffic over UDP to the primary.
My setup looks like this:
Primary node: strong TX to the mesh, but weak RX. Secondary node: excellent RX, but poor TX to the mesh.
Having two RX nodes really improves signal pickup, and being indoors in an apartment leaves me with limited placement options. But I found this set really improves my ability to hear other peoples nodes.
It should be fixed with >=2.7.6 (it needs #7634 also).
I’m on 2.7.8 for both nodes, so I assume this is the intended behavior. At the moment I get one transmission over LoRa and UDP, but the acknowledgment comes back via UDP. Ideally, I’d like to be able to ignore UDP acknowledgments until the third retry, so the ACKs only come from LoRa. I am checking the LoRa broadcast with my SDR and see just one burst.
If that’s outside the scope of this bug, that’s fine. Another possible approach would be to have the node ignore acknowledgments from specific nodes or have the repeater node ignore my node. Perhaps I could raise a enhacment request for that?
This original issue was about rebroadcasts being canceled when the ACK came via UDP, this is fixed with #7589 (and #7634).
Instead of not acknowledging when receiving via UDP, we could also indicate how the ACK arrived, e.g. LoRa, UDP, MQTT, etc. That would be a nice feature request.
Hmm, upon a second look, it looks like #7589 only prevents canceling a rebroadcast (if the packet originated from someone else), but when a packet from yourself gets ACK-ed via UDP, it will still stop the retransmissions.
I'll have a look to fix this.