firmware icon indicating copy to clipboard operation
firmware copied to clipboard

[Bug]: UDP enabled prevents node from transmitting on channels over RF. DM's still work on RF.

Open wehooper4 opened this issue 7 months ago • 7 comments

Category

Other

Hardware

Other

Is this bug report about any UI component firmware like InkHUD or Meshtatic UI (MUI)?

  • [ ] Meshtastic UI aka MUI colorTFT
  • [ ] InkHUD ePaper
  • [ ] OLED slide UI on any display

Firmware Version

2.6.7

Description

Setup: Porduino node on roof

  • Client mode, Longfast
  • Busy mesh (20-30% channel utl.)
  • Interface via Android app and web interface

Wifi nodes on network

  • Heltec Tracker on roof - RF medumfast - router_late
  • Senscap Indicator - RF mediumfast - client
  • T-Beam - RF Longfast - client_mute

Longfast RF only nodes

  • T1000e - ios app - client_mute
  • T-Deck - Standalone -client_mute

Mediunfast RF only nodes

  • Senscap Indicator - client

All nodes within ~50M of each other

Issue: When UDP is enable on the roof portduino node, it will only transmit channel messages 5-10% of the time over RF. All channel messages are sent via UDP, and show up on wifi connected nodes. This is across both the default channel, and custom chanels.

DM's from the portdunio node work 100% of the time over RF or UDP.

Trance routes over RF are also unreliable (more so than from other RF only nodes)

It receives everything over RF just fine.

Disabling UDP results in the node transmitting over RF 100% of the time, but obviously wifi and mediumfast connected nodes can no longer see the messages.

Relevant log output


wehooper4 avatar May 09 '25 22:05 wehooper4

I did some additional testing today.

The "router_late" node on the network is dead.

So Pi in the box on the roof, and 4 esp32 nodes on the network. Two nodes on ShortSlow, two nodes (including pi) on longfast.

UDP enabled, pi as client: 50-70% channel message failure. 5% DM failure

UDP disabled, pi as client: 100% channel message success, 0% DM failure

UDP enables, pi a router_late: 100% channel message success, 0% DM failure

wehooper4 avatar May 13 '25 02:05 wehooper4

Sorry, wrong button. New to github

wehooper4 avatar May 13 '25 02:05 wehooper4

I did some additional testing today.

The "router_late" node on the network is dead.

So Pi in the box on the roof, and 4 esp32 nodes on the network. Two nodes on ShortSlow, two nodes (including pi) on longfast.

UDP enabled, pi as client: 50-70% channel message failure. 5% DM failure

UDP disabled, pi as client: 100% channel message success, 0% DM failure

UDP enables, pi a router_late: 100% channel message success, 0% DM failure

Based on this I'm wondering if it is default behavior for clients not to retransmit all packets. I know the "infrastructure" device roles (router, router-late, and repeater) prioritize retransmitting so this may be a factor.

madeofstown avatar May 14 '25 13:05 madeofstown

Late update to this: router_late and router do not actually resolve this issue. Even in those modes I'm having issues getting the node to transmit over RF. The only reliable fix as been to ether shut down all other nodes on the network, or disable UDP. Both of which prevent bridging from working.

wehooper4 avatar May 27 '25 15:05 wehooper4

I think I am also seeing this issue.

Using UDP bridge decreases the chance of LoRa messages arriving at destination.

Once UDP ack is received no retransmits take place over LoRa... I would prefer retries on LoRa even if UDP ack received to avoid interference issues reducing LoRa message reliability when using bridge.

Andrew-a-g avatar Jun 03 '25 14:06 Andrew-a-g

Unless we fixed it while I was not looking this is probably the bug John were describing a month or so ago.

Basically we don't differentiate where a packet comes from based on routing strategies. That means if we want to relay a packet, but we observe the packet on UDP, this cancels all retransmissions including LoRa.

Assuming I am not wrong (big ask I know) multiple queues with fairly fancy strategies would help.

Jorropo avatar Jun 03 '25 16:06 Jorropo

Thinking far into the future, after looking at https://github.com/meshtastic/protobufs/issues/691 , and thinking about the potential of dual radio nodes ... is it worth tracking more than just UDP vs RF? Maybe UDP vs 900 vs 433 vs 2.4?

fifieldt avatar Jun 08 '25 23:06 fifieldt

Can we get this one reopened?

It's a critical bug for the success of UDP. Measuring is more unreliable without it.

Andrew-a-g avatar Aug 01 '25 06:08 Andrew-a-g

Once my PR lands in protobufs, I have some code to address this. Should happen right after 2.7.4 is released as alpha

jp-bennett avatar Aug 01 '25 16:08 jp-bennett

Is this feature available in the latest alpha? At the moment, I only get one chance to TX to the mesh, but then I receive an acknowledgment from my own UDP repeater node (with LoRa disabled). Essentially, the packet is being acknowledged by myself, which isn’t useful.

What I’d like instead is the ability to only accept acknowledgments from LoRa, while still using my second node as RX-only and forwarding traffic over UDP to the primary.

My setup looks like this:

Primary node: strong TX to the mesh, but weak RX. Secondary node: excellent RX, but poor TX to the mesh.

Having two RX nodes really improves signal pickup, and being indoors in an apartment leaves me with limited placement options. But I found this set really improves my ability to hear other peoples nodes.

wtermini avatar Sep 05 '25 10:09 wtermini

It should be fixed with >=2.7.6 (it needs #7634 also).

GUVWAF avatar Sep 05 '25 10:09 GUVWAF

I’m on 2.7.8 for both nodes, so I assume this is the intended behavior. At the moment I get one transmission over LoRa and UDP, but the acknowledgment comes back via UDP. Ideally, I’d like to be able to ignore UDP acknowledgments until the third retry, so the ACKs only come from LoRa. I am checking the LoRa broadcast with my SDR and see just one burst.

If that’s outside the scope of this bug, that’s fine. Another possible approach would be to have the node ignore acknowledgments from specific nodes or have the repeater node ignore my node. Perhaps I could raise a enhacment request for that?

wtermini avatar Sep 05 '25 11:09 wtermini

This original issue was about rebroadcasts being canceled when the ACK came via UDP, this is fixed with #7589 (and #7634).

Instead of not acknowledging when receiving via UDP, we could also indicate how the ACK arrived, e.g. LoRa, UDP, MQTT, etc. That would be a nice feature request.

GUVWAF avatar Sep 05 '25 11:09 GUVWAF

Hmm, upon a second look, it looks like #7589 only prevents canceling a rebroadcast (if the packet originated from someone else), but when a packet from yourself gets ACK-ed via UDP, it will still stop the retransmissions.

I'll have a look to fix this.

GUVWAF avatar Sep 05 '25 12:09 GUVWAF