[Feature Request]: Do not reply to any broadcast packets with want ack
Platform
other
Description
This should not be a supported feature. Nodes should not be able to ask for a response on a broadcast packet as this can cause tons of traffic. Want response should only be allowed on directed messages. @GUVWAF
What about when you are deliberately sending a ping like when double tapping the button on screenless node to send your position and want a response?
This should not be allowed. That position packet should be sent without a want response.
But the purpose of sending that position packet is to have a response. It's to ping the network with your position and receive everyone else's back.
can we keep this for portnums 72 and 257? I use this feature in the ATAK Plugin
I will say I'm not in favor of this change as it limits the overall discovery of a mesh in a meaningful way.
If it's deemed to be the only way then I would say shrink the minimum Node Info Broadcast to something like an hour or 90 minutes, including having that setting in the apps. Currently set at three hours is far too infrequent when looking at removing this.
There are more uses for Meshtastic than just hyper dense cities on the coast. If "traffic" is a constant issue in these places like SFO or NYC a faster preset solves this. There are so many people there already that the lower link-budget isnt an issue.
Perhaps the default preset in the USA should just be short_turbo so we can stop removing important features.
Sample code
if (p->want_ack && p->to == NODENUM_BROADCAST && !IS_ONE_OF(p->decoded.portnum,meshtastic_PortNum_ATAK_PLUGIN,meshtastic_PortNum_ATAK_FORWARDER,meshtastic_PortNum_TRACEROUTE_APP)) {
LOG_WARN("Broadcast packet requested ACK, ignoring!");
p->want_ack = false; // Can't ACK broadcast packets
}
Sample code [...]
Wait, are we talking about want_ack or want_response now?
About broadcasts with want_ack:
Current firmware should already refuse to send broadcasts with want_ack=true. I'm not sure how the broadcasts with want_ack=true from the ATAK plugin even work (?) Does it somehow circumvent this? @niccellular
https://github.com/meshtastic/firmware/blob/8fe98db5dd6738546db0d27c6823e3380df322d4/src/mesh/Router.cpp#L326-L328
In case we still receive such a packet, the logic for deciding whether to respond with an ACK is fairly complicated, maybe that needs to be reviewed. But as I read it, the firmware already doesn't reply to broadcasts with want_ack=true. https://github.com/meshtastic/firmware/blob/8fe98db5dd6738546db0d27c6823e3380df322d4/src/mesh/ReliableRouter.cpp#L97-L165
-> Therefore, the Feature Request as per title ("Do not reply to any broadcast packets with want ack") is already fulfilled, IMO.
@RCGV1 Can you elaborate what's missing?
About broadcasts with want_response:
Some features rely on it, e.g.:
- Initial node discovery when joining a mesh works by sending our own NodeInfo to broadcast with want_response = True, so the nodes around respond with their NodeInfo
- "deliberately sending a ping like when double tapping the button on screenless node to send your position and want a response" does the same thing, but with position packets instead.
Yes, that does cause a lot of traffic, but for a reasonable cause. Especially for initial discovery, I can't think of a good alternative. The cooldown periods for responding to these should make sure it doesn't overload the mesh constantly, no matter how often this is requested.
Perhaps the default preset in the USA should just be short_turbo so we can stop removing important features.
Not just in the US.
I'd advocate for having the fastest available preset as the default. Either ShortFast everywhere for consistency/compatibility across regional borders, or ShortFast/ShortTurbo depending on regional laws for maximum resistance to congestion.
If someone really needs increased range/penetration, doing so by using more airtime might as well come at the cost of moving to a different mesh.
IMO that's more reasonable than having to move to a different, incompatible preset because the default preset is unusable due to congestion.
Meshtastic has grown too large for SF11 as a default, and the combination of many nodes, a feature-richness that causes a lot of traffic and a slow default preset is a strategic weakness compared to other LoRa Mesh Projects.
But that's probably not an option before 3.0
Wait, are we talking about want_ack or want_response now?
It seems indeed this issue is about want_response, not want_ack.
I agree that it is of good use in some cases. Maybe we can disallow it for specific portnums, but we should be careful.
Sorry I meant want response. I don't believe that want response broadcast packets should be sent at all.
So what do you say about the listed use cases for it?
Nodes should decide when to reply to a unknown nodes node info without a want response. In large meshes these sorts of packets create tons of traffic. I think that want response packets should only be used as directed packets.
Nodes should decide when to reply to a unknown nodes node info without a want response.
If we just change the flag to false and an arbitrary number of nodes respond anyways, what problem did we solve?
Nodes already decide whether they should respond based on when they sent their NodeInfo last time afaik.
I think that want response packets should only be used as directed packets.
So how does a new Node discover other nodes when joining?
They send out their own node info and other nodes will automatically initiate exchange since it's unknown
They send out their own node info and other nodes will automatically initiate exchange since it's unknown
A node that is known to them doesn't automatically also have them in their NodeDB.
E.g. if you switched to a different preset, reset your NodeDB to see who's there and then join the original Mesh again, everyone still "knows" you and will not respond. You also can't request it individually, since you don't know who to talk to
I don't think that this is a common thing. Someone who resets their node db should understand that.
Resetting a nodeDB is actually fairly common practice just in general even if you aren't switching presets. I think this idea has too many shortcomings at this time.
I don't think that this is a common thing
It is, since currently there is no way of filtering your Node List by what preset that Node is on. The only way to now what Nodes are actually present on the current preset is resetting your NodeDB and then checking who shows up.
Someone who resets their node db should understand that.
I do, but now what? What's the solution to this situation?
Wait for the nodes to advertise their node info
That can take multiple days.
I don't see what problem this is supposed to solve.
If it's about congestion, I believe rate limiting (that applies both to requesting and responding) is a better solution than removing functionality altogether.
We are seeing nodes reboot (likely bugs, or corruption, solar drain, flaky cables... whatever) and then they send Node ID with want-response, and then 200 nodes respond... happens many times per day. Perhaps can we have the boot inspired packet NOT request response? I'm less worried about a user asking for something... its unsolicited and automatically generated packets thats blasting the mesh for 1-2 minutes storm when a node reboots. Also, why do we need to "learn the whole mesh" ? wont fit in db anyhow. I'd rather only learn who's 1 hop away or somesuch. I'd rather have dynamic hop counts too... chat 7, info auto 2, telemetry 1, manual ID 7, etc.
Rate limiting requests/responses is fine too... but need to consider the few nodes that are rebooting often for various reasons (which is hard to detect EXCEPT for these flood storms tipping us off).
We have 900 nodes on a well connected mesh...and I see this is the reality or the goal in many areas, not just ours. Unfortunately other metros gave up, we're trying not to.
Looking at this from the most basic approach I see a couple things listed here no in particular order:
- Booted or rebooted nodes have a startup action of sending nodeDB with a want_response that results in high amounts of burst activity on busy meshes
- Nodes can boot or reboot unexpectedly due to a multitude of reasons
- Changing presets or nuking nodeDB requires discovery of nodes again
- Manually created Node Info (i.e. Send Position or Send Node Info) is part but not the whole discussion
I'm sure I've missed something but I think we could tease out a possible solution here:
- If you create or reset NodeDB; the first boot/reboot should the Node Info with want_response to promote initial mesh discovery
- Rebooted nodes should issue a Node Info WITHOUT want_response; I think it's fair that it could send it's Node Info out for clarity
- Manually created Node Infos should be throttled similar to Trace Route, I could even recommend something like once a minute.
Does this seem like something agreeable?
The idea of being able to manage hops per port num is an interesting one. It would really have to be done in the routing logic to make it make sense with 0-hopping. With the default hop limit of 4, I can still hit hundreds of nodes in 500sq miles in our mesh because of the roll out of 0-hopping to most important routers.
It all brings up a bigger question for me. What's the point in having a sprawling mesh if you don't know who all is on it or where they are located? Would the idea be to functionally have smaller meshes that are connected to common infra nodes? Or to be able to communicate over long distances but only if you manually exchange node info? Or would it be to have an even more anonymous large chat room?
The tension between ad hoc, small mesh, and large mesh use cases sure is real though. A lot of the not-fleshed-out ideas I can imagine happen at the router-level since that's really where a small mesh becomes a big mesh (rate limits, per-port hop logic, .etc).
Rebooted nodes should issue a Node Info WITHOUT want_response
I agree on that, but I don't think current firmware does send it with want_response=true at boot (?). @nullrouten0 Can you check what firmware the nodes that send these regularly are running on?
Also, why do we need to "learn the whole mesh" ? wont fit in db anyhow. I'd rather only learn who's 1 hop away or somesuch.
That again is a very "dense city-mesh" centric view on the project... Where I am the "whole mesh" is maybe 20 nodes, and only 2 of them are within 1 hop: My own Router_Late solar nodes. If node discovery was limited to 1 hop, it simply wouldn't retrieve any messageable nodes here.
Manually created Node Infos should be throttled similar to Trace Route
Manually created NodeInfos through "Exchange User Info" in the App are always directed, not broadcasts, IMO.
The only exception I'm aware of is the "Mesh Detector" in MUI, which sends and then listens indefinitely until you press "Stop". Maybe that could use a cooldown, though it doesn't really motivate the user to keep restarting it anyways.
There are more uses for Meshtastic than just hyper dense cities on the coast
definitely. Linking the selected default preset, overrridable of course, to the number of discovered nodes on the mesh would be good. Starting at SHORT_FAST, maybe.
Meshtastic has grown too large for SF11 as a default, and the combination of many nodes, a feature-richness that causes a lot of traffic
Well, an alternative is to leave that feature-richness, but to get a lot of users to use resources, i.e. airtime, responsibly by their own, free will. I know this is not how people in U.S. think in general (mindful use of resources), but it is something we are successful with here in Switzerland. Country-wide MEDIUM_FAST network, making people aware that sending their position or sensor data to the mesh as a whole is disingenuous, to people who could care less really.
Same for 0-cost hops, an overuse of ROUTER / ROUTER_LATE instead of Client leading to duplicate packages whirling around, etc ...
Just my two cents.
We mostly have one, just one, frequency slot for most presets except LONG_SLOW, LONG_MODERATE, SHORT_TURBO, at the center of the EU_868 ISM band 869.525 MHz, by the way, so being mindful of what one does is even more important here. See EU_868 vs. US:
https://meshtastic.org/docs/overview/radio-settings/#frequency-slot-calculator
The tension between ad hoc, small mesh, and large mesh use cases sure is real though
100% agreement
What's the point in having a sprawling mesh if you don't know who all is on it or where they are located
exactly. Good riddance on finally removing role REPEATER in firmware 2.7, at least a start
And by the way: Really, truly, think of re-naming the node roles in v3. Their naming and behavior in the code have nothing to do with what people think of first when reading the names. Also, please consider using different flag names than is_favorite for features such as CLIENT_BASE preferred early rebroadcast window nodes and 0-cost-hop from-Nodes. Naming is of extreme importance.
Finally, in case I come across harsh: thank you for this amazing project, and I hope our inputs as well as competition from MeshCore make Meshtastic better and better.