Mesh-wide, coordinated zero-hop 'virtual node' broadcast with multi-mode & multi-frequency capability
I have recently created a firmware feature that is intended to help new users on our mesh, and wanted to check if there's any interest in having this (or some subset of it) merged upstream - is there? If so, I'll tidy it up into a more general form suitable for merging (e.g. no static config). If not, I'll just maintain it as a separate patch. It's currently installed on a number of our high sites on a 2.6.11 base for testing.
Please note that the code in this PR as it exists currently is not intended to be directly merged; rather it's a starting point for testing & discussion about what aspects of this work, if any, are desirable upstream.
The Feature:
- Nodes with this firmware advertise a second, 'virtual node', with a common node ID (i.e. all sites share the same secondary ID).
- This virtual node is advertised on both the configured primary channel and on LF20.
- If the node has a channel named "Tips" configured, any incoming text messages on that channel will be re-originated using the virtual node as the origin.
- All re-originated packets are sent with a hop limit of zero.
- If the message contains a radio setting prefix (e.g. #SF20 for SHORT_FAST slot 20), the node will reconfigure the LoRa settings & channel name before re-originating the packet. It will then immediately switch back to its usual config.
The Reasons:
- To advise new users in our area that they need to change their radio settings to SF16, where the bulk of our mesh lives, and leverage our existing infrastructure to ensure that message is received across our coverage footprint.
- To provide a different set of tips to users on LF20 vs users on our main SF16 mesh.
- To quickly announce or test region-wide to other radio modes than our usual one.
- To prevent tips from leaking outside of the radio mode & geographic areas they are intended for.
- To allow the ability for multiple users to send messages using the "Tips Robot" identity, without needing to share a single device.
Sending messages as zero-hop prevents them from leaking - either to other neighbouring meshes, or to another radio mode across our SF/LF bridges. It means we can catch new users, and users from out of town, without spamming the rest of the mesh about stuff that they already know.
Is there any appetite to have this merged as a mainline feature, either in part or in full? Based on the earlier discord discussion, it seems like at least the mode switching part may be independently useful.
See also the discord thread here
Not currently implemented in this PR, but could be enabled by the mode-switch stuff: there's the potential to build an "I am over here" feature on top of it. So that nodes could e.g. every few hours briefly jump over to the default LONG_FAST slot, and send a "I am using these settings" packet, and then jump back. Would be a different, 'discovery' packet type, not a normal nodeinfo or text message. Upshot would be that, assuming it's enabled by default, new users could then automatically discover nodes in their area even if those are using different modem settings, and would know where to go without any central coordination needed to send them there.
🤝 Attestations
- [x] I have tested that my proposed changes behave as described.
- [x] I have tested that my proposed changes do not cause any obvious regressions on the following devices:
- [ ] Heltec (Lora32) V3
- [ ] LilyGo T-Deck
- [ ] LilyGo T-Beam
- [x] RAK WisBlock 4631
- [x] Seeed Studio T-1000E tracker card
- [ ] Other (please specify below)
Please note that this PR requires the following protobuf changes in order to work:
diff --git a/meshtastic/mesh.proto b/meshtastic/mesh.proto
index 03162d8..ec54c99 100644
--- a/meshtastic/mesh.proto
+++ b/meshtastic/mesh.proto
@@ -1393,6 +1393,21 @@ message MeshPacket {
* Set by the firmware internally, clients are not supposed to set this.
*/
uint32 tx_after = 20;
+
+ /*
+ * The modem preset to use fo rthis packet
+ */
+ uint32 modem_preset = 21;
+
+ /*
+ * The frequency slot to use for this packet
+ */
+ uint32 frequency_slot = 22;
+
+ /*
+ * Whether the packet has a nonstandard radio config
+ */
+ bool nonstandard_radio_config = 23;
}
/*
If this were generalised for merge, I can see it logically splitting into the following:
Preset / Frequency / Channel Switching
Integrated into the core, mostly in RadioInterface.cpp. Handles switching the radio config & channel name / key as needed, and deals with any queuing complications that may result from some packets being intended for a different mode.
May require some changes to the queuing behaviour for optimal performance (as opposed to my current version, which just treats any queuing delay as acceptable).
Virtual Node
Independent module, with separate protobuf for config. Config options would be:
- Whether the feature is enabled (default=no)
- List of (modem config, channel name, channel key, TX interval) vectors on which NodeInfo should be sent.
- Local channel number from which inbound text messages should be re-originated
- Virtual node short name
- Virtual node long name
- Virtual node public key (virtual node ID to be automatically derived from this)
Should there be a configurable hop limit for this (perhaps in the vector list)? Or should it be forced to always zero?
Config Beacon
Independent module, with a new app type & associated protobuf. At a configured interval, would switch to the default LONG_FAST slot and broadcast a zero-hop packet containing the following. After transmitting, the node would then return immediately to its normal radio settings.
- Version [2 bits] // beacon packet version, increment on non-backwards-compatible format changes
- Reserved [2 bits] // in case we need them later, should default to 0
- Role hint: [2 bits]
- MIGHT forward traffic (e.g.
CLIENTtype roles); or - WILL forward traffic (e.g.
REPEATER,ROUTER,ROUTER_LATE); or - WILL NOT forward traffic (always this if rebroadcast mode is set to
NONE)
- MIGHT forward traffic (e.g.
- Forwarding hint: [2 bits]
- Will forward ALL traffic (rebroadcast mode
ALLorALL_SKIP_DECODING); or - Will forward only CORE traffic (rebroadcast mode
CORE_PORTNUMS_ONLY); or - Will forward only KNOWN traffic (rebroadcast mode
LOCAL_ONLYorKNOWN ONLY); or - Will forward NO traffic
- Will forward ALL traffic (rebroadcast mode
- Modem settings [24 bits (F=24 + BW=3 + SF=3 + CR=2)]
- Frequency: uint24 (kHz)
- Bandwidth: uint3 enum
- Spreading factor: uint3 enum
- Coding Rate: uint2 (offset so 0=5)
- Node ID [32 bits]
- Primary channel hash (allows distinguishing channels with the same name, but different keys) [8 bits]
- Primary channel name [96 bits]
Without overhead, the example packet above would fit into 21 bytes. I'm unsure how much overhead protobuf encoding would incur for the above - I'm not familiar enough with its wire format - but for this kind of payload raw encoding would be trivial if that encoding overhead is a concern (is it?). IMO this is small enough that with a sensible interval, it could reasonably scale to a very high numer of nodes, even in a dense event mesh - especially as it's a zero-hop packet. Keeping it small does matter, because we don't have a good picture of what the default channel util looks like (because we may be sending beacons to a modem config that we aren't normally resident on), and therefore cannot auto-throttle.
Given that these are all fixed-length fields, backwards-compatibility can be maintained for raw encoding without a version bump in the event of future field changes by simply appending any new field data to the end of the existing packet (or using the reserved bits).
Assuming it were enabled by default, which IMO would be a good idea, this would allow new users to automatically discover which settings are in use by other nearby nodes, without any separate coordination mechanism being required. This would require app support, but could allow a simple autoconfiguration mechanism that presents a list like the following, and allows users to tap one to join that particular mesh. If the hash of the tapped channel indicates a non-default key, the app could also prompt the user to enter the key.
- LongFast (919.875MHz 250kHz SF11 CR5) - 4 nodes // default meshtastic config
- ShortFast (915.000MHz 250kHz SF7 CR5) - 19 nodes // default SHORT_FAST preset on a nonstandard frequency
- PrivateMesh (915.000MHz 250kHz SF7 CR5) - 2 nodes // private channel piggybacking on the above SF infrastructure
- JoesSensors (925.375MHz 250kHz SF9 CR5) - 106 nodes // Joe has put a GPS tracker on every cow in his herd
Sending only the info above allows these packets to remain very small (and therefore imposing minimal airtime overhead on the default out-of-the-box LONG_FAST setup), and sending them zero-hop ensures that the user is presented with only the nodes that are legitimately within direct range.
Including the device role & rebroadcast hints means that the user can also see at a glance whether there is a nearby device that will forward traffic. In the example above, the user might see that the ShortFast item that is displayed includes three nodes that will forward traffic (one router, two clients), but JoesSensors does not have any.
Definitely seems like a significant lack of feedback here, other than from @GUVWAF.
Is there any objection if I clean this up and generalise it for proper inclusion in the firmware? Or is it preferred for me to maintain this as an independent patchset for just our local mesh?
This is fascinating and interesting to me, I like it.
Curious why a virtual node . If using the proper ID then recipients can tell where they received the message from. Does it make the packets appear identical so they aren’t duplicates?
We have 3 presets in the Bay Area and I could see us using this feature.
I’m also pleased to see switching of presets and frequencies without a reboot, which has always frustrated me
Curious why a virtual node . If using the proper ID then recipients can tell where they received the message from.
Three reasons:
- So that more than one user can send messages using that identity; and
- Because that way we only need to advertise nodeinfo for one node on a different frequency, rather than advertising many there; and
- Because by its very nature it cannot receive replies (due to listening on a different frequency / modem preset), and therefore it sends nodeinfo packets with the 'unmessageable' flag set to true. This avoids the misleading situation where a message may be received from a node identity that looks messageable (i.e. a normal personal node), but isn't.
Does it make the packets appear identical so they aren’t duplicates?
Yes. When a packet is broadcast using this mechanism, it is identical from every site that sends it.
@erayd have you had chance to progress with this at all?
@erayd have you had chance to progress with this at all?
I'm waiting for feedback currently. It's still unclear to me whether the devs responsible for making the call actually want this functionality, and I don't want to sink a significant amount of work into generalising this feature for merge if it will never be accepted - hence why I'm asking that question here first. @GUVWAF's comments are useful from a technical standpoint, but don't answer that core query.
If there's interest in accepting this feature set (or parts of it), then I'll go ahead and implement it properly. Comments on it generally have been limited in number, but positive.
I raised this again in the discord (you can see what was written).
I think there's appetite for this, or at least parts of it, but it's a big PR as it stands, and each step requires some design choices. As I see it (and bear in mind I'm not a dev for this, and don't know all of the implications) it is made up of the following discrete parts:
- frequency shifting without reboot - has wider application, but needs work on message queue on changeover
- beacon messaging (motd) on a timescale as a function - needs tie-ins with the client apps, and some thought on how to prevent abuse and spam.
- remote control with the announcements on a PSK channel (which feels like the old
Adminchannel, tbh, and may come under the same criticism).
Have I got that right?
I raised this again in the discord (you can see what was written).
I can't find the relevant post sorry - if you were wanting a response, could you link it please?
Have I got that right?
Not quite. The logical parts are:
- Frequency switching on a per-packet basis (fast, no reboot needed, has queueing implications)
- Virtual node (sends nodeinfo, re-originates messages received via a defined PSK channel)
- Fixed LONG_FAST beacon to advertise modem settings (to allow auto-discovery by users w/ default settings)
1 & 2 are implemented currently, and has been running on top of 2.6.11 on a number of our high sites for several weeks now. However, the current implementation is not suitable for merge upstream. If the interest is there, then I'm happy to do the work to implement it as a more general feature that can be upstreamed.
3 is not yet implemented, but I intend to implement it for upstreaming if (1) is accepted. Will require coordination with the client apps to be useful. I don't see it as a spam vector, as we can enforce a long interval on it, and it doesn't contain any user-configurable text (these packets need to be tiny in order to minimise airtime).
It is pretty complex so hard to just drop gut feedback, we often wait for @GUVWAF to go first on routing related stuff. I added @oseiler2 as a reviewer as there are potentially security issues, and there is a proposal from @geeksville that is somewhat related https://github.com/meshtastic/firmware/discussions/7440
@Jorropo would be interested in your thoughts as well.
Probably hard for most of the core team to really dig in until after defcon and the 2.7 beta release.
I agree with @garthvh that this is very similar to #7440
Hey, @erayd how's this coming along? Have you had chance to look at where to draw the split lines?
Hey, @erayd how's this coming along? Have you had chance to look at where to draw the split lines?
I'm still waiting for feedback - @garthvh indicated above that this would likely not be forthcoming until some point after defcon / 2.7 beta, so I'm in something of a holding pattern until I get a clear indication whether this kind of thing actually stands a chance of being merged.
Re the split lines - yes; I'm thinking the logical divisions are as per my first comment (so frequency switching, config beacon, and virtual node).
I've been working on a different feature in the meantime to address transport reliability - should have a (quite large unfortunately) PR for that this week sometime.
Still all silent on the feedback front, but I've had enough people say that they want the radio switching to warrant me doing at least that part of it regardless.
I will implement the radio switching stuff independently as soon as I have the replay thing out the door - so likely in a couple of weeks for now. Will drop progress notes in this PR until I have a separate one ready to go for that.
I personally have no use for the virtual node concept. Or more accurately: The complexity of it and the amount of things that couldn't be taken for granted anymore feels overwhelming, to be honest.
However I'm convinced being able to change the radio settings (not only frequency, but also bandwidth and SF) on the fly is a great feature by itself. It's also a prerequisite for some other good ideas that are already around.
@thebentern does this need a 2.8 tag?