firmware icon indicating copy to clipboard operation
firmware copied to clipboard

Investigate QMesh, Reticulum and doing FEC in software

Open geeksville opened this issue 4 years ago • 18 comments

Lots of good ideas in https://github.com/faydr/QMesh/wiki/Theory (and associated project) by @faydr

Thanks @CycloMies

geeksville avatar Jun 17 '20 22:06 geeksville

In addition to this investigation @Anelito reminds us to more rigorously investigate/consider https://github.com/markqvist/Reticulum by @markqvist.

geeksville avatar Jun 21 '20 20:06 geeksville

Feel free to ask me any questions you might have about Reticulum, and I'll do my best to answer :)

markqvist avatar Jun 21 '20 21:06 markqvist

QMesh guy here -- I'm happy to answer any questions you may have about QMesh and the ideas underlying it.

One piece of advice: you should strongly consider doing your own forward error correction on top of/instead of the Hamming codes built into the LoRa PHY. Using a better FEC does seem to improve the Rx sensitivity by at least a few dB, and isn't that hard to implement.

faydr avatar Jun 22 '20 15:06 faydr

Feel free to ask me any questions you might have about Reticulum, and I'll do my best to answer :)

@markqvist Is Reticulum intended to be a Delay-Tolerant Network (DTN) protocol? I was looking at the repo, and besides the reference to "high latency", I see a filename called "bundle.py". Since the most famous DTN is called the Bundle Protocol (BP), I was wondering if you were using ideas/algorithms/code from it.

faydr avatar Jun 22 '20 15:06 faydr

thanks ya'll. will do!

geeksville avatar Jun 22 '20 16:06 geeksville

wow - good point about doing FEC in software! I would have never thought of that if you hadn't mentioned it @faydr .

geeksville avatar Jun 22 '20 16:06 geeksville

This issue has been mentioned on Meshtastic. There might be relevant details there:

https://meshtastic.discourse.group/t/ideas-for-future-developments-a-recap/494/2

geeksville avatar Jun 22 '20 16:06 geeksville

@faydr Yes, Reticulum can operate in both real-time and delay tolerant modes. As you guessed, the Bundle class naming is inspired from RFC 5050, since it nicely conveys the functionality. I’m currently implementing the DTN features, and I expect to push them to the repo in July.

You will then be able to use delay-tolerant transport, both for single packets, and for entire bundles of data, where custody for the bundle is transferred hop-for-hop through the network.

markqvist avatar Jun 22 '20 17:06 markqvist

Thank you @markqvist and @faydr for joining the discussion!

@geeksville, as the Meshtastic gets more complex, I do suggest following the OSI model.

While the cryptographic "engine" is well separated from the naive flooding protocol, there might be some parts of the protocol code to be separated more clearly?

Especially, when the layer 1 (PHY) and FEC handlings are implemented. Meshtastic could be run, not only on LoRa, but also on other types of PHY interfaces in the future (if the code layers could support a clear way to implement those).

cyclomies avatar Jun 23 '20 10:06 cyclomies

Just wondering if all 3 of you are still chatting? Meshtastic, Reticulum and Qmesh.

Steve

stevewa2066 avatar Sep 16 '21 22:09 stevewa2066

Just wondering if all 3 of you are still chatting? Meshtastic, Reticulum and Qmesh.

Steve

stevewa2066 avatar Sep 16 '21 22:09 stevewa2066

I'm still around, if anyone has any questions about QMesh.

faydr avatar Sep 21 '21 02:09 faydr

I'm tacking this onto the https://github.com/meshtastic/Meshtastic-device/projects/1 would be good to implement before it becomes too much work.

sachaw avatar Dec 29 '21 03:12 sachaw

As of now, FEC is being used (implemented in hardware)

sachaw avatar Dec 29 '21 04:12 sachaw

I believe with the work @mc-hamster has done on the new routing, this can be closed?

sachaw avatar May 07 '22 15:05 sachaw

This should be kept open. It’s a different routing proposal.

mc-hamster avatar May 07 '22 15:05 mc-hamster

This should be kept open. It’s a different routing proposal.

mc-hamster avatar May 07 '22 15:05 mc-hamster

If I understand correctly, the hardware FEC would be the hardware FEC in the LoRa chipsets. This algorithm is inferior to other algorithms that could be done in software, like Golay, BCH, Reed-Solomon, convolutional coding, LDPC, etc.

faydr avatar May 07 '22 15:05 faydr

Alas, no one has picked this up.

garthvh avatar Sep 12 '22 13:09 garthvh

I'm now starting to play around a bit with Meshtastic. My brief perusing of the firmware source makes me think that it wouldn't be too difficult to add in some Reed-Solomon FEC into LoRa packet payload. Doing so should help lower PER in the presence of interference.

I'm willing and able to work on this if there's still some interest.

faydr avatar Jan 13 '23 21:01 faydr

We take advantage of the FEC afforded by the lora protocol in the modem configuration. No need to do it again in software. By the time the packet gets to us, it's already well qualified.

mc-hamster avatar Jan 14 '23 00:01 mc-hamster

Thing is, the FEC built into the LoRa modem is not very good, it's just a simplistic Hamming code. Better FEC can do a better job (more bandwidth-efficient, better interference resistance, etc.). See below for an example of how using a better FEC algorithm helps improve range:

https://www.slideshare.net/haystacktech/how-to-triple-the-range-of-lora

faydr avatar Jan 14 '23 18:01 faydr

Hi @faydr

Interesting stuff!

I was actually under the impression that there was not really a way to do this with the common LoRa chips, but after reading the slides you linked, and going a bit deeper into the SX1276 datasheet, I see that this is a very interesting prospect actually!

Do you have any experience with implementing LDPC schemes? Maybe you would be interested in collaborating on implementing such a mode for the RNode Firmware? It would be very, very useful for use with Reticulum and physically mobile applications like Sideband (when combined with an RNode).

Maybe we can beat their "Haystack XR Mode" performance :D

markqvist avatar Jan 14 '23 21:01 markqvist

A while ago I looked at Arduino-compatible FEC implementations in software. There're not that many that can handle variable payload lengths. Eventually I found one: https://github.com/Warchant/reed-solomon_syndrome_gf256 Though, these algorithms require heavy computations and you would need to configure the radio to even pass on packets that have a failed CRC, so it would also cost quite some clock cycles. Not sure if a link budget improvement of 2-3dB is worth it, but we can only find out how much it actually is when we try it :)

GUVWAF avatar Jan 14 '23 21:01 GUVWAF

A while ago I looked at Arduino-compatible FEC implementations in software. There're not that many that can handle variable payload lengths. Eventually I found one: https://github.com/Warchant/reed-solomon_syndrome_gf256

Though, these algorithms require heavy computations and you would need to configure the radio to even pass on packets that have a failed CRC, so it would also cost quite some clock cycles. Not sure if a link budget improvement of 2-3dB is worth it, but we can only find out how much it actually is when we try it :)

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

mc-hamster avatar Jan 14 '23 23:01 mc-hamster

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

It's hard to give exact numbers. I've read 2-3dB improvement somewhere, and the above slideshow claims even tripling of range. Usually the algorithms are flexible with the amount of error correcting bits (i.e. redundancy) you add. The more, the higher the link budget improvement, but it eats the throughput.

Right now Meshtastic uses a LoRa Coding Rate of 4/8, but if that is really a bad FEC, it might be better to lower that in order to increase throughput, and add more error correcting bits to a better FEC implementation.

GUVWAF avatar Jan 15 '23 00:01 GUVWAF

@GUVWAV -- My WAG is that using RS FEC would improve sensitivity by 1-3dB, perhaps more if there's a lot of busty inference in the band.

As for CPU usage -- the fairly fast MCUs (64MHz M4 for the nRF52 seems to be the minimum hardware used for Meshtastic) should have enough CPU for decoding to not be that big of a deal. The M17 project, for example, uses convolutional coding as one of the FEC algorithms. What can be tricky with using convolutional coding is properly interleaving the bits, which can be made more challenging with variable-length packets. LDPC, Turbo, and Polar codes are more CPU (and memory) intensive than convolutional or RS, but there's probably something that we can make work for the low datarates supported by Meshtastic.

I've been looking at the Meshtastic code, and it seems pretty well-structured and easy to work with. As a result, I might try to build my own QMesh router into Meshtastic and see how well it works out. @markqvist In the process of doing this, I might give another stab at writing/porting and MCU-friendly LDPC encoder/decoder.

faydr avatar Jan 15 '23 03:01 faydr

Compared to what we are using now, what sort of improvement in error correction, sensitivity and throughput would we realize?

It's hard to give exact numbers. I've read 2-3dB improvement somewhere, and the above slideshow claims even tripling of range. Usually the algorithms are flexible with the amount of error correcting bits (i.e. redundancy) you add. The more, the higher the link budget improvement, but it eats the throughput.

Right now Meshtastic uses a LoRa Coding Rate of 4/8, but if that is really a bad FEC, it might be better to lower that in order to increase throughput, and add more error correcting bits to a better FEC implementation.

Good intel!

mc-hamster avatar Jan 15 '23 08:01 mc-hamster