RIOT icon indicating copy to clipboard operation
RIOT copied to clipboard

drivers/slipmux: Add to RIOT

Open Teufelchen1 opened this issue 8 months ago • 4 comments

Contribution description

Hey 🪼,

this adds an (incomplete) implementation of Slipmux to RIOT. It extends the current slip driver and is backwards compatible.

The second commit adds an usage example, where the shell commands are exposed to coap and through it to slipmux. That commit will be dropped prio merging.

It is incomplete as frame abort is missing.

Testing procedure

Grab your favorite slipmux tool and connect RIOTs stdio to it. Make sure to build RIOT with USEMODULE += slipdev_stdio and optionally set CFLAGS += "-DCONFIG_NANOCOAP_BLOCK_SIZE_EXP_MAX=10" to be able to handle coap packets that are longer than 64 bytes. See the changes I made to the default example.

Feedback wanted: coap server

My current approach spawns a new thread with a nanocoap server to handle the incoming and outgoing configuration frames. This might cause issues when another coap server is in use (one that is connected to the network instead of just the uart). For example, the NANOCOAP_RESOURCE(..)s will be shared between them.

I also experimented with sending the coap messages with a fake ip6+udp header into gnrc. One can then only run one single coap server (and it is no longer bound to nanocoap) who listens for such packets. The outgoing way is a bit more annoying, here a new dummy interface must be created, to which gnrc can direct those fake ip6+udp headers. Very similar to how it is already done with the current slip implementation.

What alternative approaches do you see? Any idea is appreciated.

Teufelchen1 avatar Apr 17 '25 09:04 Teufelchen1

I don't quite understand the purpose if this. When you already have a SLIP connection, you might as well do CoAP over UDP.

Sure, with this approach you can omit the IP and UDP header, but is that really worth the hassle?

benpicco avatar Apr 17 '25 13:04 benpicco

From the byte-wise overhead, it is not that much of a difference yes. I don't think it is that much of a hassle tho. Beside, as soon as you use IP+UDP you have endpoints in a network and those need to be configured (I need to know the IP I want to talk to). Doing slipmux configuration frames you have a guaranteed point to point channel with no configuration nor hassle ;)

Teufelchen1 avatar Apr 17 '25 14:04 Teufelchen1

I can't comment on the code right now, but maybe for the open aspects:

  • "Why?" Teufelchen already mentioned the configuration aspect; without precise knowledge of the concrete application (and a point of using SLIP is that this knowledge is not required) the host may not even know that it reaches the board this way. In particular,

    • this connection may be more trusted than a network connection: While I generally advocate no-TOFU / all-links-are-untrusted / cryptographically-secure-or-bust policy, TOFU policies do have their places, and TOFU over a link that is known not to be networked is preferable over a link where one would need to apply heuristics to know whether a peer is local. (Like, sure, might do link-local-only, but maybe there's NDP-proxying involved?).
    • This might be a good fit also for devices that don't do network at all, or not on that interface. On those, actual SLIP packets would be blackhole'd, but CoAP provides an interface that is easier to automate than shell interactions, and moreover easily ported to devices where no serial line but network is available.
    • Where we do have Ethernet-over-USB, I think that a dedicated CoAP-over-USB protocol would be preferred. However, I'm not aware of any such attempt, and even then, there are relatively new devices around still that still run through the moral equivalent of an FTDI chip.
  • "Which CoAP server to run?" I think we should run the same single CoAP server on all transports as a matter of simplicity; where a UART connection is more trusted than a network peer, which transport it is should be discoverable by inspecting the socket-ish thing it comes in over. I don't have an opinion yet on which implementation strategy I'd advocate for.

chrysn avatar Apr 17 '25 21:04 chrysn

Murdock results

:heavy_check_mark: PASSED

166ffcb027828078c5536df0a171435180caeb45 driver/slip: Add slipmux frames

Success Failures Total Runtime
10504 0 10504 20m:50s

Artifacts

riot-ci avatar May 16 '25 09:05 riot-ci

@benpicco I think this is ready for another review. If that is positive, I will drop the shell/example related commit and squash the others.

Teufelchen1 avatar Jun 17 '25 09:06 Teufelchen1

@benpicco ping :)

Teufelchen1 avatar Jun 23 '25 11:06 Teufelchen1

Some high-level comments from my side:

  • incomplete implementation

this adds an (incomplete) implementation of Slipmux to RIOT. It extends the current slip driver and is backwards compatible. (...) It is incomplete as frame abort is missing.

Isn't it also missing diagnostic transfer? Just had a quick glance at the code, but could only find references to configuration frames.

  • documentation: before merging this should definitely get documentation, probably on the defined pseudomodule slipdev_config. Not sure if this may even warrant its own doc.md somewhere? It should definitely mention slipmux and clearly list what is currently supported. Also some notes on the implementation (currently spawning an additional thread, sharing NANOCOAP_RESOURCE with any other nanocoap server) should be in the documentation.

  • coap server with different backends: I would hope unicoap could help with that. I'm pretty sure it supports defining resources only for a subset of driver backends (e.g. just slipmux and not udp), which means that a single unicoap server could be used for several backends. Maybe @carl-tud has some more comments on this?

  • nanocoap/nanocoap_sock: could you maybe explain in a bit more detail how this actually works right now? I see the "server thread" just receives chunks from the ringbuffer, tries to parse and answer them (using nanocoap). Every chunk is supposed to be a CoAP packet as received within slipdev, right? Where do the NANOCOAP_RESOURCEs (which are part of nanocoap_sock afaict) actually come into play?

mguetschow avatar Jun 30 '25 15:06 mguetschow

Isn't it also missing diagnostic transfer? Just had a quick glance at the code, but could only find references to configuration frames.

That's because those were already implemented in RIOT and are just called stdio via slip. Blame @miri64 for that :p If you enable this, every printf() will be send slipmux encoded.

[..needs doc...] Not sure if this may even warrant its own doc.md somewhere?

Eventually, yes. There is still work to be done on the coap-handling/server side. If I can get away with minimal documentation this time, I would appreciate. I fear it will be duplicated work if I re-implement it later (for example with unicoap). Plus, as mentioned in a comment with benpicco, the handling of the pseudomodule is not ideal in the moment. For example, it is not possible to enable configuration transfer without packet transfer. And even if you could, nanocoap depends on udp, which means you would still pull in a lot of wasted code & bytes. Both issues are scheduled for follow-up work.

I see the "server thread" just receives chunks from the ringbuffer, tries to parse and answer them (using nanocoap). Every chunk is supposed to be a CoAP packet as received within slipdev, right?

That is correct.

Where do the NANOCOAP_RESOURCEs (which are part of nanocoap_sock afaict) actually come into play?

That's nanocoap specific, I do not interact with them myself (beside the shell stuff but thats just an example of what one could do and will be dropped before merging). Afaik it is just a XFA defining names and callbacks. The nanocoap server just magically knows that XFA and looks up the endpoints on the fly.

Teufelchen1 avatar Jun 30 '25 15:06 Teufelchen1

If I can get away with minimal documentation this time, I would appreciate. (...) Both issues are scheduled for follow-up work.

Okay, I get your point and also don't want you to spend time inefficiently. However, I would appreciate some very short documentation for now (probably in the pseudomodule file then) just mentioning:

  • this is slipmux configuration with a link to the draft
  • @warning stating this is WIP / experimental work
  • briefly list the current shortcomings you mention above (or, alternatively link to this PR)

Just trying to safeguard against follow-up work being stalled at some point, where we would end up with yet another feature not documented at all.

The nanocoap server just magically knows that XFA and looks up the endpoints on the fly.

Ah, so coap_handle_req (https://github.com/RIOT-OS/RIOT/pull/21418/files#diff-b7bda8876512e2948e36a07d7926bd6d15c4b7bb1f043e832daaae113aadbb32R462) is actually a nanocoap_sock function making use of the XFA? I somehow assumed coap_*-prefixed functions where only part of the nanocoap parser. What a mess :'D

mguetschow avatar Jul 01 '25 08:07 mguetschow

Just trying to safeguard against follow-up work being stalled at some point, where we would end up with yet another feature not documented at all.

Yes, that's your job :p Speaking of non documentation, I am either overlooking something or slip has already been poorly documented :/ Either way, I roughly agree with you but there are remarks.

[Add] very short documentation for now (probably in the pseudomodule file then) just mentioning:

I opted against that and followed the example of the already existing documentation for slipdev_stdio.

this is slipmux configuration with a link to the draft

Yes, I added the links. Fun fact, I also fixed the broken link of SLIP dev itself :D

@warning stating this is WIP / experimental work

Not sure tbh. So far, it not claimed anywhere this is "rfc conform", as such, it can not be incomplete. It might be considered WIP? But what would that even mean for a pseudomodule that already works exactly like one would imagine? (still, you could persuade me to add that)

briefly list the current shortcomings you mention above (or, alternatively link to this PR)

There are no shortcomings for the user as "slipmux conform" is not claimed. Things would look different if one would enable a slipmux module but you don't. e.g. If one just uses slip (without stdio/configuration), the code is conform to SLIP rfc but not to Slipmux draft. This mess is part of the idea I outlined a bit before:

In future work, one will have to flip it on it's head. There will be one module called 'slipmux' that will enable all three frame types and be slipmux-draft-conform. There will be three sub-modules each guarding one of the three frame types, enabling you to only have partial slipmux. There will be a module called slip that is a alias for the submodule that enables the packet transfer and that will be slip-rfc conform. (Just how I imagine it, nothing concrete yet)

Teufelchen1 avatar Jul 01 '25 11:07 Teufelchen1

Thanks for adding some documentation!

There are no shortcomings for the user as "slipmux conform" is not claimed.

Maybe not shortcomings, but some things users should be aware of when using your module: there is an extra thread spawned, incoming coap requests over the wire are handled according to any NANOCOAP_RESOURCEs present in the binary. Basically a very short "how to use"/"how is this useful".

In future work, one will have to flip it on it's head. There will be one module called 'slipmux' that will enable all three frame types and be slipmux-draft-conform.

Sounds reasonable to me. Then why not briefly mention this in the documentation as well, so that users are aware this is probably only a temporary API (as in "module names" etc.)

mguetschow avatar Jul 01 '25 12:07 mguetschow

Looks good from my side now, haven't tested it though. Do you have suggestions for slipmux implementations to test this with?

mguetschow avatar Jul 02 '25 08:07 mguetschow

I have been testing against this one: https://crates.io/crates/slipmux (disclaimer, I am the author). There really aren't other viable implementations out there. There are debris by lobaro but I didn't bothered understanding that ~mess~ code base (I don't think it is intended as a library but just their own utility for their purpose which is fine). I also asked on the core-wg mailing list but no one knows about working implementations.

tl;dr: I've been actively using this for weeks now in my SeCrEt PrOjEcT. Works well for config+diagnostic. Never used networking but I didn't touch that code meaningful so I am confident that didn't degrade.

Teufelchen1 avatar Jul 02 '25 10:07 Teufelchen1

@mguetschow

  • coap server with different backends: I would hope unicoap could help with that. I'm pretty sure it supports defining resources only for a subset of driver backends (e.g. just slipmux and not udp), which means that a single unicoap server could be used for several backends. Maybe @carl-tud has some more comments on this?

a, The notion of a CoAP server is loosely defined. A driver can open any number of sockets or for that matter 'concepts' similar to sockets. Each driver then forwards all packets to 'the unicoap server' after having removed the outer frame (UDP headers, etc). Information on what transport has been used is retained and optionally used to restrict resources to certain transports. I you really want to, you can handle this manually. There is an API for that. Right, now you can only have one 'unicoap server`.

b, Is this also what Lasse wants?: multiple ports and thus multiple sockets? I yes, I talked to Lasse off channel and we agreed on solution I have yet to implement. But, adding this to unicoap is almost trivial.

carl-tud avatar Jul 02 '25 11:07 carl-tud

Awesome, unicoap will fit perfectly into this.

Teufelchen1 avatar Jul 02 '25 13:07 Teufelchen1

@miri64 you are still requesting change

Teufelchen1 avatar Jul 04 '25 13:07 Teufelchen1