embassy FDCAN peripheral refactor

Contains a (mostly) full ~rewrite~ refactor of the FDCAN peripheral driver.

STM32 controllers contain two variants of this peripheral as far as I can tell:

The H7 series contains the full M_CAN peripheral IP licensed from Bosch. This driver targets this, and therefore the H7 as its primary target.
Other series (like G4) contain a much simplified version of the M_CAN, but the base capabilities are the same. Some simplifications include non configurable message ram, small FIFOs/Queues, etc. These things are mainly just #[cfg(...)]ed out.

The goal of this rewrite is to:

Expose more capabilities of the underlying hardware
Make the driver more usable in a production CAN system
Improve error handling
Improve configurability and flexibility

Feel free to ask follow up questions if I should go more into depth on anything.

TODOs:

[ ] Mostly written against H7 but with the G4 kept in mind. Likely doesn't compile against G4 yet.
[ ] Update test suite.

Some major points:

Message RAM

Configuration structs

At the base of the changes is a new abstraction over the Message RAM. In the full M_CAN peripheral, this memory region is fully configuable, and this driver exposes all of that to the user. This includes full configurability of the different Tx and Rx buffers and queues.

In the G4, message ram is not configurable to the same degree. The same Message RAM abstraction is kept, but is instantiated with the sizing parameters that are hardwired in the hardware.

Code structure

The driver is more or less split in two layers:

A high level interface which the library users would interact with. This deals with maintaining internal state in the driver.
A low level, stateless interface. This is a fairly thin abstraction over the registers, and implements a lot of procedures which are described in the datasheets.

Error handling

The different options available for error handling are described in the config.

Buffered interface removed

I have removed the buffered interface, as I am unsure how useful it actually is in the real world.

For the full H7 interface, the different queues can be configured to be fairly long (up to 32 entries in a lot of cases), so this should in many cases be preferred.

I am not fully opposed to adding a buffered interface for Rx for use in the G4 variant of the peripheral, but I would like to hear from someone who actually needs it before investing in it.

I don't believe a buffered interface for Tx is a good idea at all.

Sep 27 '24 15:09 hansihe

You mention your rewrite exposes more hardware capabilities, and is more usable/configurable/flexible. Can you show examples? i.e. "with the previous API doing X was annoying/impossible, with the new API it's easy thanks to Y"
Why does fixing the above require a ground-up rewrite? Why can't it be done with incremental fixes or additions to the existing driver?

It'd have been nice if you reached out first before dropping a 3000-line "rewrite everything" PR. It'd have allowed you to discuss your issues and maybe find solutions with the existing API, and get some confirmation whether maintainers agree these issues are worth rewriting everything or not.

Same about the buffered CAN. @cschuhen added it, i'm sure they had a reason to need it. It's nice to ask first instead of removing it because you don't know that reason.

Sep 27 '24 17:09 Dirbaio

Thanks for the quick feedback!

I definitely see what you are saying, I realize this was bad form of me. I got a bit too wrapped up in the technical aspects, and in retrospect could have communicated ahead of time instead of dropping this massive PR out of nowhere.

When I started the work on this, it was never meant to be as big of a thing as it turned into, and it grew more out of my own requirements than anything else.

As for why I think this warrants a larger set of changes:

From what I can tell this was written to target the FDCAN in the G4. This is really a simplified and cut down version of the M_CAN peripheral IP from Bosch.
This driver was later extended to work with the full M_CAN, but kept the same limitations of the cut down version.
In order to properly expose a lot of the features of the full version, some pretty deep changes were required (the Message RAM changes)
This snowballed into exposing those capabilities through the higher levels of the driver.

Will provide a more in depth message where I try to expand on some of the limitations I found in the current version of the driver :slightly_smiling_face:

Sep 27 '24 18:09 hansihe

Let me try to write out some limitations I found with the original driver, and how they could be handled with these changes:

Handling bus errors

Previously the Bus_Off recovery procedure was automatically restarted any time. There was no easy way to detect this and react to it.

If you want to write a device that exists on the bus in say a car, you need some way to detect when your device goes into Bus_Off, and react appropriately. In some devices this might include resetting itself, in some disabling itself permanently, and in others rejoining the bus immediately.

With these changes you have the ability to choose an error handling mode. You can have the device rejoin the bus automatically as today, or you can choose to bubble up the error to all senders/receivers or handle the errors centrally in a task.

Using dedicated Rx buffers

The peripheral has the capability to allocate dedicated buffers, and receive messages with certain IDs into those buffers. This functionality is really useful for CAN because you often have devices transmitting status updates at regular intervals on the bus, and using these dedicated slots could enable tasks to only wait for the messages they need without having to drain the FIFO.

With these changes, the dedicated Rx buffers are supported and usable.

Using the two FIFOs independently

The peripheral has two Rx FIFOs. The hardware allows configuring these differently, and routing different message IDs to different ones of them.

With these changes, these two are exposed independently and can be used by, for instance, different tasks.

Sharability between tasks

As is today, you are required to have a mutable reference to the Can struct to be able to transmit and receive. This is not really all that bad in itself, but that limitation really begins to show once support for the multiple Tx/Rx mechanisms the hardware supports are exposed (including the dedicated Rx buffers mentioned above).

Tracking finished transmissions

Before there was a flush method exposed that could be used to wait for a certain Tx slot to finish. This had some limitations:

You were required to provide the slot index, but there was no way to know which slot your message got queued into with the API.
If the struct is made sharable, this API is racy.

A TxRef type is added as the return value from the tx_* functions, which enables you to cancel the transmission and wait for it to be finished.

Message RAM configuration

The H7 variant of the peripheral is really configurable in terms of how its different buffers and queues are configured. You can allocate different slices of the memory to different things in a really flexible way.

Today the configuration is fully static, and limited to the lowest common denominator (the G4 variant), which has a static memory layout and really limited buffer sizes.

The G4 variant is limited to:

2x Rx FIFOs with 3 slots
0 Dedicated Rx buffers
Tx FIFO/Priority queue with 3 slots
0 Dedicated Tx buffers
28 Standard ID filters
8 Extended ID filters

In the full (H7/M_CAN) variant, everything is configurable (limiting factor usually being FDCANRAM size), with up to:

2x Rx FIFOs with 64 slots
64 Dedicated Rx buffers
32 Tx slots (configurable as Tx/FIFO/Dedicated buffers)
128 Standard ID filters
64 Extended ID filters

As you can see this is quite a massive difference, which these changes enable usage of where the HW supports it.

Sep 27 '24 19:09 hansihe

@Dirbaio, thanks for the heads up. I'm on vacation (in Europe actually) at the moment and I'm not really in a position to even look at these patches for at least a couple of weeks.

I certainly would prefer not to see the buffered RX to go.

Even I think buffered TX can have some benefit in some use cases, especially as it allows multiple tasks to easily send. For example, with the Mutex approach, it seems more complicated to first have to (a)wait/unlock the mutex and then await again to send. Though, there are probably other solutions to this also.

Sep 28 '24 20:09 cschuhen

Thanks for chiming in @cschuhen! Hope you are enjoying visiting Europe

For sure, if there is a want to have buffered Rx, I can pull that back in :slightly_smiling_face:

I would also love to hear more about what you are using Buffered Rx for if you are willing to share, that might make it easier for me to understand the use case you have for it.

With the changes in this PR, the driver is designed to be shared across tasks, the tx_* methods do not require a mutable reference to send messages. Do you think this would cover the use cases you were thinking about for Tx?

Sep 28 '24 21:09 hansihe

i'm going to close this since the FDCAN driver has moved forward quite a bit so I don't see a path to getting this merged. Feel free to open issues or (smaller!) PRs to discuss improvements if you still want to.

Jun 22 '25 21:06 Dirbaio