embassy icon indicating copy to clipboard operation
embassy copied to clipboard

CAN examples for stm32h7 not working

Open FranciscoAmaro24 opened this issue 1 year ago • 24 comments

I've been trying to run the can example provided in the stm32h7 folder but the program seems to constantly get stuck on the CAN initialization line. What could be the issue?

FranciscoAmaro24 avatar Feb 23 '24 13:02 FranciscoAmaro24

I am having similar issues. Will documentation be released at some point for the FDCAN HAL module to aid debugging?

maiaherringfish avatar Feb 27 '24 01:02 maiaherringfish

@maiaherringfish If you get it to work it would be amazing if you could share the solution :D. We are in desperate need of this module in our application.

FranciscoAmaro24 avatar Feb 27 '24 11:02 FranciscoAmaro24

I think I am closer to getting it working today. Have you seen the new example they released last week? Unlike the one they had before, I'm able to configure this one (rather than it hanging on initialization), so you might have some luck too. However, it seems to only configure without issues if I have it set up with an HSE--otherwise the code hardfaults. Working on diagnosing this issue further

maiaherringfish avatar Feb 27 '24 16:02 maiaherringfish

Hey We kinda got it to work on internal loop back mode but on the normal can configuration we were still having issues. We have been fucking around a lot and hope by tomorrow or the day after it works. We’re also in contact with the guy that made can on embassy but he is Australian so the time zones are quite different.

On Tue, 27 Feb 2024 at 17:29, Maia Herrington @.***> wrote:

I think I am closer to getting it working today. Have you seen the new example they released last week? Unlike the one they had before, I'm able to configure this one (rather than it hanging on initialization). However, it seems to only configure without issues if I have it set up with an HSE--otherwise the code hardfaults. Working on diagnosing this issue further

— Reply to this email directly, view it on GitHub https://github.com/embassy-rs/embassy/issues/2620#issuecomment-1966984358, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP6RCXZVVP7GPATL6HIZF7TYVYCXVAVCNFSM6AAAAABDWXUELWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRWHE4DIMZVHA . You are receiving this because you authored the thread.Message ID: @.***>

FranciscoAmaro24 avatar Feb 27 '24 16:02 FranciscoAmaro24

That is great to hear! Can I ask if you are using the HSE or a different clock? This seems to be the cause of some issues for me

maiaherringfish avatar Feb 27 '24 16:02 maiaherringfish

We tried but ended up falling back on the example set up of HSE

On Tue, 27 Feb 2024 at 17:42, Maia Herrington @.***> wrote:

That is great to hear! Can I ask if you are using the HSE or a different clock? This seems to be the cause of some issues for me

— Reply to this email directly, view it on GitHub https://github.com/embassy-rs/embassy/issues/2620#issuecomment-1967051669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP6RCX4HREBZF6DIUFR36OLYVYEF7AVCNFSM6AAAAABDWXUELWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXGA2TCNRWHE . You are receiving this because you authored the thread.Message ID: @.***>

FranciscoAmaro24 avatar Feb 27 '24 16:02 FranciscoAmaro24

I think my teammate and I found the issue. We noticed there is no logic in rcc/h.rs to set the fdcansel field of the d2ccip1r register, which prevents a clock from being selected correctly and causes a default to HSE.

I still have to verify whether or not the FDCAN frequency is then being identified correctly based on the clock but this should be the start of a solid fix to this issue. When I tested it I no longer received hardfaults from accessing FDCAN::frequency() on config. PR has been started above to address this.

maiaherringfish avatar Feb 27 '24 19:02 maiaherringfish

Please met me know how you both get on with this. Also, b.t.w. exactly which chips are you using? And what boards, in particular, what is connected to HSE. Keep in mind that it is recommended that CAN should always operate with a crystal or similar source for timing. So you want to use a clock that is either HSE or derived from it probably.

FYI: This is the board that I am using for my H7 testing: https://www.aliexpress.com/item/1005005872938104.html?spm=a2g0o.order_list.order_list_main.31.351e1802DtMyWn

cschuhen avatar Feb 28 '24 08:02 cschuhen

For us we are using a nucleo the stm32h743zi6 but we want to make it work also on our custom PCB that uses the same chip. Do you think would be a problem?

On Wed, 28 Feb 2024 at 09:51, cschuhen @.***> wrote:

Please met me know how you both get on with this. Also, b.t.w. exactly which chips are you using? And what boards, in particular, what is connected to HSE. Keep in mind that it is recommended that CAN should always operate with a crystal or similar source for timing. So you want to use a clock that is either HSE or derived from it probably.

— Reply to this email directly, view it on GitHub https://github.com/embassy-rs/embassy/issues/2620#issuecomment-1968505340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP6RCX7AGJWLPMTJRQU2VJDYV3VYFAVCNFSM6AAAAABDWXUELWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRYGUYDKMZUGA . You are receiving this because you authored the thread.Message ID: @.***>

FranciscoAmaro24 avatar Feb 28 '24 08:02 FranciscoAmaro24

I wouldn't have thought it to be a problem. Can you describe the rest of your setup? Are you using it alone with some kind of (USB) CAN adaptor connected to a Linux PC? You have terminators in place etc? I usually run with 2 CAN adaptors and make one send packets and the other monitor to ensure the bus is good. If all that is good... what are the options left? I think the core of the FDCAN module is no different, I think that leaves:

  1. Clocks - Does the clock frequency being sent to the FDCAN match the frequency that it thinks it has when computing the timing values.
  2. Pin setup - Is there something wrong with the alternate function selection for that chip perhaps for example?

cschuhen avatar Feb 28 '24 10:02 cschuhen

I was able to get CAN working using a non-HSE clock (PLL2_Q using HSI w div/mul) after the above PR was merged in. I'm using an STM32H725ZG.

For some reason my visual debugger in vscode doesn't show the frequency correctly (all Hertz values appear as _0), but when I change the bitrate, reflash, and probe with an oscilloscope, I see the desired bitrate reflected in the signal.

maiaherringfish avatar Feb 28 '24 17:02 maiaherringfish

But did you manage to get it to work in normal mode? Because loopback mode runs smoothly but on normal mode we get stuck on transmitting the frame. For some reason the debugger info says that there was a successful transition and then gets stuck on the second time it tries. I would really like to know how did you do it using PLL2_Q. I had gotten can to work with another hal from Richard Meadows so im trying to find where could the error be and whats the major difference between the loopback and the normal in terms of register setting. Im still quite recent to the field of embedded so it could be something clear or simple to anyone else.

FranciscoAmaro24 avatar Feb 28 '24 18:02 FranciscoAmaro24

Do you have another device on the CAN bus to receive the message when in normal mode? If not, you will probably get stuck auto retransmitting the message. I am not sure if you're using the example, but if you are, I also commented out the receiving portion so this MCU wouldn't wait to receive a message.

I simply set up PLL2_Q in the peripheral config, and set the fdcan_clock_source to PLL2_Q there as well. I can show how I did this if you need help configuring these things.

The difference between internal loopback and normal mode is indeed just a difference of setting registers. For internal loopback, you set FDCAN_TEST.LBCK to 1 to enable loopback, and FDCAN_CCCR.MON to 1. Otherwise, I believe that in both modes the registers operate pretty much the same way to transmit and receive messages. You may have already seen this, but the reference manual (different from the datasheet) for your chip should have an FDCAN section that outlines how to configure the registers correctly if you want to learn more.

Edit: you may also find debugging in STM32CubeIDE helpful to get an initial sense of how the registers are working, since they have debug tools that are excellent for monitoring them and manually programming them. I am using VSCode & probe-rs to debug and finding it very lacking in comparison, not sure if others have the same issues lol.

maiaherringfish avatar Feb 28 '24 18:02 maiaherringfish

 let mut config = Config::default();
        config.rcc.pll1 = Some(Pll {
            source: PllSource::HSI,
            prediv: PllPreDiv::DIV4,
            mul: PllMul::MUL50,
            divp: Some(PllDiv::DIV8),
            divq: None,
            divr: None,
        });

    config.rcc.fdcan_clock_source = rcc::FdCanClockSource::PLL1_Q;

    let peripherals = embassy_stm32::init(config);

    let mut can = can::FdcanConfigurator::new(peripherals.FDCAN1, peripherals.PD0, peripherals.PD1, `Irqs)

This might be a big beginner mistake but why could the hardfault be allays sent at he can configuration line. Ive been trying a lot of stuff but i cannot get it to disappear

FranciscoAmaro24 avatar Feb 28 '24 21:02 FranciscoAmaro24

I tried your code and got a panic too. Replacing the "None"'s with not-None stopped the panic but no CAN. Can you try this, it works for me:

let mut config = Config::default();
config.rcc.pll1 = Some(Pll {
    source: PllSource::HSI,
    prediv: PllPreDiv::DIV4,
    mul: PllMul::MUL50,
    divp: Some(PllDiv::DIV8),
    divq: Some(PllDiv::DIV8),
    divr: Some(PllDiv::DIV8),
});

config.rcc.fdcan_clock_source = rcc::FdCanClockSource::PLL1_Q;

cschuhen avatar Feb 28 '24 22:02 cschuhen

@FranciscoAmaro24 it looks like you are not setting PLL1_Q to anything? You have to set config.rcc.pll1.divq to a value or it will not populate a frequency for this clock, resulting in the hard fault

@cschuhen you don't have to assign something to all the div values for it to work, just divq since FDCANSEL (clock selection for FDCAN) only selects between HSE, PLL1_Q, and PLL2_Q

maiaherringfish avatar Feb 28 '24 22:02 maiaherringfish

So, hopefully the above will 'work' for you however I don't expect it to be fully reliable comms as you should expect from CAN because it is not derived from a crystal source. This works on my 25MHz xtal: let mut config = Config::default(); config.rcc.hse = Some(rcc::Hse { freq: embassy_stm32::time::Hertz(25_000_000), mode: rcc::HseMode::Oscillator, }); config.rcc.pll1 = Some(Pll { source: PllSource::HSE, prediv: PllPreDiv::DIV5, mul: PllMul::MUL160, divp: Some(PllDiv::DIV8), divq: Some(PllDiv::DIV8), divr: Some(PllDiv::DIV8), });

config.rcc.fdcan_clock_source = rcc::FdCanClockSource::PLL1_Q;

With 8MHz external clock source source, something like this might work for you:

let mut config = Config::default();
config.rcc.hse = Some(rcc::Hse {
    freq: embassy_stm32::time::Hertz(8_000_000),
    mode: rcc::HseMode::Bypass,
});
config.rcc.pll1 = Some(Pll {
    source: PllSource::HSE,
    prediv: PllPreDiv::DIV2,
    mul: PllMul::MUL50,
    divp: Some(PllDiv::DIV8),
    divq: Some(PllDiv::DIV8),
    divr: Some(PllDiv::DIV8),
});

config.rcc.fdcan_clock_source = rcc::FdCanClockSource::PLL1_Q;

If, when playing with values you get a panic in the FDCAN code on an unwrap, it is probably because you didn't give FDCAN a clock that could create the exact bitrate that you asked for. Adding debug like this in util.rs can you you see what is going on: // Check if final bitrate matches the requested if can_bitrate != (periph_clock / (prescaler * (1 + bs1 + bs2) as u32)) {

    info!("Can bitrate missmatch {} {}",
            can_bitrate,
            (periph_clock / (prescaler * (1 + bs1 + bs2) as u32)));
    return None;
}

I should improve the API to propagate these errors up.

cschuhen avatar Feb 28 '24 22:02 cschuhen

Please let me know how you go with the above, once we know it works I can improve the documentation/examples to help people understand what might be required for them.

cschuhen avatar Feb 28 '24 22:02 cschuhen

@cschuhen good to know about HSE for CAN. I am wondering, are your concerns about clock tolerance or something else? If it is timing related, I imagine that both loopback and normal mode tests would fail with a less reliable clock, unless you think parasitics on the TX/RX PCB traces are enough to mess up the timing once you switch to normal mode.

maiaherringfish avatar Feb 28 '24 23:02 maiaherringfish

@maiaherringfish, yes, clock tolerance. For example, here is a quote from ISO11783-2: Bit time selection generally demands the use of crystal oscillators at all nodes so that the clock tolerance given in Table 1 can be achieved.

Like I'm sure it will work - mostly but you throw away some of the reliability of CAN and one day you might just plug it in and Node A and Node B have drifted enough in frequency to start getting packet errors. I'm sure these oscillators have improved somewhat over the years and sure you can improve things if you implement some procedure to set the trimming but why would you bother unless you are making 10000+ of something a year?

I work in a company that heavily uses CAN and have been doing this for over 20 years. We wouldn't consider dropping the crystal from a CAN node.

I imagine that if you consider using CAN-FD this only gets more important.

This is not about PCB traces on RX/TX. It's about Node A and Node B having a different idea on what 250k is. Node A might be transmitting at 249K and Node B is trying to receive it and ACK at 251K. Loopback will be find because you only have one node. And again, in most cases things will be fine, it depends on bitrate and it depends on bus length. Common practise by far is to use a crystal.

cschuhen avatar Feb 28 '24 23:02 cschuhen

Thanks for the explanation! That makes sense to me.

maiaherringfish avatar Feb 28 '24 23:02 maiaherringfish

Great explanation! Thank you so much for all the help and attention given to this issue :D

FranciscoAmaro24 avatar Feb 29 '24 10:02 FranciscoAmaro24

It works!!! Thank you so much for all the help it was really incredible for our project!

FranciscoAmaro24 avatar Feb 29 '24 11:02 FranciscoAmaro24

No worries. Great to hear. It's that with the HSE option I gave? Did you try both?

cschuhen avatar Feb 29 '24 11:02 cschuhen

yes sorry for the late reply

FranciscoAmaro24 avatar Mar 08 '24 14:03 FranciscoAmaro24