stm32f4xx-hal
stm32f4xx-hal copied to clipboard
Incorrect clock speed selected for SPI results in overclocking and non-functional code.
The code here selects incorrect baud rates:
https://github.com/stm32-rs/stm32f4xx-hal/blob/master/src/spi.rs#L600-L610
Datasheet:

Walkthough:
- clock.0 / freq.0 = 4 (4.5, rounded down because the result is an integer)
- 3..=5 => 0b001 because 4 >=3 and 4 <= 5.
- 0b001 = fPCLK/4
- fPCLK = 90,000,000
- fPCLK/4 = 22,500,000
so, given 90,000,000 clock speed, and a desired SPI baud rate of 20,000,000 it calculates a divider of 4. and if you take 90,000,000 and ACTUALLY divide it by 4 you get 22,500,000 which will result in the device being overclocked.
Using a high baud rate will mean that code that looks correct but may not function, with the user of the code no idea why. i.e. if the specified baud rate is equal or less than the maximum speed rated in manufacturers datasheet for a slave device but the code chooses an out-of-spec baud rate the user will be left wondering why.
Also, clock drift due to temperature rise is real, so a device that may initially work (due to lower temperature) may stop working as it heats up and the clock drifts. Overclocking exasperates this further as the slave device may already be clocked higher than it's specification.
The code should round downwards and use a lower baud rate if the exact one cannot be calculated due to lack of fractional baud rate generator.
Link to relevant code: https://github.com/stm32-rs/stm32f4xx-hal/blob/a8dd1fdc13c6084965ce61fc3cfa312e97598eef/src/spi.rs#L291-L301
(Commits changed where those lines are on master, this is a fixed permalink)
The code should round downwards and use a lower baud rate if the exact one cannot be calculated due to lack of fractional baud rate generator.
What exact ranges for each level do you suggest?
The only requirement is that the final baud rate must be EQUAL or LOWER than the desired baud rate. In my project I used an iterative approach instead of the match approach.
I wrote a fairly generic divider calculator, the main part goes like this:
let mut divider = min_divider;
let mut shift = 0;
while shift < divider_count {
let actual_frequency = input_frequency / divider;
if actual_frequency <= desired_frequency {
return Ok((divider, actual_frequency, shift));
}
divider *= 2;
shift += 1;
}
and tested like this:
#[test]
fn use_8bit_divider_real_world() {
// when
let result= super::DividerCalculator::calculate(2, 8, 90_000_000, 20_000_000);
// then
assert!(result.is_ok());
// and
if let Ok((divider, resulting_frequency, shift)) = result {
assert_eq!((divider, resulting_frequency, shift), (8, 11_250_000, 2)); // can't get exactly 20Mhz from 90Mhz clock, use the next higher divider to avoid overclocking
}
}
I had a similar problem when using the SPI peripheral to send data to WS2811 LED controllers ("neopixel" style devices) - these have relatively tight tolerances for SPI data rates, since they derive clocking from the data line . For this purpose it might be useful for the caller to be able to specify min/max permissible data, or alternatively to provide a means of querying the actual SPI data rate?
For this purpose it might be useful for the caller to be able to specify min/max permissible data
Something like this?
pub fn with_frequency_range(..., freq: core::ops::Range<Hertz>, clocks: &Clocks) -> Result<Self, ConfigError> {
...
}
you need the actual frequency in order to be able to determine how long it will take to send data. imho, any flexible api should return the 3 calculated values to the caller as per my example above; the divider, the resulting frequency and the 'shift' value which can be used directly for the br bits.
Adding with_frequency_range might be useful, but the signature proposed above doesn't give the caller a way to know the actual frequency. so being able to query the resulting frequency would then also be needed too.
my example above is very generic and can likely be used for many other dividers.
I suppose you can either look at the clock setup narrowly for the SPI peripheral itself, in which case probably an API which allows the caller to request a maximum clock speed, and returns the actual attained clock speed would be a good idea.
Because the attainable range of SPI clocks is dependent on the setup of the the rest of the MCU, the clocks could also be considered more globally (and I probably hadn't disentangled the two in my previous comment).
If considering clock setup globally instead, then the ultimate nice-to-have would be a compile time computation (constant evaluation, or a build.rs etc.) which takes a set of constraints (e.g. min/max values, and perhaps hints to minimise or maximise a particular value), and generates one or more clock configurations for the system overall (since clocks for different peripherals, and the mcu core itself are co-dependant), or otherwise fails at compile-time with a constraint violation error. This should have the minimum runtime code size and computation time overhead.
I believe this is similar to the ST Cube clock configuration (but I've not used it myself).
yes, I'm doing a lot of stuff at run-time, hence my approach.
For low-power applications you drop the cpu clock speed, at run-time, and then you need to reduce the divider to bump up the SPI clock again.