CMSIS_5 icon indicating copy to clipboard operation
CMSIS_5 copied to clipboard

Single cycle delay / shortest deterministic delay (time consuming NOP)

Open kjsdf7wr3 opened this issue 3 years ago • 4 comments

Hi,

sometimes one wants to insert nanoseconds delay in the code. This is usually achieved by inserting multiple single_cycle_delay() calls, however CMSIS provides no such function. I think for M0, M3/4 etc. this could be __asm("mov r0, r0");.

Because such a function is not available, you see other people misusing __NOP which is only supposed to be used for code alignment.

Can we add such a function to the CMSIS Core, which will give the shortest deterministic delay without side effects?

kjsdf7wr3 avatar Aug 13 '21 09:08 kjsdf7wr3

Deterministic single cycle instruction timing seems like it might be difficult to guarantee on systems without constant access time instruction memory - you might have to guarantee no DMA accesses to the same memory, no Flash erases, function preloaded and locked into the cache, things like that.

If no-fewer-than-N cycles was enough, its easier.

rsaxvc avatar Aug 13 '21 20:08 rsaxvc

That makes sense. Basically we need a NOP that is guaranteed to be time consuming.

kjsdf7wr3 avatar Aug 16 '21 12:08 kjsdf7wr3

Hi all,

After discussing this with the team I think this is not easy to solve in a generic way. If I am not mistaken __NOP is guaranteed to consume at least a single processor cycle. But as already stated above it can be more depending on memory access, cache, or even interrupts kicking in.

You may want to take a look into the DAP reference implementation which contains two sort of delays (fast and slow). In order to have them somehow accurate they need to be tweaked at compile time for the used hardware. Overall, we think it might be challenging to give a generic solution as part of CMSIS.

I am very happy to accept your contribution if you can come up with a good approach.

Thanks, Jonatan

JonatanAntoni avatar Sep 08 '21 11:09 JonatanAntoni

Example from Cortex-M0 Devices Generic User Guide:

NOP performs no operation and is not guaranteed to be time consuming. The processor might remove it from the pipeline before it reaches the execution stage. Use NOP for padding, for example to place the subsequent instructions on a 64-bit boundary.

They give no guarantee that a __NOP consumes at least a single cycle. Assuming a single __NOP consumes at least one cycle, what happens to consecutive __NOP calls?

This page gives some interesting timing measurements between __NOP and a mov instruction without side effects.

So it seems one has to assume on __NOP consuming at least one cycle, but ARM does not guarantee it.

kjsdf7wr3 avatar Sep 09 '21 11:09 kjsdf7wr3