Teacup_Firmware
Teacup_Firmware copied to clipboard
less_atomics branches
I remember some discussion about them, but can't find its place any longer.
Anyways. So far I picked the first commit on less_atomics and less_atomics3 (appears to be the same commit). Then the one commit on less_atomics2 on top of that and removed less_atomics2. These two appear to make a point.
Will try to understand the sense of the other commits next.
Nice :+1:
For the atomic read section:
We don't need to save the current->live
. Below a simple timeline:
time from now to future events ->
------------------------------------>
now -> previous -> current -> future
As you can see, when we are at now
, previous
and current
are not live. In next step, the lookahead is too slow. Now previous
is live and in that case the lookahead will abort by checking previous->live
. But current->live
can never ever be live when previous
is not live. So we don't need to check current
anymore.
And the same is with the ID
.
In next step, the lookahead is too slow. Now previous is live
Not reliably. previous
could be already done by the time lookahead finishes, setting dda->live
back to zero. Think of very short movements.
There's also the dda->done
flag. I forgot why I introduced this, but maybe that's the flag to check.
You are right. The dda->done flag is a bit hidden. Never saw this.
The flto flags are devil and angle in one person. Very small change can make huge difference in space. Were the space goes up with the flag first, it can save later a lot.
less_atomics4 to less_atomics~2 are only examples. Current order differs.
without flto:
experimental: 22378 bytes
less_atomics4~2: 22312 bytes
less_atomics4~1: 22290 bytes
less_atomics4: 22284 bytes
with flto:
experimental: 22340 bytes
less_atomics4~2: 22344 bytes
less_atomics4~1: 22340 bytes
less_atomics4: 22226 bytes
The only difference between less_atomics4~1
and less_atomics4
is the ATOMIC for one 8bit read.
One extra side note. Please play with the order of this three calculations and take a look into the code size: https://github.com/Traumflug/Teacup_Firmware/blob/experimental/dda_lookahead.c#L218-L220
To some extent I try to not "play" with the code, but to understand the required algorithm first, then implement that with as few instructions as possible :-)
I don't want to hold you back from playing, but saving a few clock ticks in the lookahead area doesn't give much advantage. It just allows to fill the movement queue 0.1% faster.
If you want something to play, how about issue #255?
Depending on the order of this three calculations, you will save up to 114bytes.
Anyhow, I think often about that part of code. Just yesterday I've tried to remove the total_step counter. But this would increase the code by 500bytes with that what I've done.