Teacup_Firmware icon indicating copy to clipboard operation
Teacup_Firmware copied to clipboard

less_atomics branches

Open Traumflug opened this issue 7 years ago • 7 comments

I remember some discussion about them, but can't find its place any longer.

Anyways. So far I picked the first commit on less_atomics and less_atomics3 (appears to be the same commit). Then the one commit on less_atomics2 on top of that and removed less_atomics2. These two appear to make a point.

Will try to understand the sense of the other commits next.

Traumflug avatar Dec 09 '16 21:12 Traumflug

Nice :+1:

For the atomic read section: We don't need to save the current->live. Below a simple timeline:

time from now to future events ->
------------------------------------>
now -> previous -> current -> future

As you can see, when we are at now, previous and current are not live. In next step, the lookahead is too slow. Now previous is live and in that case the lookahead will abort by checking previous->live. But current->live can never ever be live when previous is not live. So we don't need to check current anymore.

And the same is with the ID.

Wurstnase avatar Dec 10 '16 11:12 Wurstnase

In next step, the lookahead is too slow. Now previous is live

Not reliably. previous could be already done by the time lookahead finishes, setting dda->live back to zero. Think of very short movements.

There's also the dda->done flag. I forgot why I introduced this, but maybe that's the flag to check.

Traumflug avatar Dec 10 '16 11:12 Traumflug

You are right. The dda->done flag is a bit hidden. Never saw this.

Wurstnase avatar Dec 10 '16 11:12 Wurstnase

The flto flags are devil and angle in one person. Very small change can make huge difference in space. Were the space goes up with the flag first, it can save later a lot.

less_atomics4 to less_atomics~2 are only examples. Current order differs.

without flto:
experimental:    22378 bytes
less_atomics4~2: 22312 bytes
less_atomics4~1: 22290 bytes
less_atomics4:   22284 bytes

with flto:
experimental:    22340 bytes
less_atomics4~2: 22344 bytes
less_atomics4~1: 22340 bytes
less_atomics4:   22226 bytes

The only difference between less_atomics4~1 and less_atomics4 is the ATOMIC for one 8bit read.

Wurstnase avatar Dec 10 '16 20:12 Wurstnase

One extra side note. Please play with the order of this three calculations and take a look into the code size: https://github.com/Traumflug/Teacup_Firmware/blob/experimental/dda_lookahead.c#L218-L220

Wurstnase avatar Dec 10 '16 20:12 Wurstnase

To some extent I try to not "play" with the code, but to understand the required algorithm first, then implement that with as few instructions as possible :-)

I don't want to hold you back from playing, but saving a few clock ticks in the lookahead area doesn't give much advantage. It just allows to fill the movement queue 0.1% faster.

If you want something to play, how about issue #255?

Traumflug avatar Dec 10 '16 22:12 Traumflug

Depending on the order of this three calculations, you will save up to 114bytes.

Anyhow, I think often about that part of code. Just yesterday I've tried to remove the total_step counter. But this would increase the code by 500bytes with that what I've done.

Wurstnase avatar Dec 11 '16 06:12 Wurstnase