lbForth Undefined behaviour in C target

@pipcet reported a case of undefined behaviour in the C target. This may affect SP@ and SP!.

Oct 30 '16 06:10 larsbrinkhoff

There is no way in lbForth to push the value of the return stack pointer to the return stack, so RP@ and RP! should not have the same problem.

Oct 30 '16 06:10 larsbrinkhoff

Well, the good thing is that things blow up right away if the "wrong" definition is chosen by the compiler—dup doesn't work (or at least it didn't for me).

If I'm reading the code correctly, rp@ is equivalent to RP @, so it would be nice and symmetric if sp@ were equivalent to SP @. However, I know very little about Forth and I don't know whether it's standardized the other way.

Of course the geek solution is to produce "push %esp" assembly code, which many CPUs (used to) get "wrong" ;)

Note that

?: sp@   SP @ cell + ;
?: rp@   RP @ cell + ;

looks symmetric, but really isn't: in the rp@ case, we're adjusting the stack for the exit implied by ;, but in the sp case, there's no real reason for the adjustment.

Oct 30 '16 06:10 pipcet

SP@ and RP@ aren't standardized. But they do exist in many Forths, so there are expectations on how they should work.

My x86 target does use push %esp. (Also 68000, ARM, PDP-11.) I haven't seen any problem, but then I haven't ran the cold on any old CPU. Which get it wrong?

Good insight about the difference between the SP and RP adjustment. It's expected that the address returned by SP@ is the stack pointer before the call. Inside the definition, pushing SP increases stack depth. Then the stack pointer is fetched by @. So that value must be adjusted.

Oct 30 '16 07:10 larsbrinkhoff

Apparently it was longer ago than I thought—8086s apparently used the other result for push %sp. I thought some non-Intel CPUs competing with the 80486 got it wrong.

My mistake was to think of fetching the stack pointer with @ as a three-step process: pop the TOS, dereference it, push the new TOS. The first step decreases stack depth again, so no adjustment would be necessary in my version. It sounds like you're thinking of @ as an atomic operation that doesn't modify the stack pointer?

(OT: it's quite a hack, but I finally got the asm.js version to make it to the "ok" prompt. It's not as fast as I'd hoped, but about twice as fast as basic JavaScript in the SpiderMonkey shell.)

Oct 30 '16 08:10 pipcet

Yes, I was thinking as @ as an atomic operations. My assembler versions don't touch the stack pointer. But now that I look more closely, my C @ does update the stack pointer.

I started a JavaScript target some time ago, but I ran into a dead end. It's still in the js-disaster branch. Feel free to submit a pull request if you like!

Oct 30 '16 09:10 larsbrinkhoff

I made a tentative fix in the fix-undef branch.

Oct 30 '16 19:10 larsbrinkhoff

That fix makes it obvious that side effects happen before the deref, so that works for me.

I still have a very slight preference for : sp@ SP @ ;, as in https://github.com/pipcet/lbForth/compare/master...pipcet:change-sp-fetch?expand=1, but that would require all ports to change, and it'd make the code uglier for ports that don't touch the stack pointer...

Nov 01 '16 17:11 pipcet

I agree that : sp@ sp @ ; looks cute, but I'd like to stick with the Forth philosophy: keep it simple. Exactly what is more simple may be a matter of taste, of course.

Nov 01 '16 18:11 larsbrinkhoff

I see now there are more cases of undefined behaviour. There are warnings that dereferencing type-punned pointers will break aliasing rules. I guess I have been too liberal casting pointers.

Apr 21 '17 13:04 larsbrinkhoff