craftinginterpreters icon indicating copy to clipboard operation
craftinginterpreters copied to clipboard

Design Discussion: Long Jumps

Open chrisjbreisch opened this issue 2 years ago • 3 comments

From the text:

A 16-bit offset lets us jump over up to 65,535 bytes of code, which should be plenty for our needs.

and

Some instruction sets have separate “long” jump instructions that take larger operands for when you need to jump a greater distance.

To be clear on the first quote, we can jump forward 32,767 bytes and backwards 32,768.

I've been wondering how I could implement a long jump with a single pass compiler. It seems as if by the time I realize that I've exceeded the 16-bit range, I've already set aside two bytes for my jump, and I can't go back and change it.

So, here's my thought. I'll limit myself a bit more on the size of my short jumps. Exactly a bit, actually. I'll only use the last fifteen bits, so I can only go forward 16,383 and backwards 16,384. I'll use the first bit to indicate a long jump, and in that case, the last 15 bits represent an index into a long jump table. Yes, I'll only have 32,768 possible long jumps, but that seems like plenty. :) I can give my entries in the long jump table a full 32-bits, which hopefully will be enough for any sized jump I want to make.

I suppose theoretically, I could repeat the process here, with another signal bit that would indicate a long long jump, and another table with 64-bit jumps.

But let's not get crazy. :)

Why not just use the long jump table for everything? Because it's an extra level of indirection, that frankly, I'll rarely need. Make the general case as fast as possible.

This seems like a reasonable approach, and I can't think of anything better that I can do in a single pass. The one downside is that I've cut my short jump size in half. Which is rather depressing. But I can't think of any way around it. Or have I missed something?

chrisjbreisch avatar Jul 06 '22 11:07 chrisjbreisch

I just stopped doing "short" jumps entirely, tbh. Fetching one extra byte from the stack has a neglibible performance impact (from my testing) at this level of abstraction so I deemed the reduced complexity a worthy enough reason to only use 24-bit offsets

nocturn9x avatar Jul 06 '22 11:07 nocturn9x

Thanks, @nocturn9x. I should point out that I misspoke earlier. I had forgotten that we use OP_LOOP for negative jumps, so we do get the full 65,535 bytes in both directions, and with my approach I would get 32,767 in both directions.

But I think I will look at your approach. I assume that you did something similar then with OP_CONSTANT to not limit yourself to 255 constants, rather than implementing an OP_CONSTANT_LONG.

chrisjbreisch avatar Jul 06 '22 13:07 chrisjbreisch

If you want to check a more lox-like implementation, try japl. For a much different implementation of a statically-typed language (still taking some inspiration from lox), I have peon.

I assume that you did something similar then with OP_CONSTANT to not limit yourself to 255 constants, rather than implementing an OP_CONSTANT_LONG.

Correct.

nocturn9x avatar Jul 06 '22 14:07 nocturn9x