inkcpp Optimisation: Replace linear searches with binary searches for more consistent performance.

In a middle-sized story in a release build, I was seeing good performance at one end (100us or so per getline) vs. bad performance at the other end (300us). This change puts getline under 100us for the entire story.

Define file format with C/C++ structs rather than code.
Describe file sections in header so we don't need to scan the whole thing on load.
Remove vestiges of Endian-swapping and make read_list_flag a free function.
Bump format version.
Fix UTF-8 test (needed to use prefix)
Rename compiler's _containers stream to _instructions since that's what it stores.
Remove iterate_containers, add find_container_for, find_container_id, container_data and container_offset.
Implement find_container_for and find_offset_for with upper_bound.
Store expanded information about containers (container_data) including tree structure.
Rewrite jump using new toolkit - update ip, unwind stack, then generate new stack with search/tree walk.
Move little bit of container entry logic out of globals_impl::visit.

NB: This might be too much change? My 2p is that the speed improvements are worthwhile and the changes to the file format get it into a better shape which is more clearly defined and easier to extend.

I spotted more avenues for optimisation but trying not to get too distracted by them :)

Dec 23 '25 08:12 willvale

At first thanks for your PR and 2p. Run-time optimisation was low priority. But the project has reached a state where I appreciate work in this direction.

Luckily we have tests so I'm quite confident in your work.

I will take a deeper look soon.

Feel free to open an issue if you like to discuss further optimisation.

Please keep an eye on the feature/migration branch.

One change in it is that every comment now has a 32bit payload. This allows easier code navigation for migration. (Also the node reconstruction will profit from your optimisation)

Dec 23 '25 21:12 JBenda

Completely missed this message, sorry. I had a look at your branch - which is aimed at supporting version migration of save games? It might be good to cherry-pick the code quality changes (signed/unsigned consistency etc.) into main so they don't make the branch hard to compare.

I see the instruction payload - which isn't used at runtime but is used as some kind of instruction ID (container ID?) so you can better migrate the instruction pointer between versions?

I did wonder about this approach as it means you can map instantly from instruction to container, but didn't want to make the instruction stream wider.

If you had binary data from old and new stories, you might be able to migrate by generating the container ID from the old story and applying it to the new? That would avoid the need to have larger instruction data at runtime.

Jan 08 '26 21:01 willvale