r216-forth
r216-forth copied to clipboard
Use a hash table for word lookup
Specifically, a hash table with 32 buckets and the hash function name => ((length(name) << 2) ^ name[0] ^ name[1] ^ name[2]) & 0x1F
, where name[k]
evaluates to 0
if length(name) < k
. 32 buckets seemed like a sane choice because the 5 LSB of human ASCII input tends to be the most varied.
The buckets reside at find_hashtable
and hold heads to linked lists that work the exact same way var_latest
worked, except find
traverses the list in the appropriate bucket and create
appends to the list in the appropriate bucket. var_latest
is thus reduced in function to pointing to the most recently defined word, rather than to a linked list of all existing words.
This means the relatively easy to manage static linked list of words is now a set of 32 such linked lists, which is slightly more difficult to manage, but still not unbearably so: a Lua (5.1 + bit module = luajit) script is provided to check the consistency of the hash table and its linked lists, intended to be used as follows:
$ path/to/tptasm/main.lua model=R2... target=prog.bin export_labels=prog.labels prog.asm
$ path/to/htcheck.lua prog.bin prog.labels
all 112 links are good
The script either succeeds as above, or fails and provides advice on how to achieve consistency, such as:
star_link is in bucket 0x00, should be in bucket 0x16
in which case star_link
would have to be unlinked from the list that starts at find_hashtable.b00
and appended to the list that starts at find_hashtable.b16
.
The original linked list was converted to the set of new linked lists simply by initializing all of find_hashtable
to 0
, copying the original value of var_latest
to find_hashtable.b00
, and repeatedly running htcheck.lua and fixing links until it reported success.
This speeds up word lookup considerably, though testing has shown that not as much time is spent looking up words as I had expected, so the overall speedup is not earth-shattering: compiling factorial
from the README takes 50% less time than it used to.
Also migrate to tptasm (amounts to adding an %include "common"
at the top) and remove references to the unused word_ptr
, one of which would produce a warning.