DawgSharp
DawgSharp copied to clipboard
Use shorts instead of ints if the number of nodes is < 32K
My main use case involved 2.5M words that were represented using 150K nodes. But other users might have < 32K nodes in which case they may benefit from a 2x memory footprint cut (as if we haven't done enough!) It might be worth taking this number up to 64K-1 as we only need one special value.
If we are doing this, we may also want to think of users who might have more than 2G nodes and add a ulong version. Unlikely, but future-ready. Add a long version of GetNodeCount ().
I.e. add another template parameter to YaleDawg<TPayload>, TIndex which can be byte, ushort, uint or ulong and instantiate the required specialization based on the total number of nodes.