icu4x
icu4x copied to clipboard
ZeroTrie types simplification
Currently we have 4 types:
ZeroTrieSimpleAscii- Supports ASCII only (so that it can be const constructed)
- Does not use the perfect hash function (better for small data)
- Does not support span nodes (because ASCII does not need them)
- Maximum size is 2^32
ZeroTriePerfectHash- Supports arbitrary bytes, including UTF-8
- Uses the perfect hash function for branch nodes with at least 16 children
- Uses span nodes
- Maximum size is 2^32 (slight performance win)
ZeroTrieExtendedCapacity- Same as
ZeroTriePerfectHashbut there is no size limit
- Same as
ZeroTrie- An enum over the three above types
I think we should consolidate these down to two types:
ZeroAsciiTrie==ZeroTrieSimpleAsciiZeroBytesTrie==ZeroTrieExtendedCapacity- Remove
ZeroTriePerfectHashbecause the ~5% perf benefit isn't worth keeping around a separate type - Remove
ZeroTriebecause the type of trie should be determined by the programmer based on the nature of the data, not via a runtime switch
OK?
- [x] @Manishearth
- [x] @robertbastian
Worth noting that the byte pattern of a ZeroAsciiTrie can be consumed by a ZeroBytesTrie which means it is forwards-compatible to switch between them.