ogonek
ogonek copied to clipboard
Enable carrying validation and normalization info on types
Right now an instance of text
carries information about the encoding in the type, in order to enforce conversions at the appropriate places.
It would be neat if text
could optionally also enforce some normalization form. This would allow for optimisations in the equivalence and hashing functions, which would reflect in improved performance when using text
as a key in maps, for example.
To allow similar optimisation, make all ranges based on normalizing iterators carry information about their normal form in the type.
text
types also carry information about validity of the sequence. This property is already used in various places to optimise away redundant validation. There are however more ranges that can have the same characteristics and enable the same optimisations. Make all ranges based on decoding or normalizing iterators be considered as validated.
This will be a full rewrite of the iterator bits. They will be rewritten as range objects instead of using clunky iterator pairs. Work is currently happening on the iterators-must-go branch.