charset-normalizer-rs
charset-normalizer-rs copied to clipboard
Improvements : Speed
As per our discussion in #2 for speed improvements the following has been suggested
- calc coherence & mess in threads
- or calc mess for plugins in threads (or some async?)
- or something other...
The paths I had in mind was these:
- Related to threads idea : use
Rayon
- Replace HashMap with concurrent
DashMap
(Currentstd
HashMap implements rayon so not strictly necessary, but might be useful to look into regardless)
- Replace HashMap with concurrent
- ~Use replace hashing algorithm used in HashMap~ with
FxHash
,AHash
,HighwayHash
- aHash implemented #14
- ~Replace
sort()
withsort_unstable()
~ #6 - ~Identfiy preallocation opportunities~. For instance, replace
Vec::new()
withVec::with_capacity()
- Seems like most current new() cannot really preallocate due to uncertainty. The basic preallocation algorithm is optimized enough that unless we have a strong idea regarding memory access premature allocation is not helpful.
Many of these are low hanging fruit and related to refactoring the code to idiomatic Rust code.
For example, there are many for loops in this code. Iterator based code is more idiomatic, easier to improve with rayon
, and interact better with allocation. (pushing items from within a for loop can cause multiple allocs and copies, while collecting an iterator can allow fewer allocations.)