charset-normalizer-rs icon indicating copy to clipboard operation
charset-normalizer-rs copied to clipboard

Improvements : Idiomatic code

Open chris-ha458 opened this issue 9 months ago • 11 comments

Re : #3

This codebase has been ported from Python and a lot of the design patterns could be improved to be more idiomatic rust code. Such a move will make it easier to improve speed and maintainability, ensure correct operation from a rust point of view.

Some examples would be avoiding for loops, using matches instead of if chains etc.

Many require deeper consideration.

For example, this codebase has extensive use of f32. Unless using intrinsics, f64 are as fast as or faster than f32 in rust. Moreover, trying to cast to and back for f32 and f64 can harm performance and make it difficult to ensure correct code. For instance there are instances of exact compare between f32 and f64 variables, and this is very unlikely to operate in the intended way. If it is intended, it would be valuable to have documentation regarding that, suppressing relevant lints as well. However, if there is a need to maintain ABI compatibility or follow a specification it might be inevitable. Also, on-disk size could be a consideration. In summary f32 vs f64 handling could serve as both idiomatic code and speed but only if done right.

I will try to prepare some PRs that change some things. Despite my best efforts, I am sure that many of my changes or views might be based on a flawed understanding of the code, so feel free to explain why things were done the way they were. In such cases I will help with documentation.

chris-ha458 avatar Sep 24 '23 23:09 chris-ha458