quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

Faster escape routines

Open dralley opened this issue 3 years ago • 2 comments

dralley avatar Jun 29 '22 19:06 dralley

Figuring out how much of an improvement this is (or even if it is one, on average) might be challenging because I get different results on my AMD based desktop and my Intel based laptop (although laptop benchmarks are inconsistent due to power management).

Escaping long inputs is much faster (50-75% faster) on both systems regardless of whether it does any replacements, escaping short inputs on both systems is slower if there are replacements, but my desktop is also slower on short inputs without any replacements whereas my laptop is still faster (just not as much as with the long inputs).

Since most attributes are short and don't contain escapable chars, a regression there concerns me a bit.

Ironically, these instructions seem to have been added explicitly for the benefit of XML parsing, but they were never widely enough used for hardware manufacturers to prioritize the performance of the instructions, so now it's harder to benefit from them. It also seems like they put more effort into the one that works on null-terminated C strings, and less into the one that looks at strings of an explicit length, which is the variant Rust uses... with the result that it is (apparently) measurably much slower than the one designed for C strings. Frustrating.

Supposedly you can get better results with AVX2. Haven't looked into it yet.

https://news.ycombinator.com/item?id=14422098

http://lists.xml.org/archives/xml-dev/202108/msg00000.html

https://web.archive.org/web/20180617042918/https://software.intel.com/en-us/articles/xml-parsing-accelerator-with-intel-streaming-simd-extensions-4-intel-sse4/

https://stackoverflow.com/questions/58901232/why-is-sse4-2-cmpstr-slower-than-regular-code

https://stackoverflow.com/questions/20935769/sse42-sttni-pcmpestrm-is-twice-slower-than-pcmpistrm-is-it-true

dralley avatar Jun 29 '22 20:06 dralley

Codecov Report

Merging #408 (c1d7a06) into master (5bed370) will increase coverage by 0.19%. The diff coverage is 80.00%.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

@@            Coverage Diff             @@
##           master     #408      +/-   ##
==========================================
+ Coverage   64.43%   64.62%   +0.19%     
==========================================
  Files          37       37              
  Lines       17487    17529      +42     
==========================================
+ Hits        11267    11328      +61     
+ Misses       6220     6201      -19     
Flag Coverage Δ
unittests 64.62% <80.00%> (+0.19%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/escapei.rs 14.51% <80.00%> (+1.26%) :arrow_up:

... and 1 file with indirect coverage changes

codecov-commenter avatar Jun 29 '22 20:06 codecov-commenter