mojo icon indicating copy to clipboard operation
mojo copied to clipboard

[stdlib] Fix `String.split()` implementations

Open martinvuyk opened this issue 5 months ago • 4 comments

Fix String.split() implementations to use StringSlice and without assuming that indexing is by byte offset. Some important optimizations were added.

Using the code in issue #3460 as benchmark, we are now around 1.6x slower than Python instead of 1.8~2.6x with this new implementation. Following the trail of potential culprits, ~I still come to the same conclusion as in PR #3577 that memset needs to be usable for lists with trivial types such as UInt8 (ideally without needing the hint_trivial_reg_type param) to make use of SIMD copies which will be faster.~ Edit: Even refactoring and not using List.resize() I can't get bellow 1.5x slower than Python.

Edit2: Even removing some bit of overhead from List.append() the effects are negligible. There is still something to fix elsewhere, and I'm already extending this PR too much. Will do a followup hunting the problem down.

martinvuyk avatar Sep 22 '24 23:09 martinvuyk