rust-csv
rust-csv copied to clipboard
Non-allocating trim method
Hey, thanks for the library.
When I was using it I saw that a lot of time was spent in the trim method and stumbled over the comment about making it non-allocating.
I took a stab at it and cargo bench shows a big improvement. Let me know if that is something that you would be interested in , than I can keep working on it a bit more.
Below is the output of the included bench functions for trimming. I think they are showing the best case though since the test file seems to contain no trimmable white-space. Also what is confusing to me is that the StringRecord benchmark also benefits from improvements to the ByteRecord trim method, not sure where this is coming from.
I cleaned up the code a bit and added a bench case that actually has to remove whitespace. There the difference is less stark, but still pretty big.
data:image/s3,"s3://crabby-images/cd83b/cd83b67685138d3fc7dff8b8a1ba971a8ccd510f" alt="Screen Shot 2021-10-02 at 12 36 10"
data:image/s3,"s3://crabby-images/07c05/07c05e2f91837fbf7abc0930d0fa07de673bcda4" alt="Screen Shot 2021-10-02 at 12 37 02"
Thank you for this pull request. This yields a significant performance improvement in our project.
@BurntSushi what would be the next steps to consider this PR?
I need to review it. From a quick glance, it looks like there's some clean up required here. For example, there shouldn't be a "trim original" routine in the public API.
CI also obviously needs to be fixed.
@do4gr hello, thank you at lot for this. Do you plan to work on this PR again? Otherwise I can try to polish what you did in order to get it merged.
I'll remove the trim_original and then we'll see whether the PR run succeed. I think it was just flakey CI.
Apart from that from my side the PR would be done. There is nothing more I was thinking of including.
That's it from my side. Let me know if there is more that needs to be done.