coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

wc: Use SIMD for performance

Open ArniDagur opened this issue 4 years ago • 5 comments

The wc would benefit from using SIMD.

See:

  • https://github.com/expr-fi/fastlwc
  • https://github.com/Freaky/cw

Both projects are MIT licensed and the latter is written in Rust. Perhaps one could copy-paste some of the code from there.

ArniDagur avatar Mar 25 '21 05:03 ArniDagur

@ArniDagur did you contact the upstream authors? If we do merge one of these projects, it should be done in coordination with them!

sylvestre avatar Mar 25 '21 07:03 sylvestre

@ArniDagur did you contact the upstream authors?

No.

If we do merge one of these projects, it should be done in coordination with them!

Sure. It's not a legal requirement given the license compatibility, but I agree it's good practice. However, I think we'd most probably just borrow some routines -- not entire programs.

ArniDagur avatar Mar 25 '21 14:03 ArniDagur

Sure. It's not a legal requirement given the license compatibility, but I agree it's good practice. However, I think we'd most probably just borrow some routines -- not entire programs.

IMO, it will still be inappropriate, if you want to merge someone else's code in, it is best to contact them before doing it.

ycd avatar Mar 25 '21 21:03 ycd

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 13 '23 12:01 stale[bot]

Hasn't been fixed yet

tertsdiepraam avatar Jan 13 '23 22:01 tertsdiepraam

Seemed like a free win so I looked into it...

The line count is easily "simd-able", and we might already be doing something like that with the bytecount library.

Once that is out of the way, the runtime is dominated by the word counting.

Unfortunately, all the fast wc implementations use SIMD bit tricks that only work with ASCII text.

I read a proof of concept for SIMD with utf-8 parsing but it's very complicated and only gets a 2x speed up instead of a 10x speed up.

benkwokcy avatar Mar 03 '23 04:03 benkwokcy