arkouda icon indicating copy to clipboard operation
arkouda copied to clipboard

Use `computeOnSegments` for operations on string bytes

Open reuster986 opened this issue 3 years ago • 0 comments

PR #1060 introduced a way to do computations on strings that follow the locality of the bytes, rather than the segment pointers. For hashing strings, this approach has a little more overhead but significantly improves worst-case behavior, enough that we decided to switch. There are several similar cases where it might make sense to switch, so I am tracking those here. It would be good to have benchmarks like #1095 in place beforehand to establish a baseline.

  • [X] sipHash128 (#1060 )
  • [ ] Regex
    • [ ] findMatchLocations
    • [ ] sub
    • [x] #1241
    • [ ] peelRegex
  • [X] compare (#1116 )
    • [X] with scalar (while we're at it, fix compare with empty string)
    • [X] between vectors?
  • [X] castStringToSymEntry (#1107 )

reuster986 avatar Feb 14 '22 16:02 reuster986