StringZilla icon indicating copy to clipboard operation
StringZilla copied to clipboard

Initial Go support

Open MarkReedZ opened this issue 1 year ago • 7 comments

Added a few functions for go. If we merge this I'll finish it up.

$ go run scripts/bench.go
Contains
   38.798µs 	strings.Contains
   37.872µs 	sz.Contains
Index
   26.29µs 	strings.Index
   37.712µs 	sz.Index
IndexAny
   588.567µs 	strings.IndexAny
   45.911µs 	sz.IndexAny

MarkReedZ avatar May 16 '24 15:05 MarkReedZ

Wow, I am surprised the latency is comparable. How long is the input?

I believe for Go, it would be great to transpile our C code.

ashvardanian avatar May 16 '24 19:05 ashvardanian

I'll look at transpiling. I don't expect there to be a significant improvement on these benchmarks - its probably likely to slow things down considerably from what I've read.

The strings are 1mb. Below are 10kb strings. Short strings show the same performance except for IndexAny where the go time eventually matches stringzilla.

$ go run bench.go
Contains
   607ns 	strings.Contains
   804ns 	sz.Contains
Index
   324ns 	strings.Index
   492ns 	sz.Index
IndexAny
   5.917µs 	strings.IndexAny
   733ns 	sz.IndexAny

MarkReedZ avatar May 17 '24 00:05 MarkReedZ

How about super short inputs in the range of 10-100 bytes?

ashvardanian avatar May 17 '24 02:05 ashvardanian

None of the transpilers work on the stringzilla code. I don't see much information on them.

Build flags are the following by the way. -O3 slows us down.

// #cgo CFLAGS: -g -mavx2
// #include <stdlib.h>
// #include <../../include/stringzilla/stringzilla.h>
import "C"

100b

Contains
   245ns 	strings.Contains
   425ns 	sz.Contains
Index
   139ns 	strings.Index
   137ns 	sz.Index
IndexAny
   181ns 	strings.IndexAny
   275ns 	sz.IndexAny

16b

Contains
   138ns 	strings.Contains
   452ns 	sz.Contains
Index
   56ns 	strings.Index
   198ns 	sz.Index
IndexAny
   205ns 	strings.IndexAny
   262ns 	sz.IndexAny

MarkReedZ avatar May 17 '24 05:05 MarkReedZ

Others benchmarking cgo state its 40ns per C call, but I don't see it being that slow. I'll get more of the funcs in as I'm using this in some of my go projects. We can also try manually writing a func or two in go to see how that goes.

MarkReedZ avatar May 17 '24 06:05 MarkReedZ

@ashvardanian We do have an issue with repeated c calls. Count is faster until we match too many times. It is ~50ns per call to sz_find. We can?

A) Write Count in C so its one function call to C instead of N calls B) Write sz_find in go

Let me know and I'll go ahead. The below numbers are for a 9 byte substring to match - strings.Count is faster if the substring to search for is 4 bytes or less.

Count 1  
   55.06µs 	strings.Count
   40.692µs 	sz.Count
Count 100,000 
   1.344558ms 	strings.Count
   6.815231ms 	sz.Count

MarkReedZ avatar May 18 '24 06:05 MarkReedZ

How about we implement the count in the cGo header?

ashvardanian avatar May 18 '24 06:05 ashvardanian