plain old C, to be called via ctypes
we can format this a sa pr later, but it is this thing:
unsigned long long count_double(char *str) {
char *i, c1, c2;
unsigned long long count = 0;
c1 = *str;
c2 = *(str + 1);
for(i = str+1; c2 = *i; i++) {
count += (c1 == c2);
c1 = c2;
}
return count;
}
Build with: gcc -shared -o libblacount.so -fPIC blacount.c
And prepare to use from Python with:
import ctypes blacount = ctypes.cdll.LoadLibrary("./libblacount.so") blacount.count_double.argtypes = [ctypes.c_char_p] blacount.count_double.restype = ctypes.c_uint64
(it counts byte-strings only)
@jsbueno I also saw this https://github.com/martinxyz/rust-python-example/commit/f8e36ab5f9c
Good comparison but as C is counting bytes we can say Rust is playing good as difference is not so significant
I added a version using SWIG if you think using ctypes will perform better please send a PR :)
-------------------------------------------------------------------------------------------------
Name (time in us) Min Max Mean
-------------------------------------------------------------------------------------------------
test_rust_bytes_once 476.7920 (1.0) 830.5610 (1.0) 486.6116 (1.0)
test_c_swig_bytes_once 795.3460 (1.67) 1,504.3380 (1.81) 827.3898 (1.70)
test_rust_once 985.9520 (2.07) 1,483.8120 (1.79) 1,017.4251 (2.09)
test_numpy 1,001.3880 (2.10) 2,461.1200 (2.96) 1,274.8132 (2.62)
test_rust 2,555.0810 (5.36) 3,066.0430 (3.69) 2,609.7403 (5.36)
test_regex 24,787.0670 (51.99) 26,513.1520 (31.92) 25,333.8143 (52.06)
test_pure_python_once 36,447.0790 (76.44) 48,596.5340 (58.51) 38,074.5863 (78.24)
test_python_comprehension 49,166.0560 (103.12) 50,832.1220 (61.20) 49,699.2122 (102.13)
test_pure_python 49,586.3750 (104.00) 50,697.3780 (61.04) 50,148.6596 (103.06)
test_itertools 56,762.8920 (119.05) 69,660.0200 (83.87) 58,402.9442 (120.02)
-------------------------------------------------------------------------------------------------