corrset-benchmark
corrset-benchmark copied to clipboard
possible performance improvement
your current performance bottleneck seems to be iterating bitsets.
Maybe try to short cut the iteration via an efficient calculation of the number of trailing zeroes.
I just pasted the 64 bit version below (the table needs to be initialized at program start, but the ntz
routine is in O(1) ;)
#define DEBRUIJN 0x22fdd63cc95386dULL
static uint32_t table [64] ;
void inittable (){
uint64_t db = DEBRUIJN ;
int i ;
for (i=0 ; i < 64;++ i ) {
table [db >> 58] = i ;
db = db <<1;
}
}
uint32_t ntz ( uint64_t n){
n &= −n ;
n ∗= DEBRUIJN ;
n >>= 58;
return table [n] ;
}
The explanation is presented here. Of course you can do it for 128 or 256 bits or whatever. It will speed up the iterator a lot if the set is sparse.
Have fun!
Hi, thanks for the recommendation. I am already using the number of trailing zeros as an optimization for the bit-set iterator, as you can see here: https://github.com/willcrichton/indexical/blob/main/src/bitset/simd.rs#L210-L219
This is just implemented with the trailing_zeros
primitive provided by the Rust standard library: https://doc.rust-lang.org/stable/std/primitive.usize.html#method.trailing_zeros
Thanks for you article about performance optimization with Rust! From pure data science viewpoint it can be solved following way: for each user create feature vector F_X -> {0,1} if X_question was answered correctly, then apply linear/logistic regression with heavy L1 regularization. For linear decision functions L1 regularization have property that it's "zero out", not powerful/useful features, depending on value of regularization parameter(so you can easily select k-most powerful). It will be interesting how much it coincide with brute-force approach.