add a new compare func CompareValueONeil for bsi, which faster than CompareValue
According to Java's compare function , I code a new function CompareValueONeil for go version ,and it only support number which is >= 0 , it is faster than CompareValue. When bsi include 200 million columnIds, using same range query,CompareValueONeil is cost about 1s when CompareValue cost about 12s(https://github.com/RoaringBitmap/RoaringBitmap/blob/cca90c986d5c0096bbeabb5f968833bf12c28c0e/bsi/src/main/java/org/roaringbitmap/bsi/longlong/Roaring64BitmapSliceIndex.java#L456)
Test code read my file "total.bin"
file, _ := os.OpenFile(path+"/total.bin", os.O_RDWR, 0755)
defer file.Close()
total := roaring64.NewDefaultBSI()
total.ReadFrom(file)
t := time.Now().UnixNano()
res := total.CompareValue(0, roaring64.RANGE, 25, 30, nil)
fmt.Printf("totalCount[%v], count[%v], CompareValue cost %v [ms]\n", total.GetCardinality(), res.GetCardinality(), (time.Now().UnixNano()-t)/1000000)
t1 := time.Now().UnixNano()
res = total.CompareValueONeil(roaring64.RANGE, 25, 30, nil)
fmt.Printf("totalCount[%v], count[%v], CompareValueONeil cost %v [ms]\n", total.GetCardinality(), res.GetCardinality(), (time.Now().UnixNano()-t1)/1000000)
result
totalCount[207701480], count[234], CompareValue cost 14051 [ms]
totalCount[207701480], count[234], CompareValueONeil cost 882 [ms]
I will run some tests in my environment. In the meantime, could we come up with a better more descriptive name for this function?
Also (and better yet), when constructing the BSI we could potentially set min/max value providing some context about the nature of the data being stored in the BSI. Then we could choose to apply the optimization without creating an entirely new API.
I will run some tests in my environment. In the meantime, could we come up with a better more descriptive name for this function?
yes, more descriptive name is better
Also (and better yet), when constructing the BSI we could potentially set min/max value providing some context about the nature of the data being stored in the BSI. Then we could choose to apply the optimization without creating an entirely new API.
Yes, util now, BSI's min/max doesn't be refreshed when setValues or deserialization. If min/max exist in BSI, new API can cancel.
Is there some news on this PR?
I'm also interested in the progress of this.
Are you still working on this @ChicChip?
I'm also interested in the progress of this.
Are you still working on this @ChicChip?
bsi32 and bsi64 are supported in commit, but this request(https://github.com/RoaringBitmap/roaring/pull/461#issuecomment-2500974153) about set min/max i haven't done yet.