roaring icon indicating copy to clipboard operation
roaring copied to clipboard

WriteTo() optimization ideas?

Open steveyen opened this issue 6 years ago • 3 comments

It appears the current implementation of Bitmap.WriteTo() isn't very much different than the user instead calling ToBytes() and invoking io.Writer.Write() itself... https://github.com/RoaringBitmap/roaring/blob/master/roaringarray.go#L526

Was wondering if there are approaches or ideas that might help avoid the "extra" toBytes() allocation?

I'd be willing to give an improvement an attempt if it's pretty straightforward. In my usage of roaring in a golang full-text engine (blevesearch), the io.Writer that we're using is one of those bufio buffered writers, so the guess would be that a bunch of Write()'s ought to be ok.

cheers, steve

steveyen avatar Mar 30 '18 18:03 steveyen

The bulk of the time in the toBytes function is going to be spent there:

	stream := &bytes.Buffer{}
        (...)
	for i, c := range ra.containers {
		_ = i
		_, err := c.writeTo(stream)
		if err != nil {
			return nil, err
		}
	}

https://github.com/RoaringBitmap/roaring/blob/master/roaringarray.go#L512-L518

(Of course, one should do actual profiling... results could vary depending on your application.)

I don't think it is hard to pull out this code from the toBytes function and do something more direct with it.

It is less obvious that it is a good path to speed things up. Buffering is not necessarily a negative performance wise. It can be an optimization too!

A better strategy might be to make it possible to reuse the same memory buffer several times. Or something.

Anyhow, writing the code is not hard (I expect you will agree). What is hard is to benchmark and tune.

lemire avatar Mar 30 '18 18:03 lemire

A better strategy might be to make it possible to reuse the same memory buffer several times.

Indeed -- that's the pattern we've been following in blevesearch, and it's been quite a beneficial trick to play again and again.

What is hard is to benchmark and tune.

So true!!

steveyen avatar Mar 30 '18 20:03 steveyen

I marked this "help wanted". I think that this is an accessible problem for contributions.

lemire avatar Apr 01 '18 02:04 lemire