roaring icon indicating copy to clipboard operation
roaring copied to clipboard

Expand on "Goroutine safety" section of README

Open bored-engineer opened this issue 5 years ago • 11 comments

It would be great if the Goroutine Safety section of the README.md about what operations are currently goroutine-safe could be expanded on, even/especially if that's not a guarantee that they will remain thread-safe moving forward.

For example, if bitmap rb1 is created, optimized, and never written to again (or has appropriate locking), can it be safely used in concurrent read operations by another bitmap rb2.And(rb1) or rb2.Or(rb1)?

Testing this with go test -race (which I know isn't a anywhere near complete test) doesn't report any issues:

func TestConcurrentBitmapAccess(t *testing.T) {
	rb1 := roaring.BitmapOf(1, 20, 40, 90)

	var wg sync.WaitGroup
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			rb2 := roaring.NewBitmap()
			rb2.And(rb1)
			wg.Done()
		}()
	}
	wg.Wait()
}

bored-engineer avatar Oct 01 '19 23:10 bored-engineer

I took a look at the (*Bitmap).Or method for example and any operations that occur on x2 to see if they are unsafe:

length2 := x2.highlowcontainer.size()

The call to size is just a len operation which should be safe:

func (ra *roaringArray) size() int {
	return len(ra.keys)
}

Next up is:

s2 := x2.highlowcontainer.getKeyAtIndex(pos2)

getKeyAtIndex just reads a value from the keys slice, should be safe for concurrent access:

func (ra *roaringArray) getKeyAtIndex(i int) uint16 {
	return ra.keys[i]
}

That call happens a few more times with different indexes, they should still be safe. Finally, there is an call to:

rb.highlowcontainer.insertNewKeyValueAt(pos1, s2, x2.highlowcontainer.getContainerAtIndex(pos2).clone())

The getContainerAtIndex call again is just reading from a slice, should be safe:

func (ra *roaringArray) getContainerAtIndex(i int) container {
	return ra.containers[i]
}

The clone method is called on the returned container interface which has a few different implementations:

func (bc *bitmapContainer) clone() container {
	ptr := bitmapContainer{bc.cardinality, make([]uint64, len(bc.bitmap))}
	copy(ptr.bitmap, bc.bitmap[:])
	return &ptr
}
func (ac *arrayContainer) clone() container {
	ptr := arrayContainer{make([]uint16, len(ac.content))}
	copy(ptr.content, ac.content[:])
	return &ptr
}
func newRunContainer16CopyIv(iv []interval16) *runContainer16 {
	rc := &runContainer16{
		iv: make([]interval16, len(iv)),
	}
	copy(rc.iv, iv)
	return rc
}
func (rc *runContainer16) clone() container {
	return newRunContainer16CopyIv(rc.iv)
}

All of these operations are just calls to copy and len which should be safe for concurrent calls (again assuming no write actions)

bored-engineer avatar Oct 02 '19 00:10 bored-engineer

Perhaps the easiest solution here would be to verify which of the package level functions are safe for concurrent use on a bitmap:

func And(x1, x2 *Bitmap) *Bitmap
func AndNot(x1, x2 *Bitmap) *Bitmap
func FastAnd(bitmaps ...*Bitmap) *Bitmap
func FastOr(bitmaps ...*Bitmap) *Bitmap
func Flip(bm *Bitmap, rangeStart, rangeEnd uint64) *Bitmap
func FlipInt(bm *Bitmap, rangeStart, rangeEnd int) *Bitmap
func HeapOr(bitmaps ...*Bitmap) *Bitmap
func HeapXor(bitmaps ...*Bitmap) *Bitmap
func Or(x1, x2 *Bitmap) *Bitmap
func ParAnd(parallelism int, bitmaps ...*Bitmap) *Bitmap
func ParHeapOr(parallelism int, bitmaps ...*Bitmap) *Bitmap
func ParOr(parallelism int, bitmaps ...*Bitmap) *Bitmap
func Xor(x1, x2 *Bitmap) *Bitmap

Then we can extrapolate that the receiver versions of them are safe as well for the bitmap passed in as a parameter (obviously they are not safe for the receiver itself)

bored-engineer avatar Oct 02 '19 01:10 bored-engineer

@bored-engineer

This is a good idea. I would expect that all operations that generate a new bitmap do not modify the input bitmaps.

We need to check the effect of copy-and-write.

How do you propose we go forward?

lemire avatar Oct 02 '19 14:10 lemire

@lemire do you expect that to be guaranteed moving forward (excluding major version bumps)? If so I think it probably makes more sense to add it to the godoc for each method explicitly mentioning which arguments are safe for concurrent reads. If not, I think the README is a more appropriate place with a nice disclaimer that this won't always be true but is as of today. Once we have a list of verified safe methods I can send over a PR for either option.

As far as copy-on-write I was thinking about that as well as a good starting point. Does this sounds reasonable:

  • Assume that copy-on-write is implemented as of today without any bugs (i.e. each method the performs a write action correctly checks copyOnWrite before performing any write actions)
  • Assume the result of rb2 := rb1.Clone() on a copy-on-write bitmap (rb1) is safe for concurrent reads (i.e. rb1 and rb2 can be read at the same time)

Given those, we can build a blacklist of "unsafe" methods based on any method that checks the value of copyOnWrite, anything else can be assumed safe/without side effects. The only side-effect we'd still have to worry about is .Clone() always copies ra.keys ([]uint16) as-is so anything that manipulates ra.keys should also be added to the blacklist.

bored-engineer avatar Oct 02 '19 15:10 bored-engineer

@bored-engineer The godoc approach sounds good. The README should be modified to point the reader to the godoc.

I would qualify safe/unsafe with respect to the value of copyOnWrite. A call to Clone, if you do not use copy-on-write, is just a full copy and thus entirely thread safe/concurrent.

It seems to me that there are two very distinct use cases. There are people relying on copy-on-write and people who do not. The people who do use copy-on-write need extra care.

In some sense, though copy-on-write has some benefits, it does incur some extra complexity, and I think that the users should be made aware of this if they want performance.

lemire avatar Oct 02 '19 15:10 lemire

The godoc approach sounds good. The README should be modified to point the reader to the godoc.

Sounds good, will do that once we have a finalized list of methods.

It seems to me that there are two very distinct use cases. There are people relying on copy-on-write and people who do not. The people who do use copy-on-write need extra care.

Let me restate this just to make sure I understand it: We should update the docs to indicate which methods are safe for concurrent reads, except with a special note that if GetCopyOnWrite() is true the user must take extra care to prevent any write actions to the originally cloned bitmap as well as the current bitmap for the operations to be concurrent read safe.

I would expect that all operations that generate a new bitmap do not modify the input bitmaps.

Finally, it wasn't clear to me if this was a statement that the methods are already safe and we don't need to do any validation, or if you're saying that's how you would expect them to work and we should still verify before updating any docs saying they are.

bored-engineer avatar Oct 02 '19 18:10 bored-engineer

I think we are in agreement.

I was just stating my expectation: we should verify. However, there is only so much state we can change when taking bitmaps as inputs and returning a new bitmap. I can’t think of any except for issues with copy-on-write.

lemire avatar Oct 02 '19 18:10 lemire

Another option would be to create a dedicated type or new side-effect free functions.

Anyhow, documentation would be a step forward.

lemire avatar Oct 03 '19 07:10 lemire

My general understanding is that the safety of a Bitmap basically operates under Rowlock semantics. Non-mutating reads can happen concurrently, but any updates need exclusive access, as they can corrupt assumptions. For instance, if you remove a container after length1 is calculated in and/or/not it will panic

jacksonrnewhouse avatar Jun 22 '21 18:06 jacksonrnewhouse

Is sync.rwmutex safe to use? Contains will not mutate the underlying data?

exherb avatar Jun 19 '22 11:06 exherb

Contains does not mutate the content.

lemire avatar Jun 19 '22 15:06 lemire