CRoaring icon indicating copy to clipboard operation
CRoaring copied to clipboard

Error reporting should be better

Open andreigudkov opened this issue 7 years ago • 5 comments

Currently error handling is poor:

  • failure to allocate memory results in error message sent to stderr and segfault (e.g. in roaring_bitmap_add())
  • there is no way to distinguish between memory allocation errors and serialization format errors (portable_deserialize())

Both problems can be solved with integer error codes, e.g. 0 -- operation successful, -1 -- memory allocation error, -2 -- serialization format error, etc. Error codes should be returned for all operations which modify bitmaps.

There is, of course, an issue of atomicity. If error occurs inside some modification operation, bitmap can enter half-broken state. Atomicity guarantee would be too expensive from performance perspective. But we need to at least guarantee that bitmaps can be safely roaring_bitmap_free()d, even after an error. All other operations are UB.

andreigudkov avatar Jul 31 '18 09:07 andreigudkov

I agree...

lemire avatar Jul 31 '18 13:07 lemire

Marked as help wanted.

lemire avatar Jul 31 '18 13:07 lemire

Article covering some issues and what various codebases do:

https://eli.thegreenplace.net/2009/10/30/handling-out-of-memory-conditions-in-c

Backwards-Compatible Improvement Proposal: add new names for APIs that return error results, but have static inline wrappers for the old names that delegate to a macro for an error function. If the error macro isn't defined, define it to some provided API function. Along these lines:

 // error implementation that prints a message and exits.
 // or takes an error code number and looks it up in a table.
 // (I don't care as long as builds can exclude its dependencies.
 // e.g. embedded platforms that won't/can't link to stdio)
 //
 #if !defined(ROARING_NO_ERROR_HANDLER)
 roaring_bitmap_t *roaring_bitmap_error(const char *msg); 
 #endif

 #if !defined(ROARING_ERROR_FUNC)
      #define ROARING_ERROR_FUNC roaring_bitmap_error
 #endif

 roaring_bitmap_t *roaring_bitmap_try_xor(const roaring_bitmap_t *x1,
                                          const roaring_bitmap_t *x2);

 static inline roaring_bitmap_t *roaring_bitmap_xor(
     const roaring_bitmap_t *x1,
     const roaring_bitmap_t *x2
 ){
     roaring_bitmap_t *result = roaring_bitmap_try_xor(x1, x2);
     if (result == NULL)
          ROARING_ERROR_FUNC("out of memory");
     return result;
 }

(Currently I have to patch out those error printfs locally due to not linking to stdio. So this would be much better.)

If error occurs inside some modification operation, bitmap can enter half-broken state. Atomicity guarantee would be too expensive from performance perspective.

The tradeoff for performance makes sense...to say that those who need atomicity should not use add_many(), and not use the inplace versions of things like and/or/xor.

But fundamentals like add_range() would seem to belong in the "able to be done atomically" set. Should be negligible cost in most all cases to use an algorithm that walks through and does all the allocations it needs first...using the allocated space during the first pass to store links that it could use to backtrack over and know what to fill in or free.

AE1020 avatar Nov 03 '20 22:11 AE1020

@AE1020 Your proposal is not bad at all.

It is not always so simple to know where a memory allocation might occur. There are branches. You might follow a branch, do some allocation, then follow another branch and do more allocation.

Though atomicity is always a great thing to have, I do not think it is needed at the operation level. There are robust, distributed bitmap-based systems relying on Roaring right now in product. The way they work is that they have serialized (sometimes immutable) bitmaps. Should the operations fail, it is fine, as long as does so prior to a new serialization and not in the middle of it.

Of course, what we have currently is not great (hence this open issue) because if things fail... your system might just go crazy. So having better error handling is definitively important.

lemire avatar Nov 03 '20 23:11 lemire

Of course, what we have currently is not great (hence this open issue) because if things fail... your system might just go crazy.

If the core inplace operations return boolean (e.g. bool roaring_bitmap_xor_inplace_completed(...)), these appear to be the options for when it returns false:

  1. The input is in an undefined-but-usable/freeable state, which will give subsequent answers based on however far the partial operation got.
  2. The input is cleared and immediately usable as an empty set
  3. The input containers are freed and the bitmap bits turned into an invalid state that will crash if used. So needs to be roaring_bitmap_init()'d before being used again. But it would not free the memory of the roaring_array (whether that be allocated through roaring_bitmap_create() or not). If it had been allocated, then roaring_bitmap_free() would also work during the invalid state.

To me, 3 seems the best. Then:

static inline void roaring_bitmap_xor_inplace(
    roaring_bitmap_t *x1,
    const roaring_bitmap_t *x2
){
    if (!roaring_bitmap_xor_inplace_completed(x1, x2))
        ROARING_ERROR_FUNC("out of memory");
}

2 doesn't seem useful enough to warrant making it harder to find bugs in unwanted usages after the failure. Calling roaring_bitmap_init() if that's what you want after a failure is cheap and easy.

What is bad about 1 is that if there were going to be guarantees (for instance: "a partially completed union will not contain any bits that weren't in either set originally") then those would need to be articulated, implemented and tested. That seems like an unnecessary tax to pay for a not very useful feature.

AE1020 avatar Nov 04 '20 05:11 AE1020