Gaffer icon indicating copy to clipboard operation
Gaffer copied to clipboard

Add protection to our BinaryOperators to prevent objects growing too large

Open p013570 opened this issue 7 years ago • 1 comments

Objects like FreqMap should only be used with a small number of keys, otherwise we could end up hitting limits in accumulo/hbase for the size of the rows.

In our BinaryOperators (like FreqMapAggregator), before merging the objects we should check whether the result will be too big. The aggregator should be configured with the key size limit. It should also have a flag to control what to do in the case that the objects cannot be safely merged - either throw an exception or log a warning and truncate the objects. If an exception is thrown then this will have severe consequences for Accumulo's tablet servers (similarly in HBase).

This will require changes to the binary operators in Koryphe too.

p013570 avatar Aug 10 '17 12:08 p013570

Are the key size limits defined anywhere for the stores? Have just set it to an arbitrary integer for now, but that can be easily changed.

Also, is this protection to be implemented for every single subclass of KorypheBinaryOperator? I've looked through most of them, and they seem like they either won't need it, or implementing the truncation would be non-trivial.

m607123 avatar Aug 23 '17 12:08 m607123