Gaffer
Gaffer copied to clipboard
Add protection to our BinaryOperators to prevent objects growing too large
Objects like FreqMap should only be used with a small number of keys, otherwise we could end up hitting limits in accumulo/hbase for the size of the rows.
In our BinaryOperators (like FreqMapAggregator), before merging the objects we should check whether the result will be too big. The aggregator should be configured with the key size limit. It should also have a flag to control what to do in the case that the objects cannot be safely merged - either throw an exception or log a warning and truncate the objects. If an exception is thrown then this will have severe consequences for Accumulo's tablet servers (similarly in HBase).
This will require changes to the binary operators in Koryphe too.
Are the key size limits defined anywhere for the stores? Have just set it to an arbitrary integer for now, but that can be easily changed.
Also, is this protection to be implemented for every single subclass of KorypheBinaryOperator? I've looked through most of them, and they seem like they either won't need it, or implementing the truncation would be non-trivial.