Gaffer
Gaffer copied to clipboard
New Accumulo key-package optimised for `GetRDDOfAllElements`
The GetRDDOfAllElements
operation applies a filter to avoid every Edge
being returned twice. This means that twice as many Edge
s as necessary are being read. This could be avoided by storing the "forward" version of an Edge
of group G with one column family (e.g. "G-F") and the "backwards" version with a different column family (e.g. "G-B"). As long as there is one locality group for each of these then this will reduce the amount of data read by a GetRDDOfAllElements
operation by a factor of 2.
This could also improve the efficiency of some other queries and possibly slightly slow down others.
This needs a new key-package.
This new key package should be significantly better than either of the current two key packages. It should be the default key package in version 2.0.
Following a discussion between @gaffer01 and @d21211122 it has been decided that this will not be included in Gaffer 2.0. While this will not be implemented against the Accumulo store, it may still be possible to support full scan with a future TBD cloud native store for Gaffer v2 that implements GetAllElements without returning edges both ways round.