Gaffer
Gaffer copied to clipboard
FederatedStore-FederatedAccess: Why are Read Write Predicates Strings and not AccessPredicate objects?
FederatedStore-FederatedAccess: Why are Read Write Predicates Strings and not AccessPredicate objects?
This requires deserialisation on every usage of FederatedAccess on graphs, This seems wasteful.
The Read/Write Access predicates are stored as strings because the parent object be serialisable (so that they may be stored in a Disk Cache). It's the same story for Named Operations / Named Views. I've not experienced performance issues with this but some caching could be added.
I'm not understanding, I might be missing something fundamental here.
Is the JSON serialiser not being used in this situation for storing to Disk Cache?
In the examples you mentioned. NamedOperation
and NamedViews
they do not show this behaviour. Instead POJO's NamedOperationDetail
and NamedViewsDetail
show this String usage of AccessPredicates
.
Would it be more in keeping/correct to have a FederatedAccessDetail
POJO to be created with the Strings and have FederatedAccess
with AccessPredicates
?
To answer your first question, no. For the JCS cache, they currently they use Java serialisation. Maybe that's something we could (and probably should) change.
And for the FederatedAccessDetail question, maybe but the FederatedAccess object is already fulfilling that role
Ah, so with FederatedStore a cache was implemented FederatedStoreCache
it is a wrapper for CacheServiceLoader
and serialises Pair<GraphSerialisable, FederatedAccess>
. String manipulation was handled their for graphs (which at the time wasn't serialisable.)
FederatedAccess
is being serialised there in addGraphToCache
If this is cache that had or provided issues then the FederatedStoreCache could be amend to handle the AccessPredicates
via strings to and from the cache. dropping the need to serialise strings each time FederatedAccess
object is used.
As I understand it. We are in agreement that FederatedStoreCache
can be modified to handle the manipulation of unserialisable AccessPredicate
into cached strings and then retrieved back into AccessPredicate
objects.
This will clean up the code and be "logically" a more performant FederatedAccess
object (real world difference will likely be minimal).
Requiring Further Discussion.
It is likely that NamedOperations
and NamedViews
should follow the same changes. This likely should be achieved by a default caching mechanism (similar to FederatedStoreCache
) which serialises as default using JSONserialiser
or uk.gov.gchq.gaffer.serialisation.implementation.*
. A lot of effort has been made across gaffer development to use these serialisers, only to find issues with the JavaSerialiser via JCSCache
.
After discussion, this item won't be handled as part of the [ now alpha-4] fedstore merging, but will be reviewed as part of alpha-5 and a decision taken then as to whether to handle it there or after the later NamedOperation changes
We may not in the end include any NamedOperation changes in 2.0. The question here is whether this item is worth doing and, if it is, whether it has to be included in the 2.0 release or can be be left to the post-2.0 backlog
Due to test FederatedStoreCacheBackwardCompatibilityTest requiring backwards compatibility of the Cache I can't do smarter changes with the cache as required.
Have since decided the only break and migration required is the cache gets deleted and Graphs are re-added. the underlying data is of graphs will still persist.
repost: Requiring Further Discussion. It is likely that NamedOperations and NamedViews should follow the same changes. This likely should be achieved by a default caching mechanism (similar to FederatedStoreCache) which serialises as default using JSONserialiser or uk.gov.gchq.gaffer.serialisation.implementation.*. A lot of effort has been made across gaffer development to use these serialisers, only to find issues with the JavaSerialiser via JCSCache.
Closed by https://github.com/gchq/Gaffer/pull/2862