Gaffer icon indicating copy to clipboard operation
Gaffer copied to clipboard

FederatedStore-FederatedAccess: Why are Read Write Predicates Strings and not AccessPredicate objects?

Open GCHQDev404 opened this issue 4 years ago • 7 comments

FederatedStore-FederatedAccess: Why are Read Write Predicates Strings and not AccessPredicate objects?

This requires deserialisation on every usage of FederatedAccess on graphs, This seems wasteful.

GCHQDev404 avatar Oct 08 '20 15:10 GCHQDev404

The Read/Write Access predicates are stored as strings because the parent object be serialisable (so that they may be stored in a Disk Cache). It's the same story for Named Operations / Named Views. I've not experienced performance issues with this but some caching could be added.

d47853 avatar Oct 09 '20 10:10 d47853

I'm not understanding, I might be missing something fundamental here.

Is the JSON serialiser not being used in this situation for storing to Disk Cache?

In the examples you mentioned. NamedOperation and NamedViews they do not show this behaviour. Instead POJO's NamedOperationDetail and NamedViewsDetail show this String usage of AccessPredicates.

Would it be more in keeping/correct to have a FederatedAccessDetail POJO to be created with the Strings and have FederatedAccess with AccessPredicates?

GCHQDev404 avatar Oct 09 '20 14:10 GCHQDev404

To answer your first question, no. For the JCS cache, they currently they use Java serialisation. Maybe that's something we could (and probably should) change.

d47853 avatar Oct 09 '20 14:10 d47853

And for the FederatedAccessDetail question, maybe but the FederatedAccess object is already fulfilling that role

d47853 avatar Oct 09 '20 14:10 d47853

Ah, so with FederatedStore a cache was implemented FederatedStoreCache it is a wrapper for CacheServiceLoader and serialises Pair<GraphSerialisable, FederatedAccess>. String manipulation was handled their for graphs (which at the time wasn't serialisable.)

FederatedAccess is being serialised there in addGraphToCache If this is cache that had or provided issues then the FederatedStoreCache could be amend to handle the AccessPredicates via strings to and from the cache. dropping the need to serialise strings each time FederatedAccess object is used.

GCHQDev404 avatar Oct 09 '20 15:10 GCHQDev404

As I understand it. We are in agreement that FederatedStoreCache can be modified to handle the manipulation of unserialisable AccessPredicate into cached strings and then retrieved back into AccessPredicate objects.

This will clean up the code and be "logically" a more performant FederatedAccess object (real world difference will likely be minimal).

Requiring Further Discussion. It is likely that NamedOperations and NamedViews should follow the same changes. This likely should be achieved by a default caching mechanism (similar to FederatedStoreCache) which serialises as default using JSONserialiser or uk.gov.gchq.gaffer.serialisation.implementation.*. A lot of effort has been made across gaffer development to use these serialisers, only to find issues with the JavaSerialiser via JCSCache.

GCHQDev404 avatar Oct 09 '20 16:10 GCHQDev404

After discussion, this item won't be handled as part of the [ now alpha-4] fedstore merging, but will be reviewed as part of alpha-5 and a decision taken then as to whether to handle it there or after the later NamedOperation changes

n3101 avatar Jan 19 '22 17:01 n3101

We may not in the end include any NamedOperation changes in 2.0. The question here is whether this item is worth doing and, if it is, whether it has to be included in the 2.0 release or can be be left to the post-2.0 backlog

n3101 avatar Oct 26 '22 09:10 n3101

Due to test FederatedStoreCacheBackwardCompatibilityTest requiring backwards compatibility of the Cache I can't do smarter changes with the cache as required.

GCHQDev404 avatar Jan 09 '23 18:01 GCHQDev404

Have since decided the only break and migration required is the cache gets deleted and Graphs are re-added. the underlying data is of graphs will still persist.

GCHQDev404 avatar Jan 11 '23 13:01 GCHQDev404

repost: Requiring Further Discussion. It is likely that NamedOperations and NamedViews should follow the same changes. This likely should be achieved by a default caching mechanism (similar to FederatedStoreCache) which serialises as default using JSONserialiser or uk.gov.gchq.gaffer.serialisation.implementation.*. A lot of effort has been made across gaffer development to use these serialisers, only to find issues with the JavaSerialiser via JCSCache.

GCHQDev404 avatar Jan 17 '23 01:01 GCHQDev404

Closed by https://github.com/gchq/Gaffer/pull/2862

t92549 avatar Feb 06 '23 12:02 t92549