Gaffer icon indicating copy to clipboard operation
Gaffer copied to clipboard

Allow NamedOperations to reference each other

Open sw96411 opened this issue 3 years ago • 2 comments

At present, you cannot call one NamedOperation from inside another. This presents a barrier to the re-use of operation chains, requires moderately complex operation chains to be longer and less readable, makes it difficult to propagate changes to commonly used code through a community, and all the other drawbacks that come with a lack of composability.

This restriction is intentional, as it allows us to avoid dealing with thorny issues such as version control of NamedOperations, the impact of propagating changes to an included operation onto the operations which include it, what happens if a user has permissions to only a subset of NamedOperations which comprise a single nested NamedOperation and so on.

However, given the benefits that could be delivered by allowing the nesting of NamedOperations, is it time to revisit this decision?

sw96411 avatar Apr 13 '21 10:04 sw96411

@sw96411 We don't believe our code should impose this restriction, but right now do not have the resources to consider all the consequences you outline. As one of the breaking changes for Gaffer 2.0 we will take an initial step of producing a config setting to impose the block locally in place of the current hard-coded restriction. This will come with a 50 page legal document absolving us of all liability for any consequences of a failure to install the config block :)

n3101 avatar Jul 28 '21 14:07 n3101

@n3101 When I raised this ticket, I was really trying to have a discussion with the community about a way of allowing named ops to reference each other in a way that people can use in production systems.

I'm not really sure that just allowing it technically without thinking through the consequences and then disclaiming the impact of it to users in the docs really helps anyone. Any user technically sophisticated and familiar enough with Gaffer to work out the impacts for themselves would already be able to fork/mod Gaffer to achieve this, and anyone without that knowledge probably shouldn't be enabling the option.

From my perspective, I think it would be better to either acknowledge that the Gaffer maintainers aren't going to implement this ticket and close it accordingly, or acknowledge that we don't have the time/priority to look at this right now and leave it open. Neither of those options is a big killer for me, this is something some of my stakeholders have asked for, but I don't think it's fundamental to the viability of the platform or anything.

I would certainly view the approach you outline above as more akin to a "won't fix" closure rather than anything else. Of course, this is just my 2p, it's your backlog :-)

sw96411 avatar Aug 04 '21 08:08 sw96411

By default Gaffer should continue working how it does currently. There should be a store property added that dictates whether NamedOperations should be able to be called from other NamedOperations which would be false by default. Tests should be added to ensure this works as intended and does not cause infinite loops.

t92549 avatar Jun 27 '23 12:06 t92549

So when resolving namedOperation its possible to have infinite loop of self referencing. Rather than writing complex logic to defeat I suggest a timeout.

GCHQDev404 avatar Aug 31 '23 16:08 GCHQDev404

I agree with @sw96411 that there should be further discussion surrounding how this should be implemented. A fix which just enables this without testing isn't acceptable as this would introduce various risks. The issue of permissions should be addressed and not left as something to be bypassed by this if it were added as an option.

There at least two ways to create a loop. A named operation calling itself and a named operation calling another operation which then calls that same operation again (loop). A timeout could work to prevent this, but I think a limit of the number of total number of operations would be safer. Otherwise through multiple layers of nesting a huge operation chain could be constructed and this would cause instability.

A more robust solution could be improvements to named operations such that:

  • Gaffer refuses to add a named operation referring to another named operation which doesn't exist yet (including itself).
  • Gaffer refuses to delete a named operation which is used by another operation.

There are use cases where nested named operations would be useful and wouldn't cause infinite loops. Using recursive programming, it would be possible to construct a named operation which called itself. This could be used with algorithms for processing nested results/datatypes to create data-specific workflows. However, as operation chains are sequential this cannot be implemented in Gaffer without major changes to operation chains.

This issue raises wider questions around named operations and how they are currently processed.

GCHQDeveloper314 avatar Sep 04 '23 14:09 GCHQDeveloper314

Gaffer refuses to add a named operation referring to another named operation which doesn't exist yet (including itself). Gaffer refuses to delete a named operation which is used by another operation.

This logic is complex and very recursively. To exhaustively test this will require lots more code. I believe let anything be added. The only risk and issue I currently see is infinite loops and crashing.

GCHQDev404 avatar Sep 05 '23 15:09 GCHQDev404

To answer a concern raised by the OP.

See NamedOperation#getNamedOperation In here there is predicate logic used to examine if the user has access to the cached namedOperations. This is guaranteed by the caches usage of the NamedOperationDetail. It is not possible for a user to access a namedOperations nested inside another, if they do not have the correct access.

GCHQDev404 avatar Sep 08 '23 13:09 GCHQDev404