metacontroller icon indicating copy to clipboard operation
metacontroller copied to clipboard

Handling a many to many map

Open JonathanFraser opened this issue 6 years ago • 11 comments

We have a use case that requires a fan out, fan in style flow and we're trying to figure out if it's possible via the metacontroller. We have a set of compute graphs which utilise a set of compute workers. There are many such graphs. However, they have the same compute workers and we would like to treat the workers as a shared resource amongst the graphs.

The scheme we've concocted, which may be an XY issue, is for each graph to create a series of compute claims. We need a controller to then consume the entire set of compute claims to figure out the set of resources to deploy and this is the sticking point. I suppose we could probably implement this via a fully custom controller but we've had some success with the metacontroller and we are looking for some input on whether this is a supported workflow or not. Perhaps there's a better way to accomplish this?

JonathanFraser avatar Jan 29 '19 00:01 JonathanFraser

To be honest, I haven't figured out a way for Metacontroller to be of much assistance in these scenarios. We had a similar discussion in #107. The common denominator is that your hook is going to need a presumably large number of objects to answer a single query: as you said, you need to see the set of all claims in order to compute the desired set of workers and/or the binding to workers.

It's certainly possible to set up such a controller with Metacontroller, but we'd have to call your hook every time any of a large number of objects changes, and we'd have to send you the entire list of objects every time. At that point, you would reap considerable efficiency gains by running your own local cache instead of sharing Metacontroller's cache, as was decided for the scenario in #107.

enisoc avatar Jan 29 '19 20:01 enisoc

Well, in some sense would just invert the parent/child ownership relation. For each child it has multiple but a selectable number of parents. So perhaps it's not many to many, but many to one. I don't think it's much more data than the current parent/children relationship, particularly if we can leverage selectors for input filtering.

JonathanFraser avatar Jan 29 '19 20:01 JonathanFraser

That's a good point. There could be ways to massage your match-making algorithm so there's not one big step of "look at everything and decide everything". For example, you could have a hook that gets called once per "worker with available resources", per "unbound compute claim", where the filtering is done with selectors on the Metacontroller side. Each worker would then have the option to bind to the claim or not. Is that the kind of thing you were thinking about?

enisoc avatar Jan 29 '19 20:01 enisoc

That could maybe work, I can see that being useful if your workers are already seeded and running. I can also see that back flow style being a general use case.

In our case the children won't be created until there is at least one claim. Ultimately what we want is to ensure that there is one child if there is one or more parent, and to clean up the children if there are no matching parents. Even something as simple as returns all parents and children which match selector 'x' would be be sufficient. It's basically an ensure 1 use case.

I think the composite controller might handle this, so long as I set up the child match correctly and then use finalizers to tear it all down. However I'm worried that it might be a bit racy.

JonathanFraser avatar Jan 29 '19 21:01 JonathanFraser

Hm let me see if I can rephrase it to make sure I understand:

It sounds like each claim asks for a worker of a particular type, and what you want is to automatically create a single worker of each type as needed, and delete the worker when it's no longer needed. Is that right?

If so, the main blocker I see is that Metacontroller currently only supports watching children that you own. Since Graph creates the Claims, it should own them, but that means something like "WorkerAutoscaler" would ignore those Claims since it sees that someone else owns them already. I explored some patterns for watching children that you don't own in a design proposal, but I never got the concepts to fit together cleanly enough to feel like it's ready to implement. We discussed some possible alternatives in #98.

enisoc avatar Jan 29 '19 22:01 enisoc

Yeah, that sound about right. So the graphs create claims and owns them. That's easy enough, fan out pattern of the composite controller. Then what to do with the claims. Well, my first thought is to have a controller watching the claims and then for each unique type of claim ensure a worker deployment for that type is running. The workers would be responsible for auto-scaling themselves based on queue size but that's a separate detail. I can set up a composite controller where I use a type selector to pick the worker children for the claim but it runs into issues with the fact that two different claims are attempting to select and create the same worker child. A decorator doesn't quite work because it requires the workers to already exist before I can attach them to a claim.

I might be able to work around this by having a WorkerManager object and use a decorator to attach to it the claims. Then have a composite controller deploy workers based on the manager. Though something about having a singleton seems a bit of a code smell.

JonathanFraser avatar Jan 29 '19 23:01 JonathanFraser

Hmm, nvm seemed to not understand the decorator. It seems both controllers require any children to have exactly 1 parent, and only that parent can control them.

JonathanFraser avatar Jan 31 '19 18:01 JonathanFraser

FYI if we push forward on a proposal like #98 (prototyping possible API in #142), I think it would make it easy to solve this problem. You could define a composite controller that watches claims that match a certain label selector, and then emits 0 or 1 desired child workers as appropriate.

enisoc avatar Feb 01 '19 18:02 enisoc

I can see getting all objects which match a selector, how would you get the set like behaviour with a regular selector to batch them in the backend. Lets say I have some claims with types of foo,frak and bar, with 10 each. Ideally I would get one callback from metacontroller for each of the subtypes but AFAICT, I would need to forward declare each of those types and define an object for each. There isn't a way via selectors to say, batch on this label. While not strictly nessecary, that would allow for the greatest use of the metacontroller cache in this scenario.

JonathanFraser avatar Feb 01 '19 19:02 JonathanFraser

Ah I think I see what you mean. So, at least if #98 happens, this use case will become feasible: you could have a controller that watches all claims as related objects, and creates various types of workers as children. That's not even possible today because of the "one parent rule" as you described above.

However, it might be wasteful of resources if you have many different types of claims, because that one hook would be called with all claims whenever any claim changes.

It's certainly possible that we could define "batching" behavior (essentially GROUP BY in relational terms). It would work and be efficient and elegant (in a mathematical sense), but the risk is that the API becomes complex and esoteric.

enisoc avatar Feb 01 '19 19:02 enisoc

Yeah, I can see that. A simple shift from:

    annotationSelector:
      matchExpressions:
      - {key: service-per-pod-label, operator: Exists}

to

    annotationSelector:
      matchExpressions:
      - {key: service-per-pod-label, operator: Group}

would probably do the trick, but I suspect you are passing that directly to kubernetes and not parsing it at all.

JonathanFraser avatar Feb 01 '19 19:02 JonathanFraser