go-patch
go-patch copied to clipboard
Copy/Move operation
There are some pretty valuable use cases for have a move/copy functionality, primarily when you want to re-arrange nested objects.
For example, suppose you have a yaml file like a BOSH manifest with the following structure:
instance_groups:
- name: api
jobs:
- name: api-server
properties:
db_string: mysql://mysql.example.com:3306
- name: api-worker
properties:
db_string: mysql://mysql.example.com:3306
<lots and lots of configuration...>
In BOSH land, if you wanted to scale the workers without scaling the servers, you'd have place the api-worker
job in a different instance_group and then scale that separately. That would mean you'd want to end up with the following yaml:
instance_groups:
- name: api
jobs:
- name: api-server
properties:
db_string: mysql://mysql.example.com:3306
- name: worker
jobs:
- name: api-worker
properties:
db_string: mysql://mysql.example.com:3306
<lots and lots of configuration...>
It should be easy, except that your operation definition would need to know the internal data of the yaml you're operating on:
- type: replace
path: /instance_groups/-
value:
name: worker
jobs:
- name: api-worker
properties:
db_string: mysql://mysql.example.com:3306
<lots and lots of configuration...>
You have to copy all of the data that already exists in the yaml being operated on. With a move
functionality, on the other hand, you'd be able to write something like the following operation instead:
- type: replace
path: /instance_groups/-
value:
name: worker
jobs: ~
- type: move
from: /instance_groups/name=api/jobs/name=worker
path: /instance_groups/name=worker/jobs/-
Or, if you're super cool:
- type: move
from: /instance_groups/name=api/jobs/name=worker
path: /instance_groups?/name=worker/jobs/-
No knowledge of the internals of api-worker
required. A copy
could function the same way.
Would this be a reasonable interface for move
/copy
? Is something that could get added? Asking for a friend.
@dsabeti move/copy are not implemented at this point on purpose. if viewed as a generic library, this proposed functionality makes a lot of sense; however, one of my goals though is to expose missing abstractions in bosh's deployment configuration. there are several scenarios that possibly need to be discussed further to see if they require bosh core changes:
-
decoupling of jobs & instance groups (to allow further rearrangement of jobs; eg instance group one is jobs [a,b,c] and instance group two is [x,y,z])
-
decoupling intent of colocation (to allow placement of one job next to another even if its location changes; eg metron_agent everywhere within deployment, cc_clock next to cc_worker)
Make sense. I'd actually prefer a BOSH primitive for changing colocation strategies. Do you know how far out that work is? I think we were mostly thinking about this feature as a stop-gap until "topologies" or "roles" are implemented in BOSH.
We had another break today due to the duplication the current constraints require.
We have opsfiles that add instance groups that need configuration common to other jobs, but can't copy the necessary configuration in from anywhere, so have to duplicate it. Specifically, we had changes to how we accessed name variables that didn't make it into the ops files.
For something like, say, every metron agent needing mutual tls stuff to talk to loggregator, it would be really nice to be able to copy this stuff until it's made available via link.
I know that links are the correct solution. In the mean time, we're feeling pain from incorrectness we're not positioned to do much about.
@cppforlife why isn't this being viewed as a generic library? I'd love to see go patch interpolation added as a feature to fly CLI, for example, and this library being generic as possible should facilitate that.
@cppforlife: what's the status of this feature request? I see that it's marked with v2
, so I have hope that it'll happen one day. For my team, it would be useful for configuring diego isolation segments, concourse tagged workers, etc., where we want to copy the configuration of a base diego cell, concourse worker, etc., and apply a few changes on top. Or is there a cleaner way to handle that use case?
@jmcarp one of the unanswered questions for copy op is once content is copied, how does one reference new content or original content (aside from some kind of numeric index) since both content pieces will be same?
Is that really a problem though? They'll have unique paths, right? Since one must copy to an addressable location?
@cppforlife Better support for changing colocation strategies would be super useful. We're trying to start working on a small-footprint deployment manifest for CF, but we're not sure how we'll keep properties in sync between our main manifest and our small-footprint manifest, so the feature request is becoming more timely.
@cppforlife: I see what you mean. If we have a manifest like this:
instance_groups:
- name: foo
and an operation like this:
- type: copy
from: /instance_groups/name=foo
path: /instance_groups/-
then we'll get a manifest like this:
instance_groups:
- name: foo
- name: foo
and won't be able to distinguish the two identical instance groups, except by index. We could always immediately modify the last instance group:
- type: copy
from: /instance_groups/name=foo
path: /instance_groups/-
- type: replace
path: /instance_groups/-1/name
value: bar
but that feels like a hack. What if we allowed key-value pairs from the path to override values in the copied object, like this:
- type: copy
from: /instance_groups/name=foo
path: /instance_groups?/name=bar
so that we would copy instance group foo
, but override its name
to bar
. That would give the original and copied objects different key-value paths. Would that make sense?
You could also work around this issue by inserting the new instance group name with a replace, and then copying its properties, etc, into place with copy operations.
The "make another instance group, but with a different name and one or two changed properties" use case suffers from this, though it has workarounds. The "copy a job from its usual location to a colocation instance group and then delete the original to reduce footprint" use case doesn't run into this at all.
More brainstorming syntax...
A new type
which changes context and whose value are ops to be applied to it.
- path: /instance_groups/name=foo
type: apply # change context
value:
- path: /
type: copy
value: //instance_groups/- # xquery-style double slash for root
- path: /name
type: replace
value: bar
A new ops
field [only on copy
?] which supports relative paths.
- path: /instance_groups/name=foo
type: copy
value: /instance_groups/-
ops:
- path: name
type: replace
value: bar
@dpb587-pivotal: I like the second option that you proposed--it's like the syntax I was thinking of, but more explicit. @cppforlife: wdyt? I'd be happy to write a patch if you think the interface makes sense.
The post-copy local operations are elegant. I would only suggest that the sub-key for defining them is named in such way that clearly states when the sub-operations are going to happen.
Indeed, writing this would be perfectly legal, but hard to catch:
- ops: # it's not obvious here that this is happening after the copy below has taken place
- path: name
type: replace
value: bar
path: /instance_groups/name=foo
type: copy
value: /instance_groups/-
Something like post-ops
, after
or then
(which reads nicely) would fit there:
- path: /instance_groups/name=foo
type: copy
value: /instance_groups/-
then:
- path: /name # root context is set to the root of the copied node
type: replace
value: bar
Whereas post-ops
and after
would allow to later introduce pre-ops
or before
, if at some point such concept turns out to be relevant for some operations.
As far as path is concerned, I would recommend for consistency that we don't introduce any new kind of path expressions, and just state that sub-operations are local to the targeted YAML node.