nulecule
nulecule copied to clipboard
[epic] Rework params to support xpath-like replacing and non-scalar values
Problem statement
Currently params work with simple substitution. You need to specify the param name prefixed with $
in the artifact and it's then replaced with the value given either by default, answers file or by the user in case of interactive mode. This is a problem because a developer of nulecule application has to modify (upstream) kube files and replace real values with something like $image
. This brings debuging problems right now (you cannot use artifacts shipped in nulecule app directly) and will bring a lot of maintainance problems in the future (you need to mirror all changes in upstream artifacts in your modified files).
Based on the description in previous paragraph it is obvious (almost) all artifacts has to be modified from upstream and thus maintained directly in the nulecule app repo - i.e. it does not make much sense to reference artifacts by remote URL (https://...) and deffer their download to the point of installation.
Also there already were some discussions about supporting non-scalar types, thus part of this proposal is addition of type
key to the params definition.
Proposal
JSON Pointer
Based on an idea of @markllama we came to the conclusion we need to be able to replace param values in artifacts not based on the placeholder, but based on the position in the json/yaml object. This will enable developers of nulecule applications to exactly specify what should be replaced without a need to modify the artifacts.
This concept is described in a JSON Pointer (https://tools.ietf.org/html/rfc6901)
For this kubernetes pod definition
{
"apiVersion": "v1beta3",
"kind": "Pod",
"metadata": {
"labels": {
"app": "helloapache"
},
"name": "helloapache"
},
"spec": {
"containers": [
{
"image": "centos/httpd",
"name": "helloapache",
"ports": [
{
"containerPort": 80,
"hostPort": 8080,
"protocol": "TCP"
}
]
}
]
}
}
The param image
should look like
params:
- name: image
pointer: [/spec/containers/0/image]
description: The webserver image
default: centos/httpd
As you can see, pointer
is defined as an array which means you can specify more places in an artifact(s) where the param should be used. Also the default
is optional and if not specified (i.e. if the key default
is omitted) the actual value in artifact can be used. Existing key default
set to null
means that a user needs to provide the value in answers.
Although using the pointer
will be strongly recommended, current substitution of $
prefixed variables will continue to work as there might be some use cases for it.
To preserve the simplicity of answers file, params will be refferenced by name there. On the other hand extension to support JSON Pointer notaition as keys in answers file should be simple (and would potentially enable user to replace not only the given params, but, for debugging purposes, any value in any artifact)
Problems
-
Same JSON Pointer path might exist in multiple artifacts although user might want to change the value only in a single specific artifact
- This might trigger a discussion about adding artifacts specific params again.
-
Exact location specified by JSON Pointer might lead to failing runs in case of unexpected changes in artifact. f.e.
value
of environment variablePASS
specified in kubernetes pod definition:... "containers": [ { "name": "mariadb", "image": "$image", "env": [ {"name": "USER", "value": "$db_user"}, {"name": "PASS", "value": "$db_pass"}, {"name": "NAME", "value": "$db_name"} ], "cpu": 1, ...
would have path
.../containers/env/1/value
. If upstream decides to add environment variableFOO
above thePASS
variable, the path toPASS
changes to.../containers/env/2/value
(note the change1 -> 2
).- This is probably a smaller issue then current state of mirroring all artifacts as a part of the app
Allow URL and git description for artifacts
Although specification does not block anybody from specifying remote location, current implementation and the way how we substitute params basically supports only artifacts stored locally. As a part of this proposal we should document ability of the spec to accept remote locations for artifacts - i.e.
artifacts:
- kubernetes:
- https://raw.githubusercontent.com/projectatomic/nulecule/master/examples/skydns-atomicapp/artifacts/kubernetes/skydns-replicationcontroller.yaml
- ftp://example.com/nulecule/my-app/artifact.json
- file:artifacts/artifact2.json
- ...
I'd also like to propose a special type of artifact which would be specified by object in a format similar to:
artifacts:
- kubernetes:
- git_url: https://github.com/projectatomic/nulecule/
checkout: fda3fb8c48a7f8e6b73f70f55035c21fa22d7941
path: examples/skydns-atomicapp/
Where
-
git_url: is mandatory and represents a string which can be used in
git clone
command -
checkout: is optional and represents a string which can be used in
git checkout
command -
path: is optional and represent a string which is
- a directory and can be used in
cd
command (all files found in that directory are then expected to be artifacts) - a file and can be read
- a directory and can be used in
Artifacts can be then specified by a combination by URL like strings and above specified objects describing contents of git repository.
Problems
The only problem I can think of regarding this change is potentially conflicting file names. This is not really a spec problem, but mostly an implementation problem thus at least some naming suggestions might be provided to developers.
Add type
to the params definition
Currently we use INI
format to store answers.conf(.sample)
. Although this is nice from user experience point of view, it blocks us from using non-scalar values for params as is.
Let's assume we have an application which expects a list of IP addresses as a parameter. With current params structure and INI
file it's not possible to parse and pass such value to the Nulecule app without assumptions being done about the param - i.e. mathing the value to some regular expressions or something similar. In case of adding a type
key to params definition we would be able to recognize the value type easily and by parsing it also validate it against the given type
.
For yaml
and json
types are implicit, for INI
we would need to introduce some conventions - f.e.:
nulecule:
params:
- name: public_ips
pointer: /spec/containers/0/public_ip
type: array
default:
- 10.0.0.3
- 10.0.0.4
-----
answers.conf:
[my-app]
public_ips = '["192.168.1.2", "192.168.1.3"]'
The type
key might be optional and default to current behaviour - i.e. type string
.
Problems
This would obviously bring more complexity to both implementation of the spec and creation of nulecule apps.
Related https://github.com/projectatomic/nulecule/issues/64
do you guys think we should put that in the Summit release? I think so...
I think so too
If possible. I sure would like it as soon as is reasonable.
On Wed, May 20, 2015 at 2:28 PM, Vaclav Pavlin [email protected] wrote:
I think so too
— Reply to this email directly or view it on GitHub https://github.com/projectatomic/nulecule/issues/70#issuecomment-103987085 .
Mark Lamourine [email protected] Dad, Hubbie, Software Developer, System Administrator, Road Cyclist
Just wondering - why does the artifacts
object contain an array instead of another object? shouldn't it be like this?
artifacts:
kubernetes:
- https://raw.githubusercontent.com/projectatomic/nulecule/master/examples/skydns-atomicapp/artifacts/kubernetes/skydns-replicationcontroller.yaml
- ...
I am wondering the very same thing atm as I was looking at rework of atomicapp to comply with 0.0.2 - we changed components and params to be a list to get rid of magic keys..shouldn't we do the same for providers? Although that would have to be
artifacts:
- provider: kubernetes
content:
- file:...
@aweiteka @goern Are you ok with using provider names as keys here? Or should we also change it to a list?
using names as keys breaks schema validation, doesn't it?
array of things with one attribute "name: foo" or .... .not sure
On Thu, May 21, 2015 at 12:49 PM, Vaclav Pavlin [email protected] wrote:
I am wondering the very same thing atm as I was looking at rework of atomicapp to comply with 0.0.2 - we changed components and params to be a list to get rid of magic keys..shouldn't we do the same for providers? Although that would have to be
artifacts:
- provider: kubernetes content:
- file:...
@aweiteka https://github.com/aweiteka @goern https://github.com/goern Are you ok with using provider names as keys here? Or should we also change it to a list?
— Reply to this email directly or view it on GitHub https://github.com/projectatomic/nulecule/issues/70#issuecomment-104352656 .
Mark Lamourine [email protected] Dad, Hubbie, Software Developer, System Administrator, Road Cyclist
ja, there shall be no magic key as of #35
Python implementation of json pointers - https://pypi.python.org/pypi/jsonpointer
since rfc6901 is a fragment identifier specification, I would really suggest for standard and potential future uses to use the syntax # for specifying the path: pointer: #/spec/containers/0/public_ip
This means in URI terms, take the current document and extract the following fragment identifier. Potentially we could use this to expand by fetching from other resources via URI-References
This has the potential to break multi-provider support. Perhaps we should consider a two-level process. The Nulecule file remains a key/value structure like today. Each provider needs to provide a transformation matrix that translates the keys from the Nulecule file to their pointers (or other method) for the specific provider. It creates more work for application providers, but retains easy use by end-users with single and multiple providers.
This means we may need to define built-in transformers (probably xslt or something similar for json, and $XX substitution for docker) and the ability to load an external transformer from a container for more esoteric types.
@bexelbie I think you nailed it.
- We can use provider artifacts unaltered
- It addresses the provider-specific params issue #109
- It allows for huge flexibility in using params while keeping the developer cost low
- It provides a way to map unstructure provider files such as docker using the
$<var>
substitution method - Parameter re-use
Fleshing this out from our conversation...
Given Nulecule file
...
graph:
- name: my-app
params:
- name: foo
description: My param
default: bar
- name: bee
description: Your param
default: buzz
artifacts:
kubernetes:
- file://artifacts/kubernetes/template.json
docker:
- file://artifacts/docker/template.run
...
artifacts/kubernetes/params.yaml
maps the Nulecule param names to the provider file path.
foo: #/spec/template/metadata/name
bee: #/spec/some/other/json/pointer/path
artifacts/docker/params.yaml
maps the Nulecule param names to the provider file using $<var>
substitution. The bee
param is not used for this provider since there is no mapping.
foo: $value
File artifacts/docker/template.run
might look like this:
docker run -e FOO=$value ...
Above I proposed putting the provider param translation in another file. This is problematic since we would probably need to agree on a naming convention and it abstracts the parameters making it difficult to develop since the params are referenced in two different files.
A more compact way would be to refactor the artifacts object so it has more structure than just a list of files. This would allow us to embed the params mapping into the artifacts object. Something like this:
...
artifacts:
kubernetes:
sources:
- file://artifacts/kubernetes/template.json
params:
- foo: #/spec/template/metadata/name
- bee: #/spec/some/other/json/pointer/path
docker:
sources:
- file://artifacts/docker/template.run
params:
- foo: $value
...
cc @veillard
Seeing sources: - file://artifacts/kubernetes/template.json params: - foo: #/spec/template/metadata/name
urges me to simplify it to params: - foo: file://artifacts/kubernetes/template.json#/spec/template/metadata/name
And you have something which is a proper URI
Daniel
urges me to simplify it to params:
- foo: file://artifacts/kubernetes/template.json#/spec/template/metadata/name
That does seem appealing and I think we should support that so params can be targeted to an exact thing using fully qualified URI. The trouble I see with this is...
-
sources
is a list of artifacts to deploy. - one may have many params per source file potentially resulting in a lot of redundancy in specifying the file.
- some sources are in git remotes. See artifacts object. We would need to think through how these are referenced.
One of the use cases I'm considering is there may be multiple places to reference a single param value. This is not possible with the list of key:value pairs suggested here. A way to approach this could be to us a list of pointers under each param name key. For example...
...
artifacts:
kubernetes:
sources:
- file://artifacts/kubernetes/template.json
params:
db_pass:
- #/spec/template/metadata/database/passwd
- #/spec/template/metadata/frontend/database_passwd
foo:
- #/spec/some/other/json/pointer/path
docker:
sources:
- file://artifacts/docker/template.run
params:
db_pass:
- $value
...
One nasty thing is that referencing local files is always a bit of a mess, file:// scheme works fine for absolute patch but in practice it's rarely where the files sits. The RFC for file:// is (was) a big mess for anything not an absolute path, so you end up forgetting the scheme and doing things like
../config/templates.json which is an URI references to the base which is the URI of the current file. and then you hit Windows with C:\MyStuff\myapp.json for the base and all bets are off :-
So that 'Let's use URIs' is great up to a point for referencing files. Where it would shine is very dynamic
environment where app templates are just kept online and parameters are fetched from databases
or from the config parameter exported by existing applications or run-times.
Since we need to keep $ based substitution anyway, we could just make source optional for pointers. That's the equivalent of a base (html or XML) and a commonly accepted concept in URI world :-)
@veillard I am not sure f I understand what you are proposing/suggesting in your comment - file://
refs are bad, but not using them is worse...so what is the solution?:-)
no solution, just a warning. for example file:/../foo vs. file:///../foo will get probably more confusion and errors than simply accepting ../foo for relative file paths
Hum, params should really be associated to a given source. If you start giving multiple sources in the current suggested syntax, how do we know on which source the param should be evaluated ? single source and associated params need to be bound together. Either we do:
params:
foo: artifacts/kubernetes/template.json#/spec/template/metadata/foo
bar: artifacts/kubernetes/template.json#/spec/template/metadata/bar
and then we need to check the resource is in the provided list of sources, or when we define the sources we do a liste like
artifacts:
kubernetes:
sources:
- resource: artifacts/kubernetes/template.json
params:
- foo:
- /spec/template/metadata/foo
- bar:
- /spec/template/metadata/bar
- resource: artifacts/kubernetes/extra.json
@veillard You are right, @aweiteka's proposal will not work in case pointers are not same for all artifacts in a provider..so it seems we will have to go your way - params mapping will need to be specified for each artifact (i.e. source)
Maybe don't use array for params as the keys are not really "magic keys" we want to avoid but has to exist in component.params
section
artifacts:
kubernetes:
- resource: artifacts/kubernetes/template.json
params:
foo:
- /spec/template/metadata/foo
bar:
- /spec/template/metadata/bar
- resource: artifacts/kubernetes/extra.json
Yeah that works
artifacts/kubernetes/template.json#/spec/template/metadata/foo
is more web like but it can get repetitive and we already need to provide the URI for the resource so ACK on vavlin suggested syntax
We need to update the spec.
To be honest, I haven't had much experience in regards to spec writing.
@aweiteka @bexelbie Where in particular should I be adding the changes of xpathing into the spec?
@charliedrage I think the only change is to the params object: https://github.com/projectatomic/nulecule/blob/master/spec/README.md#paramsObject and https://github.com/projectatomic/nulecule/blob/master/spec/param.json
Awesome, I'll send in a PR :) thanks @aweiteka !