playground
playground copied to clipboard
Remove internal registry
We are not sure if the internal registry is something we want or if it is just a complication.
We enabled #41. So now the workers have access to the internet. It means that all the registry setup and the certificate to have it secure is not strictly required anymore.
BUT we will have to figure out by ourselves how to get our actions to run in the worker. This is not a problem when the action is public, but for private repositories, it is a bit more complicated and the operating system installation environment (osie or tinkie) will have to give us a way to inject authentication (or we have to make tink-worker good enough to authenticate I suppose)
Yep private repo/images is something we should think about but I think the way to do it is to force it? I'm in favor of dropping the registry from sandbox.
I think the way to do it is to force it?
What do you mean?
Drop the registry so that we can have a discussion about this. Force the discussion.
I agree with @mmlb, just drop it for now from the sandbox. It is easy to add again later and most people/orgs that need a private repo already have a policy and implementation or know how to set it up.... Just my 2 cents... :-)
need a private repo already have a policy and implementation or know how to set it up.... Just my 2 cents... :-)
Yeah, the problem is not how you set it up for yourself, but how you can use it in the installation environment if it has passwords or things like that. Anyway, I am always up for removing things 👍 I will leave time for more people to leave a comment, and at some point, I will move forward accordingly
Yeah, the problem is not how you set it up for yourself, but how you can use it in the installation environment if it has passwords or things like that
If you are running a private repo you always have that problem... :-( But this problem is also related to the encryption of the metadata path from provisioner to local executor. Thats something we need to make easy and save so all secrets can be transported in a save way.
I think I'd like to keep it for now for consistency with k8s-sandbox. @detiber how do you push freshly built dev images to k8s with tilt? Is the registry required/useful?
Beware i use a private repo registry for Docker images with the Ansible Playbook ...
@fransvanberckel can you tell me a bit more?! Do you rely on the private registry that sandbox provides or you just provision a registry as part of toolchain ansible-role-tinkerbell ships?
If it's the second no problem, you can keep shipping it. If it's the first removing it from the sandbox will require some work for the ansible role.
@gianarb That's right, it's more or less the second. For more details, take a look at the role ... https://github.com/fransvanberckel/ansible-role-tinkerbell/tree/master/roles/tinkerbell
I think removing it is a good idea, because it simplifies things. If you seek to do private images or work in an airgapped environment, there are other ways to do it, like a private Docker registry with credentials or pre-pulling images.
I have started to work at this and the problem is in OSIE (and tinkie @thebsdbox ).
OSIE: https://github.com/tinkerbell/osie/blob/master/apps/workflow-helper.sh#L81 Tinkie: https://github.com/gianarb/tinkie/blob/master/bootkit/main.go#L64
We use the internal registry as a way to select the version of tink-worker we want to run as part of the installation environment.
Currently, it works this way:
- Sandbox has a file called
./current_versions.sh
it contains the publicly available tink-worker image - It gets moved to
.env
as part of the./generate-envrc.sh
script - setup.sh pick it up and mirror it to the private registry always tagged as latest
- In this way the various OSIEs implementation can pull and run it.
We can make it more general and we can write a new RULE to OSIE:
"OSIE use a expect a kernel command line called tink-worker-image
or whatever and it uses its content to run the tink-worker. In this way, we can even override it via metadata"
if we all agree I can open PR against osie and tinkie and sandbox to have all of them inline
(At some point we should write OSIE RULES as documentation, but we don' know them all yet! :P )
We can easily do this now, default to lates and override in the facilities
metadata!
"facility": {
"facility_code": "onprem tink-worker-tag=abc123",
How does that sound?
@thebsdbox I am not too much into the details, but wouldn't it be better to create a new field in the facility object, because this could break applications, that use the facility_code
and build on top of the metadata.
Maybe something like facility.cmdline
?
ah so this is an undocumented thing :| things in the facility code are passed to the cmdline. They're then presented as environment variables or in /proc/cmdline
inside of OSIE and we can process them anyway we like. It won't break any existing functionality.
Yep @thebsdbox I love that. Let's see what @mmlb says and I can proceed when we agree
@nicklasfrahm yeah it is unfriendly and undocumented :) I think at some point we will figure out a better metadata structure for all of that. I think something is part of a proposal @mmlb started but not totally sure https://github.com/tinkerbell/proposals/pull/25/files
Yep I think this should be done in a metadata fashion but not like that @thebsdbox, boots already supports this in "cacher mode", we have "custom services version" that we use to boot different versions of osie (https://github.com/tinkerbell/boots/blob/master/packet/models_cacher.go#L319). Custom services is meant for this in the legacy EM stack and we should probably make use of that in tinkerbell too I think. I don't like the idea of injecting it through the facility
param that is just layering bad UX on top of the bad stuff already :D.
I don't think this should be part of the Instance proposal @gianarb thats the wrong layer to do this I think.
The other option is to just pin it in the osie "package". Which is kind of the other side of doing things. I'm not too keen on this actually.
There are some runtime params that we should have a way to pass down to tink-worker. Its version seems like one, centralized logging config is another. I think maybe hardware.metadata.services
is probably a better place? We can do something like:
{
"tink-worker": {
"log": {
"driver": "..."
},
"tag": "some-tag",
"repo": [
{ "url":"...", "user":"someuser", "password":"s3cr3t"}
]
}
}
edited to add:
and boots adds them to command line with some transformation (maybe gron
like)? some base64 json is also an option but I don't quite like this :D. That does seem like it can get a bit out of hand and we could end up running up against the kernel's cmdline limit. Maybe we just embed grpccurl or similar binary fetch the services data from hegel?
Let's summarise:
- One option is to build the operating system installation environment with the right binary/container. I think we do not have to discuss this scenario because it requires very low coordination. Whoever wants to do it, can do it I understand the benefit (self-contained) but I would like to offer an easy to swap mechanism because I see the benefit of swapping the version from the outside without having to recompile e distribute an OS.
- I proposed
cmdline
because right now it is the best structure and well-known way we have to inject something but I understand what you are saying @mmlb and yes we can get that information from the metadata,tinkie
can extend bootkit to fetch values from there, for osie you have to tell me. Metadata are so unstructured and obscure that I try to avoid that as much as I can xD
Let's evaluate option 2. When you describe services do you mean this proposal https://github.com/tinkerbell/proposals/blob/master/proposals/0014/README.md ? Probably not
At this point we have to figure out a structure and where we want to place that configuration (not in cmdline)
Hmm yeah we can go with option 2 cmdline
, maybe actually (append_cmdline
?). We already have the osie section in
network.interfaces[].netboot.osie | OSIE details
where we can add append_cmdline
. https://docs.tinkerbell.org/hardware-data/
I am confused, you say that cmdline were a bad idea right? :D
The services proposal (workflow services actually) is not what I meant when I mentioned services... naming x)
cmdline is definitely an option, but feels hacky (though its my favorite of the hacks). Having a proper way to get some hardware data from tink-server (through hegel probably?) is I think the best option, but not as quick :D
Ok, we (@mmlb) moved our conversation over prv Slack because here it was too hard and we find a common proposal!
We are gonna teach Hegel a new endpoint /worker
that will expose the content of hardware.metadata.worker
. That field specifies how the worker that OSIE will start looks like.
The format is not yet defied but ideally something like:
{
"worker": {
"log": {
"driver": "..."
},
"image": "quay.io/tinkerbell/tink-worker:something",
....
}
}
Unknown: tink-worker today relays on environment variables and I am not sure how and if we can make it work with this proposal.