Get rid of docker pause containers with a custom runtime. Closes #15086
Hi @tgross, I see you added a needs-rebase label. I'd happily rebase and work with you on that if there is any consensus on how to move this forward. Personally I think it would be a massive stability win for nomad.
Hi @tgross, I see you added a
needs-rebaselabel. I'd happily rebase and work with you on that if there is any consensus on how to move this forward. Personally I think it would be a massive stability win for nomad.
Ah yeah... I marked this (and all other open PRs at the time) as needs-rebase just because of a CI change around backports and our new LTS workflow that wouldn't work on any open PR that wasn't rebased on main after those changes landed. Sorry, I should have posted a note too. :grinning:
As far as this proposal goes, I really like the idea of dropping the pause containers but I'm still fuzzy on whether this particular implementation is viable (especially with having to deal with the group-level networks). There's not really enough for me to go on here and unfortunately I haven't had time to dig in further to make sure we understand all the implications of the design.
@tgross Any chance that the team looks at this and we end up with some sort of plan? Even if this is not supported for nvidia runtimes, it would be a massive win for everyone else.
As I mentioned above, I'm still a little fuzzy as to the viability of this plan. The PR doesn't have a working implementation in place, so it's hard to reason about without going thru it from scratch. Unfortunately we haven't had time to do so.
Mkay, combined with https://github.com/hashicorp/nomad/issues/15086#issuecomment-1954863893 it is a working implementation (or at least was last time I checked). I rather not put work into it if there is no chance on getting this in, so I also would like to keep the python script for now and not rewrite it into go :)
@tgross this PR now contains a fully working implementation. Start nomad with this config:
plugin "docker" {
config {
new_networking = true
}
}
and configure docker like this (/etc/docker/daemon.json):
{
"runtimes": {"nomad": {"path": "/home/florian/sources/nomad/bin/nomad", "runtimeArgs": ["runc"]}}
}
Adjust the path to the runtime executable as needed. Let me know what you think!
Thanks @apollo13! I'll take a detailed look thru in the next couple days.
@tgross Thanks, no rush though. I probably cannot finish it anyways this year. That said I think it would be really valuable to have and maybe it can act as a starting point. Don't hesitate to ask if anything is unclear though!
If you can't get back to it because of outside forces (totally understandable), just let us know and we'll look into carrying the PR forward (with credit to you, of course!).
Will see what I can do, but will most likely have to be in my free time since at work we are probably moving to k8s after the recent license changes and the general move to hide more and more behind the enterprise license (and other stuff like good CNI plugins with network policy basically not being existing for nomad). Would love you folks to talk me out of it though :þ
What timeframe are you thinking about for getting that in? If you want it this year, it might be easier if you continue it. If you are not in a rush, then I might be able to get something done, no promises.
What timeframe are you thinking about for getting that in? If you want it this year, it might be easier if you continue it. If you are not in a rush, then I might be able to get something done, no promises.
No real rush. As you might imagine things slow down a bit going into the end of the year anyways. Our next major release 1.10 LTS isn't until the spring, but this seems like the sort of thing we could land in a minor release no problem.
Found some time today to clean up the PR, made the cmd hidden and renamed it to runcshim. Now you also need to pass the "next runc" binary via runtimeArgs in the docker daemon.json:
{
"runtimes": {"nomad": {"path": "/home/florian/sources/nomad/bin/nomad", "runtimeArgs": ["runcshim", "runc"]}}
}
So I have a deal for you @tgross. Would you mind taking this over and writing the tests? I'll adjust everything else you want (or you do it by your own if you are faster) but getting this tested will take me a fair amount of time to even get a basic test setup running (I'd happily improve them though -- assuming time permits it -- if you can show me an example of how to write those -- ie how should docker daemon.json get configured etc).
So I have a deal for you @tgross. Would you mind taking this over and writing the tests?
Sure thing, no problem. I'll look to getting it landed sometime in this major cycle.
I haven't forgotten this and the issue is still mocking my attempts at inbox zero. Obviously we missed 1.10.0 for this with all the recent excitement in our org. But I'm going to chat with @arodd sometime next week about seeing if we can find a place for this in the roadmap.
I haven't forgotten this and the issue is still mocking my attempts at inbox zero. Obviously we missed 1.10.0 for this with all the recent excitement in our org. But I'm going to chat with @arodd sometime next week about seeing if we can find a place for this in the roadmap.
Did any clarity ensue on this since?