container-linux-update-operator icon indicating copy to clipboard operation
container-linux-update-operator copied to clipboard

MOTD is not updated

Open crawford opened this issue 7 years ago • 6 comments

The MOTD on Container Linux claims that the reboot strategy is off. When CLUO is running, it should update the message to be accurate.

crawford avatar Jun 21 '17 17:06 crawford

Dupe of https://github.com/coreos/bugs/issues/1968, I'll leave this here for tracking any work needed here.

euank avatar Jun 21 '17 18:06 euank

I think what we need here is for the agent to indicate that it's managing updates.

I also think the indication should be fine-grained to the point of including:

  1. Running status for the agent
  2. Whether updates are paused/disabled

There are a couple options for this that make sense.

  1. Bindmounted file (with update paused/disabled status) + a flock-x
  2. Register a dbus handler that can be queried for update status

I'm going to propose going with the file approach since it's simpler.

The format can be a simple bash-sourcable KEY=value file similar to update.conf.

I'll suggest idly /var/lib/update-engine/coordinator.conf with the only keys being NAME="K8s Update Operator" and STATUS="running|paused", though maybe the latter is unneeded. In addition, I'll suggest that it must keep a flock on the file as often as possible.

euank avatar Jul 11 '17 22:07 euank

This may be obvious to everybody reading this except me, but the update strategy message is not part of MOTD. It is instead generated for every login shell as as part of profile.d just by probing is-active locksmithd. Script is at /usr/share/baselayout/coreos-profile.sh.

~~Looking at node-agent deployment it seems to already have a bind-mounted /var/run/.~~

@euank what about:

  • dropping a file in run instead of lib, to make it reflect status only for the current boot
  • tweak current baselayout profile.sh to s/No Reboots/locksmithd - disabled/
  • augment profile.sh to read k-v from the coordinator.conf and print Update Strategy: $NAME - $STATUS defaulting to the labels above

Also, are we concerned about sourcing a file written from a container on every login shell (including root)? What is the recommended safe way in bash to read&print data without intermediate evaluation?

EDIT: I was looking at the wrong agent manifest.

lucab avatar Jul 12 '17 10:07 lucab

@lucab I should have mentioned that's how it's updated, yes. My above comment was so that said profile.sh could read from it.

dropping a file in run instead of lib, to make it reflect status only for the current boot

That does sound better, good idea!

tweak current baselayout profile.sh to s/No Reboots/locksmithd - disabled/ augment profile.sh to read k-v from the coordinator.conf and print Update Strategy: $NAME - $STATUS defaulting to the labels above

That's close to what I was thinking, yeah. That's why I chose those two pieces of information.

Also, are we concerned about sourcing a file written from a container on every login shell (including root)? What is the recommended safe way in bash to read&print data without intermediate evaluation?

I'm not concerned about the performance. I thought about the security aspects a little, but since both run and lib would be only writeable by root I don't think it's so scary. Anyone that can write files as root at an arbitrary host location can already achieve root/DoS fairly trivially, so adding another path from "write to file as root -> code runs" doesn't seem like it introduces any new security issues.

If we really want to avoid sourcing it, we could suitably constrain it to be parseable by awk/sed.

Did you have anything particular in mind?

euank avatar Jul 12 '17 19:07 euank

If we really want to avoid sourcing it, we could suitably constrain it to be parseable by awk/sed. Did you have anything particular in mind?

No, I like you plan, +1 on that. I was just wondering about having a real data-only .conf instead of a potential script to source and I wanted to learn about best-practices (my bash-fu is weak).

lucab avatar Jul 13 '17 08:07 lucab

@dghubble if you're interested in tackling the CLUO side of things, the changes made to locksmithctl/daemon.go here are pretty close to what cluo will need as well, I'd guess.

Sprinkling UpdateState calls in the appropriate locations should be enough to resolve this issue on Container Linux versions of at least 1520. There's no harm in this on older container linux instances, so rolling it out doesn't need to gate on the CL changes hitting stable.

euank avatar Aug 28 '17 23:08 euank