runtime-spec
runtime-spec copied to clipboard
Add vTPM specification
Add the vTPM specification to the documentation, config.go, and schema description. The following is an example of a vTPM description that is found under the path /linux/resources/vtpms:
"vtpms": [
{
"VTPMVersion": "1.2",
"CreateCertificates" : false
}
]
Signed-off-by: Stefan Berger [email protected]
@stefanberger is this interface stable from the kernel POV?
@crosbymichael I updated the patches to reflect the requested changes.
I added 3 fields to the VTPM struct in config.go. In runc I have a lot more fields. I suppose this is ok.
@crosbymichael The kernel has a vTPM proxy driver that we would be using in runc. Its interface is stable. New ioctl's may be added, but that shouldn't be a problem.
The interface of the TPM emulator implementation runc is using, swtpm, is also stable.
@stefanberger why does runc have more fields than the spec?
@crosbymichael It's keeping data such as the created major/minor numbers of the character device, file descriptor, etc. Here's where it resides then. https://github.com/stefanberger/runc/blob/vtpm/libcontainer/vtpm/vtpm.go#L21
@stefanberger ok, so those are implementation details, not input from the user
@stefanberger
Is this needed at the runtime level because of interactions with the tpm and user namespaces? Sorry, I'm just trying to understand this more and hope I'm not asking stupid questions.
@crosbymichael I am not sure what your latest comment ('Is this needed at the runtime level...')is referring to.
@stefanberger why couldn't this be done outside of the runtime(runc) and just configured the spec to add the new device to the container?
Looking over this PR and opencontainers/runc#1591, I have some slight concerns about making the vtmps
object so opaque. As it stands in the runc PR, these parameters will be backed by swtpm, and the API we're exposing is basically “things swtpm needs the runtime-caller to decide”. But the swtpm config is different from the kernel API. We could have a thinner runtime-spec interface if we stayed closer to the kernel API. You could just configure the number of vtpms you want ("vtpms": 2
) and have the runtime create the appropriate character devices for the container and pipe out the file descriptors to the caller (like we already do for pseudoterminal masters with --console-socket
, although we have yet to land a spec for that). We could re-use the existing console socket (with a more generic name?) and just define a new request-message type for the vTPM descriptors. Then runtime callers could pass that file descriptor to their emulator of choice and manage the emulator's lifecycle independently.
On the other hand, with that increased flexibility, you'd have more moving parts. If we're ok claiming that the swtpm or other runtime-chosen emulator is a good enough for anyone using vtpms
, then that's fine too. I guess folks who were not satisfied with their runtime's choice could always create the vTPM and push the appropriate device into their rootfs before create
-ing the container if they needed more flexibility.
Is this needed at the runtime level because of interactions with the tmp and user namespaces?
Is there a tpm
namespace? I agree that handing this outside of the runtime seems cleanest.
@wking So the kernel API does take a VTPMVersion as a parameter of an ioctl() but doesn't care about creation of certificates, which is a configuration parameter to the emulator. So in the case of vTPM we may not just have kernel parameters but also emulator parameters that need to be passed. Another possible parameter to the emulator would be whether the emulator is supposed to encrypt the files the vTPM writes out. On input this could be a boolean and runc creates a random key or one could pass an (AES) key directly. I suppose passing emulator parameters via the JSON is needed and allowed in this case, even if these parameters are not directly kernel parameters.
@wking There's no tpm
namespace.
On Mon, Sep 11, 2017 at 09:20:57PM +0000, Stefan Berger wrote:
So in the case of vTPM we may not just have kernel parameters but also emulator parameters that need to be passed.
What do we gain by making the runtime a middleman between the emulator and the runtime-caller? It seems simpler to have the caller:
- Setup whichever emulator they want however they want.
- Put the resulting character device under the container's root.path (or add it to linux.devices).
- Call ‘create’ to launch the container.
Then there's no need to tell the runtime about vTPM at all, it's just passing a device through to the container like all the other devices it passes through.
What do we gain by making the runtime a middleman between the emulator and the runtime-caller?
As an example of this in another context, making the runtime a middleman for pseudoterminal creation allows you to create the pseudoterminal pair from a newinstance
devpts
mount. If you're an unprivileged user, you need the container mount namespace to be created before you have permission for the devpts
mount, and while you could create those namespaces outside of the runtime and pass them in via file descriptor paths, that duplicates a lot of the work that the runtime is supposed to be helping you do. So having the runtime handle the newinstance
devpts
mount and then create a new pseudoterminal pair from that devpts
makes sense. It's not clear to me (yet?) what efficiencies we get by similarly pushing vTPM into the runtime.
@wking @crosbymichael dumb question: if we take the runtime, which I suppose you are referring to is represented by this code base here, out, does that mean we wouldn't have anything vTPM related in this repository? Maybe I am not following the discussion correctly, but I suppose that at least the config.json would have to contain the vtpms
array and at least the version of TPM to emulate (and possibly the other parameters necessary for the emulator)?
… if we take the runtime, which I suppose you are referring to is represented by this code base here, out, does that mean we wouldn't have anything vTPM related in this repository?
Well, I could see you having:
{
"linux": {
"devices": [
{
"path": "/dev/tpm0", (or whatever)
"type": "c",
"major": …, (whatever you got back from the ioctl)
"minor": …, (whatever you got back from the ioctl)
"fileMode": 384, (0o600, or whatever you like),
"uid": 0, (or whatever you like),
"gid": 0 (or whatever you like),
},
…
],
}
}
But you can also accomplish the same thing by mknod
-ing the device yourself at any point before you call start
, in which case you wouldn't need a config entry at all. Or you could mknod the device in a pre-start hook. Lots of options.
… but I suppose that at least the config.json would have to contain the
vtpms
array and at least the version of TPM to emulate (and possibly the other parameters necessary for the emulator)?
But you wouldn't need that if the caller was setting up the emulator.
@wking If we express a vTPM instance as shown above with a device on the level of the runtime-spec, then how do we represent it at the level of runc ? Can runc extend the JSON?
If we express a vTPM instance as shown above with a device on the level of the runtime-spec, then how do we represent it at the level of runc ? Can runc extend the JSON?
Runc can store additional information in its libcontainer config. But why would it need to? If the caller is managing swtpm and the devices outside the runtime, there's nothing for the runtime to do.
If you want a tighter binding, you could reroll opencontainers/runc#1591 into a third-party hook tool, with a setup command for a pre-start hook and a teardown command for a post-stop hook.
But I don't see anything in a quick skim of the runc PR, except maybe checkpoint blocking, that cannot be handled by hooks or other external-to-the-runtime code. Am I missing something? Can you provide more details on the checkpoint breakage?
@wking Who is the 'caller' in this case? ('If the caller is managing swtpm and the device...'). Do you want to support vTPM on the runc level or push to even higher levels?
I assumed that checkpointing includes migration as well. With an attached vTPM its state would have to be migrated along with the state of the container, which isn't implemented.
@wking FYI: I am also working on namespacing of IMA. There I am hooking up an IMA namespace with a virtual TPM instance and the vTPM receives the TPM commands (PCR Extends) from IMA that would normally go to the hardware TPM. If we wanted to support that on the runc level some day, then there should not only be support for an IMA namespace on that level but also to hook up a vTPM to it.
Who is the 'caller' in this case?
The runtime-caller (e.g., see the steps I floated here).
Do you want to support vTPM on the runc level or push to even higher levels?
I want to push it up to higher levels, unless we have a reason for embedding it in the runtime (like we already do for network setup).
With an attached vTPM its state would have to be migrated along with the state of the container, which isn't implemented.
That makes sense. But I don't see a reason why emulator checkpointing couldn't happen at higher levels too, so it's not a reason for putting this in the runtime vs. higher levels.
I am also working on namespacing of IMA.
Depending on the details, this could end up being like network-namespace setup (which we punt to higher levels) or mount-namespace setup (which we handle in the runtime via mount
). Which do you think it seems closer to?
@wking To resume this discussion. I integrated vTPM (with IMA namespacing) into Docker-CE 17.12. As part of that I found it to be necessary to support the following runtime spec for supporting a vTPM:
+// VTPM definition
+type VTPM struct {
+ Statepath string `json:"statepath,omitempty"`
+ StatepathIsManaged bool `json:"statepathismanaged,omitempty"`
+ TPMVersion string `json:"vtpmversion,omitempty"`
+ CreateCertificates bool `json:"createcerts,omitempty"`
+}
I extended RunC with vTPM. When running RunC with vTPM support in Docker-CE one surprise was that when running 'docker restart' the vTPM state path was deleted and thus all the persistent state of the vTPM got lost. This for example required the addition of the 'StatePathIsManaged' field in this patch here, which basically helps us restart a container on the docker level without erasing the state like we do when we delete the container on the Docker level or using runc delete xyz
on the RunC level.
Further, I find it necessary to integrate the vTPM into RunC also because of RunC spawning an IMA namespace (namespacing of IMA is not upstreamed) and to hook a vTPM to the IMA namespace I need to be able to transfer the vTPM device to the IMA namespace by calling an ioctl on the vTPM device's file descriptor. I doubt that this would be possible if vTPM was created in a hook like where we can setup networking using netns.
I extended RunC with vTPM. When running RunC with vTPM support in Docker-CE one surprise was that when running 'docker restart' the vTPM state path was deleted and thus all the persistent state of the vTPM got lost. This for example required the addition of the 'StatePathIsManaged' field in this patch here, which basically helps us restart a container on the docker level without erasing the state like we do when we delete the container on the Docker level or using
runc delete xyz
on the RunC level.
That patch is making runc's removal of the vTPM state path conditional. But if all vTPM handling happens in higher layers, runc would never be deleting the vTPM state path, so I don't see an issue there.
Further, I find it necessary to integrate the vTPM into RunC also because of RunC spawning an IMA namespace (namespacing of IMA is not upstreamed)…
I can't find code for this. Can you link to it?
… and to hook a vTPM to the IMA namespace I need to be able to transfer the vTPM device to the IMA namespace by calling an ioctl on the vTPM device's file descriptor.
I see no reason why this couldn't happen between create
and start
(or in a prestart
hook). You have the container PID from the state, so you can find the IMA namespace under /proc/{pid}/ns/
. And you can configure your hook with the path to the vTPM device, so the hook can open that when it needs the file descriptor.
house keeping: TPMs seem like a fine thing to support, but this conversation died off. @stefanberger perhaps a fresh PR would be useful, and https://github.com/opencontainers/runtime-spec/pull/920#issuecomment-328660319 asked if there had been any methods used to accomplish this functionality outside of the runtime-spec. (i'm inclined to close this for now)
house keeping: TPMs seem like a fine thing to support, but this conversation died off. @stefanberger perhaps a fresh PR would be useful, and #920 (comment) asked if there had been any methods used to accomplish this functionality outside of the runtime-spec. (i'm inclined to close this for now)
@crosbymichael @vbatts
I think I had tried to answer some of the issues here..
I haven't looked at this in a while but my arguments for supporting vTPM on the runc level rather than higher layers doing it would be:
- vTPM is abstracted to a JSON object which makes it easier for any higher layer wanting to use vTPM; one can start vTPM on the runc level as just another device
- vTPM is stateful and for CRIU would either have to block CRIU or have its state blobs written
- back then I was experimenting with IMA namespaces and attaching a vTPM to an IMA namespace required the file descriptor of the vTPM to be passed to the IMA namespace using an ioctl
needs rebase
The update I pushed today adds all those fields that are needed to support the vTPM features implemented in runc PR https://github.com/opencontainers/runc/pull/1591
@crosbymichael / @wking are you able to recommence / continue your reviews please. As @vbatts points out, vTPM support would be immensely useful for measuring trust of containers.
@stefanberger are you still working on this PR?