booster icon indicating copy to clipboard operation
booster copied to clipboard

Unable to unlock root partition with tpm2 key

Open Prototik opened this issue 1 year ago • 18 comments

booster 0.11, both with and without universal:true. Dmesg and luksdump

Main part of the log is

[    4.730062] booster: no tpm devices found after 3 seconds.
[    5.048481] booster: recovering systemd-tpm2 token #0 failed: clevis.go/tpm2: unable to load data: parameter 1, error code 0x1f : integrity check failed

Not sure why booster doesn't see tpm2, dracut works just fine with this setup.

Prototik avatar Jul 27 '23 17:07 Prototik

Thanks for reporting it.

Is it a regression from 0.10 booster version?

Also how did you enroll your your systemd-tpm2 partition? What parameters did you use? And what version of systemd do you have?

anatol avatar Jul 27 '23 17:07 anatol

systemd 254~rc3-1-arch

systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=7+12 --wipe-slot=tpm2 /dev/nvme0n1p2

Not sure about regression from 0.10, I just give a try for booster. But I can try to downgrade and report result later if you wish.

Prototik avatar Jul 28 '23 02:07 Prototik

The same issue with booster 0.10

Prototik avatar Jul 28 '23 13:07 Prototik

Bisecting points me to c1b666775e9ac2bad4246078c8904c6a20976842

My assumption as tpm modules are built-in into the kernel in archlinux packages, booster doesn't stand a chance to capture associated udev event as tpm devices initialized even before booster enters it's main method. So we should check presence of /dev/tpmrm0 and don't use waiters if tpm already here.

Something like that:

--- a/init/udev.go
+++ b/init/udev.go
@@ -2,6 +2,9 @@ package main
 
 import (
 	"fmt"
+	"github.com/google/go-tpm/legacy/tpm2"
+	"io"
+	"net"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -73,8 +76,19 @@ var (
 )
 
 func udevListener() error {
-	// Initialize tpmReadyWg
-	tpmReadyWg.Add(1)
+	var dev io.ReadWriteCloser
+
+	if enableSwEmulator {
+		dev, _ = net.Dial("tcp", ":2321") // swtpm emulator is listening at port 2321
+	} else {
+		dev, _ = tpm2.OpenTPM("/dev/tpmrm0")
+	}
+	if dev != nil {
+		dev.Close()
+	} else {
+		// Initialize tpmReadyWg
+		tpmReadyWg.Add(1)
+	}
 
 	udevConn = new(netlink.UEventConn)
 	if err := udevConn.Connect(netlink.KernelEvent); err != nil {

It fixes no tpm devices found, but I still have no luck of unlocking partition...

Prototik avatar Jul 30 '23 15:07 Prototik

I believe I found the cause: https://github.com/systemd/systemd/commit/d9b5841d40996d42a05b7d6f1adf7a7517966262

Since systemd 252 systemd-cryptenroll generates tpm2 pins with tpm2-hash-pcrs (instead of tpm2-pcrs as it used to be) to make verification against signed PCRs instead of literal PCRs. And seems booster don't handle it yet, so unable to open luks volume with tpm2 token

Prototik avatar Jul 31 '23 12:07 Prototik

@Prototik thank you for discovering the root of the problem. The booster code needs to be adjusted accordingly then https://github.com/anatol/booster/blob/30d1a2ec2856a35dbf76b6968cfa0bc2b19be853/init/luks.go#L239

anatol avatar Jul 31 '23 14:07 anatol

Bisecting points me to c1b6667

My assumption as tpm modules are built-in into the kernel in archlinux packages, booster doesn't stand a chance to capture associated udev event as tpm devices initialized even before booster enters it's main method. So we should check presence of /dev/tpmrm0 and don't use waiters if tpm already here.

Something like that:

--- a/init/udev.go
+++ b/init/udev.go
@@ -2,6 +2,9 @@ package main
 
 import (
 	"fmt"
+	"github.com/google/go-tpm/legacy/tpm2"
+	"io"
+	"net"
 	"os"
 	"path/filepath"
 	"regexp"
@@ -73,8 +76,19 @@ var (
 )
 
 func udevListener() error {
-	// Initialize tpmReadyWg
-	tpmReadyWg.Add(1)
+	var dev io.ReadWriteCloser
+
+	if enableSwEmulator {
+		dev, _ = net.Dial("tcp", ":2321") // swtpm emulator is listening at port 2321
+	} else {
+		dev, _ = tpm2.OpenTPM("/dev/tpmrm0")
+	}
+	if dev != nil {
+		dev.Close()
+	} else {
+		// Initialize tpmReadyWg
+		tpmReadyWg.Add(1)
+	}
 
 	udevConn = new(netlink.UEventConn)
 	if err := udevConn.Connect(netlink.KernelEvent); err != nil {

It fixes no tpm devices found, but I still have no luck of unlocking partition...

I think it is better to check the file existence at https://github.com/anatol/booster/blob/30d1a2ec2856a35dbf76b6968cfa0bc2b19be853/init/tpm.go#L54 and if the tpm file does not exist then wait for its udev event.

anatol avatar Jul 31 '23 14:07 anatol

Is this issue being worked on by someone? dracut handles the new systemd-cryptenroll setup just fine, FYI.

lvlgl avatar Sep 14 '23 17:09 lvlgl

Is this issue being worked on by someone? dracut handles the new systemd-cryptenroll setup just fine, FYI.

I don't know about anatol but I only skimmed through the relevant code because I was also interested in getting tpm2 unlocking to work. But this doesn't appear to be a simple fix

For tpm2 unlocking, rather than systemd, users can use the Clevis functionality in Booster. I believe this should be fine as it doesn't use the same code path as systemd-tpm2. Or you can just use mkinitcpio or dracut. For mkinitcpio, just be sure to configure it with the relevant hooks.

c3Ls1US avatar Sep 16 '23 21:09 c3Ls1US

has this been solved yet? i can't get either TPM2 or FIDO2 decryption to work on boot, but they both work with cryptsetup open. this is a massive issue. if it's not going to work any time soon please update the arch wiki page to reflect that.

Arbel-arad avatar Oct 21 '23 15:10 Arbel-arad

I see that systemd-cryptsenroll tool adds a new field named tpm2_srk to the the LUKS metadata. This field is used as a reference to the storage key. The functionality was added in https://github.com/systemd/systemd/commit/acbb504eaf1

Here is the systemd code to reconstruct SRK handle from the data:

                r = tpm2_handle_new(c, &primary);
                if (r < 0)
                        return r;

                primary->keep = true;

                log_debug("Found existing SRK key to use, deserializing ESYS_TR");
                rc = sym_Esys_TR_Deserialize(
                                c->esys_context,
                                srk_buf,
                                srk_buf_size,
                                &primary->esys_handle);
                if (rc != TSS2_RC_SUCCESS)
                        return log_error_errno(SYNTHETIC_ERRNO(ENOTRECOVERABLE),
                                               "Failed to deserialize primary key: %s", sym_Tss2_RC_Decode(rc));

And booster needs to do it using go-tpm library. Though it is not fully clear what is the correct go-tpm API for that. tpm2.* does not have any deserialize functions.

cc @chrisfenner @Foxboron who recently worked with go-tpm if they know the answer to this question.

anatol avatar Oct 26 '23 21:10 anatol

You could try Unmarshal[whatever type you're trying to unmarshal]: https://github.com/google/go-tpm/blob/ee6cbcd136f878df2c2f36b4a085d2115330f379/tpm2/marshalling.go#L55

chrisfenner avatar Oct 26 '23 21:10 chrisfenner

Booster still uses the legacy go-tpm API https://github.com/anatol/booster/blob/master/init/tpm.go

@chrisfenner Is there a way to perform sym_Esys_TR_Deserialize the that legacy API instead?

anatol avatar Oct 26 '23 21:10 anatol

The bad news is that in the legacy API, per-type unmarshalling support is more spotty. It might be there somewhere depending on what you want to unmarshal.

The good news is you don't have to switch all your code to the new API :) you could update the version of go-tpm to 0.9.0 and import github.com/google/go-tpm/legacy/tpm2 everywhere you were importing tpm2 before. Then, in the code that wants to call the new API, you can use github.com/google/go-tpm/tpm2.

chrisfenner avatar Oct 26 '23 21:10 chrisfenner

Thank you @chrisfenner I will try to mix the APIs

So I looked at the tpm-tss code and sym_Esys_TR_Deserialize essentially calls this function:

TSS2_RC
iesys_MU_IESYS_RESOURCE_Unmarshal(
    const uint8_t *buffer,
    size_t size,
    size_t *offset,
    IESYS_RESOURCE *dst)
{
    LOG_TRACE("called: buffer=%p size=%zu offset=%p dst=%p",
        buffer, size, offset, dst);
    if (buffer == NULL) {
        LOG_ERROR("buffer=NULL");
        return TSS2_ESYS_RC_BAD_REFERENCE;
    }
    TSS2_RC ret;
    size_t offset_loc = (offset != NULL)? *offset : 0;
    if (dst != NULL)
        memset(dst, 0, sizeof(*dst));
    TPM2_HANDLE out_handle;
    ret = Tss2_MU_TPM2_HANDLE_Unmarshal(buffer, size, &offset_loc,
            (dst == NULL)? &out_handle : &dst->handle);
    return_if_error(ret, "Error unmarshaling subfield handle");

    ret = Tss2_MU_TPM2B_NAME_Unmarshal(buffer, size, &offset_loc,
            (dst == NULL)? NULL : &dst->name);
    return_if_error(ret, "Error unmarshaling subfield name");

    IESYSC_RESOURCE_TYPE out_rsrcType;
    ret = iesys_MU_IESYSC_RESOURCE_TYPE_Unmarshal(buffer, size, &offset_loc,
            (dst == NULL)? &out_rsrcType : &dst->rsrcType);
    return_if_error(ret, "Error unmarshaling subfield rsrcType");

    ret = iesys_MU_IESYS_RSRC_UNION_Unmarshal(buffer, size, &offset_loc,
            (dst == NULL)? out_rsrcType : dst->rsrcType,
            (dst == NULL)? NULL : &dst->misc);
    return_if_error(ret, "Error unmarshaling subfield misc");

    if (offset != NULL)
        *offset = offset_loc;
    return TSS2_RC_SUCCESS;
}

What go-tpm type would correspond to IESYS_RESOURCE from tpm-tss codebase?

anatol avatar Oct 26 '23 22:10 anatol

Hmm, IESYS_RESOURCE sounds like a TSS type, not a TPM type. You might not get much help from go-tpm for that structure, since it's not a real TPM structure.

chrisfenner avatar Oct 26 '23 22:10 chrisfenner

talos implements all of this using their own libraries.

https://github.com/siderolabs/talos/blob/main/internal/pkg/encryption/keys/tpm2.go

Foxboron avatar Oct 30 '23 12:10 Foxboron