swtpm icon indicating copy to clipboard operation
swtpm copied to clipboard

Release the lock on the storage backend when migrating with shared storage

Open stefanberger opened this issue 1 year ago • 2 comments

This PR adds a --migration command line option with parameters incoming and release-lock-outgoing that allows to migrate the state over shared storage of the swtpm's state directory (between source and destination) by deferring the locking in incoming migration and releasing the storage once outgoing migration is retrieving the 'savestate' of the TPM. To accommodate a fallback in case of migration failure (QEMU migration failed) the re-locking of the storage is attempted several times until swtpm on the migration destination side has terminated and released the lock there.

stefanberger avatar Aug 11 '22 19:08 stefanberger

UPDATE: LIBVIRT WILL NEED TO BE MODIFIED. YOU CANNOT USE THIS PATCH.

The patch to apply to this branch to simulate --migration incoming,release-lock-outgoing on the command line and use it with existing libvirt to try out migration with shared storage. Later on libvirt will need to be patched to passed these options.

From 8241c2250eef90088e1126365953745bd919e395 Mon Sep 17 00:00:00 2001
From: Stefan Berger <[email protected]>
Date: Wed, 10 Aug 2022 19:35:41 -0400
Subject: [PATCH] temp: Simulate '--migration incoming,release-lock-outgoing'
 by setting flags

This breaks the test case in this series!
---
 src/swtpm/swtpm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/swtpm/swtpm.c b/src/swtpm/swtpm.c
index aa14066..47b11ee 100644
--- a/src/swtpm/swtpm.c
+++ b/src/swtpm/swtpm.c
@@ -486,6 +486,8 @@ int swtpm_main(int argc, char **argv, const char *prgname, const char *iface)
         goto exit_failure;
     }

+    mlp.incoming_migration = mlp.release_lock_outgoing = true;
+
     if (server) {
         if (server_get_fd(server) >= 0) {
             mlp.fd = server_set_fd(server, -1);
--
2.37.1

stefanberger avatar Aug 11 '22 19:08 stefanberger

This is a bit of a concern -- very difficult/almost impossible to test and trigger and verify in a real-world scenario:

When migrating a VM the migration may fail and execution will then resume
one the originating side. In this fallback case the swtpm on the
destination side may need some time to terminate and release the lock.
Therefore, add a loop to the code attempting to re-lock the storage
directory on the source side for a few times until on the destination
side swtpm has released the lock. Retry the locking for 100 times
with 10ms in between. The retries will only ever be necessary of a TPM 2
command is immediately executed upon resume and this may be difficult
to test. The negative side effects of this could be that the loop is not
long enough to grab the lock or that a short command sent by the TPM
driver will time out due to the retries.

stefanberger avatar Aug 11 '22 19:08 stefanberger

Pull Request Test Coverage Report for Build 4053

  • 108 of 136 (79.41%) changed or added relevant lines in 10 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.07%) to 74.961%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/swtpm/ctrlchannel.c 14 15 93.33%
src/swtpm/swtpm_nvstore.c 10 12 83.33%
src/swtpm/swtpm_chardev.c 5 8 62.5%
src/swtpm/common.c 14 19 73.68%
src/swtpm/swtpm_nvstore_dir.c 2 7 28.57%
src/swtpm/cuse_tpm.c 30 36 83.33%
src/swtpm_ioctl/tpm_ioctl.c 8 14 57.14%
<!-- Total: 108 136
Files with Coverage Reduction New Missed Lines %
src/swtpm/swtpm_nvstore_dir.c 1 65.91%
<!-- Total: 1
Totals Coverage Status
Change from base Build 4051: 0.07%
Covered Lines: 6311
Relevant Lines: 8419

💛 - Coveralls

coveralls avatar Aug 20 '22 20:08 coveralls

libvirt support for shared storage is here: https://github.com/stefanberger/libvirt-tpm/tree/master%2Bswtpm_shared_storage.v1

I'll post these patches on the libvirt mailing list soon.

stefanberger avatar Aug 21 '22 19:08 stefanberger