rbd: make block encryption cipher configurable
Describe what this PR does
The PR adds three new block encryption (RBD encryption) configuration options to customize the encryptions means used.
The option "encryptionCipher" configures the cipher with which the block device is encrypted. The option "encryptionKeySize" is about configuring the key size.
Lastly the option "integrityMode" configures the algorithms the block encryption should use to verify the integrity of consumed data. When this option is not used then no integrity verification happens. This is implemented via the kernel module dmIntegrity. Essentially when cryptsetup receives the “integrityMode” flag then it will set up a dmcrypt mapper device and a dmIntegrity mapper device.
When the integrity option is not used IO stack looks like this
App (Pod) —> dmcrypt mapper —> rbd device
With the integrity option it will look like this
App (Pod) —> dmcrypt mapper —> dmIntegrity mapper —> rbd device
In this setup the dmcrypt device would encrypt data and compute authentication tags (for example hmacs). Meanwhile dmIntegrity would be tasked with storing the encrypted data and corresponding authentication tags on the block device.
Motivation
I want to be able to customize means to employ encryption to match a required security level. Since I added an allowlist for cipher strings not any algorithm can be deployed (which hinders configurability) but it is the most easiest and secure means to validate user input.
Additionally I want to integrate means of verifying the integrity of consumed data because it’s becoming increasingly required from a compliance standpoint. For example the recently released Digital Operational Resilience Act (DORA) legislation from the EU demands that data confidentiality and integrity need to be established. I see the usage of dmIntegrity as the easiest means to achieve this.
Is there anything that requires special attention
Do you have any questions?
- In the PR a resize operation is not possible when integrity protection is enabled. This is because in older kernel releases (older than 5.7) it was not possible
- I made an allowlist for the “encryptionCipher” and “integrityMode” configurations due to security concerns. Do you also think that this allowlist makes sense?
Are there concerns around backward compatibility?
- The biggest hurdle for backwards compatibility will be the dmIntegrity. For example in this PR I forbid the usage of resize when the integrity option is used because on older linux kernel versions the resize key word was not supported.
Provide any external context for the change, if any.
These sources illustrate why integrating an integrity mode makes sense:
-
Digital Operational Resilience Act (DORA)
- Article 5 (2) (b) “[The management body shall] put in place policies that aim to ensure the maintenance of high standards of availability, authenticity, integrity and confidentiality, of data;”
-
BSI TR-02102-1 "Kryptographische Verfahren: Empfehlungen und Schlüssellängen" Version: 2025-01
- Germany's Federal Office for Information Security
- Documents talks about when encryption is used means to establish that the ciphertext was not altered is crucial
- “Die Verwendung einer Volume-Verschlüsselungslösung allein ist nur empfehlenswert, wenn diese einen wirksamen kryptographischen Schutz vor Manipulation der Daten beinhaltet [...]”
- Translation: The use of a volume encryption solution alone is only recommended if it includes effective cryptographic protection against data manipulation.
-
Here is useful dmintegrity documentation
Related issues
Resolves ceph#5686
Future concerns
- Resize is currently not supported when an integrity mode is enabled. However since a resize is possible in dmIntegrity with newer linux kernel versions (larger than 5.7) we could support it
- The new configuration options could also be used to make file system encryption more customizable
Checklist:
- [x] Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
- [x] Reviewed the developer guide on Submitting a Pull Request
- [ ] Pending release notes updated with breaking and/or notable changes for the next major release.
- [x] Documentation has been updated, if necessary.
- [x] Unit tests have been added, if necessary.
- [ ] Integration tests have been added, if necessary.
Show available bot commands
These commands are normally not required, but in case of issues, leave any of the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>: retest the<job-name>after unrelated failure (please report the failure too!)
Hi @nixpanic or @Madhu-1 , since you are both very familiar with this repository, I would appreciate your feedback on my recent changes.
I know you're busy, so if it would be helpful or save time, I am happy to schedule a quick meeting to walk you through the modifications. Maybe we can use the next Dev standup slot. Please let me know what works best for you.
Hi @nixpanic or @Madhu-1 , since you are both very familiar with this repository, I would appreciate your feedback on my recent changes.
I know you're busy, so if it would be helpful or save time, I am happy to schedule a quick meeting to walk you through the modifications. Maybe we can use the next Dev standup slot. Please let me know what works best for you.
@Greenpepper15 Thanks for the work 🎉 , requested for reviews on this one. Lets continue the discussion in the PR itself, if required we can have a this as a topic in one of the standup.
Yes I am working on adding the e2e tests and unit tests.
Hey I got a question concerning the e2e tests because I ran into a cyclic dependency.
For the e2e test I want to expose a function in /internal/util/cryptsetup/cryptsetup.go that returns all the allowed Ciphers to the e2e tests.
Then I can do something like
err = createRBDStorageClass(
...
map[string]string{"encrypted": "true",
"encryptionType": crypto.EncryptionTypeBlock.String(),
"encryptionCipher": cryptsetup.GetAllowedCiphers()[0], <-- Return an allowed cipher that I definitely know is valid because it comes from the source
},
deletePolicy)
But the e2e tests get the cryptsetup functionality through the file /e2e/vendor/github.com/ceph/ceph-csi/internal/until/crypsetup/cryptsetup.go does not have the new GetAllowedCiphers() defined because it references an old version of crypetup.go.
Should I go something like this for now
err = createRBDStorageClass(
...
map[string]string{"encrypted": "true",
"encryptionType": crypto.EncryptionTypeBlock.String(),
"encryptionCipher": "aes-xts-plain64", <-- Insert a static string that I know will work now but may change in a later version
},
deletePolicy)
And then later open another PR to replace the static assignments with a GetAllowedCiphers() call?
@Greenpepper15 have you tried running go mod tidy in the e2e folder and see if that works?
Yes I just tired and it works. Thanks for the advice.
But go mod tidy changes a lot of files in the e2e/vendor directory not just the cryptsetup.go file. Is that acceptable?
Yes I just tired and it works. Thanks for the advice. But
go mod tidychanges a lot of files in the e2e/vendor directory not just the cryptsetup.go file. Is that acceptable?
Yes that should be fine
This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏
Can somebody trigger the e2e tests? I tested them on my Cluster but I want to see if they work in the CI environment too.
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
@Greenpepper15 , the tests pass, but I do not see the test-cases you added being executed :thinking:
$ curl 'https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/rest/organizations/jenkins/pipelines/mini-e2e-helm_k8s-1.34-rbd/runs/1/nodes/100/log/?start=0&download=true' > pr5711-e2e.log
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 4236k 0 4236k 0 0 2818k 0 --:--:-- 0:00:01 --:--:-- 2818k
$ grep STEP pr5711-e2e.log
...
Yes when I look at the test output I also cannot find the output for my tests. At least I saw that the previous encryption tests succeeded.
Does anyone know the reason? On my debug cluster I tested with this test configuration and it worked for me.
# inside the /e2e directory
go test . --test-cephfs=false --test-rbd=true --upgrade-testing=false --deploy-rbd=true --deploy-cephfs=false --timeout=20m --deploy-timeout=20
Or it is because of the merge conflict that is yet to be resolved?
Does anyone know the reason? On my debug cluster I tested with this test configuration and it worked for me.
Maybe the CI job fails to checkout your change because there is a merge conflict?
From https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/organizations/jenkins/mini-e2e-helm_k8s-1.34-rbd/detail/mini-e2e-helm_k8s-1.34-rbd/1/pipeline/26 :
+ cd /home/jenkins/build/ceph-csi
+ git fetch origin pull/5711/merge
From https://github.com/ceph/ceph-csi
* branch refs/pull/5711/merge -> FETCH_HEAD
+ git checkout -b pull/5711/merge FETCH_HEAD
Switched to a new branch 'pull/5711/merge'
When I execute the same steps, I also do not have your changes to the ./e2e/ files when I go through git log --stat.
<<<<<<< feature/5686/rbd-configurable-cipher
golang.org/x/net v0.45.0 // indirect
=======
golang.org/x/net v0.46.0 // indirect
>>>>>>> devel
This is only conflict in the merge conflict. I guess if I choose my version it will run without any issues. But we don't want a downgrade version on devel.
How about I choose my changes for now and later somebody else calls go mod tidy and updates the corresponding files to resolve this issue?
I think the issue is that there is a merge conflict. It probably does not matter how you address it. Most practical would be to rebase on top of the latest devel branch, and use versions from there. That will be needed later on in any case too.
Once the conflict is resolved, try to leave the /test ... message in a comment (only that single line in the comment), maybe you are allowed to start the CI job yourself too.
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
The test failed with this error
E1110 20:00:02.940630 35903 crypto.go:238] ID: 244 Req-ID: 0001-0024-3674dcb1-48bf-45a6-8638-8fbb36b033b4-0000000000000004-c95798e0-5f4e-4956-9b64-da1138bba3e3 failed to encrypt device "/dev/rbd0" with LUKS (an error (exit status 1) occurred while running cryptsetup args: [-q luksFormat --type luks2 --hash sha256 --cipher aegis128-random --integrity aead --key-size 128 --luks2-metadata-size 4096k --luks2-keyslots-size 8192k --pbkdf-memory 32768 /dev/rbd0 -d -]): Cipher aegis128-random (key size 128 bits) is not available.
This happens when the aegis128 kernel module is not loaded. I will temporarily remove this test for tonight and run the tests again to see the other tests do not fail.
Could someone do a sudo modprobe aegis128 on the worker node(s)?
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
Could someone do a
sudo modprobe aegis128on the worker node(s)?
There is a Modprobe() helper at internal/util/kmod/modprobe.go, the node-plugin should be able to load the module that way. Ideally cryptsetup loads the module automatically... But that seems not be be done, so you have to think of a way to detect what modules are required for a cipher :thinking:
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
From the logs.
This is unfortunate:
E1114 13:14:34.421166 37619 crypto.go:242] ID: 331 Req-ID: 0001-0024-b7e4c8eb-90e2-4e53-8130-485b4b948c6b-0000000000000004-7a021e5e-78bd-4878-8968-670ce9f077a8 failed to encrypt device "/dev/rbd0" with LUKS (an error (exit status 1) occurred while running cryptsetup args: [-q luksFormat --type luks2 --hash sha256 --cipher aegis128-random --integrity aead --key-size 128 --luks2-metadata-size 4096k --luks2-keyslots-size 8192k --pbkdf-memory 32768 /dev/rbd0 -d -]): Cipher aegis128-random (key size 128 bits) is not available.
The e2e tests run on a CentOS Stream 9 system (minikube with podman backend, not it's own VM). Is the kernel module not available on this CentOS version?
W1114 12:48:18.857770 37619 modprobe.go:49] modprobe of kernel module "aegis128" failed (an error (exit status 1) occurred while running modprobe args: [aegis128]): modprobe: FATAL: Module aegis128 not found in directory /lib/modules/5.14.0-612.el9.x86_64
E1114 12:48:18.857796 37619 encryption.go:633] modprobe of kernel module "aegis128" failed (an error (exit status 1) occurred while running modprobe args: [aegis128]): modprobe: FATAL: Module aegis128 not found in directory /lib/modules/5.14.0-612.el9.x86_64
I1114 12:48:18.857805 37619 encryption.go:635] AEGIS-128 kernel module loaded.
I1114 12:48:18.857812 37619 encryption.go:637] AEGIS-128 kernel module already loaded.
The e2e tests run on a CentOS Stream 9 system (minikube with podman backend, not it's own VM). Is the kernel module not available on this CentOS version?
To answer my own question: No, the aegis128 kernel module is not available on CentOS Stream 9 or even on CentOS Stream 10 (not in the kernel-modules-core and kernel-modules-extras packages).
It won't be trivial to add e2e testing for this :disappointed:
Okay that is good to know. Then I will drop the support for the aegis cipher. As far as know from my research using this mode is experimental anyway in DmIntegrity and not recommend.
The problem with using aegis or something like AES-GCM is that it is not possible stop verifying the authentication tags. Once you have this cipher to MUST verify the integrity when decrypting (because the decryption process is tied to integrity verification). When using something like aes-xts with an HMAC the HMAC verification can be skipped in case of an error. This is why these kind of compound methods are recommended.
The "only" upside to aegis is that it is typically faster than doing something like an aes-xts with an HMAC. Or should I leave the aegis option but we do not test for it?
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd
Hey guys please give me some comment if we should drop aegis128 support or not. That would be at last thing (at least from my side) that I need to implement.
And please give some feedback if the existing e2e test suites are enough for you. If not please specify a test scenario you would like to see then I will add it.
Hey guys please give me some comment if we should drop aegis128 support or not. That would be at last thing (at least from my side) that I need to implement.
@Greenpepper15 yes lets drop it, we can introduce it in new PR if required.
And please give some feedback if the existing e2e test suites are enough for you. If not please specify a test scenario you would like to see then I will add it.
Will take sometime and review it soon. If its using existing test and doing all the tests like data consistency/snapshot/clone/resize etc we are good.
@Greenpepper15 it looks there are commints in this PR not related to this feature
Yes which one do you mean?
As far as I know it should only be this one. Should I drop this and open a new PR or should I edit the PR?
/test ci/centos/mini-e2e-helm/k8s-1.34/rbd