ceph-csi icon indicating copy to clipboard operation
ceph-csi copied to clipboard

rbd: make block encryption cipher configurable

Open Greenpepper15 opened this issue 2 months ago • 62 comments

Describe what this PR does

The PR adds three new block encryption (RBD encryption) configuration options to customize the encryptions means used.

The option "encryptionCipher" configures the cipher with which the block device is encrypted. The option "encryptionKeySize" is about configuring the key size.

Lastly the option "integrityMode" configures the algorithms the block encryption should use to verify the integrity of consumed data. When this option is not used then no integrity verification happens. This is implemented via the kernel module dmIntegrity. Essentially when cryptsetup receives the “integrityMode” flag then it will set up a dmcrypt mapper device and a dmIntegrity mapper device.

When the integrity option is not used IO stack looks like this

App (Pod) —> dmcrypt mapper —> rbd device

With the integrity option it will look like this

App (Pod) —> dmcrypt mapper —> dmIntegrity mapper —> rbd device

In this setup the dmcrypt device would encrypt data and compute authentication tags (for example hmacs). Meanwhile dmIntegrity would be tasked with storing the encrypted data and corresponding authentication tags on the block device.

Motivation

I want to be able to customize means to employ encryption to match a required security level. Since I added an allowlist for cipher strings not any algorithm can be deployed (which hinders configurability) but it is the most easiest and secure means to validate user input.

Additionally I want to integrate means of verifying the integrity of consumed data because it’s becoming increasingly required from a compliance standpoint. For example the recently released Digital Operational Resilience Act (DORA) legislation from the EU demands that data confidentiality and integrity need to be established. I see the usage of dmIntegrity as the easiest means to achieve this.

Is there anything that requires special attention

Do you have any questions?

  • In the PR a resize operation is not possible when integrity protection is enabled. This is because in older kernel releases (older than 5.7) it was not possible
  • I made an allowlist for the “encryptionCipher” and “integrityMode” configurations due to security concerns. Do you also think that this allowlist makes sense?

Are there concerns around backward compatibility?

  • The biggest hurdle for backwards compatibility will be the dmIntegrity. For example in this PR I forbid the usage of resize when the integrity option is used because on older linux kernel versions the resize key word was not supported.

Provide any external context for the change, if any.

These sources illustrate why integrating an integrity mode makes sense:

  • Digital Operational Resilience Act (DORA)

    • Article 5 (2) (b) “[The management body shall] put in place policies that aim to ensure the maintenance of high standards of availability, authenticity, integrity and confidentiality, of data;”
  • BSI TR-02102-1 "Kryptographische Verfahren: Empfehlungen und Schlüssellängen" Version: 2025-01

    • Germany's Federal Office for Information Security
    • Documents talks about when encryption is used means to establish that the ciphertext was not altered is crucial
    • “Die Verwendung einer Volume-Verschlüsselungslösung allein ist nur empfehlenswert, wenn diese einen wirksamen kryptographischen Schutz vor Manipulation der Daten beinhaltet [...]”
    • Translation: The use of a volume encryption solution alone is only recommended if it includes effective cryptographic protection against data manipulation.
  • Here is useful dmintegrity documentation

Related issues

Resolves ceph#5686

Future concerns

  • Resize is currently not supported when an integrity mode is enabled. However since a resize is possible in dmIntegrity with newer linux kernel versions (larger than 5.7) we could support it
  • The new configuration options could also be used to make file system encryption more customizable

Checklist:

  • [x] Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • [x] Reviewed the developer guide on Submitting a Pull Request
  • [ ] Pending release notes updated with breaking and/or notable changes for the next major release.
  • [x] Documentation has been updated, if necessary.
  • [x] Unit tests have been added, if necessary.
  • [ ] Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated failure (please report the failure too!)

Greenpepper15 avatar Oct 27 '25 20:10 Greenpepper15

Hi @nixpanic or @Madhu-1 , since you are both very familiar with this repository, I would appreciate your feedback on my recent changes.

I know you're busy, so if it would be helpful or save time, I am happy to schedule a quick meeting to walk you through the modifications. Maybe we can use the next Dev standup slot. Please let me know what works best for you.

Greenpepper15 avatar Nov 03 '25 13:11 Greenpepper15

Hi @nixpanic or @Madhu-1 , since you are both very familiar with this repository, I would appreciate your feedback on my recent changes.

I know you're busy, so if it would be helpful or save time, I am happy to schedule a quick meeting to walk you through the modifications. Maybe we can use the next Dev standup slot. Please let me know what works best for you.

@Greenpepper15 Thanks for the work 🎉 , requested for reviews on this one. Lets continue the discussion in the PR itself, if required we can have a this as a topic in one of the standup.

Madhu-1 avatar Nov 04 '25 09:11 Madhu-1

Yes I am working on adding the e2e tests and unit tests.

Greenpepper15 avatar Nov 05 '25 08:11 Greenpepper15

Hey I got a question concerning the e2e tests because I ran into a cyclic dependency.

For the e2e test I want to expose a function in /internal/util/cryptsetup/cryptsetup.go that returns all the allowed Ciphers to the e2e tests.

Then I can do something like

err = createRBDStorageClass(
       ...
	map[string]string{"encrypted": "true",
		"encryptionType":   crypto.EncryptionTypeBlock.String(),
		"encryptionCipher":  cryptsetup.GetAllowedCiphers()[0],   <-- Return an allowed cipher that I definitely know is valid because it comes from the source
	},
	deletePolicy)

But the e2e tests get the cryptsetup functionality through the file /e2e/vendor/github.com/ceph/ceph-csi/internal/until/crypsetup/cryptsetup.go does not have the new GetAllowedCiphers() defined because it references an old version of crypetup.go.

Should I go something like this for now

err = createRBDStorageClass(
       ...
	map[string]string{"encrypted": "true",
		"encryptionType":   crypto.EncryptionTypeBlock.String(),
		"encryptionCipher":  "aes-xts-plain64",   <-- Insert a static string that I know will work now but may change in a later version
	},
	deletePolicy)

And then later open another PR to replace the static assignments with a GetAllowedCiphers() call?

Greenpepper15 avatar Nov 06 '25 11:11 Greenpepper15

@Greenpepper15 have you tried running go mod tidy in the e2e folder and see if that works?

Madhu-1 avatar Nov 06 '25 12:11 Madhu-1

Yes I just tired and it works. Thanks for the advice. But go mod tidy changes a lot of files in the e2e/vendor directory not just the cryptsetup.go file. Is that acceptable?

Greenpepper15 avatar Nov 06 '25 13:11 Greenpepper15

Yes I just tired and it works. Thanks for the advice. But go mod tidy changes a lot of files in the e2e/vendor directory not just the cryptsetup.go file. Is that acceptable?

Yes that should be fine

Madhu-1 avatar Nov 06 '25 13:11 Madhu-1

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

mergify[bot] avatar Nov 07 '25 19:11 mergify[bot]

Can somebody trigger the e2e tests? I tested them on my Cluster but I want to see if they work in the CI environment too.

Greenpepper15 avatar Nov 10 '25 12:11 Greenpepper15

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

nixpanic avatar Nov 10 '25 13:11 nixpanic

@Greenpepper15 , the tests pass, but I do not see the test-cases you added being executed :thinking:

$ curl 'https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/rest/organizations/jenkins/pipelines/mini-e2e-helm_k8s-1.34-rbd/runs/1/nodes/100/log/?start=0&download=true' > pr5711-e2e.log
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 4236k    0 4236k    0     0  2818k      0 --:--:--  0:00:01 --:--:-- 2818k
$ grep STEP pr5711-e2e.log
...

nixpanic avatar Nov 10 '25 15:11 nixpanic

Yes when I look at the test output I also cannot find the output for my tests. At least I saw that the previous encryption tests succeeded.

Does anyone know the reason? On my debug cluster I tested with this test configuration and it worked for me.

# inside the /e2e directory 
go test . --test-cephfs=false --test-rbd=true --upgrade-testing=false --deploy-rbd=true --deploy-cephfs=false  --timeout=20m --deploy-timeout=20

Or it is because of the merge conflict that is yet to be resolved?

Greenpepper15 avatar Nov 10 '25 15:11 Greenpepper15

Does anyone know the reason? On my debug cluster I tested with this test configuration and it worked for me.

Maybe the CI job fails to checkout your change because there is a merge conflict?

From https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/organizations/jenkins/mini-e2e-helm_k8s-1.34-rbd/detail/mini-e2e-helm_k8s-1.34-rbd/1/pipeline/26 :

+ cd /home/jenkins/build/ceph-csi
+ git fetch origin pull/5711/merge
From https://github.com/ceph/ceph-csi
 * branch                refs/pull/5711/merge -> FETCH_HEAD
+ git checkout -b pull/5711/merge FETCH_HEAD
Switched to a new branch 'pull/5711/merge'

When I execute the same steps, I also do not have your changes to the ./e2e/ files when I go through git log --stat.

nixpanic avatar Nov 10 '25 16:11 nixpanic

<<<<<<< feature/5686/rbd-configurable-cipher
	golang.org/x/net v0.45.0 // indirect
=======
	golang.org/x/net v0.46.0 // indirect
>>>>>>> devel

This is only conflict in the merge conflict. I guess if I choose my version it will run without any issues. But we don't want a downgrade version on devel.

How about I choose my changes for now and later somebody else calls go mod tidy and updates the corresponding files to resolve this issue?

Greenpepper15 avatar Nov 10 '25 16:11 Greenpepper15

I think the issue is that there is a merge conflict. It probably does not matter how you address it. Most practical would be to rebase on top of the latest devel branch, and use versions from there. That will be needed later on in any case too.

Once the conflict is resolved, try to leave the /test ... message in a comment (only that single line in the comment), maybe you are allowed to start the CI job yourself too.

nixpanic avatar Nov 10 '25 17:11 nixpanic

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 10 '25 19:11 Greenpepper15

The test failed with this error

E1110 20:00:02.940630 35903 crypto.go:238] ID: 244 Req-ID: 0001-0024-3674dcb1-48bf-45a6-8638-8fbb36b033b4-0000000000000004-c95798e0-5f4e-4956-9b64-da1138bba3e3 failed to encrypt device "/dev/rbd0" with LUKS (an error (exit status 1) occurred while running cryptsetup args: [-q luksFormat --type luks2 --hash sha256 --cipher aegis128-random --integrity aead --key-size 128 --luks2-metadata-size 4096k --luks2-keyslots-size 8192k --pbkdf-memory 32768 /dev/rbd0 -d -]): Cipher aegis128-random (key size 128 bits) is not available.

This happens when the aegis128 kernel module is not loaded. I will temporarily remove this test for tonight and run the tests again to see the other tests do not fail.

Could someone do a sudo modprobe aegis128 on the worker node(s)?

Greenpepper15 avatar Nov 10 '25 20:11 Greenpepper15

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 10 '25 20:11 Greenpepper15

Could someone do a sudo modprobe aegis128 on the worker node(s)?

There is a Modprobe() helper at internal/util/kmod/modprobe.go, the node-plugin should be able to load the module that way. Ideally cryptsetup loads the module automatically... But that seems not be be done, so you have to think of a way to detect what modules are required for a cipher :thinking:

nixpanic avatar Nov 11 '25 07:11 nixpanic

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 12 '25 20:11 Greenpepper15

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 14 '25 12:11 Greenpepper15

From the logs.

This is unfortunate:

  E1114 13:14:34.421166   37619 crypto.go:242] ID: 331 Req-ID: 0001-0024-b7e4c8eb-90e2-4e53-8130-485b4b948c6b-0000000000000004-7a021e5e-78bd-4878-8968-670ce9f077a8 failed to encrypt device "/dev/rbd0" with LUKS (an error (exit status 1) occurred while running cryptsetup args: [-q luksFormat --type luks2 --hash sha256 --cipher aegis128-random --integrity aead --key-size 128 --luks2-metadata-size 4096k --luks2-keyslots-size 8192k --pbkdf-memory 32768 /dev/rbd0 -d -]): Cipher aegis128-random (key size 128 bits) is not available.

The e2e tests run on a CentOS Stream 9 system (minikube with podman backend, not it's own VM). Is the kernel module not available on this CentOS version?

  W1114 12:48:18.857770   37619 modprobe.go:49] modprobe of kernel module "aegis128" failed (an error (exit status 1) occurred while running modprobe args: [aegis128]): modprobe: FATAL: Module aegis128 not found in directory /lib/modules/5.14.0-612.el9.x86_64
  E1114 12:48:18.857796   37619 encryption.go:633] modprobe of kernel module "aegis128" failed (an error (exit status 1) occurred while running modprobe args: [aegis128]): modprobe: FATAL: Module aegis128 not found in directory /lib/modules/5.14.0-612.el9.x86_64
  I1114 12:48:18.857805   37619 encryption.go:635] AEGIS-128 kernel module loaded.
  I1114 12:48:18.857812   37619 encryption.go:637] AEGIS-128 kernel module already loaded.

nixpanic avatar Nov 14 '25 16:11 nixpanic

The e2e tests run on a CentOS Stream 9 system (minikube with podman backend, not it's own VM). Is the kernel module not available on this CentOS version?

To answer my own question: No, the aegis128 kernel module is not available on CentOS Stream 9 or even on CentOS Stream 10 (not in the kernel-modules-core and kernel-modules-extras packages).

It won't be trivial to add e2e testing for this :disappointed:

nixpanic avatar Nov 14 '25 17:11 nixpanic

Okay that is good to know. Then I will drop the support for the aegis cipher. As far as know from my research using this mode is experimental anyway in DmIntegrity and not recommend.

The problem with using aegis or something like AES-GCM is that it is not possible stop verifying the authentication tags. Once you have this cipher to MUST verify the integrity when decrypting (because the decryption process is tied to integrity verification). When using something like aes-xts with an HMAC the HMAC verification can be skipped in case of an error. This is why these kind of compound methods are recommended.

The "only" upside to aegis is that it is typically faster than doing something like an aes-xts with an HMAC. Or should I leave the aegis option but we do not test for it?

Greenpepper15 avatar Nov 14 '25 17:11 Greenpepper15

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 17 '25 11:11 Greenpepper15

Hey guys please give me some comment if we should drop aegis128 support or not. That would be at last thing (at least from my side) that I need to implement.

And please give some feedback if the existing e2e test suites are enough for you. If not please specify a test scenario you would like to see then I will add it.

Greenpepper15 avatar Nov 18 '25 07:11 Greenpepper15

Hey guys please give me some comment if we should drop aegis128 support or not. That would be at last thing (at least from my side) that I need to implement.

@Greenpepper15 yes lets drop it, we can introduce it in new PR if required.

And please give some feedback if the existing e2e test suites are enough for you. If not please specify a test scenario you would like to see then I will add it.

Will take sometime and review it soon. If its using existing test and doing all the tests like data consistency/snapshot/clone/resize etc we are good.

Madhu-1 avatar Nov 18 '25 08:11 Madhu-1

@Greenpepper15 it looks there are commints in this PR not related to this feature

Madhu-1 avatar Nov 18 '25 08:11 Madhu-1

Yes which one do you mean?

As far as I know it should only be this one. Should I drop this and open a new PR or should I edit the PR?

Greenpepper15 avatar Nov 18 '25 10:11 Greenpepper15

/test ci/centos/mini-e2e-helm/k8s-1.34/rbd

Greenpepper15 avatar Nov 18 '25 16:11 Greenpepper15