bcachefs icon indicating copy to clipboard operation
bcachefs copied to clipboard

Nonce reuse in bcachefs is more dangerous than other filesystems

Open hashbrowncipher opened this issue 5 months ago • 1 comments

As discussed in the encryption documentation, bcachefs uses deterministic nonces in encrypting its data. This leads to concerns about nonce reuse, which are discussed in the same documentation. The current design leaves users of encryption at risk from threats that are well-addressed by other disk encryption implementations, particularly AES-XTS.

In the scenario I am considering, the attacker receives two fully encrypted filesystem images. The first image (call it "prod") contains an encrypted filesystem. The second image contains the same filesystem, which has been copied to another machine (call it "testing", because it is intended to represent a machine in a testing environment). After the copy, the owner of the drives continued writing to both images. The newly-written data in the prod image is secret, and unpredictable to the attacker. The newly-written data in the testing image is known to the attacker. At no point did the attacker have online access to the disks or to the kernels which mounted them: the attacker only has their knowledge of the "testing" plaintext and access to the disks themselves.

Given this setting, the attacker may recover the plaintext of the prod disk as prod ciphertext XOR testing ciphertext XOR testing known plaintext. This attack is largely mitigated in other disk encryption implementations through the use of a block cipher like AES-XTS, which encrypts 128 bits of data at a time. But since bcachefs uses a keystream generator (ChaCha20, or potentially AES-GCM in the future), each bit is encrypted on a bit-by-bit basis and therefore revealed on a bit-by-bit basis.

I believe that mitigating this issue will require injection of external entropy. In practice, the easiest way to do this would be to derive the extent data key as KDF(superblock key, random seed) (ChaCha20 could likely be used as the KDF). The randomly chosen seed would be generated once per potential "fork event", and will need to be stored on the filesystem for as long as the encrypted data remains live. This storage requirement is significantly lower than the requirement of a per-extent data key, while still preventing concerns about nonce reuse.

hashbrowncipher avatar Jan 22 '24 00:01 hashbrowncipher