photonix icon indicating copy to clipboard operation
photonix copied to clipboard

Encrypt files that are uploaded to cloud object storage

Open damianmoore opened this issue 5 years ago • 3 comments

Related to: #12, #28

Background

Files that are stored on expandable object-based storage like S3 probably should be publicly accessible with their contents encrypted. This decryption would then be performed on the client side (JS). The alternative would be to use private "buckets" but this is not desirable as we would have to proxy all the traffic through the host servers.

The primary purpose of this encryption is to prevent access to image files at rest, either by API/console access or by brute-forced URL requests. As a secondary proposition, we will try to limit the ability for the host to access file contents unless they have been granted it for a particular image processing task (object recognition, colour detection etc.)

Initial proposal

This highly rated Stack Overflow answer seems to be a good starting point using NaCl/Sodium. PyNaCl seems to be well maintained and documented. Symmetric encryption would use XSalsa20 stream cipher with Poly1305 for authentication by default.

It's interesting to look at how Mozilla Send implemented client-side encryption using the Web Crypto API.

Another option would be to look at the encryption primitives use by Wireguard - modern VPN protocol (source https://mullvad.net/en/help/why-wireguard/):

  • ChaCha20 for symmetric encryption, authenticated with Poly1305, using RFC7539's AEAD construction
  • Curve25519 for ECDH
  • BLAKE2s for hashing and keyed hashing, as described in RFC7693
  • SipHash24 for hashtable keys
  • HKDF for key derivation, as described in RFC5869
  • Noise_IK handshake from Noise, building on the work of CurveCP, NaCL, KEA+, SIGMA, FHMQV, and HOMQV.

There are examples of this in the Python library PyCryptodome but it looks like PyNaCl can also use ChaCha20.

AES256 in GCM mode or PGP encryption might also be valid options. There is a JavaScript implementation OpenPGP.js that is maintained by ProtonMail.

Each Photo instance will have a symmetric key generated on creation and each PhotoFile will have a separate nonce/IV. The Photo symmetric key is encrypted by a user-level symmetric key which is in-turn encrypted by password-hashing with the user's password. This gives us a couple of levels of indirection in case the user wants to give access to a image processor, another user, or changes their password.

  1. User's public key pair is generated, both parts are stored in the User DB table but the private part is encrypted with the user's hashed password (Argon2/Scrypt/PBKDF2).
  2. Each photo processor has a public/private key pair. This includes tasks such as thumbnail generation, location tagging, object detection, style classification, colour detection etc.
  3. Photo gets uploaded, symmetric key gets generated for it, header + nonce + ciphertext gets saved to object storage.
  4. A new table PhotoAccessKey links users, public galleries or classification processors to Photos. Accessors all have public/private key pairs and the entry in this table is encrypted for the user with their public key.
  5. There should be at least 1 entry in the PhotoAccessKey table per Photo, linking it to the owner user's asymmetric key pair.
  6. For short periods of time there will be an additional entry in the PhotoAccessKey table for each of the photo processors that the user has allowed the server-side to perform.
  7. There should be an attribute on the PhotoAccessKey table which instructs the processor to delete the row on success.
  8. Provision for thumbnailing service needs to be made. Key PhotoAccessKey gets created on server and passed to the thumbnail service?

Simpler proposal

As a first step we might not need to worry about processing jobs on the server accessing the files as the biggest benefit is around storing on cloud storage securely.

  • Library data model gains 2 new boolean fields - something like encrypt_photos and encrypt_thumbnails.
  • Symmetric key (32 bytes) gets generated for each PhotoFile and stored in that table. Each file has a different key as they might want to be individually shared.
  • PhotoFile gets encrypted (ChaCha20 or XSalsa20). The encrypted version contains a nonce (auto generated) and a MAC (Poly1305).
  • Seems like JSON encoding combined with Base64 is a common way to store nonce, ciphertext and MAC/tag together. However, Base64 encoding is likely to increase file size by something like a third and it would be the biggest cost for the user so we'll probably need to come up with our own binary format here.
  • There is a JSON Web Key (JWK) format used for storing keys, ciphers etc. in a standard format. This could be added to the PhotoFile model.
  • Encrypted file gets uploaded to cloud storage with PhotoFile.id as file name.
  • Users of the system have their access checked against the DB and any PhotoFiles they are allowed to access are returned with their symmetric key in the API response.
  • Web client needs to be able to decrypt the file if PhotoFile contains an encryption_key.
  • Thumbnailer service would use the symmetric key if it needed to generate a new thumbnail.
  • Thumbnails need to be encryptable as the larger dimensions can still be quite big. They are safe be encrypted using the same symmetric key as they will have separate nonces.

damianmoore avatar Jan 17 '19 18:01 damianmoore

Investigate Taho-LAFS for storage encryption and their zfec library for erasure/forward error correction.

See this paper for performance comparing of several open source erasure code libraries.

damianmoore avatar Mar 14 '19 18:03 damianmoore

Sorry for asking but couldn't find any related documentation and the issues referenced here still seems to open: Has cloud object storage implemented?

Is it possible to configure Photonix to work with a service like AWS S3, Google Cloud Storage or Backblaze B2?

alikuru avatar Nov 20 '19 10:11 alikuru

Hi @alikuru. Thanks for enquiring. Support for object storage providers is still in the works, I'm afraid. Ideas for how to integrate this are becoming clearer though and it appears something that more and more people desire. If you'd like to know when the first version is release please add your email address to the form on https://photonix.org

damianmoore avatar Nov 20 '19 18:11 damianmoore