zips Post-quantum privacy for Zcash

Split from zcash/zips#1134, which considers a fully post-quantum Zcash, i.e. one that would be post-quantum secure for balance, spendability, etc.

This issue focusses on post-quantum privacy, and in particular, changing the note encryption to be post-quantum private.

If addresses are kept secret then Zcash is already intended to be post-quantum private, as discussed in zcash/zips#1134 and in my talk for Zcon3 (slides, video). So the goal for this issue is to extend that property to the case where addresses are known to the adversary. This should be done in such a way that we still do not rely on any new plausibly post-quantum encryption algorithm in the case where addresses are secret, so that we ensure that there is no privacy regression even if that algorithm were to be completely broken classically. (The recent classical break of SIDH emphasizes the importance of that.)

Aug 18 '22 17:08 daira

Copying relevant discussion from zcash/zips#1134:

NIST have selected some of the algorithms to be standardized in their post-quantum cryptography project: the key encapsulation mechanism CRYSTALS-Kyber (which can be used for encryption), and the signature algorithms CRYSTALS-Dilithium, FALCON, and SPHINCS+ (not considered here).

CRYSTALS-Kyber (I'll just call it Kyber below) is a strong candidate because it is likely to attract much more cryptanalysis now that NIST have promised to standardize it; also, using a standardized algorithm may avoid some security or interoperability pitfalls due to inconsistent implementations.

Kyber has several sets of security parameters, and we would need to decide which provides adequate security. That page says:

For users who are interested in using Kyber, we recommend the following:

Use Kyber in a so-called hybrid mode in combination with established "pre-quantum" security; for example in combination with elliptic-curve Diffie-Hellman.

We recommend using the Kyber-768 parameter set, which—according to a very conservative analysis—achieves more than 128 bits of security against all known classical and quantum attacks.

I also definitely recommend that we use a hybrid between our current Diffie–Hellman-based encryption and Kyber.

Kyber-768 has secret key size 2400 bytes, public key size 1184 bytes, and ciphertext overhead (for the Kyber-specific part) 1088 bytes.

The secret key size is not a significant problem. The public key size might be, since that would have to go into an address in the current design. The ciphertext overhead would be per-recipient (it need not be duplicated for more than one output to the same recipient, but this is at the expense of leaking the number of recipients).

The key generation and encryption performance of Kyber is more than adequate. The decryption performance is 53156 cycles on Haswell using AVX2; at 3.5 GHz this would be ~15 microseconds per potential transaction recipient, which is unlikely to be a bottleneck (actually the Diffie–Hellman part of the hybrid-mode encryption is likely to be more expensive).

Note that a sensible stepping stone would be to switch to Kyber first, leaving a switch to a post-quantum SNARK for later. That would be sufficient for post-quantum privacy (because Halo 2 already has post-quantum zk, even though it does not have post-quantum knowledge soundness).

Aug 18 '22 17:08 daira

We need to think about the interaction of this idea with scaling approaches such as transaction aggregation: zcash/zcash#4946. Quoting from that issue:

On-chain bandwidth [updated]

The size of a nullifier is 32 bytes, a note commitment is also 32 bytes, and the total size of a note+outgoing ciphertext including epk and memo is 692 bytes (for both Sapling and Orchard). If the memo is split into a different ciphertext, then the non-memo+outgoing part would be ~180 bytes.

The size of a fully shielded Orchard transaction with two "actions" (up to two spends and two outputs) is 9160 bytes. The corresponding Sapling transaction would be 2756 bytes.

Define the total footprint of a transaction to be the size of its fields that are included on chain. Define non-trimmable footprint and trimmable footprint to be the sizes of the corresponding non-trimmable and trimmable fields. A Sapling or Orchard transaction included in an aggregate would have a non-trimmable footprint, for the two nullifiers, of 32*2 = 64 bytes. It would have a trimmable footprint of 32*2 + 692*2 = 1448 bytes, if both outputs are sent in-band, or 32*2 = 64 bytes if both are sent out-of-band. It would have a total footprint of between 128 and 1512 bytes.

So, the potential reduction of total transaction bandwidth relative to Sapling is in the range roughly ~45.1% (all outputs in-band) to ~95.3% (all outputs out-of-band). Relative to Orchard, it is in the range roughly ~83.5% (all outputs in-band) to ~98.6% (all outputs out-of-band). The "all outputs in-band" savings would be improved if we implement the fewer-memos-than-outputs optimization, at the cost of additional information leakage of the number of memos.

The potential reduction in chain size after trimming is 97.7% relative to Sapling, and ~99.3% relative to Orchard.

Now let's recalculate this for a hypothetical shielded protocol, let's call it "Cyber-Orchard", that uses Kyber-768 + DH for note ciphertexts:

The size of a nullifier is 32 bytes, a note commitment is also 32 bytes, and the total size of a hybrid Kyber + DH note and outgoing ciphertext including epk and memo is 1088 + 692 = 1780 bytes. This can be split into 1632 bytes per recipient (the Kyber.CPAPKE.Enc output, epk, and memo), and 148 bytes per output.

So, the size of a fully shielded Cyber-Orchard transaction with two "actions" (up to two spends and two outputs) but only one recipient (i.e. leaking that one output is change), is 9704 bytes. If we have two recipients (i.e. not leaking whether one output is change) then it would be 11336 bytes. I'll assume the 9704-byte estimate below.

Define the total footprint of a transaction to be the size of its fields that are included on chain. Define non-trimmable footprint and trimmable footprint to be the sizes of the corresponding non-trimmable and trimmable fields. A Cyber-Orchard transaction included in an aggregate would have a non-trimmable footprint, for the two nullifiers, of 32*2 = 64 bytes. It would have a trimmable footprint of 32*2 + 1632 + 148*2 = 1992 bytes, if both outputs are sent in-band, or 32*2 = 64 bytes if both are sent out-of-band. It would have a total footprint of between 128 and 2056 bytes.

So, the reduction in on-chain bandwidth for a 2-in 2-out aggregated Cyber-Orchard transaction relative to Sapling would be in the range roughly ~25.4% (all outputs in-band) to ~95.3% (all outputs out-of-band). Relative to Orchard, it would be in the range roughly ~77.6% (all outputs in-band) to ~98.6% (all outputs out-of-band). Relative to unaggregated Cyber-Orchard, it would be in the range roughly ~78.8% (all outputs in-band) to ~98.7% (all outputs out-of-band).

The potential reduction in chain size after trimming is 97.7% relative to Sapling, ~99.3% relative to Orchard, and still ~99.3% relative to Cyber-Orchard.

As you can see, a switch to post-quantum note encryption would make the bandwidth saving of aggregation no longer quite as spectacular for transactions with in-band outputs, even though it still ends up being a ~25.4% saving over Sapling. The savings of aggregation for out-of-band outputs are unaffected.

Aug 18 '22 22:08 daira

It would be possible to support both Cyber-Orchard and plain Orchard addresses at the same time; in that case you would get post-quantum privacy for payments to the Cyber-Orchard ones (and to Orchard ones that the adversary does not know). The ciphertexts could I think be made indistinguishable, so that a Cyber-Orchard-sized ciphertext could encrypt either one output to a Cyber-Orchard recipient, or several outputs to plain Orchard recipients.

We would need to carefully consider how many plain Orchard recipients should be supported per Cyber-Orchard recipient, because each recipient slot requires a trial decryption per IVK when scanning. There's also some information leakage if that number is more than one, because then a transaction will be larger when it sends to multiple Cyber-Orchard recipients than when it sends to the same number of plain Orchard recipients.

Aug 19 '22 00:08 daira