ciphercore progress of ciphercore?

progress of ciphercore?

Open okdistribute opened this issue 4 years ago • 19 comments

Just curious how it's going with this module :) really looking forward to using it

Sep 18 '19 22:09 okdistribute

Hey @okdistribute ! I'm sorry for the lack of updates!!

I ran into an issue, even though the theory behind ciphercore is sound and i see no problem using the original publicKey as an 'replicationKey', and then further using the concatenation of contentKey + publicKey as a distributable key for read-access.

Effectively i got stuck in implementation with a hen & egg scenario, In order to store/initialize the new contentKey we'd need to have a backdoor into hypercore's initialization process. (hooks into the keyloading part, I remember spending hours stepping through the storage interaction looking for a safe spot to monkeypatch.)

So what's broken right now is contentKey management, writing/reading key to the storage, and ability to detect when to generate a new contentKey.

I think it would be a lot easier to accomplish this by adding cold-store encryption as a direct feature to hypercore, either that or at least ask politely if we can extend hypercore options with an contentEncryption: { encrypt(Buffer), decrypt(Buffer) } handler to complement the contentEncoding handler.

and/or have some more hooks into the initialization process because when core.ready() has fired then other applications might already have started reading/appending entries while ciphercore is just about to start figuring out if it's supposed to have an contentKey or not... :(

My memory is not exact but i remember some of my frustrations.

Sep 18 '19 23:09 telamon

I think @jwerle might have some insight since he was looking into hypercore encryption at Ara. What did your solution look like compared to this one?

Sep 20 '19 05:09 RangerMauve

@RangerMauve Sorry my rant was kinda long. TL;DR; Transparent encryption: no problemo! Hypercore constructor is messy: blocker.

I wish hypercore exported some kind of "load state" variable after initializing:

const feed = hypercore(store, key)

feed.ready(() => {
  const state = feed.initializedFrom
  if (state === 'exist_storage') console.log('Existing core loaded from storage')
  else if (state === 'init_storage') console.log('Initialized new storage with provided key')
  else if (state === 'generated') console.log('Generated new identity pair and initialized storage')
})

Sep 20 '19 06:09 telamon

@RangerMauve HI!

Yes I put a little thing here using the XSalsa20 stream cipher that leverages the hypercore onwrite hook to modify the data after verification. You can check it out here

The example uses the valueEncoding to make use of a xsalsa20 codec.

const { keyPair } = require('hypercore-crypto')
const replicate = require('hypercore-replicate')
const hypercore = require('hypercore')
const xsalsa20 = require('xsalsa20-encoding')
const crypto = require('crypto')
const hook = require('./')
const ram = require('random-access-memory')

const key = crypto.randomBytes(32)
const nonce = crypto.randomBytes(32)
const onwrite = hook({ nonce, key })
const { publicKey, secretKey } = keyPair()
const valueEncoding = xsalsa20(nonce, key)

const feed = hypercore(ram, publicKey, { secretKey, valueEncoding })

feed.ready(() => {
  const copy = hypercore(ram, publicKey, { onwrite })
  const other = hypercore(ram, publicKey, { valueEncoding })

  feed.append('hello')

  replicate(feed, replicate(copy, { live: true }), replicate(other, { live: true }), {
    live: true
  })

  copy.update(() => {
    copy.head((err, buf) => {
      console.log('%s', buf) // 'hello'
    })
  })

  other.update(() => {
    other.head((err, buf) => {
      console.log('%s', buf) // 'hello'
    })
  })
})

Sep 20 '19 14:09 jwerle

@jwerle um the on-write hook, does that mean that the content needs to be decrypted in order to replicate?

Sep 20 '19 14:09 telamon

@telamon nope!

EDIT:

if const other = hypercore(ram, publicKey, { valueEncoding }) dropped the valueEncoding, it will still replicate, but the contents of the data storage are encrypted

Sep 20 '19 14:09 jwerle

Ah then we're using pretty similar designs, this might be a silly question but why does "after-verification" matter? Is it to avoid wasting cycles on decrypting potentially invalid data when replicating?

As far as algorithms go, cipher core just uses the sodium/secret_box api if i remember correctly.

Sep 20 '19 15:09 telamon

"after verification" means that just after hypercore has verified the checksum of the merkle tree, it calls a "write hook" as seen here, here, and here

It is a powerful feature because all writers get to encrypt and read only replicators get to move already verified data around a swarm without actually worrying about the contents of the data being encrypted or decrypted. It is because the read only nodes will have the correct tree, signatures, etc.

The moment you need to decrypt and use the data, you should employ an onwrite hook for a reader or use a value encoding that will modify the data before returning to the caller (get(), head(), etc...)

I hope this makes sense!

Sep 20 '19 15:09 jwerle

@jwerle sorry for being a bit thick on the subject, I'm starting to believe that we have slightly different use-cases.

If i understand you correctly, when receiving: you're decrypting the data so it get's stored in non-encrypted format on disk if your peer has the decryption keys, (and that's where the onwrite hook comes into play cause the signatures get generated from the encrypted data, thus must be verified against the encrypted data)..

phew am i close? :)

Sep 20 '19 19:09 telamon

@telamon yep. One could also just a regular hypercore for the writer and when swarming the write hook encrypts data sent to peers so readers are working with an encrypted data storage

Sep 20 '19 20:09 jwerle

@jwerle Ah i see! Alright then I'm happy to have discovered your writehook, I defentively see some usecases for it.

So the usecase i had in mind when i wrote ciphercore was to also store the data encrypted on disk, and only ever decrypt it in memory given that the decryption keys were provided.

There's some drawbacks and benefits to both approaches, @jwerle please correct me if my assumptions are wrong:

feature	xsalsa-write-hook	ciphercore
transparent encryption/decryption _{(from the application's perspective)}	yes	yes
blind replication	yes	yes
reader decrypts	once _{onwrite to RandomAccess}	every access _feed.get(n)
writer encrypts	every replication stream?	once _{feed.append(data, cb)}

I hope i'm not causing any disinformation.

@okdistribute you should probably consider your use-case, If the data you store is not highly sensitive or you want to avoid performance penalty on data reads or if the data needs to be readable outside of the javascript process (using dat-storage ) then i would suggest using the writehook instead of ciphercore.

Edit: @okdistribute actually I realized that i'm quite curious about your usecase, would it be ok to ask what you're building? :smile:

Sep 21 '19 11:09 telamon

Snap thank you for the details @jwerle @telamon ! That chart will be especially helpful for people trying to choose a method of encryption. :O

Sep 23 '19 17:09 RangerMauve

@telamon it's my first time to read this and i tried to figure out how it works, but i find it hard to follow.

Is ciphercore encrypt each chunk so i can request a random set, verify it by using it's merkletree or decrypt it's chunks and use the original hypercore merkletree?
Also - do i use a private key to decrypt chunks or does it need to use a symmetric key?

I am curious because in order to build a more p2p-ish alternative to hashbase.io which would deal with hyper- or ciphercores rather than hyperdrives or dat's, there is a problem of sybil and/or outsourcing attacks, where peers pretend to store many duplicates of hypercore chunks, but instead:

have many accounts but only share one copy of the hypercore
have only one account, but if anyone requests chunks, they quickly download them from peers who actually store them to then answer the requests...

Oct 16 '19 09:10 serapath

Hey @serapath happy to see the show of interest! Just a heads up, as I wrote earlier, this project is still in an alpha stage.

The idea is to use a plain hypercore and wrap it up in an es6 proxy that injects an non-intrusive content-encoder. In other words, this method should provide the exact same results as encoding your feed-entries with some unknown encoding that a remote end might not know how to decode. So the merkletree and general behaviour of hypercore is untouched and works as usual.

The content encryption uses a secret key (not symmetric pair) as described by: https://download.libsodium.org/doc/secret-key_cryptography/aead So only the peers who have access to the secret "content" key can decrypt the data. I don't see any reason why you couldn't use some other encryption , take a look at the encrypt and decrypt methods defined within index.js.

As for the attacks you described I'm a little bit unsure if i possess the wisdom to answer them correctly.. But they both sounds like MITM patterns? The newly released Hypercore:v8 includes the hypercore-protocol:v7 which might address some of your concerns by introducing a noise-handshake that can help verify that the peer you're talking to truly has the key they've advertised. (There shouldn't be any issues with bumping ciphercore's hypercore dependency to v8)

If you'd like to help out please fork and send PR back or just poke me with some updates. I want to release a 1.0 of this project but i'm a bit occupied with trying to release a 1.0 for decentstack this month.

Oct 16 '19 10:10 telamon

Thx for your explanation and also the link :-)

The idea is that:

each peer participating in the p2p-alternative to hashbase receives a replica of the same hypercore encrypted with a different private key which those peers do not have access to.
and all the public keys for all the peers and their replicas are public so anyone can still receive the original hypercore data from a combination of those peers and re-assemble the original hypercore

This is in order to make sure peers actually store full copies of the hypercore they replicate so they can be challenged and need to respond with randomly requested chunks from that uniquely encrypted hypercore they store.

They can not get the requested chunks "just-in-time" from other peers, because they have a unique copy and the only way to respond to the challenges is to actually keep the data around.

Oct 16 '19 16:10 serapath

@serapath Sorry for late reply, that's an interesting idea, I'm not entierly familiar with the usecase but i'm intrigued :).

I too was looking into making a bit more useful hashbase, except once i started looking into the details i realized that we were missing quite a few other parts as well which led me to start working on decentstack.

You might wanna take a look at the middleware-interface/exchange-protocol for exchanging the challenges. If you feel like something is missing please give me some feedback. :)

Oct 18 '19 22:10 telamon

thx, i'll take a look :-)

Oct 21 '19 16:10 serapath

This might also be relevant to compare against: https://github.com/ameba23/crypto-encoder

(thanks @frando)

Dec 24 '19 23:12 RangerMauve

@RangerMauve actually i think @ameba23 might have been derived that code from ciphercore :) I had a quick chat with him since we accidentally both accidentally raced for the crypto-encoder name on npm. Here's my version: https://github.com/telamon/crypto-encoder (unpublished) Will try and merge our efforts in the coming days.

I strongly recommend that anyone in need of content-encryption uses one of the mentioned crypto-encoders for now. The other goal of ciphercore providing an intuitive and transparent API is gonna take some more time I'm afraid. ( january-ish )

Dec 24 '19 23:12 telamon

ciphercore ciphercore copied to clipboard

progress of ciphercore?

ciphercore
ciphercore copied to clipboard