byte-storage Content Addressable Storage Instead

I'm wondering if a better approach would be to offer a content addressable store instead. There's a couple reasons why you can't effectively do a content addressable store above the API you're proposing and also why it is probably a better primitive for what you are trying to do.

Put simply, in a content addressable store the user doesn't set a value to a key, the user gives a value and is given the key. Setting the same value twice will return the same key.

Proposal

let key = await byteStorage(value) // value is a File, Blob, Stream, whatever
let value = await byteStorage(key) // could return a promise, or a stream, whatever you wanna go for

While it may seem like this is something that can be built on top of your proposal, you actually can't do it effectively. You don't know the hash of the value until the stream completes, so the underlying implementation needs to use a tmpfile and then do a filesystem level rename to the hash.

I sketched out an implementation and an example of a friendlier store built on top in userland. https://gist.github.com/mikeal/70daaf34ab39db6f979b8cf36fa5ac56

I think this is a better primitive as well. Databases have been doing similar tricks like this for years in order avoid locking. On top of this primitive you can build any number of schemes for locking (first write wins, last write wins, singleton mutex on key update, or you can use the hash as an atomic identifier of what is being updated and make the user pass the old key to update once).

If you don't provide this as a primitive many of these alternate schemes don't work. At the very least, you're pushing people to drastically reduce the lock time they are dealing with since they are only storing and updating metadata, and you're having them do it in the existing storage apis where those constraints are already known.

There are also many other benefits to content addressability. It solves all your locking issues (this API never locks), shortens any locking in existing APIs to far less than the time a stream takes to end, and you get syncing on top of this API for free :)

Thoughts?

Jul 18 '17 22:07 mikeal

+1

Blake2B seems like a good hash for now. But you could allow specifying the hash and parameters in the write call. (Blake2, for example, allows different sized keys)

Jul 19 '17 00:07 creationix

What about garbage collection? How would you provide a set of hashes to keep. Also what about trees of values (where one node points to another by hash)? Git can do this because it has fixed types, but generally you need some structure.

My design was to encode objects as either raw binary data or serialized objects with a pointer primitive value type. (I used msgpack with pointer being one of the extended types).

It doesn't have to be this complex though, you could simply allow space for an object to list an array of objects it references for GC purposes. Then you only need the root nodes to to a GC sweep instead of all nodes.

Jul 19 '17 00:07 creationix

+1

I recommend using multihash so that different users can pick their own hash functions for their needs and/or upgrade them over time without having to break the keys to achieve a full migration.

The decisions made on assumptions (i.e for now) can lead to hazardous futures (e.g git and sha1). A little more on this idea can be found on the "Future Proofing Systems" talk at BPASE17.

Jul 19 '17 00:07 daviddias

What about garbage collection? How would you provide a set of hashes to keep. Also what about trees of values (where one node points to another by hash)? Git can do this because it has fixed types, but generally, you need some structure.

Good point. If byte-storage enables recursive pins, then we need to understand how objects are linked and if there isn't a fixed scheme for all the objects, then we need a way to identify their types.

Initally IPFS got away because every MerkleDAG node was a protobuf. Today, with introduced support to git, bitcoin, eth, zcash, cbor and others, we manage to understand the data type and these objects are linked together through IPLD and the Content Identifier (CID), which gives you the hash + the data type. This was the way we identified that enabled to have recursive pins of linked heterogeneous objects.

Jul 19 '17 00:07 daviddias

A nit: Decouple setting data from addressing data. You want to be able to come up with the key for data you have without writing to the store, say to do a set-membership check. You can still hide your choice of hash function by exposing it as an API whose return values are defined in terms of valid arguments to get and set.

Jul 19 '17 01:07 kemitchell

@diasdavid I don't think you need to know the type, just make it part of the storage format where the payload is arbitrary.

pointers*
data

Though this does mean that it's now up to whoever is uploading the data to extract any pointers and include them in the store command.

Jul 23 '17 15:07 creationix

@kemitchell I don't think this is what you meant, but you gave me a great idea.

What if the storage engine decoupled storing data with setting the key for that data?

So I think this would fit @mikeal's case fine and the storage engine wouldn't have to calculate any hashes or know anything about it.

Suppose I want to store a large file or simply a stream of information.

call a store API which lets me stream data in somehow.
When done, it gives me an anonymous file descriptor.
I then call a set or something to "close" the file descriptor and give it a name

Pretty much all hashes allow calculating on streams as you go, so it's easy for the client to know the hash when it's done streaming.

This also means that the previous discussions about GC is now an application concern and not part of the storage engine. This can cause some performance hit and pretty much prevents cross-site sharing of datasets with duplicate values. (It can no longer assume that two sites with the same hash for value are the same value or that data is immutable)

Jul 23 '17 16:07 creationix

Looking back at the top post, I don't think this idea provides all the benefits. Having the storage engine be able to assume it's content addressable really enables all kinds of nice properties.

Jul 23 '17 16:07 creationix

RE: garbage collection.

As long as you can list the hashes in the store the user can do that. We don't have anything like this for any other storage (once you put stuff in local storage or idb you have to clean it up).

Caching what is "active" in storage can be done through any of the other persistent APIs that are addressable by user specified keys.

I don't think it would be appropriate for the spec to define this because it would limit the kinds of storage mechanism you can build on top of it. For instance, I have a content addressable file structure and all I ever need to know is the hash for the root node in order to figure out what is "active" but other use cases may be much more complex.

Jul 23 '17 18:07 mikeal

@mikeal is right:

While it may seem like this is something that can be built on top of your proposal, you actually can't do it effectively. You don't know the hash of the value until the stream completes, so the underlying implementation needs to use a tmpfile and then do a filesystem level rename to the hash.

But instead of implementing a CAS, what about just exposing rename? rename is also useful for other things too, for example: http://npm.im/atomic-file

Aug 22 '17 20:08 dominictarr

I like the rename idea. It's lower level so a little harder to use, but it reduces the complexity of the browser implementation a lot (compared to CAS built-in). Then it's up to userland libraries to make these decisions.

I'm generally a fan of browsers giving us powerful, but simple primitives and letting us build on top.

Aug 25 '17 18:08 creationix