cadence icon indicating copy to clipboard operation
cadence copied to clipboard

Add Blob Cadence type to store metadata onchain

Open mrakus7 opened this issue 2 years ago • 10 comments

When you project select onchain store metadata for NFT you faced with approach that you need to use String type to store binary data as String. In this case before store you have to convert from Binary to e.g Base64 format to comply with Cadence String format .

Issue To Be Solved

  1. Provide possibility to use and store Binary data in Cadence
  2. Exclude needed to convert into Base64 to store binary data
  3. String binary data is additional increase blockchain storage size into 30-40%

Suggested Solution

Create and provide Blob type of data for Cadence to use and store binary huge size data

mrakus7 avatar Feb 07 '22 18:02 mrakus7

We had discussed this before, as a more efficient and immutable alternative to [UInt8]. Could be named e.g. Data, Bytes, etc.

Should be easy to add by basically duplicating most of the implementation for String.

turbolent avatar Feb 08 '22 01:02 turbolent

I forgot to maintion, that before raise this request it was discussed with TheOneSock#1135 in Discord afther Office Hourse . He ask me to raise it and said that array with type [UInt8] is worth for chain and in fact takes more space for strore against store String (Base64 with 30-40% increase).

@turbolent , please discuss it again and revert with final solution

mrakus7 avatar Feb 08 '22 07:02 mrakus7

@mrakus7

I have mixed feelings about this. Questions in my head:

  • What is the reference here for huge size data ?
  • Do we want people to use Flow as data storage ? ( I remember there were some separate data storage ideas mentioned on forum ) Currently this would increase node memory requirements.
  • i don't like base64 solution too, but tbh it is not that bad. ( not sure but at least we can cover "String binary data is additional increase blockchain storage size into 30-40%" problem with simple gzip on storage or similar, which I see the only real problem here )

My opinion: Let's not dig a hole and then plan to build a ladder.

But I am always open to discuss in case I am missing something, maybe this is a good candidate for FLIP though, what do you think @turbolent ?

bluesign avatar Feb 08 '22 08:02 bluesign

@bluesign ,

  1. huge size data - its when some project don't want to use IPFS or something like this and want store metadata (more than 2mb) onchain (exclude any dependenses from other providers)
  2. Do we want people to use Flow as data storage - this question isn't for me, but as Flow blockchain user I want to have store all data onchain, e.g. NFT and metadata, why not if I can pay fee for this. This approach already discussed with some gyes from Flow team
  3. Maybe some better to have this inside support based on chain? String (Base64) and [UInt8] are not good solution. Why we can't create some new Blob Candence type for binary data and provide onboard gzip support, for example?

mrakus7 avatar Feb 08 '22 12:02 mrakus7

@mrakus7 I see, thanks for the context (that's @janezpodhostnik here on GitHub).

The current options for storing plain binary data are all suboptimal:

  • [UInt8] is stored as an atree array, which has a storage overhead. Also, it is mutable
  • String is immutable, but must be UTF-8 encoded, so to store arbitrary binary data, it must be encoded, e.g. using Base64, which also causes an overhead

We already have a need for an immutable byte array in a few places of the language itself, e.g. for PublicKeys, so a byte array would be helpful.

Adding a dedicated byte array type would be helpful and increase safety (immutable; no "type confusion" caused by strings actually containing encoded data, which could lead to forgetting to decode or using the wrong decoding).

A discussion about data storage on Flow, potentially large data, is out of scope for this feature proposal, so let us focus on the language feature itself.

turbolent avatar Feb 08 '22 17:02 turbolent

Yes, correct, @turbolent , it was discussed with @janezpodhostnik .

I just created new FLIP that this new binary Candece type can be used.

If you are in overall agree that this binary type can be supported in future on Cadence, what are the next steps to enhance it?

mrakus7 avatar Feb 15 '22 09:02 mrakus7

Related: https://github.com/onflow/cadence/issues/452

robert-e-davidson3 avatar May 13 '22 23:05 robert-e-davidson3

@AlexHentschel This proposal is for optimizing the storage of blobs. Currently types such as [UInt8] or a hex/base64-encoded string can be used, but they both have an overhead that could be reduced.

This proposal is not arguing if or how much data should be stored.

turbolent avatar Sep 23 '22 23:09 turbolent

would make exec state smaller, purely effort on Cadence side, non-breaking

j1010001 avatar Oct 19 '22 17:10 j1010001

@turbolent thanks for the note. I might have accidentally commented on the wrong issue. I think my comments should have gone to https://github.com/onflow/developer-grants/issues/23 and https://github.com/onflow/flow/issues/831. Moved my comments there. Sorry for the noise.

AlexHentschel avatar Oct 19 '22 19:10 AlexHentschel