cadence icon indicating copy to clipboard operation
cadence copied to clipboard

Storage Querying API

Open cybercent opened this issue 5 years ago β€’ 30 comments

Issue To Be Solved

A way to iterate over the storage to be able to list the resources owned by an account.

(Optional): Suggest A Solution

  • using a Cadence script: for entry in account.storageEntries { log(entry) }
  • using RPC: getAccount(β€œ0x01") The response result can have a storage field with data in the cadence format (json with types)
  • Details of the technical implementation
  • Tradeoffs made in design decisions
  • Caveats and considerations for the future

If there are multiple solutions, please present each one separately. Save comparisons for the very end.)

(Optional): Context

I’m working on a marketplace and I was looking for a way to list all NFTs owned by an account.

cybercent avatar Jul 04 '20 00:07 cybercent

Hi @turbolent is there an ETA for the Storage Querying API ?

cybercent avatar Sep 30 '20 09:09 cybercent

@cybercent There's no ETA, but it's definitely one of the things we'll work on soon, along with other storage API improvements like #369 and #376.

I've added a roadmap for Cadence yesterday, and improving the storage API is a high-priority item.

If you're interested in helping out, let me know!

turbolent avatar Sep 30 '20 17:09 turbolent

Bump!

10thfloor avatar Feb 12 '21 03:02 10thfloor

Is this issue resolved? -> https://docs.onflow.org/faq/developers/#how-can-i-see-what-fungible-tokens-an-account-has

dryruner avatar Jul 20 '21 07:07 dryruner

@testrain No, this issue is still open. The documentation you linked requires knowledge about the fact that the account has certain values stored. The idea for this issue is to allow developers to query what storage paths store values, and what they are, without having to prior knowledge what is stored (i.e. what storage paths are used).

We're currently working on a new dictionary data structure and will eventually use is for account storage, effectively giving us the functionality to list storage paths which store values for free.

turbolent avatar Jul 20 '21 14:07 turbolent

@turbolent any updates for this issue?

atapin avatar Sep 16 '21 12:09 atapin

This is the one thing that is highly painful and annoying when working with Flow. I think this should be prioritized. Any other blockchain allows any dev to generically get all NFTs for a given account. In the case of flow, because an NFT can be stored into any storage path, it is impossible to generically show all NFTs owned by the an account. Another problem that will need to be solved after the generic storage querying is figured out is generically loading metadata for the NFTs. The INFT interface should have the signature for a the getMetadata function so developers can generically pull it for any NFT in a collection.

D10100111001 avatar Sep 29 '21 18:09 D10100111001

This is the one thing that is highly painful and annoying when working with Flow.

Agreed!

Any other blockchain allows any dev to generically get all NFTs for a given account.

Actually, I would disagree with this. Looking at Ethereum, it's easy to find out the list of NFTs of a specific type owned by an account, but that's also easy on Flow (by assuming the Collection is in the standard path). But on Ethereum, you have to know the contract address of all possible NFTs to know the list of all NFTs owned by a single account. That's barely tractable on Ethereum today, but not especially scalable for the future. Once Cadence has storage iteration, this will be something that is a MASSIVE improvement on Flow than on Ethereum.

This feature is taking a long time because it depends on a pretty significant rewrite of the underlying storage layer of Cadence. That rewrite is rolling out soon (code is complete and tested!), and so we hope to have storage iteration available to scripts and contracts very soon. Thanks to y'all for your patience!

dete avatar Nov 09 '21 21:11 dete

@dete thank you for the update! Thank you to the team for the continued effort you guys put in. Highly appreciated! πŸ––

cybercent avatar Nov 10 '21 22:11 cybercent

Once #1248 is merged we finally have the infrastructure in place to iterate over account storage. The next step would be to come up with an API.

Some challenges that need to be addressed:

  • Storage may contain resources
  • Storage may contain many objects
  • Cadence does not (yet) provide generics to Cadence programs (only built-in types may be generic)
  • Cadence does not (yet) provide means to dereference value-kinded objects (e.g. there is no way to get an Int from an &Int)

The first point makes defining a "storage entry" type for both resource-kinded and value-kinded (e.g. structs) objects difficult. For resource-kinded values, the entry should probably provide an authorized reference (auth &AnyResource). For value-kinded objects it would be more ergonomic to provide the value itself, instead of a reference to it (AnyStruct).

The second point makes adding a field let storageEntries: [StorageEntry] inefficient: If there are many objects in storage, then a large array would have to be temporarily allocated. A more efficient alternative could be an iterator.

Also, it might make sense to provide the run-time type of the object in the storage entry (e.g. a field let type: Type)

turbolent avatar Nov 24 '21 00:11 turbolent

Maybe it is enough if the storage entries only contain the path and the type, and do not contain a value – we already have the storage API which can be used to load/copy/borrow the value for a given path.

struct StorageEntry {
  let path: StoragePath
  let type: Type
}

let account = getAccount(0x1)
account.forEachStored(fun (entry: StorageEntry): Bool {
  if entry.type == Type<FlowToken.Vault>() {
    log(entry.path)
    // stop iteration
    return false    
  }


  // continue iteration
  return true
})

Or without introducing a new type, which is not very useful on its own and making the information parameters of the iteration function:

let account = getAccount(0x1)
account.forEachStored(fun (path: StoragePath, type: Type): Bool {
  if type == Type<FlowToken.Vault>() {
    log(path)
    // stop iteration
    return false    
  }


  // continue iteration
  return true
})

turbolent avatar Nov 30 '21 22:11 turbolent

In addition to iteration over the account storage, we can also add iteration over public and private links/capabilities, and the question is also, what should the API look like

turbolent avatar Dec 02 '21 22:12 turbolent

I think iteration is nice for sure, but also additionally some kind of query functionality can be useful. ( to limit search area and return array of struct )

If we will use the struct below.

struct StorageEntry {
  let path: StoragePath
  let type: Type
}

I think we can also use query based ( returning [StorageEntry] ) functions.

Something like account.storageManager can be useful. ( later even maybe this can be used to move storage setup for DApps into contracts, without passing AuthAccount but passing storageManager )

For example:

account.storageManager.query(_ type: Type) [StorageEntry]
account.storageManager.queryFirst(_ type: Type) StorageEntry?

or if iterator:

account.storageManager.query(_ type: Type) StorageEntryIterator

for storageEntry in account.storageManager.query(Type<Topshot.Collection>()){
}

Also I think we can extend StorageEntry with load and borrow like below, so we can slowly get rid of Paths little bit.

struct StorageEntry {
  let path: StoragePath
  let type: Type
  fun borrow()
  fun load()
}

If we cover also links and capabilities with StorageEntry, can be useful I guess.

Other then query here also:

  • keep track of capabilities ( given to that path to other accounts )
  • if possible option to revoke

I am a bit torn between if needed to add some helper booleans like canBorrow canLoad etc. ( like capability check method )

But by moving a lot of functionality into StorageEntry can make it really powerful.

Also something like queryAndBorrowFirst can be useful.

Some typical Transaction can look like this:

let provider = acct.strorageManager.queryAndBorrowFirst(Type<DapperUtilityCoin.Vault{FungibleToken.Provider}>()) 
    ?? panic("Could not borrow a reference to the buyers FlowToken Vault")
self.purchaseTokens <- provider.withdraw(amount: UFix64(6)) as! @DapperUtilityCoin.Vault

bluesign avatar Dec 02 '21 22:12 bluesign

I think it would be very nice to now only search in storage but also in public paths. Storage are right now not available from scripts, so the usefullness of that is limited IMHO.

Questions I want to have answered is:

  • Give me all capablities that can be borrowed as &{NonFungibleToken.CollectionPublic} or give med all wallets that is &{FungibleToken.Receiver}.

I really like @bluesign api from above. But for me allowing it to also search in what you can get a capability to this resource as would be awesome.

It would also be useful if we could combine this with the Metadata standard and be able to say. Give me all resources that have this View. Or even "give me all items in storage" that have this view, but that is a little bit heavy maybe?

bjartek avatar Dec 03 '21 12:12 bjartek

I think best option is to expose storageManager ( or whatever we call it ) to both PublicAccount and AuthAccount. ( Immutability can work here very good )

It would also be useful if we could combine this with the Metadata standard and be able to say. Give me all resources that have this View. Or even "give me all items in storage" that have this view, but that is a little bit heavy maybe?

This would be very nice +1

Actually we can maybe introduce some Filter function to iterator/array etc, which will get a Cadence Any, then you will return True/False for some condition etc.

bluesign avatar Dec 03 '21 12:12 bluesign

Are there any concrete plans for when this will be done? It is very important imho.

bjartek avatar Dec 20 '21 08:12 bjartek

Are there any plans for this to span addresses? One other common Β«issueΒ» on flow is get a listing of all nfts of a type across addresses.

Or even get a list of all account with the given thing stored/linked.

bjartek avatar Jan 10 '22 06:01 bjartek

Thank you for the great ideas!

I agree, a type-based storage querying API would be great to have. Unfortunately it is currently not possible, as the FVM / state trie does not provide an index of type to storage locations.

As for iteration to other domains, like public and private, in addition to storage: That would be possible and sounds like a good idea πŸ‘

turbolent avatar Jan 19 '22 19:01 turbolent

@turbolent wouldn't it possible to handle this on cadence side ? Eventually it will be just a dictionary, {String : [Path]} where key is a type identifier. I don't think we need ability to deep query there.

bluesign avatar Jan 21 '22 20:01 bluesign

@bluesign Yes, if we wanted to limit it to just root/top-level values, we could maintain the index from the Cadence side. I wonder though how useful it would be, as some developers may assume it returns all values of the type.

We had also discussed exposing the value traversal functionality that already exists in the implementation to Cadence, so one could use the querying function and the traversal function to find all values of a type (though it would be very expensive, as all intermediate data would need to get loaded)

turbolent avatar Jan 31 '22 17:01 turbolent

@turbolent this is very good question actually.

In the context of containers, multi level storage, I see those options:

  • array / dictionary
  • composite resource

For composite resources, I think it is not not only feasible, but also not possible to query their members reliably.

I think even NFTCollection should not be queried for ownedNFTs. (Even id is key there we would need to use uuid which will not provide info)

For array and dictionaries: I think here is also little tricky, It would need some addressing like mentioned before, but it would be expensive.

On the other hand, I don't see much usage of dictionary/array in storage, they are rarely used in top level storage.

For me it seems like acceptable trade off to query top level by type, and possibly get rid of Paths a bit.

But more importantly, if we make Query storage by Type for collections and vaults, it would be super easy for wallets to say something like : 'this transaction will access your FlowTokenVault' etc so basic human readable transactions can be a thing in the future without simulating the transaction.

bluesign avatar Jan 31 '22 21:01 bluesign

I totally agree that it would be nice to add a type-based querying API eventually, but I would also like to see an iteration querying API before that, as it could be implemented now.

What do people think about the proposal above (https://github.com/onflow/cadence/issues/208#issuecomment-983086381)? Or maybe someone has another good idea for an iteration API?

turbolent avatar Mar 03 '22 17:03 turbolent

I think that proposal is a good place to start. Then you can build higher level abstractions on top of it later.

bjartek avatar Mar 03 '22 18:03 bjartek

I totally agree that it would be nice to add a type-based querying API eventually, but I would also like to see an iteration querying API before that, as it could be implemented now.

What do people think about the proposal above (#208 (comment))? Or maybe someone has another good idea for an iteration API?

I like the second API example.

devbugging avatar Mar 21 '22 16:03 devbugging

Discussed iteration / pagingation with @fxamacker and @ramtinms.

It won't be possible to create a "cursor" that will allow starting iteration over an atree data structure (e.g. ordered map, as used for account storage) in one script or transaction, then continue iteration in a subsequent script transaction against the state the cursor was created for.

If we loosen the requirement to effectively query previous state and always query the current state, the next problem are mutations to the container: If the container is mutated, e.g. a new key is inserted into an ordered map, then the iteration order changes, invalidating the cursor and leading to the iterator e.g. not reporting certain elements or reporting them multiple times.

If we loosen the requirement further and assume that the iterated container is not mutated, then @fxamacker thinks it might be possible to provide a cursor that can be serialized, so e.g. it can be returned from a script, then provided to a subsequent script, or it could be stored on-chain.

To help with the limitation of mutations invalidating the cursor, we could ensure the iterated data structure is not mutated, for example by:

  • Recommending developers to prevent mutations, e.g. by locking, and/or detecting mutations
  • Add built-in support for locking data structure for mutations, and/or provide a way to detect mutations by e.g. generating a "state identifier" for the data structure

turbolent avatar Jul 08 '22 23:07 turbolent

I think solution here is detecting mutation ( not avoiding ). Technically nothing can iterate on mutating object. So mutation should just invalidate the cursor. I don't know much about atree structure, but probably ordered map can use some state identifier + key as cursor.

But if you limit discussion to path keys only. As we have ordered map, seek with prefix can work too. with seekWithPrefix and next we get rid of cursor totally, mutation responsibility falls on user of the api. But I still think state identifier is very nice idea in my opinion, I think it can be useful in other areas too. [0]

[0] https://github.com/onflow/cadence/issues/1777

bluesign avatar Jul 09 '22 10:07 bluesign

This should probably implemented in three stages of functionality:

  1. Add support for retrieving all paths of an account, e.g. a computed field Auth/PublicAccount.paths: [Path]. This would not support pagination, but would be useful as an MVP.
  2. Add support for pagination, without handling mutation. This could be a function-based API, like above, or reified as an object as an iterator/enumerator/cursors.
  3. Add support for handling mutation during mutation, e.g. by locking or detecting mutation as described above

As for the implementation:

  • In the checker, AuthAccountType and PublicAccountType, add a new member, a constant field typed let paths: [Path]
  • In the interpreter, in NewAuthAccountValue add a computed field path. Implement it by getting the storage maps for all domains via Interpreter.Storage.GetStorageMap, and use StorageMap.Iterator and StorageMapIterator.NextKey to iterate over all keys of the storage map, the path identifiers. Create paths from the domain and the identifier

turbolent avatar Jul 22 '22 19:07 turbolent

@janezpodhostnik Do you think we will need to add dedicated metering for this?

turbolent avatar Jul 22 '22 19:07 turbolent

After seeing the implementation I have a suggestion for a slightly modified version of the proposed API:

Cross-posting from https://github.com/onflow/cadence/pull/1845#discussion_r933515481:

Would it be better to group these 'paths' related operations (this one and the iterator functions) together? I am thinking of something similar to keys field (of AuthAccount).

e.g:

let paths: AuthAccount.Paths

and AuthAccount.Paths to be something like:

struct Paths {
   pub allPaths: [Path]

   pub fun forEachPublic(_ function: ((PublicPath, Type): Bool))

   pub fun forEachPrivate(_ function: ((PrivatePath, Type): Bool))

   pub fun forEachStored(_ function: ((StoragePath, Type): Bool))
}

The advantage is:

  • It groups similar operations together
  • The usage is more verbose (specially for the iterator functions):
    account.Paths.forEachPublic(...)
    

SupunS avatar Jul 29 '22 18:07 SupunS

@SupunS good idea, maybe we can do something like:

With built in enum:

enum StorageDomain{
   ANY
   PUBLIC
   PRIVATE
   STORAGE
}

and

in AuthAccount:

   pub fun getPaths(_ StorageDomain): [Path]
   pub fun forEachPath(_ StorageDomain, _ function: ((Path, Type): Bool))

Do we have any benefit from getting PublicPath, StoragePath instead of Path ?

bluesign avatar Jul 30 '22 16:07 bluesign