kupo icon indicating copy to clipboard operation
kupo copied to clipboard

Metadata with output

Open everestada opened this issue 1 year ago • 10 comments

Describe your idea, in simple words.

Get metadata info included with transaction info if possible.

Why is it a good idea?

Get more complete info.

Are you willing to work on it yourself?

I don't have the skills to work on this.

everestada avatar Aug 08 '22 04:08 everestada

Hey, this is something I have so far always leaned against introducing into Kupo. There are two reasons for it:

  1. Kupo is output-centric; elements are indexed per output and the entire application architecture revolves around that. Metadata however are related to a transaction, not individual outputs. So, keeping it true to the original design goal is important to avoid "feature creep" and turn the software into a bloated application. Thus, unless there's a clear use-case that can be argued in that direction, also storing metadata seems redundant; especially given the next point.

  2. Transaction metadata are already readily available in the node and Kupo provides all the information. For example, if I take the following transaction: e590625a2a560917d44ad4b23c931dfbae24287084a6945702da215f8bbd11bd. It can be obtained by searching GET /matches/addr_test1vq4qk0qruac7lnha8rkmgz4sxwxnh2levcuwq85qlfl5ttgu6s3r9 on the testnet. Kupo tells me that it was created on the following point:

    "created_at": {
      "slot_no": 65748442,
      "header_hash": "cb074af5b9a48bfa0a9c626b5b589a5aa6b9d5ef10f4e75dc5f3c44e0f5d711a"
    }
    

    From there, I can lookup the immediately previous point using GET /checkpoints/65748441, which yields:

    {
      "slot_no": 65748317,
      "header_hash": "4a635a060a39aa1530751b928c785df6b56f88248cc8eb4f28f907f816796ce0"
    }
    

    Which I can use to search for the whole transaction in the block using the chain-sync protocol (e.g. through ogmios).

    client.on('open', () => {
     client.on('message', onMessage)
     client.findIntersect([{
       "slot": 65748317,
       "hash": "4a635a060a39aa1530751b928c785df6b56f88248cc8eb4f28f907f816796ce0"
     }]);
     client.requestNext() // Roll-backward to requested point
     client.requestNext() // Roll-forward to the next point 
    })
    
    function onMessage(e) {
     const rollForward = JSON.parse(e).result.RollForward
     if (rollForward) {
       const era = Object.keys(rollForward.block)[0]
       const block = rollForward.block[era]
       const transactions = block.body.flatMap(tx => {
         return tx.id === "e590625a2a560917d44ad4b23c931dfbae24287084a6945702da215f8bbd11bd" ? tx.metadata : [];
       });
       console.log(JSON.stringify(transactions, null, 2));
       client.close();
     }
    }
    

    which spits:

    [
      {
        "hash": "ea61c7aebf6cd748b8d68dc68ccd14c86fcf8f3ae804a181ebd197387d63035c",
        "body": {
          "blob": {
            "1337": {
              "string": "greetings from ppbl summer 2022"
            }
          },
          "scripts": []
        }
      }
    ]
    

So, all-in-all, this whole dance took about 100ms, because accessing blocks directly from the node isn't a very expensive operation. Since the information is already available in the node's database, it seems counter-intuitive to duplicate it in the indexer as well.

Bear with me a second.


Having said that... you've been the second person to request that so it seems like there's a use-case for it -- might it be only convenience. Since Kupo already has a privileged access to the node (or Ogmios), those steps could be combined into one API query to make it easier for people to do this sort of operations.

Also, since Kupo is also already processing all transactions one-by-one anyway it would be relatively easy to also add metadata to the batch. Given the amount of metadata, this would certainly slow down a few other things and increase the size of the database, but I can imagine making this available only behind some (compilation or runtime) flags.

KtorZ avatar Aug 10 '22 08:08 KtorZ

Having been chasing a good solution, other than blockfrost, for last few months my self to get metadata. I would like to share my metadata adventures while working on frew projects and most recent ones.

Currently I am working on a project https://enterthemandala.app and it relies pretty heavily on the NFT metadata for the user experience, especially forward as we build out our game play.

I use Kupo and Ogmios throughout the whole project to aggregate UTXO data, Asset data, pool delegation information and pool information.

AND recently in version 2.0.0-BETA the most beautiful query of being able to search for policy id EG: http://localhost:4200/v1/matches/*?policy_id=a484ff902de682d1c05158daf246b40b533a7a2b2e9f81a41ad48fe4

We use it to see which player holds which Mandala NFTs for extra rewards.

THIS IS AMAZING MAN!!!!

However for metadata I have been chasing this devil with different solutions:

First I started with Oura:

where I setup a Oura filter to only stored txs/metadata with label CIP-25(label721) dump into a JSON file and then dumped into mongodb through a script.

This solution worked ok, wasn't the most efficient and it was hard to index mondodb since the policyID and Asset name by CIP-25 are objects <Keys> not <Values>. However it made for a nice small DB of only about 4.5GB with all the metadata since Shelley era meeting the label cip-25(label 721) criteria.

Soon after dcSPark released CARP, which uses Oura under the hood but they also added a postgresql indexer where you can setup certain jobs/workflows as well and saves it in CBOR. This is THE fastest solution so far for me to use to obtain metadata and query with really nice speed. The downside is, Carp will save only the cip25 metadata but it will save every single TX as well one even without metadata in it, which bloats the DB to 45GB.

So currently my stack consists of: Cardano node, Ogmios, Kupo, Carp, and it's a very nice working solution and even very feasible for a home enthusiast to spin up and connect Mandala app through their own solution which is the whole idea for us.

My apologies this got somewhat lengthy at this point, but I think my adventure story hehe, is crucial in trying to say how absolutely amazing it would be if Kupo, even if it took a little longer to sync had an option through either a flag or whatever, to store TX metadata that could be searchable by PolicyID/Asset.

Not to mention how much more friendly the Development stack for any cardano dev would become, literally Cardano Node, Ogmios(Optional) and Kupo.

With that said, Kupo is already an amazing tool and I would not be one bit disappointed if this option didn't make it in.

Thanks Again, Mike

bakon11 avatar Aug 10 '22 14:08 bakon11

If size, complexity and lightweight is a concern, another approach may be to have kupo post a webhook info to an outside endpoint and let the user process/save the output if using this metadata flag or any other optional flags in their own database.

Thanks for making Kupo, this will be a very useful tool for a lot of people.

everestada avatar Aug 10 '22 14:08 everestada

I think this would already be much better than having to use a second solution(mentioned in my comment above) just to sync metadata.

those steps could be combined into one API query to make it easier for people to do this sort of operations.

bakon11 avatar Aug 10 '22 14:08 bakon11

So, I am curious whether the solution I am proposing above could work because this is a relatively low hanging fruit with high impact.

The question really being: does an endpoint in the form of

GET /auxiliary-data/{slot-no}[?transaction_id={transaction-id}

Would work? In prose, I propose to query auxiliary data by slot-no, possibly filtered by transaction Id. This would return a list (possibly empty) of associated auxiliary data. The slot-no and transaction id can be obtained from the /matches endpoint results, or simply be completely arbitrary (coming from other sources).

Would that solve everyone's problem by any chance 😅?

KtorZ avatar Aug 10 '22 16:08 KtorZ

Additionally, it should be relatively easy to also add an optional auxiliary_data_hash to results if that's any useful. That would make it possible to know before querying whether an match's transaction has associated metadata.

KtorZ avatar Aug 10 '22 16:08 KtorZ

@bakon11: This is THE fastest solution so far for me to use to obtain metadata and query with really nice speed

Now you've triggered me. How fast 😏? I'd love to get some high-level baseline to see whether what I am proposing is unusable or somewhat in the realm of what exists.

KtorZ avatar Aug 10 '22 16:08 KtorZ

this would work for me. However, can the webhook also be something that can be implemented? Something where matches will dump everything available to an endpoint?

everestada avatar Aug 10 '22 19:08 everestada

Let's make a separate ticket for the webhook because that's a whole different conversation. This is clearly not sustainable to do when synchronizimg from scratch. Or at least, the target server would need to be able to sustain 10k + req/s, not impossible but demands some thinking. Plus, if a client needs this level of control, why not use the chain-sync protocol directly 🤔?

KtorZ avatar Aug 10 '22 19:08 KtorZ

I was thinking of kupo doing the match and then posting matched transaction to a webhook. But what we have got here is good for me.

everestada avatar Aug 10 '22 21:08 everestada