gitcoin-grants-data-portal icon indicating copy to clipboard operation
gitcoin-grants-data-portal copied to clipboard

Pull data directly from chain

Open davidgasquez opened this issue 1 year ago • 9 comments

Currently, we rely on the Allo Indexer API Data. We should add an option to pull data straight from chains using something like cryo or subsquids. This way, we don't need to trust the Allo API data is that's what we want.

davidgasquez avatar Oct 25 '23 13:10 davidgasquez

Can Gitcoin Data Portal rely on Indexed data?

davidgasquez avatar Dec 11 '23 09:12 davidgasquez

Can Gitcoin Data Portal rely on Indexed data?

Probably not because Indexed is missing many chains in which GC rounds are running.

We need something like cryo.

davidgasquez avatar Jan 03 '24 10:01 davidgasquez

This works!

import cryo

cryo.collect(
    "transactions",
    blocks=["18.9M"], 
    rpc="https://eth.merkle.io",
    reorg_buffer=1000,
    max_concurrent_chunks=15, 
    inner_request_size=10000,
    output_dir="data",
    contract=["0x03506eD3f57892C85DB20C36846e9c808aFe9ef4"],
    hex=True
)

Don't forget to pip install cryo-python polars though!

davidgasquez avatar Jan 04 '24 16:01 davidgasquez

Made a small Colab notebook for people to play around.

From a quick test, it'll take around 52 hour to fully index a that contract, 0x03506eD3f57892C85DB20C36846e9c808aFe9ef4 in Ethereum mainnet.

davidgasquez avatar Jan 05 '24 17:01 davidgasquez

  • got low-effort 4x speedup while fetching events by raising concurrent_chunks to 100.
  • inside collab fetching all (undecoded) logs from Project Registry took 14 seconds (from deployment 400 days ago till now).
  • while TXs need some thinking, if performance inside CI-runner is comparable, event-based assets seem feasible now
import cryo

cryo.freeze(
    "events",
    blocks=["16071515:"], 
    rpc="https://eth.merkle.io",
    reorg_buffer=1000,
    max_concurrent_chunks=100, 
    inner_request_size=10_000,
    output_dir="data_fast",
    contract=["0x03506eD3f57892C85DB20C36846e9c808aFe9ef4"],
    hex=True
)

DistributedDoge avatar Jan 06 '24 06:01 DistributedDoge

Woah! I did try with higher max_concurrent_chunks but didn't get any speedup locally... interesting!

while TXs need some thinking, if performance inside CI-runner is comparable, event-based assets seem feasible now

:rocket:

davidgasquez avatar Jan 06 '24 14:01 davidgasquez

Just leaving a note that tx data from Covalent is quite neat for analyzing cost side, as it already has dolarized amounts for actual gas cost.

  • I think total gas cost of mainnet transactions dealing with grants stack project profiles was $23k for about 2.3k operations.

Unfortunately, the fetch is a bit on the longer side. Figuring out the incremental part could help save a lot of time and API credits (that we still have aplenty).

  • Free API key request limit of 4/second => need to limit parallel runs for assets of that type
  • 3 minutes to pull 2.3k events in pages of 100 isn't that impressive

DistributedDoge avatar Jan 11 '24 07:01 DistributedDoge

I think total gas cost of mainnet transactions dealing with grants stack project profiles was $23k for about 2.3k operations.

Nice! Would be awesome to publish a report inside Quarto analyzing the new data and showing the process to derive these numbers.

Unfortunately, the fetch is a bit on the longer side. Figuring out the incremental part could help save a lot of time and API credits (that we still have aplenty). Free API key request limit of 4/second => need to limit parallel runs for assets of that type 3 minutes to pull 2.3k events in pages of 100 isn't that impressive

Understandable. Really need to think harder about #28. Meanwhile, we can always do it slow. GitHub actions errors out after... 6 hours I think. :man_shrugging:

davidgasquez avatar Jan 11 '24 09:01 davidgasquez

I'm keeping an eye on mesc and its integration with Cryo. I think there might be a simple approach to get data from multiple chains easily. Probably slower than Covalent, except if we do partitions + incremental!

davidgasquez avatar Jan 11 '24 09:01 davidgasquez