ethereum-etl icon indicating copy to clipboard operation
ethereum-etl copied to clipboard

Provide access to full history through GCS buckets

Open bryzgaloff opened this issue 3 years ago • 2 comments

BigQuery datasets are available publicly. It would also be awesome to unload the full history of ETH history to, let's say, AWS S3 and make it queryable through Athena.

The request is: could you please provide access to raw data in GCS buckets? Parquet format would be awesome, but JSON/CSV is also ok. Unloading from GCS should be more efficient than from BigQuery, I believe. Please correct me if I am wrong.

If you may provide me with access to the buckets, I may prepare the data for a public usage in Parquet.

bryzgaloff avatar Jun 15 '22 10:06 bryzgaloff

We don't expose exported files in GCS at this point. An alternative is to use scripts here to export data from BigQuery to GCS https://github.com/blockchain-etl/ethereum-etl-postgres

medvedev1088 avatar Jun 17 '22 14:06 medvedev1088

Can this be planned for implementation? I am contribute in case you may provide me with sufficient access to your GCS buckets / GCP account. I may configure the read-only credentials to the GCS bucket with data, or maybe make it public.

I may share alternatives once I dig a little bit into it for you to choose, but I would like to see the data itself first to plan if it needs any reformatting.

bryzgaloff avatar Jun 20 '22 05:06 bryzgaloff