ethereum-etl-airflow icon indicating copy to clipboard operation
ethereum-etl-airflow copied to clipboard

Missing logs_by_topic table

Open johnayoung opened this issue 3 years ago • 3 comments

Hi all,

When attempting to run the DAGs for the first time, we are unable to see where the "logs_by_topic" table is getting populated for our log event.

WITH parsed_logs AS
(SELECT
    logs.block_timestamp AS block_timestamp
    ,logs.block_number AS block_number
    ,logs.transaction_hash AS transaction_hash
    ,logs.log_index AS log_index
    ,logs.address AS contract_address
    ,`<project-id>-internal.ethereum_<entity>_blockchain_etl.parse_<smart-contract>_event_<event-name>`(logs.data, logs.topics) AS parsed
FROM `<project-id>-internal.crypto_ethereum_partitioned.logs_by_topic_0x8c5` AS logs
WHERE

  address in (lower('<address>'))

  AND topics[SAFE_OFFSET(0)] = '<topic>'


  -- live


  )
SELECT
     block_timestamp
     ,block_number
     ,transaction_hash
     ,log_index
     ,contract_address

    ,parsed.owner AS `owner`
    ,parsed.spender AS `spender`
    ,parsed.value AS `value`
FROM parsed_logs
WHERE parsed IS NOT NULL

The section in question is this guy:

...
FROM `<project-id>-internal.crypto_ethereum_partitioned.logs_by_topic_0x8c5` AS logs
...

We know that this is part of the "LIVE" realtime update section, but what is actually populating the table with the topics that we specify? Is this being done in a different repo?

johnayoung avatar Feb 02 '22 14:02 johnayoung

It's done in this repo https://github.com/blockchain-etl/blockchain-etl-dataflow/blob/master/partitioned_tables.md

medvedev1088 avatar Feb 02 '22 15:02 medvedev1088

Thanks a ton @medvedev1088 for the quick response.

So am I correct in assuming:

  • live will stream entities (blocks, transactions, etc.) to Pub/Sub using https://github.com/blockchain-etl/blockchain-etl-streaming
  • then uses https://github.com/blockchain-etl/blockchain-etl-dataflow to connect pub/sub to BigQuery

Are both of these repos setup enough where we can plug and play our own implementation? We plan on contributing to the dataset ecosystem, but need a custom implementation for some edge cases.

johnayoung avatar Feb 02 '22 15:02 johnayoung

@johnayoung yes those assumptions are correct. The code in the repos is sufficient to set the system up.

On your last point, what datasets are you planning on contributing?

medvedev1088 avatar Feb 02 '22 16:02 medvedev1088