op-analytics
op-analytics copied to clipboard
Pull Contract Names
Issue
We currently use this hex component to combine all the following contract labelling data:
goldsky_clickhouse.labels_top50k_growthepie
goldsky_clickhouse.labels_projects_growthepie
`oplabs-tools-data.views.oso_op_atlas_contract_labels`
`opensource-observer.openlabelsinitiative.oli_tag_mapping`
Various workflows we currently have
- In goldsky_clickhouse: https://github.com/ethereum-optimism/op-analytics/blob/main/other_chains_tracking/get_contract_labels.ipynb
- oso_atlas_contract_labels: https://console.cloud.google.com/bigquery?ws=!1m5!1m4!4m3!1soplabs-tools-data!2sviews!3soso_op_atlas_contract_labels
SELECT
pc.chain_id, pc.project_id, lower(contract_address) AS contract_address, p.name, p.category
FROM `opensource-observer.op_atlas.project_contract` pc
LEFT JOIN `opensource-observer.op_atlas.project` p
ON pc.project_id = p.id
AND p.deleted_at IS NULL
GROUP BY 1,2,3,4,5
Solution
Eventually, we should move all these scattered data sources and ad hoc pipelines into OP Clickhouse for more scalability into the future. @lithium323 suggested we can:
- set up a ReplacingMergeTree dim table in long form with three columns only: contract_address, mapping_source, metadata (JSON). Then order by the first two of those
- From there we can have multiple jobs writing data to the table from many different places and that is where we would accumulate all our mappings
- And then we would have a tranforms job that cleans all that raw info and produces a single best mapping
- that best mapping final table would be what we would use in all dashboards
- For external sources, we can set up daily jobs to pull those into clickhose
- We'd also get verified contract names from Blockscount for top unmapped contracts. Check API docs: https://soneium.blockscout.com/api-docs
The final outcome should be one dim table that contains all of the above, plus any manual mapping we have.
Another step further would be bringing back our deployer logic