data-governance topic
document-processing-pipeline-for-regulated-industries
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
Data-Stash
Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。
Data-Reconcile
Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。
datacatalog-util
A Python package to centralize some Google Cloud Data Catalog scripts, this repo contains commands like bulk CSV operations that help leverage Data Catalog features.
datacatalog-tag-manager
Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format
Data-Export
Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。
mara-metabase
Configuration and schema sync for Metabase from Python
data-detective
Data catalog for everything in your company
conduktor-poc-kafka-protocol
POC to demonstrate how to alter incoming/outgoing records in Kafka. It's a toy, don't use it in production.
datachecks
Open Source Data Quality Monitoring.