DataOps topic

DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.

List DataOps repositories

raccoon

186

Stars

Forks

Watchers

Raccoon is a high-throughput, low-latency service to collect events in real-time from your web, mobile apps, and services using multiple network protocols.

raystack

clickstream

dataops

eventsourcing

kafka

versatile-data-kit

413

Stars

Forks

Watchers

One framework to develop, deploy and operate data workflows with Python and SQL.

vmware

analytics

data-engineer

data-engineering

data-lineage

squirrel-core

279

Stars

Forks

Watchers

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

merantix-momentum

cloud-computing

collaboration

computer-vision

dagger

257

Stars

Forks

Watchers

Dagger is an easy-to-use, configuration over code, cloud-native framework built on top of Apache Flink for stateful processing of real-time streaming data.

raystack

apache-flink

apache-kafka

dataops

framework

awesome-data-catalogs

604

Stars

Forks

Watchers

📙 Awesome Data Catalogs and Observability Platforms.

opendatadiscovery

awesome

awesome-list

big-data

data-catalog

whylogs

2.6k

Stars

118

Forks

Watchers

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collecti...

whylabs

ai-pipelines

analytics

approximate-statistics

calculate-statistics

meteor

172

Stars

Forks

Watchers

Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.

raystack

bigdata

collector

data-catalog

data-management

firehose

314

Stars

Forks

Watchers

Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

raystack

apache-kafka

bigquery

dataops

firehose

console

3.6k

Stars

332

Forks

Watchers

Redpanda Console is a developer-friendly UI for managing your Kafka/Redpanda workloads. Console gives you a simple, interactive approach for gaining visibility into your topics, masking data, managing...

redpanda-data

apache-kafka

dataops

kafka-gui