data-cleaning topic

List data-cleaning repositories

optimus

1.4k
Stars
233
Forks
Watchers

:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

data-forge-ts

1.3k
Stars
76
Forks
Watchers

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

bumblebee

137
Stars
35
Forks
Watchers

🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)

voicebook

370
Stars
82
Forks
Watchers

🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).

schema-inspector

504
Stars
45
Forks
Watchers

Schema-Inspector is a simple JavaScript object sanitization and validation module.

janitor

1.4k
Stars
130
Forks
Watchers

simple tools for data cleaning in R

skrub

1.0k
Stars
90
Forks
Watchers

Prepping tables for machine learning

mage-ai

7.2k
Stars
660
Forks
43
Watchers

🧙 Build, run, and manage data pipelines for integrating and transforming data.

validate

403
Stars
37
Forks
Watchers

Professional data validation for the R environment

klib

479
Stars
51
Forks
Watchers

Easy to use Python library of customized functions for cleaning and analyzing data.