data-cleansing topic

List data-cleansing repositories

optimus

1.4k
Stars
233
Forks
Watchers

:truck: Agile Data Preparation Workflows madeย easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark

data-forge-ts

1.3k
Stars
76
Forks
Watchers

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

PClean

215
Stars
31
Forks
Watchers

A domain-specific probabilistic programming language for scalable Bayesian data cleaning

data-analysis-using-python

213
Stars
89
Forks
Watchers

Exploratory data analysis ๐Ÿ“Šusing python ๐Ÿof used car ๐Ÿš˜ database taken from โ“š๐–†๐–Œ๐–Œ๐–‘๐–Š

wrangler

83
Stars
56
Forks
Watchers

Wrangler Transform: A DMD system for transforming Big Data

dedupe

19
Stars
2
Forks
Watchers

Java DSL for (online) deduplication

Quizzes & Assignment Solutions for Google Data Analytics Professional Certificate on Coursera. Also included a few resources on side that I found helpful.

desbordante-core

361
Stars
61
Forks
Watchers

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algor...

Autism-Detection-in-Adults

20
Stars
21
Forks
Watchers

This is a binary classification problem related with Autistic Spectrum Disorder (ASD) screening in Adult individual. Given some attributes of a person, my model can predict whether the person would ha...

Zillow-Home-Value-Prediction

31
Stars
9
Forks
Watchers

XGBoost, LightGBM, LSTM, Linear Regression, Exploratory Data Analysis