zingg icon indicating copy to clipboard operation
zingg copied to clipboard

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Results 147 zingg issues
Sort by recently updated
recently updated
newest added

We have to figure out an error code/error and information reporting framework so that jobs orchestrated through dags like Airflow etc can be handled gracefully. Check best practises here.

Explore if we can provide an easy way to build the json args - can we use something like https://github.com/json-editor/json-editor ?

enhancement

descriptive text and word vector/n grams kind of data may need different kinds of blocking. evaluate this

Now that the documents are in a better consumable shape, we should add the case studies

In some cases an SVM may be better, so give a way to plug that with default being logReg

enhancement

This is an umbrella feature request to get Zingg smarter in terms of understanding column types - how cool will it be to understand that a particular column denotes email...

enhancement

*One dimension to similarity is column similarity - predicting which columns are same and can be joined/matched is a problem in itself.

enhancement

We should be able to resolve unstructured entities to structured records in a database. That will enable a host of applications for ekg, fraud etc

enhancement