zingg icon indicating copy to clipboard operation
zingg copied to clipboard

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Results 223 zingg issues
Sort by recently updated
recently updated
newest added

so that time taken is less and we do not do redundant processing.

add info on the model eg https://docs.deepchecks.com/en/stable/examples/guides/quickstart_in_5_minutes.html

Let us expose our deterministic matching parts so that more use cases can be solved with Zingg

is there a way for us to support/build technnologies like datavant tokenization which can then be used for matching tokens in a privacy preserving way

question

**Z Columns** Zingg uses the few internal columns to store internal and intermediate data **Describe the solution you'd like** As of now, things work fine as expected. However, it is...

**Describe the question** Some of my entities have po boxes, some have street addresses, some have both. Im trying to understand the inner-workings so I can use the tool smarter...

question

Hello! As you know I'm working with the NC 5M dataset. I am frequently restarting from scratch to test the framework I'm building. Each time, I'll run the findTrainingData +...

question

In the doc you recommend setting numPartitions to ~20-30x the number of worker cores. Is that a good rule of thumb for all job types? (e.g. findTrainingData, trainMatch, link, etc)

question

What configuration should we use for generateDocs? I tried the one I use for findTrainingData but ended up getting an error (stderr attached): `java.io.FileNotFoundException: /home/[email protected]/NCVoter360/zinggModels/April19_voters/model.html (No such file or directory)`...

question

Having a prebuilt simple model can be helpful to get people started.

enhancement