zingg icon indicating copy to clipboard operation
zingg copied to clipboard

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Results 147 zingg issues
Sort by recently updated
recently updated
newest added

**Is your feature request related to a problem? Please describe.** We do not have a static code analyzer. Let us see what pmd gives us - https://maven.apache.org/plugins/maven-pmd-plugin/check-mojo.html **Describe the solution...

Hi @sonalgoyal , I had tried the one month ago using docker in my local machine able to run and getting result. But now i got data in azure blob...

question

user reported error when input had a column named source. renaming to source_in fixed. error was Analysis Exceptiion, 'z_source' is ambiguous. need to investigate

running match for febrl on 0.3.4 release gives an error z_sim18 not found. I suspect that the python model configuration is different from that in config.json - leading to this...

**Describe the bug** febrl example takes a long time **Expected behavior** same time on docker as local machine

add febrl example test

The current pull request introduces several powerful techniques to improve code quality and separating concerns: - Domain driven design is adopted, trying to abstract away from external systems(Standard Input/Standard Output)...

the documentation needs to clearly spell out possible values and structure of the training data, along with adding header to the sample training file.

Knowing when to stop labeling is important to building the right models. To help the user, here are a few ideas - expose model metrics like accuracy and confusion matrix....

Write a python script which whill expose the model stats - confusion matrix and number of records marked, unmarked, matches, non matches, not sure. We will use the Labeller class....