zingg icon indicating copy to clipboard operation
zingg copied to clipboard

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Results 147 zingg issues
Sort by recently updated
recently updated
newest added

protected Dataset getBlocks(Dataset blocked) throws Exception{ return DSUtil.joinWithItself(blocked, ColName.HASH_COL, true).cache(); } this method is not called form the code and will lead to issues with the linker and resolver?

update Dockerfile and Readme.md to 0.4.1-SNAPSHOT from 0.4.0 in enterprise branch

The issue is due to case sensitive comparison of column name, input vs what's in config e.g. following works fine: ./scripts/zingg.sh --phase recommend --conf examples/febrl/config.json --column XYZ but not ./scripts/zingg.sh...

this makes readthedoc build fail ` sonal@sonal-mac docs % rm -rf _build/html; make html Running Sphinx v7.2.6 path is /Users/sonal/zingg/python/zingg making output directory... done loading pickled environment... done building [mo]:...

this PR is for issue [651](https://github.com/zinggAI/zingg/issues/651) i have used this meta.yaml for recipe, command used: conda update conda conda install conda-build conda info --envs conda activate base conda build condaRecipe

(.venv) vikasgupta@Vikass-MacBook-Air /tmp % databricks-connect test * PySpark is installed at /opt/homebrew/lib/python3.10/site-packages/pyspark * Checking SPARK_HOME * Checking java version java version "1.8.0_351" Java(TM) SE Runtime Environment (build 1.8.0_351-b10) Java HotSpot(TM)...

some of the methods are tied to a phase, some are probably invoked directly while jackson sets the property? Need to see whats really needed here and if the code...

Reopening this topic in a new issue, I'm not able to get this working. ``` from zingg.client import * options = ClientOptions([ClientOptions.COLLECT_METRICS, "false"]) > AttributeError: type object 'ClientOptions' has no...

2023-11-24 17:49:58,143 [main] WARN org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry - The function affinegapsimilarityfunction replaced a previously registered function. 2023-11-24 17:49:58,143 [main] WARN org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry - The function jarowinklerfunction replaced a previously registered function. 2023-11-24 17:49:58,143...

good first issue