zingg
zingg copied to clipboard
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
This PR is for the issue [784](https://github.com/zinggAI/zingg/issues/784). Codes are generated using: ``` mvn clean; mvn package -Dspark=3.5 -Dmaven.test.skip=true cd python; pip uninstall zingg -y; pip install .; cd .. ./scripts/zingg.sh...
need to improve code review and coverage
``` public boolean process(Set
we should validate that a model has been created _Originally posted by @sonalgoyal in https://github.com/zinggAI/zingg/pull/778#discussion_r1492093265_
commit number 7d9567c89a1f9e0006abbed8ea6884e4d6138512 mvn clean compile package [INFO] ------------------------------------------------------- [INFO] T E S T S [INFO] ------------------------------------------------------- [INFO] Running zingg.client.TestSparkFrame SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation...
**Describe the bug** Running the --phase exportModel command with usual command structure like the other phases (which all work) throws an error. **To Reproduce** Steps to reproduce the behavior: 1....
 
This PR is for the issue [729](https://github.com/zinggAI/zingg/issues/729). Tested with running the phases. " [main] WARN org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry - The function affinegapsimilarityfunction replaced a previously registered function. [main] WARN org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry - The...
{project.parent.basedir} etc are getting created. please check pom.xml and fix
spark block and sparkfeaturefactory both have this method. move to one common place