zingg
zingg copied to clipboard
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
#615 improving code coverage for zingg/common/core: /hash to 97% coverage model and sink both to 100% coverage
Check https://gist.github.com/UsAndRufus/7b46bb349f5016b5995f8feb786597b3 and see places to improve the python api experience.
To support multiple backends like Spark, Snowflake etc as well as to test the base algorithms and concrete implementations, we need to build a unit testing framework. Broadly, we need...
**Describe the bug** The output data produced by the link phase change each time the model is run. **To Reproduce** Steps to reproduce the behavior: 1. Follow quick start instructions:...
a lot of junits can be generalized so that they can be used for Spark as well as other frameworks. create a list here of junits which use SparkFrame and...
this is a debug messageand should be printed as it will confuse the user java.lang.NullPointerException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at zingg.spark.core.executor.SparkZFactory.get(SparkZFactory.java:40) at zingg.common.client.Client.setZingg(Client.java:58) at zingg.common.client.Client.(Client.java:43) at zingg.spark.client.SparkClient.(SparkClient.java:22) at zingg.spark.client.SparkClient.getClient(SparkClient.java:47)...
should reflect the changed package structure we now have
for users who are running Zingg through Python, assessmodel should be able to give stats based on zinggDir and model id instead of expecting a json
Current feature is if (first+second != 0) score = 2.0*Math.abs(first - second)/(first + second); Do we really need the 2?
We want one test that checks if we have covered any method change in Java. It goes over every file in java that has a python api. there is a...