Sonal
Sonal
It will be nice to have a native Snowflake version of Zingg which can run without Spark. While looking at snowpark, here are the first level thoughts - Need to...
py4j.protocol.Py4JJavaError: An error occurred while calling o148.execute. : zingg.client.ZinggClientException: zingg.client.FieldDefinition; local class incompatible: stream classdesc serialVersionUID = -2785755064886098506, local class serialVersionUID = 607273313522895964 at zingg.Matcher.execute(Matcher.java:159) at zingg.client.Client.execute(Client.java:215) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)...
config the fieldDefinition section for 2 sources with different schemas - see #204
try https://cloud.google.com/solutions/spark and see if we can build something out with Zingg
decouple model from view so that we can expose the looping through the python api
User reported 150 columns out of 7-8 only were being used for matching. The second iteration of find_training_data has been running for the last 2hrs. First iteration took 32 mins...
we should rename all the model ids to model names - same as the example and remove the readme that had preconfigured model ids.
We need to assess if performance has been impacted by adding stop words.
Need to make sure that updateLabels etc work on Databricks.