Sonal
Sonal
What if we could take sql from say a dbt model or otherwise and use that for our model training - blocking as well as similarity? Then non Java programmers...
Right now there is a lot of repeat code in the labeller and update labeller classes - for ex execute method could be in one place. Code is pretty hard...
It will improve readability quite a bit if - headers were bold - options yes/no etc were color coded
We can build a cli that can filter and show results to the user, much like the labeller
Reported by Luke from Databricks [zingg_Dec21_0823_log4j-active (1).txt](https://github.com/zinggAI/zingg/files/7761163/zingg_Dec21_0823_log4j-active.1.txt) [zingg_Dec21_0823_sdtderr.txt](https://github.com/zinggAI/zingg/files/7761167/zingg_Dec21_0823_sdtderr.txt)
We have to figure out an error code/error and information reporting framework so that jobs orchestrated through dags like Airflow etc can be handled gracefully. Check best practises here.
Explore if we can provide an easy way to build the json args - can we use something like https://github.com/json-editor/json-editor ?
descriptive text and word vector/n grams kind of data may need different kinds of blocking. evaluate this
putting this out there as a thought,
Now that the documents are in a better consumable shape, we should add the case studies