spark-matcher
spark-matcher copied to clipboard
Record matching and entity resolution at scale in Spark
I started to get errors during matcher.fit method call after updating to pyspark==3.4.0: Caused by: java.lang.IllegalArgumentException: requirement failed: Index 0 follows 0 and is not strictly increasing Still digging to...
Currently, when one of the input dataframes for a Matcher object contain missing values, an unclear and seemingly unrelated error is thrown. Before starting to fit or predict, the dataframe(s)...
Fixes ambiguous warning messages for `table_checkpoint` and `checkpoint_dir` Adds ValueError when both `col_names` and `field_info` are given ``` def if_else(a, b): """""" if not a and b: pass else: print(f"warning....
- Implementing diverse mini-batch active learning based on cardinal package. - adjusting matching_base and deduplicator to accept the new method as an argument. - updated test_active_learning to include the new...