Robin Linacre
Robin Linacre
Interesting. It hadn't occurred to me that rather than using an 'array_transform'/'array_reduce' type function, you could spread over multiple columns, and more importantly, I hadn't thought about the possibility that...
Worth noting for people reading this thread that Python udfs coming to duckdb very soon: https://twitter.com/holanda_pe/status/1655602267083489282
@OlivierBinette nice, i hadn't seen that in duckdb. yes a showcase would be great! We're actually thinking about refactoring some of the comparisons because it's currently a bit tricky to...
On my phone, but the place is start is hy looking at the SQL generated by the EMBEDDING_COMPARISON. There should be a method on that object to get the SQL...
I think it's the =, if you look at the function signature, there's an argument like higher is better or something. Maybe there's a bug, but try changing that option
I've had a go at a solution to this [here](https://github.com/RobinL/comparison_test/blob/98309645f080723bc0ad64184b3a3ea987fdf068/splink/__init__.py#L1). It's similar to your 'backend agnostic functions' but possibly even simpler. There is a demo script [here](https://github.com/RobinL/comparison_test/blob/98309645f080723bc0ad64184b3a3ea987fdf068/demo.py#L1) showing how it...
I've had a second go at this using a class based approach to see if it can eliminate boilerplate, and I don't think it helps. Solution [here](https://github.com/RobinL/comparison_test/pull/3) for reference.
I think we're now in a good place: We have several different proposed solutions, all of which can successfully solve the core problem of getting rid of backend-specific comparison levels...
Here's another class based approach which tries to combine @ADBond 's approach and my approach. https://github.com/RobinL/comparison_test/blob/e556a4b04e492daa7ff993428c27f3f693e75b45/splink/__init__.py (In a real implementation, I think Andy's ideas around having further classes e.g. a...
[This](https://github.com/RobinL/comparison_test/blob/19ef1c68711a7944ae2f1ad3ce9c17346bec4f9f/splink/__init__.py) is a slightly more fleshed out `ComparisonLevelCreator` that pulls in some of your ideas and implements `.configure` just to get a sense of what it would look like