Robin Linacre

Results 234 comments of Robin Linacre

One thing (probably separate from that specific issue) is that I think we should be outputting the intermediate tables with a unique name (e.g. with asci_uuid), and then passing them...

This seems to give the right answer but doesn't have a correct termination condition - atm it just iterates 20 times. ``` edges = pairwise_predictions.as_pandas_dataframe() duckdb.register("edges", edges) sql = """...

Sadly this appears to be slower at least in this use case. Might be worth trying on a much larger dataaset ``` Time taken to cluster using Splink: 0.815 seconds...

Didn't seem to help performance, closing

Could this make it faster? https://duckdb.org/2025/05/23/using-key

I also had this problem a while back. struggling to remeber exactly the issue but I was able to solve with. Apologies this code is a bit mixed in with...

Thanks for raising this. Not sure I totally follow what you're after. Can you achieve the same thing by setting `set_up_basic_logging` to False and then configuring logging how you want?...

Apologies for not replying sooner. This is probably the most complex custom comparison I've ever seen! I think it's unlikely the term frequency configuration 'plays' nicely with some of the...

Sorry for the delay. When you use the comparison/comparison level libraries, these are really just using helper functions that generate the underlying settings dictionary for you. i.e. Splink always uses...

This just jogged a memory: https://github.com/moj-analytical-services/splink/discussions/2316 I think it might be the same issue. If so, fix is: • Creating a fresh virtual environment • Install latest splink • (Using...