splink icon indicating copy to clipboard operation
splink copied to clipboard

threshold_selection_tool_from_labels_table does not work using spark

Open guylissak opened this issue 1 year ago • 1 comments

What happens?

https://moj-analytical-services.github.io/splink/charts/threshold_selection_tool_from_labels_table.html

Hi this functionality of threshold_selection_tool_from_labels_table does not work when using spark linker. same code works for me when using DuckDB

error: ParseError: Invalid expression / Unexpected token. Line 1, Col: 14. l.first_name[4m[0m = r.first_name

To Reproduce

Run this code using spark linker

https://moj-analytical-services.github.io/splink/charts/threshold_selection_tool_from_labels_table.html

OS:

Databricks

Splink version:

3.9.14

Have you tried this on the latest master branch?

  • [X] I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • [X] I agree

guylissak avatar May 18 '24 21:05 guylissak

Thanks for the report

The characters �[4m�[0m are ANSI escape codes for terminal text formatting. Specifically:

  • �[4m is the ANSI escape code for enabling underline.
  • �[0m is the ANSI escape code for resetting all text attributes (including underline).

I think this is therefore likely to be a copy and paste problem, could be related to this: https://github.com/moj-analytical-services/splink/issues/2018

I've tried copy pasting the duckdb code which works fine at my end.

Are you able to post a reproducible example using the Spark linker where this error occurs?

RobinL avatar Jun 07 '24 12:06 RobinL

Closing on the basis I think this may have been a copy and paste error - please repoen with a reprex if you're still having issues

RobinL avatar Jul 18 '24 13:07 RobinL