splink
splink copied to clipboard
[FEAT] Settings validation for single column input datasets
Is your proposal related to a problem?
Some users attempt to use splink on single column datasets and are met with unhelpful error messages, rather than something to warn them that - https://github.com/moj-analytical-services/splink/issues/1362.
This error is also found further into the linkage process, making it harder for users to identify that the issue is not a bug, but a constraint within splink.
Describe the solution you'd like
An additional check within the settings validator that identifies and flags this issue before any linkage methods get run.
There's a question about whether this should be a warning or an error. I'd lean more towards having this sit in the settings validator as an error, prompting the user to turn off settings validation should they wish to use methods such as profile_columns
on their data.