splink
splink copied to clipboard
FYI - Settings dict must be defined with double quotes as if it is valid json but booleans must be defined as python.
What happens?
I've opened a PR to update the documentation but I wanted to open an issue to highlight the behavior.
To Reproduce
If you input the following into the settings editor, it will highlight the errors I encountered:
{
"link_type": "dedupe_only",
"probability_two_random_records_match": 0.0001,
"blocking_rules_to_generate_predictions": [
"l.first_name = r.first_name",
"l.surname = r.surname"
],
"comparisons": [
{
"output_column_name": "First name",
"comparison_levels": [
{
'sql_condition': "first_name_l IS NULL OR first_name_r IS NULL",
"label": "Null",
"is_null_level": true
},
{
"sql_condition": "first_name_l = first_name_r",
"label_for_charts": "Exact match",
"tf_adjustment_column": "first_name",
"tf_minimum_u_value": 0.001
},
{
"sql_condition": "levenshtein(first_name_l, first_name_r) <= 2",
"label_for_charts": "Levenstein <= 2",
"tf_adjustment_column": "first_name",
"tf_minimum_u_value": 0.003,
"tf_adjustment_weight": 0.5
},
{
"sql_condition": "ELSE",
"label_for_charts": "All other comparisons"
}
]
},
{
"output_column_name": "Surname"
},
{
"output_column_name": "Date of birth"
}
],
"additional_columns_to_retain": [
"cluster"
]
}
OS:
iOS
Splink version:
3.8.1
Have you tried this on the latest master
branch?
- [X] I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- [X] I agree
@RobinL when you're back, would you mind adding a quick note to say that the settings editor requires double quotes, rather than singles?
Add warning to docs page
Just to note this is because the settings editor is editing a json document as opposed to a Python dictionary.
This is partly because it would be much harder to get the validation to the jsonschema working properly if it was Python syntax.
Just mentioning so it's clear what the issue is. I will try to get around to putting a note on the editor itself too
In Splink 4, settings are provided as a python object and the settings editor is decomissioned. closing