splink icon indicating copy to clipboard operation
splink copied to clipboard

FYI - Settings dict must be defined with double quotes as if it is valid json but booleans must be defined as python.

Open mastratton3 opened this issue 1 year ago • 3 comments

What happens?

I've opened a PR to update the documentation but I wanted to open an issue to highlight the behavior.

To Reproduce

If you input the following into the settings editor, it will highlight the errors I encountered:

{
    "link_type": "dedupe_only",
    "probability_two_random_records_match": 0.0001,
    "blocking_rules_to_generate_predictions": [
        "l.first_name = r.first_name",
        "l.surname = r.surname"
    ],
    "comparisons": [
        {
            "output_column_name": "First name",
            "comparison_levels": [
                {
                    'sql_condition': "first_name_l IS NULL OR first_name_r IS NULL",
                    "label": "Null",
                    "is_null_level": true
                },
                {
                    "sql_condition": "first_name_l = first_name_r",
                    "label_for_charts": "Exact match",
                    "tf_adjustment_column": "first_name",
                    "tf_minimum_u_value": 0.001
                },
                {
                    "sql_condition": "levenshtein(first_name_l, first_name_r) <= 2",
                    "label_for_charts": "Levenstein <= 2",
                    "tf_adjustment_column": "first_name",
                    "tf_minimum_u_value": 0.003,
                    "tf_adjustment_weight": 0.5
                },
                {
                    "sql_condition": "ELSE",
                    "label_for_charts": "All other comparisons"
                }
            ]
        },
        {
            "output_column_name": "Surname"
        },
        {
            "output_column_name": "Date of birth"
        }
    ],
    "additional_columns_to_retain": [
        "cluster"
    ]
}

OS:

iOS

Splink version:

3.8.1

Have you tried this on the latest master branch?

  • [X] I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • [X] I agree

mastratton3 avatar May 13 '23 14:05 mastratton3

@RobinL when you're back, would you mind adding a quick note to say that the settings editor requires double quotes, rather than singles?

ThomasHepworth avatar May 17 '23 11:05 ThomasHepworth

Add warning to docs page

RossKen avatar Sep 19 '23 13:09 RossKen

Just to note this is because the settings editor is editing a json document as opposed to a Python dictionary.

This is partly because it would be much harder to get the validation to the jsonschema working properly if it was Python syntax.

Just mentioning so it's clear what the issue is. I will try to get around to putting a note on the editor itself too

RobinL avatar Sep 20 '23 19:09 RobinL

In Splink 4, settings are provided as a python object and the settings editor is decomissioned. closing

RobinL avatar Jul 24 '24 18:07 RobinL