splink icon indicating copy to clipboard operation
splink copied to clipboard

[FEAT] Date comparison without datediffs

Open samnlindsay opened this issue 1 year ago • 0 comments

Is your proposal related to a problem?

Attempting to build a comparison for a date column without a datediff_level? (i.e. exact, Damerau-Levenshtein, else)

cl.damerau_levenshtein_at_thresholds("date_col", [1,2]) 

:x: date type column, not string

ctl.date_comparison("date_col", damerau_levenshtein_thresholds=[1,2], datediff_thresholds=[], datediff_metrics=[])

:x: datediff_error_logger requires datediff_thresholds for cl.datediff_at_thresholds even though it’s not strictly necessary for ctl.date_comparison

ctl.date_comparison("date_col", damerau_levenshtein_thresholds=[1,2], datediff_thresholds=None, datediff_metrics=None)

:x: datediff_error_logger fails due to len(None)

Describe the solution you'd like

I think the latter should be sufficient to say “I don’t want any datediff levels” and to turn off datediff_error_logger here:

if datediff_thresholds is not None :
        # Validate user inputs
        datediff_error_logger(thresholds=datediff_thresholds, metrics=datediff_metrics)

Describe alternatives you've considered

The alternative is to effectively write a custom alternative version of ctl.date_comparison without the datediff_levels

samnlindsay avatar Oct 10 '23 10:10 samnlindsay