abydos icon indicating copy to clipboard operation
abydos copied to clipboard

Fuzzy intersection using Levenshtein alignment

Open chrislit opened this issue 4 years ago • 0 comments

Add another intersection type: Fuzzy intersection based on ordered tokens, using Levenshtein alignments to parcel out the similarity weights: Given two aligned strings:

  • two equal tokens mean weight for the token is divided by 2 and added to the intersection
  • two unequal tokens mean weight for each token is added to its non-intersection, except in the cases of '-' tokens which don't accrue weight since they represent ins/del

chrislit avatar Jul 17 '19 23:07 chrislit