matchr
matchr copied to clipboard
Difference between libraries
Hi.
I have tried this library and compared it with https://github.com/adrg
In some cases we experience differences in the results:
package main
import (
"fmt"
"github.com/adrg/strutil/metrics"
"github.com/antzucaro/matchr"
)
func main() {
r2 := "wilson kjell"
r1 := "wilson mathias"
fmt.Printf("matchr long distance:%f\n", matchr.JaroWinkler(r1, r2, true))
fmt.Printf("matchr short distance:%f\n", matchr.JaroWinkler(r1, r2, false))
m := metrics.NewJaroWinkler()
fmt.Printf("adrg:%f\n", m.Compare(r2, r1))
}
// matchr long distance:0.694444
// matchr short distance:0.694444
// adrg:0.816667
https://go.dev/play/p/z2IQsqYjIDQ
What is correct distance between these strings?
The origninal implementation (strcmp95) called from perl gives us 0.83523
Thank you.
Hi.
It looks like the difference between adrg/strutil and matchr is the 0.7 limit which is not implemented in strutil.
The difference between matchr and strcmp95/perl is probably because the phonetic part is not implemented in matchr.