closestmatch
closestmatch copied to clipboard
What is bagsize and how to use it?
I don't understand what's bagsize and why its []int{2} or []int{2,3,4}
@schollz: Any input here? I'm also a bit confused on how best to tweak this. The unittests have []int{4}, []int{5} but don't expand on what variable that's controlling.
I'll dig through the source, but an official explanation for dummies would be great. :smile:
Looks like for each "sentence", the bagsizes control the size of sub-string chunks.
// "foo bar baz" @ []int{1,5}
[]string{"f", "o", "o", " ", "b", "a", "r", " ", "b", "a", "z"}
[]string{"foo b", "ar ba", "z"}
I think. But also not sure how best to tweak it for different datasets.