closestmatch icon indicating copy to clipboard operation
closestmatch copied to clipboard

What is bagsize and how to use it?

Open dzpt opened this issue 7 years ago • 2 comments

I don't understand what's bagsize and why its []int{2} or []int{2,3,4}

dzpt avatar Sep 03 '18 09:09 dzpt

@schollz: Any input here? I'm also a bit confused on how best to tweak this. The unittests have []int{4}, []int{5} but don't expand on what variable that's controlling.

I'll dig through the source, but an official explanation for dummies would be great. :smile:

coxley avatar Feb 09 '21 17:02 coxley

Looks like for each "sentence", the bagsizes control the size of sub-string chunks.

// "foo bar baz" @ []int{1,5}
[]string{"f", "o", "o", " ", "b", "a", "r", " ", "b", "a", "z"}
[]string{"foo b", "ar ba", "z"}

I think. But also not sure how best to tweak it for different datasets.

coxley avatar Feb 09 '21 17:02 coxley