simhash icon indicating copy to clipboard operation
simhash copied to clipboard

About Chinese

Open mejinke opened this issue 10 years ago • 1 comments

Does not support the Chinese ?

mejinke avatar Mar 31 '15 09:03 mejinke

English speaking people are not very concerned by Unicode :) In the code sample provided, you see that he uses an ASCII string :

[]byte("this is a test phrase"),

However, the lib can support Unicode using go.text

simhash.Simhash(simhash.NewUnicodeWordFeatureSet(content, norm.NFKC))}

See this repo for a complete example : https://github.com/bbalet/gorelated

bbalet avatar Mar 31 '15 09:03 bbalet