ffi-hunspell icon indicating copy to clipboard operation
ffi-hunspell copied to clipboard

Can't get it work with Russian

Open houshuang opened this issue 10 years ago • 3 comments

Can't get this to work, not sure if it's a UTF8 issue or what.

require 'ffi/hunspell' c= FFI::Hunspell.dict('ru_RU') p c.stem("рассчитывал") #-> []

command line using hunspell binary: textmining|master⚡ ⇒ echo рассчитывал | hunspell -d ru_RU -s рассчитывал рассчитывать

houshuang avatar Oct 28 '13 18:10 houshuang

Works for me (my locale is UTF-8)

require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')

dict.valid? "рассчитывал"
#=> true 

dict.encoding
#=> #<Encoding:UTF-8> 

dict.stem "рассчитывал"
#=> ["рассчитывать"]

nkrot avatar Aug 04 '16 11:08 nkrot

@houshuang what does __ENCODING__ return in irb? What is the output of the locale command?

postmodern avatar Dec 04 '16 06:12 postmodern

Yeah, this is encoding problems:

On Ubuntu 17.04 (hunspell 1.4.1-2build1):

dict = FFI::Hunspell.dict('ru_RU')
dict.encoding
# => #<Encoding:KOI8-R (autoload)>

dict.suggest('ощибка')
# => []

dict.suggest('ощибка'.encode(dict.encoding)).map { |s| s.encode(__ENCODING__) }
# => ["ощипка", "ошибка"]

Envek avatar Jul 27 '17 11:07 Envek