perstem icon indicating copy to clipboard operation
perstem copied to clipboard

Stemmer Not Working as Expected

Open thedamnedrhino opened this issue 7 years ago • 2 comments

I am executing the script using perl perstem.pl < input > output. input is a file containing the single line

کتابهاصفحه ها صفحهی صفحه ی کار کن کارکن

I would expect many changes to be made to this line through running a stemmer on it, however the output file contains the exact same data as the input data (the diff shell command shows no difference between the two files). I have also tried out all the relevant options (-s --irreg-stem -t 1) to no avail. I have also tried the perl perstem.pl < input | cat > output command to execute the script, but the result was the same.

thedamnedrhino avatar Sep 02 '17 08:09 thedamnedrhino

I have the same problem , and also I get this message : Use of the encoding pragma is deprecated at perstem.pl line 133. Use of the encoding pragma is deprecated at perstem.pl line 137.

abb4s avatar Oct 21 '18 13:10 abb4s

Just replace use encoding "utf8" with use utf8 and it should work :)

SSBakh avatar Dec 03 '19 17:12 SSBakh