stopword icon indicating copy to clipboard operation
stopword copied to clipboard

improved Hebrew stopwords

Open TonySlon opened this issue 1 year ago • 6 comments

improved and corrected Hebrew stopwords

TonySlon avatar Feb 01 '24 12:02 TonySlon

Hi @TonySlon, can you tell a little about the changes? What is wrong and what is it changed to?

eklem avatar Feb 01 '24 12:02 eklem

Hi @eklem some words were written wrong, and I also added more stopwords which I think could be used

TonySlon avatar Feb 01 '24 13:02 TonySlon

Thanks! I was mostly wondering about those words that was written wrong. It seems they are two words (merged wrongly). But I don't think they will ever be removed from a text if that is true, because you add one and one word to the stopword library through and arrray and remove a word if it equals a stopword.

eklem avatar Feb 01 '24 13:02 eklem

@eklem ok, I changed it to be one word in new commit

TonySlon avatar Feb 01 '24 13:02 TonySlon

Do you have a translation/meaning for the words (what they are changed to)?

eklem avatar Feb 02 '24 16:02 eklem

'באיזומידה', - this one is never used like that, it has to be two separate words 'באיזו', 'מידה', . this is one example, but there are several words which are written not correctly.

TonySlon avatar Feb 05 '24 09:02 TonySlon