php-stemmer icon indicating copy to clipboard operation
php-stemmer copied to clipboard

Greek language stemmer?

Open MaghSamana opened this issue 4 years ago • 10 comments

Hello, this isn't an issue, but a request. There is a stemming algorithm for greek language. https://snowballstem.org/algorithms/greek/stemmer.html

Can it be added here?

MaghSamana avatar Sep 18 '20 18:09 MaghSamana

@msaari you wrote in issue #21 that you may implement other languages if a snowball algorithm exists. Can I motivate you to keep the Greek stemmer a chance?

HLeithner avatar Jul 09 '21 07:07 HLeithner

Yes. There’s an obscene amount of steps in it and the additional complication of using Greek alphabet, but I can look at this. Not before my vacation’s over, so at some point in August or September, depending on how much development time Relevanssi’s next version takes.

msaari avatar Jul 09 '21 08:07 msaari

I can be of help in anything regarding Greek language, alphabet, characters etc, - just message if there's anything needed.

MaghSamana avatar Jul 14 '21 17:07 MaghSamana

I've started work on the Greek stemmer in the greek branch. It seems complicated, but we'll see how it turns out once I get to the actual stemming steps.

msaari avatar Aug 19 '21 08:08 msaari

I’m interesting for the addition of Greek language too. If your need any help on Greek alphabet let me know.

tarasiadis avatar Feb 13 '22 07:02 tarasiadis

I'm stuck with this. The Greek alphabet is not the problem; understanding the stemmer is. If someone can help me understand the stemmer details, that'd be helpful.

msaari avatar Feb 13 '22 07:02 msaari

I'm trying to understand the algorithm too, but I'm not the specialist on this! If someone can help it will be very usefull to include Greek language on stemmer.

tarasiadis avatar Feb 14 '22 17:02 tarasiadis

I have found some code for greek stemmer algorithm to help you. I attach it here. https://u.pcloud.link/publink/show?code=XZSAo7VZBgQQ9HSGre03ltv3ll2wuzaTlBby

tarasiadis avatar Feb 17 '22 07:02 tarasiadis