KoParadigm
KoParadigm copied to clipboard
feature request - reverse conjugation
hi folks,
i've been looking for a library that performs the following:
먹었습니다 # => 먹다
할게요 # => 하다
are there plans to support something like this? alternatively, do you think leveraging this library would be useful to build a reverse conjugator myself?
cheers, Ryan
You could iterate over all possible verb roots and check if any of them are valid using Paradigm.verb2verb_classes, though there is nothing special about that compared to any dictionary. Once you have found the verb root, you can then just append "다".
For example (using graphemes):
from koparadigm import Paradigm
from jamo import h2j, j2h
from grapheme import graphemes
p = Paradigm()
verb_dict = p.verb2verb_classes
def words(word: str):
"""Generate all possible sub-words by removing jamo from right to left."""
syllables = list(graphemes(word))
while syllables:
end = syllables.pop()
base_word = "".join(syllables)
yield base_word + end
jamo = h2j(end)
if len(jamo) == 3:
yield base_word + j2h(*jamo[:-1])
def unconjugate(verb: str):
if verb in verb_dict:
return [verb]
verbs = []
for v in words(verb):
if v in verb_dict:
verbs.append(v)
return verbs
>>> list(words("먹었습니다"))
['먹었습니다', '먹었습니', '먹었습', '먹었스', '먹었', '먹어', '먹', '머']
>>> unconjugate("먹었습니다")
['먹']
>>> list(words("할게요"))
['할게요', '할게', '할', '하']
>>> unconjugate("할게요")
['하']
@CLIDragon 한 번 해볼게요 고마워요!