django-softhyphen icon indicating copy to clipboard operation
django-softhyphen copied to clipboard

Prevent hyphenation of short words in Russian

Open Walkeryr opened this issue 10 years ago • 2 comments

I've seen this example in the source code:

Short words are not hyphenated

>>> hyphenate("<p>The brave men, living and dead.</p>")
u'<p>The brave men, liv&shy;ing and dead.</p>'

This doens't hold for Russian language where 5 letter words got hyphenated, how can I control this behavior?

Walkeryr avatar Apr 09 '14 12:04 Walkeryr

Interesting question.

I'm not sure I have the answer, being ignorant of Russian hyphenation rules. This library uses a Russian dictionary by Peter Novodvorsky in the dicts directory. You can read more about it here and the dictionary itself is here.

It might be possible to add some option to the hyphenator that ignores word tokens below a certain size, but my recollection is there's nothing in the code that approaches that right now.

palewire avatar Apr 09 '14 16:04 palewire

Perhaps adding a character limit greater than zero here might do it?

palewire avatar Apr 09 '14 16:04 palewire