Glance-Bookmarklet
Glance-Bookmarklet copied to clipboard
Long words disappearing
Long words are not shown, it jumps right to the next word.
E.g on this page http://www.nrk.no/nordland/en-av-fire-disponert-for-narkolepsi-1.11585097 the words "forskningsartikkel", "svineinfluensaviruset", "Pandemrix-vaksinen" and "årsakssammenhengen" are not shown.
spritz.js:
173 var tail = 22 - (word.length + 7);
174 word = '.......' + word + ('.'.repeat(tail));
310 String.prototype.repeat = function( num ){
311 return new Array( num + 1 ).join( this );
312 }
forskningsartikkel length is 18: 22-(18+7) = -3
Uncaught RangeError: Invalid array length (on line 311)
Max. word size is 16. I think we need to split long words. The question is where.
Any suggestions?
Ah! Great catch. I guess it's not going to work all that well for German and Norwegian, etc right now..
22 is an arbitrary number. We could just raise that up. Would that solve your problem?
We can use soft hyphen (­) to split long words. For German and English there is for example hyphenator.js https://code.google.com/p/hyphenator/ which can be used as bookmarklet, too. But I have no solution for Norwegian or other Languages.
On the official spritz example, I think they actually split words in the middle, and use dashes to show parts of the word over multiple frames.
This is really annoying and currently makes OpenSpritz hardly usable for German texts. Hyphenation is the way to go in my opinion.
@F30 , how long are the (typical) German words please?
I use the hyphenator that @smielke mentions -- it exposes a hyphenateWord method. I only hyphenate words that are too long (presently by character length, but I'll be upgrading to base this on rendered width in ens.) see: https://github.com/kukulski/readifry/blob/master/main.js
@tomByrer Hmm, hard to estimate. We do have words like „Vermögenszuordnungszuständigkeitsübertragungsverordnung“ [1], but such are of course rather the exception than the rule. As the graphic in [2] is 404'ing, I unfortunately couldn't find a source for the word length distribution, but the article states an average length of 10.6 characters.
I agree, however, that it would be sensible to add hyphenation for long words instead of just increasing the maximum to some arbitrary value. From my knowledge, there exist both algorithm- and word list-based approaches to hyphenation. The one @smielke mentioned seems to be algorithmic, which of course appears preferable for a JS solution. It also promises to support more languages than he mentioned [3].
[1] http://www.sprachlog.de/2013/06/05/das-neue-laengste-wort-des-deutschen/ [2] http://www.duden.de/sprachwissen/sprachratgeber/durchschnittliche-laenge-eines-deutschen-wortes [3] http://code.google.com/p/hyphenator/wiki/en_AddNewLanguage
average length of 10.6 characters
With a max length of 18 characters, hyphenating does sound best.
Hyphenator is a good idea here. Is there a preferred method, or should we just rip out the readifry one?
From my phone.. On Mar 16, 2014 7:17 AM, "tomByrer" [email protected] wrote:
average length of 10.6 characters
With a max length of 18 characters, hyphenating does sound best.
Reply to this email directly or view it on GitHubhttps://github.com/Miserlou/OpenSpritz/issues/39#issuecomment-37758152 .
I broke out my hyphenation wrapper to make it easy to pick up: https://github.com/kukulski/readifry/blob/gh-pages/HyphenHelper.js