Text-Statistics icon indicating copy to clipboard operation
Text-Statistics copied to clipboard

Numbers are not handled correctly

Open DaveChild opened this issue 14 years ago • 2 comments

From Google Code:

Numbers within text numerically (1, 20, 100 etc) may not be handled correctly.

Currently an unknown - should "20" be counted as two syllables ("twen-ty") or as one syllable? Or should it be excluded from the calculations?

DaveChild avatar Dec 02 '10 18:12 DaveChild

I don't know if you were still wondering about this, or even if you wanted to follow the original calculations as specified by J. Peter Kincaid and his team from 1975, but in the original paper[1] numbers are handled as follows:

For the Flesch-Kincaid Reading Ease score- numbers are counted as one word, and the number of syllables is indeed counted as the word is pronounced, "20" is "twen-ty" for 2 syllables, "1918" is "nineteen eighteen" for 4 syllables.[2]

For the Gunning Fog Index- numbers are considered "easy" words, and get a score of one.[3]

The paper covers a few more details for the calculations, like currency symbols, percent signs, etc.

[1] Kincaid, J.P., Fishburne, R.P., Rogers, R.L., & Chissom, B.S. (1975). Derivation of New Readability Formulas (Automated Readability Index, Fog Count, and Flesch Reading Ease formula) for Navy Enlisted Personnel. Research Branch Report 8-75. Chief of Naval Technical Training: Naval Air Station Memphis. [2] Kincaid 1975, p. 50. [3] Kincaid 1975, p. 48.

getconor avatar Feb 16 '13 06:02 getconor

Thanks, this is great info. A little more than I have time to incorporate at the moment, but interesting for future development.

DaveChild avatar Jan 14 '14 15:01 DaveChild