I compared the Dale-Chall results on one of Robert Munsch's stories ('Love You Forever') from this with other scoring software (e.g. the one you cite). The results were drastically different. I don't know much about the topic, but looking at wikipedia I see that the 1995 revision of Dale-Chall (that expanded the word list to 3000) completely changed the formula:

"In 1995, Dale and Chall published a new version of their formula with an upgraded word list, the New Dale–Chall readability formula.[45] Its formula is:

Raw score = 64 - 0.95 *(PDW) - 0.69 *(ASL) "

vs. what is in the code :

def _score(self): stats = self._stats words_per_sent = stats.num_words / stats.num_sentences percent_difficult_words =
stats.num_dale_chall_complex / stats.num_words * 100 raw_score = 0.1579 * percent_difficult_words + 0.0496 * words_per_sent adjusted_score = raw_score + 3.6365
if percent_difficult_words > .05
else raw_score return adjusted_score

Wikipedia shows that formula from the earlier version of Dale-Chall: "this equation from 1948:

Raw score = 0.1579*(PDW) + 0.0496*(ASL) if the percentage of PDW is less than 5 %, otherwise compute
Raw score = 0.1579*(PDW) + 0.0496*(ASL) + 3.6365"

Aug 19 '20 17:08 jrraines

https://en.wikipedia.org/wiki/Readability

Which is different from their article on Dale-Chall which just gives the old formula. Another issue I noticed while googling around is whether unique unfamiliar words or number of unfamiliar words figure in the formula--for the text I used this is a big deal. Do we count 27 instances of 'forth' or one?

It doesn't seem like the grade level equivalence can use the old formula but I haven't googled up a new one.

Aug 19 '20 17:08 jrraines

we count each occurrence of the word. words are captured usings lists not sets, thus its not unique words but all words. all in all, for the case above, it would be 27 instances of forth.

If you find, that sets of words should be used in some instances, the change would be fairly straight forward. all statistics input to each scorer is computed here.

feel free to experiment, i'm happy to accept PRs. also, happy to make the change as well, if you find something definitive

Aug 20 '20 00:08 cdimascio

@jrraines fyi, u may want to update to the latest version.

It corrects if percent_difficult_words > .05 to if percent_difficult_words > 5

Aug 20 '20 01:08 cdimascio

Thanks for your reply! I will try that later.

I think the deal is that 1. all the online tools are using the 1995 3000 word list with the 1948 formula (that was intended and validated with a 1000 word list) and 2. the algorithm is supposed to be used on a text sample of about 1000 words and the number of unique words (i.e. the set). Again, I don’t know a lot about it, yet. I did order the 1995 book to try to understand what is going on better.

Here’s what I wrote out to clarify what I saw for myself:

I used https://www.interventioncentral.org/teacher-resources/oral-reading-fluency-passages-generator https://www.interventioncentral.org/teacher-resources/oral-reading-fluency-passages-generator On the text of Robert Munsch’s Love You Forever story and got these readability scores: Formula

Then I took the same text and used py-readability-metrics library and got these scores:

Flesch-Kincaid: score 6.200658777768648 and grade level 6. Flesch reading ease: score--84.35790204618294; ease--easy and grade level--['6']. Dale-Chall score--6.940605503054209 and grade level ['7', '8']. Automated Reading Index: score--4.968700797107406; grade level--['5'] and age--[10, 11]. Gunning Fog score--8.792845207768375 and grade level 9. Coleman Liau score 3.332767962308594 and grade level 3. Smog score 7.485028197883463 and grade level 7. Spache score 4.6358866518749835 and grade level 5. Linsear Write score 10.453488372093023 and grade level 10. Statistics:{'num_letters': 2979, 'num_words': 849, 'num_sentences': 43, 'num_polysyllabic_words': 25, 'avg_words_per_sentence': 19.74418604651163, 'avg_syllables_per_word': 1.210836277974087}

The open source python library gives higher scores for most of the measures. ARI is almost identical and Dale-Chall is very different. Coleman Liau is lower in the open source python tool.

It seems to me that 3rd graders can virtually all handle the text (which does have some long sentences). Munsch uses repetition at many levels in the story and that helps the students build speed and may not be accounted for by any of the scoring systems. I’m bothered by the lack of agreement between the two implementations.

Here is the text of the story:

A mother held her new baby and very slowly rocked him back and forth, back and forth, back and forth, back and forth. And while she held him, she sang: