py-readability-metrics icon indicating copy to clipboard operation
py-readability-metrics copied to clipboard

How to modify the code to measure the readability of sentence whose words less than 100

Open MathrewLing opened this issue 5 years ago • 3 comments

Thank you every much!!!

MathrewLing avatar Jun 03 '20 10:06 MathrewLing

@MathrewLing each scorer explicitly checks that there are more than 100 words. You could remove this limitation by adding an option to skip the check. The new option would need to be checked per scorer.

See the following: https://github.com/cdimascio/py-readability-metrics/blob/master/readability/scorers/dale_chall.py#L17

If you are up for making the code changes I’ll certainly review it and work with u to merge it in

cdimascio avatar Jun 03 '20 12:06 cdimascio

note that using fewer than 100 words may not provide a sufficient signal and hence the accuracy of the result will suffer

cdimascio avatar Jul 19 '20 22:07 cdimascio

Thanks for raising the question that the code should do something to measure the readability of sentence whose words are less than 100. Here is what I would do: For a sentence, or a document which contains fewer than 25 words, raise an error message and the function does not return a result, based on your reasoning that the readability levels would severely suffered. However, for a document containing between 25 and 100 words, how about making the function return the result, while raising a message to warn the users that "Document with fewer than 100 words may not provide sufficient signal to the algorithm and hence the accuracy of the result will suffer"?

son520804 avatar Oct 06 '20 14:10 son520804