"Sentences must be at least two words long, unless a linebreak or end-of-text."
Hey, why this logic?
What if someone writes:
See it. Report it. Sorted.
Will "Sorted." get ignored?
Because it does not know that "it" isn't an abbreviation.
Without the ability to actually understand the words (which would be the holy grail of artificial intelligence, so slightly outside scope), the algorithm might as well try to parse Bla bo. Bubble bi. Booboo. No hint at all whether the period is used as an abbreviation or the end of a sentence.
It might as well parse See eg. Boston mr. Garcia, which has all the same character counts, positions of caps and periods, yet is only one sentence (albeit a bad one, but still something that should be detected as a single sentence).
As such, PHP-Sentence (and basically all other Sentence boundary disambiguation algorithms) make a best effort attempt at detection whether a period is the end of a sentence, part of an abbreviation or just... ellipsis.
i feel you, but how about loading a list of abbreviations in English (and DE and NL, if you wanna maintain those languages), and then checking if the single word is an abbreviation?
I think that simply ignoring. all. single. worded. sentences. isn't. very. good.
This is a really good lib for this: https://github.com/bigwhoop/sentence-breaker
I think a list of abbreviations could be added without too much effort. I'll look into it though no promisses as to a date.
I know about sentence-breaker, but I don't know of it's quality. The testcases don't seem particularly challenging and on first sight seems to ignore things like colons, question marks, exclaimation marks and imperfect use of punctuation. It'll probably perform better on professional, well-written texts and worse on real-world texts. For instance, they seem to deal with "... word", but not with "...word", "..word" or "....word". Their rule system is easily extendable, but does require individual rules for each exceptional case.
As far as I know, there is no scientific corpus or set of tests to compare these types of algorithms.