php-markdown
php-markdown copied to clipboard
Emphasis not rendered when followed by a comma, period or semicolon + space
When trying to parse something like hello *, this is emphasized* etc..
the part between the asterisks doesn't get rendered. It happens at least when the first asterisk is followed by a comma, period, or semicolon and a space. See the following image:
Interestingly, PHP Markdown is the only parser with that behavior, so I really ought to fix this. Thank you for the report.
I've tracked the problem down to this line https://github.com/michelf/php-markdown/blob/lib/Michelf/Markdown.php#L1252 where ,.;:
are not desired after a *
or a _
. I wonder why is that?
I'm trying to remember, but can't find the reason. A simple way to find out would be to remove them, make a pull request and look at the failing tests from the auto tester.
Are the tests running here on Github (via Travis) or shall I run them locally?
Here, using Travis. You can run them locally too.
I think I found the reason. It's to avoid confusing asterisks that are meant to be asterisks in situations like that:
This is an asterisk*. That is *emphasis*.
Many implementations get it right, including Github:
This is an asterisk*. That is emphasis.
While this works in PHP Markdown too, the way it's implemented it breaks other cases.
The basic idea to make that work is that the opening asterisk can be anywhere but at the end of a word, and so we check for whitespace after the asterisk. But since words are often followed by punctuation, punctuation counts as whitespace in this situation. That makes sense in the general case, but not in your example where there is no word preceding the punctuation. So I think the fix would be to not count punctuation as whitespace if the asterisk is preceded by whitespace.
Sounds about right. In our case, though, it was enough to remove the punctuation from the regexes, because as our markdown is derived from a rich text editor, the asterisks are escaped.
I can confirm this behavior with the Markdown text **; )**
that doesn't get transformed into <strong>; )</strong>
.
Other unsuccessful tries:
-
** ;)**
-
**;) **
Successful try: -
**;)**
Hope it helps.
Related to https://github.com/friendica/friendica/issues/6938