organice
organice copied to clipboard
Too eager parsing of italic markup
*** TODO organice parser error: Foo/Bar/Baz
Foo/Bar/Baz
Is shown as:
I think that's a known bug and applies for all kind of markup. I think I remember that, once I worked on the regex parser, I decided that it's too hard to fix and postponed it to the future clojure parser :) But maybe a fresh look on the regex parser allows you to fix it.
See also #94 and #198
I don’t think it’s a duplicate. The other issues are about the parser choking on lines containing multiple inline markup statements.
This issue is much more simple: Foo/Bar/Baz
is just text. It shouldn’t be italicized at all^^
Yep, the issues linked above are only related. It's probably the same regex.
Anyway, the problem applies to all kind of markup, e.g. Foo+Bar+Baz
You are correct, it does apply to all kind of inline markup. I've written a test and looked at the Regexp. But I don't understand why it doesn't work as written at the moment(;
Worse example with only one slash:
“organice supports parsing and preserving the minimum/maximum range timestamps.” becomes:
I did not look at the proper Org syntax, but what if we just required a whitespace before inline markup?
Thus, replace [*/~=_+]
with \s[*/~=_+]
?
@lechten Your proposal fixes some terms like foo/bar
and foo/bar/baz
, but it breaks terms like *bold*
. I tried adding to your proposal with [^\s][*/~=_+]
, but that's not working as expected. I must make a trivial mistake here^^
Partial update: The ^
negates the \s
, of course(; But [\s^]
and [\^\s]
also don't work.
The ^
does not work inside square brackets (where it is just another character). We might add (^|\s)
but this adds another group, which seems to ask for lots of subsequent changes...
hihi, we're looking at the same options currently. I'm checking if there's a way for a match group to basically be ignored. But that's probably against the whole idea of match groups.
There is a way! ((?:\s|^)?[*/~=_+])
looks promising.
Looked promising and passes all tests, but it breaks foo/bar/baz
, again-_-
((?:^|\s+)[*/~=_+])
is most promising atm, but it fails *bold*
. (There's blanks before the *bold*
term, GH just opts to not render them).
I think ((?:^|\s+)[*/~=_+])
would actually be a good solution, however the match group just direct before (([\s({'"]?)((?:^|\s+)[*/~=_+])
also has a \s
which interferes...
Welcome to regexp based parsing hell(;
I hope that PR #910 fixes this now (one more commit not mentioning this, sorry).
Closed by the amazing work of @lechten in #910 :fireworks: :confetti_ball: