obsidian-to-hugo icon indicating copy to clipboard operation
obsidian-to-hugo copied to clipboard

Codeblocks should not be parsed

Open anakojm opened this issue 2 years ago • 5 comments

Obsidian-to-hugo wrongly convert the following:

```python
if foo==bar and foo==baz:
    L = [[12,42],[13,90]]
```

to

```python
if foo<mark>bar and foo</mark>baz:
    L = [12,42],[13,90]({{< ref "12,42],[13,90" >}})
```

Codeblocks should instead be skipped to prevent such false positives (I have no idea how to implement this).

anakojm avatar Jan 26 '23 02:01 anakojm

Hey @anakojm

I guess regex lookarounds should work for this use case, this would ideally nail the regex down to only those matches that are not written in between triple quotes

If you would like to give it a shot, feel free to add a test case for this in the md marks suit

devidw avatar Jan 28 '23 19:01 devidw

I am willing to try but one problem I am facing is that I can't do something like that r"(?<!^```.*?$).*?==([^=\n]+)==.*?(?!^```$)"gsm because it is not supported: re.error: look-behind requires fixed-width pattern.

I think you would be better off dealing with this issue as I lack experience in the matter.

Also why did you restrict the issue to the marks processor? The issue affect the wikilinks parser too, as shown by my example

In the meantime, I have written test cases, should I PR them? Maybe in another branch?

anakojm avatar Jan 29 '23 00:01 anakojm

Alright I see

Also why did you restrict the issue to the marks processor? The issue affect the wikilinks parser too, as shown by my example

Good point, have overseen the change in the second line of the example 🙈

If we want to point out the issue clearly and avoid misunderstandings, we can use the diff block on GH 😉

```python
- if foo==bar and foo==baz:
+ if foo<mark>bar and foo</mark>baz:
-    L = [[12,42],[13,90]]
+    L = [12,42],[13,90]({{< ref "12,42],[13,90" >}})
```

In the meantime, I have written test cases, should I PR them? Maybe in another branch?

Cool, yes that would be awesome, maybe an extra branch like bug-codeblocks

devidw avatar Jan 29 '23 06:01 devidw

This might do the trick since Python 3.6:

    wiki_link_regex = r"(?ms:```.*?```)|\[\[(.*?)\]\]"
    for match in re.finditer(wiki_link_regex, text):
        if not match.group(1):
            continue

vonloxley avatar Jan 27 '24 15:01 vonloxley

it might work but i believe the problem is more fundamental. we can’t parse markdown with regex properly since markdown is not a regular language.

anakojm avatar Jan 29 '24 02:01 anakojm