Twig icon indicating copy to clipboard operation
Twig copied to clipboard

markdown_to_html different/incorrect behavior with leading line-break

Open Sarke opened this issue 2 years ago • 4 comments

I thought I was going mad at one point. I've simplified the issue to the following code:

$loader = new \Twig\Loader\ArrayLoader([
    'index' => '{{ text|markdown_to_html }}',
]);
$twig = new \Twig\Environment($loader);
$twig->addExtension(new MarkdownExtension());

$twig->addRuntimeLoader(new class implements RuntimeLoaderInterface {
    public function load($class) {
        if (MarkdownRuntime::class === $class) {
            return new MarkdownRuntime(new DefaultMarkdown());
        }
    }
});

$markdown = <<<MD
Paragraph 1

Paragraph 2

1. First

2. Second
MD;

echo $twig->render('index', ['text' => $markdown]);

echo "\n------\n\n";

// only difference is the leading line-break
$markdown = "\n" . $markdown;

echo $twig->render('index', ['text' => $markdown]);

Output

<p>Paragraph 1</p>

<p>Paragraph 2</p>

<ol>
<li><p>First</p></li>
<li><p>Second</p></li>
</ol>

------

<p>Paragraph 1<br />
Paragraph 2<br />
1. First<br />
2. Second</p>

I originally noticed it when I was getting the second (incorrect) results with this, which seems harmless enough.

{% apply markdown_to_html %}

Title

Similar effects with other markdown libraries. Note that passing a leading line-break to the markdown library directly (without Twig) works as expected.

michelf/php-markdown               1.9.1
twig/markdown-extra                v3.3.8
twig/twig                          v3.3.9

Sarke avatar Apr 14 '22 13:04 Sarke

The filter has a behavior to remove the indentation: https://github.com/twigphp/Twig/blob/b4d6723715da57667cca851051eba3786714290d/extra/markdown-extra/MarkdownRuntime.php#L25-L28

This seems to be what breaks when the markdown start with an empty line, because it will then remove the empty line between Paragraph 1 and Paragraph 2

stof avatar Apr 14 '22 13:04 stof

Well that doesn't seem like a good way of doing it. Not only for my issue of no indentation, but markdown can legitimately start with an indented code block, followed by non-indented paragraphs, etc.

A workaround for now would be to do {{ text | trim | markdown_to_html }}, which solves my problem, but it wouldn't work with the mentioned indented code block.

Sarke avatar Apr 15 '22 07:04 Sarke

Maybe Twig could borrow the approach I implemented in my colinodell/indentation library? It basically:

  1. Checks the indentation line-by-line to see if all lines have a common amount and type of indentation.
  2. We abort if any line is found without indentation, or if there's a mix of tabs for some lines and spaces for others
  3. Once we know the common indentation we strip that from all lines

Here are some test cases showing the results.

I don't know if this approach is the most efficient but it seems fairly reliable.

colinodell avatar Apr 16 '22 15:04 colinodell

Maybe Twig could borrow the approach I implemented in my colinodell/indentation library?

Maybe. I'm not sure such an intricate solution is required though.

  1. Checks the indentation line-by-line to see if all lines have a common amount and type of indentation.

There's a couple of differences here. I ignore empty lines, since many IDEs tend to remove whitespace on empty lines. I also treat tabs as 4 spaces, since this is how Markdown sees them.

  1. We abort if any line is found without indentation, or if there's a mix of tabs for some lines and spaces for others

Thanks, I added the early return.

Here are some test cases showing the results.

Thanks. We could add some tests for the indentation detection, but I think the Markdown tests cover it. I added a comparison between the Twig output (after the indentation has been removed and the Markdown converted to HTML) and the direct Markdown output from the Markdown library in use.

Sarke avatar Apr 19 '22 17:04 Sarke