kramdown
kramdown copied to clipboard
Should kramdown treat `dd` elements like other list entries?
Not sure whether this is intended behavior. It certainly irritates me:
$ kramdown <<EOF
$ item
$ : text
$
$ text
$ EOF
<dl>
<dt>item</dt>
<dd>text
<p>text</p>
</dd>
</dl>
A multi-paragraph definition does not wrap <p>
around the first paragraph. As a consequence, the first paragraph does not get the margin-bottom
that <p>
is often styled with, as e.g. in the minima
theme used in a default Jekyll setup. Which is how I noticed the issue.
Fortunately, the documentation specifies a way to explicitly request paragraph wrapping, by preceding the definition with a blank line:
$ kramdown <<EOF
$ item
$
$ : test
$
$ test
$ EOF
<dl>
<dt>item</dt>
<dd>
<p>test</p>
<p>test</p>
</dd>
</dl>
But for other lists that handling is automatic. To achieve the same for definition lists, I added the following:
--- a/lib/kramdown/parser/kramdown/list.rb
+++ b/lib/kramdown/parser/kramdown/list.rb
@@ -201,6 +201,7 @@ module Kramdown
elsif @src.check(EOB_MARKER)
break
elsif (result = @src.scan(content_re)) || (!last_is_blank && (result = @src.scan(lazy_re)))
+ item.options[:first_as_para] ||= last_is_blank
result.sub!(/^(\t+)/) { " "*($1 ? 4*$1.length : 0) }
result.sub!(indent_re, '')
item.value << result
If I understand it correctly, the patched elsif
branch is entered only when a dd
entry is continued with a non-blank source line.
This patch only changes the output in cases where it previously contained a mixture of non-wrapped inline text adjacent to block elements, which supposedly should be avoided. The explicit wrapping form is still possible, but only necessary for single-paragraph items with this patch.
The kramdown documentation features 84 places where this change has an effect. I have made a diff of the resulting htmldoc
.
Unfortunately, the htmldoc
CSS uses margin-top
for <p>
which makes multi-paragraph definitions look bad: The definition term is spaced closer to the preceding text than to its own definition entry. But that is due to the unconditional style rule. Here is a quick fix:
--- a/doc/_design.scss
+++ b/doc/_design.scss
@@ -157,6 +157,10 @@ body {
font-weight: bold;
}
+ dd > p:first-child {
+ margin-top: 0;
+ }
+
a {
color: $link-color;
text-decoration: underline;
But note that this style change also applies to explitly paragraphed definition lists. Compare the explicitly paragraphed example in the documentation with the output after the style change. I consider this an improvement.
Yet I cannot rule out the possibility that I am trying to fix something which is not broken. Perhaps the previous behavior has all been intended to provide users with a way to mix inline with block content in <dd>
elements? If so, please point me to some rationale. Thanks.
It should be noted that this change breaks previous CSS-based workarounds. If users never put empty lines before definitions and put the following kludge in their (S)CSS:
/* Fix vertical spacing in <dd>text<p>More text</p></dd> */
dd > p:first-child { margin-top: $parskip; }
Then they would have to remove that now (or change $parskip
to 0
explicitly) because that rule also applies when there is no text before the first <p>
child.
I find that this behavior is actually specified (source). However, my impression is that this behavior might be the reason for style rules like the margin-top
of blocks and countermeasures like negative margin-bottoms
of headers. Apparently, the CSS has been made such that authors can avoid putting empty lines before their definitions. When playing with the CSS, I find that one can get along nicely without margin-top
and negative margin-bottom
just by using the proposed parser change. This would also render the CSS quickfix obsolete.
But since it is documented, the proposed (one-line) parser change should probably be made optional.
I have introduced a boolean option :auto_dd_para
for the proposed parser change so that you can experiment a bit. Since it is off by default and not activated for the webgen
-based documentation, I have reverted the CSS quickfix. Nevertheless, to get rid of the following webgen
warning, I have replaced a color expression in doc/_design.scss
.
DEPRECATION WARNING on line 166 of /_design.scss:
The operation `#1666A3 div 2` is deprecated and will be an error in future versions.
Consider using Sass's color functions instead.
http://sass-lang.com/documentation/Sass/Script/Functions.html#other_color_functions
By the way, the option handling framework is really convenient. No duplicated descriptions! Thanks for that.
I'd like your input on whether the proposed option semantics should, or should not, alternatively focus on that line and semantic criteria like it.children.size
rather than on lexical criteria as currently.
By the way, I get 2838 test runs without MathjaxNode tests whereas the CI only does 2818 test runs with MathjaxNode tests included. The numbers are for Ruby 2.4. No obvious clue what the CI skips.
Thanks for all the input - however, I need some time to think through all of it.
Merged changes from master, then rebased and squashed.
Changed :auto_dd_para
criterion to a semantic version (several block children) instead of the lexical one (contains blank line followed by something else). This also handles examples like
item
: text
^
text
While I was at it, I noticed that a :dd
item's .options[:first_as_para]
is only conditionally cleaned up. I have changed that to clean it up unconditionally.
This needs review. I have assumed that the children (created by parse_blocks
) are all nontrivial blocks, so it.children.size > 1
tells me all that is needed. Maybe that is wrong.
(Actually, counting initial blank blocks is OK with me, since these can reasonably be interpreted as request for paragraph wrappings. That's the idea behind :auto_dd_para
.)
I have verified that the generated htmldoc
matches that of master
except for the additional documentation. No changes to the tag layout because the kramdown documentation does not use :auto_dd_para
.
I have also removed the color function change to doc/_design.scss
. Firstly because it was unrelated (goes to a separate PR), secondly because it was wrong (the resulting color changed to black with that change).
Added more tests for the :auto_dd_para option. Take a look at test/testcases/block/13_definition_list/auto_dd_para.text
to get a glimpse of the current semantics:
- Plain
para
is wrapped in<p>
regardless of the state of the:auto_dd_para
option. -
para-iff-auto
is wrapped in<p>
if and only if the:auto_dd_para
option is set. -
inline
is never wrapped in<p>
.
You might notice that an item8
is missing. That was an interesting case causing a test failure in the text-to-kramdown-to-html path. That's probably a bug worth a separate issue.
I have looked at your changes and I agree that such an option might be useful. However, I'm always cautious when the parser is involved since the generated AST should be the same for all kramdown documents regardless of set options (except the for the few parser options there are). And this option would generate a different AST.
Yes, it is a parser option affecting the AST. It deliberately changes the grammar in order to get rid of (presumably unintented) mixed block/inline sequences. It does not affect how output converters process the AST. Therefore, identical ASTs produce the same output as before.
I'll have to update the wording in the specification to more closely match the implementation. The current (semantic) implementation is said to take effect when there are several blocks. In more visual terms, it actually takes effect when there are block boundaries before the end, which I think is the most intuitive or least surprising behavior, and perhaps the only one that thoroughly gets rid of mixed block/inline sequences. If you can think of any counterintuitive syntax cases enabled by the current implementation, I'd like to hear about that.
Just noticed the following:
$ kramdown <<EOF
$ * text
$ ^
$ text
$ EOF
<ul>
<li>text
<p>text</p>
</li>
</ul>
That is, other list types actually use the lexical criterion (need blank line) instead of the semantic one (contains block boundaries).
For uniformity, :auto_dd_para
should then switch back to the lexical criterion.
But if the primary purpose is to eliminate inline/block sequences, then all list types should be switched to the semantic criterion. Then :auto_dd_para
should be :auto_para
, and the parser would have to be patched at several places. (Well... two places.)
An alternative implementation of :auto_para
would post-process the AST and remove the :transparent
option from paragraphs that are followed by block elements. That is thorough but adds overhead.
I have time this weekend, will take a closer look and then decide. New release will also probably be done.
Thanks for your patience!
No need to hurry this one. This auto-para branch still needs work. The current state solves most inline+block issues with dd
elements, but as explained above, there are still cases left with other list types, so the option introduced here is subject to change. Work here will also need coordination with #486. I have worked on neither one recently. You can close this PR if you do not like it lingering around, I'll reopen it when it looks complete again.