debiman
debiman copied to clipboard
Bullet list rendered as multiple <dl>s
Each bullet in the lists in capabilities(7) is rendered as a separate definition list, like this:
<dl>
<dd>*</dd>
<dt>…</dt>
</dl>
Instead each list should be a single unordered list:
<ul>
<li>…</li>
…
</ul>
This seems to be an upstream issue with mandoc:
$ mandoc -Thtml /usr/share/man/man7/capabilities.7.gz
[…]
<dl class="Bl-tag">
<dt class="It-tag"><b>CAP_AUDIT_CONTROL</b> (since Linux 2.6.11)</dt>
<dd class="It-tag">Enable and disable kernel auditing; change auditing filter
rules; retrieve auditing status and filtering rules.</dd>
</dl>
Could you report it at http://mdocml.bsd.lv/contact.html please, or would you prefer if I relayed the report?
This is an inherent problem of converting the old man(7) language to HTML. The snippet from capabilities(7) is written with the .IP macro as follows:
.IP * 2
Bypass file read permission checks and
directory read and execute permission checks;
.IP *
Invoke
.BR open_by_handle_at (2).
Literally speaking, the mandoc
output is correct, because the .IP
macro is intended for definition lists. The thing is that with *
as a header, it looks exactly as bullet-point list in plain-text output, where the *
is flushed into the left margin. The semantically correct solution is provided by the mdoc(7) language and its .Bl and .It macros.
To make existing manuals written in the man
language more visually pleasing, I think it would be best to modify mandoc
's HTML formatting to treat .IP *
, .IP -
etc. as unordered lists and produce the <ul>
tags instead of <dl>
tags. Alternatively you could do it in the post-processing phase or even style the dt and dd tags to appear on the same line, but you'd still need to recognize the bullet-definitions from other definitions.
(As for contacting mandoc upstream, their contact page says that messages on all three mailing lists are publicly visible, but there is no link to a viewer. Do you know it? I'd like to read some existing bug reports or discussions.)
cc @ischwarze
I agree there is room for improvement in mandoc, so i added an entry to my TODO list:
format ".IP *" etc. as <ul> rather than <dl>
I suspect that is feasible with a bit of heuristic inspection, but it's not completely trivial, so i'm not doing it right away, but i did mark it as relatively high priority: the impact is cosmetic, but the resulting ugliness is above average for a cosmetic issue.
In general, man(7) HTML formatting is less refined than mdoc(7) HTML formatting, and harder to implement nicely, but that's no excuse for not trying.
That said, i see a cosmetic issue with debiman as well. The upstream mandoc.css contains detailed CSS code to nicely format class "Bl-tag" lists, in particular to make sure that tags appear left of bodies if they fit, or above bodies otherwise - in fact, that's the part of mandoc.css that was hardest to tune. While debiman produces large amounts of CSS code - more than i would deem reasonable - this specific detail seems to be missing, resulting in ugly display of "Bl-tag" lists in general. In particular, the tags never seem to appear to the left of the respective body, not even if they are short.
Very similar case is in systemd.environment-generator(7) where the list is written as
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are executed sequentially in the alphanumerical order of the final component of their name\&. The output of each generator output is immediately parsed and used to update the environment for generators that run after that\&. Thus, later generators can use and/or modify the output of earlier generators\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
Generators are run by every manager instance, their output can be different for each user\&.
.RE
.PP
and the HTML version is
<div style="margin-left: 4.00ex;">•Generators are executed sequentially in
the alphanumerical order of the final component of their name. The output of
each generator output is immediately parsed and used to update the environment
for generators that run after that. Thus, later generators can use and/or
modify the output of earlier generators.</div>
<div style="height: 1.00em;"> </div>
<div style="margin-left: 4.00ex;">•Generators are run by every manager
instance, their output can be different for each user.</div>
<div class="Pp"></div>
which still looks rather ugly.
I finally implemented this feature request in: http://mandoc.bsd.lv/cgi-bin/cvsweb/man_html.c#rev1.173 The change will be contained in the next release, which will likely be called mandoc-1.14.5.
Here is an example with mandoc(1) from CVS HEAD:
$ mandoc -Thtml /co/linux-man-pages/man7/capabilities.7
<div class="Bd-indent">
<ul class="Bl-bullet">
<li>Bypass file read permission checks and directory read and execute
permission checks;</li>
<li>invoke <b>open_by_handle_at</b>(2);</li>
<li>use the <b>linkat</b>(2) <b>AT_EMPTY_PATH</b> flag to create a link to a
file referred to by a file descriptor.</li>
</ul>
</div>
Very similar case is in systemd.environment-generator(7)
That isn't similar at all and i think putting it into the same bugtracking ticket is very misleading.
where the list is written as
.RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Generators are executed sequentially in the alphanumerical order of the final component of their name\&. The output of each generator output is immediately parsed and used to update the environment for generators that run after that\&. Thus, later generators can use and/or modify the output of earlier generators\&. .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} Generators are run by every manager instance, their output can be different for each user\&. .RE .PP
That is man(7) code of such low quality that it is kind of a stretch to even call it "man(7)"; calling it "low-level roff(7) trickery" would be more to the point. Such low-level stuff definitely has no place in a manual page. People can't really expect to get input semantically translated to HTML when they rely on manual horizontal movements, moving left and right on the printing paper. HTML simply contains no facilities to represent such manual printing head movements, and a formatter has very little chance to guess what the semantic intention of the author might be.
Please report the manual page as broken upstream and tell upstream to properly use .IP macros and to not use \h escapes.
Mandoc rendering still is:
<div class="Bd-indent">•Generators are executed sequentially in the
alphanumerical order of the final component of their name. The output of each
generator output is immediately parsed and used to update the environment for
generators that run after that. Thus, later generators can use and/or modify
the output of earlier generators.</div>
I don't see any reasonable way how this could be improved.
That isn't similar at all
There is .IP \(bu
, an approximately as common sequence as .IP *
, which can be found even in GNU's roff(7) itself. If you handle .IP *
specially, you might as well handle .IP \(bu
. The other macros/escapes which are ignored in the HTML conversion don't make this case unsimilar.
Hi @lahwaacz ,
i agree that ".IP *" and ".IP \(bu" are similar, and that there is nothing wrong with using "\(bu" in manual pages, and indeed the patch i recently committed - see the "bob-beck pushed a commit to openbsd/src" right above - handles both.
What i meant with "isn't similar at all" was this horrible code from systemd.environment-generator(7):
\h'-04'\(bu\h'+03'\c
If you look closely, you will see that the ".IP \(bu 2.3" in that manual page is in an inactive .el clause: "ie n" is always true for manual pages (except when formatting with a real typesetter for PostScript or PDF output), so the .el clause is never entered.
Oh, in that case you're right. On closer look, systemd seems to use xsltproc to generate their man pages from XML.
Hi @lahwaacz ,
On closer look, systemd seems to use xsltproc to generate their man pages from XML
... and more specifically, from DocBook 4.2:
https://github.com/systemd/systemd/blob/master/man/systemd-environment-d-generator.xml
So no wonder the output is crap. DocBook is by far the worst and lowest quality file format you can pick for documentation. It is absolutely notorious for generating abysmal man(7) output as well as for being full of bugs and almost unmaintained.