pandoc
pandoc copied to clipboard
Custom text style inside of a table cell is overridden with "Compact" style
While converting .html
to .docx
I'm trying to customize text style. It seems that the text is not styled properly when inside a table cell.
This HTML renders correctly:
<!-- example1.html -->
<div custom-style="Index 5">
Hello, world!
</div>
pandoc --print-default-data-file reference.docx > reference.docx
pandoc --reference-doc reference.docx --standalone --output output_$(date +%s).docx example1.html
This HTML does not render correctly:
<!-- example2.html -->
<table>
<tr>
<td>
<div custom-style="Index 5">
Hello, cruel world!
</div>
</td>
</tr>
</table>
pandoc --print-default-data-file reference.docx > reference.docx
pandoc --reference-doc reference.docx --standalone --output output_$(date +%s).docx example2.html
Expected output would be such that preserves the "Index 5" style.
Not sure if it's of any help, but I've located 2 things that might be related to this:
-
commit description had this to say:
...This change also allows nesting of custom styles; in order to do so, it removes the default "Compact" style applied to Plain blocks, except when inside a table
-
comment had this to say:
...It would be helpful to generate some tests (e.g. test/tables.native rendered to docx) and do a before/after, removing the Compact style inside the tables.
We'd have to understand why originally the exception was made inside tables. There must have been a reason for it. Perhaps @bensteinberg or @jkr would know more.
I think the discussion at #6072 may shed some light. When I started working on #5670, I believe I removed the compact style inside tables, as I thought it was incorrect/not necessary, but that turned out to be wrong. I'm afraid I've lost the thread on this a bit.
FWIW this also happens when a custom Table style is defined: Compact style for text inside tables will overwrite the text formatting set by the Table style (e.g the text color on header)
So even if we modifies the style "Table" in the reference doc, the text formatting won't apply because of "Compact" style being applied in cells.
To reproduce:
- Create a reference document where Table style is modified (e.g set header row to red)
- Use this reference document to convert a simple md doc
---
title: demo table
---
Table: Cars Means
n() mean(speed) mean(dist)
---- ------------ -----------
50 15.4 42.98
in the resulting doc, you can also try to change the Table style to one of Word built-in one, you'll see that the text formatting won't change even on style where text color should be changed.
I had a specific case of this issue here https://github.com/rstudio/rmarkdown/issues/2242, thanks to @cderv. His main conclusion:
From the doc you sent, it seems that the correct table style is applied, but the text inside the table get the Compact Paragraph style that overwrites the style you set in the Table style
I see the problem, but I'm still not sure of the solution.
Here's what would happen if we did not use the Compact style on Plain elements inside table cells: (released pandoc on right, without-Compact on left)
data:image/s3,"s3://crabby-images/86341/863412bba3053cedad33e33d8e859a9a23437f03" alt="Screen Shot 2021-11-03 at 9 02 24 AM"
A workaround might be to use a Lua filter that converts all Plain elements inside a table to Para.
IMHO having the compact style for tables is needed as a sane default, otherwise, paragraph spacing in normal text styles will mess the table visuals.
Then I understand this is a limitation: It won't be possible to change Table style using a reference doc as long as we need to impose this Compact style in cells. 🤔 It feels too bad that this don't work whereas Word allows it. I am not entirely sure how all that works though.
I didn't mean to imply this is a limitation. I only expressed (not too clearly) that a default with a compact style is needed, either as it is today or with an improved pandoc that allows for custom styles in table cells. I don't know the implementation details so I have no idea if it's feasible. I see that Word lets you define the paragraph style for different table parts, but with explicit styling, not with existing named paragraph styles.
I am running in the same issue (using pandoc for HTML to DOCX conversion with a custom-reference.docx). I do not understand why we cannot have both: allow overriding with a custom Table-cell style, with "compact" default fallback? Is there a technical limitation preventing this?
I also wonder why do we need to set the style to "Compact" at all? The table generated for ODT doesn't have any style information at all, which makes the editor use the default style.
A workaround might be to use a Lua filter that converts all Plain elements inside a table to Para.
Taking John's adivce to heart I cooked up a little lua filter to set custom-style on the text in tables. Not the most sophisticated piece of code, but it might be a starting point for someone.