pandoc
pandoc copied to clipboard
Insert RawBlock in LaTeX without forcing blank lines around the block
Consider the following LUA filter:
function Div(el)
if el.classes[1] == "special" then
-- insert element in front
table.insert(
el.content, 1,
pandoc.RawBlock("latex", "\\begin{Special}"))
-- insert element at the back
table.insert(
el.content,
pandoc.RawBlock("latex", "\\end{Special}"))
end
return el
end
When running this filter on a following fenced div block:
::: special
some text
:::
the output for latex is:
\begin{Special}
some text
\end{Special}
However, the expected output would not contain the empty lines before and after the some text.
pandoc 2.9.2.1
Compiled with pandoc-types 1.20, texmath 0.12.0.2, skylighting 0.8.5
The empty lines can alter the output of some environments and should be avoided. If needed, the user can include the empty lines in the fenced div block.
The LaTeX writer inserts blank lines between all block-level elements. In principle we could change this so that raw blocks wouldn't get blank lines, but this change in behavior could break some current filters.
I just ran into the exact same issue (with a python filter), and I do think this should be changed. Maybe it could be made an optional change with a configurable switch, to avoid breaking existing filters?
I've run into the same issue using a lua filter. In some cases you can work around the issue by using pandoc.utils.blocks_to_inlines() to convert them to inlines first and then append the LaTeX commands. Unfortunately this breaks if you try convert a multi line CodeBlock to a Code inline object.
I my case we have custom LaTeX environment for certain divs and LaTeX is in that case sensitive to blank lines and I don't know a way to remove in the LaTeX environment.
I agree with @gabindu that it would be nice to have a flag for RawBlock to not include a blank lines above or below the block.
Any news on this topic ?
I have been struggling with this issue almost since my first Lua Filter.
When you deal with figures, minipages, etc. what you need is no blank lines before or after \begin{Special}. Same with \end{Special}
@chrisaga - I have used a workaround for \end by injecting a \setlength{\parskip}{-1em} before the \end environment, which "gobbles" the extra paragraph space and is sufficient for my use cases.
Something like this in your LUA script:
-- \setlength{\parskip}{-1em} to gobble the extra newline before End
pandoc.RawBlock("latex", "\\setlength{\\parskip}{-1em}\n\\End{" .. adm .. "block}")
Maybe it helps.
Thanks @priiduonu Your workaround must work ins some cases when the issue is only "visual" (i.e. need to avoid a blank line). In some other it won't because latex will process the code differently. For instance
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du premier \texttt{minipage}.
Il est positionné à gauche.
\end{minipage}\hfill
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du deuxième \texttt{minipage}.
Il est positionné à droite.
\end{minipage}
Is totally different from
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du premier \texttt{minipage}.
Il est positionné à gauche.
\end{minipage}\hfill
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du deuxième \texttt{minipage}.
Il est positionné à droite.
\end{minipage}
The first code renders as two text blocks side by side pushed against the margins with some void on the center of the page.
The second code renders two blocks above each others on the left of the page.
There is no blank line to gobble here.
I found a workaround based on the redefinition of \par to do nothing. Its a little bit tricky on complex use cases. So this code
\let\savepar\par
\let\par\relax
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du premier \texttt{minipage}.
Il est positionné à gauche.
\end{minipage}\hfill
\begin{minipage}{0.3\linewidth}
Ceci est le contenu du deuxième \texttt{minipage}.
Il est positionné à droite.
\end{minipage}
\let\par\savepar
Renders like the first one. You even are allowed to have multiple paragraphs in the minipages since \par seems to be reset to original in the environment.
Anyway, reading your post made me rethink the whole issue ... and I am quite convinced that we are addressing it the wrong way!
Let me clarify. I thought of it like if I was dealing with HTML: I want two blocks side by side.
But in my first code (with no blank line before the second minipage), LaTeX actually deals with them the minipages as INLINES not BLOCKS so they are ont he same lines exactly like 'A' and 'B' are on the same line in A\hfill B.
So why don't play by the rules and feed LaTeX with a RawInline instead of a RawBlock when needed ?
This filter - https://github.com/chdemko/pandoc-latex-environment - seems to use RawInline instead of RawBlock to avoid blank lines inside the LaTeX environment. But it still leaves blank line between consecutive environments (which is probably a right thing to do in 99% of cases).
I would be the idea.
I just had a look and noticed that this filter uses either RawInline or RawBlock. Maybe it's the reason why you experience a blank line between consecutive environments.
I'll try to figure out a use case with minipages in a few days.
Well. It's tougher than I thought. Using RawInline is not enough.
I figured out that this minimal filter (note the '%' at the end of the RawInlines):
function Div (div)
print(div.t)
env = div.classes[1]
return {pandoc.RawInline('latex', '\\begin{' .. env .. '}%')} ..
div.content ..
{pandoc.RawInline('latex', '\\end{' .. env .. '}%')}
end
would produce the following AST document sructure (assuming basicenv if the name of the fenced div in the MarkDown source):
Plain [ RawInline (Format "latex") "\\begin{.basicenv}%" ]
~ some content ~
Plain [ RawInline (Format "latex") "\\end{.basicenv}%" ]
That's right. We created inlines instead of blocks so Pandoc automatically put them in Plain blocks. It makes sense since inlines should be (I think) inside a block.
The issue is that the LaTeX writer handles them as if they where Para and outputs blank lines before and after.
In LaTeX paragraphs are delimited by blank lines before and after them. So "Blank line" + "Plain text" + "Blank line" = A paragraph.
According to Pandoc's documentation Plain is "Plain text, not a paragraph." https://pandoc.org/lua-filters.html#type-plain
So I would argue that Plain [ RawInline (Format "latex") "\\begin{.basicenv}%" ] should produce exactly \begin{.basicenv}% with no blank line before or after.
I think this is fixable with a custom LaTeX writer. Even with a filter but it would be messy since it would do most of the job of a writer.
@jgm maybe you have an opinion on this ?
Technically this is right: a Plain ideally shouldn't end with a blank line.
But currently we have this in the code:
https://github.com/jgm/pandoc/blob/main/src/Text/Pandoc/Writers/LaTeX.hs#L674-L677
The vsep creates a blank line between all blocks. Changing it to vcat would avoid the blank line, but then we'd have to make other changes to the code to make sure we get all the blank lines that are actually needed.
Thanks for the clarification @jgm . I understand that there is no hope having this fixed in Pandoc's LaTeX writer.
I have spent quite some time today figuring a workaround. It's important because end of lines and blank lines are meaningful in LaTeX code.
The next best thing would probably be a custom LaTeX writer where we could redefine just Writer.Block.Plain and use the standard Writer for everything else. But this does not exist. As far as I understand custom writers we have to define a writer function for every blocks and inlines possible in the document.
The filter based alternative is not very promising either because we have to walk the AST document structure and render big chunks of LaTeX RawBlocks around the place where we don't want blank lines. Too messy !
Post-processing with sed doesn't looks that bad at this point :-D
So I guess I'll stick to pure ad-hoc TeX/LaTeX hacks based on \let\par\relax (see https://github.com/jgm/pandoc/issues/7111#issuecomment-2917284934) :-(
I wouldn't say "no hope." It might not be too hard to fix it.
I wouldn't say "no hope." It might not be too hard to fix it.
Good news ! Because I would rather Pandoc choose carefully where to output blank lines in LaTeX code since they often are unnecessary and many times armful. Lines of empty comments (%) are perfect for code readability.
@chrisaga - Your idea of using RawInline instead of RawBlock helps to avoid blank lines inside the LaTeX environment (and is probably more correct semantically?).
A working solution can be found here: https://github.com/rstudio/rmarkdown/blob/main/inst/rmarkdown/lua/latex-div.lua#L50-L59
@jgm - maybe LaTeX writer can be altered based on optional class, (e.g .tight) - if this class is applied to Div do not output blank lines around RawBlocks?
This enables to solve the problem of consecutive minipage environments (and possibly others) while not breaking current filters.
So it looks like @jgm worked on it 👍
I was investigating something too. Since I definitely cannot write nor read Haskell code, I worked on a custom Lua LaTeX writer: hk_latex_basic_writer.lua to fix the Plain issue.
Two more points:
- For
Plainnot to introduce a new paragraph in the LaTex code, it needs to have no blank line above and below. - I wouldn't be shocked if RawBlocks would stay as they are. After all they are "blocks" and we have RawInlines as an alternative. Removing blank lines above and below is not an issue either since we cant add them in the text if needed.
I build this little project to test my custom writer