Displaying a code block in multiple pages in PDF
Hello, one of my functions the code block takes up more than one page, not showing the code.
Hello @CarduCaldeira,
Did you manage to solve the issue? If not, could you give me more details about your configuration (theme, plugins used...), please?
Thank you
I encountered the same using Material for MkDocs. It seems that pre stopped rendering newlines after the page break. PyMdown Extensions code blocks use a two cell table to render line numbers (all line numbers in the first cell and all the content in the second); the issue is likely due to the PDF page break not carrying over all the CSS/properties. These issues also affect code blocks without line numbers that are rendered as only <div><pre><code>....
Below is my workaround. It hides the line numbers table cell and uses CSS and a custom attribute to add them back as ::before pseudo-elements. It also replaces all the newlines in the <code> node with <br> tags to compensate for them breaking after a page break. The padding-left: 1.2rem; CSS is to adjust for the amount of space allowed for the line numbers.
CSS:
@media screen {
.highlight a[data-line-number]::before {
display: none;
}
}
@media print {
/* Add the line number as a ::before pseudo element when using parse_code_linenums() */
.highlight code a[data-line-number]::before {
content: attr(data-line-number) !important;
position: absolute;
display: inline;
right: -1em;
visibility: visible;
white-space: pre-line;
}
.highlight code a[data-line-number] + span:not(.hll),
.highlight code a[data-line-number] + span.hll > span:first-child {
padding-left: 1.2rem;
}
/* Reinforce pre behaviors in case child elements are separated from `<pre>` by a page break */
.highlight pre * {
white-space-collapse: preserve !important;
word-break: normal !important;
}
}
JavaScript:
/**
* An interface with the MkDocs Exporter plugin.
*/
window.MkDocsExporter = {
/**
* Render the page...
*/
render: async () => {
parse_code_linenums();
}
};
// Removes the `linenos` table cell and adds the line numbers as `data-line-number` attribute to allow display using ::before instead
function parse_code_linenums() {
const code_blocks = document.querySelectorAll('.highlight');
code_blocks.forEach(code_block => {
const codeLinks = code_block.querySelectorAll('a[id^="__codelineno"]');
const linenoLinks = code_block.querySelectorAll('.linenos a');
// Create a map of href to link text for the .linenos links within the current table
const linenoMap = {};
linenoLinks.forEach(link => {
const href = link.getAttribute('href');
const text = link.textContent.trim();
linenoMap[href] = text;
});
const lines_cell = code_block.querySelector('.linenos');
// replace newlines in code block with `<br>` to fix newlines breaking after a page break
if (lines_cell) {
code = code_block.querySelector('.code code');
} else {
code = code_block.querySelector('code')
}
if (code) {
replaceNewlinesWithBr(code);
}
let maxLength = 1;
codeLinks.forEach(link => {
const href = `#${link.id}`; // Create href to match the link id
if (linenoLinks.length) {
if (linenoMap[href]) { // Check if corresponding href exists in the same table's linenos
link.style.position = 'relative'; // Positioning for the ::before element
link.setAttribute('data-line-number', linenoMap[href]); // Set data attribute for line number
maxLength = linenoMap[href].length; // save the length of the last number
}
else {
link.setAttribute('data-line-number', ''); // Add a blank attribute to let CSS know when there are line numbers
}
}
});
codeLinks.forEach(link => {
// find the span element after the line number link to pad the spacing for the added numbers
sibling = link.nextElementSibling;
if (sibling) {
if (sibling.classList.contains('hll')) {
sibling = sibling.firstElementChild; // target the content span, not the highlighting
}
}
});
});
}
// replace all newline characters in `element`'s text nodes with `<br>` tags
function replaceNewlinesWithBr(element) {
// Get all child nodes of the element
const childNodes = Array.from(element.childNodes);
// Iterate over each child node
childNodes.forEach(node => {
if (node.nodeType === Node.TEXT_NODE) {
// Split the text content by newline characters
const textParts = node.nodeValue.split('\n');
// Create a document fragment to hold new nodes
const fragment = document.createDocumentFragment();
textParts.forEach((part, index) => {
// Create a text node for the part
const textNode = document.createTextNode(part);
fragment.appendChild(textNode); // Append the text node
// If this is not the last part, add a <br>
if (index < textParts.length - 1) {
const br = document.createElement('br');
fragment.appendChild(br);
}
});
// Replace the original text node with the new content
node.parentNode.replaceChild(fragment, node);
}
});
}
Another related issue with code blocks that cross page breaks is that the code content will always try to start at the top of the next page, leaving the title (filename) element floating alone at the end of the previous page. Strangely, this element resists changes from JavaScript, so a Python hook is needed to implement a workaround.
def _moveCodeFilename(parsed_html: BeautifulSoup):
"""
Moves the title (filename) of code blocks into the <code> section so it may be rendered closer to the code block
:param parsed_html: The parsed HTML to process.
:type parsed_html: BeautifulSoup
:return: The altered HTML
:rtype: BeautifulSoup
"""
code_blocks = parsed_html.find_all(class_='highlight')
for code_block in code_blocks:
header_span = code_block.select_one('span.filename')
pre_tag = code_block.select_one('.highlight > pre code')
if not pre_tag:
pre_tag = code_block.select_one('.highlight .code pre code')
if header_span and pre_tag:
header_span = copy.deepcopy(header_span)
# Move the header into the pre block
ex_header_span = header_span.extract()
pre_tag.insert(0, ex_header_span)
return parsed_html
def on_page_content(html: str,
page: Page,
config: MkDocsConfig,
files: Files) -> Union[str, None]:
"""
The `page_content` event is called after the Markdown text is rendered to
HTML (but before being passed to a template) and can be used to alter the
HTML body of the page.
Args:
html: HTML rendered from Markdown source as string
page: `mkdocs.structure.pages.Page` instance
config: global configuration object
files: global files collection
Returns:
HTML rendered from Markdown source as string
"""
# Parse the HTML content
parsed_html = BeautifulSoup(html, 'html.parser')
parsed_html = _moveCodeFilename(parsed_html)
# Return the modified HTML
return str(parsed_html)
Since this affects the HTML of the actual website, we need CSS to control the rendering of the added Filename element:
@media screen {
.highlight code .filename {
display: none !important;
}
}
@media print {
.highlight > .filename,
.highlight th {
display: none !important;
}
.highlight .linenos {
display: none;
}
.md-typeset *:not(h3) + h4 {
margin-top: 1.5rem;
.highlight code .filename {
font-family: 'Classico URW T OT', 'Noto Serif JP', serif;
font-size: .6rem;
margin: 0;
}
}
JavaScript to invoke parse_code_linenums (from my previous comment) on the website as well, so HTML, print and PDF can be handled equally.
$(document).ready(function() {
parse_code_linenums();
});