printpdf icon indicating copy to clipboard operation
printpdf copied to clipboard

Creating PDF from HTML text does not work at all.

Open SimplyProgrammer opened this issue 8 months ago • 2 comments

Hello.

I am trying to use this library to convert simple markdown files (HTML in this case) into .pdf files.

But firstly I have trying something simple that was mentioned in the examples section with slight simplifications:

	let html = r#"
	<!DOCTYPE html>
	<html>
		<head>
			<title>HTML to PDF Conversion</title>
			<style>
				body {
					font-family: 'Helvetica', sans-serif;
					margin: 20px;
					color: #333;
				}
				h1 {
					color: #004080;
					border-bottom: 1px solid #ccc;
					padding-bottom: 5px;
				}
				p {
					margin: 10px 0;
					line-height: 1.5;
				}
				.highlight {
					background-color: #ffffcc;
					padding: 5px;
					border-radius: 3px;
				}
				ul {
					margin: 15px 0;
				}
				li {
					margin: 5px 0;
				}
				.footer {
					margin-top: 30px;
					font-size: 0.8em;
					text-align: center;
					color: #668;
				}
			</style>
		</head>
		<body>
			<h1>HTML to PDF Conversion Example</h1>
			
			<p>This document demonstrates converting HTML to PDF using the <span class="highlight">printpdf</span> library. 
			HTML conversion allows you to leverage familiar web styling techniques for PDF generation.</p>
			
			<h2>Features Demonstrated</h2>
			
			<ul>
				<li>Basic text formatting with paragraphs and headings</li>
				<li>Custom styling with CSS</li>
				<li>Lists (ordered and unordered)</li>
				<li>Tables with borders and styling</li>
				<li>Image embedding</li>
			</ul>
			
			<h2>Ordered List Example</h2>
			
			<ol>
				<li>First, create your HTML content</li>
				<li>Configure the HTML conversion options</li>
				<li>Call the converter function</li>
				<li>Save the resulting PDF document</li>
			</ol>
			
			<div class="footer">
				Generated with printpdf HTML conversion - Page 1
			</div>
		</body>
	</html>
	"#;

using this code, pretty much the exact copy/paste of the example (but I have also tried to convert it to the string first and write it into the file afterwards, the results were the same...):

	let mut doc = PdfDocument::new("HTML to PDF Example");

	let mut warnings = Vec::new();
	let mut pages = Vec::new();

	let newpages = PdfDocument::from_html(
		&html,
		&BTreeMap::new(),
		&BTreeMap::new(),
		&GeneratePdfOptions::default(),
		&mut Vec::new(),

	);
	pages.append(&mut newpages.unwrap_or_default().pages);

	// Add the pages to the document
	doc.with_pages(pages);

	// Save the PDF to a file
	let bytes = doc.save(&PdfSaveOptions::default(), &mut warnings);

	for warning in warnings {
		println!("{}", warning.msg);
	}

	std::fs::write("./html_example.pdf", bytes).unwrap();

The HTML itself should look something like this:

Image

But the .pdf that I have received looks like this, it's quite a gibberish to say at least...

Image

Image

It almost looks like it would work, but for some reason, it ignores leading completely, it just puts everything to 1 line.

Am I missing something? If it helps, I am using Windows and Rust/Cargo 1.86.0

SimplyProgrammer avatar Apr 30 '25 10:04 SimplyProgrammer

No, I noted in the README that it's very experimental. By "very experimental" I basically mean, it parses the (X)HTML (at least that works), but it's not really usable. But at least the API stub is there. I didn't have time for the 0.8 release to implement this, so I delayed it to 0.9.

It is that way is because the underlying HTML layout solver is a massive piece of work, i.e. in order to put the HTML into the PDF, I first need to solve the positions, fonts, images, etc. and then translate it to PDF operations: that is done by azul-layout, which has reftests here: https://azul.rs/reftest - HTML layout solving is very complex.

You are not missing anything, it's just broken. What you're seeing is a "start", my goal was to get it to be "usable for invoices" and that's about it. Real HTML layout solving would be a massive effort.

fschutt avatar Apr 30 '25 12:04 fschutt

Aha, Okay I see... I was working with the assumption that what is included in the examples is already in somewhat working condition, but perhaps I got ahead of myself...

I will see what I can do for my use case until then. But thanks for the fast response!

SimplyProgrammer avatar May 01 '25 11:05 SimplyProgrammer