Creating PDF from HTML text does not work at all.
Hello.
I am trying to use this library to convert simple markdown files (HTML in this case) into .pdf files.
But firstly I have trying something simple that was mentioned in the examples section with slight simplifications:
let html = r#"
<!DOCTYPE html>
<html>
<head>
<title>HTML to PDF Conversion</title>
<style>
body {
font-family: 'Helvetica', sans-serif;
margin: 20px;
color: #333;
}
h1 {
color: #004080;
border-bottom: 1px solid #ccc;
padding-bottom: 5px;
}
p {
margin: 10px 0;
line-height: 1.5;
}
.highlight {
background-color: #ffffcc;
padding: 5px;
border-radius: 3px;
}
ul {
margin: 15px 0;
}
li {
margin: 5px 0;
}
.footer {
margin-top: 30px;
font-size: 0.8em;
text-align: center;
color: #668;
}
</style>
</head>
<body>
<h1>HTML to PDF Conversion Example</h1>
<p>This document demonstrates converting HTML to PDF using the <span class="highlight">printpdf</span> library.
HTML conversion allows you to leverage familiar web styling techniques for PDF generation.</p>
<h2>Features Demonstrated</h2>
<ul>
<li>Basic text formatting with paragraphs and headings</li>
<li>Custom styling with CSS</li>
<li>Lists (ordered and unordered)</li>
<li>Tables with borders and styling</li>
<li>Image embedding</li>
</ul>
<h2>Ordered List Example</h2>
<ol>
<li>First, create your HTML content</li>
<li>Configure the HTML conversion options</li>
<li>Call the converter function</li>
<li>Save the resulting PDF document</li>
</ol>
<div class="footer">
Generated with printpdf HTML conversion - Page 1
</div>
</body>
</html>
"#;
using this code, pretty much the exact copy/paste of the example (but I have also tried to convert it to the string first and write it into the file afterwards, the results were the same...):
let mut doc = PdfDocument::new("HTML to PDF Example");
let mut warnings = Vec::new();
let mut pages = Vec::new();
let newpages = PdfDocument::from_html(
&html,
&BTreeMap::new(),
&BTreeMap::new(),
&GeneratePdfOptions::default(),
&mut Vec::new(),
);
pages.append(&mut newpages.unwrap_or_default().pages);
// Add the pages to the document
doc.with_pages(pages);
// Save the PDF to a file
let bytes = doc.save(&PdfSaveOptions::default(), &mut warnings);
for warning in warnings {
println!("{}", warning.msg);
}
std::fs::write("./html_example.pdf", bytes).unwrap();
The HTML itself should look something like this:
But the .pdf that I have received looks like this, it's quite a gibberish to say at least...
It almost looks like it would work, but for some reason, it ignores leading completely, it just puts everything to 1 line.
Am I missing something? If it helps, I am using Windows and Rust/Cargo 1.86.0
No, I noted in the README that it's very experimental. By "very experimental" I basically mean, it parses the (X)HTML (at least that works), but it's not really usable. But at least the API stub is there. I didn't have time for the 0.8 release to implement this, so I delayed it to 0.9.
It is that way is because the underlying HTML layout solver is a massive piece of work, i.e. in order to put the HTML into the PDF, I first need to solve the positions, fonts, images, etc. and then translate it to PDF operations: that is done by azul-layout, which has reftests here: https://azul.rs/reftest - HTML layout solving is very complex.
You are not missing anything, it's just broken. What you're seeing is a "start", my goal was to get it to be "usable for invoices" and that's about it. Real HTML layout solving would be a massive effort.
Aha, Okay I see... I was working with the assumption that what is included in the examples is already in somewhat working condition, but perhaps I got ahead of myself...
I will see what I can do for my use case until then. But thanks for the fast response!