puppeteer-sharp icon indicating copy to clipboard operation
puppeteer-sharp copied to clipboard

puppeteerSharp failed to display image in pdf

Open kirticism opened this issue 1 year ago • 1 comments

Hi, I am encountering an issue while converting HTML to PDF using PuppeteerSharp. Specifically, the images are not displaying in the generated PDF. Despite following various solutions suggested on StackOverflow, the problem persists. Below is the method I am using to perform the HTML to PDF conversion. I would appreciate any guidance or solutions you could provide.

public async Task<IFormFile> HtmlToPdf(string htmlContent, string fileName) { var startTime = DateTimeOffset.UtcNow; _logger.LogInformation("Executing GeneratePdf: {fileName}...", fileName);

var launchOptions = new LaunchOptions { Headless = true };
await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultChromiumRevision);
using (var browser = await Puppeteer.LaunchAsync(launchOptions))
using (var page = await browser.NewPageAsync())
{
    await page.SetContentAsync(htmlContent);

    await page.AddStyleTagAsync(new AddTagOptions { Content = "body { font-size: 20px; margin: 50px; }" });
    await page.WaitForSelectorAsync("img");
    var pdfBytes = await page.PdfDataAsync(new PdfOptions { Format = PaperFormat.A4 });

    var filePath = Path.Combine(Path.GetTempPath(), fileName);
    File.WriteAllBytes(filePath, pdfBytes);

    var formFile = await CreateFormFile(filePath, fileName);

    File.Delete(filePath);

    var endTime = DateTimeOffset.UtcNow;
    var executionTime = endTime - startTime;
    _logger.LogInformation("Time taken to execute GeneratePdf: {fileName}. Time: {ExecutionTime} ms",
                           fileName, executionTime.TotalMilliseconds);

    return formFile;
}

}

kirticism avatar May 28 '24 10:05 kirticism

await page.WaitForSelectorAsync("img");

This is not enough. You should wait until the image loads too.

Perhaps this will do the trick.

await page.WaitForNetworkIdleAsync();

mstijak avatar Jun 04 '24 21:06 mstijak

I am facing the same issue

        var pdfOptions = new PdfOptions
        {
            PrintBackground = true, // otherwise background of tables won't be printed
            Format = request.Options.Format.ToPuppeteerPaperFormat(),
            Landscape = request.Options.PrintOrientation == PrintOrientation.Landscape,
            MarginOptions =
            {
                Bottom = $"{request.Options.MarginOptions.Bottom}mm",
                Left = $"{request.Options.MarginOptions.Left}mm",
                Right = $"{request.Options.MarginOptions.Right}mm",
                Top = $"{request.Options.MarginOptions.Top + (request.Options.HeaderHeight != 0 ? HeaderContentGap + request.Options.HeaderHeight : 0)}mm"
            },
            Scale = new decimal(request.PrintScale),
            HeaderTemplate = headerTemplate,
            FooterTemplate = "<div></div>", // we want empty footer
            DisplayHeaderFooter = !string.IsNullOrEmpty(headerTemplate)
        };        
        await using var stream = await contentPage.PdfStreamAsync(pdfOptions);

to my surprise the images are seen in the html right before the PDF generation image

but this happens when the PDF is generated image

I tried many methods for waiting

    private async Task WaitForPageToBeLoaded(IPage page, CancellationToken ct)
    {
        _logger.LogDebug("Waiting for all fonts to be loaded");
        ct.ThrowIfCancellationRequested();
        await page.EvaluateExpressionHandleAsync("document.fonts.ready");
        _logger.LogDebug("Page fonts loaded");
        
        _logger.LogDebug("Emulating Screen media type");
        ct.ThrowIfCancellationRequested();
        await page.EmulateMediaTypeAsync(MediaType.Screen);
        _logger.LogDebug("Screen media type emulated");

        await page.SetJavaScriptEnabledAsync(true);
        
        _logger.LogDebug("Waiting for images to be loaded");
        ct.ThrowIfCancellationRequested();
        await page.WaitForSelectorAsync("img");
        _logger.LogDebug("Images loaded");
        
        await page.WaitForNetworkIdleAsync();
    }

skalahonza avatar Nov 04 '24 13:11 skalahonza

@skalahonza what happens when you print to PDF in chrome?

kblok avatar Nov 04 '24 14:11 kblok

@kblok that works well image

skalahonza avatar Nov 04 '24 14:11 skalahonza

This can happen if you're loading images over HTTPS while the page is served over HTTP.

mstijak avatar Nov 04 '24 14:11 mstijak

As others suggested in other issues. I enabled ignoring of https errors for local development.

skalahonza avatar Nov 04 '24 14:11 skalahonza

For those encountering similar issues, the problem was lazy loading. Images not initially visible on the page with lazy loading activated failed to render.

<img loading="lazy" src="...">

Removing the lazy loading attribute from the image element resolved the issue, and PDF generation functioned properly once more.

skalahonza avatar Nov 04 '24 19:11 skalahonza