PdfPig
PdfPig copied to clipboard
Transparency is lost when extracting images from PDFs
I am trying to pull out all of the images from a PDF that I have. It is a Pathfinder Adventure Module that I purchased awhile ago. Everything works great, but the only things is that the images lose their transparency when I try and save them, and then the images look awful with a black blocky background.
#pragma warning disable CA1416
using System.Drawing.Imaging;
using System.Drawing;
using UglyToad.PdfPig;
var filePath = args[0];
var destinationFolder = args[1];
using var file = File.Open(filePath, FileMode.Open);
using var pdf = PdfDocument.Open(file);
var encoders = ImageCodecInfo.GetImageDecoders();
var pngEncoder = encoders.First(enc => enc.FormatID == ImageFormat.Png.Guid);
Console.WriteLine("Exporting images...");
foreach (var page in pdf.GetPages())
{
var images = page.GetImages().ToArray();
for (int i = 0; i < images.Length; i++)
{
images[i].TryGetPng(out var pngBytes);
using var stream = new MemoryStream(pngBytes ?? images[i].RawBytes.ToArray());
using var image = Image.FromStream(stream, false, false);
image.Save($"{destinationFolder}/{page.Number}-{i}.png", pngEncoder, null);
}
}
Console.WriteLine($"Extracted images from {filePath}");
#pragma warning restore CA1416
Are you able to share the document at all? In general I don't think images in PDF have a transparency layer but I haven't looked at the spec recently and don't recall
I have used other tools online for the exact same process, and they work and include the transparency properly,. I'd like to be able to use my own tool, which this package works perfectly for, except for the transparency.
My only concern with uploading the PDF here is that it is a paid product which is watermarked with my account information. I will temporarily upload it here, but I will need to remove it after a day or so.
EDIT: I can't seem to upload the PDF from my phone right now, I can try again on Monday when I return home.
One possible area to look into is
https://github.com/UglyToad/PdfPig/blob/c25368e5ab7c3add2bd771d940a31dc2e87f3d34/src/UglyToad.PdfPig/Images/Png/PngFromPdfImageFactory.cs#L31C17-L31C101
the PngBuilder.Create's hasAlphaChannel value is always false.
Another possible area to look into are "soft masks":
I doubt I will have time soon to look into that though
I've cloned the project and built it with the change to PngBuilder.Create's hasAlphaChannel set to true, which does not appear to change the result of the images.
@tcm151 okay, thanks for checking. Sad it's not an easy fix
It must be related to soft mask then... which is more tricky since I don't think it's fully implemented yet