PdfPig icon indicating copy to clipboard operation
PdfPig copied to clipboard

Transparency is lost when extracting images from PDFs

Open tcm151 opened this issue 1 year ago • 6 comments

I am trying to pull out all of the images from a PDF that I have. It is a Pathfinder Adventure Module that I purchased awhile ago. Everything works great, but the only things is that the images lose their transparency when I try and save them, and then the images look awful with a black blocky background.

#pragma warning disable CA1416
using System.Drawing.Imaging;
using System.Drawing;
using UglyToad.PdfPig;

var filePath = args[0];
var destinationFolder = args[1];

using var file = File.Open(filePath, FileMode.Open);
using var pdf = PdfDocument.Open(file);

var encoders = ImageCodecInfo.GetImageDecoders();
var pngEncoder = encoders.First(enc => enc.FormatID == ImageFormat.Png.Guid);

Console.WriteLine("Exporting images...");
foreach (var page in pdf.GetPages())
{
	var images = page.GetImages().ToArray();
	for (int i = 0; i < images.Length; i++)
	{
		images[i].TryGetPng(out var pngBytes);
		using var stream = new MemoryStream(pngBytes ?? images[i].RawBytes.ToArray());
		using var image = Image.FromStream(stream, false, false);
		image.Save($"{destinationFolder}/{page.Number}-{i}.png", pngEncoder, null);
	}
}

Console.WriteLine($"Extracted images from {filePath}");
#pragma warning restore CA1416

tcm151 avatar Feb 06 '24 22:02 tcm151

Are you able to share the document at all? In general I don't think images in PDF have a transparency layer but I haven't looked at the spec recently and don't recall

EliotJones avatar Feb 18 '24 15:02 EliotJones

I have used other tools online for the exact same process, and they work and include the transparency properly,. I'd like to be able to use my own tool, which this package works perfectly for, except for the transparency.

My only concern with uploading the PDF here is that it is a paid product which is watermarked with my account information. I will temporarily upload it here, but I will need to remove it after a day or so.

EDIT: I can't seem to upload the PDF from my phone right now, I can try again on Monday when I return home.

tcm151 avatar Feb 18 '24 17:02 tcm151

One possible area to look into is https://github.com/UglyToad/PdfPig/blob/c25368e5ab7c3add2bd771d940a31dc2e87f3d34/src/UglyToad.PdfPig/Images/Png/PngFromPdfImageFactory.cs#L31C17-L31C101 the PngBuilder.Create's hasAlphaChannel value is always false.

Another possible area to look into are "soft masks": image

I doubt I will have time soon to look into that though

BobLd avatar Feb 20 '24 19:02 BobLd

I've cloned the project and built it with the change to PngBuilder.Create's hasAlphaChannel set to true, which does not appear to change the result of the images.

tcm151 avatar Feb 20 '24 20:02 tcm151

@tcm151 okay, thanks for checking. Sad it's not an easy fix

It must be related to soft mask then... which is more tricky since I don't think it's fully implemented yet

BobLd avatar Feb 21 '24 20:02 BobLd