pyfpdf icon indicating copy to clipboard operation
pyfpdf copied to clipboard

Speed and memory issues

Open jordanlui opened this issue 7 years ago • 7 comments

Hi there, I am processing a list of .PNG files (max size about 200KB) from my local computer and I notice that FPDF.output() command is very slow and processing time increases non linearly with number of pages. Has anyone else seen this?

Process 100 files, takes 9 seconds. 200 files takes 42s, and 300 takes 96s. These represent write times of 99, 210, and 320ms per image file.

I have been testing this a bit further and assume the issue is due memory limitations on opening so many image files and saving them into the PDF? Is there a recommend way to make this operation more memory efficient?

Any ideas on how to fix this would be appreciated! :)

Using fairly simple function, and using Python 2.7

def makePDF(pdfFileName, listPages, dir = ''):
	if (dir):
		dir += "/"

	cover = Image.open(str(listPages[0]))
	width, height = cover.size

	pdf = FPDF(unit = "pt", format = [width, height]) # FPDF Class constructor

	for i, page in enumerate(listPages):
		if i%20 == 0:
		   print i 
		pdf.add_page()
		pdf.image(str(page), 0, 0)

	pdf.output(dir + pdfFileName + ".pdf", "F") # Create PDF File
	return

I also noticed that some of my files vary in size, but generally are all very close to 1200x1600 pixels. I had been wondering if the different pixel sizes slows down the output() function due to a scaling operation, but forcing a static dimension is not improving speed. image

I did notice that writing the save image repeatedly can be incredibly fast (writing a 1000 page pdf of the same image in 0.28 seconds). Is there a way that I can more efficiently load my other images as I build my PDF?

jordanlui avatar Nov 19 '17 07:11 jordanlui

Hi! A long ime ago, I proposed a patch to performance issues: #60 In my case, the png images are 16MB, and after my patch the results are impressive.

marcelotduarte avatar Apr 17 '18 18:04 marcelotduarte

ill pull it on my fork and publish

On Tue, Apr 17, 2018 at 2:22 PM, Marcelo Duarte [email protected] wrote:

Hi! A long ime ago, I proposed a patch to performance issues: #60 https://github.com/reingart/pyfpdf/pull/60 In my case, the png images are 16MB, and after my patch the results are impressive.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/reingart/pyfpdf/issues/93#issuecomment-382093082, or mute the thread https://github.com/notifications/unsubscribe-auth/AIgjJCvLboAVOWG-DIwbbJsI0xtegjZTks5tpjLYgaJpZM4QjScR .

alexanderankin avatar Apr 17 '18 19:04 alexanderankin

Hi! A long ime ago, I proposed a patch to performance issues: #60 In my case, the png images are 16MB, and after my patch the results are impressive.

I am currently facing the same problem. The PDF I am generating is fairly simple, since It contains only 5 pages with 9 Images ranging from 200kb to 17mb. Using only the small images the compiling takes up to 4 min, which is a very long time for such a simple setup. Adding the 2 17mb images increases the compiling time drastically. Using your optimized code didn't unfortunately make any difference and even made it worse, since blank pages are being generated now. Are you familiar with such thing? Thanks for your help!

Alilino avatar Feb 01 '19 08:02 Alilino

For the last 3 years I have a app using pyfpdf with my patch, and none problems. Sometimes a 150 page with 8mb images, otherwise 20 pages with 16mb images.... An another coment about: https://github.com/reingart/pyfpdf/pull/60#issuecomment-453019558

marcelotduarte avatar Feb 01 '19 14:02 marcelotduarte

For the last 3 years I have a app using pyfpdf with my patch, and none problems. Sometimes a 150 page with 8mb images, otherwise 20 pages with 16mb images.... An another coment about: #60 (comment)

I didn't mean that your patch is not working. I meant that I am facing some issues and hoped that somebody had the same thing and solved it.

Alilino avatar Feb 01 '19 16:02 Alilino

How are you using pyfpdf? What version of python?

marcelotduarte avatar Feb 01 '19 17:02 marcelotduarte

How are you using pyfpdf? What version of python?

I am using pyfpdf within Jupyter on windows with python 3.6. I tried your version one more time and noticed an improvement from 296s to 232s

Alilino avatar Feb 01 '19 20:02 Alilino