pdf2image icon indicating copy to clipboard operation
pdf2image copied to clipboard

Thread number getting added to output filenames

Open zweissman opened this issue 3 years ago • 3 comments

Describe the bug First off, thank you so much for putting this together. So much better than using subprocess and hoping that you can understand what is going on with the results that come back. The multithreading is incredible and just cut my processing time by 3-4x!

I am specifying and output_file in convert_from_path() that looks like 'C:\folder\file_prefix'

What comes out that is formatted like 'file_prefix00n-xx.jpg' where n is which thread it was run on and xx is the page number.

To Reproduce Steps to reproduce the behavior: from pdf2image import convert_from_path pdf_path = 'C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1.pdf' pdf_basepath = 'C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1' file_base = 'WO1978000010A1' files = convert_from_path(pdf_path, output_folder=pdf_basepath, output_file=file_base, fmt='jpg', dpi=200, paths_only=True, thread_count=4)

Resulting in a folder with files like C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10001-01.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10001-02.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10002-03.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10002-04.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10003-05.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10003-06.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10004-07.jpg C:\Users\ZW\AppData\Local\Temp\IE\WO1978000010A1\WO1978000010A10004-08.jpg

Expected behavior Any chance there is a way to get rid of the 001, 002, 003, 004 that is getting injected in there before the -page number? I was looking in the code to even see where this is coming and I have to assume that it is a by-product of multithreading.

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser: None
  • Version: 1.14.0

zweissman avatar Feb 04 '21 02:02 zweissman

I think this is a reasonable request, will have to reproduce the issue on my side. Would be willing to take a PR that fixes this.

Belval avatar Feb 06 '21 21:02 Belval

PR #186 sent

zweissman avatar Feb 08 '21 18:02 zweissman

It is not fixed, the behaviour persists for version v1.16.0. I think the PR #186 is still necessary.

edumotya avatar Jun 28 '21 10:06 edumotya