cpdf-source icon indicating copy to clipboard operation
cpdf-source copied to clipboard

cpdf in.pdf -gs gs -gs-malformed-force -o out.pdf [-gs-quiet] appears to scrub text from the resulting pdf

Open greggarson opened this issue 5 years ago • 12 comments
trafficstars

Seeing this occur with cpdf 2.3 after I upgraded and tried to improve my malformed pdf handling.

greggarson avatar Dec 10 '19 23:12 greggarson

Are you able to supply the file in question?

Could cpdf previously handle it on its own?

johnwhitington avatar Dec 10 '19 23:12 johnwhitington

I can supply the offending file where would be the best place to send it to? cpdf would handle the file correctly previously but I don't believe the cpdf in.pdf -gs gs -gs-malformed-force -o out.pdf offering was available before 2.3? Or if it was we were unaware of it and not using it.

greggarson avatar Dec 11 '19 18:12 greggarson

You can send it to me at john at coherentgraphics.co.uk.

Normally, you would only want to use -gs-malformed, which tells cpdf to try mending a file with gs if it cannot mend it itself upon loading.

The only time you would want to use -gs-malformed force is when cpdf /can/ load the file ok, but the malformity causes cpdf to fail halfway through some operation. Thus, we would need to force the file to be repaired by gs beforehand.

johnwhitington avatar Dec 11 '19 19:12 johnwhitington

I see, let me test with that approach, I had assumed that -gs-malformed-force was an all powerful fixer, given that it took into account page level issues where -gs-malformed may not?

greggarson avatar Dec 11 '19 19:12 greggarson

No, the actual operation is exactly the same. In some future version, we might be able to enable cpdf to fix the file with gs even if the problem is discovered after loading the file, but that's not possible yet -- hence the existence of -gs-malformed-force.

johnwhitington avatar Dec 11 '19 19:12 johnwhitington

Update on this, I was running the command above as part of a script and not alone on the command line so wasn't looking at the output of that one command but rather the end result.

When actually running the command by itself I end up with:

bash-4.2$ cpdf test.pdf -gs /usr/bin/gs -gs-malformed-force -o out.pdf
Command line must be of exactly the form
cpdf <infile> -gs <path> -gs-malformed-force -o <outfile>

So it seems that the command doesn't even run when the invocation is as requested?

greggarson avatar Dec 11 '19 20:12 greggarson

It works here. Try just 'gs' instead of /usr/bin/gs?

johnwhitington avatar Dec 12 '19 12:12 johnwhitington

Seems to have the same result

bash-4.2$  cpdf test.pdf -gs gs -gs-malformed-force -o out.pdf                                                                                                                                           
Command line must be of exactly the form                                                                                                                                                                          
cpdf <infile> -gs <path> -gs-malformed-force -o <outfile>                                                                                                                                                         
bash-4.2$ which gs
/usr/bin/gs

greggarson avatar Dec 12 '19 19:12 greggarson

Fascinating. I'll send you a new executable with some debugging output so we can see what's actually happening.

You are using Linux/x86-64 right?

johnwhitington avatar Dec 13 '19 13:12 johnwhitington

That I am.

greggarson avatar Dec 13 '19 18:12 greggarson

Please try

http://www.coherentpdf.com/14thDecember2019.tar.bz2

This should print some debug output to help me understand why the special command line recognition for -gs-malformed-force is not working.

johnwhitington avatar Dec 14 '19 11:12 johnwhitington

Will do, thanks again for the help.

greggarson avatar Dec 17 '19 18:12 greggarson