pdfScale icon indicating copy to clipboard operation
pdfScale copied to clipboard

Get Page Size from Ghostscript

Open tavinus opened this issue 5 years ago • 1 comments

Time to kill all dependencies

We finally have a solution for getting the page size with ghostscript.
Thank you Stefan Dragnev!
https://stackoverflow.com/a/52644056/1273636

PROS

  • No need for external dependencies anymore (pdfinfo, identify, etc)
  • Should never fail on any PDF
  • We have the sizes of ALL pages

CONS

  • Seems a bit slower than the other methods

I THINK it is slower also because it is traversing all pages and checking them.
Which means that a version that only gets the size of the first page could be a lot faster.
(need to try and test)
For most operations we only use the size of the first page anyways.

I do want to offer the option for the user to choose the page size though (optional call).

I also want to have the option of listing ALL page sizes (on --info or similar parameter).

Examples calls

$ gs -dNODISPLAY -dQUIET -sFileName=../mixsync\ manual\ v1-2-3.A0.SCALED.pdf -c "FileName (r) file runpdfbegin 1 1 pdfpagecount {pdfgetpage /MediaBox get {=print ( ) print} forall (\n) print} for quit"
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384
0 0 3370 2384

Without -sFileName

$ gs -q -dNODISPLAY -c "(../mixsync\ manual\ v1-2-3.pdf) (r) file runpdfbegin 1 1 pdfpagecount {pdfgetpage /MediaBox get {=print ( ) print} forall (\n) print} for quit"
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29
0 0 841.89 595.29

Seems like -sFileName is a good idea, since it handles spaces on names.
Also, we may want to include -dBatch and remove quit from the PS script.

Need to test/adapt a bit more and also implement it into the adaptive method.
I will probably use this as second option, if GREP fails (if grep is indeed a lot faster).

I will also probably leave the choice to force the external modes (eg. -m pdfinfo).
It COULD be useful on specific cases.

tavinus avatar Oct 04 '18 16:10 tavinus

Quoting his reply

Here's a breakdown of the command:

FileName (r) file  % open file given by -sFileName
runpdfbegin        % open file as pdf
1 1 pdfpagecount { % for each page index
  pdfgetpage       % get pdf page properties (pushes a dict)
  /MediaBox get    % get MediaBox value from dict (pushes an array of numbers)
  {                % for every array element
    =print         % print element value
    ( ) print      % print single space
  } forall
  (\n) print       % print new line
} for
quit               % quit interpreter. Not necessary if you pass -dBATCH to gs

Replace /MediaBox with /CropBox to get the crop box.

tavinus avatar Oct 04 '18 16:10 tavinus