pdf-tools icon indicating copy to clipboard operation
pdf-tools copied to clipboard

set slice from bounding box using all pages

Open tpapp opened this issue 5 months ago • 8 comments

pdf-view-set-slice-from-bounding-box is great, but does not work well for documents that do not have the same bounding box on all pages (left/right, chapter headings, etc).

I wonder if it would be possible to have a function that

  1. traverses all pages,
  2. calculates their bounding box,
  3. combines that into a single one,
  4. sets that.

tpapp avatar Aug 12 '25 08:08 tpapp

I am happy to contribute PR but I need a bit of help, basically the recommended way to traverse all pages so I can call pdf-info-boundingbox on them.

tpapp avatar Aug 12 '25 08:08 tpapp

It is not exactly what you are asking but does pdf-view-auto-slice-minor-mode not work for you? It slices each page according to its own bounding box.

aikrahguzar avatar Aug 12 '25 09:08 aikrahguzar

Thanks, I am aware of pdf-view-auto-slice-minor-mode but it is not ideal as it overmagnifies pages with little content. I would like a consistent, but tight crop as I am reading the document.

tpapp avatar Aug 12 '25 09:08 tpapp

There is nothing builtin that does what you want (as far as I know). It is possible to write it using pdf-cache-boundingbox and looping over all the pages.

aikrahguzar avatar Aug 12 '25 10:08 aikrahguzar

Yes, what I need help with is

looping over all the pages

the rest I can do and make a PR.

tpapp avatar Aug 12 '25 10:08 tpapp

Yes, what I need help with is

looping over all the pages

I think it can be done with dotimes e.g. something like,

(let (bb)
  (dotimes (i (pdf-info-number-of-pages))
    (let ((page (1+ i)))
      (setq bb (do-stuff-to-update-bb page bb))))
  bb)

the rest I can do and make a PR.

aikrahguzar avatar Aug 12 '25 11:08 aikrahguzar

Just to clariy: I need help not with the logic of loops or mapreduce in Emacs Lisp, but how to retrieve a particular page. Is it pdf-cache-get-image?

tpapp avatar Aug 12 '25 13:08 tpapp

I am sorry if I am missing something obvious, but the page in most of pdf-tools is a page number. You don't need an image to get the bounding box, you pass page number to pdf-cache-boundingbox or pdf-info-boundingbox.

aikrahguzar avatar Aug 12 '25 14:08 aikrahguzar