pdfannots
pdfannots copied to clipboard
Feature: Add option to display the number of the annotation
Hello, First off, I love the script! It produces beautiful output with context, etc. Great job!
For scientific reviews, since you need to provide rebuttal to each of the reviewers' notes, it would be convenient to respond to, e.g. 'page 3, note 12' or 'note 32'. I would suggest labeling the annotations not only with the page number, but also the number of the note on this page and/or global number (the note count would start at the beginning of the document then).
What do you think?
Thanks for the feedback! It sounds like a reasonable idea. My main concern is that the "order" of notes on a page is unspecified in the PDF file. The order that pdfannots outputs annotations is just based on an ugly pile of heuristics in pdfminer that try to infer the page reading order, and another pile of heuristics in pdfannots that try to determine the text "nearest" an annotation in order to sort them. So, even though "page 4, note 3" may seem logical to you, there are plenty of instances where it won't match what a human would consider the third note on the page (this happens most often on multi-column documents). Until now, these discrepancies didn't matter too much -- sometimes notes show up out of logical order, which is a bit annoying, but not the end of the world. With this feature, the bug would be much more glaring.
Indeed, the ordering problem seems to be hard to solve, I have recently found it using Acrobat Reader's "Comment summary" feature, which labels annotations on a page with numbers, then produces a page with the summary of each annotation just after said page. An example is in the attachments. The example is very simple and does not show the problem you mentioned, but with a more densely annotated PDF I recently worked with, it really showed. The order of the comments indeed wasn't matching exactly the reading direction. Acrobat resolves this by labelling each annotation.
The question is: is it feasible (in your opinion) to work on a feature, which would produce 2 output files: a PDF with annotations labelled and the markdown comment summary?
Additionally, since I can code some Python if you could roughly point me to a part of the code needing enhancement, I probably could be able to help and treat this feature, as a kind of weekend project. :)