Diff for other kinds of changes
Besides text changes in text PDFs, this would be even more helpful if it could produce an image with a visual indication of:
- Changes to non-text PDFs. Much needed for textual changes to scanned, textless PDFs (like the FISA Court's, and many others).
- New pages added.
- Pages removed. (The FISA Court had one of these, they chopped off like 30 pages once.)
Out of scope! :)
But it could be so beautiful and helpful! I could also break these out if you thought any were more in-scope than others.
I just don't expect to do any more work in this repo. I'd be glad to accept PRs or turn over the repo to another account if there are things you want to add.
Hi Josh,
I have seen your code related to PDF difference. I found useful in my current task. My task is following:
Task: Implement a python script that does the following:
- Checks the availability of the menu card (PDF) on this website: http://bantschowundbantschow.de/fraunhofer_sit
- Checks if the PDF has changed since the last time it was downloaded (do NOT consider the file name for that!)
- Downloads the PDF only if it has changed since last download.
So my point is if I use your code, can you advise me to accomplish above mentioned task in simple way. I think if instead of the box plotting function, I must use some function to download the pdf. Would you like to share your knowledge with me to write code for download the pdf file if there is difference spotted.
Thank you very much. Sunitha
@sunithak Hi. Unfortunately I don't have the time to advise you on that.