pdfboxing
pdfboxing copied to clipboard
Nice wrapper of PDFBox in Clojure
Hi, This code snippet: ``` (-> (split/split-pdf :input "test/pdfs/multi-page.pdf") (nth ) text/extract) ``` often throws: `Execution error (IOException) at org.apache.pdfbox.cos.COSStream/checkClosed (COSStream.java:83). COSStream has been closed and cannot be read. Perhaps...
Updating dependencies to the latest stable versions
## Description of your pull request (Feel free to squash & merge and use this as a commit message!) Add functionality in `pdfboxing.text` to extract pdf text from specific regions...
Implement a namespace for exporting a PDF or PDDocument to a BufferedImage [Re #55] This PR also slightly changes the prerequisites for split function Previously it only allowed strings as...
## Description of your pull request Added the ability to overlay one PDF over another ## Pull request checklist Before submitting the PR make sure the following things have been...
For text extraction, pdfboxing currently uses [org.apache.pdfbox.text.PDFTextStripper](https://pdfbox.apache.org/docs/2.0.13/javadocs/org/apache/pdfbox/text/PDFTextStripper.html) which works on the entire document. However, any document structure is removed during text extraction, so the more data the pdf contains, the...
First of all - kudos for this library! It proves to be very useful to our project in Magnet. However we need an export to image functionality that Apache's PDFbox...
Opening a new issue after discussion in #26 ```clojure (pdf-split/merge-pddocuments :docs (pdf-split/split-pdf :input path :start 1 :end 4) :output "test.pdf") Unhandled java.io.IOException COSStream has been closed and cannot be read....
The idea is to remove the external dependency on `org.apache.pdfbox/preflight`. The reason for that is since there is one less dependency, it's one less thing to worry about when the...
few months ago i used pdfbox comandline-tool to make simple HTML from a PDF document so i made function that does exactly that. Provide a PDF to the root folder...