pdftitle icon indicating copy to clipboard operation
pdftitle copied to clipboard

a utility to extract the title from a PDF file

Results 12 pdftitle issues
Sort by recently updated
recently updated
newest added

The algorithm only considered the case when the title text is in the top level, while in many pdf files, the title is indeed inside a XForm or a multi-level...

Currently, the program may output digraph for certain PDFs. For example https://arxiv.org/pdf/1506.02640.pdf . ```bash $ pdftitle -p 1506.02640.pdf You Only Look Once: Unified, Real-Time Object Detection ``` Note the `fi`...

In the implementation of the "eliot" algorithm, the y coordinates are sorted low-to-high: https://github.com/metebalci/pdftitle/blob/5ebc1a0ec3f347e5a257485bc6ce43a9f12798ba/pdftitle.py#L543-L548 Since the origin of a pdf is the bottom-left corner, the y coordinates should be sorted...

bug

is it possible to create an informative error message instead of application crash. ``` Traceback (most recent call last): File "/home/zk/.local/lib/python3.9/site-packages/pdftitle.py", line 701, in run title = get_title_from_file(args.pdf) File "/home/zk/.local/lib/python3.9/site-packages/pdftitle.py",...

I think this would help the user to discover the option. Currently the error message just say "PDF contains a unicode char that does not exist in the font", maybe...

I might have overlooked something, but it seems there is no way to adjust the parameters from API calls, e.g. you can't call `get_title_from_file(path, algo='max2')`.

enhancement

Fixes #33 This should be the most minimal-invasive way of passing arguments to pdftitle when calling it from another python module. It allows the module to be used in conjunction...

- Improve fixing spaces when seeing similar consecutive characters - Add argument to force fixing spaces - Strip possible newlines from end result

Text in the PDF file might not contain space character but the space might be indicated with an actual (additional) horizontal position difference between the glyphs before and after the...

enhancement

To make this repository more contribution friendly it should imho be structured in a "standard way", i.e. the top level directory only containing `setup.py` but not the actual source code....