unipdf
unipdf copied to clipboard
Support for Type3 fonts
Currently text extraction fails on some text using this font type. Need to add support for it to properly work for extraction.
@peterwilliams97 What is the current status of this in extraction? Are we able to get any information from Type 3 fonts?
I believe that this should be doable. I've seen this error (type3 fonts are currently not supported) for some PDF that Zotfile (a plugin in zotero) has no trouble extracting. Anything to learn from there?
In the above case, the text in the document is all in Type 1 fonts. However, there are a few occasions with embedded type 3 fonts inside figures. This shouldn't affect extracting most of the time, which probably explains why Zotfile (which uses pdf.js) is able to still extract annotations.