unipdf icon indicating copy to clipboard operation
unipdf copied to clipboard

[BUG] Extracting bullet points

Open Elikrag opened this issue 3 years ago • 2 comments

Description

UniPDF v3.12.1 Getting pageText from ExtractPageText() and using pageText.Marks() to extract the words on a page. Bulletpoints are showing up as x.

The problem can be seen in this example from page 4 of the attached PDF: Screenshot from 2020-10-06 13-11-14

x
All
combustion
devices
installed
on
or
after
May
1,
2014,
must
be
equipped
with
an
operational
auto-igniter
upon
installation
of
the
combustion
device;

Attachments

Full PDF: Speer_Permit.pdf

Elikrag avatar Oct 06 '20 21:10 Elikrag

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

github-actions[bot] avatar Oct 06 '20 21:10 github-actions[bot]

Not sure how to specify this as a "customer issue"

Elikrag avatar Oct 06 '20 22:10 Elikrag