AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard
Question about extracting labels of PDF Elements
trafficstars
First of all, thank you for making these models available—great work!
I have tried several AI models that extract content from PDFs and identify its type—e.g.,
- text
- title
- list
- table
- figure.
The problem is that I haven’t yet found a model that correctly recognizes the hierarchy of headings, such as H1, H2, and H3. Can any of your models do that? So what I need looking for is a way to detect
- text
- title
- list
- table
- figure.
- H1
- H2
- H3
- H4
Is it possible with one of your model?