AdvancedLiterateMachinery Question about extracting labels of PDF Elements

Question about extracting labels of PDF Elements

Open Samyssmile opened this issue 5 months ago • 0 comments

trafficstars

First of all, thank you for making these models available—great work!

I have tried several AI models that extract content from PDFs and identify its type—e.g.,

text
title
list
table
figure.

The problem is that I haven’t yet found a model that correctly recognizes the hierarchy of headings, such as H1, H2, and H3. Can any of your models do that? So what I need looking for is a way to detect

text
title
list
table
figure.
H1
H2
H3
H4

Is it possible with one of your model?

Jun 20 '25 15:06 Samyssmile

AdvancedLiterateMachinery AdvancedLiterateMachinery copied to clipboard

Question about extracting labels of PDF Elements

AdvancedLiterateMachinery
AdvancedLiterateMachinery copied to clipboard