unilm
unilm copied to clipboard
Extracting Outlines by LayoutLMv3
Describe Model I am using LayoutLMV3:
I am working on a project that involves extracting outlines from PDF files. Some of the PDF files I am handling do not contain a structured "Contents" or "Outline" chapter at the beginning, making it challenging to extract the document’s structure through traditional methods.
I am considering using LayoutLMv3 to extract the document's outline by analyzing each page.
Could you please confirm whether LayoutLMv3 is capable of performing such a task? Additionally, I would appreciate it if you could provide guidance or example code on how to implement this. Furthermore, the input data would be PDF files or images?
Many thanks!