gerev
gerev copied to clipboard
PDF Parser, GoogleDrive support for PDF, README.md minor fix
@bary12 do we want here pdf->html->text? to know titles, bold, etc, like docx?
@bary12 do we want here pdf->html->text? to know titles, bold, etc, like docx?
Yes, just for the titles.
@d4yz so we need pdf_to_html, and then use html_to_text, like we do for .docx
- [ ] convert pdf to html then to text, for preserving title information