marker
marker copied to clipboard
Can it process images in pdf?
It will skip the images currently. It's possible to save the images separately and embed them into the markdown, though - what's your usecase for the images?
I also think it is better to process image and table , like mathpix.
It will extract tables, just not images. Can you tell me more about what you're using the images for?
sometimes, you need image to illustrate the pointview in your pdf ex, if you are working on compute graphics there are many graphs and images in paper and you want to collect it in you own note.
I would find it very helpful to have images extracted so that I can convert a homework assignment pdf into a complete markdown file. How would you recommend extracting images?
This could be a good issue to work on!
It will skip the images currently. It's possible to save the images separately and embed them into the markdown, though - what's your usecase for the images?
For me, I would use Obsidian to read books and record my comments, so I would like the images included for better reading. maybe the final file branch can be: -Book name --book name.md --image ---image 1 ---image 2
We really need this feature.🔥Does anyone know if there are any alternatives that can replace this project?
i just tried groundingdino for trying to draw bounding boxes around the figures to then extract them. bad results. i wonder how far you can get by fine-tuning groundingdino for this task.
It is possible to use fitz/PyMuPDF to extract image at each page (just not at the exact position like docx files), save it to a position and label it as markdown format with the safe path.
If anyone wants to contribute this with a PR, I'd be very excited to review. I'm working on improving some of the base models and making marker fully open (it's noncommercial right now due to nougat and layoutlmv3 licensing), so I don't have bandwidth to take the image project on at the moment.
The segmentation model identifies image positions, so it may be possible to extract images using that, and embed them in the right spot.
It is possible to use fitz/PyMuPDF to extract image at each page (just not at the exact position like docx files), save it to a position and label it as markdown format with the safe path.
@tungsten106 no you cannot really - at least its not good. fitz/PyMuPDF is not getting all images
Yes please. My use case: pdf => markdown => html => png images => canvas
I'm training a model to extract images - this will be integrated into marker
Image extraction will be coming in the next version (should be shipped in the next 2 weeks).
Just added this into the dev branch - https://github.com/VikParuchuri/marker/pull/111 . I'm going to close this issue, since the feature will land in master soon (next few days).