marker Can it process images in pdf?

Dec 01 '23 06:12 biandh

It will skip the images currently. It's possible to save the images separately and embed them into the markdown, though - what's your usecase for the images?

Dec 01 '23 06:12 VikParuchuri

I also think it is better to process image and table , like mathpix.

Dec 01 '23 06:12 DamonsJ

It will extract tables, just not images. Can you tell me more about what you're using the images for?

Dec 01 '23 06:12 VikParuchuri

sometimes, you need image to illustrate the pointview in your pdf ex, if you are working on compute graphics there are many graphs and images in paper and you want to collect it in you own note.

Dec 01 '23 06:12 DamonsJ

I would find it very helpful to have images extracted so that I can convert a homework assignment pdf into a complete markdown file. How would you recommend extracting images?

Dec 01 '23 16:12 kshitijsachan

This could be a good issue to work on!

Dec 02 '23 06:12 mahimairaja

It will skip the images currently. It's possible to save the images separately and embed them into the markdown, though - what's your usecase for the images?

For me, I would use Obsidian to read books and record my comments, so I would like the images included for better reading. maybe the final file branch can be: -Book name --book name.md --image ---image 1 ---image 2

Dec 03 '23 15:12 keno-log

We really need this feature.🔥Does anyone know if there are any alternatives that can replace this project?

Dec 12 '23 08:12 Hambaobao

i just tried groundingdino for trying to draw bounding boxes around the figures to then extract them. bad results. i wonder how far you can get by fine-tuning groundingdino for this task.

Dec 12 '23 17:12 yachty66

It is possible to use fitz/PyMuPDF to extract image at each page (just not at the exact position like docx files), save it to a position and label it as markdown format with the safe path.

Dec 13 '23 03:12 tungsten106

If anyone wants to contribute this with a PR, I'd be very excited to review. I'm working on improving some of the base models and making marker fully open (it's noncommercial right now due to nougat and layoutlmv3 licensing), so I don't have bandwidth to take the image project on at the moment.

The segmentation model identifies image positions, so it may be possible to extract images using that, and embed them in the right spot.

Dec 13 '23 03:12 VikParuchuri

It is possible to use fitz/PyMuPDF to extract image at each page (just not at the exact position like docx files), save it to a position and label it as markdown format with the safe path.

@tungsten106 no you cannot really - at least its not good. fitz/PyMuPDF is not getting all images

Dec 13 '23 21:12 yachty66

Yes please. My use case: pdf => markdown => html => png images => canvas

Feb 27 '24 21:02 7flash

I'm training a model to extract images - this will be integrated into marker

Feb 28 '24 06:02 VikParuchuri

Image extraction will be coming in the next version (should be shipped in the next 2 weeks).

May 03 '24 05:05 VikParuchuri

Just added this into the dev branch - https://github.com/VikParuchuri/marker/pull/111 . I'm going to close this issue, since the feature will land in master soon (next few days).

May 07 '24 18:05 VikParuchuri

marker marker copied to clipboard

Can it process images in pdf?

marker
marker copied to clipboard