markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

how to save image in the markdown

Open charliedream1 opened this issue 1 year ago • 11 comments

The image in the markdown is represented as this: . How can I save images?

charliedream1 avatar Dec 20 '24 06:12 charliedream1

Can you show an example?

Misnad avatar Dec 21 '24 09:12 Misnad

from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("test.pdf")
print(result.text_content)

md_file = 'text.md'
with open(md_file, 'w') as f:
       f.write(result.text_content)

I'm write code like this. In file text.md, only text saved, the image part shows like [](.data/image:base64...), I couldn't find this figure.

  • How to get this base64 image directly from function md.convert("test.xlsx")?
  • How to save images?

charliedream1 avatar Dec 24 '24 02:12 charliedream1

up

corolair-hazem avatar Dec 24 '24 10:12 corolair-hazem

I have documents with embedded images that I would like to convert to markdown, so this feature would be helpful for me.

Currently when exporting it exports like this (exact quote):

![LABEL - TMG.jpg](data:image/jpeg;base64...)

I could see two resolutions:

  1. Complete the embedded image data ![](data:image/png;base64...). I think this would work display correctly when rendering markdown.
  2. Add an option to export embedded images as files in a particular directory and create relative links to them.

infogulch avatar Jan 02 '25 19:01 infogulch

Same problem, base64 data shoul be in the data-url or file saved, neither are there.. just ![](data:image/png;base64...) ie.. "..." were the base64 image data should be

oliviermills avatar Jan 06 '25 03:01 oliviermills

Same problem. I convert word to md, all the image embedded in the document convert to just ![](data:image/png;base64...)

tjyangsg avatar Jan 17 '25 07:01 tjyangsg

Refer to issues51

Modify the file packages/markitdown/src/markitdown/converters/_markdownify.py, comment out the following code.

# if src.startswith("data:"):
#     src = src.split(",")[0] + "..."
Image

Reinstall it and you're done.

pip install -e packages/markitdown

Now it works!

SnakeHacker avatar Feb 17 '25 08:02 SnakeHacker

@.***>谢谢!

tjyangsg avatar Feb 17 '25 08:02 tjyangsg

What's the meaning of view on web? image can't be visit ed locally?

---Original--- From: @.> Date: Mon, Feb 17, 2025 16:24 PM To: @.>; Cc: "Optimus @.@.>; Subject: Re: [microsoft/markitdown] how to save image in the markdown (Issue#162)

Refer to issues51

Modify the file packages/markitdown/src/markitdown/converters/_markdownify.py, comment out the following code.

if src.startswith("data:"): # src = src.split(",")[0] + "..." image.png (view on web)

Reinstall it and you're done. pip install -e packages/markitdown
Now it works!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***> SnakeHacker left a comment (microsoft/markitdown#162)

Refer to issues51

Modify the file packages/markitdown/src/markitdown/converters/_markdownify.py, comment out the following code.

if src.startswith("data:"): # src = src.split(",")[0] + "..." image.png (view on web)

Reinstall it and you're done. pip install -e packages/markitdown
Now it works!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

charliedream1 avatar Feb 17 '25 08:02 charliedream1

What's the meaning of view on web? image can't be visit ed locally?

This is merely a matter of omission, as the new version aims to conserve tokens for llm.

SnakeHacker avatar Feb 17 '25 09:02 SnakeHacker

If I want to get and save the raw image, how to do?

---Original--- From: @.> Date: Mon, Feb 17, 2025 17:03 PM To: @.>; Cc: "Optimus @.@.>; Subject: Re: [microsoft/markitdown] how to save image in the markdown (Issue#162)

What's the meaning of view on web? image can't be visit ed locally? …

This is merely a matter of omission, as the new version aims to conserve tokens for llm.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***> SnakeHacker left a comment (microsoft/markitdown#162)

What's the meaning of view on web? image can't be visit ed locally? …

This is merely a matter of omission, as the new version aims to conserve tokens for llm.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

charliedream1 avatar Feb 17 '25 09:02 charliedream1

Base 64 images are now optionally maintained (in #1140)

afourney avatar Mar 21 '25 02:03 afourney

@.***>谢谢!

tjyangsg avatar Mar 21 '25 02:03 tjyangsg