markitdown icon indicating copy to clipboard operation
markitdown copied to clipboard

图片无法转换

Open hgmsq opened this issue 11 months ago • 14 comments

测试结果图片不能转换到md文档里面

hgmsq avatar Dec 18 '24 05:12 hgmsq

make sure that you are using latest version

SigireddyBalasai avatar Dec 18 '24 07:12 SigireddyBalasai

Today, I tested directly using the pip command to install dependency packages.

hgmsq avatar Dec 18 '24 07:12 hgmsq

Today, I tested directly using the pip command to install dependency packages.

so any problem ?

SigireddyBalasai avatar Dec 18 '24 08:12 SigireddyBalasai

if possible include full context your os the version of python you are using and the error message

SigireddyBalasai avatar Dec 18 '24 08:12 SigireddyBalasai

I also cannot output image content,I can only get the main idea of the images,oops~

BlackPool888 avatar Dec 18 '24 08:12 BlackPool888

Python:3.12

e.g:

Image

BlackPool888 avatar Dec 18 '24 08:12 BlackPool888

@BlackPool888 i think you are getting what is intended for giving image input you will get description

SigireddyBalasai avatar Dec 18 '24 08:12 SigireddyBalasai

@SigireddyBalasai Right,I get it now,I had expected to receive the main content,not just a description,thx

BlackPool888 avatar Dec 18 '24 08:12 BlackPool888

https://github.com/microsoft/markitdown/issues/51 ,you can comment the code

KmBase avatar Dec 18 '24 08:12 KmBase

md = MarkItDown(llm_client=client, llm_model="gpt-4o") ,图片转md需要提供llm的接口才能识别

happy-xlf avatar Dec 19 '24 09:12 happy-xlf

@SigireddyBalasai Right,I get it now,I had expected to receive the main content,not just a description,thx

This may require an OCR approach

neverlatetolearn0 avatar Dec 25 '24 06:12 neverlatetolearn0

md = MarkItDown(llm_client=client, llm_model="gpt-4o") ,图片转md需要提供llm的接口才能识别

This method can summarize the content of the picture, but can not identify the text in the picture, how to extract the text in the picture

neverlatetolearn0 avatar Dec 25 '24 06:12 neverlatetolearn0

Insert the pictures in the PDF document into the corresponding positions in the Markdown document. This cannot be achieved.

wking2014 avatar Mar 15 '25 02:03 wking2014

md = MarkItDown(llm_client=client, llm_model="gpt-4o") ,图片转md需要提供llm的接口才能识别

This method can summarize the content of the picture, but can not identify the text in the picture, how to extract the text in the picture

I also need this feature!

zenoda avatar May 19 '25 02:05 zenoda