marker
marker copied to clipboard
Marker is not working
I installed marker locally on an M4 mac. The install succeeds but I am unable to use the library.
Using the CLI, I see that params are unsupported -
marker_single input_pdfs/10K_sample_1.pdf --output_dir . --output_format markdown
usage: marker_single [-h] [--max_pages MAX_PAGES] [--langs LANGS] [--batch_multiplier BATCH_MULTIPLIER] filename output
marker_single: error: unrecognized arguments: --output_dir --output_format markdown
Using python I see an import error:
Traceback (most recent call last):
File "/Users/ashwini/Desktop/ParsingEval/marker_eval.py", line 4, in <module>
from marker.converters.pdf import PdfConverter
ModuleNotFoundError: No module named 'marker.converters'
Using python I see an import error:
Traceback (most recent call last): File "/Users/ashwini/Desktop/ParsingEval/marker_eval.py", line 4, in <module> from marker.converters.pdf import PdfConverter ModuleNotFoundError: No module named 'marker.converters'
I got this error when I used pip install marker-pdf to install the library.
Fixed it by installing latest version from git - pip install git+https://github.com/VikParuchuri/marker@master
Similar issue here: ModuleNotFoundError: No module named 'marker.converters.docx'
Installing from git didn't help.
Same issue
Facing the same issue, has this been fixed yet?
any update on this Issue
Similar issue here: ModuleNotFoundError: No module named 'marker.converters.docx'
Installing from git didn't help.
You are getting this error because there is no file/directory named as docx inside marker/converters.
If you want to extract text from docx file I think you can achieve it using same code as you would do for pdf -
(untested code below)
from marker.converters.pdf import PdfConverter
from marker.models import create_model_dict
from marker.output import text_from_rendered
converter = PdfConverter(
artifact_dict=create_model_dict(),
)
rendered = converter("/path/to/docx")
text, _, images = text_from_rendered(rendered)
A couple of months ago, the install command changed: https://github.com/VikParuchuri/marker/commit/9f5e5f7d1483bf5a1a9bb8525ff6319025215c26#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R72
For more document type support use:
pip install marker-pdf[full]