olmocr
olmocr copied to clipboard
Failed dependency installing on Mac
🐛 Describe the bug
I am trying to install olmOCR on mac and I get this error:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. chromadb 0.5.23 requires tokenizers<=0.20.3,>=0.13.2, but you have tokenizers 0.21.0 which is incompatible.
I have tried to uninstall the version I had and install one within the requested version range. However, then I get the following error:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. transformers 4.49.0 requires tokenizers<0.22,>=0.21, but you have tokenizers 0.20.3 which is incompatible. cached-path 1.6.7 requires huggingface-hub<0.28.0,>=0.8.1, but you have huggingface-hub 0.29.1 which is incompatible.
It appears the requirements for tokenizers by olmOCR and by transformers are impossible to satisfy at the same time. Any way to bypass this issue?
Versions
Python 3.11.11 ace_tools==0.0 aiohappyeyeballs==2.4.6 aiohttp==3.11.13 aiosignal==1.3.2 alembic==1.14.1 altair==5.5.0 annotated-types==0.7.0 anyio==4.8.0 asgiref==3.8.1 attrs==25.1.0 babel==2.17.0 backoff==2.2.1 bcrypt==4.2.1 beaker-py==1.34.1 beautifulsoup4==4.12.3 bibtexparser==2.0.0b8 bleach==6.2.0 blinker==1.9.0 boto3==1.37.4 botocore==1.37.4 build==1.2.2.post1 cached_path==1.6.7 cachetools==5.5.2 certifi==2025.1.31 cffi==1.17.1 charset-normalizer==3.4.1 chroma-hnswlib==0.7.6 chromadb==0.5.23 click==8.1.8 clldutils==3.24.1 cohere==5.13.12 colorama==0.4.6 coloredlogs==15.0.1 colorlog==6.9.0 cryptography==44.0.1 csvw==3.5.1 dataclasses-json==0.6.7 Deprecated==1.2.18 distro==1.9.0 dlinfo==2.0.0 docker==7.1.0 docstring_parser==0.16 durationpy==0.9 embedchain==0.1.127 fastapi==0.115.8 fastavro==1.10.0 filelock==3.17.0 flatbuffers==25.2.10 frozendict==2.4.6 frozenlist==1.5.0 fsspec==2025.2.0 ftfy==6.3.1 gitdb==4.0.12 GitPython==3.1.44 google-api-core==2.24.1 google-auth==2.38.0 google-cloud-aiplatform==1.82.0 google-cloud-bigquery==3.29.0 google-cloud-core==2.4.2 google-cloud-resource-manager==1.14.1 google-cloud-storage==2.19.0 google-crc32c==1.6.0 google-resumable-media==2.7.2 googleapis-common-protos==1.68.0 gptcache==0.1.44 grpc-google-iam-v1==0.14.0 grpcio==1.71.0rc2 grpcio-status==1.71.0rc2 grpcio-tools==1.70.0 h11==0.14.0 h2==4.2.0 hpack==4.1.0 html5lib==1.1 httpcore==1.0.7 httptools==0.6.4 httpx==0.28.1 httpx-sse==0.4.0 huggingface-hub==0.27.1 humanfriendly==10.0 hyperframe==6.1.0 idna==3.10 importlib_metadata==8.5.0 importlib_resources==6.5.2 inflect==7.5.0 isodate==0.7.2 Jinja2==3.1.5 jiter==0.8.2 jmespath==1.0.1 joblib==1.4.2 jsonpatch==1.33 jsonpointer==3.0.0 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 kanjize==1.6.0 kubernetes==32.0.1 langchain==0.3.19 langchain-cohere==0.3.5 langchain-community==0.3.18 langchain-core==0.3.40 langchain-experimental==0.3.4 langchain-openai==0.2.14 langchain-text-splitters==0.3.6 langsmith==0.1.147 language-tags==1.2.0 lingua-language-detector==2.0.2 lxml==5.3.0 Mako==1.3.9 Markdown==3.7 markdown-it-py==3.0.0 markdown2==2.5.3 MarkupSafe==3.0.2 marshmallow==3.26.1 mdurl==0.1.2 mem0ai==0.1.56 mmh3==5.1.0 monotonic==1.6 more-itertools==10.6.0 mpmath==1.3.0 multidict==6.1.0 multitasking==0.0.11 mypy-extensions==1.0.0 narwhals==1.28.0 networkx==3.4.2 numpy==2.1.3 oauthlib==3.2.2 ollama==0.4.7 -e git+https://github.com/allenai/olmocr.git@701abdb95525dbbfe75c2fc288df90bbea080043#egg=olmocr onnxruntime==1.20.1 openai==1.65.2 opentelemetry-api==1.30.0 opentelemetry-exporter-otlp-proto-common==1.30.0 opentelemetry-exporter-otlp-proto-grpc==1.30.0 opentelemetry-instrumentation==0.51b0 opentelemetry-instrumentation-asgi==0.51b0 opentelemetry-instrumentation-fastapi==0.51b0 opentelemetry-proto==1.30.0 opentelemetry-sdk==1.30.0 opentelemetry-semantic-conventions==0.51b0 opentelemetry-util-http==0.51b0 orjson==3.10.15 overrides==7.7.0 packaging==24.2 pandas==2.2.3 peewee==3.17.8 phonemizer==3.3.0 pillow==11.1.0 platformdirs==4.3.6 portalocker==2.10.1 posthog==3.16.0 propcache==0.3.0 proto-plus==1.26.0 protobuf==5.29.3 pyarrow==19.0.1 pyasn1==0.6.1 pyasn1_modules==0.4.1 pycparser==2.22 pydantic==2.10.6 pydantic-settings==2.8.1 pydantic_core==2.27.2 pydeck==0.9.1 Pygments==2.19.1 pylatexenc==2.10 pyparsing==3.2.1 pypdf==5.3.0 PyPDF2==3.0.1 pypdfium2==4.30.1 PyPika==0.48.9 pyproject_hooks==1.2.0 pysbd==0.3.4 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-pptx==1.0.2 pytz==2024.2 PyYAML==6.0.2 qdrant-client==1.13.2 rdflib==7.1.3 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 requests-oauthlib==2.0.0 requests-toolbelt==1.0.0 rfc3986==1.5.0 rich==13.9.4 rpds-py==0.22.3 rsa==4.9 s3transfer==0.11.3 safetensors==0.5.2 schema==0.7.7 segments==2.2.1 setuptools==75.8.0 shapely==2.0.7 shellingham==1.5.4 six==1.16.0 smart-open==7.1.0 smmap==5.0.2 sniffio==1.3.1 soupsieve==2.6 SQLAlchemy==2.0.38 starlette==0.45.3 streamlit==1.42.2 streamlit-chat==0.1.1 SudachiDict-full==20250129 SudachiPy==0.6.10 sympy==1.13.1 tabulate==0.9.0 tenacity==9.0.0 tiktoken==0.7.0 tokenizers==0.21.0 toml==0.10.2 torch==2.6.0 torchaudio==2.6.0 torchvision==0.21.0 tornado==6.4.2 tqdm==4.67.1 transformers==4.49.0 typeguard==4.4.1 typer==0.15.1 types-requests==2.32.0.20241016 typing-inspect==0.9.0 typing_extensions==4.12.2 tzdata==2024.2 uritemplate==4.1.1 urllib3==2.3.0 uvicorn==0.34.0 uvloop==0.21.0 watchfiles==1.0.4 wcwidth==0.2.13 webencodings==0.5.1 websocket-client==1.8.0 websockets==15.0 wrapt==1.17.2 XlsxWriter==3.2.1 yarl==1.18.3 yfinance==0.2.50 yt-dlp==2025.1.26 zipp==3.21.0 zstandard==0.23.0