outlines-core icon indicating copy to clipboard operation
outlines-core copied to clipboard

Error creating Vocabulary from pretrained with erwanf/gpt2-mini

Open RobinPicard opened this issue 4 months ago • 0 comments

Describe the issue as clearly as possible:

I have an error when trying to create a Vocabulary with the model "erwanf/gpt2-mini"

Steps/code to reproduce the bug:

from outlines_core import Vocabulary

vocabulary = Vocabulary.from_pretrained("erwanf/gpt2-mini")

Expected result:

A `Vocabulary` instance

Error message:

Traceback (most recent call last):
  File "/Users/robin/outlines/.idea/test.py", line 18, in <module>
    vocabulary = Vocabulary.from_pretrained("erwanf/gpt2-mini")
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: request error: https://huggingface.co/erwanf/gpt2-mini/resolve/main/tokenizer.json: status code 404

Outlines/Python version information:

0.2.11 Python 3.11.12 (main, Apr 8 2025, 14:15:29) [Clang 16.0.0 (clang-1600.0.26.6)]

``` absl-py==2.2.2 accelerate==1.6.0 aiohappyeyeballs==2.6.1 aiohttp==3.11.16 aiosignal==1.3.2 airportsdata==20250224 annotated-types==0.7.0 anthropic==0.49.0 anyio==4.9.0 astor==0.8.1 astunparse==1.6.3 attrs==25.3.0 babel==2.17.0 backoff==2.2.1 backrefs==5.9 beartype==0.15.0 blake3==1.0.4 build==1.2.2.post1 cachetools==5.5.2 cairocffi==1.7.1 CairoSVG==2.8.2 certifi==2025.1.31 cffi==1.17.1 cfgv==3.4.0 chardet==5.2.0 charset-normalizer==3.4.1 chex==0.1.89 click==8.1.8 cloudpickle==3.1.1 colorama==0.4.6 compressed-tensors==0.9.2 coverage==7.8.0 cryptography==45.0.5 cssselect2==0.8.0 datasets==3.5.0 defusedxml==0.7.1 Deprecated==1.2.18 depyf==0.18.0 diff_cover==9.2.4 dill==0.3.8 diskcache==5.6.3 distlib==0.3.9 distro==1.9.0 dnspython==2.7.0 dottxt==0.1.5 einops==0.8.1 email_validator==2.2.0 etils==1.12.2 fastapi==0.115.12 fastapi-cli==0.0.7 filelock==3.18.0 flatbuffers==25.2.10 flax==0.10.5 frozenlist==1.5.0 fsspec==2024.12.0 gast==0.6.0 genson==1.3.0 gguf==0.10.0 ghp-import==2.1.0 gitdb==4.0.12 GitPython==3.1.44 google-ai-generativelanguage==0.6.15 google-api-core==2.24.2 google-api-python-client==2.167.0 google-auth==2.39.0 google-auth-httplib2==0.2.0 google-generativeai==0.8.4 google-pasta==0.2.0 googleapis-common-protos==1.70.0 griffe==1.7.3 grpcio==1.71.0 grpcio-status==1.71.0 h11==0.14.0 h5py==3.13.0 hf-xet==1.0.3 httpcore==1.0.8 httplib2==0.22.0 httptools==0.6.4 httpx==0.28.1 huggingface-hub==0.30.2 humanize==4.12.2 identify==2.6.9 idna==3.10 importlib_metadata==8.6.1 importlib_resources==6.5.2 iniconfig==2.1.0 interegular==0.3.3 iso3166==2.1.1 jax==0.5.3 jaxlib==0.5.3 Jinja2==3.1.6 jiter==0.9.0 jsonpath-ng==1.7.0 jsonschema==4.23.0 jsonschema-specifications==2024.10.1 keras==3.9.2 lark==1.2.2 libclang==18.1.1 llama_cpp_python==0.3.8 llguidance==1.0.1 llvmlite==0.44.0 lm-format-enforcer==0.10.11 Markdown==3.8 markdown-it-py==3.0.0 MarkupSafe==3.0.2 mdurl==0.1.2 mergedeep==1.3.4 mistral_common==1.5.4 mkdocs==1.6.1 mkdocs-autorefs==1.4.2 mkdocs-gen-files==0.5.0 mkdocs-get-deps==0.2.0 mkdocs-git-committers-plugin==0.2.3 mkdocs-git-revision-date-localized-plugin==1.4.7 mkdocs-literate-nav==0.6.2 mkdocs-material==9.6.15 mkdocs-material-extensions==1.3.1 mkdocs-redirects==1.2.2 mkdocs-section-index==0.3.10 mkdocstrings==0.29.1 mkdocstrings-python==1.16.12 ml_dtypes==0.5.1 mlx==0.24.2 mlx-lm==0.22.5 mpmath==1.3.0 msgpack==1.1.0 msgspec==0.19.0 multidict==6.4.3 multiprocess==0.70.16 namex==0.0.8 nest-asyncio==1.6.0 networkx==3.4.2 ninja==1.11.1.4 nodeenv==1.9.1 numba==0.61.2 numpy==2.1.3 ollama==0.4.7 openai==1.74.0 opencv-python-headless==4.11.0.86 opt_einsum==3.4.0 optax==0.2.4 optree==0.15.0 orbax-checkpoint==0.11.12 -e git+ssh://[email protected]/dottxt-ai/outlines.git@48e10d9b7fdebd0b5188fe4fbae9ab61b1945824#egg=outlines outlines_core==0.2.11 packaging==24.2 paginate==0.5.7 pandas==2.2.3 partial-json-parser==0.2.1.1.post5 pathspec==0.12.1 pillow==10.4.0 platformdirs==4.3.7 pluggy==1.5.0 ply==3.11 pre_commit==4.2.0 prometheus-fastapi-instrumentator==7.1.0 prometheus_client==0.21.1 propcache==0.3.1 proto-plus==1.26.1 protobuf==5.29.4 psutil==7.0.0 py-cpuinfo==9.0.0 pyarrow==19.0.1 pyasn1==0.6.1 pyasn1_modules==0.4.2 pycountry==24.6.1 pycparser==2.22 pydantic==2.11.3 pydantic-settings==2.8.1 pydantic_core==2.33.1 PyGithub==2.6.1 Pygments==2.19.1 PyJWT==2.10.1 pymdown-extensions==10.16 PyNaCl==1.5.0 pyparsing==3.2.3 pyproject_hooks==1.2.0 pytest==8.3.5 pytest-benchmark==5.1.0 pytest-cov==6.1.1 pytest-mock==3.14.0 python-dateutil==2.9.0.post0 python-dotenv==1.1.0 python-json-logger==3.3.0 python-multipart==0.0.20 pytz==2025.2 PyYAML==6.0.2 pyyaml_env_tag==1.1 pyzmq==26.4.0 referencing==0.36.2 regex==2024.11.6 requests==2.32.3 responses==0.25.7 rich==14.0.0 rich-toolkit==0.14.1 rpds-py==0.24.0 rsa==4.9 safetensors==0.5.3 scipy==1.15.2 sentencepiece==0.2.0 shellingham==1.5.4 simplejson==3.20.1 six==1.17.0 smmap==5.0.2 sniffio==1.3.1 starlette==0.46.2 sympy==1.13.1 tensorboard==2.19.0 tensorboard-data-server==0.7.2 tensorflow==2.19.0 tensorflow-io-gcs-filesystem==0.37.1 tensorstore==0.1.73 termcolor==3.0.1 tf_keras==2.19.0 tiktoken==0.9.0 tinycss2==1.4.0 tokenizers==0.21.1 toolz==1.0.0 torch==2.6.0 torchaudio==2.6.0 torchvision==0.21.0 tqdm==4.67.1 transformers==4.51.3 treescope==0.1.9 typer==0.15.2 typing-inspection==0.4.0 typing_extensions==4.13.2 tzdata==2025.2 uritemplate==4.1.1 urllib3==2.4.0 uvicorn==0.34.1 uvloop==0.21.0 virtualenv==20.30.0 vllm==0.8.3 watchdog==6.0.0 watchfiles==1.0.5 webencodings==0.5.1 websockets==15.0.1 Werkzeug==3.1.3 wrapt==1.17.2 xgrammar==0.1.21 xxhash==3.5.0 yarl==1.19.0 zipp==3.21.0 ```

Context for the issue:

No response

RobinPicard avatar Jul 28 '25 08:07 RobinPicard