tokenizers
tokenizers copied to clipboard
Python 3.13 support
The library cannot be built/installed with Python 3.13 RC.
Dockerfile:
FROM python:3.13-rc-bookworm
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH=/root/.cargo/bin:$PATH
RUN pip install tokenizers==0.20.0
Output:
28.56 Compiling tokenizers v0.20.0 (/tmp/pip-install-rtrxn7wj/tokenizers_502d52710ca54c4ea47f73913fe50a86/tokenizers)
28.56 Compiling numpy v0.21.0
28.56 Compiling tokenizers-python v0.20.0 (/tmp/pip-install-rtrxn7wj/tokenizers_502d52710ca54c4ea47f73913fe50a86/bindings/python)
28.56 error[E0425]: cannot find function, tuple struct or tuple variant `PyUnicode_FromKindAndData` in module `pyo3::ffi`
28.56 --> src/tokenizer.rs:326:46
28.56 |
28.56 326 | let unicode = pyo3::ffi::PyUnicode_FromKindAndData(
28.56 | ^^^^^^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `PyUnicode_FromOrdinal`
28.56 |
28.56 ::: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyo3-ffi-0.21.2/src/unicodeobject.rs:109:5
28.56 |
28.56 109 | pub fn PyUnicode_FromOrdinal(ordinal: c_int) -> *mut PyObject;
28.56 | ------------------------------------------------------------- similarly named function `PyUnicode_FromOrdinal` defined here
28.56
28.56 error[E0425]: cannot find value `PyUnicode_4BYTE_KIND` in module `pyo3::ffi`
28.56 --> src/tokenizer.rs:327:36
28.56 |
28.56 327 | pyo3::ffi::PyUnicode_4BYTE_KIND as _,
28.56 | ^^^^^^^^^^^^^^^^^^^^ not found in `pyo3::ffi`
28.56
28.56 For more information about this error, try `rustc --explain E0425`.
28.56 error: could not compile `tokenizers-python` (lib) due to 2 previous errors
28.56 💥 maturin failed
28.56 Caused by: Failed to build a native library through cargo
28.56 Caused by: Cargo build finished with "exit status: 101": `env -u CARGO PYO3_ENVIRONMENT_SIGNATURE="cpython-3.13-64bit" PYO3_PYTHON="/usr/local/bin/python3.13" PYTHON_SYS_EXECUTABLE="/usr/local/bin/python3.13" "cargo" "rustc" "--features" "pyo3/extension-module" "--message-format" "json-render-diagnostics" "--manifest-path" "/tmp/pip-install-rtrxn7wj/tokenizers_502d52710ca54c4ea47f73913fe50a86/bindings/python/Cargo.toml" "--release" "--lib"`
28.56 Error: command ['maturin', 'pep517', 'build-wheel', '-i', '/usr/local/bin/python3.13', '--compatibility', 'off'] returned non-zero exit status 1
28.56 [end of output]
28.56
28.56 note: This error originates from a subprocess, and is likely not a problem with pip.
28.56 ERROR: Failed building wheel for tokenizers
28.56 Failed to build tokenizers
28.67 ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (tokenizers)