tokenizers icon indicating copy to clipboard operation
tokenizers copied to clipboard

Add a multithreaded tokenizer test and 3.13t CI

Open ngoldbaum opened this issue 3 months ago • 3 comments

This is to aid supporting the free-threaded build. Adding explicitly multithreaded tests like this exercises the behavior differences of the free-threaded build compared with the GIL-enabled build. See https://github.com/huggingface/tokenizers/pull/1809/ where I initially tried this with a different testing approach.

The content of the test is adapted from the asyn test_concurrency test.

I tried to make this use multiprocessing as well but ran into deadlocks so decided to stick to using 4 python threads in a thread pool.

c.f. https://github.com/huggingface/safetensors/pull/637 where I added a multithreaded test to safetensors.

ngoldbaum avatar Sep 12 '25 16:09 ngoldbaum

Perhaps add Py3.14t?

  • https://ft-checker.com/?page=13

cclauss avatar Oct 11 '25 04:10 cclauss

@cclauss nope, not until hf-xet updates its PyO3 dependency and/or releases cp314t wheels:

       error: The configured Python interpreter version (3.14) is newer than PyO3's maximum supported version (3.13)
        = help: please check if an updated version of PyO3 is available. Current version: 0.23.5
        = help: The free-threaded build of CPython does not support the limited API so this check cannot be suppressed.
      warning: build failed, waiting for other jobs to finish...
      💥 maturin failed
        Caused by: Failed to build a native library through cargo
        Caused by: Cargo build finished with "exit status: 101": `env -u CARGO MACOSX_DEPLOYMENT_TARGET="11.0" PYO3_BUILD_EXTENSION_MODULE="1" PYO3_ENVIRONMENT_SIGNATURE="cpython-3.14-64bit" PYO3_PYTHON="/Users/goldbaum/.pyenv/versions/3.14.0t/bin/python3.14" PYTHON_SYS_EXECUTABLE="/Users/goldbaum/.pyenv/versions/3.14.0t/bin/python3.14" "cargo" "rustc" "--features" "pyo3/extension-module" "--message-format" "json-render-diagnostics" "--manifest-path" "/private/var/folders/nk/yds4mlh97kg9qdq745g715rw0000gn/T/pip-install-j1_za471/hf-xet_bf48338361074576a46118b7b50a4e99/hf_xet/Cargo.toml" "--release" "--lib" "--" "-C" "link-arg=-undefined" "-C" "link-arg=dynamic_lookup" "-C" "link-args=-Wl,-install_name,@rpath/hf_xet.abi3.so"`

ngoldbaum avatar Oct 17 '25 17:10 ngoldbaum

Went ahead and rebased though. @Narsil any chance you can take a look over here?

ngoldbaum avatar Oct 17 '25 17:10 ngoldbaum