Tip for Termux Users: Workaround for `tokenizers` Installation Issue
Hello,
This is a tip for anyone trying to install docling on Termux (Android) and running into installation failures. The latest versions may fail to install due to a dependency conflict with the tokenizers library.
The pre-compiled python-tokenizers package from the tur-packages repository seems to be incompatible. The solution is to remove the pre-built version and compile tokenizers from source with specific flags.
Here is the step-by-step workaround:
Step 1: Uninstall the Conflicting Package
First, remove the python-tokenizers package provided by the tur repository to avoid conflicts.
apt remove python-tokenizers
Step 2: Install Build Dependencies
You need to install the necessary tools to compile a Rust-based Python package from source.
pkg install rust clang make maturin
Step 3: Compile and Install tokenizers from Source
Use the following command to compile and install the tokenizers library. The CXXFLAGS are crucial for the compilation to succeed on Termux.
CXXFLAGS="-lpthread -D__ANDROID_API__=21" pip install tokenizers
Step 4: Install docling
Once the tokenizers library is successfully installed, you can proceed to install docling.
pip install docling
After following these steps, docling should be installed and working correctly on Termux.
Hope this helps other users on this platform!
Nota bene: adjust this number: -D__ANDROID_API__=21 to your Android API version.