python-scraperlib
python-scraperlib copied to clipboard
Collection of Python code to re-use across Python-based scrapers
zimscraperlib
Collection of python code to re-use across python-based scrapers
Usage
- This library is meant to be installed via PyPI (
zimscraperlib). - Make sure to reference it using a version code as the API is subject to frequent changes.
- API should remain the same only within the same minor version.
Example usage:
zimscraperlib>=1.1,<1.2
Dependencies
- libmagic
- wget
- libzim (auto-installed, not available on Windows)
- Pillow
- FFmpeg
- gifsicle (>=1.92)
macOS
brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle
Linux
sudo apt install libmagic1 wget ffmpeg \
libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle
Alpine
apk add ffmpeg gifsicle libmagic wget libjpeg
Nota: i18n features do not work on Alpine, see https://github.com/openzim/python-scraperlib/issues/134 ; there is one corresponding test which is failing.
Contribution
This project adheres to openZIM's Contribution Guidelines
pip install hatch
pip install ".[dev]"
pre-commit install
# For tests
invoke coverage
Users
Non-exhaustive list of scrapers using it (check status when updating API):