wikihow icon indicating copy to clipboard operation
wikihow copied to clipboard

WikiHow scraper

wikiHow

wikihow2zim is an OpenZIM scraper to create offline versions of wikiHow websites, in all its supported languages.

:zap: Scraper is known to have a very significant issue linked to throttling (https://github.com/openzim/wikihow/issues/150)

CodeFactor Docker License: GPL v3 PyPI version shields.io

Usage

wikihow2zim works off a language version that you must provide via the --language argument. The list of supported languages is visible in the --help message.

Docker

docker run -v my_dir:/output ghcr.io/openzim/wikihow wikihow2zim --help

Python

wikihow2zim is a Python3 (3.6+) software. If you are not using the Docker image, you are advised to use it in a virtual environment to avoid installing software dependencies on your system.

python3 -m venv env
source env/bin/activate

# using published version
pip3 install wikihow2zim
wikihow2zim --help

# running from source
python wikihow2zim/ --help

Call deactivate to quit the virtual environment.

See requirements.txt for the list of python dependencies.

Contributing

All contributions are welcome!

Please open an issue on Github and/or submit a Pull-request.

Guidelines

  • Don't take assigned issues. Comment if those get staled.
  • If your contribution is far from trivial, open an issue to discuss it first.
  • Ensure your code passed black formatting, isort and flake8 (88 chars)

We have a pre-commit hook ready for you. Install it with pip install pre-commit && pre-commit install