python-scraperlib icon indicating copy to clipboard operation
python-scraperlib copied to clipboard

Collection of Python code to re-use across Python-based scrapers

Results 52 python-scraperlib issues
Sort by recently updated
recently updated
newest added

On slack, asking about this got this response from rgaudin: ``` it’s testing locales and you don’t have them on the system testing it. Open a ticket but it will...

`libjpeg8-dev` referenced in the [README.md](https://github.com/openzim/python-scraperlib/blob/7d498319baadba715316c15cf9857ff2f6974a00/README.md?plain=1#L44) is obsolete, and presumably should be replaced. ```bash sudo apt install libmagic1 wget ffmpeg \ libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \ libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev...

question

`i18n.get_language_details()` returns a `dict` which is not optimal in typed environments

enhancement
good first issue

Just like we have `add_redirect` in `zim/creator` and `add_redirects_to_zim` in `zim/filesystem`, we should now add support for the "new" ZIM alias with `add_alias` in `zim/creator` and `add_aliases_to_zim` in `zim/filesystem`

enhancement

This issue serves as a checklist for the release event. - [x] Secure the CI is green on git `main` - [x] Check that dependencies ranges are ok, upgrade if...

task

https://github.com/openzim/python-scraperlib/pull/128 has shown that there is some code qa / typing issues that have to be fixed but will ultimately lead to a breaking change in term of API, hence...

enhancement

I am building updated package for v3.3.0 and giving followup to all your kind feedback about failing tests. Now I got following tests failing: ``` =================================== FAILURES =================================== __________________________ test_selocale_unsupported...

bug

This issue serves as a checklist for the release event. - [ ] Secure the CI is green on git `main` - [ ] Check that dependencies ranges are ok,...

task

At this stage this seems necessary to publish a user documentation. Https://Readthedocs.com seems a good candidate.

documentation

Currently, for i18n we rely directly on system locale. We should depend on and use an external locale-data such as what [Babel](https://github.com/python-babel/babel) does. This would provide a real support for...

enhancement