wikihow does not retry API requests
wikihow_en_endless last task failed at the beginning while listing articles in each categories
The error returned is a 503, which is probably a transient error.
Task: https://farm.openzim.org/pipeline/bc83a4cb-341e-43f6-b1e4-e17b2324b5f0/debug)
Logs:
[MainThread::2024-02-22 22:07:34,231] DEBUG:-> article: Diagnose-Auditory-Processing-Disorder
[MainThread::2024-02-22 22:07:34,231] DEBUG:-> article: Cover-Your-Ear-in-the-Shower
[MainThread::2024-02-22 22:07:50,346] ERROR:Interrupting process due to error: Call failed: {"status_code": 503, "text_body": ""}
[MainThread::2024-02-22 22:07:50,346] ERROR:Call failed: {"status_code": 503, "text_body": ""}
We should probably retry API calls not only on ConnectionError (current pywikiapi behavior) but also on what looks like a transient error, or maybe all errors except 404 and few other maybe (because it may be too complex to identify what is a transient error with certainty).
We might also consider to add this retry logic to web scraping calls (they are not retried either).
You might want to check https://github.com/openzim/wikihow/issues?q=is%3Aissue+503
My bad, then only API requests are not retried I think. Thank you!
This continue to impact MANY recipes (e.g. last runs of wikihow_ru_maxi, wikihow_pt_maxi, wikihow_nl_maxi)
Note that this issue might be made irrelevant (more or less) if we decide to switch from pywikiapi to another library as suggested in https://github.com/openzim/wikihow/issues/162