wikihow icon indicating copy to clipboard operation
wikihow copied to clipboard

wikihow does not retry API requests

Open benoit74 opened this issue 1 year ago • 3 comments

wikihow_en_endless last task failed at the beginning while listing articles in each categories

The error returned is a 503, which is probably a transient error.

Task: https://farm.openzim.org/pipeline/bc83a4cb-341e-43f6-b1e4-e17b2324b5f0/debug)

Logs:

[MainThread::2024-02-22 22:07:34,231] DEBUG:-> article: Diagnose-Auditory-Processing-Disorder
[MainThread::2024-02-22 22:07:34,231] DEBUG:-> article: Cover-Your-Ear-in-the-Shower
[MainThread::2024-02-22 22:07:50,346] ERROR:Interrupting process due to error: Call failed: {"status_code": 503, "text_body": ""}
[MainThread::2024-02-22 22:07:50,346] ERROR:Call failed: {"status_code": 503, "text_body": ""}

We should probably retry API calls not only on ConnectionError (current pywikiapi behavior) but also on what looks like a transient error, or maybe all errors except 404 and few other maybe (because it may be too complex to identify what is a transient error with certainty).

We might also consider to add this retry logic to web scraping calls (they are not retried either).

benoit74 avatar Feb 23 '24 09:02 benoit74

You might want to check https://github.com/openzim/wikihow/issues?q=is%3Aissue+503

rgaudin avatar Feb 23 '24 14:02 rgaudin

My bad, then only API requests are not retried I think. Thank you!

benoit74 avatar Feb 23 '24 15:02 benoit74

This continue to impact MANY recipes (e.g. last runs of wikihow_ru_maxi, wikihow_pt_maxi, wikihow_nl_maxi)

Note that this issue might be made irrelevant (more or less) if we decide to switch from pywikiapi to another library as suggested in https://github.com/openzim/wikihow/issues/162

benoit74 avatar May 17 '24 08:05 benoit74