wp1
wp1 copied to clipboard
Task requested on farm.zimit.kiwix.org with too much memory
We have a task on farm.zimit.kiwix.org which requests more memory than mwoffliner workers have available:
I removed the task on the farm.
Is it an issue to fix on WP1 side or was this just a test?
Here's the logic that WP1 uses for requesting resources:
https://github.com/openzim/wp1/blob/main/wp1/logic/selection.py#L187
I don't remember how that was determined, but you can see that it's variable.
To me it seems that this selection would have more than 5M of lines! We should not propose to do that. This infra is not conceived to make very large ZIM files.
At least, the resources should be capped to worker resources.
Should we get rid of the dynamic resource sizing attempt and just request max resources every time?
What are reasonable limits for the number of articles we allow for a ZIM? I think the UI would need some updating if we were to establish such limits.
We have currently two tasks "abusing" (the word is a bit strong) our infrastructure:
- https://farm.zimit.kiwix.org/pipeline/6123148d-1d10-4753-9b11-25101fcd606e/debug : 443497 articles
- https://farm.zimit.kiwix.org/pipeline/65334923-5200-40d2-a58d-07c34ab6a0a9/debug : 242146 articles, 534990 files
I think that we should put a cap on number of articles in the selection, aiming at the 2 hours limit we already have set for zimit.kiwix.org (a basic rule of thumb is probably pretty easy to compute). This limit should ideally be set per user, so that we can grant ourselves more freedom, and that so that we can sell more time/articles to the ones who are ready to pay for it.
Given the difficulties we have to update wikipedia ZIMs currently, it is probably even wise to do it quite quickly, sooner or later someone will have the idea of using wp1 to create his own updated ZIM.
To ke this issue is the "sister" of #794 and should be fixed in priority. The message should invite users to open a request to openzim/zim-requests.
To ke this issue is the "sister" of #794 and should be fixed in priority. The message should invite users to open a request to openzim/zim-requests.
Yes these issues are closely related.
The comment on the other bug (https://github.com/openzim/wp1/issues/794#issuecomment-2701846868) is not quite accurate. It assumes that we need to increase the memory request by ~20% because 1.14 uses 20% more memory. However, the real situation is that WP1 probably never actually processed large selections of the size being requested here. So it's not clear that even if we requested the maximum resources available every time, we would be able to scrape all of the selections that are requested (which is why we also need a solution to this issue).
Defining fair usage of WP1 is even more urgent than fixing this issue