Unable to create zim files
Hello,
After trying many time for the following, they keep failing:
- Other biology Log: https://farm.zimit.kiwix.org/pipeline/2ef6896e-acd0-4a8f-8edf-21d3fd8d4aeb/debug Project: https://wp1.openzim.org/#/selections/wikiproject/6ed5ef65-5829-4fc9-95ed-865aab7a2d37
- Lists Log: https://api.farm.zimit.kiwix.org/v1/tasks/96d8e7b1-c69e-4b99-9e6a-c55b3eecc7ab Project: https://wp1.openzim.org/#/selections/wikiproject/90b881b8-985f-4f11-ac6d-cbf613d394fb
- Military Log: https://api.farm.zimit.kiwix.org/v1/tasks/65334923-5200-40d2-a58d-07c34ab6a0a9 Project: https://wp1.openzim.org/#/selections/wikiproject/696e76b7-1284-4e1f-94fd-57906f9260d9
They have all been killed for out of memory errors. Looks like 1.14.2 is using more memory than before under some conditions (it is expected due to enhancements in the image manipulation steps).
@audiodude I think you should revise the algorithm that allocate memory to the schedule based on number of articles in the list. It shouldn't depend much on the number of articles in fact.
Is there a way to omit the resource request? If I'm not going to change the resource request based on number of articles, I'd like to just request "whatever resources are available" (without hardoding "12GB" and then forgetting to updated it when more memory is available on zimfarm).
Nope, this is not possible, and I'm not even sure we want to do it.
@audiodude Can you please just increase by 20% the requested memory?
I also have the same problem
@benoit74 @audiodude We really need to fix this bug ASAP.
@benoit74 Will make the memory consumption charts next week.
@project21212 @eof21212 We are working on your issue but this takes more time as expected. Might that be that the ZIM you want to create for the MIlitary Wikiproject could be of general interest (maybe we could officialy build them, ie. outside WP1)?
I wouldn't mind at all
I ran WPEN with top 1k, 10k, 50k and 100k, with dev version (1.15.0-dev0)
Memory charts below.
top 1000
peak at 1.57G around 10:08 UTC
top 10000
peak at 5.21G around 16:56 UTC
top 50000
peak at 6.62G 14. April around 13:01 UTC
top 100000
peak at 6.44G 13. April around 02:56 UTC
All peak memory are kinda weird:
- we are somewhere during downloading files (observations below are not 100% precise, it looks like time achieves to get offset by few seconds / minutes looking at end time ...)
- 98% for top 10000
- 10% for top 1000
- for top 50000 ATM we were retrying 29 failed files (out of 600k)
- for top 100000 we were supposedely even maybe after that, writing redirects, but I doubt it is correct time
- memory is always a kind of peak but then it never decreases back to normal level after that moment
Digging a bit deeper, we see that even for the top 1000, the sudden increase in memory consumption is somewhere during files download, not at the very beginning or end.
Looks like we are storing something about downloaded file in memory and never releasing it. Something happening for a very particular file ? Some kind of race condition somewhere causing memory to never be released?
I will dig the top 1000 deeper by logging every file URLs and sizes, to try to narrow the problem down (there are "only" 27936 of them). And check if the sudden increase of memory consumption is reproducible.
So are you suggesting that we should always request ~8 GB of RAM, presuming that we have the 50k article limit in place?
So are you suggesting that we should always request ~8 GB of RAM, presuming that we have the 50k article limit in place?
I'm still puzzled by the situation, don't know yet what to recommend.
Given latest results, I'm still puzzled.
I propose to update the formula used to compute required memory with this "magic" formula:
memory = max(2, -10 + 1.75 * ln( articles_count ))
ln (log based on euler number) seems to be more adequate than log10 (log based 10).
Put a minimum is necessary because things get odd with low number of articles.
Probably rounding to the nearest tenth of GB makes sense.
This gives following values.
| Articles count | Memory |
|---|---|
| 100 | 2G |
| 1000 | 2.1G |
| 5000 | 4.9G |
| 10000 | 6.1G |
| 50000 | 8.9G |
| 100000 | 10.1G |
| 500000 | 13.0G |
| 1000000 | 14.2G |
This is really empirical but let's see how it works, we can always fine-tune the formula should the thing get too odd