mwoffliner
mwoffliner copied to clipboard
Error logs are not printed out if not --verbose
Printing error logs is the most important thing, this should not depend of argument --verbose. Considering that without --verbose we print a few important logs, the error should be part of them.
I suspect anyway that the log system needs to be re-architectured a bit using a standard log framework.
Error logs are always printed (regardless of --verbose
).
--verbose
only controls the logging of info
and warn
logs...
Here is a log extracted from the e2e test:
[log] [2019-07-02T14:50:56.145Z] Using Tmp Directories: { outputDirectory: '/media/kelson/SOTOKI/mwoffliner/mwo-test-1562079026790',
cacheDirectory: '/media/kelson/SOTOKI/mwoffliner/cac/dumps-1562079056144/',
dumpTmpDir: '/tmp/mwo-dump-1562079056145' }
[log] [2019-07-02T14:50:56.167Z] Getting text direction...
[log] [2019-07-02T14:50:56.167Z] Getting site info...
[log] [2019-07-02T14:50:56.168Z] Getting sub-title...
[log] [2019-07-02T14:50:56.357Z] Text direction is [ltr]
[log] [2019-07-02T14:50:56.966Z] Using a remote MCS/Parsoid instance
[log] [2019-07-02T14:50:56.966Z] Downloading article list from [https://download.kiwix.org/wp1/enwiki/tops/10] to [/tmp/mwo-dump-1562079056145/10]
[log] [2019-07-02T14:50:57.498Z] Worker [0] getting article range [0-50] of [10] [100%]
[log] [2019-07-02T14:50:57.767Z] Doing dump
[log] [2019-07-02T14:50:57.767Z] Writing zim to [/media/kelson/SOTOKI/mwoffliner/mwo-test-1562079026790/wikipedia_en_10_nopic_2019-07.zim]
[log] [2019-07-02T14:50:57.767Z] Flushing redis file store
[log] [2019-07-02T14:50:58.007Z] Found [4] stylesheets to download
[log] [2019-07-02T14:50:58.007Z] Downloading stylesheets and populating media queue
[log] [2019-07-02T14:50:58.195Z] Downloaded stylesheets
[log] [2019-07-02T14:50:58.196Z] Saving favicon.png...
[log] [2019-07-02T14:50:58.700Z] Updating article thumbnails for all articles
[log] [2019-07-02T14:50:58.700Z] Getting Main Page
[log] [2019-07-02T14:50:58.701Z] Creating main page...
[log] [2019-07-02T14:50:58.705Z] Getting articles
[log] [2019-07-02T14:50:58.706Z] Worker [0] Processing batch of article ids [["Enzyme","Association_football","+8 more"]]
[log] [2019-07-02T14:50:58.706Z] Worker [1] Processing batch of article ids [[]]
[log] [2019-07-02T14:51:08.892Z] Progress downloading articles [10/10] [100%]
[log] [2019-07-02T14:51:08.893Z] Found [11] js module dependencies
[log] [2019-07-02T14:51:08.893Z] Found [11] style module dependencies
[log] [2019-07-02T14:51:08.893Z] Downloading module dependencies
[log] [2019-07-02T14:51:09.482Z] Downloading a total of [40] files
[log] [2019-07-02T14:51:09.483Z] Worker [0] Processing batch of [10] files
[log] [2019-07-02T14:51:09.483Z] Worker [1] Processing batch of [10] files
[log] [2019-07-02T14:51:09.484Z] Worker [2] Processing batch of [10] files
[log] [2019-07-02T14:51:09.485Z] Worker [3] Processing batch of [10] files
[log] [2019-07-02T14:51:10.239Z] Progress downloading files [10/40] [25%]
[log] [2019-07-02T14:51:10.720Z] Progress downloading files [20/40] [50%]
[log] [2019-07-02T14:51:11.056Z] Progress downloading files [30/40] [75%]
[log] [2019-07-02T14:51:11.569Z] Retrying a total of [2] files
[log] [2019-07-02T14:51:11.570Z] Worker [0] Processing batch of [2] files
[log] [2019-07-02T14:51:11.775Z] Writing Article Redirects
[log] [2019-07-02T14:51:11.819Z] Finishing Zim Creation
T:14; A:423; RA:330; CA:63; UA:30; FA:0; IA:10; C:1; CC:1; UC:0; WC:0
T:16; Generate cluster offsets
T:16; ResolveRedirectIndexes
Resolve redirect
T:16; Set article indexes
set index
T:16; Resolve mimetype
T:16; create title index
T:16; 424 title index created
T:16; 3 clusters created
T:16; fill header
T:16; write zimfile
T:16; finish
[log] [2019-07-02T14:51:13.049Z] Summary of scrape actions: {
"files": {
"success": 38,
"fail": 2
},
"articles": {
"success": 10,
"fail": 0
},
"redirects": {
"written": 330
}
}
[log] [2019-07-02T14:51:13.049Z] Finished dump
We see that two files have failed to be downloaded... but no error in the log.
Download failures are currently treated as warnings.
I've been reserving "error" for fatal errors, rather than a download error.
This SO post suggests that Error should be "fatal to an operation", which would log as you suggest.
https://stackoverflow.com/a/2031209/3399654
@ISNIT0 Thx for the explanation. You are probably right, warning is a better level than error.
@kelson42 I actually meant that I would implement this. "fatal to an operation" would mean something like: an http request failed.
I do think we should log failed downloads as errors (rather than fatal to the application)
@ISNIT0 :) of that is a flip! My recommendation on this is: Make a first on the wiki a list of log levels, and for each of them a definition. One time we agree on this, then things will be easier to judge.
@ISNIT0 Have you written somewhere this definition meanwhile?
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
@midik Would you be able to come with a serious logging strategy/framework?
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.
Looks good now