wiki2text
wiki2text copied to clipboard
Output is empty file
Hi,
Do you have any idea why the program could output an empty file? I ran the command like written in the README: bunzip2 -c enwiki-DATE-pages-articles.xml.bz2 | ./wiki2text > enwiki.txt Where the .bz2 file is the dump of all abstracts of Wikipedia pages: It's called enwiki-latest-abstract.xml.gz in the following link https://dumps.wikimedia.org/enwiki/latest/.
I'm working on Windows 10. It might be helpful to note that the program DID work on a small file containing only one wikipedia page, but as said above did not work on the big dump (zipped it was about 500M, unzipped around 5G).
Thanks!
same problem
I'm sorry -- I don't maintain this anymore and I basically don't remember how to program in Nim.