course-nlp icon indicating copy to clipboard operation
course-nlp copied to clipboard

Lesson 10 notebooks: `bunzip` throws an error when unzipping `.bz2` files

Open jcatanza opened this issue 5 years ago • 1 comments

On a Windows 10 64-bit machine:

bunzip throws "EOFError: Compressed file ended before the end-of-stream marker was reached" when processing these files: viwiki-latest-pages-articles.xml.bz2I trwiki-latest-pages-articles.xml.bz2

Attaching a screenshot: bunzip_error

Windows version of 7-zip throws a similar error

Note 1: A valid .xml format file is still saved.

Note 2: The problem was resolved when I downloaded the files directly from https://archive.org/details/wikipediadumps

jcatanza avatar Feb 22 '20 00:02 jcatanza

somehow same error :(

alirezadigi avatar Jul 06 '22 16:07 alirezadigi