warcbase icon indicating copy to clipboard operation
warcbase copied to clipboard

Ingestion bug in copyStream, wrong number of bytes expected

Open lintool opened this issue 11 years ago • 1 comments

14/08/10 09:13:18 ERROR ingest.IngestFiles: Error ingesting file: /scratch0/webarchive/congress108/arc.sample/CONGRESS01-20040124072939-193.arc.gz
java.io.IOException: Read 394 but expected 439
        at org.warcbase.ingest.IngestFiles.copyStream(IngestFiles.java:63)
        at org.warcbase.ingest.IngestFiles.ingestArcFile(IngestFiles.java:102)
        at org.warcbase.ingest.IngestFiles.ingestFolder(IngestFiles.java:163)
        at org.warcbase.ingest.IngestFiles.main(IngestFiles.java:220)

lintool avatar Aug 12 '14 20:08 lintool

Current fix is to catch exception and move on. https://github.com/lintool/warcbase/commit/a00e413edff46d4655fe621b65d1af89ffda33c4

Might be worth looking in detail a bit more on what's going on at a later point in time.

lintool avatar Aug 16 '14 13:08 lintool