jhove
jhove copied to clipboard
JHOVE (WARC-kb) only reports one WARC validation error when there are more
Issue: when validating a WARC, Jhove returns 84 error messages, where JWAT returns 84 + 110 error messages.
The WARC is LUX-004-TEST-2017-12-18-20171220042523987-00100-16828_wbgrp-crawl007.us.a rchive.org_8443.warc (cf. https://web.archive.org/web/*/www.2houses.lu).
Jhove 1.22.1 returns 84 ErrorMessages when validating the WARC LUX-004-TEST-2017-12-18-20171220042523987-00100-16828_wbgrp-crawl007.us.a rchive.org_8443.warc using the command C:\Users\rvveenendaal\AppData\jhove>jhove -m WARC-kb c:\temp\warctests\LUX-004-T EST-2017-12-18-20171220042523987-00100-16828_wbgrp-crawl007.us.archive.org_8443. warc > c:\temp\jhove-lux100.out.txt
E.g. ErrorMessage: INVALID_EXPECTED: Entity: Incorrect payload digest, 7B9EB1DAC48E74FA5F418BC456CB410F88B81D98, DA39A3EE5E6B4B0D3255BFEF95601890AFD80709
JWAT-tools-0.6.0 (the version currently embedded in Jhove) reports the same 84 Incorrect Payload Digest messages. And additionally 110 WARC-Target-URI errors, 194 errors in total:
C:\Users\rvveenendaal\AppData\jwat-tools-0.6.0>jwattools.cmd test -e c:\temp\war ctests\LUX-004-TEST-2017-12-18-20171220042523987-00100-16828_wbgrp-crawl007.us.a rchive.org_8443.warc Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=64M; support was removed in 8.0 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256M; sup port was removed in 8.0 Showing errors: true Validate digest: true Using 1 thread(s). Output Thread started. ThreadPool started. Queued 1 file(s). ThreadPool shut down. Output Thread stopped.
Job summary
GZip files: 0
- Arc: 0
- Warc: 0 Arc files: 0 Warc files: 1 Errors: 194 Warnings: 0 RuntimeErr: 0 Skipped: 0 Time: 00:00:26 (26449 ms.) TotalBytes: 1.0 gb AvgBytes: 43.2 mb/s INVALID_EXPECTED: 150 REQUIRED_INVALID: 44 'WARC-Target-URI' value: 110 Incorrect payload digest: 84
Why doesn't Jhove seem to report the 110 WARC-Target-URI errors?