jhove
jhove copied to clipboard
Outdated JWAT version in WARC module
Dev Effort
3D
Description
Current version is 1.0.3 https://github.com/openpreserve/jhove/blob/2e9444442cdaad7f749daa2db8538310ca1cc16f/jhove-ext-modules/pom.xml#L15
The most recent version in Maven repository is 1.1.1: https://mvnrepository.com/artifact/org.jwat/jwat-warc
Version 1.0.6 fixed a critical error: Payload digest should not be checked for revisit records
Tests failed when I bumped the version. Will investigate and hopefully issue a pull request if I am able to fix it.
Have fixed 6/8 tests but the last 2 is causing a NullPointerException in JWAT. Will follow up and try to fix it upstream (JWAT).
Confirmed bug in JWAT (header.warcTypeIdx
is null
causing NullPointerException
):
https://github.com/netarchivesuite/jwat/blob/5ae169ce839288aa5cf9927dd64d2fcc14bced69/jwat-warc/src/main/java/org/jwat/warc/WarcRecord.java#L287
It is triggered by missing WARC-Type header in https://github.com/openpreserve/jhove/blob/2e9444442cdaad7f749daa2db8538310ca1cc16f/jhove-ext-modules/src/test/resources/warc/invalid-warcheaderfieldpolicy-7.warc#L2
used by https://github.com/openpreserve/jhove/blob/2e9444442cdaad7f749daa2db8538310ca1cc16f/jhove-ext-modules/src/test/java/edu/harvard/hul/ois/jhove/module/WarcModuleTest.java#L354
See https://github.com/maeb/jhove/tree/maeb/jwat for a POC fix
Once this is updated I need to test this #294 to see whether it's still an issue.
Fix is merged. I have been in touch with @csrster to get a release in maven.
Tried a naive update but hit unit test problems with the WARC module:
Tests run: 79, Failures: 8, Errors: 2, Skipped: 0, Time elapsed: 0.61 sec <<< FAILURE! - in edu.harvard.hul.ois.jhove.module.WarcModuleTest
parseInvalidWarcHeaderFieldPolicy7(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.032 sec <<< ERROR!
java.lang.NullPointerException
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.generalInvalidChecks(WarcModuleTest.java:1001)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderFieldPolicy7(WarcModuleTest.java:357)
parseInvalidWarcHeaderFieldPolicy8(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0 sec <<< ERROR!
java.lang.NullPointerException
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.generalInvalidChecks(WarcModuleTest.java:1001)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderFieldPolicy8(WarcModuleTest.java:371)
parseInvalidWarcHeaderVersion16(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.invalidErrorExpectedCheck(WarcModuleTest.java:951)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderVersion16(WarcModuleTest.java:662)
parseInvalidWarcHeaderVersion17(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.invalidErrorExpectedCheck(WarcModuleTest.java:951)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderVersion17(WarcModuleTest.java:671)
parseInvalidWarcHeaderVersion18(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.invalidErrorExpectedCheck(WarcModuleTest.java:951)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderVersion18(WarcModuleTest.java:680)
parseInvalidWarcHeaderVersion19(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.invalidErrorExpectedCheck(WarcModuleTest.java:951)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderVersion19(WarcModuleTest.java:689)
parseInvalidWarcHeaderVersion20(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.001 sec <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.invalidErrorExpectedCheck(WarcModuleTest.java:951)
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcHeaderVersion20(WarcModuleTest.java:698)
parseInvalidWarcReaderDiagnosis1(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.001 sec <<< FAILURE!
java.lang.AssertionError: expected:<3> but was:<4>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcReaderDiagnosis1(WarcModuleTest.java:787)
parseInvalidWarcFileFieldsEmpty(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<16> but was:<17>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcFileFieldsEmpty(WarcModuleTest.java:150)
parseInvalidWarcFileFieldsInvalidFormat(edu.harvard.hul.ois.jhove.module.WarcModuleTest) Time elapsed: 0.002 sec <<< FAILURE!
java.lang.AssertionError: expected:<15> but was:<16>
at edu.harvard.hul.ois.jhove.module.WarcModuleTest.parseInvalidWarcFileFieldsInvalidFormat(WarcModuleTest.java:165)
...
Results :
Failed tests:
WarcModuleTest.parseInvalidWarcFileFieldsEmpty:150 expected:<16> but was:<17>
WarcModuleTest.parseInvalidWarcFileFieldsInvalidFormat:165 expected:<15> but was:<16>
WarcModuleTest.parseInvalidWarcHeaderVersion16:662->invalidErrorExpectedCheck:951 expected:<2> but was:<3>
WarcModuleTest.parseInvalidWarcHeaderVersion17:671->invalidErrorExpectedCheck:951 expected:<2> but was:<3>
WarcModuleTest.parseInvalidWarcHeaderVersion18:680->invalidErrorExpectedCheck:951 expected:<2> but was:<3>
WarcModuleTest.parseInvalidWarcHeaderVersion19:689->invalidErrorExpectedCheck:951 expected:<2> but was:<3>
WarcModuleTest.parseInvalidWarcHeaderVersion20:698->invalidErrorExpectedCheck:951 expected:<2> but was:<3>
WarcModuleTest.parseInvalidWarcReaderDiagnosis1:787 expected:<3> but was:<4>
Tests in error:
WarcModuleTest.parseInvalidWarcHeaderFieldPolicy7:357->generalInvalidChecks:1001 » NullPointer
WarcModuleTest.parseInvalidWarcHeaderFieldPolicy8:371->generalInvalidChecks:1001 » NullPointer
So it'll need a little investigation before closing this, it appears there's two problems only in truth.
I'll try to look into this ASAP, maybe next week. Must contact @csrster to get him to do a release of JWAT with the code in master.
Internally we use a forked version of jhove and jwat using jitpack.io to work around the lack of a release of jwat.
JWAT-1.1.3 is being released to maven central as I write this. It includes the pull request from maeb.