CDX-Writer
CDX-Writer copied to clipboard
Field names should be case-insensitive
Also reported here: https://bitbucket.org/rajbot/warc-tools/issue/1 . I'm creating this issue here on GitHub so others may know about this issue as well.
According to the WARC ISO 28500 Version 1 Latest Draft, Section 4, fields names should be case-insensitive. i.e., Warc-Type should be the same as WARC-Type. Without case-insensitivity, record.type will return None if it doesn't match WARC-Type exactly for example.
Oh, I'd like to add that this is an issue because my WARC file failed to derive proper CDX files on Internet Archive: https://archive.org/details/delcampe_20140126 .