JustAnotherArchivist
JustAnotherArchivist
'SHA-256' is indeed the official name. But that's true for 'SHA-1' as well, yet the digest headers almost universally use the format `sha1:foo` (since that's what's used in the specification's...
I mostly agree, but collision resistance might also be somewhat important, else bodies that are actually different might get deduped. For example, if there is a public web archival service...
The standard does not currently allow repeating the digest headers, so I'm not sure that's relevant. A library that currently accepts repeated headers is not implementing the standard correctly. It...
Agreed, repeated `extension-field`s should definitely be allowed. You might argue that 'as noted' also applies to extensions. Of course, a parser that doesn't support a particular extension wouldn't know whether...
> I haven't ever encountered any WARCs in the real world that use `WARC-IP-Address` on the `request` records. Here are some tools that do: wget, wpull, qwarc, Zeno, warcio (at...
On a related note, I think the specification should mention that the value of a `quoted-string` is its decoded contents, i.e. the concatenation of the `qdtext` and the `quoted-pair`s with...
By the way, I realise that these definitions were taken directly from RFC 2616. The RFC has a bit of clarification but also doesn't avoid this ambiguity. Specifically, it states...
Aah, crossfire. Yeah, the 7230 definition makes much more sense. It probably can't simply be taken over to WARC though due to the continuation lines, which were deprecated in 7230,...
Actually, the line folding does have an undesired effect: if you want to represent any white space other than a single space in a `quoted-string`, you must escape it [0]....
There exist at least two WARC-writing tools which use a different content type for warcinfo records: [crocoite](https://github.com/PromyLOPh/crocoite) and [qwarc](https://github.com/JustAnotherArchivist/qwarc) both write JSON data with `Content-Type: application/json; charset=utf-8`. The interpretation of...