warcreate icon indicating copy to clipboard operation
warcreate copied to clipboard

How does WARCreate handle preserving HTTP/2 communication?

Open machawk1 opened this issue 7 years ago • 5 comments

The current behavior needs to be documented. Per the WARC/1.1 spec, there is no documented "right way" but identifying the current approaches would be useful and help to guide how it might be done.

machawk1 avatar Jan 30 '18 01:01 machawk1

For the reference https://github.com/iipc/warc-specifications/issues/15

ibnesayeed avatar Jan 30 '18 16:01 ibnesayeed

That discussion was part of the impetus for creating this issue, @ibnesayeed. This issue is about determining what the current behavior is. The result of that can be cross-referenced with other discussions on the correct representation.

machawk1 avatar Jan 30 '18 17:01 machawk1

@machawk1 I would assume the "right" way would be to turn the HTTP/2 protocols into HTTP/1.1.

Reasoning:

  • currently no existing replay system handles replay of HTTP/2
  • tools like warcreate and Squidwarc see headers as HTTP/2 but see the bodies as if they were sent over HTTP/1.1 (not the individual parts of the stream)

N0taN3rd avatar Jan 30 '18 18:01 N0taN3rd

@N0taN3rd Maybe a conversion record is more suitable for the HTTP/1.1 derivative while still maintaining the original payload while replay systems (and the WARC spec) catch up.

machawk1 avatar Jan 30 '18 19:01 machawk1

I would agree with @machawk1, because as an archivist you never want to lose information that might be useful later just because current tools do not support something.

ibnesayeed avatar Jan 30 '18 20:01 ibnesayeed