warcreate icon indicating copy to clipboard operation
warcreate copied to clipboard

Store screenshot of page in WARC, too

Open machawk1 opened this issue 7 years ago • 2 comments

In https://kris-sigur.blogspot.com/2018/11/on-screenshots-in-warcs.html @kris-sigur describes the storage of a screenshot in a WARC file. This would be useful for others (e.g., @CamtheWicked on Twitter, for whom I could not find a GitHub handle) and might be easy(-er) to accomplish by leveraging the native Chrome APIs as available.

I have not worked with the devtools(?) API programmatically from an extension, but this seems like it would be a suitable use case for preservation using a browser extension.

/cc @N0taN3rd because I think he may have worked with this part of the Chrome/Web- extension API.

machawk1 avatar Nov 19 '18 15:11 machawk1

I believe there are two options

  1. using the tabCapture extension api (never played with this)
  2. using the debugger permission and CDP command Page.captureScreenshot

N0taN3rd avatar Nov 19 '18 16:11 N0taN3rd

@N0taN3rd Thanks for the input!

tabCapture seems to be limited to the current viewport, excluding anything that is not currently visible. This would be useful but I think the anticipated "screenshot" concept expected by a user is for the whole page despite what's currently visible.

The second option might be more feasible but a little more complex. I think it will require chrome.debugger.getTargets(), identify the current tab (I am not yet sure what else qualifies as a target), chrome.debugger.sendCommand() using the target and Page.captureScreenshot as the method without any commandParams per https://developer.chrome.com/extensions/debugger#method-sendCommand (the defaults appear to be suitable).

EDIT: ...and of course, converting the base64-encoded image data to something more suitable for WARC record storage. It might be easiest to keep it as b64 in the WARC but I am unsure if there will issues with interpretation given it is not a resource of web origin.

machawk1 avatar Nov 19 '18 16:11 machawk1