browsertrix icon indicating copy to clipboard operation
browsertrix copied to clipboard

switch to async streaming download:

Open ikreymer opened this issue 1 year ago • 1 comments

  • download via presigned URLs via aiohttp instead of boto APIs
  • use async methods from stream-zip to generate zip: note that stream-zip still does a sync->async conversion under the hood
  • follow-up to #1933 for streaming download improvements

ikreymer avatar Jul 30 '24 20:07 ikreymer

When the multi-WACZs being produced in this branch are loaded into ReplayWeb.page, no seed pages or resources are listed. There may be something slightly off, investigating further.

tw4l avatar Jul 30 '24 21:07 tw4l

Should be fixed now! Turns out the datapackage.json was not quite valid, had incorrect path in resources, not returning equal to name, and matching properties to single WACZ!

ikreymer avatar Oct 03 '24 03:10 ikreymer

Tested on dev and working well! Nice job

tw4l avatar Oct 03 '24 14:10 tw4l