archiveweb.page icon indicating copy to clipboard operation
archiveweb.page copied to clipboard

[Feature]: Needs similar unique warc filenamepattern implemented as in browsertrix

Open tuehlarsen opened this issue 1 year ago • 1 comments

Context

When you extract the warc file from wacz it allways has the warc name: data.warc.gz It should be name similar unique way as in browsertrix. image

What change would you like to see?

see above

Requirements

No response

Todo

No response

tuehlarsen avatar Jun 26 '24 13:06 tuehlarsen

The same issue still applies.

I tried to complement Browsertrix Cloud-collections with downloaded/then uploaded archiveweb.page crawls and will get files named data.warc instead of original WARC-names or files containing parts of the original name (checked when downloaded initially from archiveweb.page as well as from Browsertrix Cloud as multi-WACS).

Klindten avatar May 12 '25 13:05 Klindten