archiveweb.page
archiveweb.page copied to clipboard
[Feature]: Needs similar unique warc filenamepattern implemented as in browsertrix
Context
When you extract the warc file from wacz it allways has the warc name: data.warc.gz
It should be name similar unique way as in browsertrix.
What change would you like to see?
see above
Requirements
No response
Todo
No response
The same issue still applies.
I tried to complement Browsertrix Cloud-collections with downloaded/then uploaded archiveweb.page crawls and will get files named data.warc instead of original WARC-names or files containing parts of the original name (checked when downloaded initially from archiveweb.page as well as from Browsertrix Cloud as multi-WACS).