pywb icon indicating copy to clipboard operation
pywb copied to clipboard

Possible to index and replay web archive with custom archive directory structure?

Open peterk opened this issue 6 years ago • 2 comments

I have a web archive with a custom directory structure (recorded in other software). Is it possible to scan this structure automatically for new warc files without moving them to the pywb collection folder? I.e. I want to keep my own archive folder intact and make it possible to index and play back stuff. Looking at the documentation it seems like I have to move all warcs to the collection folder for them to be indexed?

peterk avatar Aug 20 '18 19:08 peterk

@peterk sincerest apologies for the delay in reply, but to answer your question yes you do have to move the warcs to the collections folder.

collections/

  • coll/ -- archive/ (warcs) -- indexes/ (cdxj)

However if you are using docker you can make coll's archive and indexes directories volumes and then mount your external directories to them.

N0taN3rd avatar Sep 30 '18 01:09 N0taN3rd

@peterk have you tried symbolic links and changing the directory structure using config.yaml

sydoluciani avatar Nov 01 '20 05:11 sydoluciani