archiveweb.page
archiveweb.page copied to clipboard
Document where the desktop application stores data
I am going to work with a number of larger collections that I want to store on a external drive. At what path does the desktop application store data? Ideally this could be user-defined, but for now the information of where that path is would be enough to use a mount.
Using the AppImage, I located these directories on my system, but couldn't find any warcs or wacz files inside?
despens@slice:~/.config$ ls -l | grep 'page'
drwx------ 3 despens despens 3 Mär 26 07:56 archiveweb.page
drwx------ 15 despens despens 23 Apr 10 08:50 archivewebpage
drwx------ 5 despens despens 8 Apr 10 08:48 ArchiveWeb.page
Doing some file system monitoring while capturing a bigger resource on my Linux system, I found the data is stored under
~/.config/archivewebpage/IndexedDB
The storage format is opaque, so no warc, cdx, or wacz to be found, but what seems like a mixture of binary data and JSON, probably a format mandated by electron.
I have not tried yet to mount a larger drive to this path.
Thank you, @despens! On mac, this is found at ~/Library/Application Support/archivewebpage/IndexedDB/
.
I agree it would be useful to be able to define where the data is saved on our systems (especially for large files, which we might want to save on an external hard drive).
It would be super to be able to identify and access the .warc or .wacz files stored there, so we could back-up specific collections.
Hijacking this issue now that the question has been answered to create a documentation issue. This question has also been asked by users on the forum as well, would be a good thing to document!
Adding another note for where the Chrome extension saves the database that the app uses, see my forum post on the issue.