firefox-scrapbook icon indicating copy to clipboard operation
firefox-scrapbook copied to clipboard

Export hierarchical index files for "output as html"

Open pbpb opened this issue 7 years ago • 8 comments

Assuming scrapbook-x works like scrapbook, loading the index.html file takes forever if one has a large number of saved pages. This appears to be due to the way toggle is used -- every page needs to be loaded at the outset.

It would be much more useful if there were an option for "output as html" where only the top-level links were loaded at first, and sub-links were loaded only upon clicking parent. I guess that would require creating a hierarchy of index.html files. It might be nice to use the native file system tree structure, so that the tree structure is evident without either scrapbook or loading the index.html files into a browser.

Having a decent way to access scarpbook data without scrapbook is essential for the day when scrapbook is no more, so that people don't lose everything they've saved. Right now, the scrapbook data structure is organized only by date without using scrapbook itself to parse the rdf file.

pbpb avatar Nov 15 '17 20:11 pbpb

Which is the browser you are going to use? ScrapBook X doesn't work in Firefox >= 57, and we are probably not going to add new feature. If you are going to use the latest browser, consider Web ScrapBook.

danny0838 avatar Nov 16 '17 02:11 danny0838

My intention is to switch to FF 52esr once enough of my extensions work on it. Then I would use that as long as I could, since FF seems to be on a losing trajectory. That probably means for years, until web sites stop working in 52. Does Web ScrapBook (a) either operate from or import from Scrapbook[x] data structure, and (b) already have something similar to what I'm asking for? It sounds like it does, but not clear to me.

The main reason I am interested in the feature is so that I don't lose all the info in my scrapbook data when no browser supports it any more. The current result of saving to html is unusable for a large amount of data, and there is no simple way to convert it to something useful. I would have to write my own parser for the rdf file which sounds like a huge one-time learning project.

pbpb avatar Nov 16 '17 17:11 pbpb

@pbpb I have this Python tool (https://bitbucket.org/himselfv/scraptools/overview) to export Scrapbook X into file/folder structure with text files for notes, MHT / HTML for web exports.

I don't remember if the current revision works but at least some revisions exported everything fine. I was trying to make Scrapbook X run off the exported data but now with Firefox going away from extensions I guess there's no point.

himselfv avatar Nov 29 '17 14:11 himselfv

Thanks! I'll check it out.

pbpb avatar Nov 29 '17 18:11 pbpb

Hi! As a long time user of ScrapBook then ScrapBook+ and finally ScrapBook X, I have accumulated a huge history of documents, and actually I was using ScrapBook to save and archive a bunch of documents instead of the file system.

Given that there is no future for ScrapBook in Firefox, I have been looking for a way to finally export my entire ScrapBook directory properly. The import/export tools of ScrapBook X does not do it as I'd like: it export every item into the same folder, not keeping the tree of the collection. As it is now, this is a major blocker to a migration to the newest Firefox.

I understand that the Python tool just mentioned above is doing this, am I correct? Could you give instructions as to how to download, install and use this tool?

(ideally, that would be awesome if ScrapBook X itself would propose a "true" export mechanism, as to pretty much convert the entire ScrapBook directory into its file system counterpart)

Thanks!

basille avatar Dec 27 '17 20:12 basille

Yep. I've added some more readme. I'm not sure how comfortable you are with command-line tools but basically download the scripts from the downloads section, install Python 2.7 (it has an installer) and do what the readme says. If you need help, ask here, I'll try to answer.

himselfv avatar Dec 27 '17 22:12 himselfv

Thanks @himselfv, I will definitely give it a try. I'm pretty familiar with command-line affairs, so no worries here. Your instructions and the README should make a good start.

basille avatar Dec 28 '17 03:12 basille

The current structure that makes each item in its directory is for uniqueness and better handling for interlinking of items.

If you just want to keep the original structure work you can consider the site indexer of Web ScrapBook.

If you do want to convert whole database as verbatim structure (#40) you can try ScrapBook X File Converter but currently they can be converted to verbatim structure as zipped format or single HTML format only.

We may consider implementation of a verbatim data structure support for Web ScrapBook in the future, but there are still many things to investigate in prior.

danny0838 avatar Dec 28 '17 03:12 danny0838