Menu links of the downloaded site point to the on-line pages
Monolith would be an ideal tool for me to download a complete website. It downloads my Wordpress based website quickly, the home page works perfectly, but unfortunately all the menu links point to the on-line pages. I could not get it to work fully off-line. So I used it on Manjaro Linux/KDE: monolith https://site-URL/ -b /home/balint/Desktop/B4X/B4X.html -o /home/balint/Desktop/B4X/B4X.html Is this really not possible or did I parameterize it wrong? [email protected]
Not the developer. I came here to create this same issue.
I've glanced quickly at the source code and looked at the flags and there doesn't seem to be any functionality for this. The program simply walks through the page and creates and embeds the resources it finds in the page to output a single document.
You can see here that it simply copies the anchor tag: https://github.com/Y2Z/monolith/blob/2a8d5d7916347e7648a0a0a550b31cba3b75cdf1/src/html.rs#L1014-L1036
If the program recursively walked and built a local document tree it would greatly increase how useful it is imo
Hi @j-balint,
The -b option there is meant to be mostly for https:// URLs, e.g. to pull more resources from the internet, if converting a conventionally saved file+folder HTML page locally. I think it could in theory work for file:// links. What if you try monolith https://site-url/ -b file:///home/balint/Desktop/B4X/ -o /home/balint/Desktop/B4X/B4X.html?
And also hello Ralph, you are absolutely correct, monolith is extremely dumb, but let's look on the bright side, at least it won't take over the world and travel through time to try and klll its creator, right? Archiving child pages is something that's been requested since day one, and there're programs to do that already, but I can see how it could be handy when every separate linked page is in its own .html file, even if it means certain overhead. One of the problems would be sharing one of those documents, since they will try to link locally, instead of externally. There is work being done on making it possible to utilize monolith as a crate (library), rather than a stand-alone CLI, to make it possible to create scrapers and follow links, hence powering browser extensions and server-side software, along with promoting creation of monolith-based scrapers capable of archiving whole websites. I hope it sees the light of day soon.