Nick Sweeting
Nick Sweeting
Just run this every few months: ```bash cat list_of_urls.txt | archivebox add ``` Or use the built-in scheduling feature: https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving Then to view your archive run: ```bash archivebox setup archivebox...
Oh this is just a bug in getpwuid not being able to resolve a numbered user to a name across different OSs. It's not intended to enforce any kind of...
Probably not going to build this into archivebox core, but you can definitely cobble something together using the CLI or REST API.
`/core/snapshot/add` is the REST endpoint you want, you can find the source for it here: `archivebox/core/views.py`:`AddView.form_valid(...)` and `archivebox/core/forms.py`:`AddLinkForm`. Use the `dev` version of ArchiveBox, as the REST API is still...
Hmm not sure I'm comfortable building this much OneNote-specific functionality into ArchiveBox. Is there any way this could be broken out as a separate tool? It could pull from the...
I'd say this is currently our 2nd most-requested feature :) It's definitely in the roadmap: #120 but not planned anytime soon because it's not ArchiveBox's primary use-case and it's extremely...
Not for a while, it's a very tricky feature to implement natively, I'd rather integrate an existing crawler and use ArchiveBox to just process the generated stream of URLs. Don't...
You can already archive everything on a user's feed with the `--depth=1` flag, it just doesn't support depth >1. However, you can achieve full recursive archiving if you do multiple...
@francwalter I've added a new option ~~`URL_WHITELIST`~~ `URL_ALLOWLIST` in 5a2c78e for this usecase. Here's an example of how to exclude everything except for URLs matching `*.example.com`: ```bash export URL_ALLOWLIST='^http(s)?:\/\/(.+)?example\.com\/?.*$' #...
Set it to the absolute path of `./node_modules/@postlight/mercury-parser/cli.js` not the relative one. e.g. `/Users/yourusername/archivebox/node_modules/@postlight/mercury-parser/cli.js`