The Doctor

Results 345 comments of The Doctor

I honestly don't know. I just went back through my notes and even the version I built at work is on 22.04 LTS, out of concern that a later release...

@pirate It does not. I even tried running Sonic in one terminal and `archivebox update --index-only` in another ( both in debug mode) and Sonic never received any connections from...

``` (archivebox) {12:01:47 @ Tue Aug 27} [drwho @ leandra:(7) ArchiveBox]$ archivebox version 0.7.2 ArchiveBox v0.7.2 BUILD_TIME=2024-08-19 15:32:40 1724106760 IN_DOCKER=False IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.6.8-arch1-1-x86_64-with-glibc2.38 PYTHON=Cpython FS_ATOMIC=True FS_REMOTE=False FS_USER=1000:1000 FS_PERMS=644 DEBUG=False...

``` (archivebox) {12:04:21 @ Tue Aug 27} [drwho @ leandra:(7) ArchiveBox]$ archivebox add 'https://example.com/#123456' [i] [2024-08-27 19:04:36] ArchiveBox v0.7.2: archivebox add https://example.com/ #123456 > /home/drwho/ArchiveBox [!] Warning: Missing 1 recommended...

`archivebox update --index-only` is executing right now. When it's done I'll paste the output. Incidentally, I installed Readability (I think) and enabled it (`archivebox config --set READABILITY_BINARY=/us r/bin/readable`) before this...

Oh! Something landed in logs/errors.log when I added that URL for example.com: ``` Exception in archive_methods.save_htmltotext(Link(url=https://example.com/#123456)) command=/home/drwho/archivebox/bin/archivebox add https://example.com/#123456; ts=2024-08-27__19:04:41 cannot access local variable 'cmd' where it is not associated...

Contents of ~/ArchiveBox/archive/1724785476.865335/index.json: ``` { "archive_path": "archive/1724785476.865335", "base_url": "example.com/#123456", "basename": "", "bookmarked_date": "2024-08-27 19:04", "canonical": { "archive_org_path": "https://web.archive.org/web/example.com/#123456", "dom_path": "output.html", "favicon_path": "favicon.ico", "git_path": "git/", "google_favicon_path": "https://www.google.com/s2/favicons?domain=example.com", "headers_path": "headers.json", "htmltotext_path": "htmltotext.txt",...

Incidentally, I'm really looking forward to v0.8.x because the REST API should be part of that.

Is `archivebox update --index-only` supposed to create those /index.*/ files? I seem to recall reading in a couple of closed tickets' comments that this is deprecated and should not happen...

It just failed on me. Stack trace: ``` Traceback (most recent call last): File "/home/drwho/archivebox/lib/python3.11/site-packages/archivebox/search/__init__.py", line 25, in import_backend backend = import_module(backend_string) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module return...