Tessa Walsh

Results 216 comments of Tessa Walsh

> am using sometimes an ad blocker when I archive webpages with the extension; is that bad for my archiving files or the core elements of pages? Hi @hamoudak, the...

> We are impacted by this issue as well at Kiwix, we have a website to ZIM relying on `` as well. > > Should we also develop a custom...

Thanks for flagging this! > mkdir: cannot create directory ‘/.local’: Read-only file system touch: cannot touch '/.local/share/applications/mimeapps.list': No such file or directory /usr/bin/google-chrome: line 45: /dev/fd/63: No such file or...

> Is it possible that all the changes needed can be accommodate by chrome flags that we could already configure with `CHROME_FLAGS` as described in the [README](https://github.com/webrecorder/browsertrix-crawler#configuring-chromium--puppeteer--pywb)? It is possible!...

> We can use pytest instead of "python setup.py test" without migrating -- @tw4l do you have a preferred direction that you're going towards for python testing? If not, I'm...

Ah looking at the context a little bit more, I'm certainly not opposed to moving to `pyproject.toml`. And poetry does seem nice, though we're not using it for any other...

> For "Skip test_capture_https_proxy" do you think there's an easy way to fix it? I faintly recall this is am urllib problem. Thankfully my past self thought to add the...

Pinging @ikreymer to weigh in on the release and Poetry questions. Thanks @white-gecko and @wumpus for all the work on those PRs!

Hi @Dooriin, the pages.jsonl file is meant to be an index of the HTML pages only, but you should be able to find everything that was crawled in the CDXJ...

Hi @Dooriin, sorry for the delayed response! This is something we're actually looking into now as we develop features around assisted crawl QA in Browsertrix Cloud. We have a PR...