Ivan Kozik

Results 148 comments of Ivan Kozik

Which OS did it fail on? The `pyenv`-based build brings in its own Python headers.

But note https://github.com/chfoo/wpull/issues/131

CRIU might be another option for suspending/resuming crawls on Ubuntu 15.10/16.04+. It works by dumping/restoring restoring a snapshot of the process to/from disk. As root, I managed to dump and...

I had some success with `criu dump --tcp-established --shell-job --ghost-limit 20000000 -t PID` and `criu restore --tcp-established --shell-job` (in a tmux) again, but unfortunately grab-site processes crash about 50% of...

Yeah, it would be better if this worked for any crawl that includes a reddit URL, not just those that start with a reddit URL.

Similarly, send `Cookie: NCR=1` to all *.blogspot.com URLs

I could not repro this on macOS 10.14 (with homebrew install) just now, but systwi says it still happens on 10.13.6 (which install is not known).

I can't repro on macOS 11, was this fixed in lxml?

Possible implementation strategy: Implement #59 so that the user can easily adjust delays on a per-domain basis. For each 429 response, add (# of connections being used * 1 second)...

grab-site currently doesn't really have anyone developing it (I just try to keep the install steps working), but I have no objections to the addition of WACZ support.