warcworker issues

Results 7 warcworker issues

Sort by recently updated

FileNotFoundError after clicking Start archiving

Hi, I'm exploring tools for crawling social media. I got a FileNotFoundError after starting a crawl. I chose scroll_everything as script. ``` FileNotFoundError FileNotFoundError: [Errno 2] No such file or...

nvanderperren

Only pulled one page

* How do I pull an entire website with this * How do I see what it is doing internally?

tripleo1

Add a screenshot to the README

![image](https://user-images.githubusercontent.com/19284/43676896-e4c3276e-97f9-11e8-815c-0ab5c1cc254f.png)

peterk

Save screenshot in job folder

The screenshot is now saved in the root archive folder. It would be great to have them saved in the job dir instead.

Segerberg

[Question] Squidwarc Frontier Mangament and long scalable crawls

One of the use cases I have wanted to support in Squidwarc is multiple worker crawlers populating and pulling from a single master frontier. As well as a move from...

N0taN3rd

Make it possible to configure order of user scripts

When selecting which user scripts to run, make it possible to configure the order.

peterk

Rewrite the worker using javascript instead of Python

Currently the worker is using Python 3.6 compiled from source. It could probably just as well use the bundled javascript facilities from the base image to work on queue items....

peterk

warcworker
warcworker copied to clipboard

Metadata

FileNotFoundError after clicking Start archiving

Only pulled one page

Add a screenshot to the README

Save screenshot in job folder

[Question] Squidwarc Frontier Mangament and long scalable crawls

Make it possible to configure order of user scripts

Rewrite the worker using javascript instead of Python

← Metadata

Owner

Metadata

warcworker warcworker copied to clipboard

Metadata

FileNotFoundError after clicking Start archiving

Only pulled one page

Add a screenshot to the README

Save screenshot in job folder

[Question] Squidwarc Frontier Mangament and long scalable crawls

Make it possible to configure order of user scripts

Rewrite the worker using javascript instead of Python

← Metadata

Owner

Metadata

warcworker
warcworker copied to clipboard