ArchiveBot
                                
                                 ArchiveBot copied to clipboard
                                
                                    ArchiveBot copied to clipboard
                            
                            
                            
                        ArchiveBot, an IRC bot for archiving websites
From jobs `9h7sqivag9rq0rdqkzcr20efo` and `3is0spcpeysoxm5ccszzwjp92`
The forums igset is becoming quite messy: * It's large yet incomplete. There are currently 57 ignores in it, but this still doesn't cover many forum softwares, including some that...
After job completion, `!status $jobid` prints very little information. It would be useful to also have total size, number of responses and errors, and perhaps some other metrics available, e.g....
If there is a queue of 5 or more jobs and a voiced user tries to `!a` a URL for which a job exists already, the bot responds with (#337):...
`!a[o]` followed by `!yahoo` is a common pattern for some types of jobs, e.g. lists of tweets or individual Facebook pages. It would be nice if there was a `--yahoo`...
The parser will happily accept `!a https://example.org/--concurrency 1`. Note that there is no space after the URL. This will create a job for `https://example.org/--concurrency`, and the extra `1` argument will...
Would it be possible to add this mid grab (so say if the grab is running out of control/getting spam sites etc) 5vqqwfgciq4y7ksb6e4olk5ao is a key example of when this...
Sometimes I need to start a custom pipeline for a specific website, but if there's stuff in the ArchiveBot queue, it fills up the pipeline with other crawls. Maybe this...
I'm getting "too many open files" errors on some of my runs. This is on 20170510.02 with supposedly lots of disk space and memory left. Examples: ``` OSError: [Errno 24]...
This is a more focused portion of #132. Ignore patterns are just JSON files and can be read from disk every time an ignore set is requested. This ensures that...