cobweb
cobweb copied to clipboard
Web crawler with very flexible crawling options. Can either use standalone or can be used with resque to perform clustered crawls.
Bumps [addressable](https://github.com/sporkmonger/addressable) from 2.3.8 to 2.8.1. Changelog Sourced from addressable's changelog. Addressable 2.8.1 refactor Addressable::URI.normalize_path to address linter offenses (#430) remove redundant colon in Addressable::URI::CharacterClasses::AUTHORITY regex (#438) update gemspec to...
Bumps [sinatra](https://github.com/sinatra/sinatra) from 1.4.8 to 2.2.0. Changelog Sourced from sinatra's changelog. 2.2.0 / 2022-02-15 Handle EOFError raised by Rack and return Bad Request 400 status. #1743 by tamazon Minor refactors...
Bumps [sidekiq](https://github.com/mperham/sidekiq) from 3.5.3 to 6.4.0. Changelog Sourced from sidekiq's changelog. 6.4.0 SECURITY: Validate input to avoid possible DoS in Web UI. Add strict argument checking #5071 Sidekiq will now...
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.8.5 to 1.12.5. Release notes Sourced from nokogiri's releases. 1.12.5 / 2021-09-27 Security [JRuby] Address CVE-2021-41098 (GHSA-2rr5-8q37-2w7h). In Nokogiri v1.12.4 and earlier, on JRuby only, the SAX...
Hello -- this looks like a great crawler, but I need a way, when crawling, to max-out crawl times on a per-url basis. Because of that I recommend two features:...
After several years of happy operation our Cobweb-dependent crawler ran into a page at https://sso.cas.org/ where the `` contains this `` tag: ``` ``` Our log file was reporting >...
Bumps [rack](https://github.com/rack/rack) from 1.6.11 to 2.2.3. Commits 1741c58 bump version 5ccca47 When parsing cookies, only decode the values a5e80f0 Bump version. b0de37d Remove trailing whitespace. 1a784e5 Prepare CHANGELOG for next...
Bumps [rake](https://github.com/ruby/rake) from 12.3.0 to 13.0.1. Changelog *Sourced from [rake's changelog](https://github.com/ruby/rake/blob/master/History.rdoc).* > === 13.0.1 > > ==== Bug fixes > > * Fixed bug: Reenabled task raises previous exception on...
Bumps [haml](https://github.com/haml/haml) from 4.0.7 to 5.1.2. Changelog *Sourced from [haml's changelog](https://github.com/haml/haml/blob/master/CHANGELOG.md).* > ## 5.1.2 > > Released on August 6, 2019 > ([diff](https://github.com/haml/haml/compare/v5.1.1...v5.1.2)). > > * Fix crash in some...
Hi, To start, thank you for an excellent piece of work. Appreciated. I'm trying to use this to crawl the site http://www.udemy.com/ . I added it to my Gemfile, bundle...