cobweb icon indicating copy to clipboard operation
cobweb copied to clipboard

Web crawler with very flexible crawling options. Can either use standalone or can be used with resque to perform clustered crawls.

Results 21 cobweb issues
Sort by recently updated
recently updated
newest added

Bumps [addressable](https://github.com/sporkmonger/addressable) from 2.3.8 to 2.8.1. Changelog Sourced from addressable's changelog. Addressable 2.8.1 refactor Addressable::URI.normalize_path to address linter offenses (#430) remove redundant colon in Addressable::URI::CharacterClasses::AUTHORITY regex (#438) update gemspec to...

dependencies

Bumps [sinatra](https://github.com/sinatra/sinatra) from 1.4.8 to 2.2.0. Changelog Sourced from sinatra's changelog. 2.2.0 / 2022-02-15 Handle EOFError raised by Rack and return Bad Request 400 status. #1743 by tamazon Minor refactors...

dependencies

Bumps [sidekiq](https://github.com/mperham/sidekiq) from 3.5.3 to 6.4.0. Changelog Sourced from sidekiq's changelog. 6.4.0 SECURITY: Validate input to avoid possible DoS in Web UI. Add strict argument checking #5071 Sidekiq will now...

dependencies

Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.8.5 to 1.12.5. Release notes Sourced from nokogiri's releases. 1.12.5 / 2021-09-27 Security [JRuby] Address CVE-2021-41098 (GHSA-2rr5-8q37-2w7h). In Nokogiri v1.12.4 and earlier, on JRuby only, the SAX...

dependencies

Hello -- this looks like a great crawler, but I need a way, when crawling, to max-out crawl times on a per-url basis. Because of that I recommend two features:...

After several years of happy operation our Cobweb-dependent crawler ran into a page at https://sso.cas.org/ where the `` contains this `` tag: ``` ``` Our log file was reporting >...

Bumps [rack](https://github.com/rack/rack) from 1.6.11 to 2.2.3. Commits 1741c58 bump version 5ccca47 When parsing cookies, only decode the values a5e80f0 Bump version. b0de37d Remove trailing whitespace. 1a784e5 Prepare CHANGELOG for next...

dependencies

Bumps [rake](https://github.com/ruby/rake) from 12.3.0 to 13.0.1. Changelog *Sourced from [rake's changelog](https://github.com/ruby/rake/blob/master/History.rdoc).* > === 13.0.1 > > ==== Bug fixes > > * Fixed bug: Reenabled task raises previous exception on...

dependencies

Bumps [haml](https://github.com/haml/haml) from 4.0.7 to 5.1.2. Changelog *Sourced from [haml's changelog](https://github.com/haml/haml/blob/master/CHANGELOG.md).* > ## 5.1.2 > > Released on August 6, 2019 > ([diff](https://github.com/haml/haml/compare/v5.1.1...v5.1.2)). > > * Fix crash in some...

dependencies

Hi, To start, thank you for an excellent piece of work. Appreciated. I'm trying to use this to crawl the site http://www.udemy.com/ . I added it to my Gemfile, bundle...