wombat
wombat copied to clipboard
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Hello, I noticed some strange behaviour of Wombat. Let's say I want to crawl 2 websites firstly I was using Typhoeus and Regex to crawl websites, but there was one...
when request url is **http://info.ntust.edu.tw/faith/edua/app/qry_linkoutline.aspx?semester=1031&courseno=ET5117701** it will get error **encoding error : input conversion failed due to input error, bytes 0xA8 0x8B 0xE8 0xB3**, but other course page isn't. I...
This is my solution for following a bunch of links without throwing an exception on the first 404. I think it'd be prettier to add a method to the DSL,...
# Description This PR is supposed to solve #64 by adding a `:url` locator. it can be used from a `:follow` or the root page > I have also added...