wombat
wombat copied to clipboard
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
_`wombat.rb` exactly as in [README](https://github.com/felipecsl/wombat/blame/master/README.md#L21-L48); added a `require 'net/http/digest_auth'` but it didn't matter because of load order._ ``` bash % ruby --version ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19] ``` ```...
Wombat can't parse local files: ` /.gem/ruby/2.3.1/gems/wombat-2.5.1/lib/wombat/processing/parser.rb:33:in block (2 levels) in initialize': undefined method content_type' for # (NoMethodError) `
The "API Documentation" link on: http://felipecsl.com/wombat/ points to http://rubydoc.info/gems/wombat/2.1.1/frames On that page, the "API Documentation" link points to https://www.rubydoc.info/gems/wombat/2.0.0/frames and so on. Unrelated, gemnasium badge is reporting errors. Hope this...
Thanks for the work you do with this gem Hello, I am need remove multiples nodes with class css ``` .media .ads .cite-content ``` How do I remove a nodes...
Is it possible to crawl through the images and get the image source?
Small API improvement idea: instead of ``` stuff({ css: 'div.some-class' }, :list) ``` I want to be able to write: ``` stuff :list, css: 'div.some-class' ``` To me, this reads...
Hey, there's a way to get the url of the followed link? Something like: ```ruby products 'css=.products-grid .item .product-name a', :follow do |url| url url title 'css=.product-name h1' end ```
I found `page` function is existed. And I guess it is used for url parameter named page. So I code `page: 2`. But page function is not work. It give...
I think we should have function to customize http header in this library. http://docs.seattlerb.org/mechanize/HTTP/Agent.html#Headers