scala-scraper
scala-scraper copied to clipboard
A Scala library for scraping content from HTML pages
Updates [org.scoverage:sbt-scoverage](https://github.com/scoverage/sbt-scoverage) from 2.0.0 to 2.0.2. [GitHub Release Notes](https://github.com/scoverage/sbt-scoverage/releases/tag/v2.0.2) - [Version Diff](https://github.com/scoverage/sbt-scoverage/compare/v2.0.0...v2.0.2) I'll automatically update this PR to resolve conflicts as long as you don't change it yourself. If you'd...
just like https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver, we can use a real browser.
@ruippeixotog You did a great job with this scraper! Is there a way to extract the content that a page would get asynchronously after the it finished the natural rendering?
Before .siblings would be inferred to be a `Iterable[Element#ThisType]` instead of an `Iterable[JsoupElement]`. The former lacks a lot of functionality.
- Copyright year - added links to css selector resources
Running `browser.get()` or `browser.post()` on Heroku keeps returning org.JsoupHttpStatusException 
I try to parse big table element with ContentExtractors.table. but, buildRow and buildTable method is not tail recursion. Thereby ContentExtractors.table function throwed StackOverflowError. that failed to parse URL: http://www.tipness.co.jp/schedule/SHP063/month
When parsing pages in a foreign language - a common use case for this library - it is sometimes needed to parse dates formatted in another locale (e.g. different month...
It would be nice if the browsers has an asynchronous version of `get`---this way you can just do several page loads at once. As a work around, can I use...
As you probably know, Chrome now supports headless (https://developers.google.com/web/updates/2017/04/headless-chrome), and one way to call it is through WebDriver. Any plan for scala-scraper to support headless Chrome?