`Crawly.Fetchers.Fetcher` implementation for Playwright
Currently crawly has an implementation for Splash: https://github.com/elixir-crawly/crawly/blob/5eeeb2a3ba230ee55d2411a64f9e426957dc8c40/lib/crawly/fetchers/splash.ex
I tend to use Playwright (or Puppeteer if I only care about Chromium) for browser automation and testing, so it'd be cool to be able to use some of it's functionality from crawly.
The only thing I'm unsure of is whether or not Playwright exposes a requests page/API like Splash does:
Splash exposes the render.html endpoint which renders incoming requests sent with ?url get parameter.
I might end up picking this up, but I figured I'd create an issue beforehand. 😄
Hard to say. I did not have a chance to explore these two tools. In some of my previous projects, phantom js was used for browser rendering, but now it seems to be a bit dead.
It would be interesting to see an example fetcher for Playwright or Puppeteer. Maybe we can add it to Crawly as a standard fetcher :) Just let me know how it goes!
As a non-Elixir example, I just built a scraper for sites that will save each page as a PDF using Playwright: https://github.com/Nezteb/scrape-pdf
Next weekend I'll see what I can do about a crawly fetcher for it!
https://github.com/mechanical-orchard/playwright-elixir will probably be able to support what you are looking for.
Oh nice, I'll check that out! I'll see if I can get a minimal demo of using crawly along with playwright-elixir as the fetcher!