changedetection.io icon indicating copy to clipboard operation
changedetection.io copied to clipboard

[feature] Local Browser-Based Scraping - Maybe through the chrome extension?

Open renzobuendia opened this issue 9 months ago • 5 comments

Version and OS docker

Is your feature request related to a problem? Please describe. no, I adds more ways to scrape websites

Describe the solution you'd like Support local browser-based scraping, where the user’s own browser (Chrome, Edge, Firefox) is used to monitor website changes. This should be an additional option alongside the existing Playwright-based scraping.

Users should be able to choose between: 1. Local Browser Scraping – Uses the browser installed on the system to load pages, execute JavaScript, and extract content. 2. Playwright-Based Scraping – Uses headless Playwright for automated website monitoring.

Both modes should allow: • Monitoring JavaScript-heavy websites. • Bypassing bot detection more effectively. • Extracting content using CSS/XPath selectors. • Configurable check intervals to prevent detection. • Secure, private data storage (no cloud dependencies).

Describe the use-case and give concrete real-world examples Scraping login-protected pages – Since the local browser keeps session cookies, users can monitor content behind logins without having to re-authenticate programmatically. • Example: Tracking private dashboards or web-based email inboxes for new messages.

Additional context Could look something like this:

Image

renzobuendia avatar Mar 27 '25 18:03 renzobuendia

Since there is the changedetection.io browser extension for chrome, instead of using the CDP protocol to drive chrome (which is what playwright does) maybe we can get the content somehow through the extension hmm

dgtlmoon avatar Mar 28 '25 11:03 dgtlmoon

  1. Local Browser Scraping – Uses the browser installed on the system to load pages, execute JavaScript, and extract content.

No this cant work in this way, thats not how browser scraping when you're not an extension works, this whole project is based on playwright driving the browser - running as its own software

its NOT possible to drive chrome outside of using the playwright or similar

dgtlmoon avatar Mar 28 '25 11:03 dgtlmoon

I will open a new feature

dgtlmoon avatar Mar 28 '25 11:03 dgtlmoon

the other thing that will be MASSIVELY confusing for people is that it means they would need to keep a browser open locally, isnt that a problem? what do you think?

dgtlmoon avatar Mar 28 '25 11:03 dgtlmoon

It has its pros and cons but this is essentially how I use Distill.io. Like most people, I always have a browser open. The main benefit being that it's removed the need to set up playwright.

Also, if I understand correctly, it allows to leverage cookie/logins done in the browsers. For instance, there is a website I can't seem to get the identification step to work with ChangeDetection while Distill.io has no issue as I'm already logged in.

For me the main issue with the chrome extension is that it requires the user to use chrome...

Batwam avatar Jun 06 '25 15:06 Batwam