playwright-go icon indicating copy to clipboard operation
playwright-go copied to clipboard

[Feature]: Add support for undetectable browsers

Open plord12 opened this issue 1 year ago • 1 comments

Webscraping has become quite tricky recently, with CDN's detecting bots pretty easily. It would be great if we could add support for one of the "undetectable" browsers. For example -

  • https://github.com/daijro/camoufox
  • https://secretagent.dev/

Does the upstream have similar features?

Camoufox does support playwright, but only with python.

Is your feature request related to a problem? Please describe.

Many websites now spot playwright-go / playwright-go-stealth and block access.

Describe the solution you'd like

In an ideal world, it would be as simple as -

playwright.Install(&playwright.RunOptions{Browsers: []string{"camoufox"}})
...
pw.Camoufox.Launch()

Additional context

Camoufax v132.0-beta.15 and before supplied a launch binary which was usable with playwright-go by simply setting ExecutablePath :

pw.Firefox.Launch(playwright.BrowserTypeLaunchOptions{ExecutablePath: playwright.String("launch"))})

However launch has since been removed.

To keep my scraping going, I've taken the launch source into my project, https://github.com/plord12/webscrapers (specifically https://github.com/plord12/webscrapers/tree/main/launch and https://github.com/plord12/webscrapers/blob/main/utils/utils.go#L140). This works with the latest Camoufox release.

plord12 avatar Nov 30 '24 15:11 plord12

I'm new user of playwright-go but I was able to run it with camoufox as server, but you need to run it over two different processes:

Install the latest version of camoufox

pip install -U camoufox[geoip]==0.4.6

Then install the latest (at this moment, not yet published) version of playwright-go

go get github.com/playwright-community/playwright-go@b6313db

And install the drivers (on my case, this fails because I use Fedora, however it still installs the driver)

go run github.com/playwright-community/playwright-go/cmd/playwright@b6313db install --with-deps

Then you need to start camoufox as server

Launching server...
Server launched: 683.392ms
Websocket endpoint: ws://localhost:38835/de22fd1ffd383c1d3669beada9e7eb73 

Configure the Endpoint in your script:

package main

import (
	"log"
	"github.com/playwright-community/playwright-go"
)

func main() {
	pw, err := playwright.Run()
	if err != nil {
		log.Fatalf("oh lala: %v", err)
	}

	browser, err := pw.Firefox.Connect("ws://localhost:38835/de22fd1ffd383c1d3669beada9e7eb73")
	if err != nil {
		log.Fatalf("could not launch browser: %v", err)
	}

	page, err := browser.NewPage()
	if err != nil {
		log.Fatalf("could not create page: %v", err)
	}
	if _, err = page.Goto("http://127.0.0.1:8080"); err != nil {
		log.Fatalf("could not goto: %v", err)
	}

	if _, err = page.; err != nil {
		log.Fatalf("could not goto: %v", err)
	}
}

dlouvier avatar Dec 08 '24 16:12 dlouvier