ferrum icon indicating copy to clipboard operation
ferrum copied to clipboard

Implement stealth mode

Open route opened this issue 4 years ago • 20 comments
trafficstars

https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-stealth

route avatar Jan 26 '21 05:01 route

Would be great if it could pass those tests

balt5r avatar Jan 31 '21 03:01 balt5r

undetected_chromedriver might also be a good reference.

Also it would probably make sense to add the intoli's checks to the specs. They are also on GitHub (here and here).

alexanderadam avatar Feb 20 '21 10:02 alexanderadam

@route Any thoughts on adding this in? We've been using ferrum for a while now and started getting blocked on one of the sites.

I'm happy to take a cut at implementing this if you want to outline some of your thoughts on how you envision doing it. I studied the source code for about an hour tonight just thinking through some options here.

brettallred avatar Apr 29 '21 03:04 brettallred

Hi @brettallred,

I'm happy to take a cut at implementing

This would be so wonderful! :pray:

I'm not a maintainer here but I would like to see Stealth mode as an integrated extension.

My idea would be:

Specs

  • the specs could get a new directory for extensions (i.e. spec/extensions/stealth)
  • For the specs itself it would probably make sense to add a static page (see spec/support/views for some examples) that shows various states (could be visually simpler than this, since we only would need to check the text output in the specs). There are nice reference pages out there with checks that could be integrated in this page:

Implementation of the extension itself

there are good references out there:

Outside of the specs, you could also check the reCAPTCHA score how good the scripts work.

Summary of a possible solution — TL;DR;

  1. Create a HTML file in spec/support/views containing the checks mentioned above to have a reliable check available within the specs — maybe also a simple HTML table with a summary (i.e. you are [not] a bot)
  2. Write the spec in the way that it intentionally fails (since the extension is not used / ready yet — so that it's obvious that the specs work — i.e. expect(browser.body).to include("you are not a bot"))
  3. Write a rake task (i.e. rake update:stealth_extension) to fetch/build the minimized/compiled puppeteer-extra-plugin-stealth extension and put it in a nice extensions directory within the ferrum repository
  4. Hopefully the spec will be green now if the extension was properly loaded (remember to add Ferrum::Browser.new(extensions: %w(path/to/stealth/ext.js)) or even a shortcut like stealth_mode: true to that) :wink:
  5. optional: document how to integrate Privacy Pass

Again, this is just an idea and I'm not the maintainer here. So please take it with a grain of salt. But I think this could work in a very maintainable manner.

PS: Updating the stealth extension could even be a GitHub action later on.

alexanderadam avatar Apr 29 '21 20:04 alexanderadam

I just wanted to pass a small note that the move @alexanderadam proposed is absolutely feasible. Absurdly so. I've always been a bit intimidated wrangling the js/extension side of things so I kind of brushed that last comment off a bit, assuming additional wiring would need to happen. Tonight I stumbled back into it and noted in particular extract-stealth-evasions, and thought I'd just see where I could get with it. Woah.

image image

First off, thank you @alexanderadam for your detailed note. I saw it this spring, but like I said... I didn't understand it's proposed simplicity. Second, I wanted to report these findings just in case it inspires someone else.

ttilberg avatar Oct 01 '21 05:10 ttilberg

According to these webpages :

  • https://piprogramming.org/articles/How-to-make-Selenium-undetectable-and-stealth--7-Ways-to-hide-your-Bot-Automation-from-Detection-0000000017.html

Tests of bot.sannysoft.com and www.nowsecure.nl are successfully passed with this configuration of browser :

browser = Ferrum::Browser.new(browser_path: BROWSER_PATH, headless: false, browser_options: { "disable-blink-features": "AutomationControlled" })

I don't yet find how to pass them in headless mode.

sebthemonster avatar Oct 27 '22 15:10 sebthemonster

Isn't this a problem better solved at the Chromium level?

I read this article recently, seems like there are improvements in an upcoming version of Chrome:

https://antoinevastel.com/bot%20detection/2023/02/19/new-headless-chrome.html

I'd close this issue, out of scope for Ferrum.

sandstrom avatar Feb 22 '23 17:02 sandstrom

It is, but still ferrum itself can provide some guidance and scripts to make it even harder from the beginning to detect automation.

route avatar Feb 28 '23 05:02 route

Is there documentation on how to get the new headless mode in Ferrum?

wflanagan avatar Jul 13 '23 17:07 wflanagan

You've found a solution on how to transfer them in headless mode?

akavitaliy avatar Aug 13 '23 17:08 akavitaliy

You can enable the new headless mode in chromium by modifying the browser options:

Ferrum::Browser.new(browser_options: { "headless": "new" })

maeve avatar Aug 14 '23 08:08 maeve

You can enable the new headless mode in chromium by modifying the browser options:

Ferrum::Browser.new(browser_options: { "headless": "new" })

it doesn't work, because there's a lot more work to be done https://github.com/rubycdp/ferrum/pull/379

route avatar Aug 17 '23 09:08 route