fingerprint-suite icon indicating copy to clipboard operation
fingerprint-suite copied to clipboard

fingerprint injection makes browsers detectable as headless

Open ishfx opened this issue 1 year ago • 18 comments

Describe the bug Simply injecting fingerprint-suite (using newInjectedContext or newInjectedPage), with or without options, for headless or headfull browser, makes the browser detected as headless in : https://arh.antoinevastel.com/bots/areyouheadless.

  • headless = detected
  • headless + stealth = undetected
  • headless + fingerprint-suite = detected !!
  • headless + stealth + fingerprint-suite = detected !!
  • headfull = undetected
  • headfull + stealth = undetected
  • headfull + fingerprint-suite = detected !!
  • headfull + stealth + fingerprint-suite = detected !!

Every time fingerprint suite is used, even without any option, it makes the browser detectable.

To Reproduce

const { chromium: playwright } = require('playwright-extra')
const { newInjectedContext } = require('fingerprint-injector');

playwright.launch({ headless: false }).then(test)

async function test(browser) {
  const context = await newInjectedContext(browser, {}); // DETECTED
  // const context = await brower.newContext(); // UNDETECTED

  const page = await context.newPage();
  await page.goto('https://arh.antoinevastel.com/bots/areyouheadless');
  await page.screenshot({ path: 'detected.png', fullPage: true });
  await browser.close()
}

Expected behavior Injecting the fingerprint-suite shouldn't make the browser be detected as headless.

System information:

  • OS: Arch Linux x86_64 - 6.3.2-arch1-1
  • Node.js version: v16.20.0

ishfx avatar May 17 '23 14:05 ishfx

It seems like using the "chrome" browser ( browsers: ["chrome"] ) is the only one triggering the detection.

// headless mode
Device: desktop | Os: windows | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: windows | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: windows | Browser: edge | Status: You are not Chrome headless

Device: desktop | Os: macos | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: macos | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: macos | Browser: edge | Status: You are not Chrome headless
Device: desktop | Os: macos | Browser: safari | Status: You are not Chrome headless

Device: desktop | Os: linux | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: linux | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: linux | Browser: edge | Status: You are not Chrome headless

Device: mobile | Os: android | Browser: chrome | Status: You are not Chrome headless
Device: mobile | Os: android | Browser: firefox | Status: You are not Chrome headless
Device: mobile | Os: android | Browser: edge | Status: You are not Chrome headless

Device: mobile | Os: ios | Browser: edge | Status: You are not Chrome headless
Device: mobile | Os: ios | Browser: safari | Status: You are not Chrome headless
// headfull mode
Device: desktop | Os: windows | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: windows | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: windows | Browser: edge | Status: You are not Chrome headless

Device: desktop | Os: macos | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: macos | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: macos | Browser: edge | Status: You are not Chrome headless
Device: desktop | Os: macos | Browser: safari | Status: You are not Chrome headless

Device: desktop | Os: linux | Browser: chrome | Status: You are Chrome headless
Device: desktop | Os: linux | Browser: firefox | Status: You are not Chrome headless
Device: desktop | Os: linux | Browser: edge | Status: You are not Chrome headless

Device: mobile | Os: android | Browser: chrome | Status: You are Chrome headless
Device: mobile | Os: android | Browser: firefox | Status: You are not Chrome headless
Device: mobile | Os: android | Browser: edge | Status: You are not Chrome headless

Device: mobile | Os: ios | Browser: edge | Status: You are not Chrome headless
Device: mobile | Os: ios | Browser: safari | Status: You are not Chrome headless
const { chromium } = require("playwright");
const { FingerprintInjector } = require("fingerprint-injector");
const { FingerprintGenerator } = require("fingerprint-generator");


async function runBrowser(device, os, browserType) {
	const browser = await chromium.launch({
		headless: false
	});

	// chrome, firefox, edge, safari

	const { fingerprint, headers } = await new FingerprintGenerator().getFingerprint({
        devices: [device],
        operatingSystems: [os],
        browsers: [browserType]
    });

    const context = await browser.newContext({
        userAgent: fingerprint.navigator.userAgent,
        colorScheme: 'dark',
        viewport: {
            width: fingerprint.screen.width,
            height: fingerprint.screen.height,
        },
        extraHTTPHeaders: {
            'accept-language': headers['accept-language'],
        },
    });

    await new FingerprintInjector().attachFingerprintToPlaywright(context, { fingerprint, headers });

    const page = await context.newPage();

    await page.goto("https://arh.antoinevastel.com/bots/areyouheadless", { waitUntil: "load"});

    const value = await page.evaluate(() => document.querySelector('#res p').textContent);

    console.log(`Device: ${device} | Os: ${os} | Browser: ${browserType} | Status: ${value}`);

    await browser.close();

}

steinerx avatar May 19 '23 02:05 steinerx

@steinpigs thanks for this test!

ishfx avatar May 19 '23 14:05 ishfx

any takers ?

ishfx avatar May 23 '23 08:05 ishfx

Sorry for the delay, I actually started looking into this last week - but unfortunately didn't come to any conclusion. The 'Are you headless' website seems to utilize some kind of ML-like regression, where it collects the browser fingerprint and then sends it to the server, which decides whether the fingerprint is valid or not.

Since this is not an actual bot-protection service, the priority on this is a bit lower, but I'll definitely continue looking into this. Thanks for your patience! :)

barjin avatar May 23 '23 12:05 barjin

No problem, I understand. Thank you for the update. However, I wanted to mention that the injected evasions in the fingerprint-injector compromise the anonymity of the fingerprint, rendering it unusable. This is an important factor to consider.

ishfx avatar May 31 '23 15:05 ishfx

I'm not sure if this works here, but you should find a way to use 'new' in headless playwright.launch({ headless: new }).then(test)

mrfussion avatar Jul 07 '23 17:07 mrfussion

I'm not sure if this works here, but you should find a way to use 'new' in headless playwright.launch({ headless: new }).then(test)

Thanks for the inputs, but that's not the issue here. Plus, the headless new option is only for puppeteer and not playwright : https://github.com/microsoft/playwright/blob/9bca9f1b4ff3478c20a526c8f8b41ab8ab9be6e6/packages/playwright-core/src/client/types.ts#L105

ishfx avatar Jul 07 '23 17:07 ishfx

Debugged this, It is getting flagged due to webDriver property in the fingerprint being set to true.

The https://arh.antoinevastel.com/bots/areyouheadless plugin has a check

https://github.com/antoinevastel/fpscanner/blob/master/src/fpScanner.js#L119

@barjin

abhisheksurve45 avatar Jul 28 '23 19:07 abhisheksurve45

@abhisheksurve45 So why did not fingerprint-suite set webDirver to false?

qkxie avatar Aug 18 '23 07:08 qkxie

fingerprint-injector/fingerprint-injector.js await page.setExtraHTTPHeaders(this.onlyInjectableHeaders(headers, browserVersion));

without extraheader will pass the detection.

tenkuken avatar Sep 26 '23 08:09 tenkuken

fingerprint-injector/fingerprint-injector.js await page.setExtraHTTPHeaders(this.onlyInjectableHeaders(headers, browserVersion));

without extraheader will pass the detection.

Removing accept-language will fix this issue: fingerprint-injector/fingerprint-injector.js await page.setExtraHTTPHeaders(this.onlyInjectableHeaders(headers, browserVersion)); ++ delete extraHeaders['accept-language'];

tenkuken avatar Sep 26 '23 10:09 tenkuken

Can confirm @tenkuken patch works! I added accept-language to the requestHeaders array (which is an array of headers to be filtered out) and my tests with areyouheadless works both with headless: old and headless: new on puppeteer

https://github.com/apify/fingerprint-suite/blob/098d592aec3eb855a59b8ec06c123d690808e277/packages/fingerprint-injector/src/fingerprint-injector.ts#L36-L53

FdelMazo avatar Sep 28 '23 18:09 FdelMazo

@barjin Are you planning a fix or customizable header settings that don't involve forking this?

The page.on("request") approach is one way of editing the header as well. But I think in the options we should incorporate something like:

//conf:
ignoreHeaders: ['accept-language']

That way:

[...requestHeaders, ...ignoreHeaders].forEach((header) => { 
         delete filteredHeaders[header]; 
     });

iwaduarte avatar Oct 12 '23 19:10 iwaduarte

@barjin Isn't the guy from that page https://arh.antoinevastel.com/bots/areyouheadless working for Datadome a bot-detection company. Which would make this maybe a priority?

iwaduarte avatar Oct 12 '23 19:10 iwaduarte

@iwaduarte Make it your priority then :) PRs are welcome.

Regarding the fix - I would rather not add another option in the already pretty granular options object. The other problem is that by removing the accept-language header, the fingerprint becomes less consistent (as puppeteer/playwright will probably substitute the header with a default, while still leaving the navigator.languages API injected).

A proper solution would include hiding the headlessness without compromising the other parts of the injection - which is the first (and only) rule when introducing new features into this library.

As I said, PRs (preferably with proper research/tests) are welcome. Thanks!

barjin avatar Oct 13 '23 10:10 barjin

@barjin I think you have to be more specific here. I could indeed drop a PR for the repo but "would include hiding the headlessness without compromising the other parts of the injection" does not give much to work with. How would you define compromise?

And also if you could give a tip of what code you advise be even better :)

iwaduarte avatar Oct 13 '23 17:10 iwaduarte