playwright icon indicating copy to clipboard operation
playwright copied to clipboard

[BUG] page.on('request') is not capturing favicon.ico URI

Open alishba0133 opened this issue 3 years ago • 11 comments

Context:

Playwright Version: playwright-1.12.1
Operating System: Ubuntu 18.04
Python Version: 3.8
Browser: chromium v888113

Code Snippet

import os
import asyncio
from playwright.async_api import async_playwright



async def request(request):
    print('request %s' % request.url)


async def coroutine():
    async with async_playwright() as playwright:
        # Launch browser
        binary = playwright.chromium
        browser = await binary.launch(headless=True)
        page = await browser.new_page()
        page.on('request', request)
        await page.goto("http://nedec.co.kr/favicon.ico")
        await browser.close()


asyncio.run(coroutine())

Describe the bug

page.on('request') is not being emitted for this url. We rendered this url in puppeteer and it captures the request successfully.

alishba0133 avatar Jul 07 '21 04:07 alishba0133

We currently intentional filter out favicon requests. What is your use-case, do you want to intercept or process it somehow?

mxschmitt avatar Jul 07 '21 11:07 mxschmitt

We currently intentional filter out favicon requests. What is your use-case, do you want to intercept or process it somehow?

In my project I want to save the favicon file and perform brand detection using it.

alishba0133 avatar Jul 08 '21 05:07 alishba0133

This seems like a valid usecase, but it does not fit the testing world, where favicons bring flakiness.

dgozman avatar Jul 09 '21 00:07 dgozman

@dgozman, is there any workaround to capture favicon.ico request? I was able to capture it through CDP but it makes the logic quite complex.

While rendering a random URL ignoring favicon.ico request might be OK, but if someone tries to render favicon.ico itself for example https://www.google.com/favicon.ico then page.goto() returns None which is unexpected because page gets rendered successfully.

Custom handling would then be required in user code to handle such cases where page is rendered but response is None.

Environment:
- playwright-1.15.0
- CentOS 8.4
- Python 3.8

sohaib17 avatar Sep 23 '21 07:09 sohaib17

@sohaib17 There is no workaround right now. If this request turns out to be popular, we'll make it work.

dgozman avatar Sep 23 '21 15:09 dgozman

Looks very strange and arbitrary.

mihailik avatar Nov 16 '21 16:11 mihailik

I wanted to have regression tests against not including correct modernised favicon, but seems even if the app misses it and falls back to favicon.ico -- it still near impossible to detect.

mihailik avatar Nov 16 '21 16:11 mihailik

@sohaib17 There is no workaround right now. If this request turns out to be popular, we'll make it work.

Hello, I have the same requirement and would like to be able to trigger a favicon.ico request in headless mode. I hope you will be able to implement this feature.

Fly-Playgroud avatar Aug 12 '22 05:08 Fly-Playgroud

This one wasted a lot of time today because the favicon request shows up when debugging tests, but not when running the tests.

mrDarcyMurphy avatar Sep 30 '22 19:09 mrDarcyMurphy

I hope we do not filter the favicon, maybe make the filter optional. I was expecting the page.on('response') to capture every responses I see in the network list from the browser DevTools.

vn7n24fzkq avatar Feb 08 '23 04:02 vn7n24fzkq

Any progress on this one? We are actually using playwright for many security projects and collecting the favicon is critical for us. For example, we use to fingerprint vulnerable systems. We were wondering why the filtering is done in playwright for favicon as a standard browser does the query?

adulau avatar May 10 '23 13:05 adulau

Just trying to get more attention on that one before I start implementing a manual work around to get the relevant favicon.

Is there any chance to see this feature implemented in the near future? Or at least a way to force fetching from the current context?

Rafiot avatar Jul 11 '23 09:07 Rafiot

Bump, just ran into this issue as well.

byt3bl33d3r avatar Aug 15 '23 23:08 byt3bl33d3r

If you're already using Chromium, this is pretty easy to do over CDP. You'll just need to use the new headless mode or a headful spawn, since the old headless wouldn't render favicons as part of their pipeline.

A monkeypatch also seems possible here by ignoring the favicon flag, but this will probably break more often.

import { chromium, Browser, Page, ChromiumBrowser } from 'playwright';

// Initialize everything
async function initialize() {
  const browser: ChromiumBrowser = await chromium.launch({
    headless: true, // Run in headless mode
    args: ['--headless=new'] // Enable the new headless mode
  });
  const page: Page = await browser.newPage();
  const client = await page.context().newCDPSession(page);

  await client.send('Network.enable');

  // Store the favicon data here
  const faviconData: { [url: string]: any } = {};

  // Listen for requests for favicon
  client.on('Network.requestWillBeSent', async (params) => {
    const { request } = params;
    if (request.url.endsWith('favicon.ico') || request.url.includes('/favicon')) {
      console.log(`Favicon request detected: ${request.url}`);
    }
  });

  // Listen for favicon responses
  client.on('Network.responseReceived', async (params) => {
    const { response, requestId } = params; // Extract requestId here
    if (response.url.endsWith('favicon.ico') || response.url.includes('/favicon')) {
      console.log(`Favicon response received: ${response.url}`);
      
      // Fetch response body via CDP using the correct requestId
      const { body, base64Encoded } = await client.send('Network.getResponseBody', { requestId });
      
      // Store or process the favicon data
      faviconData[response.url] = base64Encoded ? Buffer.from(body, 'base64') : body;
    }
  });

  // Navigate to the page
  await page.goto('https://google.com');

  // Wait 5 seconds for the page to load
  await new Promise((resolve) => setTimeout(resolve, 5000));

  // Print the favicon data
  console.log(faviconData);

  // Close the browser
  await browser.close();
}

// Run the initialization
initialize().catch((error) => {
  console.error(`An error occurred: ${error}`);
});

piercefreeman avatar Aug 25 '23 16:08 piercefreeman

@piercefreeman You rock, thanks a lot!!!

mihailik avatar Aug 28 '23 08:08 mihailik