panther icon indicating copy to clipboard operation
panther copied to clipboard

Symfony panther html() issue

Open vladginosyan12 opened this issue 4 years ago • 6 comments

I am using Symfony Panther for web scraping. When Google Chrome's version and Chrome driver was 89, everything worked fine. But after updating both versions to 92,

$crawler->filter('h1')->html();

will always return empty string.

I think, the problem is related to this method >html()

Could you please let me know if you have a solution for this.

vladginosyan12 avatar Jul 23 '21 11:07 vladginosyan12

Duplicate of https://github.com/symfony/panther/issues/478

LoicBoursin avatar Jul 23 '21 16:07 LoicBoursin

Encountered this issue as well

jbalatero avatar Aug 01 '21 14:08 jbalatero

No solution?

MartinsPaulo avatar Oct 07 '21 10:10 MartinsPaulo

@vladginosyan12 If this is still an issue, try:

$crawler->filter('h1')->getElement(0)->getDomProperty('innerHTML');

(for reference: https://github.com/php-webdriver/php-webdriver/discussions/921)

The html() method of the domcrawler still uses ->attr('outerHTML') which will not work if the browser is in W3C mode, as explained in #478.

codegain avatar Mar 17 '22 10:03 codegain

Hi,

If you want to avoid the bug (or the feature) you can uninstall your current Chrome and install a version prior to the 91 one. Ex for debian the google-chrome-stable_90.0.4430.93-1_amd64.deb release You can find it here : http://mirror.cs.uchicago.edu/google-chrome/pool/main/g/google-chrome-stable/ When you reinstall the packet, don't forget to refresh the bdi : vendor/bin/bdi detect drivers

If you want to correct the bug/feature with recent Chrome : Since ChromeDriver 91 it is W3C standard compliant with "Get Element Attribute". So you must disable w3c compatibility. I didn't succed in doing it with my config : Anybody know how to use ChromeOption / setExperimentalOption with a config like : $chromeOptions = new ChromeOptions(); $chromeOptions->setExperimentalOption('w3c', false);

$client = Client::createChromeClient( null, [ '--window-size=1200,1100', '--headless', '--disable-dev-shm-usage', '--no-sandbox' ], ['port' => 9000 + getmypid()] );

I don't know where to link the ChromeOptions.

Thank you. Antoine.

AntoineMUSSARD avatar Apr 21 '22 19:04 AntoineMUSSARD