Goutte icon indicating copy to clipboard operation
Goutte copied to clipboard

Goutte not using httpclient headers

Open restucciaquito opened this issue 5 years ago • 11 comments

Hi!

I'm trying to edit the User-Agent as described at The HttpClient Component Documentation but the crawler always use 'Symfony BrowserKit'. I'm doing something wrong or is a bug?

$client = new Client(HttpClient::create(['timeout' => 5000, 'headers' => ['User-Agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36']]));

Thanks for your help.

restucciaquito avatar Dec 19 '19 19:12 restucciaquito

You can set the header in this manner: $client->setHeader('User-Agent', $userAgent);

I'm trying to set the CURL options, hoping someone can help me. All the links for doing this are old.

jmichaelterenin avatar Jan 14 '20 20:01 jmichaelterenin

Thanks for your answer. When I set headers in the way you suggested me I have this error:

Call to undefined method Goutte\Client::setHeader()

restucciaquito avatar Jan 17 '20 16:01 restucciaquito

As the BrowserKit will override user-agent header during executing, you have to use setServerParameter() to put user agent header back.

use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;

$client = new Client(HttpClient::create(array(
    'headers' => array(
        'user-agent' => 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0', // will be forced using 'Symfony BrowserKit' in executing
        'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'Accept-Language' => 'en-US,en;q=0.5',
        'Referer' => 'http://yourtarget.url/',
        'Upgrade-Insecure-Requests' => '1',
        'Save-Data' => 'on',
        'Pragma' => 'no-cache',
        'Cache-Control' => 'no-cache',
    ),
)));
$client->setServerParameter('HTTP_USER_AGENT', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:73.0) Gecko/20100101 Firefox/73.0');

kiang avatar Feb 27 '20 03:02 kiang

Thank you for that important/critical piece of information, much appreciated!

jmichaelterenin avatar Feb 27 '20 03:02 jmichaelterenin

@kiang, that's the correct answer, just tested, thanks for your great help!

restucciaquito avatar Mar 15 '20 22:03 restucciaquito

Is this new? I have multiple scripts in which

use Goutte\Client;
$client = new Client();
$client->setHeader('user-agent','whatever');

has worked flawlessly. Now suddenly—after starting a new project and using the most recent version of Goutte — it fails with PHP Fatal error: Uncaught Error: Call to undefined method Goutte\Client::setHeader()

BorisAnthony avatar May 07 '20 10:05 BorisAnthony

Yup. There it is: https://github.com/FriendsOfPHP/Goutte/commit/05f6994ec1d0d8368157de7fe45063e751857086

BorisAnthony avatar May 07 '20 10:05 BorisAnthony

Yes, this answer is correct (https://github.com/FriendsOfPHP/Goutte/issues/401#issuecomment-591760247), I tested it on an updated version of Goutte.

restucciaquito avatar May 07 '20 22:05 restucciaquito

I didn't read the comments thoroughly enough the first time through but This Comment is spot on. I found it after digging through the code with trial and error.

This: $this->client->setServerParameter('HTTP_USER_AGENT', $userAgent);

Sets the user agent properly and solved my issue with a site.

alexhackney avatar Jan 13 '21 20:01 alexhackney

This works in the current implementation:

$client = new Client([
    'HTTP_USER_AGENT' => 'Mozilla/5.0 (X11; Linux i686; rv:78.0) Gecko/20100101 Firefox/78.0',
    'HTTP_ACCEPT' => '*/*',
]);

Have a look at the constructor of Symfony\Component\BrowserKit\AbstractBrowser which calls the method setServerParameters(). The Goutte class Client is indirectly derived from Symfony\Component\BrowserKit\AbstractBrowser

WeeSee avatar Jan 30 '21 09:01 WeeSee

This works in the current implementation:

$client = new Client([
    'HTTP_USER_AGENT' => 'Mozilla/5.0 (X11; Linux i686; rv:78.0) Gecko/20100101 Firefox/78.0',
    'HTTP_ACCEPT' => '*/*',
]);

Have a look at the constructor of Symfony\Component\BrowserKit\AbstractBrowser which calls the method setServerParameters(). The Goutte class Client is indirectly derived from Symfony\Component\BrowserKit\AbstractBrowser

can you explain how to do this? when I copy the code it just shows errors and says must be of type HttpClientInterface

adgower avatar Aug 19 '21 16:08 adgower