puppeteer
puppeteer copied to clipboard
[Bug]: net::ERR_HTTP2_PROTOCOL_ERROR when accessing a http/3 page
Minimal, reproducible example
import puppeteer from "puppeteer";
var browser = await puppeteer.launch({
headless:"new" // no error in headful mode
});
let url = "https://www.adobe.com/it/creativecloud/photography/basics/light.html";
const page = await browser.newPage();
await page.goto(url, {waitUntil: "domcontentloaded", timeout: 40000});
let data = await page.$eval("h1",(element)=>{
return element.innerText;
});
console.log(data)
console.log("done")
Error string
net::ERR_HTTP2_PROTOCOL_ERROR
Bug behavior
- [X] Flaky
Background
I've tested the code on Mac and aws lambda. Same error. It works fine if using headless:false
. Error occurs in multiple node versions: 20.10.0, also on 18.14.1 and 16.15.0
Expectation
I am expecting code to visit the page and not crash. The culprit may be that adobe is using http/3. I was not able to confirm it, because other http/3 websites do work ( e.g. https://blog.cloudflare.com/http3-the-past-present-and-future)
Reality
I get an error:
file:///pathtoproject/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Frame.js:167
? new Error(${response.errorText} at ${url}
)
^
Error: net::ERR_HTTP2_PROTOCOL_ERROR at https://www.adobe.com/it/creativecloud/photography/basics/light.html at navigate (file:///pathtoproject/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Frame.js:167:27) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async Deferred.race (file:///pathtoproject/node_modules/puppeteer-core/lib/esm/puppeteer/util/Deferred.js:80:20) at async CdpFrame.goto (file:///pathtoproject/node_modules/puppeteer-core/lib/esm/puppeteer/cdp/Frame.js:133:25) at async CdpPage.goto (file:///pathtoproject/node_modules/puppeteer-core/lib/esm/puppeteer/api/Page.js:565:20) at async file:///pathtoproject/index.js:9:1
Puppeteer configuration file (if used)
No response
Puppeteer version
21.7.0
Node version
20.10.0
Package manager
npm
Package manager version
8.5.5
Operating system
macOS
This issue was not reproducible. Please check that your example runs locally and the following:
- Ensure the script does not rely on dependencies outside of
puppeteer
andpuppeteer-core
. - Ensure the error string is just the error message.
-
Bad:
Error: something went wrong at Object.<anonymous> (/Users/username/repository/script.js:2:1) at Module._compile (node:internal/modules/cjs/loader:1159:14) at Module._extensions..js (node:internal/modules/cjs/loader:1213:10) at Module.load (node:internal/modules/cjs/loader:1037:32) at Module._load (node:internal/modules/cjs/loader:878:12) at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) at node:internal/main/run_main_module:23:47
-
Good:
Error: something went wrong
.
-
- Ensure your configuration file (if applicable) is valid.
- If the issue is flaky (does not reproduce all the time), make sure 'Flaky' is checked.
- If the issue is not expected to error, make sure to write 'no error'.
Once the above checks are satisfied, please edit your issue with the changes and we will try to reproduce the bug again.
@Lightning00Blade could you please try to reproduce?
The issue is reproducible, but looks like Adobe has a way to prevent automation, so closing as won't fix.
For anyone coming to this issue, what fixed this for me was providing a user agent to puppeteer; the default user agent resulted in the error, while providing one was enough for the web server to properly respond, ex:
import puppeteer from 'puppeteer';
const ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36';
const url = "https://www.adobe.com/it/creativecloud/photography/basics/light.html";
const browser = await puppeteer.launch({ headless: 'new' });
const page = await browser.newPage();
page.setUserAgent(ua);
const res = await page.goto(url)
const text = await res.text();
console.log(text);
The issue is reproducible, but looks like Adobe has a way to prevent automation, so closing as won't fix.
It also affects other pages. Unfortunately, specifying a user agend is no longer sufficient. Are there any ideas on how to circumvent this procedure?
I am also getting net::ERR_HTTP2_PROTOCOL_ERROR using chrome trying to POST to login somewhere. Interestingly got the following CORS error using product: 'firefox' at the same point:
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at
using the user agent fixed for me
using the user agent fixed for me
Interesting, no combination of User Agent or any other "tricks" would work for me. It really must depend on the site in question
Then I discovered puppeteer-extra.
import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';
import AdblockerPlugin from 'puppeteer-extra-plugin-adblocker';
puppeteer.use(StealthPlugin());
puppeteer.use(AdblockerPlugin({ blockTrackers: true }))
After this, everything worked.
using the user agent fixed for me
Interesting, no combination of User Agent or any other "tricks" would work for me. It really must depend on the site in question
Then I discovered puppeteer-extra.
import puppeteer from 'puppeteer-extra'; import StealthPlugin from 'puppeteer-extra-plugin-stealth'; import AdblockerPlugin from 'puppeteer-extra-plugin-adblocker'; puppeteer.use(StealthPlugin()); puppeteer.use(AdblockerPlugin({ blockTrackers: true }))
After this, everything worked.
Thanks for this suggestion. Mine stopped working after a while 😆
UPDATE:
Sometimes the proxy you're using might cause this error, I changed it to a unknown country and it stopped happening
For anyone coming to this issue
page.setUserAgent(ua);
Thanks, the issue has been fixed.
Any update on this matter?
Any update on this matter?
I'm facing the same problem. Did you find a solution on your end?
@kazanemed Honestly no, it seems that in my case it's the web that it's the problem. I'm assuming that the web knows I'm accessing through AWS, and it's not letting me get the data. But I can't say for sure. Try some other websites and see if that can be the problem.
I've also tried the Stealth plugin and some other tips to not being detected, but it didn't work either. Assuming that problem resides in that the web is refusing the connection.
setting a user agent resolved the issue for me
setting a user agent also did not work for me. Trying to access sacbee.com