puppeteer icon indicating copy to clipboard operation
puppeteer copied to clipboard

[Bug]: TimeoutError no matter how much I increase the time!

Open kingsleydev19 opened this issue 2 years ago • 5 comments

Minimal, reproducible example

import puppeteer from "puppeteer";


export const httpCreatePDF = async(req: Request, res:Response)=>{

    const url = req.body.url;
    const prevPublicURL = req.body.prevPublicURL;
    const tripId = req.body.tripId;
    res.sendStatus(200);
   
  try {
   
    const browser = await puppeteer.launch({ headless: true,
    executablePath: 'google-chrome-stable',
    args: ['--no-sandbox', ],
    });

    const page = await browser.newPage();
    page.setDefaultNavigationTimeout(70000)
    
    await page.goto(url,  {waitUntil: 'networkidle0', timeout: 60000});
    
     
    
    const pdf = await page.pdf({ format: 'A4', });
    
    
    await browser.close();

    //delete prev pdf if it exist in cloudinary
    
    if(prevPublicURL){
        await deleteAllFiles({filesToDelete: [prevPublicURL]});
    };

    //upload the file to cloudinary
    const uploadResponse =  await cloudinaryBufferUpload(pdf) as any;
    
    const response = {
        pdfImageURL: uploadResponse?.secure_url,
        pdfPublicURL: uploadResponse?.public_id,
        tripId: tripId
    }

    
    const webHookReq = await axios.post(sanitizedConfig.WEBHOOK_URL_PDF, response);
   
        if(webHookReq?.status === 200){
    
            return
        }
      
        return console.log(response)
  } catch (error) {
    
    console.log(error)
  }
}

Error string

TimeoutError: Navigation timeout of 70000 ms exceeded

Bug behavior

  • [ ] Flaky
  • [X] PDF

Background

I just want to use puppeteer to generate pdf from a dynamic page. This page was built with Next.js and the content was fetched using getServerSideProps method from Nextjs. The nextjs page is hosted on varcel. Everything for now is on free acount as we are still test developing! The puppeteer server is on back4app while our main API is on render.

No matter how I increase the time, I will still get the error. I have also tried timeout: 0, didn't work. We separated the puppeteer server code from our main API. Our main API makes a request to the it and gets result via webhook when puppeteer completes the task or fails. The entire process works well locally on my Windows 11 machine. However, when deployed via docker, I gets this issue. when puppeteer is on my local machine, I also tried passing the targeted url directly from varcel and it was successfully converted to pdf. Also, I can convert the homepage, contact page, etc to pdf when puppeteer is deployed via docker but the page we really need converted to pdf always fail with timeout error. I really don't know how to resolve this!

Here is my docker file:

`FROM node:slim

WORKDIR /app

copy configs to /app folder

COPY package*.json ./ COPY tsconfig.json ./

copy source code to /app/src folder

COPY src /app/src

RUN npm install axios bcrypt cloudinary cookie-parser cors dotenv express jsonwebtoken mongoose puppeteer

RUN npm install typescript --save-dev RUN npm run build

FROM node:slim

RUN apt-get update && apt-get install -y gnupg ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true RUN apt-get update
&& apt-get install -y wget gnupg
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
&& apt-get update
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1
--no-install-recommends
&& rm -rf /var/lib/apt/lists/*

WORKDIR /app

COPY package*.json ./

RUN npm install axios bcrypt cloudinary cookie-parser cors dotenv express jsonwebtoken mongoose puppeteer

COPY --from=0 /app/build ./build

RUN npm uninstall typescript

USER node

CMD [ "npm", "start"]

EXPOSE 6000 `

If I use await page.goto(url, {waitUntil: 'networkidle0', timeout: 60000});, the network error disappears, pdf gets created but with an error from varcel stating that 'This Serverless Function has timed out.'

Here is the url we are targeting. It is dyanmic. This is just one of it: https://bookingapp-mocha.vercel.app/builder/live/64e3aaba042c8f73316d13f1-Experience-the-Cairo-vibes

Expectation

To generate pdf from the url

Reality

Throws serveice time out error

Puppeteer configuration file (if used)

No response

Puppeteer version

21.4.1

Node version

16.18.1

Package manager

npm

Package manager version

9.6.4

Operating system

Windows

kingsleydev19 avatar Oct 29 '23 19:10 kingsleydev19

The issue has been labeled as confirmed by the automatic analyser. Someone from the Puppeteer team will take a look soon!


Analyzer run

github-actions[bot] avatar Oct 29 '23 19:10 github-actions[bot]

Could you try using the official docker image? https://github.com/puppeteer/puppeteer/pkgs/container/puppeteer What you describe sounds like the browser fails to start/run because the docker image does not have all dependencies (or misconfigured sandbox). You can use dumpio: true flag to see if the browser logs contain any errors.

OrKoN avatar Oct 30 '23 08:10 OrKoN

Thanks for the feedback! while I am not a pro in using puppeteer, it is my first time! I was wondering if there is any security concrn for me using the official image? I read that the permission granted by the use of the official image is much than needed. I don't know. Like I said, I simply use mine to generate PDF and nothing more. Also, it is separated from our main API and only sends back result via internal webhook which. The response is for our main API to take the uploaded PDF link and save in our database.

kingsleydev19 avatar Oct 30 '23 10:10 kingsleydev19

@sparkmediatech you would need to evaluate security concerns on your own depending on your use case. Disabling the sandbox also has the security considerations + it is not how the browser regularly runs. Also see https://pptr.dev/troubleshooting#setting-up-chrome-linux-sandbox (we do not recommend running without a sandbox).

OrKoN avatar Oct 30 '23 10:10 OrKoN

If I use the official image, will I still care about what was talked about in the link you provided? The url we generate the PDF are ours. We are only generating from our own URL and this URL is properly verified in our main API before we send to puppeteer API to generate a pdf of it. I really appreciate your time responding to my query. Thank you so much.

kingsleydev19 avatar Oct 30 '23 10:10 kingsleydev19

The issue is not reproducible. As a suggestion, use the full node image as the node slim image might be missing some dependencies.

OrKoN avatar May 16 '24 17:05 OrKoN