apify-actor-docker icon indicating copy to clipboard operation
apify-actor-docker copied to clipboard

Chromium not found in actor-node-playwright-chrome

Open vanekj opened this issue 3 years ago • 23 comments

Hello!

I ran into an issue while configuring my crawler Dockerfile with apify/actor-node-playwright-chrome image.

When I build and run my Dockerfile, I get this error and it stops.

ERROR PlaywrightCrawler: Request failed and reached maximum retries. browserType.launchPersistentContext:
Executable doesn't exist at /home/myuser/pw-browsers/chromium-1028/chrome-linux/chrome
╔═════════════════════════════════════════════════════════════════════════╗
║ Looks like Playwright Test or Playwright was just installed or updated. ║
║ Please run the following command to download new browsers:              ║
║                                                                         ║
║     npx playwright install                                              ║
║                                                                         ║
║ <3 Playwright Team                                                      ║
╚═════════════════════════════════════════════════════════════════════════╝

Could you please help me, what can I change to make it work?

Thank you! 🙏🏻


Dockerfile

FROM apify/actor-node-playwright-chrome:18

COPY --chown=myuser:myuser package*.json ./

RUN npm --quiet set progress=false
RUN npm ci --only=production

COPY --chown=myuser:myuser . ./

CMD ["node", "src/main.js"]

.dockerignore

**/.classpath
**/.dockerignore
**/.env
**/.git
**/.gitignore
**/.project
**/.settings
**/.toolstarget
**/.vs
**/.vscode
**/*.*proj.user
**/*.dbmdl
**/*.jfm
**/charts
**/docker-compose*
**/compose*
**/Dockerfile*
**/node_modules
**/npm-debug.log
**/obj
**/secrets.dev.yaml
**/values.dev.yaml
README.md

Docker build command

$ docker build -t crawler .

Docker run command

$ docker run --env-file .env crawler

vanekj avatar Oct 26 '22 07:10 vanekj

@vanekj any solution did you found? I'm having the same issue and hence opened a new request #90.

SFaraji avatar Dec 11 '22 09:12 SFaraji

@B4nan you closed my issue due to being duplicate. Can you then provide a solution to the above issue. Thanks,

SFaraji avatar Dec 12 '22 08:12 SFaraji

Yes, closed because it is exact duplicate, there is no point in having two issues for the same.

B4nan avatar Dec 12 '22 08:12 B4nan

Ok, now can you give a solution to the issue?

SFaraji avatar Dec 12 '22 08:12 SFaraji

@B4nan anything mate? Are you just going to leave this open ended with no response.

SFaraji avatar Dec 13 '22 02:12 SFaraji

Hey, can someone please provide an update regarding this ticket. @B4nan @vanekj

SFaraji avatar Jan 10 '23 04:01 SFaraji

Hi @SFaraji, I gave up on using the apify Docker image and I am using the Playwright one

# Get the base image of Node version 16
FROM node:16

# Get the latest version of Playwright
FROM mcr.microsoft.com/playwright:focal

# Set the work directory for the application
WORKDIR /app

# COPY the needed files to the app folder in Docker image
COPY package*.json /app/

# Get the needed libraries to run Playwright
RUN apt-get update && apt-get -y install libnss3 libatk-bridge2.0-0 libdrm-dev libxkbcommon-dev libgbm-dev libasound-dev libatspi2.0-0 libxshmfence-dev

# Install the dependencies in Node environment
RUN npm ci

# Start the main script
CMD ["node", "--inspect=0.0.0.0:9229", "src/main.js"]

vanekj avatar Jan 10 '23 08:01 vanekj

Hi @SFaraji, I gave up on using the apify Docker image and I am using the Playwright one

# Get the base image of Node version 16
FROM node:16

# Get the latest version of Playwright
FROM mcr.microsoft.com/playwright:focal

# Set the work directory for the application
WORKDIR /app

# COPY the needed files to the app folder in Docker image
COPY package*.json /app/

# Get the needed libraries to run Playwright
RUN apt-get update && apt-get -y install libnss3 libatk-bridge2.0-0 libdrm-dev libxkbcommon-dev libgbm-dev libasound-dev libatspi2.0-0 libxshmfence-dev

# Install the dependencies in Node environment
RUN npm ci

# Start the main script
CMD ["node", "--inspect=0.0.0.0:9229", "src/main.js"]

You ran it on AWS Elastic Beanstalk Docker running on 64bit Amazon Linux 2/3.5.3 as well?

SFaraji avatar Jan 10 '23 08:01 SFaraji

I am running it on my private VPS

vanekj avatar Jan 10 '23 08:01 vanekj

Thanks @vanekj I really appreciate all fixed. Just wondering is there any other way to get the necessary libraries for Playwright? Because it increased my Docker image size.

SFaraji avatar Jan 10 '23 10:01 SFaraji

Unfortunately I did not play with it more to strip down the size as I was happy it's working for my needs.

vanekj avatar Jan 11 '23 09:01 vanekj

AFAIK Apify packages are usually installed with npm install whereas you use npm ci. It might be the cause of your issues. Have you tried the recommended Dockerfiles?

mnmkng avatar Jan 12 '23 06:01 mnmkng

I'm also experiencing the same issue, looks like it might be broken 😅

underfisk avatar Feb 21 '23 20:02 underfisk

We're running hundreds of thousands of runs and thousands of builds on those images every day, they're not broken per se. But they might be broken in some specific configurations. Please provide a reproduction scenario or more information. We would like to help, but without more info there's no way how.

mnmkng avatar Feb 22 '23 10:02 mnmkng

We're running hundreds of thousands of runs and thousands of builds on those images every day, they're not broken per se. But they might be broken in some specific configurations. Please provide a reproduction scenario or more information. We would like to help, but without more info there's no way how.

I did post in another issue #91 where I'm using pnpm and running a Nestjs app with crawlee. Also a little bit more context is that I'm deploy to an ECS

underfisk avatar Feb 22 '23 13:02 underfisk

I also have exactly the same issue. @vanekj thanks for pointing to Playwright Docker image, it solved the issue for me

iBubelo avatar Mar 06 '23 19:03 iBubelo

Had the same issue.

In my case, it's caused by the mismatch of the playwright version between in package.json and in docker image.

For example, I follow the crawlee doc, specify * for playwright in package.json, and get an image named apify/actor-node-playwright-chrome:16-1.31.2, this will cause the issue. If I replace * with 1.31.2, it will be ok.

In summary, this is my package.json:

{
    "dependencies": {
       "@crawlee/playwright": "^3.3.0",
       "playwright": "1.31.2",
    }
}

This is my base image: FROM apify/actor-node-playwright-chrome:16-1.31.2

kejiweixun avatar Mar 31 '23 04:03 kejiweixun

Thanks for sharing @kejiweixun. That's interesting. Are you using a lock file?

mnmkng avatar Apr 03 '23 07:04 mnmkng

@mnmkng Hi, I try to reproduce this issue, but couldn't. I changed lots of my code including the Dockerfile since I thought I "fixed" this issue, but I don't have a copy of my code when this issue occured.

Now in my code, I change "playwright": "1.31.2" to "playwright": "*", and at the same time, I change FROM apify/actor-node-playwright-chrome:16-1.31.2 to FROM apify/actor-node-playwright-chrome:16, it works with no problem.

kejiweixun avatar Apr 04 '23 00:04 kejiweixun

There are new releases of playwright (one landed just a few hours ago), so once we publish a new version of crawlee, the base docker image gets rebuilt and will contain a newer version - and I am afraid that will break it for you again. I feel like the pinning might be actually required to resolve this, as without it, you have two places that need to be synchronized but there is no link between them. NPM with * will technically try to resolve to the latest version, which might be not available in the docker file. If you have a lockfile, you generate that locally - so it will go for the latest available version, but that might not be available in the docker image yet, so fails on building it. And vice versa, if you have a lock file and rebuild an older project, you can get "too new docker image". I think that only with both dockerfile and package.json dependency pinned to exact version can resolve all the possible cases. On the other hand, this requires users to change two places when upgrading the playwright/puppeteer version.

B4nan avatar Apr 04 '23 07:04 B4nan

Writing for future reference. I was having issues specifically deploying to Cloud Run using cloudbuild.yaml and gcloud builds submit with expected steps to build, push and deploy a Crawlee project.

Further details of my error are in the mentioned issue. I solved for this by adding RUN npx playwright install to my own Dockerfile thereby not depending on the base image to perform this install and ensuring all default browsers are present in the resulting image.

Good luck future Googler navigating this mess! 🫡

RowanAldean avatar Jan 30 '24 22:01 RowanAldean

Another option that worked for me is concating && npx playwright install to the posinstall script in the package.json file image

jkorach avatar May 26 '24 11:05 jkorach