nx-tools
nx-tools copied to clipboard
Building with kaniko results in missing packets on node_modules
I don't know if anyone is experiencing this issue, but when we try to build with kaniko everything goes fine (no error shown). When we deploy the image in production we find out that packets are missing from node_modules. This is weird, because building locally works fine and packets are present in package.json.
Reading kaniko logs, it seems that it reads the folder from cache, and we're wondering that it may be the cause of the problem.
This is the build stage in our .gitlab-ci.yaml:
build-with-kaniko:
stage: build
image: gperdomor/nx-kaniko:20.11.1-alpine
variables:
# Nx Container
INPUT_PUSH: 'true' # To push your image to the registry
INPUT_ENGINE: 'kaniko' # Overriding engine of project.json files
cache:
key:
files:
- pnpm-lock.yaml
paths:
- .pnpm-store
before_script:
- npm i -g pnpm
- pnpm config set store-dir .pnpm-store
- pnpm i
- NX_HEAD=$CI_COMMIT_SHA
- NX_BASE=${CI_MERGE_REQUEST_DIFF_BASE_SHA:-$CI_COMMIT_BEFORE_SHA}
# Login to registry
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n $CI_REGISTRY_USER:$CI_REGISTRY_PASSWORD | base64)\"}}}" > /kaniko/.docker/config.json
script:
- pnpm nx affected --base=$NX_BASE --head=$NX_HEAD --target=container --configuration=production --parallel=1
This is the error shown when launching the image in docker:
node:internal/modules/cjs/loader:1147
throw err;
^
Error: Cannot find module 'cookie-parser'
Require stack:
- /usr/src/app/main.js
at Module._resolveFilename (node:internal/modules/cjs/loader:1144:15)
at Module._load (node:internal/modules/cjs/loader:985:27)
at Module.require (node:internal/modules/cjs/loader:1235:19)
at require (node:internal/modules/helpers:176:18)
at Array.__webpack_modules__ (/usr/src/app/main.js:2492:18)
at __webpack_require__ (/usr/src/app/main.js:2515:41)
at /usr/src/app/main.js:2531:49
at /usr/src/app/main.js:2581:3
at /usr/src/app/main.js:2584:12
at webpackUniversalModuleDefinition (/usr/src/app/main.js:3:20) {
code: 'MODULE_NOT_FOUND',
requireStack: [ '/usr/src/app/main.js' ]
}
Node.js v20.11.1
This is our package.json:
{
"name": "my-app",
"version": "0.0.1",
"dependencies": {
"@apollo/gateway": "2.7.1",
"@apollo/subgraph": "2.7.1",
"@nestjs-plugins/nestjs-nats-jetstream-transport": "2.2.6",
"@nestjs/apollo": "12.1.0",
"@nestjs/axios": "3.0.2",
"@nestjs/common": "10.3.3",
"@nestjs/config": "3.2.0",
"@nestjs/core": "10.3.3",
"@nestjs/graphql": "12.1.1",
"@nestjs/jwt": "10.2.0",
"@nestjs/microservices": "10.3.3",
"@nestjs/platform-express": "10.3.3",
"@nestjs/terminus": "10.2.3",
"@nestjs/typeorm": "10.0.2",
"@songkeys/nestjs-redis": "10.0.0",
"axios": "1.6.7",
"class-transformer": "0.5.1",
"class-validator": "0.14.1",
"cookie-parser": "1.4.6",
"express": "4.18.3",
"graphql": "16.8.1",
"ioredis": "5.3.2",
"multer": "1.4.5-lts.1",
"nestjs-i18n": "10.4.5",
"nestjs-pino": "4.0.0",
"pg": "8.11.3",
"pino-pretty": "10.3.1",
"reflect-metadata": "0.2.1",
"rxjs": "7.8.1",
"tslib": "2.6.2",
"typeorm": "0.3.20",
"xml2js": "0.6.2"
},
"main": "main.js"
}
And this is our Dockerfile:
FROM docker.io/node:lts-alpine as deps
# Check https://github.com/nodejs/docker-node/tree/b4117f9333da4138b03a546ec926ef50a31506c3#nodealpine to understand why libc6-compat might be needed.
RUN apk add --no-cache libc6-compat
RUN npm i -g pnpm
WORKDIR /usr/src/app
COPY dist/apps/my-app/package*.json ./
COPY dist/apps/my-app/pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile --prod
# Production image, copy all the files and run nest
FROM docker.io/node:lts-alpine as runner
RUN apk add --no-cache dumb-init curl
ENV NODE_ENV production
ENV PORT 3000
WORKDIR /usr/src/app
COPY --from=deps /usr/src/app/node_modules ./node_modules
COPY --from=deps /usr/src/app/package.json ./package.json
COPY dist/apps/my-app .
RUN chown -R node:node .
USER node
EXPOSE 3000
CMD ["dumb-init", "node", "main.js"]
Anyone has a solution to this?
I have the same issue on a project where I build a backend with nestjs framework. For me it's the module tslib that is missing, resulting in docker images that are 100mb instead of usually 200mb. Only fix I found is to remove cache of each built app from my gitlab repo's container registry and then rebuild the app.
I have the same issue on a project where I build a backend with nestjs framework. For me it's the module tslib that is missing, resulting in docker images that are 100mb instead of usually 200mb. Only fix I found is to remove cache of each built app from my gitlab repo's container registry and then rebuild the app.
Thanks, I may try that as a temporary workaround. I hope they'll fix this eventually.
Hi folks... This seems to be related to Kaniko and not to the plugin itself, the plugin basically build the final command and arguments which are executed to build the image, but all the build step is do it by Docker, Podman or in your case, Kaniko... In any case, can you provide at the link to the repo if is public please?...
Also if you execute the same command (extracted from the gitlab logs) in your local and the final image works, then is another confirmation that the problem is not the plugin
Hello @gperdomor, thanks for your reply. That's what I thought too. I was using gperdomor/nx-kaniko:18.12.0-alpine image and now I'm trying using gperdomor/nx-kaniko:20.12.2-alpine. Could you tell me what version of Kaniko are those images using ?
Also, as an attempt to prevent caching issues I tried following https://github.com/gperdomor/nx-tools/blob/main/packages/nx-container/docs/advanced/cache.md and implemented "cache-from" and "cache-to" on all my apps of my NX projects but I don't see any changes in my Gitlab's container registry. Cache folders are still appName/cache instead of appName:buildcache. For reference, here is the relevant part that I modified in each project.json of my apps :
"container": {
"executor": "@nx-tools/nx-container:build",
"dependsOn": ["build"],
"options": {
"engine": "docker",
"metadata": {
"images": ["$CI_REGISTRY/$CI_PROJECT_PATH/backend-features"],
"load": true,
"tags": [
"type=schedule",
"type=ref,event=branch",
"type=ref,event=pr",
"type=sha,prefix=sha-"
],
"cache-from": [
"type=registry,ref=$CI_REGISTRY/$CI_PROJECT_PATH/backend-features:buildcache"
],
"cache-to": [
"type=registry,ref=$CI_REGISTRY/$CI_PROJECT_PATH/backend-features:buildcache,mode=max"
]
}
}
}
and here is my gitlab ci's job :
build affected apps:
stage: build
image: gperdomor/nx-kaniko:20.12.2-alpine
interruptible: true
cache:
key:
files:
- package-lock.json
paths:
- .npm/
variables:
# Nx Container
INPUT_PUSH: 'true' # To push your image to the registry
INPUT_ENGINE: 'kaniko' # Override the engine to use for building the image
before_script:
- npm ci -f --cache .npm --prefer-offline
- NX_HEAD=$CI_COMMIT_SHA
- NX_BASE=${CI_MERGE_REQUEST_DIFF_BASE_SHA:-$CI_COMMIT_BEFORE_SHA}
# Login to registry
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
script:
- echo "Building apps..."
- npx nx show projects --affected --with-target container --base=$NX_BASE --head=$NX_HEAD > apps.txt
- npx nx affected --base=$NX_BASE --head=$NX_HEAD --target=container --parallel=1
artifacts:
paths:
- apps.txt
rules: # Only run on main branch but not tags
- if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_TAG == null'
PS: I'm using latest libraries versions :
"@nx-tools/container-metadata": "^5.3.1",
"@nx-tools/nx-container": "^5.3.1",
EDIT: OH WAIT ! I put "cache-from" and "cache-to" inside metadata, I'm going to move them inside options and see if it fixes my issue !
EDIT 2: yup, I see now in the logs that it's pushing to appName:buildcache now
EDIT 3: but there are still some erros, see:
INFO[0066] Taking snapshot of files...
INFO[0066] Pushing layer type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:9dbbf534e4cb2b3c653eb31ceed07b924d89357bc7db9013dc177c3bdacb8467 to cache now
INFO[0066] Pushing image to type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:9dbbf534e4cb2b3c653eb31ceed07b924d89357bc7db9013dc177c3bdacb8467
INFO[0066] USER node
INFO[0066] Cmd: USER
INFO[0066] No files changed in this command, skipping snapshotting.
INFO[0066] EXPOSE 3000
INFO[0066] Cmd: EXPOSE
INFO[0066] Adding exposed port: 3000/tcp
INFO[0066] No files changed in this command, skipping snapshotting.
INFO[0066] CMD ["dumb-init", "node", "main.js"]
INFO[0066] No files changed in this command, skipping snapshotting.
WARN[0066] Error uploading layer to cache: failed to push to destination type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:23b6d6d66222c4e991ae8565aae3c4ea7cba20cf4bd727d3a9b8712484b8dbd2: Get "https://type=registry,ref=registry.companyName.com/v2/": dial tcp: lookup type=registry,ref=registry.companyName.com: no such host
INFO[0066] Pushing image to registry.companyName.com/groupName/backend/backend-app:main
INFO[0072] Pushed registry.companyName.com/groupName/backend/backend-app@sha256:069ef95d4d7cdafc7e19f754ae0e2226ccccfe1650062644a11b83cd1b099b19
INFO[0072] Pushing image to registry.companyName.com/groupName/backend/backend-app:sha-d70ea95
INFO[0072] Pushed registry.companyName.com/groupName/backend/backend-app@sha256:069ef95d4d7cdafc7e19f754ae0e2226ccccfe1650062644a11b83cd1b099b19
If you take a look, there are errors pushing layers to cache.. any ideas ? is it that because if using kaniko, the cach-to and cache-from are not the same that for docker engine ?