docker-iojs icon indicating copy to clipboard operation
docker-iojs copied to clipboard

Reduce Image Size - Alpine Linux, update apk

Open megastef opened this issue 8 years ago • 7 comments

Hi,

are there any plans to use optimize the image size? We where using iojs image and a customer complained about the image with more than 700 MB, and suggested iojs-slim. The result was still 300 MB - including spm-agent-docker.

After using this as base:

FROM alpine:edge
RUN echo "http://dl-4.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories 
RUN apk update
RUN apk add --update iojs
RUN apk add --update git

RUN apk update
RUN apk upgrade
RUN rm -rf /var/cache/apk/*

the image size for spm-agent-docker had a final size of 123 MB. But here is my concern:

  • how to ensure that there is the latest io.js available as .apk package?

In any case I would like to see a tiny, offical iojs base image.

Thanks Stefan

megastef avatar Jul 24 '15 10:07 megastef

Hi @megastef,

Yes the current image is not as slim as it could have been. There is a lengthy disucssion in issue #44 you should check out regarding the possibility to provide a bussybox/alpine linux image.

The difficulty is to maintain the integrity of the installed binaries and the Docker Image since this would be considered an official distribution of Node.JS / io.js which people will use in their production environments for everything they use Node.JS for (banking, transactions, confidential information etc.).

We also need to make sure that the Node.JS / io.js test suite does not fail on a Alpine Linux container.

Starefossen avatar Jul 24 '15 12:07 Starefossen

@megastef You can use https://github.com/mhart/alpine-node

asafyish avatar Jul 28 '15 19:07 asafyish

I was also surprised when I saw the size of the official images. ~700MB for running node is... wrong. Fortunately I found @mhart 's nice alpine-linux node containers, which are MUCH smaller ( ~35MB or 5% ! ) , and I'm using those for everything. So my question was, how come a container that's 5% the size of another one, does practically the same thing ( for someone that wants to run node on it )

@brianredbeard, gave a detailed and enlightening answer to my question https://www.youtube.com/watch?v=gMpldbcMHuI

So maybe we should start an effort to maintain an image that ONLY runs node and not unnecessary things... ?

AntouanK avatar Aug 25 '15 14:08 AntouanK

@AntouanK:

A lot of work has gone into this. And that effort is underway: https://github.com/nodejs/docker-iojs/issues/44 (the maintainer of alpine-node image you use is in that discussion)

Right now, if you inspect the image layers for iojs:latest (using docker history iojs:latest), you will see that:

  • 34.6MB - Installing node.js iteslf
  • 26.45kb - Downloading the gpg keys

The rest of the 605.5MB is buildpack-deps:jessie (480MB) which itself depends on debian:jessie (125.2MB). These two images are the base of nearly every other official base image on the Docker Registry.

What are the implications of this? The typical usecase for docker is running multiple containers from multiple images on a single host.

When downloading these dependencies over the network, they all share the same core image where most of the important shared libraries and system utilities exist. This allows them to take advantage of that environment but ownly download that large 605.5MB base image once.

You will notice that, in an environment where you are running more than just a single image, that the on-disk size and over the network size of minimal images which only pull in exactly what they need to run actually ends up being much larger because they aren't able to share that nice base image.

The alpine linux images we are working on will be for the specific usecase where you are on a host that will only be running node/iojs. In this case, the shared base image isn't really helpful.

retrohacker avatar Aug 25 '15 15:08 retrohacker

@wblankenship Glad to hear you are working on this.

I do know that the "fs layers" are reused. But why do we need that debian one in the first place ( apart from "it's convenient" )? Just because a lot of other images are based on it? Doesn't sound like a good enough reason to me. I thought that those 600MB have things that you need in order to run node.

I did the same exact process for building my raspberry pi io.js image (docker history antouank/rpi-iojs:3.1.0 it's based on resin/rpi-raspbian:wheezy ). But mostly because I didn't know how to start and build a smaller one.

Building your own image from "scratch" will give you much more control over it and make it efficient for every user I believe.

Hope you will get those official alpine images up soon. :+1:

PS Maybe we should include a raspberry pi image in the official ones?

AntouanK avatar Aug 25 '15 15:08 AntouanK

Some companies want to use Node.js but there are concerns about security. Own images need some sort of trust between devs and that is ok for low/middle size open source projects, but large OS projects and companies need more levels of trust, thats why debian/ubuntu even fedora/rhel images are used as base, those images are reviewed for more eyes and have more users that could discover some issues very quick.

I could resume on two things.

  • If you need efficiency for your own project, that will be a good to have for you.
  • If you need broad usage for other projects and other people (and business), that will be a good to have for Node.js organization.

I agree that for raspberries and custom projects there is a need for a very light images but official images need to be trustworthy for anyone else.

Glad that using docker repo we can share any effort on both fronts: efficiency AND broad usage :-D

cronopio avatar Aug 25 '15 16:08 cronopio

@AntouanK buildpack-deps:jessie comes with a bunch of handy shared libraries and tools. There are many common natvie modules in Node.js that can build inside of the official docker images because these packages are included in our base image.

This includes things like:

  • image-magick
  • libncurses
  • python
  • g++
  • gcc
  • make
  • git
  • curl
  • and many more

Its not practical for every base image to install these common packages, because on-disk size would explode. But if we want an environment that can build/run native modules, these packages are necessary.

Not every use case needs these packages, because not every use case needs to be able to build and use native modules. Hopefully the minimal alpine images willl fill that role, but the base images are supposed to be for the general use case (the "one size fits all"), that is why they look the way they do today.

retrohacker avatar Aug 25 '15 18:08 retrohacker