ribbit-network-frog-hardware icon indicating copy to clipboard operation
ribbit-network-frog-hardware copied to clipboard

Investigate reducing co2 and gpsd image sizes

Open djgood opened this issue 3 years ago • 5 comments

See if we can reduce the sizes of the images to speed up updates, which allows a new device to come online faster and makes developing for our devices less of a waiting game! Reduces the waste of bandwidth as well ;)

An easy way that I've found to reduce image sizes is to use the multi-stage builds feature in Docker. We should be able to add build step which starts a fresh container, then copy all of the compiled/downloaded modules from the previous step, taking just what we need to run.

The simplest image that we could create would just contain the code and it's dependencies, but it may be worth keeping some tools around that are helpful debugging (e.g. i2cdetect).

djgood avatar Nov 20 '21 00:11 djgood

I did an experiment to see how much we can save.

  • "phase 1": https://github.com/abesto/ribbit-network-frog-sensor/commit/01ffab3a7d945f9f1cb37523d8e41511ec0503b9 - multi-stage build using a virtualenv
  • "phase 2": https://github.com/abesto/ribbit-network-frog-sensor/commit/e456650f5fdb9ac3ec98a4276b9f303c34541424 - pay ridiculous amounts of build time for extra space savings

And the numbers:

  • Base image: 243MB
  • Current co2 image size: 540MB
  • phase 1: 438MB
  • phase 2: 423MB

Phase 2 is (at least in my buildx x-plat build environment) ridiculously slow, so definitely not worth it for the extra 10MB saving. Hooowever, phase 1 seems like it's worth doing? It's a 12% size decrease. Not as dramatic as you'd see with a static binary running on Alpine or whatever, but we are talking about Python. To get more, we'd probably need to start trimming the actual dependencies.

(Note, we should probably have done PYTHONDONTWRITEBYTECODE=1 on phase 1, should be zero overhead and save some space on .pyc files)

abesto avatar Jan 11 '22 21:01 abesto

... A quick look through the output of the install_packages we start with shows stuff like fonts and libgtk2 being installed, so I'm guessing there's some further space to be saved by tracking down packages that think we need a desktop environment, and convincing them of the error of their ways.

abesto avatar Jan 13 '22 10:01 abesto

Instead of copying all the content you could copy in just the file needed for the installs (pyproject.toml and the lock file?). That way when you make changes to co2.py it won't bust your cache, it will only rebuild when it needs new packages in the build step:

# This will copy all files in our root to the working  directory in the container
# It's almost guaranteed to bust the Docker build cache, so do it as late as possible
COPY . ./

https://github.com/abesto/ribbit-network-frog-sensor/blob/e456650f5fdb9ac3ec98a4276b9f303c34541424/software/co2/Dockerfile.template#L48-L50

This step isn't really going to give any advantage, the build steps are disposable packages that never makes it to the devices. Due to the layering in Docker and build caching it also won't really save space on your local machine. At the moment it will just slow the build down ever so slightly while you wait for the uninstall process to complete.

# Save a tiny amount of space: we don't need these at runtime
RUN pip uninstall -y wheel pip

https://github.com/abesto/ribbit-network-frog-sensor/blob/e456650f5fdb9ac3ec98a4276b9f303c34541424/software/co2/Dockerfile.template#L55-L56

Using the balena :buster-build images could also help build time as less installs will be required in the build step. It will make the build step image bigger, but as that never lands on a device there is little reason to worry.

Would also consider locking the Python version. I think it is something like balenalib/%%BALENA_MACHINE_NAME%%-python:3.12-buster-build. Otherwise a new Python update from 3.10 to 3.11 for example will be pushed out to the devices when using just python:buster-build. There is usually little need to be on the edge like that, can be done intermittently as and when needed. A Python update inside the container will require a lot of bandwidth to do across devices. Not to mention its helpful for ensuring breaking changes don't slip in.

Come to think of it, would say the same for the pip --upgrade in there too. It is updating without version control. Better to bump the image version with pip inside than always running on the latest pip, just to create a predictable environment.

maggie44 avatar Jan 17 '22 15:01 maggie44

Instead of copying all the content you could copy in just the file needed for the installs

Accurate; check out https://github.com/Ribbit-Network/ribbit-network-frog-sensor/blob/main/software/co2/Dockerfile.template, we're actually doing this!

This step isn't really going to give any advantage, the build steps are disposable packages that never makes it to the devices.

We install wheel and pip after setting up the venv, and we ship the venv to the runtime image, so surely removing them saves the space that pip and wheel take up?

Using the balena :buster-build images could also help build time

Yep, and we're also doing that! :D

Would also consider locking the Python version. Come to think of it, would say the same for the pip --upgrade in there too.

pip I'm conflicted about, but the Python version, for sure.

abesto avatar Jan 17 '22 17:01 abesto

Whoops, I was looking at the experiment repos you linked, looks like there is another one merged with a lot of the things already in it.

We install wheel and pip after setting up the venv, and we ship the venv to the runtime image, so surely removing them saves the space that pip and wheel take up?

Ah I see. Nice catch.

I tend to install with —user instead of a virtual env which would use the global wheel package. But they are all much of a muchness. https://pythonspeed.com/articles/multi-stage-docker-python/

maggie44 avatar Jan 19 '22 09:01 maggie44

We're moving to an esp32-based frog for the foreseeable future, so closing this won't fix fow now. Perhaps once the global supply chain clears up a bit, we will revisit this.

New software repo below:

https://github.com/Ribbit-Network/ribbit-network-frog-software

keenanjohnson avatar Dec 06 '22 00:12 keenanjohnson