docker-compose-starter icon indicating copy to clipboard operation
docker-compose-starter copied to clipboard

How do you handle code sharing between different services in the same monorepo?

Open itamaro opened this issue 7 years ago • 8 comments

It doesn't come up in this sample repo, but I'm assuming that in a large enough project you would have several top-level "common" directories, meant to be shared between two or more services.

With every service having its own Dockerfile under its sub-dir, it wouldn't be possible to include also the required common dirs straight from the repo, since Docker doesn't allow referencing the filesystem from outside the "build context" (e.g. ../common).

Do you have an elegant solution for this use-case?

itamaro avatar Feb 27 '17 19:02 itamaro

@itamaro This is an amazing question!

Whenever I start a new project or altering infrastructure - I prefer to not share at all. Even if there are similar parts - they are just copied/manually merged between services. This allows you to easy experiment within any given service without breaking others. If changes worked well for one service, they are back copied to others manually.

Now let me go back to your question. There are few ways I've used for different projects:

  1. Create common or shared folder within monorepo, which have his own Dockerfile. In this docker file you define build steps and copy that folder to folder within a container (/shared or /common). You can do a shared infrastructure versioning using Docker tags. Other services, which require that common code to run just based on this common docker. Whenever you do a change to the infrastructure - you update a docker tag to a new version and update base image for some or all services. The downside of that approach is whenever you update your base docker - all other services, which are based on that docker image needs to be rebuilt from scratch
  2. Another approach is to just publish common code as a separate package. For example, in Node.JS you can use a private registry to achieve that. I like this approach much more, but the downside is that you have to maintain a private package repository, which might be too much for early products.

We personally use approach #1 at the moment. Our common piece is solid and changes just a few times per month.

Also if you do use Node.JS, you can easily connect shared piece of code on the local environments same way as you connect in in docker using NODE_PATH. Here is how we run our services using scripts in package.json: "start": "NODE_PATH=../shared node src/server.js". Node.JS will use NODE_PATH to search for modules in addition to the regular directories.

Hope this helps!

anorsich avatar Feb 28 '17 10:02 anorsich

@anorsich thank you for sharing!

Couple of thoughts.

First, regarding the "not sharing at all" approach for the beginning - while the thought of duplication and maintenance overhead makes me shiver, I can see how it can work reasonably well in a single-contributor environment. At least from the collaborative monorepos I'm familiar with, I find it difficult to imagine how it could work without significant "forking" between the different copies.

The 2 solutions you present definitely make sense. As anything, I see pro's and con's, and I wonder if we can get to an ultimately better solution, when compared to a monolithic application, where it is trivial to have one shared directory (or many).

Some issues that bothered me when attempting the solutions you described (in no particular order):

  1. So much extra "toil" to get things working...
  2. With option 1 - managing base images can be a real hassle. You start with 1, but you end up with a bunch. You need to version them. When they change, you need to chase all the services that depend on them, update their Dockerfile to start from the new base image, and rebuild. Update deployments to use the updated images. Update compose files.
  3. Also, with base images, there's the need to "manage their cache distribution". You either let every developer build the base images for themselves, or you need to make sure all developers pull base images from a central registry, every time they are updated. There are several tradeoffs just here.
  4. Option 2 has its own overhead, as you described, with setting up and managing package repositories. This overhead is significantly magnified when you take advantage of the micro-services paradigm to actually have different technologies for different services, so you end up managing repositories for apt & pip & npm & ... (we actually do that, but not for purposes of implementing option 2)
  5. Also, I think that packaging up a small utility function as a full-fledged package just to be able to reuse it all over your codebase just doesn't make sense... Call me crazy ;-)
  6. In addition, for both options, I really like having my code in the image in the same layout as the code in the repo, so I can easily bind-mount the repo-dir for local development for dynamic languages. Using installed packages or base images with special paths can break this.
  7. To compound all of that, there's also the problem of effectively managing dependencies. Once you have multiple small shared modules (micro-modules?), each can have a bunch a dependencies (internal or external), it can be a nightmare to make sure that every final image contains all the required internal & external dependencies - no more, and no less.
  8. Not quite directly related to shared modules, but while I'm on a soapbox, don't get me started on the pains of build-images vs. run-images... Since we're trying to assume as little as possible about the host environment, but we do need build-environments that are broader than the runtime-environments (e.g. compiler and dev-libs in build-image, but only shared-objects and binaries in runtime-image), we end up with complex multi-tiered dependency management that controls what goes into build-images and what goes into runtime-images and which build-image is used to produce artifacts in temporary host-dirs to be copied to which runtime-image... It's hell.

I think this is a real pain-point in the overly-simplistic approach that Dockerfiles force, and real deep solutions are needed...

We're actually building internal tooling to make it better for us. Maybe it can be relevant to others as well :-)

itamaro avatar Mar 02 '17 09:03 itamaro

@itamaro you describe the exact problem I'm having -- I'm experimenting with changing the build context of a service to be the root of the monorepo so I can copy over the /shared directory but it results in a more complicated Dockerfile and slows down the time it takes to build the image -- I was wondering whether you managed to find a sane solution to this problem?

richardscarrott avatar Sep 05 '18 22:09 richardscarrott

@richardscarrott I still use shared folder on some old projects, but for the new projects, I prefer to publish common code as a standalone package (npm in Node.JS). Packages take more time and likely require some CI process to publish them, but allow to keep docker images more lightweight. You also going to think twice before creating a new package, which results in less shared code.

anorsich avatar Sep 06 '18 12:09 anorsich

I was wondering whether you managed to find a sane solution to this problem?

@richardscarrott not a perfect solution, but something that worked for us pretty well.

we wrote a build tool, YaBT, that is essentially a glorified Dockerfile generator that understands semantics of both internal (e.g. shared code) and external (e.g. apt / pip / npm packages) dependencies, and allows you to describe your artifacts in a DAG (you can see examples under the tests dir). it has good support for Python & C/C++ monorepos with external dependencies based on a bunch of package management frameworks, parallel building & target caching (of artifacts as well as test execution).

you're welcome to take it for a spin, and let me know how it works for you (or submit issues as to how it isn't :-) ).

worth mentioning also Bazel, which may work for your needs (it didn't for ours).

also worth mentioning Buck, which may also work for you, but probably doesn't (because there's no support for Docker images).

YaBT shares a similar DSL with both of these, and maybe they can even coexist under the same project.

itamaro avatar Oct 10 '18 03:10 itamaro

@itamaro Have you found an elegant solution for this?

cerinoligutom avatar Jul 30 '19 19:07 cerinoligutom

Have you found an elegant solution for this?

@cerino-ligutom the build tool I mentioned in the previous comment is still being used successfully :-)

itamaro avatar Aug 03 '19 20:08 itamaro

I recently found an elegant solution and thought I'd share!

In March of this year, Docker Compose released version 2.17.0 which supports specifying multiple build contexts.

Added support for additional_contexts in the build service configuration.

I can now have the primary build context be a service-specific folder, and add a common folder as a named context passed in to the Dockerfile. It's super neat.

The docker-compose.yml file in the root of my project looks like this:

...
services:
  ...
  server:
    ...
    build:
      context: server
      dockerfile: Dockerfile
      additional_contexts:
      - shared=common
    ...
  ...
...

And my server/Dockerfile looks like this:

...
COPY . .
COPY --from=shared . ../common
...

This copies all of my source files from my server folder (minus .dockerignore files) and all of my shared files from my common folder into my Docker image.

My folder structure:

server
├─ Dockerfile
└─ ...

...
├─ Dockerfile
└─ ...

common
└─ ...

docker-compose.yml

And that's it! I think it's pretty clean.

mstefanwalker avatar Dec 15 '23 04:12 mstefanwalker