rethinkdb-dockerfiles icon indicating copy to clipboard operation
rethinkdb-dockerfiles copied to clipboard

New image planning

Open stuartpb opened this issue 7 years ago • 30 comments
trafficstars

This is a tracking issue to collect all the elements that are under consideration for a potential overhaul to the official RethinkDB image:

  • Packaging from scratch (#43) or Alpine (#32) instead of Debian
  • Packaging for ARM (#41)
  • Running as a non-root user (#39)
  • Removing the VOLUME directive (#14)
  • Housekeeping

Now, some of these have a larger potential for breaking changes than others (particularly removing the VOLUME directive), so I'm not necessarily going to do all of them immediately; however, many of these (like changing the user and packaging from scratch) shouldn't affect the daemon's normal operation, so I may tie them to the next minor bump for the official library. (Ideally, all this would be held off as potentially breaking changes that would only accompany a major version bump, but I don't see RethinkDB 3.0 happening any time in the foreseeable future.)

stuartpb avatar Jan 06 '18 11:01 stuartpb

Can i help for the alpine container ?

nicodmf avatar Jul 02 '18 12:07 nicodmf

Is this still something that's planned? Is RethinkDB something that's still maintained? Would it make sense to look at RebirthDB? (https://github.com/RebirthDB/rebirthdb)

tianon avatar Aug 30 '18 18:08 tianon

@nicodmf Yes, talk to the RebirthDB people.

@tianon All the activity is with RebirthDB (except for a small amount of stuff I do to fix stuff, and the miscellaneous PR's you see, and that'll end up in RebirthDB anyway) but it looks like they're going to become the people in charge of RethinkDB real soon. You'll notice that the main rebirthers are now owners of the RethinkDB org: https://github.com/orgs/rethinkdb/people

srh avatar Sep 07 '18 19:09 srh

I like @daveisfera's work in https://github.com/rethinkdb/rethinkdb-dockerfiles/pull/40#issuecomment-391075583 and currently that's probably the closest thing to what a new image would look like, but that is a hairy RUN line - I'd rather see something that uses more of upstream's build process, which (IIRC) uses Nix and was developed by @AtnNn.

In any case, what we really need is a new pull request implementing a build on Alpine using either of these two approaches (if we can't get this going from scratch, which is what I'd really rather have) - I can't merge a thread comment.

stuartpb avatar May 02 '19 23:05 stuartpb

RUN commands to build a package usually are big and ugly. Multi-stage builds help clean that up, but they're currently not supported for official images.

The Dockerfile that I made uses the same build process as the Alpine package and is very close to the version used for the Debian builds (i.e. just modifications to work with Alpine), so I'm not sure what the changes you're requesting would be.

daveisfera avatar May 03 '19 05:05 daveisfera

Is there a way we could pull the build command by reference from an upstream source, like how Arch Linux has the ABS? In other words, if this is how the Alpine package is built, can we perform the build and install it as our own package?

stuartpb avatar May 03 '19 08:05 stuartpb

I'm not following. A Dockerfile is basically a build system like ABS. If you go dig into the details of an ABS build, they're just as big and ugly. They have better ways to break it up and structure things so it's easier to look at and manage, but once multi-stage builds are supported, then the Dockerfile can be cleaned up in the same sort of way.

dave-nm avatar May 03 '19 14:05 dave-nm

I guess what I'm saying is, if this does the some thing as https://git.alpinelinux.org/aports/tree/community/rethinkdb/APKBUILD, why not just build from that aports tree source directly (which, from the looks of it, would also resolve #39 via https://git.alpinelinux.org/aports/tree/community/rethinkdb/rethinkdb.pre-install)? Why repeat ourselves?

stuartpb avatar May 05 '19 22:05 stuartpb

Oh, do you mean to run the APKBUILD file from Alpine inside of the Dockerfile? If that's what you're asking, then I'm not sure what that would take but I'm guessing it's possible. The down side is that we'd be tied to the version that they have installed and if we're going to couple it to their packaging like that, then why not just install the already built binaries?

daveisfera avatar May 05 '19 22:05 daveisfera

if we're going to couple it to their packaging like that, then why not just install the already built binaries?

I guess my original thinking behind this was that this was the only way to really pin a package release, since Alpine's repos only provide the latest version. The point of the Docker library seems to be about making the build process as deterministic as possible, though I'll admit that this seems like something of a farce when the first line of the Dockerfile bases the entire build on a mutable tag.

stuartpb avatar May 05 '19 23:05 stuartpb

Argh... ultimately, what this comes down to is that RethinkDB did their own packaging for Debian/Ubuntu/CentOS, and they packaged the full matrix of releases to package versions, and if we're relying on distro packaging (which only caters to the latest version), the ability to maintain images for previous versions using only upstream-provided infrastructure falls apart.

I don't want the Dockerfiles repository to become a backports project, especially when RethinkDB upstream already maintains a backporting CI system for other distributions.

I feel like the real solution would be to get Alpine added to https://github.com/rethinkdb/rethinkdb-nix, but that would require Alpine to get added to Nix's whole VM building system first...

stuartpb avatar May 05 '19 23:05 stuartpb

Agh, whatever, we don't list anything but the latest version in the library anyway (and I vaguely recall the exchange that led to this).

I could argue that it's a bad idea to let old images' dependencies stagnate, but there's also an argument to be made for not making untested changes to obscure backports... whatever.

Yeah, at this point I'm fine just using Alpine's package. When Alpine drops support for older versions, that's fine, because so does our building of them. Anybody who wants to keep an older image building will just have to invent their own CI system to support their specific blend of old and new, I guess!

stuartpb avatar May 06 '19 00:05 stuartpb

Or, really, reflecting on https://github.com/rethinkdb/rethinkdb-dockerfiles/pull/45#issuecomment-426690121, I think what'd be more appropriate than Ubuntu or Alpine going forward would be for the RethinkDB image to be based on a package for Nix: https://github.com/rethinkdb/rethinkdb-nix/issues/2

stuartpb avatar May 06 '19 00:05 stuartpb

Having a set of packages for each of the OSes is really nice (like what Postgres provides), but it's a ton of work. Also, even they don't maintain packages for Alpine and they just build the software in the Alpine base image, like I've done. Basically, they have two versions:

  1. debian base that installs the prebuilt package ( https://github.com/docker-library/postgres/blob/85aadc08c347cd20f199902c4b8b4f736341c3b8/9.6/Dockerfile )
  2. alpine base that builds dynamically ( https://github.com/docker-library/postgres/blob/85aadc08c347cd20f199902c4b8b4f736341c3b8/9.6/alpine/Dockerfile )

I would vote that that be the same strategy that Rethink uses for the flexibility and control that it provides, because honestly being tied to whatever Alpine has limits the Docker images in a way that I believe will likely make them less useful than they should be.

daveisfera avatar May 06 '19 02:05 daveisfera

@stuartpb @daveisfera I was surprised this morning when I ran a docker pull command in my project and the rethinkdb:2.3.6 image downloaded a new image from dockerhub.

The dockerhub page says the tag was updated 18 days ago.

The Dockerfile in this repo hasn't been changed for 2 years.

Given this: https://www.bankinfosecurity.com/docker-hub-breach-its-numbers-its-reach-a-12425, I'm wondering if there is cause for concern, or have one of you guys recently pushed a new image over that tag (and if so why wouldn't it just be a new tag?)

ryedin avatar May 28 '19 14:05 ryedin

The base image was updated ( https://github.com/debuerreotype/docker-debian-artifacts/commits/fd138cb56a6a6a4fd9cb30c2acce9e8d9cccd28a/jessie/Dockerfile ). This is common to get security fixes and such out to all of the images and I don't believe that it has anything to do with the breach.

daveisfera avatar May 28 '19 14:05 daveisfera

Are you guys any close to having AArch64 (ARM64) support on DockerHub?

https://hub.docker.com/_/rethinkdb/

lag-linaro avatar Jun 13 '19 07:06 lag-linaro

Just to be clear, it's been ~623 days since the last actual rethinkdb image update, so it's definitely ripe (even if just to get off Debian Jessie whose leftover lifespan is getting very, very thin).

Without some amount of image maintenance, we'll be adding a deprecation notice (which itself can be temporary, but we'd really much rather see the image updated :smile: :heart:).

tianon avatar Jun 19 '19 22:06 tianon

@tianon I have a PR that will update the docker images, namely https://github.com/rethinkdb/rethinkdb-dockerfiles/pull/46.

When I open the PR to change https://github.com/docker-library/official-images/blob/master/library/rethinkdb, should I open one to remove the deprecation warning or will you do it?

gabor-boros avatar Nov 29 '19 08:11 gabor-boros

For rethinkdb itself, it was AGPLv3 and after the transition, was changed to APL. The docker page for the official rethinkdb image is AGPLv3.

Is there specific information as to what is still forcing the APGL license for the image?

My team wants to use it, but due to our Legal department, AGPL is not a viable license for us. Thanks for all your hard work and for the information you can provide!

brecko avatar Mar 13 '20 16:03 brecko

@brecko the license should be Apache 2.0. Any other licenses are just wrong. Thank you for raising this, I’ll update that info.

gabor-boros avatar Mar 14 '20 08:03 gabor-boros

@gabor-boros Thank you for the quick response and that is good news to hear. So that my team can better align, approximately when do you think this change will be made?

brecko avatar Mar 16 '20 13:03 brecko

@brecko I wanted to do that today morning but I had no time for that. Tomorrow morning I’ll try to adjust that 😇

gabor-boros avatar Mar 16 '20 17:03 gabor-boros

PR: https://github.com/docker-library/docs/pull/1679

gabor-boros avatar Mar 17 '20 07:03 gabor-boros

Thank you @gabor-boros !

brecko avatar Mar 17 '20 17:03 brecko

I came up with a recipe for building rethinkdb 2.4.2 on apline:3.15 which is a stable branch. It uses ARG directives that can be overridden from the build command and performs a multi-stage build across several layers. The final image is a minuscule 35.7MB (13.82MB compressed) and runs as a non-root user. Hopefully this meets the requirements to be used as the basis for an official rethinkdb:alpine image. For now, I've pushed this to docker hub under my account if anyone wants to try it out.

docker run -d --name=rdbtest -p 8080:8080 -p 28015:28015 -p 29015:29015 besworks/rethinkdb:latest
docker logs -f rdbtest

Or build it yourself from the latest/Dockerfile in my github repo.

I've also added a version that includes python + the python rethinkdb driver (~79MB) which is tagged as besworks/rethinkdb:python

besworks avatar May 04 '22 23:05 besworks

So it gets the backtrace() with libexecinfo? Great.

I think instead of using boost-dev it should use the fetched boost library. This hard-codes the boost version to 1.60.0, which avoids any hypothetical changes to the behavior of boost's datetime library, which the query language and secondary index functions can use.

I think you also don't need icu-dev anymore.

srh avatar May 06 '22 09:05 srh

From my build log it looks like it ignored the installed boost-dev package (1.77.0-r1) anyway and instead built with the fetched version that you mentioned.

None of the build dependencies end up in the final image anyway so having an extra one here and there won't hurt a whole lot. These can easily be fine-tuned in later builds. I'll try without icu-dev next time I need to do a full run.

besworks avatar May 06 '22 13:05 besworks

For whatever it's worth, here's a Dockerfile that I made a while back that builds against alpine: https://github.com/rethinkdb/rethinkdb-dockerfiles/issues/32#issuecomment-297428635

daveisfera avatar May 06 '22 14:05 daveisfera

@daveisfera I based my Dockerfile partly on your example as well as others that I found. The difference with mine is that I use several RUN commands to create cached layers that way I can re-run builds with various tweaks without needing to install the deps, download and unpack the source, etc on each run.

I also build the output image from a fresh copy of the alpine base image to completely discard any build artifacts.

The python tagged image actually builds from a different base too because using python:alpine came out ~20MB smaller than installing python3 from apk. This could probably be reduced even more if the rethinkdb python module was installed some way other than with pip.

besworks avatar May 06 '22 14:05 besworks