mainframer Docker proof of concept

Docker proof of concept

Open AndreasBackx opened this issue 7 years ago • 5 comments

I thought it would be neat to have mainframer work with Docker so it can be easily used anywhere very easily without any setup by the users. Dockerfiles could perhaps be made available for multiple environments? I've got a proof of concept ready for Android called mainframer-android-docker. Everything is explained in the README, but if you have any questions, be sure to leave them below. Maybe this is something that could be integrated into mainframer?

Apr 15 '17 00:04 AndreasBackx

Any feedback on this?

May 04 '17 19:05 AndreasBackx

Well, yes, lot's of it, not sure where to start :)

JFYI we were considering running docker container for each user on remote machine but decided to stick to regular setup since it's easier to maintain for relatively small company. Good scalable solution would require a lot of work.

Now about your POC:

You're removing one of the main performance benefits of Gradle — Gradle daemon, by starting and shutting down container for each build. This might be a show stopper for many teams.
I definitely don't like that mainframer.sh was changed to achieve that, it's not required at all in my opinion :) When I was planning docker setup on remote machine I was considering running container per user with ssh daemon in it, so it wouldn't matter for remote user if he/she connects to real machine or docker container → no changes required for mainframer.sh itself.
Sample is android related while Mainframer is general purpose tool and would be much nicer to include link to more generic sample (with Android sample inside if you wish).
Build and run stages should be combined to achieve max automatization in big teams (imagine each developer would have to run build and more over understand when to do it, 99% of android devs don't know what Docker is or how to work with it), even better would be to simply download image from some registry and then run it (image can be published as part of project CI).
Correct me if I'm wrong, but looks like each run will have to download Gradle itself and project dependencies, this can take minutes while main purpose of Mainframer is to minimize build times…

That's main feedback I guess, feel free to comment :)

May 06 '17 12:05 artem-zinnatullin

Thank you for the feedback! My main goal is to get rid of all the setup required for mainframer for various projects. By providing a way of working with Docker, the users only need to have a Docker server running that they can SSH into (unless you make Docker listen on a port). This way it is easy to spin up multiple types of projects.

JFYI we were considering running docker container for each user on remote machine but decided to stick to regular setup since it's easier to maintain for relatively small company. Good scalable solution would require a lot of work.

Setting up a Docker server is a lot easier than figuring out yourself how to setup the Android SDK and the headless configuration that is a pain. I'm glad I had some help from people who've done it before: running Docker containers for Android. It would imo save a lot of people a headache if they could just spin up Docker and use Mainframer.

You're removing one of the main performance benefits of Gradle — Gradle daemon, by starting and shutting down container for each build. This might be a show stopper for many teams.

I agree that the Gradle daemon provides performance benefits that shouldn't be overlooked. The way Android has Docker support can always be changed or multiple solutions can be provided. I explained below why I used this approach.

I definitely don't like that mainframer.sh was changed to achieve that, it's not required at all in my opinion :) When I was planning docker setup on remote machine I was considering running container per user with ssh daemon in it, so it wouldn't matter for remote user if he/she connects to real machine or docker container → no changes required for mainframer.sh itself.

Docker containers should be kept simple, they should do 1 job and they should do it well. Running another process in the background is not the way you should use a container. You've already got an SSH server on the host so you should use that instead.

There are also benefits to having a Docker container spin up and shutdown only when it is needed. You don't stress the server constantly and can use the server for other purposes while it's not compiling one or multiple Android projects.

Gradle's documentation also recommends you to not use the daemon for build servers to keep everything running reliably and isolated. Docker is used to keep everything isolated.

I think that we should provide 2 ways of using Gradle remotely:

Use the approach from the POC: spin a container up with Gradle inside of it and compile the project without a Gradle daemon.

The benefits here are that it doesn't stress the system all the time and this is the most isolated solution.
Take advantage of the Gradle daemon: have 1 docker container running with a Gradle daemon running in the foreground. Then spin up containers that use that Gradle daemon and let them shut down again.

This would stress the system more, but it would definitely lead to faster builds. It does however break the isolation that Docker offers.

I also wonder how Gradle handles multiple projects at the same time and how we could accomplish this.

We should benchmark the current Android sample versus the POC of mine and compare the results. How fast the builds are, how much it stresses the RAM and CPU, and perhaps measure the temperature.

Sample is android related while Mainframer is general purpose tool and would be much nicer to include link to more generic sample (with Android sample inside if you wish).

I agree, that is why it's a proof of concept. The general idea is there and it works. It should only be made more broad and made to work for more platforms. I personally needed a Docker version of mainframer for Android which is why I made this.

Build and run stages should be combined to achieve max automatization in big teams

This is what the POC could offer compared to the current Android sample. I'm unsure whether the current Android would work when there are multiple builds running at the same time and whether it supports it in the first place. The POC could more easily be made to run multiple builds at the same time. Off the top of my head you'd have to remove the name specification when starting the container and you'd have to find a way to keep the .gradle, .android, and other volumes separate between each project/build.

(imagine each developer would have to run build and more over understand when to do it, 99% of android devs don't know what Docker is or how to work with it)

This is why I wanted to make this POC. It's a lot easier to install Docker by following the tutorial on their website than it is to setup a JDK, the Android SDK, and keep the support libraries up to date. Not to even mention running it headlessly with the licenses and keeping it automatically up-to-date.

even better would be to simply download image from some registry and then run it (image can be published as part of project CI).

I don't see what this has got to do with the Docker implementation. Why would you have Docker images of Android projects when you can save APKs? What does it also have to do with simply running gradle tasks remotely?

Correct me if I'm wrong, but looks like each run will have to download Gradle itself and project dependencies, this can take minutes while main purpose of Mainframer is to minimize build times…

It only has to do this on the first run like you would locally. It keeps all of the files in a persistent volume.

May 06 '17 18:05 AndreasBackx

I would like to apologise if my comments made you angry or sad, this is not the intention, I'm basically just describing my vision of Dockerization of the remote machine and wanted to do this anyway so this issue now is a good resource to share the knowledge :)

Docker containers should be kept simple, they should do 1 job and they should do it well. Running another process in the background is not the way you should use a container. You've already got an SSH server on the host so you should use that instead.

True, true. We do so for microservices/CI, but this is pretty different use-case!

There are also benefits to having a Docker container spin up and shutdown only when it is needed. You don't stress the server constantly and can use the server for other purposes while it's not compiling one or multiple Android projects.

I would agree, but: for instance build without already started Gradle daemon takes 1 minute and 8 seconds to build, with already running daemon it only takes 15-18 seconds to build. 4x slower. 4x. Please note that I did not clean any files between builds, source code change was to add and remove println() in some file.

Gradle's documentation also recommends you to not use the daemon for build servers to keep everything running reliably and isolated.

That is true for CI build servers! During development you want to use as much caches as possible to speedup things: build files, Gradle daemon, etc, etc.

Docker is used to keep everything isolated.

This is very good goal to achieve, but isolated does not mean that you can't keep containers running!

I'm not against Dockerizing remote machine, I actually really want it, but build performance is very important for us so we can't add Docker but lose more than 5-10% of performance.

Take advantage of the Gradle daemon: have 1 docker container running with a Gradle daemon running in the foreground. Then spin up containers that use that Gradle daemon and let them shut down again.

Passing build daemon between containers is very build system dependent and I'm not actually sure that Gradle is good at multiple project builds in parallel, we run Gradle daemon per user and one user usually builds one project at a time while multiple users can build their projects in parallel.

This would stress the system more, but it would definitely lead to faster builds. It does however break the isolation that Docker offers.

You make point about stressing the system but I don't see much problem with that :) We run separate machine for remote builds since December 2016 and currently it serves about 15 users per day each of them runs Gradle daemon and on idle machine uses about 30 gb of RAM and about 0-1% of i7 6700k CPU.

It's not stress when main purpose of machine is to serve new requests as fast as possible, it's expected that more RAM maybe required for idle state but that is normal trade-off.

We should benchmark the current Android sample versus the POC of mine and compare the results. How fast the builds are

As I said before, without Gradle daemon in our setup we have about 4x slower builds, this does not include Kotlin Compiler Daemon and other system processes required for the build. Each time you start not only Gradle Daemon, but also JIT in the JVMs, the longer JVM is running the faster it responds to new requests.

how much it stresses the RAM and CPU, and perhaps measure the temperature.

Again as pointed before temperature or CPU load is not an issue since Gradle and most of other build systems in daemon mode are not CPU intensive while idle, only RAM is consumed.

It's a lot easier to install Docker by following the tutorial on their website than it is to setup a JDK, the Android SDK, and keep the support libraries up to date. Not to even mention running it headlessly with the licenses and keeping it automatically up-to-date.

I agree that building simple Docker container with everything in it is not that hard, but keeping in mind performance goals we have it becomes much harder. Keeping machine up-to-date is not that hard, Android Gradle plugin now able to automatically download required Android SDK dependencies for the the build, so this problem is solved (I don't like it for CI but for development purpose it's good). We have pretty tiny bash script that setups new users on remote machine, and part of it is in this repo's recipes.

I don't see what this has got to do with the Docker implementation. Why would you have Docker images of Android projects when you can save APKs? What does it also have to do with simply running gradle tasks remotely?

I was talking about Docker images, in big teams you would probably want to work with prebuilt images rather than asking each developer to understand how, why and when to build new one.

It only has to do this on the first run like you would locally. It keeps all of the files in a persistent volume.

Ah I see, but current version of POC does not support multiple users properly without native different users on a machine, but this is not the goal of POC at the moment, so ok :)

May 06 '17 19:05 artem-zinnatullin

I would like to apologise if my comments made you angry or sad, this is not the intention, I'm basically just describing my vision of Dockerization of the remote machine and wanted to do this anyway so this issue now is a good resource to share the knowledge :)

No need, I just write a lot more compared to other people and might come off as sharp, but no hard feelings 😉 Thank you for the feedback, I really do appreciate it and apologise if I seemed angry or sad, quite the opposite! I agree that this is a good way of sharing the knowledge and opinions.

I would agree, but: for instance build without already started Gradle daemon takes 1 minute and 8 seconds to build, with already running daemon it only takes 15-18 seconds to build. 4x slower. 4x. Please note that I did not clean any files between builds, source code change was to add and remove println() in some file.

Good to know, I didn't know the difference was that massive. I was already happy to get the load off of my Macbook. Then I'd say getting a daemon running is essential.

This is very good goal to achieve, but isolated does not mean that you can't keep containers running!

I'm not saying that you cannot keep them running. With my proposal of having 1 docker container run the daemon and spin up other containers to use that daemon, I wanted to separate the concerns of each container.

I'm not against Dockerizing remote machine, I actually really want it, but build performance is very important for us so we can't add Docker but lose more than 5-10% of performance.

I completely agree. My idea for the Docker integration is that it should be completely hands-off. The one problem I see atm is what if one Gradle daemon is kept alive when that employee is on vacation for example? Currently we'd have to close that daemon or the container running it down manually. What would be a good idea is if we made some "middleware" that detects whether the container hasn't been used for a while and if so, shut it down. This is how I would like to see the Docker integration being used by people:

Employee A wants to start using mainframer for remote builds.
He simply changes some settings in his run configurations and make sure he has got access to the remote server.
When a build starts for Employee A, the server detects that Employee A does not have a container running and so spins one up. This container will keep running with a Gradle daemon.
He goes home at the end of the day and isn't going to build anything anymore.
After 3 hours, the container detects that hasn't built anything in 3 hours and decides to shut down in order to free up some memory. When this happens can of course be changed.

Now no one at the company has to 1) worry about setting up a gradle daemon on the remote server and 2) worry about managing that daemon (not even the sysadmins). What do you think?

Passing build daemon between containers is very build system dependent and I'm not actually sure that Gradle is good at multiple project builds in parallel, we run Gradle daemon per user and one user usually builds one project at a time while multiple users can build their projects in parallel.

I'm not sure either, it was the optimal approach in my mind because it would reduce overhead. It's like having a backend app running in a micro-service/container infrastructure. There will be web server containers running independently from the application so that requests can be load balanced and the container amount can be adjusted based on load. We'd just have 1 gradle daemon for multiple projects. But again, this is pure speculation and I don't know whether Gradle even supports this.

You make point about stressing the system but I don't see much problem with that :) We run separate machine for remote builds since December 2016 and currently it serves about 15 users per day each of them runs Gradle daemon and on idle machine uses about 30 gb of RAM and about 0-1% of i7 6700k CPU.

That's interesting, I expected it do demand more after running it on my Macbook. I'm currently using my old PC as a home server which is running an i7-2600. I'm gonna switch back to Arch Linux when I have some spare time to move everything over and that system is running a Ryzen 7 1700 which should be a beast with Gradle. I might set it up as a Docker server in the meantime.

I agree that building simple Docker container with everything in it is not that hard, but keeping in mind performance goals we have it becomes much harder. Keeping machine up-to-date is not that hard, Android Gradle plugin now able to automatically download required Android SDK dependencies for the the build, so this problem is solved (I don't like it for CI but for development purpose it's good).

I was saying that it was harder than it should. Currently the only approach I saw was by downloading and installing everything based on the versions used:

RUN wget -nv --output-document=android-sdk.zip https://dl.google.com/android/repository/tools_r${ANDROID_SDK_TOOLS}-linux.zip && \
 unzip -qo android-sdk.zip -d android-sdk-linux && \
 rm android-sdk.zip && \
 mkdir -p ~/.gradle && \
 echo "org.gradle.daemon=false" >> ~/.gradle/gradle.properties && \
 echo y | ${ANDROID_HOME}/tools/android --silent update sdk --no-ui --all --filter android-${ANDROID_TARGET_SDK} && \
 echo y | ${ANDROID_HOME}/tools/android --silent update sdk --no-ui --all --filter platform-tools && \
 echo y | ${ANDROID_HOME}/tools/android --silent update sdk --no-ui --all --filter build-tools-${ANDROID_BUILD_TOOLS} && \
 mkdir -p ${ANDROID_HOME}/licenses/ && \
 echo "8933bad161af4178b1185d1a37fbf41ea5269c55" > ${ANDROID_HOME}/licenses/android-sdk-license && \
 echo "84831b9409646a918e30573bab4c9c91346d8abd" > ${ANDROID_HOME}/licenses/android-sdk-preview-license

Specfically the

wget -nv --output-document=android-sdk.zip https://dl.google.com/android/repository/tools_r${ANDROID_SDK_TOOLS}-linux.zip

is not very friendly. I left a note in my README explaining this problem:

Note: Because of how the Dockerfile gets the Android SDK Tools version, version 25.2.5 seems to be the latest available version when using https://dl.google.com/android/repository/tools_r${ANDROID_SDK_TOOLS}-linux.zip. This needs to be resolved.

Versions newer than 25.2.5 use hashes instead of versions to identify the downloads. We'd have to write some software to get them from the RSS feed or keep a list of them statically somewhere that we have to update constantly.

How do you handle versions of build tools, sdk, and sdk tools?

We have pretty tiny bash script that setups new users on remote machine, and part of it is in this repo's recipes.

I'm not a fan of using bash scripts for setting up users. It can be very hard to clean up whereas containers are very easy to clean up. I prefer keeping my server as "clean" as possible which is why I made this POC.

I was talking about Docker images, in big teams you would probably want to work with prebuilt images rather than asking each developer to understand how, why and when to build new one.

Why should the developers need to handle anything with Docker? The whole process should be invisible to them, they'd only have to work with the stuff they already know. This is why you only need to supply the versions of the build tools, sdk, and sdk tools to the POC Dockerfile. This is something I'm really trying to achieve in the end.

Ah I see, but current version of POC does not support multiple users properly without native different users on a machine, but this is not the goal of POC at the moment, so ok :)

Yes, it should be able to work easily after a few changes.

Gradle builds might benefit from keeping a container running. Other services do not like Go and I think we also need to think of those as well. There needs to be a clearly defined "flow" for each type of project: Gradle projects use a running container and Go perhaps uses multistage containers for building and perhaps running them. Multi-stage builds are new in Docker 17.05. When we're adding Docker support to mainframer, then I think it should be easy to define such a flow when adding support for other project types.

May 06 '17 21:05 AndreasBackx

mainframer mainframer copied to clipboard

Docker proof of concept

mainframer
mainframer copied to clipboard