ort icon indicating copy to clipboard operation
ort copied to clipboard

Update Dockerfile to use multi-stage builds

Open heliocastro opened this issue 3 years ago • 8 comments

  • Uses multi-stage build to reduce final container
  • Harmonize the base image to use same OpenJDK base from Eclipse Temurin and Ubuntu Focal LTS
  • Make all tooling versions customizable through docker build args
  • Each tooling is built in separate components, improving the process to add or remove specific tools

heliocastro avatar Nov 24 '21 08:11 heliocastro

Is there a way to unresolve a conversation, I think https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r757104461 (and https://github.com/heliocastro/ort/pull/18) is worth to think about

maxhbr avatar Nov 25 '21 19:11 maxhbr

Well, to solve this the easy way is to compile python as well.

I did that with CLANG on the docker_overhaul, but yes, is reasonable to update scancode.

On Thu, Jan 6, 2022 at 10:25 AM Maximilian Huber @.***> wrote:

@.**** commented on this pull request.

In docker/legacy/Dockerfile https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r779408845 :

+COPY . /usr/local/src/ort + +WORKDIR /usr/local/src/ort + +# Gradle build. +ARG ORT_VERSION +RUN --mount=type=cache,target=/tmp/.gradle/ \

  • GRADLE_USER_HOME=/tmp/.gradle/ && \
  • scripts/import_proxy_certs.sh && \
  • scripts/set_gradle_proxy.sh && \
  • sed -i -r 's,(^distributionUrl=)(.+)-all.zip$,\1\2-bin.zip,' gradle/wrapper/gradle-wrapper.properties && \
  • sed -i -r '/distributionSha256Sum=[0-9a-f]{64}/d' gradle/wrapper/gradle-wrapper.properties && \
  • ./gradlew --no-daemon --stacktrace -Pversion=$ORT_VERSION :cli:distTar :helper-cli:startScripts

+FROM adoptopenjdk:11-jre-hotspot-bionic

You should follow that, as soon as it is merged on master. But this also conflicts with the old Scancode version requiring python 3.6

— Reply to this email directly, view it on GitHub https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r779408845, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADGDAH562S22LITLVKIDPLUUVNZVANCNFSM5IVMVK5A . You are receiving this because you authored the thread.Message ID: @.***>

heliocastro avatar Jan 06 '22 12:01 heliocastro

Up to date with upstream Docker changes, including new Java 17

heliocastro avatar Jan 12 '22 15:01 heliocastro

Up to date with upstream Docker changes, including new Java 17

I just had to revert that 😢 See https://github.com/oss-review-toolkit/ort/pull/4948.

sschuberth avatar Jan 12 '22 16:01 sschuberth

Taking some notes from today's meeting with @heliocastro and @tsteenbe:

  • Uses multi-stage build to reduce final container

Now different package managers are installed in different layers of "helper images", for which the final images just copies required files. This optimizes for incremental build time, but it also reduces the image size, as only required binaries are included, and not the files to build the binaries.

  • Harmonize the base image to use same OpenJDK base from Eclipse Temurin and Ubuntu Focal LTS

This is good, but just as a remark, meanwhile also available in the "legacy" Dockerfile.

  • Make all tooling versions customizable through docker build args

Note to myself: Tool versions were customizable before, but only via environment variables, but via build arguments.

  • Each tooling is built in separate components, improving the process to add or remove specific tools

Also see explanations for the first bullet item.

sschuberth avatar Feb 22 '22 13:02 sschuberth

Removed. You are correct

On Sun, Feb 27, 2022 at 6:24 PM Maximilian Huber @.***> wrote:

@.**** commented on this pull request.

In docker/bash_prompt https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r815468412 :

@@ -0,0 +1,99 @@ +#!/bin/bash

Under which license is that file?

I am thinking that this is going to far. convenience for interactive usage should be in another image that is added on top and not included by default.

— Reply to this email directly, view it on GitHub https://github.com/oss-review-toolkit/ort/pull/4746#pullrequestreview-894527479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADGDAFBHJND3WQVT5MS33LU5JM4VANCNFSM5IVMVK5A . You are receiving this because you were mentioned.Message ID: @.***>

heliocastro avatar Feb 28 '22 07:02 heliocastro

@fviernau @heliocastro @tsteenbe had a meeting on July 15th to discuss getting this PR merged.

@fviernau @tsteenbe agreed this can be merged once below action items 4,6 and 7 are done

Action items:

  1. Fill issue to replace virtualenv on Python analysis to use pyenv (@heliocastro)
  2. Fill issue to drop Ruby 2 in ORT (@heliocastro)
  3. Fill issue to ORT docker image should also work on ARM (@tsteenbe)
  4. Fix functional test failure "Could not build image for container run.", see https://github.com/oss-review-toolkit/ort/runs/7354719297?check_suite_focus=true How to debug this issue? (@heliocastro/ @sschuberth)
  5. Nominate @heliocastro to become an ORT contributor (@tsteenbe)
  6. Update ort-gitlab-ci to enable running with legacy Docker image (@tsteenbe)
  7. Test multi-stage Docker image with a variety of projects using different package managers (@fviernau / @tsteenbe)
  8. Once 4 has been addressed, make announcement that PR will be merged by date X (1 week period) and that users dependend on legacy image need to take mitigation measures. (@tsteenbe)

tsteenbe avatar Jul 15 '22 10:07 tsteenbe

Some work on getting sources:

Android copyright information is present in the following files in the resulting container:

/opt/android-sdk/cmdline-tools/NOTICE.txt
/opt/android-sdk/platform-tools/NOTICE.txt

The related sources are probably available on https://android.googlesource.com/ but to me it is unclear what source repositories relate to the built binaries.

Temurin JDK source is available via this GitHub API oneliner. Notice that I had to convert the git tag jdk-11.0.16.1+1 to the jdk-11.0.16.1*1 regex to match. I don't know why jq can't deal with a + character to match.

curl -s https://api.github.com/repos/adoptium/temurin11-binaries/releases | jq -r '.[] | select(.tag_name | test("jdk-11.0.16.1*1")) | .assets[] | select(.name | test("sources.*.tar.gz$")) | .browser_download_url'

Some ideas for improvement/simplification:

  • Request the assets by tag name: https://docs.github.com/en/rest/releases/releases#get-a-release-by-tag-name
  • Skip tags, get all resources and filter the assets.name or browser-download_url properties directly.

nicorikken avatar Aug 25 '22 11:08 nicorikken

Codecov Report

Merging #4746 (78f3e89) into main (2d7b039) will decrease coverage by 0.00%. The diff coverage is n/a.

:exclamation: Current head 78f3e89 differs from pull request most recent head 4214c2f. Consider uploading reports for the commit 4214c2f to get more accurate results

@@             Coverage Diff              @@
##               main    #4746      +/-   ##
============================================
- Coverage     57.72%   57.71%   -0.01%     
+ Complexity     2210     2208       -2     
============================================
  Files           321      321              
  Lines         18954    18942      -12     
  Branches       3678     3677       -1     
============================================
- Hits          10941    10933       -8     
+ Misses         6867     6864       -3     
+ Partials       1146     1145       -1     
Flag Coverage Δ
test 27.57% <ø> (-0.03%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ain/kotlin/commands/packageconfig/CreateCommand.kt 0.00% <0.00%> (ø)
.../curation/ClearlyDefinedPackageCurationProvider.kt 48.80% <0.00%> (+2.38%) :arrow_up:
...-cli/src/main/kotlin/utils/PathExcludeGenerator.kt 100.00% <0.00%> (+3.03%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov[bot] avatar Sep 05 '22 13:09 codecov[bot]

Cross-posting from Slack: Please don't merge this with a whopping 43 (!) commits in the history. I guess most of this should be squashable, or?

sschuberth avatar Sep 08 '22 14:09 sschuberth

@mnonnenmacher @MarcelBochtler @fviernau could you live with the Git history in this PR?

sschuberth avatar Sep 08 '22 14:09 sschuberth

@mnonnenmacher @MarcelBochtler @fviernau could you live with the Git history in this PR?

@sschuberth I would not merge it like this, because just from a short look I see so many issues:

  • 43 commits is far too much for a single PR. This is impossible to review.
  • Many commits change multiple things at once. Just as one example, the "Revert to Java 11" commit changes the pyenv installation and there is no explanation in the commit message if this is related to the base image change or just completely unrelated.
  • There are a lot of typos and hard to understand sentences in the commit messages.
  • Several commits contain lists of changes which shows they should have been separate commits.
  • There is one commit that creates a legacy Dockerfile, but this is not the current version of the file but an old version. Instead, there is a separate commit that updates the legacy file.
  • There are version updates all over the place, it's unfeasible to check if all of these are updates related to some old version of the original Dockerfile or to the current Dockerfile. For example, the first commit downgrades ScanCode only to upgrade it again in a later commit.
  • Commit message prefixes are inconsistent, some start with "Docker fix:" or "Docker update:" or "Dockerfile:" or "docker:".
  • It's unclear why some entries were added to the .gitignore.
  • The link to the BuildKit docs was removed from the README.

And this is not the complete list.

I understand that this PR was worked on for almost a year now, and that this was probably a bit frustrating. So I want to thank @heliocastro for being to persistent with your efforts to improve our Docker setup. But nevertheless this PR should follow our standards. We have put a lot of effort into maintaining a clean Git history for this project because it brings a lot of benefits, and I would not like to ignore this now just because someone has an urge to merge these changes.

My suggestion to move this forward in a clean way would be to squash all changes into just a few commits, so that the PR can be reviewed properly, for example:

  • One commit to copy the current Dockerfile to the legacy directory.
  • A few commits that apply the relevant changes on top of the current Dockerfile, but without all the version updates that just repeat version updates which already happened in our current Dockerfile. Or even squashing all changes into a single commit would be better than the current state of the PR.
  • One commit for each version update that changes a version compared to our current Dockerfile.

@sschuberth @fviernau @MarcelBochtler Does one of you maybe have a better idea how to clean this up?

And as a general comment, if you work on a contribution that takes a long time, to a project with mandatory code review, it is super important to rebase your changes on top of the current main branch on a regular basis. Yes, this can be a lot of work, but it is the responsibility of the author to make sure that their changes are in a reviewable state.

mnonnenmacher avatar Sep 08 '22 19:09 mnonnenmacher

Does one of you maybe have a better idea how to clean this up?

Looks like a clean-up was already done in https://github.com/oss-review-toolkit/ort/pull/5760. I'll close this PR in favor of that one. Let's continue the discussion over there.

sschuberth avatar Sep 09 '22 06:09 sschuberth