ort
ort copied to clipboard
Update Dockerfile to use multi-stage builds
- Uses multi-stage build to reduce final container
- Harmonize the base image to use same OpenJDK base from Eclipse Temurin and Ubuntu Focal LTS
- Make all tooling versions customizable through docker build args
- Each tooling is built in separate components, improving the process to add or remove specific tools
Is there a way to unresolve a conversation, I think https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r757104461 (and https://github.com/heliocastro/ort/pull/18) is worth to think about
Well, to solve this the easy way is to compile python as well.
I did that with CLANG on the docker_overhaul, but yes, is reasonable to update scancode.
On Thu, Jan 6, 2022 at 10:25 AM Maximilian Huber @.***> wrote:
@.**** commented on this pull request.
In docker/legacy/Dockerfile https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r779408845 :
+COPY . /usr/local/src/ort + +WORKDIR /usr/local/src/ort + +# Gradle build. +ARG ORT_VERSION +RUN --mount=type=cache,target=/tmp/.gradle/ \
- GRADLE_USER_HOME=/tmp/.gradle/ && \
- scripts/import_proxy_certs.sh && \
- scripts/set_gradle_proxy.sh && \
- sed -i -r 's,(^distributionUrl=)(.+)-all.zip$,\1\2-bin.zip,' gradle/wrapper/gradle-wrapper.properties && \
- sed -i -r '/distributionSha256Sum=[0-9a-f]{64}/d' gradle/wrapper/gradle-wrapper.properties && \
- ./gradlew --no-daemon --stacktrace -Pversion=$ORT_VERSION :cli:distTar :helper-cli:startScripts
+FROM adoptopenjdk:11-jre-hotspot-bionic
You should follow that, as soon as it is merged on master. But this also conflicts with the old Scancode version requiring python 3.6
— Reply to this email directly, view it on GitHub https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r779408845, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADGDAH562S22LITLVKIDPLUUVNZVANCNFSM5IVMVK5A . You are receiving this because you authored the thread.Message ID: @.***>
Up to date with upstream Docker changes, including new Java 17
Up to date with upstream Docker changes, including new Java 17
I just had to revert that 😢 See https://github.com/oss-review-toolkit/ort/pull/4948.
Taking some notes from today's meeting with @heliocastro and @tsteenbe:
- Uses multi-stage build to reduce final container
Now different package managers are installed in different layers of "helper images", for which the final images just copies required files. This optimizes for incremental build time, but it also reduces the image size, as only required binaries are included, and not the files to build the binaries.
- Harmonize the base image to use same OpenJDK base from Eclipse Temurin and Ubuntu Focal LTS
This is good, but just as a remark, meanwhile also available in the "legacy" Dockerfile
.
- Make all tooling versions customizable through docker build args
Note to myself: Tool versions were customizable before, but only via environment variables, but via build arguments.
- Each tooling is built in separate components, improving the process to add or remove specific tools
Also see explanations for the first bullet item.
Removed. You are correct
On Sun, Feb 27, 2022 at 6:24 PM Maximilian Huber @.***> wrote:
@.**** commented on this pull request.
In docker/bash_prompt https://github.com/oss-review-toolkit/ort/pull/4746#discussion_r815468412 :
@@ -0,0 +1,99 @@ +#!/bin/bash
Under which license is that file?
I am thinking that this is going to far. convenience for interactive usage should be in another image that is added on top and not included by default.
— Reply to this email directly, view it on GitHub https://github.com/oss-review-toolkit/ort/pull/4746#pullrequestreview-894527479, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADGDAFBHJND3WQVT5MS33LU5JM4VANCNFSM5IVMVK5A . You are receiving this because you were mentioned.Message ID: @.***>
@fviernau @heliocastro @tsteenbe had a meeting on July 15th to discuss getting this PR merged.
@fviernau @tsteenbe agreed this can be merged once below action items 4,6 and 7 are done
Action items:
- Fill issue to replace virtualenv on Python analysis to use pyenv (@heliocastro)
- Fill issue to drop Ruby 2 in ORT (@heliocastro)
- Fill issue to ORT docker image should also work on ARM (@tsteenbe)
- Fix functional test failure "Could not build image for container run.", see https://github.com/oss-review-toolkit/ort/runs/7354719297?check_suite_focus=true How to debug this issue? (@heliocastro/ @sschuberth)
- Nominate @heliocastro to become an ORT contributor (@tsteenbe)
- Update ort-gitlab-ci to enable running with legacy Docker image (@tsteenbe)
- Test multi-stage Docker image with a variety of projects using different package managers (@fviernau / @tsteenbe)
- Once 4 has been addressed, make announcement that PR will be merged by date X (1 week period) and that users dependend on legacy image need to take mitigation measures. (@tsteenbe)
@tsteenbe Ruby issue @tsteenbe Python vitualenv replacement proposal
Some work on getting sources:
Android copyright information is present in the following files in the resulting container:
/opt/android-sdk/cmdline-tools/NOTICE.txt
/opt/android-sdk/platform-tools/NOTICE.txt
The related sources are probably available on https://android.googlesource.com/ but to me it is unclear what source repositories relate to the built binaries.
Temurin JDK source is available via this GitHub API oneliner. Notice that I had to convert the git tag jdk-11.0.16.1+1
to the jdk-11.0.16.1*1
regex to match. I don't know why jq
can't deal with a +
character to match.
curl -s https://api.github.com/repos/adoptium/temurin11-binaries/releases | jq -r '.[] | select(.tag_name | test("jdk-11.0.16.1*1")) | .assets[] | select(.name | test("sources.*.tar.gz$")) | .browser_download_url'
Some ideas for improvement/simplification:
- Request the assets by tag name: https://docs.github.com/en/rest/releases/releases#get-a-release-by-tag-name
- Skip tags, get all resources and filter the
assets.name
orbrowser-download_url
properties directly.
Codecov Report
Merging #4746 (78f3e89) into main (2d7b039) will decrease coverage by
0.00%
. The diff coverage isn/a
.
:exclamation: Current head 78f3e89 differs from pull request most recent head 4214c2f. Consider uploading reports for the commit 4214c2f to get more accurate results
@@ Coverage Diff @@
## main #4746 +/- ##
============================================
- Coverage 57.72% 57.71% -0.01%
+ Complexity 2210 2208 -2
============================================
Files 321 321
Lines 18954 18942 -12
Branches 3678 3677 -1
============================================
- Hits 10941 10933 -8
+ Misses 6867 6864 -3
+ Partials 1146 1145 -1
Flag | Coverage Δ | |
---|---|---|
test | 27.57% <ø> (-0.03%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
...ain/kotlin/commands/packageconfig/CreateCommand.kt | 0.00% <0.00%> (ø) |
|
.../curation/ClearlyDefinedPackageCurationProvider.kt | 48.80% <0.00%> (+2.38%) |
:arrow_up: |
...-cli/src/main/kotlin/utils/PathExcludeGenerator.kt | 100.00% <0.00%> (+3.03%) |
:arrow_up: |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
Cross-posting from Slack: Please don't merge this with a whopping 43 (!) commits in the history. I guess most of this should be squashable, or?
@mnonnenmacher @MarcelBochtler @fviernau could you live with the Git history in this PR?
@mnonnenmacher @MarcelBochtler @fviernau could you live with the Git history in this PR?
@sschuberth I would not merge it like this, because just from a short look I see so many issues:
- 43 commits is far too much for a single PR. This is impossible to review.
- Many commits change multiple things at once. Just as one example, the "Revert to Java 11" commit changes the pyenv installation and there is no explanation in the commit message if this is related to the base image change or just completely unrelated.
- There are a lot of typos and hard to understand sentences in the commit messages.
- Several commits contain lists of changes which shows they should have been separate commits.
- There is one commit that creates a legacy Dockerfile, but this is not the current version of the file but an old version. Instead, there is a separate commit that updates the legacy file.
- There are version updates all over the place, it's unfeasible to check if all of these are updates related to some old version of the original Dockerfile or to the current Dockerfile. For example, the first commit downgrades ScanCode only to upgrade it again in a later commit.
- Commit message prefixes are inconsistent, some start with "Docker fix:" or "Docker update:" or "Dockerfile:" or "docker:".
- It's unclear why some entries were added to the .gitignore.
- The link to the BuildKit docs was removed from the README.
And this is not the complete list.
I understand that this PR was worked on for almost a year now, and that this was probably a bit frustrating. So I want to thank @heliocastro for being to persistent with your efforts to improve our Docker setup. But nevertheless this PR should follow our standards. We have put a lot of effort into maintaining a clean Git history for this project because it brings a lot of benefits, and I would not like to ignore this now just because someone has an urge to merge these changes.
My suggestion to move this forward in a clean way would be to squash all changes into just a few commits, so that the PR can be reviewed properly, for example:
- One commit to copy the current Dockerfile to the legacy directory.
- A few commits that apply the relevant changes on top of the current Dockerfile, but without all the version updates that just repeat version updates which already happened in our current Dockerfile. Or even squashing all changes into a single commit would be better than the current state of the PR.
- One commit for each version update that changes a version compared to our current Dockerfile.
@sschuberth @fviernau @MarcelBochtler Does one of you maybe have a better idea how to clean this up?
And as a general comment, if you work on a contribution that takes a long time, to a project with mandatory code review, it is super important to rebase your changes on top of the current main branch on a regular basis. Yes, this can be a lot of work, but it is the responsibility of the author to make sure that their changes are in a reviewable state.
Does one of you maybe have a better idea how to clean this up?
Looks like a clean-up was already done in https://github.com/oss-review-toolkit/ort/pull/5760. I'll close this PR in favor of that one. Let's continue the discussion over there.