community icon indicating copy to clipboard operation
community copied to clipboard

Develop a list of software licenses used by Kubeflow and its dependencies

Open jbottum opened this issue 3 years ago • 11 comments

We need an inventory the software license(s) used by Kubeflow and its dependencies.

Kubeflow uses an Apache License 2.0, per this page, https://github.com/kubeflow/kubeflow/blob/master/LICENSE.

Kubeflow has many dependencies. We might use LicenseFinder or DepChecker to check the dependencies and their licenses.

jbottum avatar Jan 23 '23 04:01 jbottum

Tidlift is another potential tool to review licenses and dependencies

jbottum avatar Jan 24 '23 17:01 jbottum

Potentially we need to define hat is Kubeflow, what is included? Other - Potential dependency issues with Grafana, Minio, Prometheus.

@zijianjoy @akgraner @james-jwu do you have any comments on this ? Is it required ?

jbottum avatar Jan 24 '23 17:01 jbottum

I got some feedback from Bob. We will likely need a list of licenses like:

Regarding license discovery, my understanding is that we need to scan all software and their dependencies within the container images for Kubeflow. KFP does this regularly, and we can perhaps share some process / experience here. @zijianjoy Is this something you can provide guidance?

james-jwu avatar Jan 25 '23 00:01 james-jwu

To share KFP license scan process:

  • Frontend: We use license-checker and run this script https://github.com/kubeflow/pipelines/blob/d60bc99bb61a9f23fce8eabd78634236e45a5dc1/frontend/gen_licenses.js.
  • Backend: We use go-license tool: https://github.com/kubeflow/pipelines/tree/d60bc99bb61a9f23fce8eabd78634236e45a5dc1/backend#updating-licenses-info

I suggest to start with Kubeflow WG component images and then potentially extend to its dependencies for license scan. https://github.com/kubeflow/manifests#kubeflow-components-versions. If this requirement comes from CNCF, I am guessing dependencies like istio and knative should have already been license compliance, because they are CNCF projects.

zijianjoy avatar Feb 08 '23 20:02 zijianjoy

@DomFleischmann Have you been able to create an inventory of the Kubeflow images ? Do we need our list to include the location where those images are stored ?

jbottum avatar Feb 08 '23 22:02 jbottum

@akgraner @juliusvonkohout do either of you have a list of Kubeflow images with their location ?

jbottum avatar Feb 16 '23 22:02 jbottum

@jbottum i provided the list multiple times in the security meeting. Although busybox and some Workbench images are missing. Dominik uploaded it here then https://pastebin.ubuntu.com/p/4nMrk4SXjm/ The tags are missing, since they will change for 1.7 anyway.

juliusvonkohout avatar Feb 17 '23 10:02 juliusvonkohout

With a security repository i would have a place to store them and people could add missing stuff /merge their own lists...

juliusvonkohout avatar Feb 17 '23 10:02 juliusvonkohout

@juliusvonkohout @DomFleischmann thanks! great work.

jbottum avatar Feb 17 '23 17:02 jbottum

Hi all, here is the list of images I got (for manifest v1.7-branch): kf_1.7.0_images

Below is the script used for generating the images list. The script needs to be placed in the root directory of manifest repo in order to be run.

VERSION=1.7.0
output_file="kf_${VERSION}_images"

# Try to delete 'tmp' file with force - no matter if it exists or not
rm -f tmp
# Iterate over all files with names: 'kustomization.yaml', 'kustomization.yml', 'Kustomization' found recursively in current directory
for F in $(find ./apps ./common \( -name kustomization.yaml   -o -name kustomization.yml -o -name Kustomization \)); do
  # Get path to the file
  dir=$(dirname -- "$F")
  # Generate k8s resources specified in 'dir' using 'kustomize build' command.
  # Check if the command fails and log the problmatic folder.
  kbuild=$(kustomize build "$dir")
  return_code=$?
  if [ $return_code -ne 0 ]; then
    printf 'ERROR:\t Failed \"kustomize build\" command for directory: %s. See error above\n' "$dir"
     continue
  fi
    # Grep the output of 'kustomize build' command for 'image:' and '- image' lines,
    # and remove strings 'image:', '- image: ", empty spaces and tabs from the output.
    # Lastly, delete all empty lines, and lines containing '{' character.
    # Redirect the output to 'tmp' file
  grep '\-\?\s\image:'<<<"$kbuild" | sed -re 's/\s-?\simage: *//;s/^[ \t]*//g' | sed '/^$/d;/{/d' >> tmp
done

# Sort the content of 'tmp' file, get the uniq records and redirect the output to 'output_file'
sort tmp | uniq > "$output_file"
# Clean 'tmp' file.
rm -f tmp

echo "File ${output_file} successfully created"

Disclamer: kustomize build command fails for the following directories:

  • ./apps/kfp-tekton/upstream/env/cert-manager/platform-agnostic-multi-user
  • ./apps/kfp-tekton/upstream/env/cert-manager/dev
  • ./apps/kfp-tekton/upstream/env/platform-agnostic-multi-user-emissary
  • ./apps/kfp-tekton/upstream/env/dev
  • ./apps/kfp-tekton/upstream/env/aws
  • ./apps/kfp-tekton/upstream/env/platform-agnostic-emissary
  • ./apps/katib/upstream/installs/katib-standalone-postgres
  • ./apps/katib/upstream/installs/katib-leader-election
  • ./common/dex/overlays/ldap
  • ./common/dex/overlays/github

@annajung brought up that there are example images that probably need to be also considered.

difince avatar Mar 08 '23 12:03 difince

As discussed in slack on March 15, the above script should consider also manifests ./example folder So, this line for F in $(find ./apps ./common \( -name kustomization.yaml -o -name kustomization.yml -o -name Kustomization \)); do needs to become: for F in $(find ./apps ./common ./example \( -name kustomization.yaml -o -name kustomization.yml -o -name Kustomization \)); do

difince avatar Mar 15 '23 16:03 difince