openwhisk icon indicating copy to clipboard operation
openwhisk copied to clipboard

Building is very slow because of Python 2 and `ulimit` of number of open files

Open Reylak opened this issue 2 years ago • 2 comments

Environment details:

(all that matters is that I run on ArchLinux)

  • host: ArchLinux (2023-04-20), Linux 6.2.11
  • Gradlew:
------------------------------------------------------------
Gradle 6.9.1
------------------------------------------------------------

Build time:   2021-08-20 11:15:18 UTC
Revision:     f0ddb54aaae0e44f0a7209c3c0274d506ea742a0

Kotlin:       1.4.20
Groovy:       2.5.12
Ant:          Apache Ant(TM) version 1.10.9 compiled on September 27 2020
JVM:          1.8.0_362 (Oracle Corporation 25.362-b09)
OS:           Linux 6.2.11-arch1-1 amd64

Steps to reproduce the issue:

  1. clone OpenWhisk source repository
  2. ./gradlew distDocker

Provide the expected results and outputs:

OpenWhisk should build in reasonable time (a matter of minutes, less than 3min on my computer).

Provide the actual results and outputs:

OpenWhisk is very slow to build. The very slow phase is building ow-utils, in particular installing Python 2 native modules which also compiles them (>=30min and counting, I stopped after this time).

Proposed fix

Move to Python 3-only base Docker images.

Workaround: set a ulimit directive on Docker build commands in "gradle/docker.gradle" to lower the limit on the number of open files (nofile). For example:

dockerBuildArg = ['build', '--ulimit', 'nofile=4096:4096']

Additional information you deem important:

The underlying problem has been identified in an issue about another topic in openwhisk-deploy-kube: Python 2 does not handle well high numbers of possibly open FDs when asked to close them when calling subprocess.Popen (see Python bug report). What this has to do with building OpenWhisk, is that the Docker image for ow-utils (possibly others) uses Python 2 and installs Python packages with native modules; the latter must be compiled on-the-fly, incurring a lot of calls to subprocesss.Popen (just a guess here) that trigger this behavior.

It is very visible on ArchLinux because the default ulimit on the number of open files in a Docker container is very high in this distribution (not sure why not in others, I guess they set a lower default value system-wide).

Honestly, I think effort should be made to definitely migrate from Python 2 in every OpenWhisk component.

Reylak avatar Apr 20 '23 11:04 Reylak

I agree that we should just be using Python3 everywhere. I don't recall exactly why ow-utils is using Python 2, but its well past the Python 2 EOL data of 1/1/2020.

dgrove-oss avatar Apr 20 '23 21:04 dgrove-oss

The same applies to CouchDB (see the issue at openwhisk-deploy-kube). But in this case, it has more to do with using an old version of CouchDB, and I cannot tell if using a more recent version would break things for OpenWhisk.

For the record, the issue can be fixed in an Ansible deployment by adding a ulimit to the deploy task of CouchDB:

- name: "(re)start CouchDB from '{{ couchdb_image }} ' "
  vars:
    couchdb_image: "{{ couchdb.docker_image | default('apache/couchdb:' ~ couchdb.version ) }}"
    # [...]
    ulimits:
      - "nofile:4096:4096"
    pull: "{{ couchdb.pull_couchdb | default(true) }}"

Reylak avatar Apr 21 '23 08:04 Reylak