nauta icon indicating copy to clipboard operation
nauta copied to clipboard

build error

Open nparkstar opened this issue 5 years ago • 6 comments

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

I've got errors as below during building nauta on kubernetes pod system as build system. --------------------------------------------------------------------------------------------- fatal: [local]: FAILED! => {"ansible_job_id": "452303190776.86355", "attempts": 5, "changed": false, "finished": 1, "msg": "Error building nauta/rpm/python - code: 127, message: The command '/bin/sh -c pip install -U pip==19.0.3 virtualenv==16.0.0 setuptools==39.2.0 wheel==0.31.1' returned a non-zero code: 127, logs: ['Step 1/12 : ARG BASE_IMAGE=shared/centos/rpm-packer', '\n', 'Step 2/12 : ARG PYTHON2_PIP_RPM_IMAGE=shared/build/rpm/python2-pip', '\n', 'Step 3/12 : FROM ${PYTHON2_PIP_RPM_IMAGE} as python2_pip_rpm_image', '\n', ' ---> 03175a7225d0\n', 'Step 4/12 : FROM ${BASE_IMAGE}', '\n', ' ---> 0cf01b1e3621\n', 'Step 5/12 : ENV RPM_VERSION=2.7', '\n', ' ---> Using cache\n', ' ---> 1a58dc8d7507\n', 'Step 6/12 : ENV RPM_RELEASE=0', '\n', ' ---> Using cache\n', ' ---> 54fe0479563b\n', 'Step 7/12 : RUN yum update -y && yum install -y python-devel python libffi-devel openssl-devel gcc gcc-c++', '\n', ' ---> Using cache\n', ' ---> d08159f4c7fb\n', 'Step 8/12 : RUN curl "https://bootstrap.pypa.io/get-pip.py" | python', '\n', ' ---> Using cache\n', ' ---> 753d99098bff\n', 'Step 9/12 : RUN pip install -U pip==19.0.3 virtualenv==16.0.0 setuptools==39.2.0 wheel==0.31.1', '\n', ' ---> Running in 7ca17da74ecf\n', '\x1b[91m/bin/sh: pip: command not found\n\x1b[0m', 'Removing intermediate container 7ca17da74ecf\n']"} --------------------------------------------------------------------------------------------- This message say that pip command is not found, but pip command is exist.

Cluster configuration details: I tried on my own system.

  • Cloud provider or hardware configuration:
  • Operating system: (printf "$(uname -srm)\n$(cat /etc/os-release)\n"): Linux 4.4.214-1.el7.elrepo.x86_64 x86_64 NAME="Ubuntu" VERSION="16.04.6 LTS (Xenial Xerus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 16.04.6 LTS" VERSION_ID="16.04" HOME_URL="http://www.ubuntu.com/" SUPPORT_URL="http://help.ubuntu.com/" BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/" VERSION_CODENAME=xenial

This system info about Docker container, actually k8s on nauta.

  • Nauta version and commit: (nctl version)(git rev-parse --short HEAD): git cloned today(2020/03/10)

Nauta component related with bug: (build system/installer/nctl(cli)/dashboard/documentation/k8s/any of nauta container)

make k8s-installer-build

What is the current behavior?

What is the expected behavior?

Steps to reproduce: * *

Anything else do we need to know:

Is any problem building on container(actually nauta)

nparkstar avatar Mar 10 '20 12:03 nparkstar

Hi, it looks like cached docker layers are not working correctly for some reason (I can build this image without problems when docker cache is not used) - https://github.com/IntelAI/nauta/pull/43 PR should mitigate this issue.

mateusz-ciesielski avatar Mar 10 '20 14:03 mateusz-ciesielski

Thank you for your reply. Would you tell me what I should do to debug this issue?

nparkstar avatar Mar 10 '20 14:03 nparkstar

Best way would be to simply update nauta's code to latest version (PR that I've mentioned above is already merged) and run the build again, nauta/rpm/python should work now

mateusz-ciesielski avatar Mar 10 '20 15:03 mateusz-ciesielski

I built nauta on bare-metal successfully, but I failed on Kubernetes pod, error message is below, it reads that it could not download nginx from mirror site. But I downloaded it easy by using command "wget http://nginx.org/packages/mainline/centos/7/x86_64/RPMS/nginx-1.13.9-1.el7_4.ngx.x86_64.rpm"

What can I do? or Is it impossible to build on kubernetes pod?

error message partial --- Trying other mirror.\n ", "http://nginx.org/packages/mainline/centos/7/x86_64/RPMS/nginx-1.13.9-1.el7_4.ngx.x86_64.rpm: [Errno 12] Timeout on http://nginx.org/packages/mainline/centos/7/x86_64/RPMS/nginx-1.13.9-1.el7_4.ngx.x86_64.rpm: (28, 'Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds')\n ", 'Trying other mirror.\n ', '1:nginx-1.13.9-1.el7_4.ngx.x86_64: [Errno 256] No more mirrors to try.\n ', 'Removing intermediate container 12b10818fba0\n']"

Thank you,

nparkstar avatar Mar 12 '20 09:03 nparkstar

Could you tell me what particular image build is failing? If you mean that you want to build Nauta on some Kubernetes pod (like running make k8s-installer-build inside of Kubernetes container), be aware that it may not work, we are not supporting such case. Although, error message suggests that it is a network related error, so I'd recommend checking if proxy settings are configured correctly (see: https://intelai.github.io/nauta/installation-and-configuration/How_to_Build_Nauta/HBN.html#proxy-setting-requirements).

mateusz-ciesielski avatar Mar 12 '20 09:03 mateusz-ciesielski

Thank you for your reply, I tried to build on kubernetes pod made on Nauta cluster. I intended to test most of my jobs on Nauta nodes. If the Nauta build is not supposed to be run on container, I will not try any more. That's not problem. I just tested. I will try to test others including deep learning training. I really appreciate your reply and help. It was very helpful. I am testing more, and asking more. Thank you very much.

nparkstar avatar Mar 12 '20 09:03 nparkstar