cp-ansible icon indicating copy to clipboard operation
cp-ansible copied to clipboard

Cannot upgrade minor version package in air gap environment

Open 22RC opened this issue 2 years ago • 3 comments

Describe the issue In air gap environment, using an http repo with the same structure as the official confluent repo containing all minor version, is not possible upgrade from 6.2.x to 6.2.x+1

To Reproduce Steps to reproduce the behaviour:

  1. Create httpd repository with following structure:
6.2/
  | --- ...
  | --- confluent-rest-utils-6.2.0-1.noarch.rpm
  | --- confluent-schema-registry-6.2.0-1.noarch.rpm
  | --- confluent-security-6.2.0-1.noarch.rpm
  | --- confluent-server-6.2.0-1.noarch.rpm
  | --- ...
  1. Create custom .repo file with following content:
[Confluent]
name=Confluent repository
baseurl=https://httpd.repo.io/rpm/6.2
gpgcheck=0
enabled=1
  1. Install confluent platform with cp-ansible (6.2.0-post). Success !
  2. After installation, update the httpd repo with minor version package (6.2.1); and update metadata
6.2/
  | --- ...
  | --- confluent-rest-utils-6.2.0-1.noarch.rpm
  | --- confluent-rest-utils-6.2.1-1.noarch.rpm
  | --- confluent-schema-registry-6.2.0-1.noarch.rpm
  | --- confluent-schema-registry-6.2.1-1.noarch.rpm
  | --- confluent-security-6.2.0-1.noarch.rpm
  | --- confluent-security-6.2.1-1.noarch.rpm
  | --- confluent-server-6.2.0-1.noarch.rpm
  | --- confluent-server-6.2.1-1.noarch.rpm  
  | --- ...
  1. try to update to cp-ansible 6.2.1 with same custom .repo file used in step 3

Expected behaviour cp-ansible update package version and install confluent platform 6.2.1.

Current behaviour dnf cannot find package with version 6.2.1.

Inventory File Inventory file and repo file are not changed from 6.2.0 to 6.2.1. we use the following vars in inventory to retrive rpm from custom repo:

repository_configuration: custom
custom_yum_repofile_filepath: /root/httpd-confluent.repo

Environment (please complete the following information):

  • OS: [RHEL 8.1]
  • CP-Ansible Branch: [6.2.0-post and 6.2.1]
  • Ansible Version [ansible 2.9.6]

Additional context To resolve the issue we manually type on all confluent platform hosts:

$ dnf clean all
$ dnf makecache

And restart upgrade.

22RC avatar Oct 15 '21 14:10 22RC

@22RC It seems this fix is causing issues, filling up the disk on hosts unnecessarily. See https://github.com/confluentinc/cp-ansible/pull/830

Can we think of another solution for your issue?

domenicbove avatar Nov 22 '21 15:11 domenicbove

It seems that this has been resolved with https://github.com/confluentinc/cp-ansible/pull/830 Please reopen if that's not the case

utkarsh5474 avatar Oct 18 '22 10:10 utkarsh5474

Hi, i want to re-open this issue because we stumbled on following error during an fresh installations via cp-ansible:

<inf-kr-t01.> (0, b'/root\n', b'')
<inf-kr-t01> ESTABLISH SSH CONNECTION FOR USER: None
<inf-kr-t01> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o 'ControlPath="/root/.ansible/cp/59bd345a32"' inf-kr-t01 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717 `" && echo ansible-tmp-1712231769.4379592-1846660-131430882442717="` echo /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717 `" ) && sleep 0'"'"''
<inf-kr-t01> (0, b'ansible-tmp-1712231769.4379592-1846660-131430882442717=/root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717\n', b'')
Using module file /data/ansible/lib64/python3.6/site-packages/ansible/modules/dnf.py
<inf-kr-t01> PUT /root/.ansible/tmp/ansible-local-184203333bjn02i/tmpl_repb4h TO /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/AnsiballZ_dnf.py
<inf-kr-t01> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o 'ControlPath="/root/.ansible/cp/59bd345a32"' '[inf-kr-t01]'
<inf-kr-t01> (0, b'sftp> put /root/.ansible/tmp/ansible-local-184203333bjn02i/tmpl_repb4h /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/AnsiballZ_dnf.py\n', b'')
<inf-kr-t01> ESTABLISH SSH CONNECTION FOR USER: None
<inf-kr-t01> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o 'ControlPath="/root/.ansible/cp/59bd345a32"' inf-kr-t01 '/bin/sh -c '"'"'chmod u+x /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/ /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/AnsiballZ_dnf.py && sleep 0'"'"''
<inf-kr-t01> (0, b'', b'')
<inf-kr-t01> ESTABLISH SSH CONNECTION FOR USER: None
<inf-kr-t01> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o 'ControlPath="/root/.ansible/cp/59bd345a32"' -tt inf-kr-t01 '/bin/sh -c '"'"'/usr/bin/python /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/AnsiballZ_dnf.py && sleep 0'"'"''
<inf-kr-t01> (1, b'\x1b[?25l| Fetching certificate serial numbers\r\x1b[?25h\x1b[0K\r\x1b[?25l| Checking server status\r\x1b[?25h\x1b[0K\r\x1b[?25l| Fetching content for a certificate\r\x1b[?25h\x1b[0K\r\x1b[?25l| Fetching content overrides\r\x1b[?25h\x1b[0K\r\r\n{"results": [], "failed": true, "msg": "Failed to download packages: confluent-common-7.6.0-1.noarch: Cannot download, all mirrors were already tried without success", "exception": "  File \\"/tmp/ansible_ansible.legacy.dnf_payload_6xvcmr18/ansible_ansible.legacy.dnf_payload.zip/ansible/modules/dnf.py\\", line 1213, in ensure\\n  File \\"/usr/lib/python3.9/site-packages/dnf/base.py\\", line 1309, in download_packages\\n    self._download_remote_payloads(payloads, drpm, progress, callback_total)\\n  File \\"/usr/lib/python3.9/site-packages/dnf/base.py\\", line 1238, in _download_remote_payloads\\n    raise dnf.exceptions.DownloadError(errors._irrecoverable())\\n", "invocation": {"module_args": {"name": ["confluent-common-7.6.0-1", "confluent-ce-kafka-http-server-7.6.0-1", "confluent-server-rest-7.6.0-1", "confluent-telemetry-7.6.0-1", "confluent-server-7.6.0-1", "confluent-rebalancer-7.6.0-1", "confluent-security-7.6.0-1"], "state": "latest", "allow_downgrade": false, "autoremove": false, "bugfix": false, "disable_gpg_check": false, "disable_plugin": [], "disablerepo": [], "download_only": false, "enable_plugin": [], "enablerepo": [], "exclude": [], "installroot": "/", "install_repoquery": true, "install_weak_deps": true, "security": false, "skip_broken": false, "update_cache": false, "update_only": false, "validate_certs": true, "lock_timeout": 30, "allowerasing": false, "nobest": false, "conf_file": null, "disable_excludes": null, "download_dir": null, "list": null, "releasever": null}}}\r\n', b'Shared connection to inf-kr-t01 closed.\r\n')
<inf-kr-t01> Failed to connect to the host via ssh: Shared connection to inf-kr-t01 closed.
<inf-kr-t01> ESTABLISH SSH CONNECTION FOR USER: None
<inf-kr-t01> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o 'ControlPath="/root/.ansible/cp/59bd345a32"' inf-kr-t01 '/bin/sh -c '"'"'rm -f -r /root/.ansible/tmp/ansible-tmp-1712231769.4379592-1846660-131430882442717/ > /dev/null 2>&1 && sleep 0'"'"''
<inf-kr-t01> (0, b'', b'')
The full traceback is:
  File "/tmp/ansible_ansible.legacy.dnf_payload_6xvcmr18/ansible_ansible.legacy.dnf_payload.zip/ansible/modules/dnf.py", line 1213, in ensure
  File "/usr/lib/python3.9/site-packages/dnf/base.py", line 1309, in download_packages
    self._download_remote_payloads(payloads, drpm, progress, callback_total)
  File "/usr/lib/python3.9/site-packages/dnf/base.py", line 1238, in _download_remote_payloads
    raise dnf.exceptions.DownloadError(errors._irrecoverable())
failed: [inf-kr-t01] (item=confluent-security) => {
    "ansible_loop_var": "item",
    "attempts": 5,
    "changed": false,
    "invocation": {
        "module_args": {
            "allow_downgrade": false,
            "allowerasing": false,
            "autoremove": false,
            "bugfix": false,
            "conf_file": null,
            "disable_excludes": null,
            "disable_gpg_check": false,
            "disable_plugin": [],
            "disablerepo": [],
            "download_dir": null,
            "download_only": false,
            "enable_plugin": [],
            "enablerepo": [],
            "exclude": [],
            "install_repoquery": true,
            "install_weak_deps": true,
            "installroot": "/",
            "list": null,
            "lock_timeout": 30,
            "name": [
                "confluent-common-7.6.0-1",
                "confluent-ce-kafka-http-server-7.6.0-1",
                "confluent-server-rest-7.6.0-1",
                "confluent-telemetry-7.6.0-1",
                "confluent-server-7.6.0-1",
                "confluent-rebalancer-7.6.0-1",
                "confluent-security-7.6.0-1"
            ],
            "nobest": false,
            "releasever": null,
            "security": false,
            "skip_broken": false,
            "state": "latest",
            "update_cache": false,
            "update_only": false,
            "validate_certs": true
        }
    },
    "item": "confluent-security",
    "msg": "Failed to download packages: confluent-common-7.6.0-1.noarch: Cannot download, all mirrors were already tried without success",
    "results": []
}

I also see that in ansible's yum module the update_cache parameter was set to false by default, this mean that if i have an old metadata cache on yum folder the new metadata will not be present during the download. To solve this issue we had to re-build the yum cache dnf makecache.

I ask you to review the reasons why the yum cache rebuild task was removed from the playbook with #830. If the problem notice in #380 is only due to "capacity" of fs or "time to install", I think it needs to be looked at.

22RC avatar Apr 04 '24 14:04 22RC