ark icon indicating copy to clipboard operation
ark copied to clipboard

unzip_with_strip_components should not rely on bash/ksh specific handling of */

Open cruschke opened this issue 10 years ago • 9 comments

Hi,

I am currently facing some weird issue with ark on Solaris 10.

       ---- Begin output of unzip -q -u -o /tmp/kitchen/cache/splunkforwarder.zip -d /tmp/d20140626-583-18zyngk && rsync -a /tmp/d20140626-583-18zyngk/*/ /cm/splunkforwarder && rm -rf /tmp/d20140626-583-18zyngk ----
       STDOUT:
       STDERR: rsync: change_dir "/tmp/d20140626-583-18zyngk/*" failed: No such file or directory (2)

Looking at debug output of my chef run showing this ark parameters

       ark("splunkforwarder") do
         provider Chef::Provider::Ark
         action [:put]
         updated true
         updated_by_last_action true
         retries 0
         retry_delay 2
         guard_interpreter :default
         cookbook_name "my_monitoring"
         recipe_name "splunkforwarder"
         url "http://path/to/splunkforwarder.zip"
         version "5.0.2-149561"
         extension "zip"
         owner "vagrant"
         group "vagrant"
         path "/cm/splunkforwarder"
         release_file "/tmp/kitchen/cache/splunkforwarder.zip"
         strip_components 1
       end

The issue happens only on Solaris 10, not on Solaris 11 and not on CentOS 6 based machines.

The Zip archive looks like this (so there is really some directory to strip away)

-bash-3.2$ unzip -l splunkforwarder.zip  | less
Archive:  splunkforwarder.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  02-01-2013 14:54   splunkforwarder/
      509  02-01-2013 13:57   splunkforwarder/README-splunk.txt
        0  02-01-2013 14:54   splunkforwarder/bin/
    45920  02-01-2013 14:36   splunkforwarder/bin/bloom
    45920  02-01-2013 14:36   splunkforwarder/bin/btool       
...

Rsync is version 3.0.8 protocol version 30 on both Solaris 10 and Solaris 11.

When I invoke the command line chef is complaining about

unzip -q -u -o /tmp/kitchen/cache/splunkforwarder.zip -d /tmp/d20140626-583-18zyngk && rsync -a /tmp/d20140626-583-18zyngk/*/ /cm/splunkforwarder && rm -rf /tmp/d20140626-583-18zyngk

manually on Solaris 10 then it works OK.

I also looked at different Bash versions (3.2 on Solaris 10, 4 on Solaris 11), installed bash 4 on Solaris 10 and still no difference.

Any idea what could be the reason appriciated.

cruschke avatar Jun 26 '14 10:06 cruschke

I'm seeing the same error, but on Ubuntu 14.04

STDERR: rsync: change_dir "/tmp/d20140626-9550-76qs3q/*" failed: No such file or directory (2

joelmoss avatar Jun 26 '14 15:06 joelmoss

After spending quite some time on this issue to save us from repacking all our archives I think I finally found the root cause.

On Solaris 10 the sh is a true sh, while on Solaris 11 sh is a link to ksh93. Apparently there is some different behaviour in handling a */ between sh and other shells that causes ark to break.

Solaris 10

  • repeat the unzip command the ark succeeded to do
# unzip -q -u -o /tmp/kitchen/cache/splunkforwarder.zip -d /tmp/d20140707-583-h0xft 
  • there is something in the temp directory
# ls -l /tmp/d20140707-583-h0xft
total 8
drwxr-xr-x   7 root     root         737 Feb  1  2013 splunkforwarder
  • ark fails on rsync here but lets test ls here
# ls -l /tmp/d20140707-583-h0xft/*/
/tmp/d20140707-583-h0xft/*/: No such file or directory
  • what is the shell
# ls -l /bin/sh
lrwxrwxrwx   1 root     root          13 Jun 23 15:04 /bin/sh -> ../../sbin/sh

# ls -l /sbin/sh
-r-xr-xr-x   1 root     root       82456 Sep 22  2010 /sbin/sh

Same procedure on Solaris 11

root@live-solaris-11-1:~# unzip -q -u -o /tmp/kitchen/cache/splunkforwarder.zip -d /tmp/d20140707-583-h0xft

root@live-solaris-11-1:~# ls -l /tmp/d20140707-583-h0xft
total 8
drwxr-xr-x   7 root     root         737 Feb  1  2013 splunkforwarder

root@live-solaris-11-1:~# ls -l /tmp/d20140707-583-h0xft/*/
total 176
drwxr-xr-x   3 root     root        2184 Feb  1  2013 bin
drwxr-xr-x  10 root     root        1362 Feb  1  2013 etc
-rw-r--r--   1 root     root           0 Feb  1  2013 ftr
drwxr-xr-x   3 root     root        2148 Feb  1  2013 lib
-r--r--r--   1 root     root       48789 Feb  1  2013 license-eula.txt
drwxr-xr-x   3 root     root         246 Feb  1  2013 openssl
-r--r--r--   1 root     root         509 Feb  1  2013 README-splunk.txt
drwxr-xr-x   3 root     root         180 Feb  1  2013 share
-rw-r--r--   1 root     root       15079 Feb  1  2013 splunkforwarder-5.0.2-149561-SunOS-x86_64-manifest

root@live-solaris-11-1:~# ls -l /bin/sh
lrwxrwxrwx   1 root     root           9 Jun  7 10:58 /bin/sh -> i86/ksh93

Replacing the /sbin/sh on Solaris 10 with a symlink to ksh magically makes ark working as expected.

@joelmoss can you verify the same on Ubuntu 14.04?

I believe a solution could be to replace the execute unpack ... with bash.

cruschke avatar Jul 07 '14 13:07 cruschke

I'm having a similar issue to this but I'm using CentOS 6.5:

       [2014-08-03T22:19:45+00:00] FATAL: Stacktrace dumped to /tmp/kitchen/cache/chef-stacktrace.out
       Chef Client failed. 4 resources updated in 291.633139227 seconds
       [2014-08-03T22:19:45+00:00] ERROR: ark[terraform] (terraform::default line 23) had an error: Mixlib::ShellOut::ShellCommandFailed: execute[unpack /tmp/kitchen/cache/terraform-0.1.0.zip] (/tmp/kitchen/cache/cookbooks/ark/providers/default.rb line 55) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '23'
       ---- Begin output of unzip -q -u -o /tmp/kitchen/cache/terraform-0.1.0.zip -d /tmp/d20140803-2497-1s6dqyx && rsync -a /tmp/d20140803-2497-1s6dqyx/*/ /usr/local/terraform-0.1.0 && rm -rf /tmp/d20140803-2497-1s6dqyx ----
       STDOUT: 
       STDERR: rsync: change_dir "/tmp/d20140803-2497-1s6dqyx/*" failed: No such file or directory (2)
       rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1039) [sender=3.0.6]
       ---- End output of unzip -q -u -o /tmp/kitchen/cache/terraform-0.1.0.zip -d /tmp/d20140803-2497-1s6dqyx && rsync -a /tmp/d20140803-2497-1s6dqyx/*/ /usr/local/terraform-0.1.0 && rm -rf /tmp/d20140803-2497-1s6dqyx ----
       Ran unzip -q -u -o /tmp/kitchen/cache/terraform-0.1.0.zip -d /tmp/d20140803-2497-1s6dqyx && rsync -a /tmp/d20140803-2497-1s6dqyx/*/ /usr/local/terraform-0.1.0 && rm -rf /tmp/d20140803-2497-1s6dqyx returned 23
       [2014-08-03T22:19:46+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

I've manually stepped through the commands and everything looks okay until I try the rsync command:

[root@default-centos-65 ~]# rsync -a /tmp/d20140803-2497-1s6dqyx/*/ /usr/local/terraform-0.1.0
rsync: change_dir "/tmp/d20140803-2497-1s6dqyx/*" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1039) [sender=3.0.6]
[root@default-centos-65 ~]# ls -l /bin/sh
lrwxrwxrwx. 1 root root 4 Dec  5  2013 /bin/sh -> bash

I've also seen the same error via Test Kitchen using CentOS 7 and Ubuntu 14.04.

rosstimson avatar Aug 03 '14 22:08 rosstimson

Just a quick FYI, for me I've managed to get things working by dropping back to an older version of ark with the following in my metadata.rb:

depends          'ark', '~> 0.6.0'

If I bump this to 0.7.0 then I encounter the same issue. I've not had a chance to compare the differences yet but it looks like, somewhere between 0.6.0 and 0.7.0 there is a breaking change (for my issue at least).

rosstimson avatar Aug 04 '14 11:08 rosstimson

+1 on "seeing this on Ubuntu 14.04".

It's clear the problem is is in the extra slash on this rsync command. If there is going to be a fix to this, I think it'll be in the rsync command by dropping the trailing slash, aka changing:

rsync -a /tmp/d20140915-10978-ha2p3d/*/ /bla/bla/bla

to

rsync -a /tmp/d20140915-10978-ha2p3d/* /bla/bla/bla

Against the current master, this line of code where this change would happen is here.

Alternatively, the way I got around this, is my simply adding:

strip_components 0

to my ark LWRP call.

joegoggins avatar Sep 15 '14 17:09 joegoggins

I've got time next week (after a long few weeks of classes) so I'll make that change to the slash. Alright now back to my last class.

burtlo avatar Sep 19 '14 15:09 burtlo

@joegoggins Thanks, your suggestion of strip_components 0 worked like a charm!

richid avatar Sep 20 '14 04:09 richid

thanks - also worked on 14.04

daraghm avatar Oct 01 '14 02:10 daraghm

It also helped 12.04 converge - couldn't do it without it.

darron avatar Oct 07 '14 22:10 darron