vendir icon indicating copy to clipboard operation
vendir copied to clipboard

Added support for globs for GH release unpacking

Open brittonhayes opened this issue 4 years ago • 6 comments

Changes

  • Adds support for filepath globbing for unpacking github release archives
  • fixes #49

Example Usage

Here is an example of using the globbing to unpack only arm-based tars from the goreleaser repository. This is done by adding the * wildcard within the unpackArchive.path

apiVersion: vendir.k14s.io/v1alpha1
kind: Config

directories:
    - path: globbing
      contents:
      - path: goreleaser
        githubRelease:
            slug: goreleaser/goreleaser
            latest: true
            disableAutoChecksumValidation: true
            unpackArchive:
                path: "*arm64.tar.gz"

# Result:
# | globbing/
# |-- goreleaser_Darwin_arm64/
# |---- ...
# |-- goreleaser_Linux_arm64/
# |---- ...

brittonhayes avatar Aug 18 '21 04:08 brittonhayes

@brittonhayes, you must sign our contributor license agreement before your changes are merged. Click here to sign the agreement. If you are a VMware employee, read this for further instruction.

vmwclabot avatar Aug 18 '21 04:08 vmwclabot

@brittonhayes, VMware has approved your signed contributor license agreement.

vmwclabot avatar Aug 18 '21 08:08 vmwclabot

hey @brittonhayes. it looks like when unpackArchive currently is used, we replace the entire downloaded directory with contents that were unpacked. this is a bit weird for the case when glob matches multiple paths. what should be the end result: archives added into download directory, only one wins and replaces downloaded content, etc? i think we may need to figure out that first. since there could be multiple archives, should we introduce a new option called unpackArchives (as a sibling to singular) which would retain parent directory?

cppforlife avatar Aug 19 '21 16:08 cppforlife

The way my added feature works is that all matches are packed into subdirectories of the dest path, with the subdirectory's name being the glob-matched path.

So this should prevent the issue of it overwriting the contents of the directory, since I'm using filepath.Join() to break out each glob match into a subdirectory.

E.g.

Result: | myoutputdir/ |-- match1/ |---- match1contents |-- match2/ |---- match2contents

I was unsure how much change maintainers were comfortable making to the way the unpacking works, so I kept the tweak to the code base very minimal.

Very open to feedback and ideas here if this solution doesn't meet the need yet! 🙂

brittonhayes avatar Aug 19 '21 16:08 brittonhayes

I was unsure how much change maintainers were comfortable making to the way the unpacking works, so I kept the tweak to the code base very minimal.

@brittonhayes my main concern so far is that unpackArchive configuration behaves differently in case of glob vs non-glob scenario. im still thinking a bit on what we should. is there some more generic feature we should add for unpacking...

cppforlife avatar Aug 24 '21 17:08 cppforlife

@cppforlife I think that's probably a good idea. A more general feature for unpacking would provide you the flexibility to implement additional functionality down the line.

Right now the feature of syncing/fetching the release and unpacking it are pretty tightly coupled. I'd opt for abstracting at least the unpacking part out into in interface of type Unarchiver with a method of Unarchive and then different types for each of the archives you might unpack.

To replace these functions:

func (t Archive) tryZip(path, dstPath string) (bool, error) {

I'd opt for a pattern like this:

type Unarchiver interface {
	Unarchive(path string, destPath string) error
}

type Zip struct {}

type Tar struct {}

func (t Tar) Unarchive(path string, destPath string) error {
 // do stuff
 return nil
}

func (z Zip) Unarchive(path string, destPath string) error {
 // do stuff
 return nil
}

This opens the door for stuff like this which could check the file type and use any of the above methods within the fucntion.

type Glob struct {}

func (g Glob) Unarchive(path string, destPath string) error {

	// Loop over the glob and run this switch
	// statement for each match
	// Probably with each run in a goroutine to keep
	// it snappy
	//
	switch filepath.Ext(path)
	case "zip":
		z := &Zip{}
		return z.UnpackArchive(path, destPath)
	case "tar":
		t := &Tar{}
		return t.UnpackArchive(path, destPath)
	default
		return errors.New("No archives matched")
}

Also another piece that I think is making this challenging is the part where it returns a bool to indicate that it finished unpacking one of the available kinds of archives. Since this bool boils up the the calling function in the CLI and determined if it would throw an error.

Just thoughts on it but I think finding a way to decouple some of the logic on unarchiving function and the functions that sync the archives might make it easier to implement this and add support for additional compressed formats in the future with this more general interface.

Definitely let me know if this feels over-complicated for the need. This is a late night idea I'm jotting down and could quite possibly not make any sense :sweat_smile:

brittonhayes avatar Aug 26 '21 05:08 brittonhayes