applicationset icon indicating copy to clipboard operation
applicationset copied to clipboard

Feature request: filter git directories generator based on files in the app directory

Open LiorLieberman opened this issue 3 years ago • 11 comments

Hi all,

We are using applicationset to provision helm applications to all our environments.

We are templating each environment with value-.yaml file

We have a use case now which we do not need all the applications generated from the applicationset in all environments.

For some environments we do not need to include all apps

I suggest adding an option to repo_service (feel free to advise somewhere else) to filter only applications with the requested value-file name.

That way we could have different applications on different environments generated with applicationset.

What do you think?

LiorLieberman avatar Apr 13 '21 17:04 LiorLieberman

Hi @LiorLieberman, would you be able to provide an example of what you mean?

jgwest avatar Apr 19 '21 20:04 jgwest

Hi @jgwest sure.

Most of us have multiple environments, and the not every application is desired in every environment. In our use case for example we provision helm applications using values-.yaml file. Therefore I thought it would be great if the generator, git generator for example will check if under the directories we have such values-.yaml file. and just if there is the applicationset will provision new application.

A good example would be a cluster for Production and a cluster for Dev. Lets take this directory from argo-cd-example apps repo. For simplicity lets say Applicationset is configured like that (only provision one applications).

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: cluster-addons
spec:
  generators:
  - git:
      repoURL: https://github.com/argoproj/argocd-example-apps.git
      revision: HEAD
      directories:
      - path: helm-guestbook
  template:
    metadata:
      name: '{{path.basename}}'
    spec:
      project: default
      source:
        repoURL: https://github.com/argoproj/argocd-example-apps.git
        targetRevision: HEAD
        path: '{{path}}'
        helm:
          valueFiles:
            - value.yaml
            - values-production.yaml (it dev it would be values-dev.yaml)
      destination:
        server: https://kubernetes.default.svc
        namespace: '{{path.basename}}'

So what i suggest is having another filter layer that telling Applicationset to check for this specific file in the directory and just if it has this file it will include create the application. In the example above we would have helm-guestbook app in production but not in Dev

What do you think?

LiorLieberman avatar Apr 23 '21 11:04 LiorLieberman

Sounds good, having the ability to target particular directories based on the presence of matching a path filter makes sense to me, and I agree this sounds like something that would be useful to enough folks to make it worth adding. Did you have an example in mind re: how to add this feature to the ApplicationSet Git generator spec?

jgwest avatar Apr 27 '21 17:04 jgwest

Thinking maybe to add it into repo_service, GetDirectories function, but I am not sure this is the right place. Regarding the key in the spec, I think we could have another optional key under directories like pathExists or a filters struct and inside it pathExists or directoryFileExists.

example 1:

directories:
- path: *
  pathExsits: a.yaml

example2:

directories:
- path: *
  filters:
  - directoryFileExists: a.yaml

LiorLieberman avatar Apr 28 '21 06:04 LiorLieberman

Cool, how about this:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: cluster-addons
spec:
  generators:
  - git:
      repoURL: https://github.com/argoproj/argocd-example-apps.git
      revision: HEAD
      directories:
      - path: apps
        onlyIncludeWithFileMatch:
          # (an array of strings that match filepath.Match expressions)
          - values1.yaml  # You can just specify a file name
          - "*.yaml" # Or a wildcard string that matches filepath.Match syntax (https://golang.org/pkg/path/filepath/#Match)

onlyIncludeWithFileMatch: This is a bit longer than pathExists or directoryFileExists, but I think it's still acceptable, and has the advantage of concisely explaining the behaviour: only include the directory if a file matches.

  • It contains an array of strings, which lets the user specify multiple values (which could be useful)
  • Those string are filepath.Match expressions, which give a bit more flexibility than just a simple filename.

The behaviour would be (I think this is the same as what you are suggesting, but I wrote it out for my own understanding):

For each directory under `directories`:
- If the directory contains a `onlyIncludeWithFileMatch` value:
 - For each folder in the Git repo that matches the `directories.path` value, check if it contains a filename that matches the `onlyIncludeIfFileMatch` value.
   - If yes, include the directory in the final directory list
   - If no, skip the directory.

What do you think?

jgwest avatar Apr 29 '21 15:04 jgwest

(an array of strings that match filepath.Match expressions)

My initial thinking was to check for a presence of one file (usually a value-file that its name changes between environments), how would it work with multiple files? should they all have to be included or just one of them?

  • It contains an array of strings, which lets the user specify multiple values (which could be useful)

Again, would it be OR or AND between the multiple values?

Based on my initial thinking onlyIncludeWithFileMatch wont be a perfect description since we just want to check for a presence of a file.

I think what you described could be another useful feature, to declare the pattern we want that at least one file in the directory will match and based on that to include the directory or not.

Let me know your thoughts

LiorLieberman avatar Apr 30 '21 13:04 LiorLieberman

I was thinking it would be an OR between the multiple values, which should work well with your use case, and might provide useful flexibility. But, I'm just guessing at what others might find useful... :smile:

jgwest avatar Apr 30 '21 14:04 jgwest

Sure, and thanks for that. I can go with your offer. How would you suggest implementing it ? I am not sure GetDirectories would be the perfect place. Do you have something on your mind?

LiorLieberman avatar May 01 '21 18:05 LiorLieberman

GetDirectories looks like a reasonable place to put it, unless there is an issue with that that you are seeing?

Perhaps something like this:

// new filter interface
type GetDirectoriesFilter interface {
	IncludeDirectory(absolutePath string, relativePath string) bool
}

type Repos interface {

	// GetFilePaths returns a list of files (not directories) within the target repo
	GetFilePaths(ctx context.Context, repoURL string, revision string, pattern string) ([]string, error)

	// GetDirectories returns a list of directories (not files) within the target repo
	GetDirectories(ctx context.Context, repoURL string, revision string, filter GetDirectoriesFilter) ([]string, error)
	// New filter interface added here /\

	// GetFileContent returns the contents of a particular repository file
	GetFileContent(ctx context.Context, repoURL string, revision string, path string) ([]byte, error)
}

func (a *argoCDService) GetDirectories(ctx context.Context, repoURL string, revision string, filter GetDirectoriesFilter) ([]string, error) {

	// (...)

	if err := filepath.Walk(repoRoot, func(path string, info os.FileInfo, fnErr error) error {
		// (...)
		relativePath, err := filepath.Rel(repoRoot, path)
		if err != nil {
			return err
		}

		if relativePath == "." { // Exclude '.' from results
			return nil
		}

		// NEW: use the filter if passed
		if filter != nil && !filter.IncludeDirectory(path, relativePath) {
			return nil
		}

		filteredPaths = append(filteredPaths, relativePath)

		return nil
	}); err != nil {
		return nil, err
	}

	return filteredPaths, nil

}

jgwest avatar May 03 '21 12:05 jgwest

I am currently exploring the applicationset controller and from my thinking if run into a similar use case. I was wondering if this proposal is still the way forward, or if there is a suggestion how this could be achieve with already existing Applicationset generators.

For me the use case is as following:

  • We have several clusters (which correspond to an "environment".
  • In each cluster, there is an ArgoCD running and managing/deploying the resources in the cluster
  • The individual clusters have several applications deployed and share a large set of them
  • Configuration / Overides per cluster/environment is done in values_clusterid.yaml
  • I am looking to only deploy certain applications into a cluster, if that specific values.yaml exist

I was already thinking of templating the ApplicationSet resource when I saw the issue here

patrickjahns avatar Dec 31 '21 01:12 patrickjahns

having the same question, anyone with a smart solution to this issue? is this planned to be released anytime?

yuvalavidor avatar Dec 12 '22 11:12 yuvalavidor