applicationset icon indicating copy to clipboard operation
applicationset copied to clipboard

Support for extracting parent path/basenames of {{path}} for Git file/directory generators

Open jgwest opened this issue 2 years ago • 16 comments

As discussed here (amongst other places), there exists an obvious need with the Git generator to extract components of paths for use in ApplicationSet templates.

At present (on the master branch), the Git file and Git directory generators generate the following parameters:

  • {{path}}: path of discovered (directory/config file) within the repository, e.g. /path/to/apps/app1 or /path/to/config/config.json
  • {{path.basename}}: name of discovered (directory/config file) within the repository, eg app1 or config.json

It would be beneficial to extend this to allow templates to use parent folder names/paths as well, we just need to discuss the format and behaviour for this.

Open questions (feedback welcome):

What format should this take?

  • {{path[1]}}: Return the path to the parent file/folder of the discovered (directory/config file), for example /path/to/apps or /path/to/config
  • {{path[1].basename}}: Return the path to the name of the parent file/folder of discovered (directory/config file), for apps or
  • {{path[0]}}: Would have the same value as {{path}} (likewise for basename)
  • Another format? Any other prior art we can draw upon here? (Does Helm support something like this? What format does it use?)
    • {{path.basename[1]}}?
    • {{path~1}}/{{path~1.basename}}?
    • {{path[len-1]}}/{{path[len-1].basename}}?

What should the behaviour be if a path doesn't exist (eg {{path[4]}} for a path only containing 2 components)

  • throw error and return (my personal preference, but does this break any useful patterns?)
  • silently ignore
  • empty string

jgwest avatar Jul 13 '21 18:07 jgwest

Hi @jgwest,

I happen to be testing applicationset and I am stuck to this exact same issue. My apps are using the following directory layout ./applicationName/environments/namespaceName. Currently, there is no way for me to retrieve the applicationName. Could you give few more examples? I initially thought that {{ path }} could be a simple list of the files that has been found with their path split by /. For example, in my case, path would contain ['applicationName', 'environments', 'namespaceName']. Then I could get my applicationName by using {{ path[0] }}. If I wanted the whole path as a single string it would require a method like {{ path.toString() }} or {{ path.join('/') }}. In case the item doesn't exist it should throw an error but it would be nice to offer a {{ path[4].withDefault('test') }}.

I am not sure if this correspond to the example you gave.

benjamin-bergia avatar Jul 15 '21 12:07 benjamin-bergia

Leaving a note here that I'm in the same situation and hacked some stuff together with file generators that are just:

  name: 
  namespace

would make my directories cleaner if I could do path[0]. although I think I would like to be able to do reverse access path[-2] in case of a folder being nested deeper

elocke avatar Jul 22 '21 04:07 elocke

is there any plan to implement this in near future. this is much needed as it eliminates a lot of duplication

jwalitptl avatar Aug 25 '21 21:08 jwalitptl

Same situation for me with Git Generator. I've got this directory structure: level1/CLUSTER-NAME/level2/level3/*

I seriously thought a Jinja2 parser was implemented in this so I could simply do something like:

spec.template.metadata.name: {{ path[1] }}-{{ path.basename }}

or alternatively:

spec.template.metadata.name: {{ path | split("/")[1] }}-{{ path.basename }}

but nonetheless this doesn't work. Looking forward for this enhancement to be implemented.

Fixmetal avatar Aug 26 '21 14:08 Fixmetal

Same situation for me with Git Generator. I've got this directory structure: level1/CLUSTER-NAME/level2/level3/*

I seriously thought a Jinja2 parser was implemented in this so I could simply do something like:

spec.template.metadata.name: {{ path[1] }}-{{ path.basename }}

or alternatively:

spec.template.metadata.name: {{ path | split("/")[1] }}-{{ path.basename }}

but nonetheless this doesn't work. Looking forward for this enhancement to be implemented.

Came here to say this. My ApplicationSets will be hella cleaner and my bosses will think I'm some sort of templating demi god.

RumRogerz avatar Sep 02 '21 15:09 RumRogerz

Putting a note here, I've discovered my need for dirname({{path}}) when using the git files generator. I want to use that in the .spec.template.spec.source.path field instead of building it or hardcoding it.

elocke avatar Oct 14 '21 15:10 elocke

Hey, @jgwest I would like to pick this issue up, can you please assign it to me?

bradmccoydev avatar Oct 19 '21 01:10 bradmccoydev

How about adding regex support to the git generator?

Something like this:

spec:
  generators:
    - git:
        repoURL: ssh://git@xxxx
        revision: HEAD
        directories:
          - pathRegexp: deployments/(?P<namespace>(argo|other-namespace|another))/(?P<appname>.*)
  template:
    metadata:
      name: '{{pathRegexp.appname}}'
    spec:
      destination:
        server: https://kubernetes.default.svc
        namespace: '{{pathRegexp.namespace}}'

svmaris avatar Nov 27 '21 18:11 svmaris

is this feature will be available in the next release?

jwalitptl avatar Dec 08 '21 12:12 jwalitptl

This sounds great!

whyman avatar Feb 14 '22 17:02 whyman

I have a couple of suggestions for @jgwest's questions.

What format should this take?

1. User-controlled components

Instead of allowing arbitrary access to any path component, we could allow the user to specify the components of interest from the generator (instead of from the template). An example based on the initial Slack discussion linked:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
spec:
  generators:
    - git:
        repoURL: [email protected]:my-org/my-repo.git
        directories:
          - path: apps/*/deployment/overlays/*
            # object defining template parameters of the form `path.components.<key>`
            # that expose the path component matched by `path`, at the corresponding index
            components:
              root: 0 # this would point to the first path component, i.e. "apps"
              app: 1 # this would point to the 2nd path component (first glob)
              env: 4 # this would point to the 5th path component (second glob)

          # when having multiple dirs with different structures, this solution is useful
          # because path components can be shifted and `{{ path[1] }} would point
          # to different things for different entries in `directories`
          - path: addons/apps/*/deployment/overlays/*
            components:
              app: 2 # this keeps the `{{ path.components.app }}` correct & consistent
              env: 5

  template:
    metadata:
      name: 'app-{{ path.components.app }}-deployment-{{ path.components.env }}'
    spec:
      # ...

Advantages:

  • perhaps easier to implement
  • works well with multiple directories that have different structures

For me, this would be the preferred solution.

2. Composable Unix-like syntax

Another way to do this would be to use something resembling composable basename/dirname operations, and allow template parameters like {{ path.dirname.dirname.dirname.basename }}.

After a quick glance at the code, this might be a bit harder to implement with the current templating engine as you would need to provide all the possible combinations of .dirname and .basename.

Edit: added env: 5 in second directory in the example.

bogatuadrian avatar Feb 23 '22 08:02 bogatuadrian

@bogatuadrian I'm not really familiar with the code, but would it be possible to use regex-based matching instead of globbing? Referring to the components using their index in the path is something that you can easily make a mistake with. Plus, changing the path structure is a bit harder because you need to move all components around as well.

I'm talking about something like this:

path: (?P<root>apps)/(?P<app>.*)/deployment/overlays/(?P<env>.*)

svmaris avatar Feb 23 '22 08:02 svmaris

regex-based matching instead of globbing

@svmaris I'm also not familiar with the code, but from a design perspective I believe that using regex instead of globbing would be a breaking change. If you use another field like pathRegexp, as you previously mentioned, it might work.

Some questions/concerns I see are:

  1. What does pathRegexp actually do? Does it do the actual matching instead of path, or does it only do the parsing of the already matched path? I would lean towards the latter, so you get:

    path: apps/*/deployment/overlays/*
    pathRegexp: (?P<root>apps)/(?P<app>.*)/deployment/overlays/(?P<env>.*)
    
  2. I'm not sure if there are actual corner cases, but * glob and .* are different. The * glob's regexp equivalent would be something like [^/]*. My point is that regexes are also complicated and you can easily make mistakes with 😄.

bogatuadrian avatar Feb 23 '22 09:02 bogatuadrian

@bogatuadrian My initial thought that regexes would not replace globbing, but could be used in addition to globbing.

But while thinking about your first question/concern: In order to actually read the files from the repo, you'd probably need to do something like you suggest, by decoupling the path and the regexp. It would be really hard (if not impossible) to implement this using just the Regex, because you would need to look at every single file in the repository, which could be really bad for performance.

In that case, regex-based matching, while probably a bit more flexible, is definitely more complex and more error-prone than just using index numbers. So my vote goes to "User-controlled components" as you suggested in your earlier comment 👍

svmaris avatar Feb 23 '22 09:02 svmaris

@svmaris But if you use both path and pathRegexp, and pathRegexp is only used to extract the named groups from the already matched paths, then I think it should be fine 🤔. Do you see any problem with this approach?

bogatuadrian avatar Feb 23 '22 09:02 bogatuadrian

@bogatuadrian No I don't. I think that could be a good approach and it would offer a bit more flexibility than using index-based components. We do need to think about what happens when a path doesn't match the pathRegexp. Probably the pathRegexp should act as a filter as well by removing all paths that don't match, so we might want to find a better name for it (like filterRegexp or pathComponentRegexp or something?)

svmaris avatar Feb 23 '22 10:02 svmaris