sprig icon indicating copy to clipboard operation
sprig copied to clipboard

Fix usage of regex functions in pipelines

Open phemmer opened this issue 7 years ago • 9 comments

Most of the regex functions are unfortunately significantly broken when it comes to using them in pipelines. By broken I mean the functions technically work, but their usage I don't think would be intuitive to anyone, and that usage is difficult.

For example to use the regexReplaceAll function, you cannot do this:

"foo subject string" | regexReplaceAll "foo" "bar"

You instead have to do this:

"bar" | regexReplaceAll "foo" "foo subject string"

This means that if you are operating in a pipeline, you have to put your pipeline as the 2nd argument to the function, or store it in a variable first, which can be really messy and hard to maintain.

Golang templates pass the result of the previous item in the pipeline as the last argument to the function being called. Thus the arguments of all the regex functions need to be swapped around.

Unfortunately changing this is a breaking change, but I think this is a fairly major usage issue. And since these functions aren't even documented (ref #84), the impact might be minor.

phemmer avatar Mar 14 '18 12:03 phemmer

@phemmer Thanks for the comment and the concern for UX. As you point out this is a breaking change so we would need to deal with that, semantic version wise, if we accepted a change like this. Even though it's not documented.

mattfarina avatar Mar 28 '18 19:03 mattfarina

@phemmer / @mattfarina -

I've read the documentation that Matt provided in the comments of #84

So, let's say I am reading the following value from JSON and want to convert

Hello @user I am a (string)

to

Hello_-user_I_am_a_-string-

I am looking to replace all non-alphaNumeric matches and convert them to a hyphen. I've tried using "\\W+" with little success.

How would I do that in a pipeline using regexReplaceAll?

bitmvr avatar Jun 14 '18 17:06 bitmvr

I was just looking at this again trying to figure out why things ended up the way they did, what can people do, and how should we move forward.

Why did things end up this way? It's rather simple. The ordering of arguments mirrors the ordering of arguments in the underlying Go functions used.

What can people do today? The use of variables within the template seems like a possible path given the current ordering. It's not idea and doesn't just work with pipelines but can be used.

How should we move forward? There are two concerns that need to be balanced. First, there are existing users of the functions in their templates. We don't want to break them. We have found that even tiny breaks in functionality, even when it's a major version, brings about complaints and a fair amount of support work for us maintainers who are already too busy. We are, in part, optimizing for time and for the users we already have. Not just of the libraries Go API but of the template functions.

So, the path forward would be to have some new regex functions, with new names, that have a different ordering. This would be a feature request. PRs are welcome.

mattfarina avatar Sep 30 '19 16:09 mattfarina

I have a use-case, to clean-up string values that may contain 'illegal' kubernetes label: characters. I was approaching it the only logical way (since there is no example in the Helm documentation, actually just a copy of the Sprig, #84), and had begun tearing my hair out as no regex seemed to work. The top hit following a bit of searching for a usage example brought me to this bug. For it is a bug of the very worst kind, it's a design bug. I doubt anyone is using these functions for anything much at all - as presented, they are inconveniently unusable, even if one can figure it out. I simply want to use {{- . | regexReplaceAllLiteral "[^A-Za-z0-9._-]+" "_" | quote -}} but the one-liner would be very tricky to figure out without reading this bug-report first (due to #84) and ugly without the proposed refactoring / new functions. For those seeking a solution similar to my usage example above, here is the ugly truth: {{- "_" | regexReplaceAllLiteral "[^A-Za-z0-9._-]+" . | quote -}}

edrandall avatar Sep 08 '20 12:09 edrandall

I want to suggest that no one sane is using these functions in a pipeline, because the behavior is completely insane. So it's unlikely to break anything. But maybe that's not the case. :D

For a less break-stuff suggestion... Introducing new pipeline-friendly functions like regexpReplaceAll (the "p" is for "pipeline"?) with the arguments in a more intuitive order (where input is input string) would likely work, and not be completely horrible.

dannysauer avatar Nov 04 '21 01:11 dannysauer

So what's the conclusion here. How do I get the hostname from a url?

{{- define "hostname" -}}
{{- . | trimPrefix "http://" |  trimPrefix "https://" | ???  "/.*" "" | trim | quote -}}
{{- end -}}

alexanderkjeldaas avatar Jun 27 '22 20:06 alexanderkjeldaas