Regex101 icon indicating copy to clipboard operation
Regex101 copied to clipboard

sed, grep, awk

Open tjt263 opened this issue 4 years ago • 19 comments

Flavor Request


No, they're not essential, and their syntaxes are similar to php, perl, python, etc. Then again, nothing about Regex101 is strictly essential (i.e., it's not a C compiler or something of that nature); but that's not why we use it. We use it because we like to use it. It's nice, it's useful, it's convenient, and it makes our task easier. These would be welcome additions. Cheers.

tjt263 avatar Nov 02 '19 05:11 tjt263

See also #196 #165 #272

Doqnach avatar Nov 04 '19 11:11 Doqnach

+1 It would be awsome to be able to test those with this tool. Noobs like me can literally spend hours dealing with sed based substitutions, specific required escaping, etc etc

zar3bski avatar May 18 '20 14:05 zar3bski

Found this trying to convert a JavaScript RegExp to grep pattern. This would be useful for the ability to view JavaScript RegExp in other language formats, if that is possible.

guest271314 avatar Nov 23 '20 02:11 guest271314

@guest271314 The code generator does little more than placing a written regex into a code template. It does no conversions for you. It is not guaranteed that the written regex will be correct for the specific language you generate the code for. The code generator is mostly there to give a starting point, where it is up to you to make sure the regex you wrote is correct for the chosen language.

Doqnach avatar Nov 23 '20 14:11 Doqnach

+1 to this please, I'm having a hell of a time debugging a sed --regexp-extended and the same pattern is fine in the provided languages.

xenoterracide avatar Feb 17 '21 17:02 xenoterracide

I am currently writing an ERE regex engine, which will be convertible to js easily, the 'only' thing left to do are alternatives and groups, I hope to have time and motivation to work on it more and convert it so it can be used on the website. Alternatives are not too hard to do, but groups requires a bit more work.

Ouims avatar Feb 17 '21 18:02 Ouims

+1 awk [macox awk version 20070501 at present]

flatroof avatar Mar 22 '21 23:03 flatroof

+1 sed +1 grep

A great addition here...

Stephan972 avatar Apr 14 '21 08:04 Stephan972

generating a correctly escaped command line from it would definitely be a bonus… it's quite a hazzle to count backslashes in a regex

mpldr avatar May 26 '21 10:05 mpldr

Would be extremely helpful to have these added :)

Codex- avatar Jul 08 '21 23:07 Codex-

@Codex- can you give me an example of a regex where you need to count backslashes? I use grep and sed and I don't particularly remember having to fight the utilities, but I do have to fight bash's escaping needs.

working-name avatar Jul 09 '21 00:07 working-name

Not for counting backslashes specifically but more for validating the tokens are supported, i.e. the standard implementation of grep doesn't support \d (or perlre in general)

Codex- avatar Jul 20 '21 21:07 Codex-

In addition to #165, #196, #272, #404 Currently we go to https://sed.js.org/ which has the computation but not the rest of the resources that r101 provides. This appears to be the only calculator on the internet that implements regexes as described in https://www.gnu.org/software/sed/manual/sed.html

pfdint avatar Jul 30 '21 05:07 pfdint

I can probably add some context here (in addition to an upvote for the feature request). From the grep man page:

grep understands three different versions of regular expression syntax: “basic” (BRE), “extended” (ERE) and “perl” (PCRE). In GNU grep there is no difference in available functionality between basic and extended syntaxes. In other implementations, basic regular expressions are less powerful. The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards. Perl-compatible regular expressions give additional functionality, and are documented in pcresyntax(3) and pcrepattern(3), but work only if PCRE is available in the system.

The rest of that page goes on to document BRE vs. ERE and is a great reference for anybody looking. Long story short, the only difference between the two is escaping and handling of special characters (?, +, (), {}, |). Support is as follows:

  • awk: only supports ERE
  • sed: supports BRE (default) or ERE (with -E)
  • grep: supports BRE (default) or ERE (with -E). May support PCRE (with -P) but this is pretty uncommon; bourne shell doesn't support it

BRE (default) and ERE (activated with the -E flag) are the two implementations that are always available, and are also used for sed. -P for PCRE is out there but pretty uncommon, and not available with sed or awk anyway.

So, long story short, adding BRE and ERE support to the best string-related resource out there would be amazing!

tgross35 avatar May 12 '22 13:05 tgross35

Close this as a duplicate of: #165

cheako avatar May 12 '22 18:05 cheako

Not a duplicate. As explained above, that's for extended (sed -E), not basic (plain sed):

https://en.wikibooks.org/wiki/Regular_Expressions/POSIX_Basic_Regular_Expressions

kj avatar Oct 28 '22 06:10 kj

I don't think having two separate issues adds anything to the conversation, one should be sufficient.

cheako avatar Oct 28 '22 17:10 cheako

I agree, but if one is closed, then the other should be clear that it's about both BRE and ERE.

kj avatar Oct 29 '22 03:10 kj

To add a little to what @tgross35 said above, Supporting sed, grep, awk, etc directly will involve handling different program-specific regular expression rules and may differ depending on implementation (e.g. GNU/Linux vs Mac OS). This seems very useful, but also potentially hard to maintain and/or confusing. So just supporting POSIX Basic and Extended regular expressions would probably be the best first step. (relevant excerpts from man pages on a random Pi I was already connected to)

       s/regexp/replacement/
              Attempt to match regexp against the pattern space.   If  successful,  replace  that
              portion  matched with replacement.  The replacement may contain the special charac‐
              ter & to refer to that portion of the pattern space which matched, and the  special
              escapes \1 through \9 to refer to the corresponding matching sub-expressions in the
              regexp.

 Manual page sed(1) line 157 (press h for help or q to quit)
REGULAR EXPRESSIONS
       POSIX.2 BREs should be supported, but they aren't completely because of performance  prob‐
       lems.   The  \n  sequence in a regular expression matches the newline character, and simi‐
       larly for \a, \t, and other sequences.  The -E option switches to using  extended  regular
       expressions  instead;  it  has been supported for years by GNU sed, and is now included in
       POSIX.

 Manual page sed(1) line 236 (press h for help or q to quit)
REGULAR EXPRESSIONS
       [...]  In GNU grep  there  is  no  difference  in  available
       functionality  between  basic  and  extended  syntaxes.   In  other implementations, basic
       regular expressions are less powerful.  [...]

 Manual page grep(1) line 280 (press h for help or q to quit)

UrsineRaven avatar May 17 '23 20:05 UrsineRaven