JSVerbalExpressions icon indicating copy to clipboard operation
JSVerbalExpressions copied to clipboard

Functional rewrite

Open jehna opened this issue 5 years ago • 9 comments

I've been thinking about re-writing JSVerbalExpressions to use function composition rather than the builder-like pattern it has now.

So now the README.md describes a simple example for using VerbalExpressions as such:

const tester = VerEx()
    .startOfLine()
    .then('http')
    .maybe('s')
    .then('://')
    .maybe('www.')
    .anythingBut(' ')
    .endOfLine();

This can be described as a builder-like extension for the native RegExp object; you can chain the expression and add more stuff to "build" a complete regular expression.

This is very clear approach for building simple, "one-dimensional" regular expressions. The problem with current implementation starts to surface when we start doing more complicated stuff like capture groups, lookaheads/behinds, using "or" pipe etc makes the expression quickly grow out of maintainability and readability.

For example, I think something like this is impossible to implement with VerbalExpressions at the moment:

/^((?:https?:\/\/)?|(?:ftp:\/\/)|(?:smtp:\/\/))([^ /]+)$/

To make it simpler, I'm proposing a 2.0 rewrite of VerbalExpressions that would take a functional approach, something like:

VerEx(
  startOfLine,
  "http",
  maybe("s"),
  "://",
  maybe("www."),
  anythingBut(" "),
  endOfLine
)

Motivation for this approach would be:

  • We can split regular expressions into multiple variables
    • Naming "sub-expressions" allows better naming, different abstraction levels in regular expressions
    • Each small part is testable with unit tests
  • Makes grouping explicit (enforce closing an opened capture group)

So the simplest example could be something like this:

const regex = VerEx(
  startOfLine,
  "http",
  maybe("s"),
  "://",
  maybe("www."),
  anythingBut(" "),
  endOfLine
);

And the complex example could be written e.g. like this:

VerEx(
  startOfLine,
  group(
    or(
      concat("http", maybe("s"), "://", maybe("www.")),
      "ftp://",
      "smtp://"
    )
  ),
  group(anythingBut(" /"))
);

While this looks a bit more complex, we can more easily split it up and name things:

const protocol = or(concat("http", maybe("s"), "://"), "ftp://", "smtp://");
const removeWww = maybe("www.");
const domain = anythingBut(" /");
const regex = VerEx(startOfLine, group(protocol), removeWww, group(domain));

This way we could test all of those "sub-expressions" (variables) in isolation.

jehna avatar Jun 27 '19 08:06 jehna

Some examples where compositional/functional patterns has been used:

jehna avatar Jun 27 '19 09:06 jehna

Huh. Interesting.

shreyasminocha avatar Jun 27 '19 09:06 shreyasminocha

So for something like:

VerEx(
  startOfLine,
  "http",
  maybe("s"),
  "://",
  maybe("www."),
  anythingBut(" "),
  endOfLine
)

… would the import statement look like one of the following:

import { VerEx, startOfLine, maybe, anythingBut, endOfLine } from verbal-expressions;
import * from verbal-expressions;

A bit concerned about global scope pollution…

shreyasminocha avatar Jun 27 '19 09:06 shreyasminocha

ES module/TypeScript imports would look like this:

import { VerEx, startOfLine, maybe, anythingBut, endOfLine } from 'verbal-expressions'

On node.js require you can use:

const { VerEx, startOfLine, maybe, anythingBut, endOfLine } = require('verbal-expressions')

If we want to still support global browser scripts, then a common practice with this kind of libraries (e.g. Ramda, lodash) is to use a short single-character namespace. We could namespace with V or ve. In that case you would use the library as:

V.VerEx(
  V.startOfLine,
  "http",
  V.maybe("s"),
  "://",
  V.maybe("www."),
  V.anythingBut(" "),
  V.endOfLine
)

jehna avatar Jun 27 '19 14:06 jehna

Sounds good.

I'd like to help out with this. How do we work this out?

shreyasminocha avatar Jun 27 '19 16:06 shreyasminocha

I can create a POC draft pull request to show a couple of ideas, and we can iterate from that. Does that sound good?

jehna avatar Jun 28 '19 12:06 jehna

Sure.

shreyasminocha avatar Jun 28 '19 15:06 shreyasminocha

@jehna How about I create a 2.0.0 branch and write some failing tests while you build your proof of concept?

shreyasminocha avatar Jun 29 '19 08:06 shreyasminocha

Ok, so I did some work that I'd like to show you: #197

jehna avatar Jul 20 '19 20:07 jehna