valibot icon indicating copy to clipboard operation
valibot copied to clipboard

Add babel-plugin-transform-regex to devDependencies

Open slevithan opened this issue 1 year ago • 13 comments

Closes #704.

Per previous discussion with @fabian-hiller in #704, this adds package babel-plugin-transform-regex to devDependencies for improved DX.

  • Tests, lint, prettier, and build all pass.
  • Renamed library/src/regex.ts as library/src/regex.src.ts.
    • regex.ts is now the transpiled output.
  • In regex.src.ts, I updated only the first two regexes (BASE64_REGEX and BIC_REGEX) to use the regex tag with more readable syntax (using free spacing, subroutines, a subroutine definition group, and noncapturing (…)).
    • The emitted source in regex.ts for these two regexes is identical to before, which helps demonstrate that the Babel plugin is doing a great job with producing efficient and readable native regexes (since the regexes being replaced were hand-tuned regex literals). As a result, this PR's only change to regex.ts is an added header comment.
    • Other regexes and the rest of the code in regex.src.ts (apart from a hint comment at the top) is untouched.
    • I'll update the remaining regexes (many of which will benefit much more than these two, which are quite simple) after this PR lands.
  • Also added package regex to devDependencies so the types work in regex.src.ts.
    • The import { regex } from 'regex'; line in regex.src.ts is stripped from transpiled source by the Babel plugin (via its removeImport option which removes any import declarations with module name 'regex').
    • To make the types for regex work, I had to update library/tsconfig.json, changing "moduleResolution": "node" to "Bundler". Values "NodeNext" and "Node16" would also work for regex.src.ts, but those values conflict with other, current tsconfig.json settings.
      • This change doesn't seem to cause any issues elsewhere, but I'm not deeply knowledgeable about TS/CJS/bundling so it's worth examining.

slevithan avatar Aug 09 '24 03:08 slevithan

Thank you for creating this PR. Since I have less time at the moment and other things have a higher priority, I will try to get back to you next week.

fabian-hiller avatar Aug 10 '24 10:08 fabian-hiller

Is it not possible to do the transformation as part of our build step? I don't like that we need a regex.ts and regex.src.ts.

fabian-hiller avatar Aug 17 '24 14:08 fabian-hiller

No problem. I've updated the PR with a solution that generates a transpiled/regex.ts file to use for the build, and then deletes it afterward in a postbuild script (which I added the rimraf package to do).

tsup (and esbuild, which it uses) doesn't have a built-in way to alias local file import paths, so I created a tiny build plugin in tsup.config.ts to do this.


If you prefer to avoid the build plugin, an alternative to the current (new) approach could be:

  1. Copy (or rename) regex.ts to a temp file (lets say regex.orig.ts).
  2. Overwrite regex.ts using Babel.
  3. Run the build with tsup.
  4. Replace regex.ts with regex.orig.ts.

But my initial thought is that moving files around like that would not be preferable.

slevithan avatar Aug 17 '24 19:08 slevithan

Thanks for the update! I am pretty busy at the moment and mainly focused on getting our v1 RC and after that our v1 release ready. I will come back to this PR when I have more time again or this feature becomes more important.

fabian-hiller avatar Aug 18 '24 11:08 fabian-hiller

No problem, take your time. If you have a moment, though, I'm curious whether you have concerns with the current approach (or need time to think about it), or if you think the current PR is on the right track to pick back up from when you have more time.

slevithan avatar Aug 18 '24 18:08 slevithan

In general, I don't like to complicate the setup because I like things simple, but of course I see the benefits of this PR.

Is it possible to run our tests with the compiled output? For security reasons this seems to be necessary in the long run.

fabian-hiller avatar Aug 18 '24 22:08 fabian-hiller

In general, I don't like to complicate the setup because I like things simple, but of course I see the benefits of this PR.

I share your concern. This was actually the primary reason for my initial approach of having a separate "source" file, with regex.ts being the transpiled output. But I see the downside of that as well.

I've simplified things somewhat in the latest update to the PR, including getting rid of rimraf, and narrowing the aliasing plugin (which can always be generalized later if for whatever reason more paths need to be aliased).

Is it possible to run our tests with the compiled output? For security reasons this seems to be necessary in the long run.

I've updated the PR to do so. Vite/Rollup share the resolve.alias option, so this was easy to do. Their local aliasing support seems very robust (and allows regexes or functions, which I didn't use here).

slevithan avatar Aug 19 '24 01:08 slevithan

Thank you for the changes! This PR is now going in a good direction.

fabian-hiller avatar Aug 19 '24 09:08 fabian-hiller

Glad to hear it! No worries if it takes you some time to come back to this, but in the meantime, I've made a couple more tweaks and I've gone ahead and updated the PR to cover the first 12 of 24 regexes in regex.ts. Hopefully this shows more substantively what the end result will be like (I think it makes sense to still wait on rewriting the remaining regexes until after this PR lands). Of the first 12 regexes, a couple (CUID2_REGEX and DECIMAL_REGEX) were so simple that they didn't benefit from regex, so I left them alone. And EMOJI_REGEX was generated by an external library, so I also left it alone so it can be updated more easily.

I also included the following question/comment above IP_REGEX:

// Q: Would it be tree-shakeable if this used interpolation without adding any new variables, as
// ``new RegExp(`^(?:${IPV4_REGEX.source.slice(1, -1)}|${IPV6_REGEX.source.slice(1, -1)})$`, 'iu')``?

If the answer is yes, that would remove a lot of redundancy, since this is the biggest regex in the file. That redundancy is much easier to verify now that I've rewritten the regexes in a readable way.

Important: I've been very careful to ensure that none of the refactored regexes changed what they match (existing bugs in IPV6_REGEX and IP_REGEX have been preserved). I've also avoided adding any capturing groups, since that would change the shape of match results.

slevithan avatar Aug 19 '24 15:08 slevithan

I've rebased on top of recent conflicting Valibot configuration changes, and additionally bumped regex to v4.1.3. The latest version of regex avoids the need to update Valibot's library/tsconfig.json to use "moduleResolution": "Bundler". So I've undone that change.

slevithan avatar Aug 24 '24 17:08 slevithan

Congrats on the v1 launch. 😊 Would it make sense for me to pick this back up now that v1.0.0 has been out for a while?

slevithan avatar Mar 31 '25 17:03 slevithan

Hey @slevithan 👋 I'm very sorry for my late reply, but I don't have much time at the moment. So I have to be really careful with prioritizing things. Right now I should probably focus on adding new actions, improving our docs, and educating people on why Valibot is better than alternative libraries in various use cases. I think I would keep our regexes as they are and maybe test our library if we add more complex regexes at some point in the future.

fabian-hiller avatar Apr 20 '25 00:04 fabian-hiller

No worries, mate. Should we close this PR?

slevithan avatar Apr 20 '25 00:04 slevithan