actix-web icon indicating copy to clipboard operation
actix-web copied to clipboard

Add `unicode` feature to switch between `regex` and `regex-lite` crates as a trade-off between full unicode support and binary size

Open yujincheng08 opened this issue 6 months ago • 4 comments

PR Type

Feature

PR Checklist

  • [x] Tests for the changes have been added / updated.
  • [x] Documentation comments have been added / updated.
  • [x] A changelog entry has been made for the appropriate packages.
  • [x] Format code with the latest stable rustfmt.
  • [x] (Team) Label with affected crates and semver status.

Overview

Address #3290. It aims at removing a large dependency regex-automa for binary size.

Before cargo bloat:

226.2KiB regex_automata

After cargo bloat:

29.6KiB regex_lite

yujincheng08 avatar Feb 15 '24 16:02 yujincheng08

So the benefit of regex over regex-lite is it's unicode support so I think what I'd prefer is a feature flag called unicode (bikeshedding tbd), enabled by default, where disabling it falls back to regex-lite. Basically flip this around to make it additive.

robjtede avatar Feb 15 '24 16:02 robjtede

I think there's also performance gain when using regex over regex-lite.

The principal difference between the regex and regex-lite crates is that the latter prioritizes smaller binary sizes and shorter Rust compile times over performance and functionality.

yujincheng08 avatar Feb 15 '24 16:02 yujincheng08

@robjtede modified as requested.

yujincheng08 avatar Feb 15 '24 17:02 yujincheng08

prioritizes smaller binary sizes and shorter Rust compile times over performance

Another good reason to have this be an on-by-default feature since this will be sort of "infectious" as a transitive dep.

Thankfully, most third party libs use default-features = false already.

robjtede avatar Feb 15 '24 17:02 robjtede

Hi @robjtede, I rebased this PR again. Is it ready to be merged?

yujincheng08 avatar Feb 19 '24 14:02 yujincheng08

I've made some changes to isolate the conditional Regex logic (for actix-router at least) in a separate module which also allows us to keep using regex::RegexSet for better perf.

robjtede avatar Mar 03 '24 15:03 robjtede