babel-loader icon indicating copy to clipboard operation
babel-loader copied to clipboard

[Bugfix] Ensure stability of filename cache-keys

Open stefanpenner opened this issue 3 years ago • 18 comments

[Bugfix] Ensure stability of filename cache-keys

JSON.stringify(structure) isn’t inherently stable as it relies on various internal details of how structure was created.

As written, if a given babel configuration is create in an dynamic manner, it is possible for babel-loader to have spurious cache misses.

To address this, we can use one of the many stable stringify alternatives.

For this PR I have selected fast-json-stable-stringify for that task, as it appears both popular and it’s benchmarks look promising.

This PR does not explicitly include tests, as testing this is both tricky to test in this context, and the important tests are contained within fast-json-stable-stringify itself.

Please Read the CONTRIBUTING Guidelines In particular the portion on Commit Message Formatting

Please check if the PR fulfills these requirements

  • [x] Tests for the changes have been added (for bug fixes / features)
  • [x] Docs have been added / updated (for bug fixes / features)

What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

  • [x] Bugfix
  • [ ] Feature
  • [ ] Code style update (formatting, local variables)
  • [x] Refactoring (no functional changes, no api changes)
  • [ ] Build related changes
  • [ ] CI related changes
  • [ ] Other... Please describe:

What is the current behavior? (You can also link to an open issue here)

cache keys are unstable, causing occasional cache misses

What is the new behavior?

cache keys should be stable

Does this PR introduce a breaking change?

  • [ ] Yes
  • [x] No
  • Impact: improved cache hit frequency
  • Migration path for existing applications: none
  • Github Issue(s) this is regarding: none

stefanpenner avatar Jun 09 '21 23:06 stefanpenner

Thanks! Is it possible that the CI failure is related to this change?

nicolo-ribaudo avatar Jun 09 '21 23:06 nicolo-ribaudo

@nicolo-ribaudo failure is occurring locally too, I will investigate & address (most likely later tonight, or tomorrow).

stefanpenner avatar Jun 10 '21 00:06 stefanpenner

@nicolo-ribaudo quick-update I'll do some additional debugging this afternoon, but my initial debugging pass suggests this test failure may actually be an example of an erroneous cache miss now corrected with this change. I'm not yet 100% convinced, but intend to continue investigation this afternoon.

stefanpenner avatar Jun 10 '21 16:06 stefanpenner

@nicolo-ribaudo I believe in-fact the test failures where an example of an unstable cache key.

stefanpenner avatar Jun 10 '21 19:06 stefanpenner

@hzoo / @nicolo-ribaudo what are the next steps? Anything I can help with?

stefanpenner avatar Jun 11 '21 16:06 stefanpenner

fast-json-stable-stringify is a new dependency. I am auditing the source of fast-json-stable-stringify. The serialization is applied on every compile file so performance is crucial here.

I clone the repo and here the benchmark result. It seems that fast-stable-stringify is fastest.

fast-json-stable-stringify x 21,781 ops/sec ±1.09% (87 runs sampled)
json-stable-stringify x 16,813 ops/sec ±0.92% (92 runs sampled)
fast-stable-stringify x 25,183 ops/sec ±0.74% (93 runs sampled)
faster-stable-stringify x 20,960 ops/sec ±1.21% (93 runs sampled)

JLHwung avatar Jun 11 '21 17:06 JLHwung

@JLHwung I can switch to that 1, give me a sec.

stefanpenner avatar Jun 11 '21 17:06 stefanpenner

fast-stable-stringify seems not actively maintained: https://github.com/nickyout/fast-stable-stringify I lean to host our own fast-stable-stringify for babel libraries only.

JLHwung avatar Jun 11 '21 17:06 JLHwung

There haven't been new commits, but there are no reported bugs. It could be "maintianed, but working".

We should only self-host it if we find a bug that we need to fix, and the maintainer doesn't accept PRs.

nicolo-ribaudo avatar Jun 11 '21 17:06 nicolo-ribaudo

I agree with @nicolo-ribaudo's POV.

stefanpenner avatar Jun 11 '21 17:06 stefanpenner

Designing general stable JSON.stringify is more difficult than designing specific stable serializer for Babel options. Yes I use the term serializer because we don't have to generate a valid JSON. After all we don't read the serialized results and parse them back. We want to make sure for most different options, the hash is different. So we can perform better than most stable JSON stringify. For example,

  1. We can stringify any object key v with "\"" + v + "\"".
  2. If value is a string, we use it without escaping " and \, they are required by JSON but we don't have to output json.

Combining these two tricks I see 150% improvement (named as fast-stable-stringify-2) on the benchmark and the new serializer is now 20% slower than the native JSON.stringify:

fast-json-stable-stringify x 22,088 ops/sec ±1.16% (87 runs sampled)
fast-stable-stringify x 23,703 ops/sec ±1.59% (85 runs sampled)
fast-stable-stringify-2 x 61,288 ops/sec ±1.91% (83 runs sampled)
Native JSON.stringify x 69,951 ops/sec ±1.15% (93 runs sampled)
The fastest is Native JSON.stringify

The serializer could be faster than the native JSON.stringify because we don't scan strings. We just make sure every object can map to a string.

I have pushed the branch in https://github.com/JLHwung/fast-stable-stringify/tree/babel

JLHwung avatar Jun 11 '21 18:06 JLHwung

@JLHwung additional performance here sounds great.

stefanpenner avatar Jun 11 '21 20:06 stefanpenner

@JLHwung et.al what are the next steps?

stefanpenner avatar Jun 14 '21 15:06 stefanpenner

My personal opinion here: Since we don't have to use JSON stringify, I think we can host our own property stable config serializer. Now that the cache key is updated and the old cache should probably be invalidated, I think we should inform users on the next release note that we recommend running rm -rf ./node_modules/.cache/babel-loader before upgrading babel-loader.

JLHwung avatar Jun 14 '21 18:06 JLHwung

@JLHwung let me know when your variant of stringification is published, and I can include it here...

stefanpenner avatar Jun 14 '21 21:06 stefanpenner

@stefanpenner We don't have to publish a new package, since it is not JSON and meant to be used in babel-loader only, we can inline it in this repo.

JLHwung avatar Jun 14 '21 21:06 JLHwung

I'm down with in-lining our changes as in https://github.com/JLHwung/fast-stable-stringify/commit/f5b0ca9deae71b2634890ce01423baa41cbcd38b in the repo

hzoo avatar Jun 15 '21 00:06 hzoo

ok

stefanpenner avatar Jun 15 '21 23:06 stefanpenner