browser-pack-flat icon indicating copy to clipboard operation
browser-pack-flat copied to clipboard

Is it possible to retain browserify's deduping functionality?

Open bengourley opened this issue 5 years ago • 1 comments

Thanks for the great module!

I got caught out by this when using a monorepo set up:

https://github.com/goto-bus-stop/browser-pack-flat/blob/77cafd3224332ad09e153e5c556b2193eda07bba/index.js#L41-L50

Vanilla browserify was able to successfully dedupe because even though the paths were different, the content was the same, but this process discards that and so you end up with n copies of a dependency in a bundle where n is the number of times it exists on your file system.

I'm assuming arguments[4] is the require() function from the module wrapper? Is it possible to figure out which module it is requiring, put a placeholder to that module in and then reference it by its eventual variable name? e.g. _$validators_14.

Happy to figure out how to PR this but I want to check there wasn't a reason this was avoided in the first place. Cheers!

bengourley avatar Jan 24 '19 11:01 bengourley

The main reason is that deduped modules in browser-pack are only seen in source once, but they can be instantiated multiple times. The same deduped module can even be instantiated with different dependencies. arguments[4] is the entire module wrapper function that browser-pack uses (function(module,exports,require,cache...)). For example, if two modules A and B both have an index.js containing:

module.exports = require('./lib/index.js')

that will be deduped by browserify. When evaluating a module, browser-pack uses an object mapping to turn require() strings into module IDs. In this case, for module A, it could be something like {"./lib/index.js":20} while for module B it could be {"./lib/index.js":30}. that's where the difficulty is—browser-pack-flat uses the static string values in require() calls to figure out the order to evaluate modules in, and places them in a single flat scope that executes from start to finish. Deduped modules are problematic because 1) the module is evaluated multiple times and 2) the object returned by require() calls is different between evaluations . The easiest way to address this in is by just outputting the code twice, which guarantees that the require() calls can be statically evaluated and that the two modules actually generate two separate exported values.

Now, this is not something that is impossible to solve in a more optimal way. Cyclical dependencies suffer some of the same problems, the solution to that is described in the readme: https://github.com/goto-bus-stop/browser-pack-flat#what-exactly

For deduped modules, we could do something similar, by placing the deduped module into a function wrapper, and calling those function wrappers at the appropriate points in the bundle.

// contents of the deduped module:
var _evaluateDedupedModule1 = function (module, exports, requireMap) {
  // transformed require(A) → requireMap[A]
  module.exports = requireMap["./lib/index.js"]
}

// first evaluate everything require()d by the deduped module in one particular instance
var _moduleAIndex = /* whatever node_modules/A/lib/index.js does */

// then inject the right values
var _moduleA = { exports: {} }
_evaluateDedupedModule1(_moduleA, _moduleA.exports, {
  "./lib/index.js": _moduleAIndex
})
_moduleA = _moduleA.exports;

// then the same for another particular instance of the module
var _moduleBIndex = /* whatever node_modules/B/lib/index.js does */

var _moduleB = { exports: {} }
_evaluateDedupedModule1(_moduleB, _moduleB.exports, {
  "./lib/index.js": _moduleBIndex
})
_moduleB = _moduleB.exports;

Implementing that is not exactly trivial, though, lol. Because it's quite difficult and npm dedupes identical versions of modules now (when browserify's deduping was written, npm v1/v2 were the state of the art, and they would happily install hundreds of copies), it's not been very urgent.

goto-bus-stop avatar Jan 24 '19 12:01 goto-bus-stop