Unable to match certain webpack bundles
Describe the bug
Could not recognize a certain webpack bundle.
Details
Webpack runtime requirements in the sample:
-
runtimeId = "__webpack_require__.j"; -
onChunksLoaded = "__webpack_require__.O"; -
nodeModuleDecorator = "__webpack_require__.nmd"; -
ensureChunkHandlers = "__webpack_require__.f"; -
hasOwnProperty = "__webpack_require__.o"; -
ensureChunk = "__webpack_require__.e"; -
exports = "__webpack_exports__"; -
getChunkScriptFilename = "__webpack_require__.u"; -
definePropertyGetters = "__webpack_require__.d"; -
compatGetDefaultExport = "__webpack_require__.n"; -
startup = "__webpack_require__.x"; -
moduleFactories = "__webpack_require__.m";
All wrapped in an IIFE.
Expected Behaviour
__webpack_require__ and __webpack_modules__ should be inferred from usage, and unpacked correctly
Code
Logs
No response
Fixed, it checks for a few more patterns like __webpack_require__.j = now and doesn't require an entry id anymore (__webpack_require__.s)
You can try it in the playground and the npm package will probably be updated in the next few days
Another one spotted:
exports.id = 885;
exports.ids = [885];
exports.modules = {.../* webpack 5 modules */};
I wonder, is it possible for users to provide extra chunks in AST to support code splitting bundles?
@j4k0xb This is still unfinished, since entryId is not retrieved
which one is it? I haven't really looked at how multiple chunks work yet
is it possible for users to provide extra chunks in AST to support code splitting bundles?
This is also something I would love to see, though it probably should be it's own issue.
I haven't really looked at how multiple chunks work yet
@j4k0xb Do you mean how they work in webpack/etc itself? Or just in implementing support for them in webcrack?
Not sure if this will be helpful, but this gist has some notes/references I captured while understanding more about it myself:
- https://gist.github.com/0xdevalias/8c621c5d09d780b1d321bfdb86d67cdd#reverse-engineering-webpack-apps
From a quick GitHub code search, these might be interesting too:
- https://github.com/webpack/webpack/blob/1ae20c0a0b7ea3b81566df6899543143eecbe1be/lib/Compilation.js#L4970-L5058
- https://github.com/webpack/webpack/blob/1ae20c0a0b7ea3b81566df6899543143eecbe1be/lib/buildChunkGraph.js
- https://github.com/vercel/next.js/blob/b3ad907d2bbe5f16988565ca6e99d434084bded0/packages/next/taskfile.js#L1753-L1772
which one is it? I haven't really looked at how multiple chunks work yet
The matching logic is here:
(() => {
var installedChunks = {
'377': 1
};
__webpack_require__.O.require = chunkId => installedChunks[chunkId];
__webpack_require__.f.require = (chunkId, promises) => {
if (!installedChunks[chunkId]) {
(chunk => {
var moreModules = chunk.modules;
var chunkIds = chunk.ids;
var runtime = chunk.runtime;
for (var moduleId in moreModules) {
if (__webpack_require__.o(moreModules, moduleId)) {
__webpack_require__.m[moduleId] = moreModules[moduleId];
}
}
if (runtime) {
runtime(__webpack_require__);
}
for (var i = 0; i < chunkIds.length; i++) {
installedChunks[chunkIds[i]] = 1;
}
__webpack_require__.O();
})(require(`./${__webpack_require__.u(chunkId)}`));
}
};
})();
We may also parse nested webpack chunks: https://github.com/webpack/webpack/tree/main/lib/CompatibilityPlugin.js
Some unpacked samples for further inspection. Notice how some webpack runtime globals still got persisted (search for regex /require\./)
Should also match webpack ESM modules in
SequenceExpression.
It already converts all kinds of sequences before in unminify
The modules are also converted to esm (more or less) if they're inside of a bundle and it removes require.r, require.d.
What's still left is require.j and module = require.nmd(module) and Object.defineProperty(exports, "__esModule", { value: true });
Found another issue in that bundle (fars.ee/FrBW.js):
71017: module => {
module.exports = require("path");
},
should not be transformed to:
module.exports = require( /*webcrack:missing*/"./path.js");
The code needs to be refactored to find references to the __webpack_require__ argument instead of blindly assuming that require(id) calls are the same. This should also result in better performance as it avoids traversing the whole AST.
https://github.com/j4k0xb/webcrack/blob/91809380d964128fc515be12e2474312af914eb2/packages/webcrack/src/unpack/webpack/bundle.ts#L26-L39
https://github.com/j4k0xb/webcrack/pull/50 / deploy-preview-50--webcrack.netlify.app contains a rewrite of the whole matching logic (very experimental)
- It will find anything with this format instead of looking for
'e', 'd', 'j', 'm', 'r'etc properties which should be more resilient
var __webpack_modules__ = { ... };
// ...
function __webpack_require__(moduleId) {
// ...
__webpack_modules__[moduleId](module, module.exports, __webpack_require__);
}
- That
requirevs__webpack_require__issue was fixed - Many of the runtime globals are replaced
- And some refactoring that could allow managing multiple chunks in the future: https://github.com/j4k0xb/webcrack/blob/cd26c952db4d788feef10072482b08bfc7e51594/packages/webcrack/src/unpack/webpack/bundle.ts#L5-L7 https://github.com/j4k0xb/webcrack/blob/cd26c952db4d788feef10072482b08bfc7e51594/packages/webcrack/src/unpack/webpack/chunk.ts#L3-L6