UglifyJS
UglifyJS copied to clipboard
[PROPOSAL] mangle property
Feature request
Idea to mangle repetitive object properties.
Uglify version (uglifyjs -V
)
3.4.9
Uglify script Lets assume this default code to test compression:
var UglifyJS = require('uglify-js');
var code = ``; // see following examples
var result = UglifyJS.minify(code, {
toplevel: true,
mangle: {
properties: false,
toplevel: true
}
});
console.log(result.code);
Repetitive object properties access example Lets assume a script which inserts 10 text nodes in the DOM, or in a more generic manner, a script which gets many times the same property name from same or different objects.
var code = `
(function() {
var doc = new DocumentFragment();
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
document.body.appendChild(doc);
})();
`;
In this example, the property appendChild
is called 11 times.
The output code is:
!function(){var e=new DocumentFragment;e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),document.body.appendChild(e)}();
The length is 303B
As we can see, appendChild
is repeated many times.
If we use properties: true
the output becomes:
!function(){var e=new DocumentFragment;e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),document.n.e(e)}();
The length is 210B But the code becomes invalid !
Repetitive object properties access optimization
Lets rewrite the code in such a manner than an object property, is not accessed with dot but with a function instead. doc.appendChild
becomes appendChild(doc)
.
var code = `
function appendChild(obj) {
return obj.appendChild.bind(obj);
}
(function() {
var doc = new DocumentFragment();
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(document.body)(doc);
})();
`;
The output is:
function e(e){return e.appendChild.bind(e)}var n;e(n=new DocumentFragment)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(document.body)(n);
The length is 250B And more important: the code is totally valid !
obj.property1.property2.property3.property4
could be written as property4( property3( property2( property1(obj))))
which could potentially compress to a(b(c(d(e))))
Introducing this method to uglify js
The parser could detect repetitive properties access, and convert them to functions to enable stronger compression. I suggest kind of: properties: true | false | 'none' | 'hard' | 'soft'
.
Where false
map to 'none'
, true
map to 'hard'
and have the same current behavior. Plus the introduction of 'soft'
which tries to convert properties to functions (only if resulting size is smaller).
This technique could save a lot of bytes in classes using long property names.
Performances This is a simple test of performances on chrome 70 :
function appendChild(obj) {
return obj.appendChild.bind(obj);
}
(function() {
var doc = new DocumentFragment();
console.time('perf');
for (let i = 0; i < 1e6; i++) {
// doc.appendChild(new Text(Math.random().toString())); // 1450ms
appendChild(doc, new Text(Math.random().toString())); // 1823ms~1500ms
}
console.timeEnd('perf');
console.log(doc.firstChild.wholeText.length);
})();
As we can see, V8 keeps really good performances on this pattern.
PS: I present here a generic idea how to optimize object's properties with functions for compression. This method would probably requires some adjustments according to the access context: call, set, get ?
function appendChildCall(obj) {
return obj.appendChild.bind(obj);
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body)(new Text('a'));
// --- OR
function appendChildCall(obj) {
return obj.appendChild.apply(obj, slice.call(arguments, 1));
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body, new Text('a'));
// ----
function appendChildGet(obj) {
return obj.appendChild;
}
// console.log(document.body.appendChild === document.documentElement.appendChild);
console.log(appendChildGet(document.body) === appendChildGet(document.documentElement));
// ----
function appendChildSet(obj, value) {
obj.appendChild = value;
}
// document.body.appendChild = function() { console.log('appendChild '); };
appendChildSet(document.body, function() { console.log('appendChild '); });
This sort of subexpression aliasing proposal comes up a lot. The gzip output is larger.
$ cat ex1.js
(function() {
var doc = new DocumentFragment();
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
doc.appendChild(new Text('a'));
document.body.appendChild(doc);
})();
$ cat ex1.js | terser --toplevel -mc | gzip | wc -c
91
$ cat ex2.js
function appendChild(obj) {
return obj.appendChild.bind(obj);
}
(function() {
var doc = new DocumentFragment();
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(doc)(new Text('a'));
appendChild(document.body)(doc);
})();
$ cat ex2.js | terser --toplevel -mc | gzip | wc -c
121
Yes, I agree than gzip loves repetitions for better compression. But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see what append.
PS: that's why I proposed a 'soft' flag which allow developpers to test if the code is smaller or not with this optimization.
But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see
Tested a few aliasing variations in the past. Current optimizations seem to work best for most code post gzip. In addition to toplevel
, see also: passes
, pure_getters
and unsafe
.
But don't let me stop you from creating a PR and proving otherwise. Compare sizes with test/benchmark.js
. The trick is creating a general purpose solution that works with all code. Easier said than done when side effects are considered.