UglifyJS icon indicating copy to clipboard operation
UglifyJS copied to clipboard

[PROPOSAL] mangle property

Open lifaon74 opened this issue 6 years ago • 3 comments

Feature request

Idea to mangle repetitive object properties.

Uglify version (uglifyjs -V) 3.4.9

Uglify script Lets assume this default code to test compression:

var UglifyJS = require('uglify-js');
var code = ``; // see following examples
var result = UglifyJS.minify(code, {
  toplevel: true,
  mangle: {
    properties: false,
    toplevel: true
  }
});
console.log(result.code);

Repetitive object properties access example Lets assume a script which inserts 10 text nodes in the DOM, or in a more generic manner, a script which gets many times the same property name from same or different objects.

var code = `
(function() {
  var doc = new DocumentFragment();
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  document.body.appendChild(doc);
})();
`;

In this example, the property appendChild is called 11 times. The output code is:

!function(){var e=new DocumentFragment;e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),document.body.appendChild(e)}();

The length is 303B As we can see, appendChild is repeated many times.

If we use properties: true the output becomes:

!function(){var e=new DocumentFragment;e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),document.n.e(e)}();

The length is 210B But the code becomes invalid !

Repetitive object properties access optimization Lets rewrite the code in such a manner than an object property, is not accessed with dot but with a function instead. doc.appendChild becomes appendChild(doc).

var code = `
function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(document.body)(doc);
})();
`;

The output is:

function e(e){return e.appendChild.bind(e)}var n;e(n=new DocumentFragment)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(document.body)(n);

The length is 250B And more important: the code is totally valid !

obj.property1.property2.property3.property4 could be written as property4( property3( property2( property1(obj)))) which could potentially compress to a(b(c(d(e))))

Introducing this method to uglify js The parser could detect repetitive properties access, and convert them to functions to enable stronger compression. I suggest kind of: properties: true | false | 'none' | 'hard' | 'soft'. Where false map to 'none', truemap to 'hard' and have the same current behavior. Plus the introduction of 'soft' which tries to convert properties to functions (only if resulting size is smaller). This technique could save a lot of bytes in classes using long property names.

Performances This is a simple test of performances on chrome 70 :

function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  console.time('perf');
  for (let i = 0; i < 1e6; i++) {
    // doc.appendChild(new Text(Math.random().toString())); // 1450ms
    appendChild(doc, new Text(Math.random().toString())); // 1823ms~1500ms
  }
  console.timeEnd('perf');
  console.log(doc.firstChild.wholeText.length);
})();

As we can see, V8 keeps really good performances on this pattern.

PS: I present here a generic idea how to optimize object's properties with functions for compression. This method would probably requires some adjustments according to the access context: call, set, get ?

function appendChildCall(obj) {
  return obj.appendChild.bind(obj);
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body)(new Text('a'));

// --- OR
function appendChildCall(obj) {
  return obj.appendChild.apply(obj, slice.call(arguments, 1));
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body, new Text('a'));


// ----
function appendChildGet(obj) {
  return obj.appendChild;
}
// console.log(document.body.appendChild === document.documentElement.appendChild);
console.log(appendChildGet(document.body) === appendChildGet(document.documentElement));

// ----
function appendChildSet(obj, value) {
  obj.appendChild = value;
}
// document.body.appendChild = function() { console.log('appendChild '); };
appendChildSet(document.body, function() { console.log('appendChild '); });

lifaon74 avatar Nov 14 '18 09:11 lifaon74

This sort of subexpression aliasing proposal comes up a lot. The gzip output is larger.

$ cat ex1.js
(function() {
  var doc = new DocumentFragment();
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  document.body.appendChild(doc);
})();

$ cat ex1.js | terser --toplevel -mc | gzip | wc -c
      91
$ cat ex2.js
function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(document.body)(doc);
})();

$ cat ex2.js | terser --toplevel -mc | gzip | wc -c
     121

kzc avatar Nov 15 '18 17:11 kzc

Yes, I agree than gzip loves repetitions for better compression. But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see what append.

PS: that's why I proposed a 'soft' flag which allow developpers to test if the code is smaller or not with this optimization.

lifaon74 avatar Nov 15 '18 18:11 lifaon74

But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see

Tested a few aliasing variations in the past. Current optimizations seem to work best for most code post gzip. In addition to toplevel, see also: passes, pure_getters and unsafe.

But don't let me stop you from creating a PR and proving otherwise. Compare sizes with test/benchmark.js. The trick is creating a general purpose solution that works with all code. Easier said than done when side effects are considered.

kzc avatar Nov 15 '18 20:11 kzc