regenerator icon indicating copy to clipboard operation
regenerator copied to clipboard

Inefficient use of the arguments object

Open maxnordlund opened this issue 11 years ago • 5 comments

I was playing around with the examples in the test cases and came across the use of arguments object.

function* sum() {
  var result = 0;

  for (var i = 0; i < arguments.length; ++i) {
    yield result += arguments[i];
  }

  yield result;
}

Turns into this

var sum = regeneratorRuntime.mark(function sum() {
  var result, i, args$1$0 = arguments;

  return regeneratorRuntime.wrap(function sum$(context$1$0) {
    while (1) switch (context$1$0.prev = context$1$0.next) {
    case 0:
      result = 0;
      i = 0;
    case 2:
      if (!(i < args$1$0.length)) {
        context$1$0.next = 8;
        break;
      }

      context$1$0.next = 5;
      return result += args$1$0[i];
    case 5:
      ++i;
      context$1$0.next = 2;
      break;
    case 8:
      context$1$0.next = 10;
      return result;
    case 10:
    case "end":
      return context$1$0.stop();
    }
  }, sum, this);
});

But this copies the arguments object into a variable which prevents optimizations from taking place, for a brief introduction see Bluebirds wiki page about Optimization killers. This can be fixed in to ways, both with tradeoffs.

First is to copy over the arguments into a fresh new array, some thing like this:

var sum = regeneratorRuntime.mark(function sum() {
  var result, i, args$1$0 = new Array(arguments.length);
  for(var $i = 0; $i < args$1$0.length; ++$i) args$1$0[$i] = arguments[$i];
  ...

However this breaks code which uses some of the more exotic features of the arguments object, such as the ability to assign to an index and update the corresponding parameter. The last one is actually tested for in below sum*.

function* gen(x, y) {
  yield x;
  ++arguments[0];
  yield x;
  ...
}
// gen(1, 0) would yield 1 then 2

The second approach is to hoist all code above the first yield into the outer function which would compile this:

function* sum() {
  var result = 0, args = [];

  for (var i = 0; i < arguments.length; ++i) {
    args[i] = arguments[i];
  }

  for (var i = 0; i < args.length; ++i) {
    yield result += args[i];
  }

  yield result;
}

Into something like this:

var sum = regeneratorRuntime.mark(function sum() {
  var result = 0, args = [];

  for (var i = 0; i < arguments.length; ++i) {
    args[i] = arguments[i];
  }

  return regeneratorRuntime.wrap(function sum$(context$1$0) {
    while (1) switch (context$1$0.prev = context$1$0.next) {
    case 3:
      if (!(i < args.length)) {
        context$1$0.next = 9;
        break;
      }

      context$1$0.next = 6;
      return result += args[i];
    case 6:
      ++i;
      context$1$0.next = 3;
      break;
    case 9:
      context$1$0.next = 11;
      return result;
    case 11:
    case "end":
      return context$1$0.stop();
    }
  }, sum, this);
});

I'm sure there are many more ways this could be solved, but I like to raise the issue about performance for variadic generators.

maxnordlund avatar Oct 12 '14 16:10 maxnordlund

Thanks for bringing this to my attention, @maxnordlund!

The first approach is a no-go because of the loss of potentially desirable exotic behavior, as you point out, and the second approach doesn't seem generalizable. For example, what about code where usage of the arguments object occurs after the first yield?

Though I'm unsure how to implement the second approach soundly as part of Regenerator, that kind of hoisting is absolutely the right way to solve the problem by hand. Just as you sometimes have to adjust your coding style to accommodate unexpected performance gotchas, my recommendation here would be to rewrite performance-sensitive generator functions by hand so that they take an array of arguments as a single parameter, and then write a wrapper function that converts arguments to an array.

Or, better yet, just use a transpiler for ...rest parameters to avoid the unnecessary performance consequences of the arguments object, though in order to avoid capturing the arguments object you might have to transpile ...rest parameters after generator functions in the transform pipeline.

Because Regenerator has a responsibility to preserve as much of the expected semantics of the arguments object as possible, and produce relatively predictable, straightforward code, automatic optimizations of this kind are probably out of scope. With that said, I would welcome a pull request, even if it's just to add a warning with some helpful tips about how to fix the problem.

benjamn avatar Oct 12 '14 18:10 benjamn

Actually, here's another idea: what if we implemented ...rest parameters in Regenerator, and made sure that the conversion to an array happened in the outer function, so that you didn't have to worry about the order of transforms? In that approach, we wouldn't change the implementation of arguments, but we could just recommend that people start using ...rest parameters instead.

This feels like scope creep, to be sure, but it also seems worthwhile because Regenerator has a unique opportunity to generate efficient code for generator functions that have ...rest parameters. That's why I ended up implementing async functions in Regenerator, rather than as a separate transform, because I could skip creating an extra generator function and just create a single generator object, an optimization that would be very difficult to coordinate between two separate transforms.

If that idea makes sense to you, again, I'd welcome a pull request.

Closing this issue in favor of one I just created: https://github.com/facebook/regenerator/issues/132

benjamn avatar Oct 12 '14 19:10 benjamn

Having second thoughts about #132 now. If we can detect that arguments is used in a read-only way, is that enough to prove the hoisting strategy is sound?

benjamn avatar Oct 12 '14 19:10 benjamn

Hm, it would be kinda weird for Regenerator to implement ..rest, and start to look more like Traceur. There are some more options, I think. What about changing the outer function to just a trampoline, like:

var sum = regeneratorRuntime.mark(function sum() {
  var gen$1$0 = regeneratorRuntime.wrap(function sum$(context$1$0) {
    var result, i;
    while (1) switch (context$1$0.prev = context$1$0.next) {
    case 0:
      result = 0;
      i = 0;
    case 2:
      if (!(i < args$1$0.length)) {
        context$1$0.next = 8;
        break;
      }

      context$1$0.next = 5;
      return result += args$1$0[i];
    case 5:
      ++i;
      context$1$0.next = 2;
      break;
    case 8:
      context$1$0.next = 10;
      return result;
    case 10:
    case "end":
      return context$1$0.stop();
    }
  }, sum);
  return gen.apply(this, arguments); // Propagate the arguments object into sum$
});

About hoisting, I do believe it should be fairly easy to spot any use of arguments after the first yield, and generate the same code as today. Perhaps also give a warning about unperformat behavior.

maxnordlund avatar Oct 12 '14 21:10 maxnordlund

Thank you for reporting this issue and appreciate your patience. We've notified the core team for an update on this issue. We're looking for a response within the next 30 days or the issue may be closed.

ghost avatar Aug 04 '15 18:08 ghost