webppl
webppl copied to clipboard
CPS breaks function order independence in the presence of function wrappers/decorators
WebPPL is fine with compiling this code:
var bar = function()
{
return 42;
}
var generate = function()
{
return foo();
}
but when compiling this code:
var wrap = function(fn)
{
return fn;
}
var foo = function()
{
return bar();
}
var bar = wrap(function()
{
return 42;
});
var generate = function()
{
return foo();
}
it dies with Uncaught ReferenceError: bar is not defined
.
Inspecting the compiled code, it looks like bar
is losing its name, because it no longer appears as a function declaration to the CPS transformation:
var wrap = function (globalStore, _k3, address, fn) {
var _return = _k3;
_trampoline = function () {
_return(globalStore, fn);
};
};
var foo = function (globalStore, _k4, address) {
var _return = _k4;
var _s5 = address.concat('_1');
_trampoline = function () {
bar(globalStore, _return, _s5);
};
};
var _s10 = address.concat('_2');
var _s11 = function (globalStore, _k12, address) {
var _return = _k12;
_trampoline = function () {
_return(globalStore, 42);
};
};
It's obnoxious to have to start worrying about function declaration order again in a language that's supposed to be agnostic to it. I get the need to use _s
symbols when introducing new intermediates, but is there a reason why bar
needs to be renamed in this case?
Also: decorated/wrapped functions will break the compiler if they are recursive, since the name of the function at its declaration gets mangled but its name at any recursive callsites stays the same. This is a more serious issue. One workaround is to define the recursive wrapped function using the Y combinator. So
var foo = wrap(function(x)
{
if (x == 0)
return 0;
else
return 1 + foo(x-1);
});
becomes
var foo = Y(function(fooImpl)
{
return wrap(function(x)
{
if (x == 0)
return 0;
else
return 1 + fooImpl(x-1);
});
});
I'm not sure this is a bug, although I have run into this issue before and agree that it's annoying. My impression was that it's hard to get around it, because wrap
could implement arbitrary computation, including random choice, so it's necessary for CPS to sequentialize computation in a way that isn't necessary for a list of pure function definitions. This sequentialization currently leads to the introduction of new names, which breaks recursive functions.
I'm not sure that we can get around this issue by reusing existing names instead of generating new ones, because the sequentialization also introduces new (function) blocks, and functions outside the block can't easily access functions defined within the block. But if you see a way to deal with this issue by adjusting the naming scheme, that would be great!
Alternatively, we could try to detect blocks of code that don't lead to randomness being sampled, and avoid CPS (and other transforms) for those blocks. This would be useful in general, and would handle deterministic wrappers as a special case.
It's worth figuring out how CPS-based compilers for functional languages usually handle mutual recursion... Maybe @kimmyg knows?
I think that performing a variable hoisting transform might solve at least some of this. For example, this doesn't work in WebPPL:
var id = function(x) { return x; };
var f = id(function() { return g() });
var g = function() { return 42 };
f(); // "Uncaught ReferenceError: g is not defined"
But this does:
var id = undefined, f = undefined, g = undefined;
id = function(x) { return x; };
f = id(function() { return g() });
g = function() { return 42 };
f(); // => 42
(Providing you remove the check in cps.js
that only allows assignment to the store.)
I've not tried this on any other examples or thought it through in detail, but perhaps it's a direction to explore if we look at fixing this.
I've thought more about making mutual recursion and caching go together. Currently, cache
is implemented as a header function (dealing with store, continuation, address), but in fact it doesn't need any of these and can be implemented as a plain Javascript library function. Such functions are treated differently by the CPS transform. In particular, they don't lead to the introduction of extra function scopes that break function order independence. Consider this example:
var even = cache(function(x) {
console.log('Called `even` with argument ' + x);
if (x == 0) {
return true;
} else {
return odd(x - 1);
}
});
var odd = cache(function(x) {
console.log('Called `odd` with argument ' + x);
if (x == 0) {
return false;
} else {
return even(x - 1);
}
});
console.log(odd(3));
console.log(odd(5));
The compiled code looks like this, with nested structure. However, if we instead implement cache
as a plain Javascript function and signal this fact by calling it as a method, say dp.cache
, we get code with flat structure as shown here. I've created a module that can be used for this purpose, but we should consider integrating this into core in some form that allows people to keep using cache
as before, but doesn't lead to nested scopes.
Just wanted to give this a +1, for dealing with the cache
case in particular. Spent the morning wrestling with this and then noticed there was already an issue for it!