webppl
webppl copied to clipboard
valueRec and serialize are slow for large, recursive objects
The MAP
aggregator calls ad.valueRec
on all items that it adds, and the Histogram
aggregator calls both ad.valueRec
and util.serialize
. These transformations make inference unusably slow when the program returns large objects (e.g. procedurally-generated geometric meshes). Addressing this requires general solutions to these two subproblems:
- Avoid
ad.valueRec
when the inference routine doesn't use AD. Or, perhaps even if it does, skipping this step at the user's request? For instance, if the user only wants the MAP, then it's only necessary to perform this transform on the final MAP value, not on every candidate value along the way. - Find a better way to index histograms. I've long felt uncomfortable with the use of potentially giant, serialized strings as histogram keys, and it turns out they can be a big issue. I.e. a particle filter can run very quickly, but if it takes ~1 sec to serialize each final particle value, then it becomes unusably slow.
I tested valueRec
and serialize
on a large object and the times taken for each were ~200ms and ~600ms respectively.
valueRec
Given that it's so slow, your suggestion of calling valueRec
only when we know we're using ad seems sensible to me. I'll work up a PR for this.
I suppose another approach would be to walk each object to check whether it contains any ad nodes before we commit to copying the whole object. I'd expect this to be fast, but it's probably messy enough that we're better off doing it your way.
When we are using ad, arranging things so that valueRec
is only called once when onlyMAP
is used seems reasonable. (Assuming we think this will bite someone at some point.) Perhaps we just pass a thunk (which wraps the valueRec
call) to the MAP
aggregator, rather than the value itself.
serialize
It appears that serialize
became especially slow when correct handling of ±Infinity
was added. (#162, #256). Without that, the time taken to serialize the same object comes down to ~60ms.
I'm not sure how best to deal with this. One option is just to remove the Infinity
handling, possibly bringing it back when we're using something better than stringify
+ plain objects to build histograms. (With the expectation that users will encode Infinity
themselves, should their program return it.) Does anyone have any thoughts about that?
Related issues #179, #108.