cgpm
cgpm copied to clipboard
vscgpm: hooked instance of VsCGpm is incompatible with Engine(multiprocess=True)
engine.compose_cgpm([vscgpm], multiprocess=True)
fails with
E RuntimeError: Subprocess failed: Traceback (most recent call last):
E File "/scratch/fsaad/cgpm/cgpm/utils/parallel_map.py", line 55, in process_input
E outq_wr.send((i, ok, fx))
E PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
Further investigation reveals that the Venturecxx ripl is not picklable. The issue is that parallel_map
will attempt to pickle objects when communicating between master-slave processes.
This issue does not arise with sklearn-based cgpms such RandomForest and LinearRegression since those predictor objects from sklearn are pickleable.
One possibility is to explicitly convert the hooked cgpm to its JSON metadata format and have the worker press deserialize the cgpm, except the performance hit of explicit deserialization and repopulating the trace could be significant. Consider profiling this approach.
Alternative is to patch the venture.ripl.ripl.Ripl class to be pickleable.