dispy
dispy copied to clipboard
Support for Dill or Cloudpickle?
Hi,
I'm looking to submit scripts that contain lambda functions to a compute cluster, but this seems impossible with the current setup due to the use of pickle without dill or cloudpickle. Do you have suggestions for how I might be able to submit these lambda functions? It seems that I cannot pickle them with dill and then send the pickled data over. Thanks!
With 'depends' or 'dispy_job_depends' you can pass any data after serializing it and deserializing it in the compute function on node. It seems cloudpickle is compatible with pickle so you can use it to serialize it on the client and that should just work (I think).
I was not aware of cloudpickle; may be it would be useful to more users to support it. If you want to use it right now (so you don't have to serialize it as described above), you can change serialize
and deserialize
functions in pycos/__init__.py
file to use cloudpickle. No other changes are needed.
Looking into cloudpickle further (e.g., https://github.com/RaRe-Technologies/gensim/issues/558), cloudpickle (or dill) may not be a good idea in general due to performance issues. So instead of changing pycos file, you can change serialize
function in your dispy client program with:
if __name__ == '__main__':
import dispy, cloudpickle
def serialize(obj):
cloudpickle.dumps(obj)
dispy.serialize = serialize
I haven't tested it this way, but since cloudpickle is drop in replacement for pickle, it may work.