dispy
dispy copied to clipboard
Dealing with custom types defined in C
I'm trying to use dispy to parallelize work with objects defined in C. The program will implement serialization of these objects (a link to an example would be most appreciated).
While trying to test this, I get an error -- presumably, because the serialization is not yet implemented. What I am seeing, however, is the secondary error -- from somewhere inside Python-3.6's inspect.py (function named getfile) complaining: TypeError('{!r} is a built-in class'.format(object)).
I'm guessing, Dispy is trying to be helpful, but needs to catch these exceptions so that they don't hide the original one.
Is it possible to send me a small example that I can run?
Yes - you can use the file-object as an example:
def meow(o):
return o
def processed(status, node, job):
if status == dispy.DispyJob.Finished:
print("%s" % job.result)
elif status == dispy.DispyJob.Terminated:
print("%s" % (job.exception))
return
if __name__ == '__main__':
import dispy
f = open('/dev/null', 'r')
cluster = dispy.JobCluster(
meow,
cluster_status = processed,
depends = [f]
)
cluster.print_status()
cluster.wait()
cluster.print_status()
The errors I'm getting are quite unhelpful:
2018-09-05 10:16:21 pycos - version 4.8.1 with epoll I/O notifier
2018-09-05 10:16:21 dispy - dispy client version: 4.9.1
2018-09-05 10:16:21 dispy - Storing fault recovery information in "_dispy_20180905101621"
Traceback (most recent call last):
File "d.py", line 35, in <module>
depends = [f]
File "/prod/pfe/local/lib/python3.6/site-packages/dispy/__init__.py", line 2547, in __init__
lines = inspect.getsourcelines(dep)[0]
File "/prod/pfe/local/lib/python3.6/inspect.py", line 955, in getsourcelines
lines, lnum = findsource(object)
File "/prod/pfe/local/lib/python3.6/inspect.py", line 768, in findsource
file = getsourcefile(object)
File "/prod/pfe/local/lib/python3.6/inspect.py", line 684, in getsourcefile
filename = getfile(object)
File "/prod/pfe/local/lib/python3.6/inspect.py", line 654, in getfile
raise TypeError('{!r} is a built-in class'.format(object))
TypeError: <module 'io' (built-in)> is a built-in class
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/prod/pfe/local/lib/python3.6/site-packages/dispy/__init__.py", line 2803, in shutdown
self.close()
File "/prod/pfe/local/lib/python3.6/site-packages/dispy/__init__.py", line 2790, in close
if self._compute:
AttributeError: 'JobCluster' object has no attribute '_compute'
After I added the module's name to my types (following this advice), the first error is gone and instead I am getting the complain from inspect's findsource method saying OSError: source code not available.
Followed by the cryptic error about the _compute attribute...
The issue seems to be in assuming that any object that has __class__ attribute can be used for getting source for that class (see line 2539 and 2546 in __init__.py). This can be handled in one of two ways: Either checking for __module__ attribute that I think is needed to get source, or use try/except with inspect.getsourcelines and issue appropriate warning. I will commit fix in couple of days (I have been working on a rather large patch and maintaining two branches; I am hoping to commit the other one soon so I don't have to apply patches to both branches and two Python versions!).
use
try/exceptwith inspect.getsourcelines and issue appropriate warning
Yes this is a bigger point: the attempts to be more helpful should not make the reported error less helpful, when they fail.
That said, some example of sending out a native custom type may in order :-) In my case, I added a todict method to my class, and a constructor that can recreate the object from a dictionary -- so now, instead of trying to pass around the objects of a native type, I'm pushing their dictionary-representations. The dictionaries are converted back into native types on each node, and the native types are then passed to the proprietary library for actual computations.
It seems to work, but I'm only learning and it would've been nice to have some kind of "best practices" tutorial for such a case...
In my case, I added a todict method to my class, and a constructor that can recreate the object from a dictionary
Is it possible to add __getstate__ and __setstate__ methods to the class? If so, that is all that is needed to serialize and deserialize. If __getstate__ can return a dictionary, then __setstate__ is not required (unless special processing is needed). See, for example, class _DispyJob_ in __init__.py whose __getstate__ returns a dictionary with only necessary attributes (this class also defines __setstate__, although not required). If this works, then this is Python's serialization approach (look for __getstate__ in dispy).
Is it possible to add
__getstate__and__setstate__methods to the class?
Aha! Thanks for the pointer. Yes, certainly -- once I turned my existing todict and fromdict into __getstate__ and __setstate__ respectively, things became much nicer. And, oh, the speed so far seems linear (!) compared to using OpenMP within one machine.
I have other questions, but will use StackOverflow -- I see, there is a dispy-tag there already...