GeneticAlgorithmPython
GeneticAlgorithmPython copied to clipboard
Problems with multiprocessing
Hi everyone.
I'd like to use pygad's parallel_processing
feature to enable parallelization with multiprocessing like that:
parallel_processing=["process", 2]
Then I have my fitness function which is a closure which has access to its outer function variables and meets pygad's requirement to have only 2 arguments at the same time like this:
def do_recognize(img):
def inner(inp, _):
best_fitness, _ = scan(img, inp, debug=False)
return best_fitness
return inner
When I try to use approach from above example I'm getting an error that it is impossible to pickle inner function:
AttributeError: Can't pickle local object 'do_recognize.<locals>.inner
Ok, my next attempt is to use functools.partial
like this:
fitness = partial(do_recognize, img)
But this doesn't work too, because of pygad's check that fitness function has exactly 2 arguments which it does by using __code__co_argcount
property which seems to be absent for partial
return type.
I came up with the following ugly hack:
Dummy = namedtuple('Dummy', 'co_argcount')
find_board = partial(do_recognize, img)
find_board.__code__ = Dummy(co_argcount=2)
this perfectly satisfies both multiprocessing and pygad but it doesn't look cool at all.
I'd like to propose to avoid using __code__co_argcount
function for arity check and switch to something less unusual.
For example inspect
package could be helpful:
import inspect
assert len(inspect.getargspec(f).args) == 2
which at least works perfectly fine with partial
.
Can you add a second argument with a default value to the do_recognize()
function? For example, add x=None
.
def do_recognize(img, x=None):
def inner(inp, _):
best_fitness, _ = scan(img, inp, debug=False)
return best_fitness
return inner
fitness = partial(do_recognize, img)
Is this solution feasible?
@ahmedfgad no, this doesn't work either:
AttributeError: 'functools.partial' object has no attribute '__code__'
@risboo6909, it is my fault not explained the idea.
This is a sample code that I tested and it is running. Just uncomment this line best_fitness, _ = scan(img, inp, debug=False)
and comment this one best_fitness, _ = 1, 2
.
import pygad
import time
from functools import partial
def do_recognize(img, x=1):
def inner(inp, _):
# best_fitness, _ = scan(img, inp, debug=False)
best_fitness, _ = 1, 2
return best_fitness
inp=2
inner2 = partial(inner, inp)
return inner2(img)
ga_instance = pygad.GA(num_generations=10,
num_parents_mating=3,
sol_per_pop=5,
num_genes=10,
fitness_func=do_recognize,
suppress_warnings=True,
parallel_processing=["process", 2])
if __name__ == '__main__':
t1 = time.time()
ga_instance.run()
t2 = time.time()
print("Time is", t2-t1)
First of all thank you for detailed explanation.
I guess I see your point now. I've tested your approach and it works fine if I just leave best_fitness, _ = 1, 2
.
But the problem is that it doesn't work with my scan
function. Let me try to explain. In your approach do_recognize
will be passed as a fitness function to pyGAD, so the first argument expected to be a solution
and the second one is solution_idx
but in your example it expects an image and some x
:
def do_recognize(img, x=1):
So after the first iteration of GA img
here will be equal to solution
and x
will be equal to solution_idx
.
But my scan
function expects img
to be exactly an image and not a solution found by GA. This image I was passing as an argument to the outer function so it could be available for the nested function, let me expand example from my original question to make it more clear:
def do_recognize(img):
def inner(inp, _):
best_fitness, _ = scan(img, inp, debug=False)
return best_fitness
return inner
# here do_recognize returns fitness function which has and access to image which I read from disk
fitness = do_recognize(read_image_from_disk('some_image.png'))
ga_instance = pygad.GA(num_generations=10,
num_parents_mating=3,
sol_per_pop=5,
num_genes=10,
fitness_func=fitness,
suppress_warnings=True,
parallel_processing=["process", 2])
Hope this clarifies my idea.
Thanks @risboo6909 for your explanation.
I do not know what error you get now but the next code works. I defined a simple scan()
function.
import pygad
import time
def scan(img, inp, debug=False):
return 1, 2
def do_recognize(img, x=1):
def inner(inp, _):
best_fitness, _ = scan(img, inp, debug=False)
return best_fitness
return inner(img, None)
ga_instance = pygad.GA(num_generations=10,
num_parents_mating=3,
sol_per_pop=5,
num_genes=10,
fitness_func=do_recognize,
suppress_warnings=True,
parallel_processing=["process", 2])
if __name__ == '__main__':
t1 = time.time()
ga_instance.run()
t2 = time.time()
print("Time is", t2-t1)
Please share any errors that you get.