GeneticAlgorithmPython icon indicating copy to clipboard operation
GeneticAlgorithmPython copied to clipboard

Problems with multiprocessing

Open risboo6909 opened this issue 2 years ago • 5 comments

Hi everyone.

I'd like to use pygad's parallel_processing feature to enable parallelization with multiprocessing like that:

parallel_processing=["process", 2]

Then I have my fitness function which is a closure which has access to its outer function variables and meets pygad's requirement to have only 2 arguments at the same time like this:

def do_recognize(img):
    def inner(inp, _):
        best_fitness, _ = scan(img, inp, debug=False)
        return best_fitness
    return inner

When I try to use approach from above example I'm getting an error that it is impossible to pickle inner function:

AttributeError: Can't pickle local object 'do_recognize.<locals>.inner

Ok, my next attempt is to use functools.partial like this:

fitness = partial(do_recognize, img)

But this doesn't work too, because of pygad's check that fitness function has exactly 2 arguments which it does by using __code__co_argcount property which seems to be absent for partial return type.

I came up with the following ugly hack:

    Dummy = namedtuple('Dummy', 'co_argcount')
    find_board = partial(do_recognize, img)
    find_board.__code__ = Dummy(co_argcount=2)

this perfectly satisfies both multiprocessing and pygad but it doesn't look cool at all.

I'd like to propose to avoid using __code__co_argcount function for arity check and switch to something less unusual. For example inspect package could be helpful:

import inspect
assert len(inspect.getargspec(f).args) == 2

which at least works perfectly fine with partial.

risboo6909 avatar Jul 26 '22 21:07 risboo6909

Can you add a second argument with a default value to the do_recognize() function? For example, add x=None.

def do_recognize(img, x=None):
    def inner(inp, _):
        best_fitness, _ = scan(img, inp, debug=False)
        return best_fitness
    return inner

fitness = partial(do_recognize, img)

Is this solution feasible?

ahmedfgad avatar Jul 28 '22 00:07 ahmedfgad

@ahmedfgad no, this doesn't work either:

AttributeError: 'functools.partial' object has no attribute '__code__'

risboo6909 avatar Jul 28 '22 17:07 risboo6909

@risboo6909, it is my fault not explained the idea.

This is a sample code that I tested and it is running. Just uncomment this line best_fitness, _ = scan(img, inp, debug=False) and comment this one best_fitness, _ = 1, 2.

import pygad
import time
from functools import partial

def do_recognize(img, x=1):
    def inner(inp, _):
        # best_fitness, _ = scan(img, inp, debug=False)
        best_fitness, _ = 1, 2
        return best_fitness
    inp=2
    inner2 = partial(inner, inp)
    return inner2(img)

ga_instance = pygad.GA(num_generations=10,
                       num_parents_mating=3,
                       sol_per_pop=5,
                       num_genes=10,
                       fitness_func=do_recognize,
                       suppress_warnings=True,
                       parallel_processing=["process", 2])

if __name__ == '__main__':
    t1 = time.time()

    ga_instance.run()

    t2 = time.time()
    print("Time is", t2-t1)

ahmedfgad avatar Jul 28 '22 22:07 ahmedfgad

First of all thank you for detailed explanation.

I guess I see your point now. I've tested your approach and it works fine if I just leave best_fitness, _ = 1, 2.

But the problem is that it doesn't work with my scan function. Let me try to explain. In your approach do_recognize will be passed as a fitness function to pyGAD, so the first argument expected to be a solution and the second one is solution_idx but in your example it expects an image and some x:

def do_recognize(img, x=1):

So after the first iteration of GA img here will be equal to solution and x will be equal to solution_idx.

But my scan function expects img to be exactly an image and not a solution found by GA. This image I was passing as an argument to the outer function so it could be available for the nested function, let me expand example from my original question to make it more clear:

def do_recognize(img):
    def inner(inp, _):
        best_fitness, _ = scan(img, inp, debug=False)
        return best_fitness
    return inner

# here do_recognize returns fitness function which has and access to image which I read from disk
fitness = do_recognize(read_image_from_disk('some_image.png'))

ga_instance = pygad.GA(num_generations=10,
                       num_parents_mating=3,
                       sol_per_pop=5,
                       num_genes=10,
                       fitness_func=fitness,
                       suppress_warnings=True,
                       parallel_processing=["process", 2])

Hope this clarifies my idea.

risboo6909 avatar Jul 30 '22 15:07 risboo6909

Thanks @risboo6909 for your explanation.

I do not know what error you get now but the next code works. I defined a simple scan() function.

import pygad
import time

def scan(img, inp, debug=False):
    return 1, 2

def do_recognize(img, x=1):
    def inner(inp, _):
        best_fitness, _ = scan(img, inp, debug=False)
        return best_fitness
    return inner(img, None)

ga_instance = pygad.GA(num_generations=10,
                       num_parents_mating=3,
                       sol_per_pop=5,
                       num_genes=10,
                       fitness_func=do_recognize,
                       suppress_warnings=True,
                       parallel_processing=["process", 2])

if __name__ == '__main__':
    t1 = time.time()

    ga_instance.run()

    t2 = time.time()
    print("Time is", t2-t1)

Please share any errors that you get.

ahmedfgad avatar Aug 02 '22 15:08 ahmedfgad