GeneticAlgorithmPython Bias towards large gene values

I noticed a behaviour where when my desired_output is set to <=10**3 it factorizes it with ease, but when it comes to numbers beyond 10**4 it starts giving me odd answers, answers with 0.99 fitness but never 1. Another thing I noticed is that the solution of the genes it gives me never goes beyond 5 or 6. Why?

from decimal import *
import pygad
import numpy as np

function_inputs = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
desired_output = 10**4


def fitness_func(solution, solution_idx):
    output = 1
    solution = solution.tolist()
    for i in range(0, len(function_inputs)):
        output *= function_inputs[i]**solution[i]
    if output >= desired_output:
        return 1.0 / ((output / desired_output)**2)
    else:
        return 1.0 / ((desired_output / output)**2)


def crossover_func():
    return 0


num_generations = 500
num_parents_mating = 500
fitness_function = fitness_func
sol_per_pop = 1000
num_genes = len(function_inputs)
init_range_low = 0
init_range_high = 11
parent_selection_type = "rank"
keep_parents = -1
crossover_type = "scattered"
crossover_probability = 0.7
mutation_type = "inversion"
mutation_probability = 0.9
gene_type = int
allow_duplicate_genes = True
stop_criteria = ["reach_1.0", "saturate_20"]
gene_space = {'low': init_range_low, 'high': init_range_high}


def callback_gen(ga_instance):
    print("Generation: ", ga_instance.generations_completed)
    solution, solution_fitness, solution_idx = ga_instance.best_solution()
    print("Parameters of the best solution: {solution}".format(
        solution=solution))
    print("Fitness value of the best solution = {solution_fitness}".format(
        solution_fitness=solution_fitness))


ga_instance = pygad.GA(num_generations=num_generations,
                       num_parents_mating=num_parents_mating,
                       fitness_func=fitness_function,
                       sol_per_pop=sol_per_pop,
                       num_genes=num_genes,
                       init_range_low=init_range_low,
                       init_range_high=init_range_high,
                       parent_selection_type=parent_selection_type,
                       keep_parents=keep_parents,
                       crossover_type=crossover_type,
                       crossover_probability=crossover_probability,
                       mutation_type=mutation_type,
                       mutation_probability=mutation_probability,
                       gene_type=int,
                       gene_space=gene_space,
                       allow_duplicate_genes=allow_duplicate_genes,
                       stop_criteria=stop_criteria,
                       save_solutions=True)

ga_instance.initialize_population(init_range_low, init_range_high, True, True, int)
ga_instance.run()
callback_gen(ga_instance)
ga_instance.plot_new_solution_rate()

Jul 24 '22 09:07 epixinvites

@epixinvites, thanks for using PyGAD.

Maybe this is the best fitness found according to the current set of parameters.

I tried running your code for 10**3 and it sometimes gives wrong results. The closest result is 1001.

I made this change in the list at the beginning of the code and it can factor 10**3 successfully with each run in less than 16 generations. It is just replacing 11 by 10.

function_inputs = [2, 3, 5, 7, 10, 13, 17, 19, 23, 29]

This is an example of solution found:

[1, 0, 1, 0, 2, 0, 0, 0, 0, 0]

Is it feasible?

Aug 02 '22 16:08 ahmedfgad

I'm trying to combine genetic algorithm and prime factorization together, 10 is not a prime factor, so this is not feasible.

Aug 08 '22 10:08 epixinvites