1. Project introduction
Genetic algorithm (GA) is used to construct a project to find the approximate solution of univariate quadratic equation.
Language: python 3.6.8
Operating system: win10
python package used:
- random
- matplotlib
- re
- time
1.1 coding of solution
Here, the solution is expressed as binary code. By default, there are 24 bits in total. The first 8 bits are the integer digits of the solution, the first bit is the sign bit representing positive and negative (0-negative, 1-positive), the last 16 bits are the decimal digits of the solution, and the search space of the whole solution is [+ 127.65535, - 127.65535].
INT=27+26+25+...+20=127Float=215+214+213+...+20=65535 INT = 2^7+2^6+2^5+...+2^0=127 \\ Float= 2^{15}+2^{14}+2^{13}+...+2^0=65535 INT=27+26+25+...+20=127Float=215+214+213+...+20=65535
1.2 cross fusion of Solutions
The sub solution, that is, the coding of the offspring comes from a pair of parent / parent solutions with high fitness, and each bit of the sub solution comes from the parent solution or parent solution in a random way. Here, by default, the top 20 solutions in fitness (ascending order) are selected to form a parent solution by pairing randomly. Each pair of parents can produce 10 sub solutions.
Note: the figure only indicates the coding source of the sub solution under different coding conditions of the same bit of the parent solution. In fact, each bit should be randomly selected from the parent solution.
1.3 mutation
Here, in order to increase the difference of mutation, a mutation is defined by default, and any five bits of the one-time change sub solution become the opposite number, such as 0 to 1 and 1 to 0.
1.4 fitness calculation
Target=T(x)=coef1∗x2+coef2∗xFitness=(Target−T(x1))2 Target = T(x) = coef_1*x^2+coef_2*x \\ Fitness = (Target - T(x_1))^2 Target=T(x)=coef1∗x2+coef2∗xFitness=(Target−T(x1))2
coef1 and coef2 are constant coefficients, and Target * * * is also a constant. Since the purpose of my genetic construction is to approximate an approximate solution of a quadratic equation of one variable, the fitness here is defined as the sum of squares of errors with * * * Target, and the smaller the fitness, the higher the survival rate.
2. Genetic algorithm
For details of genetic algorithm, please refer to: Genetic algorithm from understanding to example application (Part I) (python)
Here I briefly summarize his characteristics and my understanding.
Genetic algorithm is a kind of directional random optimization, which needs to be realized. It contains four characteristics in Chapter 1: 1 code; 2. Cross integration; 3. Mutation; 4. Definition of fitness. Fitness is the direction of evolution. A good definition of fitness is very important for genetic algorithm. We choose "high-quality" individuals through the survival of the fittest, pair by pair, generate new populations through cross fusion and mutation, and then continuously screen individuals with high fitness for pairing to produce the next generation. After such repetition, we can produce a population that meets our expectations after reaching the appropriate number of iterations.
Genetic algorithm embodies the philosophical view of "natural selection, survival of the fittest". Generally speaking, although we all say that genetic algorithm is global optimization, I don't think so. In the process of cross fusion and mutation, if the operator's improper arrangement can not achieve global optimization, my arrangement is still flawed. I choose the top 20 individuals to pair arbitrarily, and each pair of parents produce 10 sub solutions. When the number of species with the highest fitness is small, My arrangement is likely to cause the offspring of cross fusion and mutation to lose the high fitness of their parents, thus annihilating them in the vast survival history of the population. A better way is to directly copy multiple individual codes with the highest fitness and give them to the offspring, so as to directly retain the characteristics of the parents.
In addition, the jumping distance of genetic algorithm in solution space is also the key to achieve global optimization. The jumping ability of genetic algorithm depends on the strategy of cross fusion and mutation. The cross integration strategy I made here is more like "homonymous inheritance". After a certain generation iteration, our group fitness is often almost the same, even with the same code, so we can't get different offspring under "homonymous inheritance", or even directly the copy of parents. At this stage, the group can only rely on unstable mutations. If we follow my current cross integration strategy, In terms of the composition of my parents, I need to ensure their diversity. Unfortunately, I don't have it. I'm also aware of it later, but fortunately, the script works normally and can get the approximate solution of the optimal solution. In addition, there are other more fancy ways in cross fusion. Extending to the chromosomes of real organisms is simply metamorphosis. For example, direct inversion of coding, arbitrary fusion of different parts of the coding of parental solution, etc. However, the chromosomes of real organisms do have cross exchange, inversion, translocation, deletion, etc., but the probability is relatively low. Therefore, in the strategy of cross fusion and mutation, it is necessary to ensure the diversity of the offspring, but also ensure that the offspring are different from their parents. At the same time, it is also necessary to preserve the characteristics of their parents by generating individuals with the same coding as their parents to continue the group. Individuals with different coding are the pioneer of development. The group looks for better coding through these different individuals, so as to better adapt to the environment. The vast and enduring survival history of the group is based on the heroic sacrifice of many ancestors. Our current happiness is earned by the blood of a revolutionary martyr and those who have made great contributions to the cause of the motherland. I hope you will study hard and ask questions and forge ahead. When it's our turn to stand up in the future, we will be able and energetic to shoulder the dark gate and let the next generation go to a wide and bright place, I hope everyone can make a share of light and heat, and the firefly will send a share of firefly light. If there is no torch in the world, I will be the only light.
3. Presentation
The control of DETAIL mode and non DETAIL mode lies in the False and True of DETAIL variable
if __name__ == "__main__": # Define a express f="1000=12x**2+25x" # Individual code information XINTSIZE = 9 XDECSIZE = 16 # Population information POPULATION_SIZE = 100 CHILD_SIZE = 10 MAX_GENERATION_SIZE = 50 MUTATE_SIZE = 5 MUTATE_RATE = 0.05 CROSS_MERGE_RATE = 0.5 # visualization detail X_RANGE = [-100,100] Y_RANGE = [-500,6000] # Print detail DETAIL = False PAUSE = True # Create a population numPopulation = Population( popSize = POPULATION_SIZE, childsSize = CHILD_SIZE, maxGenerationSize = MAX_GENERATION_SIZE, formulate=f, ) numPopulation.live() # visualization if not DETAIL: numPopulation.visualization() else: numPopulation.visualizationDetail() print("-"*80) best = numPopulation.species[0] bestX = best.getDecimal() bestTarget = numPopulation.getTarget(bestX) print("Best x = %.3f"%bestX) print("Best Target = %.3f"%bestTarget) print("%.3f = %.3fx**2 + %.3fx"%(numPopulation.target,numPopulation.xCoef,numPopulation.yCoef))
It is demonstrated here that the integer bit length is 9 bits, the decimal bit length is 16 bits, and the population size is 100 bits. Each pair of parents produce 10 offspring, the maximum generation is 50 generations, the mutation scale is 5 bits at a time, and the mutation rate is 0.05. Under the probability that the cross fusion comes from half of the parent / parent, the univariate quadratic equation:
1000=12∗x2+25x
1000=12*x^2+25x
1000=12∗x2+25x
Script output under
3.1 detail mode
Print the information of the first 20 individuals in each generation. Finally, the value of each generation will be visualized.
Console printing
Visual generation value
It can be seen that there are two optimal solutions in about five generations, but one of them gradually disappears with the progress of generations.
3.2 non detail mode
In the non detail mode, the average value of the first 20 individuals in each generation is calculated and printed out.
4. Code explanation
Here I mainly explain the important parts of the code and the specific process
4.1 numberspecifications class
It is the definition of solution. There is a code list variable inside. By default, there are 24 bits, that is, the code storage of solution. There are five methods in total. For details, see the comments in the corresponding code block. It is the container of the solution individual in the code.
4.2 Population class
It is a group and manages each individual through the categories list variable. These individuals are instances of the numberspecifications class. The live API is a part of managing the iteration of the whole group. It is a cycle until the maximum number of generations is defined.
4.2.1 live()
def live(self): """ Group life cycle """ while self.generation < self.maxGenerationSize: self.nextGeneration()
4.2.2 nextGeneration ()
def nextGeneration(self): """ Produce the next generation """ # calculate current generation fitness fitnessList = self.calFitness(self.species) # order by fitness self.sortByFitness(fitnessList, self.species) #Survival individual # child born childs = self.getChildPopulation() #choose survival child as next generation fitnessListChild = self.calFitness(childs) self.sortByFitness(fitnessListChild, childs) self.species = childs[:self.popSize] # self.generation += 1 # self.show(detail = DETAIL)
This is the processing logic of genetic algorithm, as follows:
- Calculate the fitness of the current group and rank it;
- The selected individuals reproduce to produce offspring groups;
- Calculate and rank the fitness of offspring groups;
- Replace the current population with the offspring population of the corresponding size, and add 1 to the number of generations;
- Print the group information of the current generation;
4.2.3 sortByFitness ( fitnessList, speciesList)
def sortByFitness(self, fitnessList, speciesList): """ Sort according to suitability, select sort, ascending order """ L = len(fitnessList) for i in range(L): for j in range(i,L): if fitnessList[i] > fitnessList[j]: fitnessList[i], fitnessList[j] = fitnessList[j], fitnessList[i] speciesList[i], speciesList[j] = speciesList[j], speciesList[i]
Through bubble sorting, individuals and fitness lists are arranged in the order of fitness from small to large.
4.2.4 getChildPopulation ()
def getChildPopulation(self): """ Offspring birth return child """ # selectParent fathersI, mothersI = self.selectParent() L = len(fathersI) # get childs childs = [] for i in range(L): for j in range(self.childsSize): # child born fI = fathersI[i] mI = mothersI[i] child = self.getChild(self.species[fI], self.species[mI]) # add child to child population childs.append(child) return childs
By default, the top 20 individuals in the fitness ranking are selected to form a pair of parents randomly in pairs, and the index list of the categories variable of the father individual is returned. fathersI is similar to that of the mother. Then, each pair of parents get 10 children through cross fusion and mutation, add them to the offspring group and return to the offspring group.
4.2.5 getChild (f, m)
def getChild(self,f,m): """ 1.Binary coding cross fusion 2.Mutation coding Single child """ assert isinstance(f,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class" assert isinstance(m,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class" seed = random.uniform(0,1) # do crossMerge? # decide cross position childCode = [] for i in range(f.totalSize): fromSeed = random.uniform(0,1) if fromSeed > self.crossMergeRate: childCode.append(f.code[i]) else: childCode.append(m.code[i]) # do mutate? # randomly choose x position to mutate if seed < self.mutateRate: tempPosIndex = [i for i in range(f.totalSize)] mutatePos = random.sample(tempPosIndex,self.mutatePosSize) # Change code for i in mutatePos: if childCode[i] == 0: childCode[i] = 1 else: childCode[i] = 0 # child born child = NumberSpecies(XINTSIZE,XDECSIZE) child.assignCode(childCode) return child
Enter the parent instance. Random numbers are generated through random Uniform (0, 1) extracts from an equal distribution of 0 to 1. By default, if it is greater than 0.5, the encoding bit from the parent party; otherwise, it comes from the parent party. If the generated random number is less than 0.05 by default, 5 coding bits will be mutated arbitrarily. Give the code directly to a new instance of numberspecifications and put it back as a single child.
4.2.6 selectParent ()
def selectParent(self,size = 20): """ By default, choose the top 20 parents in terms of fitness and pair them randomly return index list of select species one is father another is mother """ assert size < len(self.species), "selectParent Func size=%d par must less population%d"%(size, len(self.species)) # get size of couple coupleSize = size // 2 # get total index of couple total = set([i for i in range(coupleSize*2)]) # father and mother father = set(random.sample(total,coupleSize)) mother = list(total - father) mother = random.sample(mother,coupleSize) father = random.sample(father,coupleSize) return father, mother
Parents were randomly assigned through random Sample (population, times) randomly extracts times samples from population and returns them as the index list of parents.
5. Complete code
#python3 #-------------------------- # Author: little shark # Date: 2022/4/13 """ Solving univariate quadratic equation by genetic algorithm """ import random from matplotlib import pyplot as plt from time import sleep import re def getRange(x,y): """ x: x^2 Coefficient of y: x Coefficient of Returns the visual value range[-50, 50] """ targets = [] scale = [i for i in range(X_RANGE[0], X_RANGE[1], 1)] for i in scale: t = x * i**2 + y*i targets.append(t) return targets class NumberSpecies: """ A species approximation to the solution of a quadratic equation of one variable is defined """ def __init__(self, xIntSize = 8, xDecSize = 16): """ xIntSize: x The number of binary coded integer digits of yuan. The default is 8 bits, and the first bit is positive and negative xDecSize: x Binary coded decimal places of yuan, 16 digits by default By default, there are 24 digits in total, which are arranged as follows 8 position x Integer digit 16 bits x Decimal place of yuan """ # define a bit size to x self.xIntSize = xIntSize # define a bit size to decimal self.xDecSize = xDecSize # total size self.totalSize = xIntSize + xDecSize # define code self.code=[0 for i in range(self.totalSize)] # random it code self.random() def assignCode(self,code): """ Direct binary code """ self.code = code.copy() def show(self): """ Print binary coded information and its corresponding x element,y Decimal system of yuan """ print("code = ",self.code) print("numb = ",self.getDecimal()) #print("fitness = ",self.fitness) def random(self): """ Ramdom code Random coding """ self.code=[random.choice([0,1]) for i in range(self.totalSize)] def getDecimal(self): """ turn code into decimal Convert binary encoding to decimal """ #--------------------------------- # X part # part of x int xIntNum = 0 start = 1 signXIndex = 0 end = self.xIntSize xIntCode = self.code[start: end] for i in range(self.xIntSize-1): xIntNum += pow(2,self.xIntSize - i - 1)*xIntCode[i] # part of x decimal xDecNum = 0 start = end end = end + self.xDecSize xDecCode = self.code[start: end] for i in range(self.xDecSize): xDecNum += pow(2,self.xDecSize - i - 1)*xDecCode[i] # x str -> float xDecStr = str(xIntNum) + "." + str(xDecNum) xFinalNum = float(xDecStr) if self.code[signXIndex] == 0: xFinalNum *= -1 return xFinalNum def codeClone(self): """ return a code clone """ return self.code.copy() class Population: """ Administration NumberSpecies: 1.Control species mutation; 2.Control cross fusion; 3.Calculate suitability; 4.Constantly update the population, eliminate the species with low fitness and select the parent with high fitness/The mother parent is paired to give birth to the next generation. """ def __init__(self, popSize = 100, childsSize = 4, maxGenerationSize = 50, formulate = "" ): """ popSize: Population size childsSize: Number of children per couple maxGenerationSize: Maximum generations formulate: formula constant = c1*x**n + c2*y """ # Is expression legal? assert formulate != "", "You must input a formulate like target = c1*x**n + c2*y" # Extract information from formulate express m = re.search("([\d\.-]+)=([\d\.-]+)x\*\*2\+([\d\.-]+)x",formulate) # Is formulate legal? assert m != None,"Your formulate is not legal like: target = c1x**n + c2y " # Assign, formulate self.target = float(m.group(1)) self.xCoef = float(m.group(2)) self.xPower = 2 self.yCoef = float(m.group(3)) # Define the index of current generation self.generation = 0 # Assign, generation information self.popSize = popSize self.childsSize = childsSize self.maxGenerationSize = maxGenerationSize # Assign, about mutate and cross merge self.mutateRate = MUTATE_RATE self.crossMergeRate = CROSS_MERGE_RATE self.mutatePosSize = MUTATE_SIZE # mutate code number default 3 # Preduce population self.species = [NumberSpecies(xIntSize = XINTSIZE, xDecSize = XDECSIZE) for i in range(popSize)] # history of population: fitness & x & y & predict constant "Target" self.historicalMeanFitness = [] self.historicalMeanXDecimal = [] self.historicalMeanYDecimal = [] self.historicalMeanTargetDecimal = [] self.XhistoricalTopPopulation = [] self.TargethistoricalTopPopulation = [] def getTop20Mean(self,xDecimalList,fitnessList): """ Calculate the fitness of the top 20 species x and y The average coefficient and fitness of the element are and returned return average of coef about x & y and its target by define express """ assert len(xDecimalList) > 20, "Your population must more than 20" meanXDecimal = sum(xDecimalList[:20])/len(xDecimalList[:20]) meanfitness = sum(fitnessList[:20])/len(fitnessList[:20]) return meanXDecimal, meanfitness def calFitness(self, speciesList): """ Calculate fitness (Squared error from defined target number) """ fitnessList = [] for i,num in enumerate(speciesList): xDecNum = num.getDecimal() # get fitness base on express fitness = (self.target - self.getTarget(xDecNum)) ** 2 # save fitnessList.append(fitness) return fitnessList def sortByFitness(self, fitnessList, speciesList): """ Sort according to suitability, select sort, ascending order """ L = len(fitnessList) for i in range(L): for j in range(i,L): if fitnessList[i] > fitnessList[j]: fitnessList[i], fitnessList[j] = fitnessList[j], fitnessList[i] speciesList[i], speciesList[j] = speciesList[j], speciesList[i] def getTarget(self,x): """ According to the formula and given x The coefficient is calculated. """ return self.xCoef * x ** self.xPower + self.yCoef * x def show(self, top=20, detail = False, pause = False): """ Print generation Top N Information: including: generation index, fitness, x,Predicted results, real results There are two modes: detail mode and non detail mode Detail mode: detail = True Print Top N The above information of species Non detail mode: datail = False Print Top 20 Average of the above information of species It can be visualized and will save the generation information in the process of population evolution """ # Get decimal of x and y xDecNums = [] for i in self.species: xDecNum = i.getDecimal() xDecNums.append(xDecNum) # Get fitness fitnessList = self.calFitness(self.species) # Start show information if detail: print() print("="*80) print(" "*30,"*- Top %d -*"%top) print(" "*25,"*- %d generation -*"%self.generation) print("="*80) print("%-12s %-12s %-12s %-12s %-12s"%("Index","Fitness","XDecimal","MyTarget","RealTarget")) print("-"*80) myTargets = [] for i in range(top): my_target = self.getTarget(xDecNums[i]) myTargets.append(my_target) print("%-12d %-12.3f %-12.3f %-12.3f %-12.3f"%(i, fitnessList[i], xDecNums[i], my_target, self.target)) print("-"*80) # Save self.XhistoricalTopPopulation.append(xDecNums[:top]) self.TargethistoricalTopPopulation.append(myTargets.copy()) if pause: sleep(0.5) else: xDecimal, fitness = self.getTop20Mean(xDecNums, fitnessList) my_target = self.getTarget(xDecimal) if self.generation == 1: print("%-12s %-12s %-12s %-12s %-12s"%("Generation","Fitness","XDecimal","MyTarget","RealTarget")) print("-"*100) print("%-12d %-12.3f %-12.3f %-12.3f %-12.3f"%(self.generation,fitness, xDecimal, my_target, self.target)) # Save history self.historicalMeanFitness.append(fitness) self.historicalMeanXDecimal.append(xDecimal) self.historicalMeanTargetDecimal.append(my_target) if pause: sleep(0.5) def visualization(self): """ Visual generation history information """ # Fitness information about plt.figure(figsize=(8,5)) plt.subplot(2,1,1) plt.plot([i+1 for i in range(self.generation)],self.historicalMeanFitness,linestyle="--",marker="o") plt.ylabel("Fitness") # My Target information about plt.subplot(2,1,2) plt.plot([i+1 for i in range(self.generation)],self.historicalMeanTargetDecimal,linestyle="--",marker="o") plt.ylabel("My Target real = %.3f"%self.target) plt.show() def visualizationDetail(self): """ Visual generation value """ plt.ion() plt.figure(figsize=(8,5)) plt.ion() xScale = [i for i in range(X_RANGE[0], X_RANGE[1])] yScale = getRange(self.xCoef, self.yCoef) for i in range(len(self.XhistoricalTopPopulation)): plt.cla() plt.plot(xScale,yScale, alpha=0.7, linestyle=":",label="Express Space",color="blue") plt.axhline(self.target,color="red",alpha=0.7,linestyle="--",label="Real Target Base Line") plt.scatter( self.XhistoricalTopPopulation[i], self.TargethistoricalTopPopulation[i], alpha=0.7, color="red", label = "My Target" ) plt.title("Generation %d max is %d"%(i,self.maxGenerationSize)) plt.ylim(Y_RANGE[0],Y_RANGE[1]) plt.xlim(X_RANGE[0], X_RANGE[1]) plt.legend() plt.pause(1) plt.close() def selectParent(self,size = 20): """ By default, choose the top 20 parents in terms of fitness and pair them randomly return index list of select species one is father another is mother """ assert size < len(self.species), "selectParent Func size=%d par must less population%d"%(size, len(self.species)) # get size of couple coupleSize = size // 2 # get total index of couple total = set([i for i in range(coupleSize*2)]) # father and mother father = set(random.sample(total,coupleSize)) mother = list(total - father) mother = random.sample(mother,coupleSize) father = random.sample(father,coupleSize) return father, mother def live(self): """ Group life cycle """ while self.generation < self.maxGenerationSize: self.nextGeneration() def nextGeneration(self): """ Produce the next generation """ # calculate current generation fitness fitnessList = self.calFitness(self.species) # order by fitness self.sortByFitness(fitnessList, self.species) #Survival individual # child born childs = self.getChildPopulation() #choose survival child as next generation fitnessListChild = self.calFitness(childs) self.sortByFitness(fitnessListChild, childs) # self.generation += 1 # self.show(detail = DETAIL, pause = PAUSE) self.species = childs[:self.popSize] def getChildPopulation(self): """ Offspring birth return child """ # selectParent fathersI, mothersI = self.selectParent() L = len(fathersI) # get childs childs = [] for i in range(L): for j in range(self.childsSize): # child born fI = fathersI[i] mI = mothersI[i] child = self.getChild(self.species[fI], self.species[mI]) # add child to child population childs.append(child) return childs def getChild(self,f,m): """ 1.Binary coding cross fusion 2.Mutation coding Single child """ assert isinstance(f,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class" assert isinstance(m,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class" seed = random.uniform(0,1) # do crossMerge? # decide cross position childCode = [] for i in range(f.totalSize): fromSeed = random.uniform(0,1) if fromSeed > self.crossMergeRate: childCode.append(f.code[i]) else: childCode.append(m.code[i]) # do mutate? # randomly choose x position to mutate if seed < self.mutateRate: tempPosIndex = [i for i in range(f.totalSize)] mutatePos = random.sample(tempPosIndex,self.mutatePosSize) # Change code for i in mutatePos: if childCode[i] == 0: childCode[i] = 1 else: childCode[i] = 0 # child born child = NumberSpecies(XINTSIZE,XDECSIZE) child.assignCode(childCode) return child if __name__ == "__main__": # Define a express f="1000=12x**2+25x" # Individual code information XINTSIZE = 9 XDECSIZE = 16 # Population information POPULATION_SIZE = 100 CHILD_SIZE = 10 MAX_GENERATION_SIZE = 50 MUTATE_SIZE = 5 MUTATE_RATE = 0.05 CROSS_MERGE_RATE = 0.5 # visualization detail X_RANGE = [-100,100] Y_RANGE = [-500,6000] # Print detail DETAIL = False PAUSE = True # Create a population numPopulation = Population( popSize = POPULATION_SIZE, childsSize = CHILD_SIZE, maxGenerationSize = MAX_GENERATION_SIZE, formulate=f, ) numPopulation.live() # visualization if not DETAIL: numPopulation.visualization() else: numPopulation.visualizationDetail() print("-"*80) best = numPopulation.species[0] bestX = best.getDecimal() bestTarget = numPopulation.getTarget(bestX) print("Best x = %.3f"%bestX) print("Best Target = %.3f"%bestTarget) print("%.3f = %.3fx**2 + %.3fx"%(numPopulation.target,numPopulation.xCoef,numPopulation.yCoef))