I have recently finished reading this book, Artificial Intelligence: The Basics by Kevin Warwick (great book btw if you want to get a basic overview of the AI field), and it gave me a yearning to do some AI-programming in F#.
Many years ago I used to read articles at Generation5 and one of the articles was this Hello World program for Genetic Algorithms. I have since then seen people solving this with different programming languages and I myself once tried the above example in Python, but now it’s time to do it in F#.
If you haven’t read about Genetic Algorithms (GA) before then you will get a brief introduction below.
The idea is to mimic the biological evolution to evolve an answer to a problem, i.e. survival of the fittest. This is done by creating a population of chromosomes. Each chromosome is representing a possible answer to the problem. The population will be creating a new generation by producing new offsprings, which is done with the help of a fitness selection function and a crossover function. This shall hopefully create a new generation with chromosomes that are closer to the answer of a problem. There can also be some random mutation of new offsprings.
This might be a little bit abstract, but I hope some code might make things more clear.
I will implement a genetic algorithm that will evolve an answer that will represent the string “Hello F#”. The first step therefore is to decide how to represent the chromosome in the code.
First I’m creating a type alias for our Chromosome type and our target can then be coded as below:
So the goal is to evolve an answer that looks like the target above. Each character in our chromosome is representing a gene, each gene being an instance of a particular allele (value). We will only use visible characters as our alleles.
Now when we know how our chromosomes look likes, lets see how we can determine its fitness in a population. The fitness calculation function is the main step in the natural selection of new offsprings.
1 2 3
Here will we use the target to calculate the fitness of a chromosome. We are calculating the delta value between each character’s ascii value with the target one and then sum all the deltas to a fitness value.
The following chromosome, “Heklo E#”, would then have the fitness value of 2, since both k and F are just one step away from the right value.
Next step to look at is how we create new offsprings, which is called crossover or recombination in a GA. This is done by taking two chromosomes (parents) and mix theirs genes with each other, see the example below:
- Mom: Heklo E#
- Dad: Ghjkn E$
- Child: Hekln E$
The locus (index) where the crossover should occur between dad and mom is decided completely random. The selection of who will be parents is also done randomly, however we will only pick parents with a fitness value among the 50% best chromosomes. This could be expressed like this in our code:
1 2 3 4 5 6 7 8 9 10
The type Individual is defined like this:
One more thing that we should do in our selection of new offsprings is to select an elite of chromosomes. This is done by keeping the 10% best chromosomes of the current population. By doing this we will make sure that the next generation have chromosomes that are at least as good as its parents generation.
We could also add mutation to the mix when creating a new generation of chromosomes. This is done to avoid that the GA will get stuck in a local maximum in the search space of the problem. However I haven’t added this to the solution.
The complete solution for the Hello F# problem will look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
One part I haven’t mention earlier is how we are initializing our population. This is done with the help of the function initializePopulation which is using the createChromosome function to randomly create a chromosome. The gene is randomly selected by using, rnd.Next(32, 123), which is the span for visible ascii characters.
Running this code will not always find the target solution, which means that there is some optimizations/improvements we could do to this program. Maybe some mutation would help, or the size of the population needs to be bigger, or another size of the elite group, or perhaps we should use another representation of the chromosome (genes as bits instead).
The best way to get a deeper understanding of genetic algorithms is to start experimenting with these attributes to see how it affects the end result.
Good luck :-)