Explore the following Genetic Programming Applet. Do you understand what’s going on here? Note that you can enter a new equation to learn under “fitness cases” in settings: give the eqnafter “//” and then a number of X-Y pairs that solve the eqnas fitness cases.
http://alphard.ethz.ch/gerber/approx/default.html
I first tried the application with the default example. The function fitting was almost perfect.
Then I tried a polynomial function with the default parameters. I chose a function with a relatively interesting shape within the given range, namely x: [0, 1] and y: [-1, 1].
fitness cases: // 12*x^5 - x 0. 0. 0.1 -0.09988 0.2 -0.19616 0.3 -0.27084 0.4 -0.27712 0.5 -0.125 0.6 0.33312 0.7 1.31684 0.8 3.13216 0.9 6.18588
The outcome wasn’t as good as the example they provided. The “fitnesses” values were overall low.
Obviously, the default parameters are unsuitable for different functions. They aren’t even appropriate for all polynomial functions.
With the same polynomial function, I experimented with different parameters to improve the fitting accuracy. The “Method of generation” had a significant effect on the fitting accuracy. After changing this parameter from “Ramped half and half” to “Full”, the function fitting became almost perfect.
The “Method of generation” parameter determines the population of the initial generation. “Full” indicates the initial tree is fully balanced. The distance from the root to any terminal node is the same. Without an appropriate initial tree configuration, this Genetic-Program algorithm can be very inaccurate.
I also created a trigonometric function with a relatively interesting shape within the given range, namely x: [0, 1] and y: [-1, 1].
fitness cases: // (Sin[10 x] + Cos[10 x + 3])/2 0. -0.494996 0.1 0.0939137 0.2 0.59648 0.3 0.550645 0.4 -0.00145012 0.5 -0.552212 0.6 -0.595273 0.7 -0.0910425 0.8 0.496892 0.9 0.627986
I first experimented with default parameters. The outcome wasn’t good.
This is due to the fact that the operators “+”, “-”, and “%” were not able to fully represent a trigonometric function.
I applied the same trigonometric function in the “fitness cases”. This time, in the fitting “function set” I added “sin” and “cos”, hoping the given trigonometric function would be better fitted. The outcome was good; the fitting was almost perfect.
This tells me that it is essential to use an appropriate set of base functions to fit an unknown function. At the same time, it is not true that the more base functions we use, the better the outcome will be. To verify this statement, I conducted the following experiment.
For the first polynomial function, with the default parameters, I was unable to obtain a good function fitting.
fitness cases: // 12*x^5 - x 0. 0. 0.1 -0.09988 0.2 -0.19616 0.3 -0.27084 0.4 -0.27712 0.5 -0.125 0.6 0.33312 0.7 1.31684 0.8 3.13216 0.9 6.18588
Then, I wanted to know if applying a larger set of base functions would improve the algorithm performance. So I added “sin”, “cos”, and “exp” as base functions that the algorithm could use. Note that the given polynomial function did not involve trigonometirc and exponential functions.
The function fitting wasn't improved. In fact the "fitnesses" values decreased.
So, it is not a general rule that the larger the base function pool, the better the outcome of the function fitting will be using the Genetic-Programming algorithms.