Genetic Programming

August 28, 2014 | Pavel Shkilionak

Artificial intelligence, intelligent self-learning machines, systems that can advise on how to do work better, and robotics – all of this has always been like magic to me. When I studied at the university, I was carried away by these topics. It is even more fascinating to use knowledge from one field for another and thus solve the tasks that seemed unsolvable.

What do you think about the interaction of artificial intelligence and the theory of evolution, one of the most interesting open issues in biology? As I worked with algorithms and not hardware, I kept wondering, how we can teach a computer to be intelligent. I did research on the topic within an internal project at IBA. This article gives an overview of Genetic Programming and my speculations on how to use it in software development.

As Wiki says, ‘Genetic Programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task. Essentially GP is a set of instructions and a fitness function to measure how well a computer has performed a task. It is a specialization of genetic algorithms (GA) where each individual is a computer program. It is also a machine learning technique used to optimize a population of computer programs according to a fitness landscape determined by a program’s ability to perform a given computational task’.

In my view, GP is a way to find a solution using atomic user-defined blocks.

The following are the main principles of GP:

1. Initially, we are given past performance data to build the most suitable program for reproducing the same principles of data structure in the future. We assume that the initial data were produced by a kind of a black box. We have historical input and output data to reproduce this system with the same rules and principles for future use.

2. All programs are members of a population. It means that we get not the only solution, but a set of solutions and can choose the most suitable one.

3. Changes in a population are made using an iterative method. At each iteration, the programs that are most fit for crossover and replenishment of the population are selected.

4. A fitness function is used to determine, if a program is fit for the purpose. It is a user-defined metric that numerically presents the ability of a program to solve the defined task (to fit the mapping of the input and output parameters for a given data set).

5. The fittest individuals of a population are selected to develop the population just the way evolution selects species.

6. Changes that can be implemented in the surviving members are similar to biological evolution. A member can be mutated or crossed with another member of the population.

7. A stop condition is defined for a fitness function, when one can stop GP and pick up a solution.

To grow up a population, a programmer must define primitive blocks which will form an individual. These are terminals, including constants and variables, and primitive expressions such as +, -, *, /, cos, sin, if-else, foreach, and other predicates. Any program can be presented as a tree built up of these blocks. This way, any individual of a population can be presented.

In this case, mutation and crossover are represented as following.

We just take a randomly selected node from one individual and use it to replace a randomly selected subtree of another individual. That is a crossover. We can also take a randomly selected node of an individual and replace it with a randomly generated subtree. That is mutation.

Genetic Programming is used for neural network learning and numeric computing, as well to approximate complex functions. I was researching GP to imitate activity of a definite person. An employee while doing his or her job can make both optimal and non-optimal decisions, which makes human thinking different from digital technologies. When you are asked to point to the south, you won’t be able to do it without a mistake, and this ‘white noise’ is our individual quality. Sometimes, we do not need an accurate answer to solve a problem. After gathering the data, we can imitate this employee’s behavior for new tasks using a computer program.

Genetic programming is also useful when creating an artificial player for a game with different difficulty levels. The behavioral algorithms of an artificial intelligent player are typically ideal and therefore a human cannot beat it, if it didn’t play at give-away. Consequently, we need to make the artificial intelligence a little bit human, allowing it to make mistakes from time to time. To this end, the GP algorithms are in place because they teach the artificial intelligence human behavior, including the ability to make mistakes.

Isn’t that remarkable? I think it is real magic and those who create smart computers are magicians. Who knows, maybe it’s a way to put a soul into computer.

August 18, 2014

Is Jargon Preventing an Acceptance of Big Data?

September 4, 2014

Defining Big Data

Blog