Matthew Frederick
matthew@mich.distance.net
Cart Centering
The goal of the cart centering problem is to evolve a controller that
can stop a cart in the exact center of the track before time runs out.
At every time step, the controller has two options at its disposal: push
right or push left. The track is frictionless, so nothing will slow
down the cart but an opposite force. Unlike the controller in the
inverted pendulum problem, this controller
can't choose to do nothing. The magnitude of the force that the controller
can apply is constant. Each time the simulation starts, the cart
is given a random initial position and velocity. This way, the controller
that develops will be more robust.
Since the computer simulation is composed of discrete time slices,
it would be impossible for the controller to center the cart over the line
at the same time its velocity is 0.0. The simulator considers anything
less than 0.05 as 0.0 when evaluating an expression. On the screen,
there are two relevant numbers in green giving the cart position and cart
velocity. When the cart is within the range of the simulators approximate
0.0, these numbers will turn blue. This is simply a way for users
to see how close a controller is to being successful.
This problem was adapted for a problem proposed by John Koza (1992).
Expressions
Each expression can be composed basic math operations,
constants, conditionals, and some environmental variables. Add, subtract,
multiply, divide, absolute value, cosine, sine, inverse cosine, and inverse
sine are available to expressions, as well as a branching conditional "if
less than zero branch one way else branch the other way". Constants
range in value between -10.0 and 10.0. The following
environmental variables are available to expressions:
cartVelocity | the horizontal velocity of the cart |
cartDistance | the distance of the cart from the center of the track |
The simulation is composed of a series of time steps that represent about one-twentieth of a second in real time. The output of each controller tells the simulator how to push the cart at each time step. A value of 0.0 or more means apply a force to the right. Otherwise, the simulator will apply a foce to the left.
Criterion for evaluating the controller
Each expression is given 3 chances to successfully center the cart. Each time, the pole starts at a slightly different position and velocity. Since a controller cannot be allowed to run forever, each controller has to balance for 350 time steps. After the controller is successful, or after 350 time steps have passed, the expression is evaluated using such factors as:
Results
The outcome of this experiment was good. A suitable controller that can center the cart can usually be found.
References
Koza, J. R. (1992). Generic Programming: On the programming of computers by eans of natural selection. Cambridge, MA: MIT Press.