A chain of forward models could be applied to the planning of goal-directed movements and to mental transformation. The robot used the simulation of action sequences to perceive (to understand) its location within a circle of obstacles and to perceive the relative distance to obstacles. The forward model was acquired by random exploration. No teacher was necessary.

Tani (1996), Tani and Nolfi (1999), and Jirenhed et al. (2001) used also a chain of forward models for prediction. Different from our approach, they used Elman networks, which have a context layer (section 1.5.4). Such an approach also allows goal-directed movements by optimizing the motor commands (Tani, 1996). However, the simulation of covert action is not possible because the robot needs to move to initialize its context layer (Tani, 1996). Therefore, in the presented study, context layers were omitted.

The forward model was trained either with a multi-layer perceptron or with an abstract recurrent neural network based on a mixture of local PCA. On the anticipation, the MLP was more accurate and 100-times faster than the abstract RNN. The MLP also had fewer free parameters: 355 free parameters for 12 input, 15 hidden, and ten output neurons against 6150 free parameters for a mixture model with 50 units and five principal components. Thus, the MLP was the favorable choice for the goal-directed movements and the mental transformation tasks.

On the standard training set, using a few principal components (*q* = 5), MPPCA-ext was better than NGPCA and NGPCA-constV. With the addition of two more components, however, MPPCA-ext got worse than the NGPCA variants. It was also worse than NGPCA-constV on a second training set (`change set') using seven or more principal components. An explanation for this apparent weakness for large *q* might be the following: a higher *q* leads to a smaller residual variance. Thus, the width of a local Gaussian probability density in the direction of the minor components is smaller. The stronger descent of this density in these directions leads to a likelihood (which is the product of the probabilities of all data points) that is more sensitive to the positions of single data points. Thus, MPPCA-ext is less robust since it maximizes this likelihood.

The introduction of the change set did not bring a noticeable difference to the maximal performance of the abstract RNN. However, it served to demonstrate that a higher noise-to-signal ratio for some of the components of a training pattern can be counterbalanced by adding more principal components in the mixture model (if using NGPCA-constV).

In the change set, the local variation extended into nine additional dimensions. This number possibly arises from two facts. First, the relative change of an image vector has ten components. Second, the length of an image vector is almost constant (as shown in section 7.4), which restricts the number of dimensions the variance can extend to.

NGPCA-constV was better on the change set than NGPCA, because it did not produce units that have only a few patterns assigned to them. This matches the observation in chapter 6.

The once trained abstract RNNs could also map in the inverse direction, that is, from two successive sensory states onto the wheel velocities. Here, however, the error was high (about 20% of the velocity range). The explanation is possibly the much larger number of input dimensions compared to output dimensions (20 to 2). Thus, as examined in section 4.6, the expected error is higher than for the forward direction (which maps from 12 to 10 dimensions).

Goal-directed motion planning requires a search in a high-dimensional motor space defined by a sequence of movements. Nothing is known about the structure of the optimization function defined over this space. The fact that Powell's method, which is a local minimization method, showed a similar performance like simulated annealing suggests that the presented task has few local minima that are not global. Whether other environments have similar properties is not known.

The square error in the goal-directed movement task was higher than the one observed in the prediction of random test series (compare table 7.3 with figure 7.9). The explanation is that the robot could not always execute successfully its given motor commands. The four wheels made the robot occasionally stick on the floor during slow turns. Such trials were omitted in the collection of test samples; however, they could not be avoided during goal-directed movements^{7.3}.

The low error for the anticipation (below 2.4 pixels squared, see figure 7.9) allowed a successful application to mental transformation. Here, a series of covert motor commands was simulated. The temporal characteristic of these motor commands was the same as for their overt execution; this matches human-behavioral findings (Jeannerod, 2001).

The robot could detect the center of the circle of obstacles by simulating a turn around its rotational axis. The maximum distance of a location, classified as center, to the real center (10 cm) is low compared to the circle diameter (180 cm) and to the length of the robot (40 cm). The remaining inaccuracy might be attributed to prediction errors, and to deviations from perfect symmetry in the circle of obstacles.

The robot could also estimate the distance to an obstacle by simulating a straight-forward movement and predicting the first interval in which the activation in the frontal sector of the image representation was below a threshold. The number of this interval scaled with the real distance to the obstacle (figure 7.14). Since over-proportionally many examples were collected for straight-forward movements than for turns, the prediction in this task was more accurate than on average. A good performance was achieved for up to 13 prediction steps.

The judgment of relative distances based on sensorimotor integration relates to a psychological experiment by Sun et al. (2003). They showed that active movement improves visual-based estimation of path lengths. In this study, subjects rode an exercise bicycle, while wearing a virtual reality head-set that presented a view along a corridor. Based on visual cues, the subjects had to estimate the length of the path traveled. The better the match between exercised movement and virtual movement, the more accurate was the estimation.

For the MLP and the abstract RNN, we observed that data points outside the sensory manifold were mapped back toward the sensory manifold. As shown in section 7.3.3, the restriction of predicted states to this manifold possibly explains why the prediction error increased slower than expected. Compared to the abstract RNN, the MLP was better at this backward mapping (figure 7.11 and section 7.4). This might explain why the MLP was more accurate.

2005-03-22