8.3 Pattern association

Based on the two strategies for the approximation of the data distributions (section 8.2), two pattern-association methods were developed (chapter 4 and 5). In both methods, an input pattern was the offset from zero of a constrained subspace within the sensorimotor space. The span of this subspace was the output space.

In the first method, the constrained space intersected the mixture of ellipsoids. This computation could be carried out analytically. The resulting point yielded in its components the output pattern. In contrast to a gradient descent in a potential field build on top of the mixture of ellipsoids, the described approach does never end in a local minimum and has no additional parameters.

In the second method, a potential field was constructed. In feature space, the potential was the squared distance to the hyper-plane that approximated the data. A kernel function allowed to compute the potential in the original space. In this space, a gradient descent along the constraint gave the output.

Both methods have two advantages over feed-forward networks: First, input and output dimensions can be chosen *after* training. In particular, for sensorimotor models, the association works in both forward and inverse direction. Second, the association does not fail if the training set contains redundant output patterns for a single input. The first advantage was demonstrated on the completion of images (section 4.4.1), on the kinematic arm model (section 4.5), and by using the sensorimotor model for the mobile robot as an inverse model (section 7.3.2). The second advantage was demonstrated on the kinematic arm model (section 4.5) and on the robot arm (chapter 6). The robot arm could recall a suitable arm posture to grasp an object seen in the camera, despite the fact that redundant arm postures existed.

These two advantages were also demonstrated in recurrent neural networks (Steinkühler and Cruse, 1998). Therefore, we called the first method abstract recurrent neural network. The argument also holds for the second method. However, to distinguish the two clearly, just one was called abstract RNN.

The recall with the mixture model was more accurate and faster than with kernel PCA (section 5.3.2 and 6.3). Thus, the emphasis was on the mixture model. Nevertheless, kernel PCA demonstrated that with increasing dimensionality a data distribution can be better described with a linear model. This matches two further observations. First, a single PCA did well on the completion of images, which had more dimensions than the sensorimotor data in the other tasks (section 4.4). Second, in the robot-arm experiment, the mixture of local PCA did better on a higher-dimensional training set, which comprised population coded variables (chapter 6).

2005-03-22