This thesis introduces unsupervised learning algorithms for arbitrary sensorimotor associations. The experimental and mathematical understanding of these algorithms will be given considerable space. Two robot setups--a robot arm and a mobile robot--serve as a test bed for our sensorimotor theory of perception of space and shape, specifically for the learning and application of visuomotor models.

The remainder of the **Introduction** focuses on two points: the background of the sensorimotor approach to perception and the description of existing learning techniques. The first point covers experiments that show an influence of action on perception and reviews hypothesis on perception based on sensorimotor models. The second point covers artificial neural networks and their application to sensorimotor models. These networks comprise feed-forward networks, recurrent networks, and Kohonen's self-organizing maps (Kohonen, 1995).

**Chapter 2** reviews existing methods related to the new unsupervised learning techniques. The collection of the sensorimotor data is a distribution of data points in some high-dimensional space. The goal of learning is to find an approximation of this distribution. Two strategies are used: first, fitting a mixture of ellipsoids to the data, and second, mapping the data to a higher-dimensional space in which they can be approximated by a single hyper-plane. The first uses a combination of vector quantization and principal component analysis (PCA). Here, PCA is restricted to a region within the distribution; hence, it is called local PCA. A mixture of local PCA relates to a probability density model of the data (Bishop, 1995). The second strategy is based on a non-linear extension of PCA that is called kernel PCA (Schölkopf et al., 1998b).

**Chapter 3** describes new algorithms to determine the parameters of a mixture of local PCA from a data distribution. One algorithm combines the vector quantizer Neural Gas (Martinetz et al., 1993) with local PCA. Another algorithm extends the mixture of probabilistic PCA (Tipping and Bishop, 1999),
such that it can cope with sparse distributions, like typical sensorimotor data. Both algorithms are tested on synthetic data and on the classification of hand-written digits.

**Chapter 4** describes a novel pattern-association method that builds on the mixture of local PCA from chapter 3. Input and output are parts of a data point in the sensorimotor space. In this space, the input portion of a data point is the offset from zero of a constrained subspace. The intersection of this subspace with the mixture of ellipsoids gives a completed data point, which yields the output portion in its components.
The method resembles a recurrent neural network, since input and output components can be chosen *after* learning, and arbitrary distributions can be approximated instead of just functions. The new method learns to complete images and learns the kinematics of a simulated robot arm with redundant degrees of freedom. The latter task demonstrates the advantage over feed-forward networks. In addition, the dependence on the number of input dimensions will be analyzed experimentally and theoretically.

**Chapter 5** introduces an alternative pattern-association method, which is based on kernel PCA. A subspace spanned by the principal components of the distribution's mapping into an infinite-dimensional space serves as a representation of the data. Here, recall is a descent in a potential field, and its region of attraction is the principal subspace. With the help of a kernel function, all computation can be done in the original space. The new pattern-association method is tested on synthetic data and on the kinematic arm model from chapter 4.

In **chapter 6**, the pattern-association methods from chapter 4 and 5 are applied to a visuomotor model for a robot arm. The robot is equipped with a two-finger gripper and a camera. The task of the robot is to grasp an object by associating an image of the object with an arm posture. Image processing is necessary and mimics biological functions. Furthermore, it proved to be of advantage to use a population coding for the joint angles, the object's position, and its orientation within the image. The experiment shows that the robot can perceive an object's position and orientation in space by simulating an arm posture suitable to grasp the object.

**Chapter 7** presents a forward model for a mobile robot with an omni-directional camera. The model predicts the sensory consequence of a motor command. By anticipating the effect of a sequence of motor commands, the robot can either select actions that lead to defined goal states or use the simulation of actions to estimate its location in space and its distance to obstacles. In learning the sensorimotor model, a multi-layer perceptron proved to be better than the newly developed pattern-association described in chapter 4. The reason for this difference is explored.

**Chapter 8** sums up the results and puts them into relation with each other. **Appendix A** describes some common statistical tools. Some of the algorithms used are presented in **Appendix B**. Mathematical proofs can be found in **Appendix C**. **Appendix D** shows samples from a database of hand-written digits, and **Appendix E** contains lists with notations, symbols, and abbreviations.

Chapter 3 to 7, further appendix B.3 and appendix C contain the contributions of this work. Parts of this research has been published beforehand. These parts are:

- Section 3.2.1 and 3.4:
- the extension of Neural Gas to local PCA and its application to digit classification (Möller and Hoffmann, 2004).
- Section 4.2 and 4.5:
- the pattern recall based on a mixture model and its application to a kinematic arm model (Hoffmann and Möller, 2003).
- Section 6.2.2 and 6.2.3:
- some of the methods for the robot-arm experiments: the data collection (slightly different version) and the image processing that extracts the orientation of an object (Schenck et al., 2003).
- Chapter 7:
- the anticipation based on a multi-layer perceptron, the goal-directed movements, and the estimate of the robot's location (Hoffmann and Möller, 2004).
- Appendix C.3:
- the theoretical prediction of the error accumulation for the anticipation task in chapter 7 (Hoffmann and Möller, 2004).

2005-03-22