Below is the unedited preprint (not a quotable final draft) of:
Houk, J.C., Buckingham, J.T., & Barto, A.G. (1996). Models
of the cerebellum and motor learning.
Behavioral and Brain Sciences 19 (3), 368-383.
The final published draft of the target article, commentaries and
Author's Response are currently available only in paper.
Lesion studies carried out in the 19th century demonstrated that the cerebellum is important for coordinating movements (Florens, 1824). However, mechanistic models of the cerebellum awaited an analysis of the histology of the cerebellum (Braitenberg & Atwood, 1958), and combined analyses of histology and electrophysiology (Marr, 1969; Albus, 1971). The clear orthogonal relationships between parallel and climbing fibers and the dendritic trees of Purkinje cells convinced Braitenberg that the cerebellum functions as a timing organ. He viewed the parallel fibers as delay lines and climbing fibers as clock read-out mechanisms, with Purkinje cells firing only when there was a coincidence of a parallel fiber volley and a climbing fiber activation. In one of his examples, a Purkinje cell innervating an antagonist muscle fired at an appropriate time delay to terminate a movement at its intended target. Even though parallel fibers have small diameters and low conduction velocities, the time delays only amount to a few milliseconds, rather short for most problems in motor control. Another limitation of this theory is its requirement for foci of synchronized activation in the granular layer. For these and other reasons, Braitenberg's timing theory has not been as vigorously pursued as the learned pattern recognition theories developed subsequently by Marr (1969) and Albus (1971).
Like Braitenberg, Marr and Albus were impressed by the anisotropic structure of the cerebellar cortex. In constructing their information processing theories, they also incorporated several additional anatomical features, as for example the marked differences in the convergence ratios of parallel (~100,000:1) and climbing (1:1) fibers onto Purkinje cells. Instead of serving as delay lines, parallel fibers were seen as providing large vectors of potential input, transmitting a diverse array of information. The climbing fibers, instead of serving as read-out devices, functioned as training signals that adjusted the synaptic weights of parallel fiber synapses, thus teaching Purkinje cells to recognize specific patterns signaled by their input vectors. Marr, Albus, and Braitenberg alike, assumed that individual Purkinje cells control elemental movements that are evoked when the Purkinje cell fires or pauses. Marr suggested that the cerebral cortex, by activating specific climbing fiber inputs, would train the cerebellum to recognize appropriate contexts for emitting the same movements in a more automatic fashion. Albus instead suggested that climbing fibers would signal errors, training the Purkinje cells to select movements that would reduce these errors. Many subsequent models have been based on extensions of these basic ideas.
In this article, cerebellar models are assessed on the basis of computations internal to the cerebellar cortex and also in relation to their function in movement control. Figure 1 summarizes the position of the cerebellar cortex in the global scheme of movement control, as a regulator of premotor networks. While Florens (1824) had realized that the cerebellar cortex only regulates movements, rather than actually controlling them, it was more than a century later when Ito's (1969) electrophysiology explained why this was so. He demonstrated that inhibition is the sole action of the Purkinje cells that project out of the cerebellar cortex. The cerebellar cortex exerts its influence by inhibiting and disinhibiting motor control actions that are formulated elsewhere, in the premotor networks of the brainstem, sensorimotor cortex and spinal cord. How does the cerebellar cortex perform these regulatory functions? We should answer this operational question prior to addressing the problem of motor learning, because a knowledge of operational principles will indicate, or at least provide insight about, what it is that Purkinje cells in the cerebellar cortex must actually learn. This places us in a better position to evaluate theories of how Purkinje cells are trained by climbing fiber signals originating in the inferior olive.
This article deals generally with diverse models of the cerebellum and motor learning, although it emphasizes the adjustable pattern generator (APG) model that has been the focus of our research. We begin, in Section 1, with a discussion of the properties of premotor networks, since this helps to specify what the cerebellar cortex may be attempting to regulate. In Section 2 we discuss the operational features of several models of the cerebellum, analyzing the mechanisms whereby the cerebellar cortex is presumed to regulate different types of motor response. Then, in Section 3, we discuss proposed roles of synaptic plasticity and the training signals sent from the inferior olive in guiding processes of motor learning.
1. Premotor Networks
What is the cerebellum regulating? This is one of the key questions that needs to be answered by a theorist attempting to model the cerebellum. Since the output actions of cerebellar cortex are exclusively inhibitory, the cerebellar nuclear cells and vestibular neurons targeted by Purkinje inhibition must be regulated by a disinhibitory action. For this to work, nuclear cells need to have something to inhibit, either pacemaker activity or excitatory input from another source. In his original paper, Marr (1969) ignored this problem and simply assumed that the nervous system would somehow convert the inhibitory outputs of the Purkinje cells that had been selected through pattern recognition into an appropriate set of elemental movements. Albus (1971) proposed that Purkinje cells are trained to pause, rather than to fire, and that these pauses select elemental commands controlled in some unspecified manner by the individual nuclear cells. In both cases, the theory on the output side of cerebellum was not very well developed.
1.1 Limb premotor network
The problem of what is being regulated by the cerebellar cortex was briefly addressed in a follow-up article by Blomfield and Marr (1970). Their proposal was based on Tsukahara's finding (Tsukahara, Korn & Stone, 1968) of an excitatory recurrent pathway between the motor cortex and the cerebellar nucleus. Blomfield and Marr postulated that the motor cortex, through some unspecified mechanism, initiates limb movements by firing a large set of cortical neurons that tend to command many more elemental movements than are actually needed, and that the continuation of these commands is dependent on positive feedback circulating through the cerebellar nucleus. The role of Purkinje cells was seen as one of eliminating, through inhibition of nuclear relays, those commands that are not needed. While the concept of positive feedback in limb premotor networks was pursued experimentally by Tsukahara, it did not receive much theoretical attention until recently (Houk 1989; Eisenman et al 1991; Houk et al 1993), the exception being Boylls' thesis (1975) and some commentary in reviews (Ito, 1970; Arbib et al 1974). Now, however, there is a substantial body of literature, reviewed in Houk, Keifer & Barto (1993), defining the anatomy and physiology of the limb premotor network and supporting the concept that the spread of positive feedback through this recurrent network is critical in limb command generation.
Figure 2 summarizes the known anatomy of the limb premotor network and its cerebellar cortical input. Since the interconnections between nuclei are topographically organized, we postulated that this diagram also applies at the level of microcircuitry, thus defining a modular architecture for generating the elemental movement commands envisioned by Marr and Albus. A band of Purkinje cells (PCs) converges on a small cluster of nuclear cells (N) that participates in a topographically organized recurrent circuit that includes thalamic (T), motor cortical (M), rubral (R), pontine (P) and lateral reticular (L) neurons. Elemental commands are generated by positive feedback transmitted through this network and are then sent to the spinal cord in corticospinal and rubrospinal fibers. In contemplating the composite motor command transmitted by all the corticospinal and rubrospinal fibers, one needs to consider a thousand to a million replications of this module. We envision that the individual elements in this large array of modules function in a partially autonomous manner, but that they also interact with each other through divergence in their loop connections (Houk et al, 1993). When this modular concept is combined with simple physiological assumptions, the resultant model provides plausible answers to several basic questions about the limb motor system.
For example, why are the movement-related discharges recorded from N, T, M, R, P and L neurons so similar to each other? We believe that this is because elemental commands are generated as a collective computation in this cortico-rubro-cerebellar recurrent network (Houk et al, 1993). An elemental command is initiated when positive feedback causes activity to intensify nearly simultaneously at all stages in a given loop. The command is terminated when many of the module's PCs fire to inhibit N activity, whereupon activity at the other stages also dies out. In this manner an elemental command is expressed similarly at each of the nuclear stages.
What mechanisms are responsible for recruiting the large number of motor cortical and red nucleus neurons that participate in any given movement? We believe that commands become distributed to the population of cortical and rubral cells through divergence in modular loop connections. In this manner, positive feedback spreads from one module to another. This mechanism of progressive spread can explain the gradual rotations of population vectors in the motor cortex that occur when an animal is required to mentally rotate a visual target (Eisenman et al, 1991). A progressive spread of positive feedback also explains the common observation that reaction times are much longer than expected from the conduction delays from sensory receptors, through motor cortex, back to motor neurons. The extra delay is explained by the need for positive feedback to intensify and spread through the limb premotor network (Houk, 1989).
What is the function of the sensory response properties of cells in motor cortex and red nucleus? A sensory response to a discrete somatosensory stimulus is relatively weak and is confined to a relatively small population of neurons under conditions in which the subject ignores the stimulus, presumeably because spontaneous PC discharge prevents the initiation of positive feedback. In contrast, when the subject wants to use a stimulus as a cue to initiate a movement, we postulate that PCs are turned off in preparation for the movement. This allows the sensory stimulus to be amplified by positive feedback in the cortico-rubro-cerebellar network, thus initiating the population activity that produces the movement (Sarrafizadeh, 1994).
1.2 Eye premotor networks
Recurrent pathways are also prevalent in the premotor networks that control eye movements. There are separate premotor networks for controlling smooth and saccadic eye movements and for horizontal and vertical components of each. Figure 3 is a summary diagram of the network that controls smooth horizontal eye movements, based on a conglomeration of several published models (Galiana & Outerbridge, 1984; Cannon & Robinson, 1987; Peterson & Houk, 1991). This network receives vestibular sensory input from the semicircular canals and generates eye movement commands. On each side of the brainstem, neurons in the medial vestibular nucleus (V) are interconnected with prepositus hypoglosius neurons (P) and with intermediate types (PV). In addition, the networks on the two sides of the brainstem are interconnected through a recurrent inhibitory pathway. V signals are dominated by velocity coding, P signals by position coding and many cells (eg. PV) combine position and velocity coding. PCs in flocculus and posterior vermal regions of the cerebellar cortex inhibit some, though not all, of this family of neurons. The output commands from both sides of the brain converge upon ocular motor neurons to control the horizontal component of smooth pursuit and optokinetic eye movements and the vestibulo-ocular reflex.
Along with many obvious similarities, there are two important differences between this smooth eye network and the limb network discussed in the previous section. One is that the smooth eye network operates more or less continuously in its active state instead of making the many transitions from inactive to active states that are characteristic of the limb network. This fits with the need to control eye position continuously to stabilize the visual world as opposed to the need to execute discrete limb movements to grasp and manipulate objects. A second important difference concerns coupling between corresponding networks on the two sides of the brain. Movements of the two eyes need to be coupled to insure fused binocular images, whereas the left and right limbs can be operated independently in manipulation tasks. Coupling between the eye networks involves reciprocal inhibitory pathways that function as additional positive feedback loops. Modeling studies illustrate how this coupling could perform important operational and adaptive functions (Galiana 1985). The postulated operational function is to promote the conversion of vestibular head velocity sensations into eye position commands (an integration in the mathematical sense), and the adaptive function is to provide a sensitive site for regulating the gain of the vestibulo-ocular reflex. PCs in the cerebellar cortex are well situated to regulate the signals generated by this network, thus controlling integration and other dynamics. PCs also regulate the intensity of responses to vestibular input, thus modifying the effective gain of the vestibular ocular reflex.
Other examples of premotor networks could be highlighted, each having its characteristic processing mode and function. However, these two examples are adequate to illustrate the variety of collective computations that can be performed in a rapid and automatic fashion by the highly interconnected architecture of premotor networks. Now we turn to models of how these collective computations are operated upon by the regulatory actions of a cerebellar cortex.
2. Cerebellar cortex
How does the cerebellar cortex organize its internal computations so as to regulate the many recurrent pathways in premotor networks? Although the cerebellum has a remarkably uniform structure, the mechanisms of regulation may nevertheless differ, depending on the type of premotor network and movement that is being regulated. For example, the regulation of networks, such as the limb premotor network, that make transitions between passive and active states to control discrete limb movements may differ from the regulation of networks, such as the smooth eye network, that operate continuously to control smooth eye movements. We begin by considering models for regulating various types of discrete movements.
Elemental commands recorded from motor cortical and red nucleus neurons, M and R in Figure 2, generally take the form of an intense burst of discharge, occurring at a frequency that corresponds to the velocity of the movement and having a duration that corresponds to the duration of the movement (cf. Houk et al 1993). In addition to these phasic components, the neurons may also show tonic components. The elemental commands just described are seen in isotonic movement tasks, whereas analogous relations to force rate and force occur in isometric tasks. PCs recorded under similar conditions may show either bursts of discharge or pauses (cf. Thach et al, 1992). It is reasonable to assume that positive feedback is permitted to intensify in the loops that are regulated by pausing PCs, and that these loops generate commands to agonist muscles. Positive feedback would be inhibited from spreading to loops regulated by bursting PCs, and we assume that the latter loops are thus inhibited from generating commands to antagonist muscles. The question of how the cerebellar cortex exterts its control thus translates into a problem of understanding how appropriately timed bursts and pauses of PC discharge may be generated.
As might be anticipated, a variety of models of the cerebellar cortex are capable of explaining how these bursts and pauses could be generated. Therefore, it is helpful to have some constraints in addition to those already imposed by the basic anatomical and physiological properties of the network. In the following section we highlight a few particularly critical constraints that derive from studies of motor performance.
2.1 Additional constraints from experimental psychology
The concept that movement commands are centrally specified and are then executed in essentially an open-loop manner has evolved from a long line of studies in experimental psychology (cf. Schmidt, 1988). According to this motor program concept, the system operates in a feed-forward mode, as opposed to using sensory feedback from the periphery. Instead of providing feedback during the movement, sensory information is used to select the parameters of a motor program before it is initiated, to initiate the program, and to guide the subsequent adaptive process that mediates motor learning.
Although most investigators find sensory feedback to be ineffective in modifying ongoing motor programs, except in limited ways, Adams (1971; 1977) maintains that the endpoint of a movement is sensitive to proprioception. In support of Adams' conclusion, we found that the offsets of elemental motor commands recorded from the red nucleus can be appreciably prolonged by using a brake to prevent the animal from reaching the intended target (Houk & Gibson, 1987). We also found clear support for the concept that the program operates mostly open loop, since application of the brake had relatively minor effects on discharge frequency during the burst.
Another key constraint from performance studies is the observation that the initiation and the programming of movement are separate asynchronous processes (Gielen & van Gisbergen, 1990; Ghez et al 1990). Programming can occur in preparation for a movement, at the time of a movement, or after a "default" movement has been initiated. We interpret this to meant that a decision to perform an action corresponds to the buildup of positive feedback in the limb premotor network, whereas the regulatory input to this network from the cerebellar cortex implements decisions about what movement to perform (Houk et al, 1993). Thus, in considering models of the cerebellar cortex, we will emphasize their capacity to regulate the metrics of elemental commands as opposed to the initiation of these commands. This fits well with the fact that cerebellar cortical lesions do not interfere with starting movements but instead result in dysmetria.
2.2 Re-evaluation of Marr and Albus models
It is instructive to re-evaluate the Marr (1969) and Albus (1971) models in light of the above considerations. Marr's idea, or better the Blomfield and Marr (1970) revision of it, was that PCs would fire in response to learned input patterns and this would inhibit unneeded elemental movements. This does not explain the pauses in PC discharge that release nuclear cell discharge. Pauses might be explained by assuming that input from inhibitory interneurons predominates over excitation to these particular PCs. Another issue is that both excitatory responses and pauses of PCs need to be more than a simple selection signal, since prolonged bursts and pauses are needed to inhibit and release the portions of the premotor network that control antagonist and agonist muscles. Dynamics that might produce extended bursts and extended pauses are not included in Marr's model. If one wishes to pursue the Marr model further, these various features need to be elaborated, and their performance implications need to be explored through simulation. Surprizingly, there was no actual simulation of Marr's model until quite recently (Tyrrell & Willshaw, 1992). The latter authors found that much was unspecified in the original model. After providing these specifications, they were able to verify a number of Marr's predictions at an information processing level. However, they have not yet attempted to use their simulation to drive a premotor network, let alone to control a movement.
The situation is somewhat different for Albus' (1971) theory, since early on it was developed into an executable model called "Cerebellar Model Articulation Controller," or CMAC (Albus, 1975). CMAC has been developed to the point of using it to control actual robotic manipulators (Miller et al, 1987, 1990). Because of these analyses, simulations and applications, the CMAC architecture is now well understood. To summarize its operating principles, it essentially functions as a static associative memory that implements locally- generalizing, nonlinear maps between mossy fiber inputs and PC outputs (Albus, 1981). It does this by first treating the granule/Golgi cell network as an association layer that generates a sparse, expanded representation of mossy fiber input, and second by using adjustable weights to couple the large parallel fiber vector to PC output units with graded properties. It would not be difficult for the bursts and pauses observed in PC recordings to be simulated by this architecture.
However, it is not clear that either of the above models could satisfy the performance constraints summarized in Section 2.1. For example, how might sensory information be used to select the parameters of a motor program and to trigger its initiation, but then be disconnected during the feedforward stage of execution? PC outputs must somehow be responsive to sensory input during a programming phase, but become unresponsive during execution. In the Marr and Albus models, responses to sensory input are mediated through parallel fibers. There is no mechanism whereby these inputs would have actions during a programming phase, be disconnected during most of an execution phase, and then become active toward the end of execution in order to terminate the movement. Another issue concerns the independent control of the programming and initiation of a movement. While the Blomfield and Marr model could do this through an unspecified motor cortical process for initiating an excessive number of commands, in the Albus model there is no separation between programming and initiation.
2.3 Adjustable Pattern Generator (APG) Model
The adjustable pattern generator (APG) model was developed by Houk, Barto and colleagues specifically to address the limitations discussed in the previous paragraph (Houk, 1989; Houk et al 1990; Sinkjaer et al 1990; Berthier et al 1993; Buckingham et al, 1994; Houk and Wise, 1995). The term adjustable pattern generator refers to the ability of an APG to generate an elemental burst command with an adjustable intensity and duration. The model has an anatomically-based modular architecture that is summarized in Figure 4. Each module includes a positive feedback loop between a cerebellar nucleus cell (N) and a motor cortical cell (M), which provides an abstract representation of the cortico- rubral-cerebellar loops discussed in Section 1.1. Each N receives inhibitory input from a private set of Purkinje cells (PCs). Each set of PCs receives a private climbing fiber training input, convergent input from an array of parallel fibers (PFs), and inhibitory input from a basket cell (B) and from stellate cells (although the latter are not specifically simulated). In early versions of the model, the PFs were assumed to convey unprocessed mossy fiber input, whereas in our recent simulations we have used a layer of granule units combined with Golgi cell feedback to create a sparsely coded representation of the mossy fiber input, along the lines assumed in the Marr and Albus theories.
Figure 5 illustrates how this model achieves independent control over the initiation and programming of a movement. In this example, programming occurs in a preparatory phase that is initiated by an instruction signal which is transmitted to Bs and PCs through mossy fiber input. The balance between direct PF excitation of PCs and their inhibition through a B unit causes some PCs to switch to a more intense on-state (stippled trace in Fig. 5A) and others to switch to an off-state (solid trace). In the APG model, motor commands are not initiated by these programming events in the cerebellar cortex. Instead, the initiation of command generation is triggered independently by sensory or internal inputs to the motor cortex or red nucleus. As discussed in Section 1.1, this starts positive feedback thus causing the limb premotor network to switch from an inactive to an active state, which then initiates elemental commands in a large number of APG modules. It is only after motor commands are initiated that the effects of programming events in the cerebellar cortex become expressed. Each elemental command intensifies to a level that is determined by the degree to which the module's PCs are switched off or on by the instruction signal. In this manner, the intensities of the different elemental commands can be preset to a variety of levels, thus offering an explanation of how population vectors in motor cortex could be regulated (Eisenman et al, 1991). In agreement with the performance studies discussed earlier, this model allows the programming process to occur in a preparatory phase (the case illustrated in Fig. 5A), at the time of movement onset, or even after movement onset.
Once the PCs switch to a particular firing state, in response to the instruction stimulus, they are postulated to become refractory to further input until near the termination of the movement. As a consequence, simulated movements start to be executed in a feedforward manner. A key assumption is that climbing fibers train PCs to recognize those particular patterns of PF activity that indicate when desired endpoints are about to be reached. Occurrences of these patterns cause PCs that were turned off in the programming phase to fire strongly, and this terminates positive feedback in the premotor network. We call this type of control quasi-feedforward (Houk et al 1990) and have pointed out how it satisfies the motor program constraints from experimental psychology (Berthier et al 1993). It is also advantageous in preventing delayed feedback from causing instability oscillations (Houk et al, 1990). The fact that mossy fibers display prominent sensory properties (Van Kann et al 1993) whereas Purkinje and nuclear cells are relatively unresponsive to somatosensory stimulation (Harvey et al, 1977, 1979) is further explained by this feature.
In the original APG model, the quasi-feedforward characteristics of the network derived solely from the biophysical properties assumed for PC dendrites (Fig. 5B). Due to their high density of calcium channels, PC dendrites were assumed to behave like bistable binary elements possessing a zone of hysteresis, and, for simplicity, PCs were modeled as if they had only one dendrite. In a cell with many dendrites, each operating in a bistable manner, their summed effect on the soma would of course be multistable. An ionic model has been developed to demonstrate the feasibility of these ideas (Yuen et al, 1995). There are two other mechanisms that might contribute to the observed insensitivity to input, although both presume that PC dendrites behave in a binary (not necessarily bistable) fashion. If so, the recurrent inhibitory connections that Purkinje (and basket) cells make with each other might promote switching between states (Bell & Grimm, 1969). Furthermore, we have recently found that insensitivity to input during the feedforward phase can result if PCs are only trained to respond to those specific patterns of PF input that occur near the ends of movements.
Motor programs might be stored in a lookup table as detailed lists of highly specific instructions, which is close to what Marr (1969) envisioned in his original theory. However, a literal application of this scheme would exceed the storage capacity of the cerebellum (Kawato & Gomi, 1993). Instead, most investigators have favored the idea that memory may be used more frugally to store generalized motor programs that are then parameterized in order to control specific movements. In the APG model, the counterpart of a generalized motor program is a set of parallel fiber weights for proprioceptive and target inputs (Berthier et al 1993). This is analogous to Adams' (1971) "perceptual trace," since a particular constellation of parallel fiber inputs, when processed by the set of parallel fiber weights, signifies that the desired endpoint of a movement is about to be reached, causing PC firing that terminates the movement command. Once weights are learned, the model's commanded velocity is then parameterized by basket cell firing in the selection phase of the model's operation (Houk et al 1990). The velocity that is selected by turning PCs off is automatically scaled so as to depend on the distance between the initial position of the limb and the desired endpoint of the movement. Velocity can also be varied independently as a scaling factor controlled by diffuse neuromodulatory input, which can explain how velocity scaling can be applied simultaneously to all elements in a composite motor program (Schmidt, 1988). Movement duration is parameterized in the execution phase of model operation. Duration turns out to be a dependent variable that evolves from the course of the movement as opposed to being determined by an internal clock. Amplitude is parameterized by the target inputs which have synaptic influences that are graded according to where the target lies along the direction of motion controlled by the particular APG module.
2.4 Limb movement models motivated by control theory
While the formulation of the above models was based on the anatomy and physiology of the cerebellum, some authors have emphasized engineering control principles in designing cerebellar models. Control theorists have been fascinated by the potential utility of internal models of the controlled system, the latter being computational devices that predict responses when supplied with sample commands. The concept of internal models was used in a recent study by Miall, Weir, Wolpert and Stein (1993) that treats the cerebellum as a "Smith predictor." Such systems combine delayed and undelayed models of the controlled system to build controllers that are are particularly suitable for controlling systems with large time delays. Miall and colleagues suggest that the cerebellum builds the appropriate models through a learning process. Once learned, these internal models become integral components of the controller. Since the motor system is characterized by large time delays, the cerebellum almost certainly needs to function as a predictive controller, but not necessarily as a Smith predictor. In another model motivated by engineering principles, Paulin (1989) suggested that the cerebellum computes like a Kalman filter, another form of predictor. The APG model has a much simpler control structure than either of these models, and it also functions as a predictive controller (Buckingham et al, 1994). There is scant evidence for several of the assumptions that Miall makes in attempting to map their Smith predictor onto the gross anatomy of the nervous system, let alone the microcircuitry of the cerebellum. Paulin does not even attempt this exercise.
Kawato and Gomi (1992a) proposed that the lateral cerebellum learns to function as an "inverse model" of the limb controlled system. Inverse models do the opposite of the forward models discussed in the previous paragraph; they predict commands when supplied with sample responses. If instead of sample responses they are supplied with signals representing desired trajectories, an inverse model then generates the appropriate commands. Kawato and Gomi assumed that the motor cortex and cerebellum are simultaneously provided parietal signals representing a desired trajectory of a limb movement. The motor cortex compares the desired trajectory with sensory feedback and issues a crude command while waiting for the cerebellum to use its inverse model to compute a precise command. After the latter is sent back to the motor cortex, it is used to compute an updated command. One problem with this model is that it ignores the time delays mentioned in the previous paragraph, which may result in serious problems with this control scheme. Although the authors map their controller onto the gross anatomy of the brain, no attempt is made to show how the microcircuity of the cerebellum might be used in implementing an inverse model. These authors have also proposed models of the intermediate and medial cerebellum that are based on similar principles (Gomi and Kawato, 1992).
The effective use of internal models presumes the existence of neural stages that compute desired trajectories for movements through space, and other stages for conversion into desired changes in muscle lengths and/or joint angles (Kawato 1990). Although signals capable of specifying the positions of targets in extrapersonal space are present in the parietal association cortex (Zipser & Andersen, 1988), there is no evidence for signals that specify desired trajectories. The APG model circumvents this problem because it formulates its own trajectories based on intrinsic circuitry and properties of the neuromuscular system. These built-in trajectories tend to be straight lines in the space of intrinsic coordinates that move the limb directly from an arbitrary starting point to the desired endpoint (Berthier et al 1993). Indirect trajectories, when they are needed to avoid obstables, can be generated by specifying via points designated, for example, by signals from the premotor cortex (Houk & Wise, 1995).
2.5 Conceptual models of limb control
In addition to the above mentioned executable models of the cerebellum as a limb controller, we will mention a few of the conceptual models that have been discussed and schematized by neurobiologists, but lack a more precise formulation that would be suitable for simulation. In this vein, Thach and colleagues (1992) described why the parallel fiber architecture is ideal for presenting PCs with the information they need to coordinate multijoint movements. We accept this idea and tacitly include it in the APG model discussed earlier. Bloedel (1992) proposed that climbing fibers selectively enhance microzones comprised of parasagittal rows of PCs rather than training PCs to recognize patterned input. This is compatible with the Llin=B7s and Welsh (1993) scheme whereby gap junctions in the inferior olive synchronize clusters of olivary neurons that innervate parasagittal zones in cerebellar cortex. Arshavsky, Gelfand and Orlovsky (1986) offered some general guidelines about the role of the cerebellum in coordinating locomotion. Finally, Prochazka (1989) suggested that the role of the cerebellum is to control the gain of spinal and brainstem reflexes. The gain-control idea has also arisen in eye movement models that are discussed later.
2.6 Conditioned reflex models
Several models have been proposed to explain the role of the cerebellum in mediating conditioned reflex (CR) responses of the eyelid to tone and light conditioned stimuli (CS) (Thompson, 1986; Moore et al, 1989; Gluck et al, 1990; Houk, 1990; Buonomano and Mauk, 1994). The circuit implicated in this response is similar in several respects to the limb premotor network discussed in previous sections (Fig. 2). However, the CR models have generally neglected pathways through thalamus and motor cortex, since CRs can be learned and performed in decerebrate animals. Based on this simplification and other findings, Thompson (1986) proposed a conceptual model along the lines summarized in Figure 6A. Tone and light CSs project (via the pons) as mossy fibers into the cerebellum. Collaterals of the mossy fibers provide direct excitation of N cells and excitation of PCs via PFs=2E In this manner, N cells are both directly excited and indirectly restrained by PC inhibition to produce an N output. The combination generates a neural model of a CR that is then relayed via R to the abducens (VI) and other eye nuclei that output the CR. Unconditioned reflexes (URs) are mediated by air puff unconditioned stimuli (US) to the cornea via the trigeminal nucleus (V). The US is also transmitted through climbing fibers, and the latter train PCs and Ns to respond appropriately to their inputs.
This model is meant to explain how CRs are generated in delay conditioning tasks. In this paradigm, the CS is an extended period of stimulation that begins several hundred msec in advance of the US and continues until the US is delivered. Thus, one can assume that the CS provides a constant excitatory drive to N that can readily be shaped by PC inhibition to form an appropriate CR. The control problem for PCs is to fire initially so as to cancel the excitatory drive to N and pause when it is time to initiate the CR; depending on the particular task, the PCs may also have to resume firing at the end to cancel the excitatory drive to N, thus terminating the CR. In several implementable models based on this concept (Moore et al, 1989; Gluck et al, 1990; Buonomano and Mauk, 1994), PCs receive a set of PFs that transmit a diversity of temporal patterns related to the CS. The PC is taught to respond to those PFs that are active during periods when N needs to be inhibited and not to respond to PFs active when N needs to be disinhibited. In the Moore et al (1989) model, the different timings of PF signals are generated in the pontine nuclei as a set of variously delayed responses to the CS input. In the Gluck et al (1990) model, the different timings are again assumed to be generated outside of the cerebellum, but in this case are represented as spike trains that are modulated at several sinusoidal frequencies and phases. In the Buonomano and Mauk (1994) model, the different timings are assumed to be generated within the cerebellar cortex, as dynamic interactions between granule and Golgi cells. This latter process is analogous to the mechanisms proposed by Fujita (1982) and by Chapeau-Blondeau and Chauvet (1991) in their dynamic filter models of cerebellar function. The model by Moore and colleagues is most readily accomodated by current single unit data which demonstrates that sources of mossy fiber input to the cerebellum have a distribution of latencies. The Moore et al (1989) model is similar in this respect to a model proposed by Bell (1995) to explain the ability of the lateral line organ in electric fish to adaptively cancel the reafference from its own electrical discharge. The lateral line organ has phylogentic relations to the neuronal architecture of the cerebellum (Mugnaini & Mahler, 1993).
In contrast to the conditioning models described in the previous paragraphs, Houk (1989; 1990) suggested that the excitatory drive to N is produced not by direct sensory responses to the CS, but instead by positive feedback in the recurrent network that interconnects red nucleus (R), the lateral reticular nucleus (L) and the cerebellar nucleus (N) in Figure 6B. This hypothesis is similar to the APG model discussed earlier in relation to limb movement control (Figs. 2 & 4). Weak sensory (CS) inputs to R need to circulate and spread in order to start sufficient positive feedback in the R-L-N loop in order to initiate a CR command. The amplitude and duration of the CR command is then controlled by PC inhibition of loop activity. This model allows PCs to pause well in advance of the CR as a preparatory programming action, in agreement with observed firing patterns of PCs (Berthier and Moore, 1986). This model is also in agreement with the diversity of signals that are recorded from R neurons during performance of CRs (Desmond and Moore, 1991). In summary, the APG version of Thompson's (1986) conceptual model suggests that the CS-US association is mainly formed outside of the cerebellum (in R, L or even the motor cortex), whereas the primary role of the cerebellum is to determine the topography of the CR (i.e., the motor program).
CR commands generated by APG modules are subsequently sent to a brainstem network that produces the final output. Activity-dependent labeling results led Keifer and Houk (1995) to suggest a model in which a trigeminal-reticulo-abducens network, illustrated schematically in Figure 6B, is used to output both CRs and URs. This brainstem network may be analogous to the network formed by segmental interneurons, propriospinal neurons and motor neurons in the spinal cord, the latter being the target of the elemental commands illustrated in Figure 2.
2.7 Eye saccade models
Lesions of the cerebellum severly disrupt an animal's ability to adapt the accuracy of saccadic eye movements (Robinson & Optican, 1981). While there are many eye saccade models, only a few attempt to explain how the cerebellum might achieve this adaptive control (Grossberg & Kuperstein, 1989; Houk et al 1992; Dean et al 1994). The anatomy indicates that cerebellar regulation of saccadic eye movements should occur at two levels, at the level of a tecto-cerebellar command network and at the level of a brainstem burst-generating network (cf. Houk et al 1992). The model shown in Figure 7A highlights the cerebellar control of the tecto-cerebellar command network, which we will consider first.
The intermediate layer of cells in the superior colliculus, or tectum (T), projects to a category of brainstem reticular neurons called long lead (LL) bursters. LLs are an important source of mossy fiber input to the cerebellum, with collaterals to the cerebellar nuclei (N). The N cells targeted by these collaterals project back onto T neurons, forming a recurrent network. Neurons in the T-LL-N premotor network generate bursts of discharge that typically precede saccadic eye movements by relatively long latencies and are assumed to function as saccade commands. The distributed nature of saccade commands is similar to the situation in the premotor networks controlling limb movements (M, R, P, L, N and T neurons in Fig. 2). Pursuing this analogy further, Houk, Galiana & Guitton (1992) hypothesized that positive feedback in the tecto-reticulo-cerebellar recurrent network functions as an important driving force for generating long-lead burst activity. The model explains the psychophysical observation (Gielen & van Gisbergen, 1990) that the initiation of saccades and the specification of their kinematic parameters are controlled by separate processes. A saccadic burst command is initiated when weak visual sensory input to T neurons from the superficial layer of the model tectum initiates positive feedback in the T-LL-N loop in Figure 7A. PCs in the the cerebellar cortex then regulate the intensity and duration of the bursts, thus specifying the motor program that controls the velocity, duration and direction of saccadic eye movements.
Saccade commands are vectors comprised of many elemental commands, each specifying an elemental saccade in a particular direction. (This is in analogy with limb motor commands, except that elemental saccade commands are also specialized for particular movement amplitudes.) The model proposed by Houk et al (1992) is an APG array analogous to the limb model discussed in Section 2.3, in which each T-LL-N module is regulated by it's own set of PCs. The vector sum of the set of elemental commands controls the direction of the saccade. Divergence in individual T-LL-N loops explains how a large population of T neurons can be recruited to form the composite command observed at the level of the tectum (McIlwain 1986), and the model equally explains the concurrent bursting seen in LL and N neurons. The pauses that occur in N neurons just before and just after bursts (Fuchs et al 1993) appear to be expressions of PC inhibition in the process of regulating loop activity. These particular N neurons are clustered in the caudal fastigial nucleus along with other N neurons that are reciprocally connected with the brainstem burst-generating network.
The brainstem burst-generating network is shown in Figure 7B. It receives retinotopically organized saccade commands from LL neurons that participate in the tecto-cerebellar network of Figure 7A. Through a convergence and burst generating mechanism that has been modelled by several authors (Robinson, 1975; Van Gisbergen et al, 1981; Scudder 1988; Van Gisbergen et al 1989; Galiana & Guiton, 1992; Krommenhoek et al, 1993), these retinotopic inputs are converted into muscle-specific burst outputs that are sent to eye motor neurons to control saccadic movements. (Bursts are also sent to the smooth eye system to control fixations after saccades are completed.) Figure 7B shows the bilateral network for horizontal saccades whereas only one side of the tecto- reticulo-cerebellar network was illustrated in Figure 7A (as another simplification, control of OP neurons by tectal fixation neurons was not discussed). The omnipause neurons (OP) shown in the center maintain fixation between saccades by tonically inhibiting the burst neurons. Bursts are triggered by LL saccade commands; LL neurons on one side inhibit OPs while simultaneously exciting that side's excitatory burst (EB) neurons. Mutual inhibition between inhibitory burst (IB) neurons prevents both sides from firing simultaneously. N neurons in the fastigial nucleus project to EB, IB and OP neurons and receive recurrent connections back from these and other sites, as collaterals of mossy fiber inputs to the appropriate regions of the cerebellar cortex.
The cerebellum regulates the burst-generating network through inhibition and disinhibition of N neurons, only one of which is shown in Figure 7B. A recent model of this system by Dean and colleagues (1994) gives the cerebellum the function of setting the gain of feedback to the burst generator circuit. They implemented gain control using a CMAC representation of the cerebellar cortex. Like the earlier model proposed by Grossberg and Kuperstein (1969), Dean's model computes gain factors that, when inserted into the brainstem saccade network, are capable of generating accurate saccades in the course of development. They also simulated the adaptive response that occurs after weakening eye muscles or after displacing visual targets. As currently formulated, this is a trial-level model that does not predict the time course of neural signals. While some investigators see the cerebellum as exerting gain control, others have suggested that compensation is produced by subtracting an adjustable signal from a fixed-gain saccadic circuit (Optican & Robinson, 1980). This subtraction scheme is the one favored by single unit data and also by the APG array model promoted in this article.
There has been considerable debate as to whether saccades are controlled as fixed vectors in retinotopic coordinates or as endpoint trajectories in head or body coordinates (Robinson 1987; Scudder 1988; Sparks 1988; Guitton et al, 1990). We have postulated that both strategies are used, but at different locations in the network (Houk et al, 1992). In the APG array model, the cerebellar cortex operates in head (or body if the head is free) coordinates. It exerts its executive regulation by preselecting an action (as in Fig. 5), triggering it and then continuing it until the desired endpoint of a saccade is about to be reached, whereupon its PCs fire to terminate positive feedback in the loops that they regulate. PCs recognize desired endpoints on the basis of PF patterns derived from proprioceptive, efference copy and target signals in mossy fibers. Since different APGs regulate the tecto- cerebellar network (Fig. 7A) and the brainstem saccade generator (Fig. 7B), the PCs in the two systems probably utilize different vectors of PF input to perform their computation. We consider the computation to be a pattern recognition task that implements finite state control, as opposed to conventional feedback control. On the other hand, we accept the simpler feedback-controlled, fixed vector models as being valid for the computations that are performed more automatically within the tecto-reticulo-cerebellar network and within the brainstem saccade generator.
Since the saccade control system includes cerebellar connections with the final stage of motor output in the brainstem, as well as with the tectal-cerebellar network, it raises the possibility that a similar regulation may eventually be documented for limb and conditioned eyelid systems.
2.8 Smooth eye movement models
Starting with Ito (1970), many investigators have proposed that the cerebellum serves an important function in the regulation of smooth eye movements (Robinson 1976; Ito 1984; Galiana 1986; Lisberger 1994; Peterson et al, 1991; Kawato & Gomi, 1992b; Shidara et al 1993). The basic circuit upon which these theories are based is shown in Figure 8. Neurons in the vestibular nucleus receive vestibular sensory input from the semicircular canals, visual sensory input from the retina, and an inhibitory input from PCs in floccular, parafloccular and vermal regions of the cerebellar cortex. The sensory inputs to V neurons mediate basic vestibulo-ocular and optokinentic responses on a background of tonic PC inhibition, and these brainstem reflexes are then fine tuned by modulations in PC discharge. The PC signals are computed from a variety of parallel fiber signals which include vestibular discharge related to the rotational velocity of the head, optokinetic discharge related to motion of the visual field and efference copies of the eye movement commands computed by the network. Since motion of small visual targets has negligible input to V neurons other than via PCs, the models attribute pursuit movements entirely to transmission through the cerebellar cortex.
In normal animals, floccular PCs discharge steadily without modulation during vestibulo-ocular reflexes, even though powerful head velocity and efference copy inputs can be demonstrated (Miles and Lisberger, 1981). This has led to the hypothesis that the basic vestibulo-ocular reflex through the brainstem is operating with an appropriate gain without the regulatory assistance of the cerebellum. These PCs do fire, however, when the animal makes pursuit eye movements, compatible with the hypothesis that PC discharge in response to the visual parallel fiber input mediates pursuit responses through disinhibition (Lisberger 1994). Adaptation of the vestibulo- ocular reflex to visual distortion has been modeled in different ways. Ito (1989) attributes the adaptation to adjustments in PF synapses, whereas Lisberger (1994) argues that the requisite changes in gain are present in the brainstem part of this system. Other models have assigned gain change to both locations; rapid changes were attributed to the cerebellar cortex and slower changes to the vestibular nucleus (Galiana 1986; Peterson et al, 1991). Tests of the latter hypotheses suggested that at least some of the rapid gain changes occur in the vestibular nucleus (Khater et al 1993). A significant part of the compensatory response, however, is still attributable to the cerebellar cortex (Partsalis et al. 1995). On theoretical grounds, it seems advantageous to have an adaptive capacity in both PCs and in the premotor network (Houk & Barto, 1992); however, the details of timing and degree of adaptation at the two sites are still being debated.
Models of smooth pursuit should address some of its unique properties. They should be capable of explaining the ability of pursuit to continue after visual input is interrupted. Young (1977), and others after him (cf. Robinson, 1987; Lisberger, 1994), attributed this to positive feedback through the efference copy loop in Figure 8. (This assumes that the efference copy signals have a sign inversion in addition to the stage of PC inhibition.) The model in Figure 8 is oversimplified in the sense that it does not include the many recurrent loops in the smooth eye premotor network that were discussed in Section 1.2 (Fig. 3). Positive feedback in these loops should also contribute to this storage property. Only the models proposed by Galliana (1986) and Peterson et al (1991) include these features. A shortcoming of all of the above models of pursuit is their failure to confront predictive tracking, which is the ability to follow, without any significant time delay, targets that have certain deterministic properties. It is likely that much of our normal pursuit of moving targets involves at least some degree of predictive tracking. Mahamud et al. (1995) have recently shown that the APG model is capable of predictive tracking, and this feature is now being explored in more detail.
2.9 Cognitive processing models
The above models apply to the phylogenetically older parts of the cerebellum which are connected with the motor system. The newer parts of the cerebellum (much of the hemispheres and the dentate nuclei) instead are connected with the so-called association regions of the cerebral cortex (Middleton & Strick, 1994). Leiner, Leiner and Dow (1989) have proposed a conceptual model of how these newer areas of the cerebellum may be involved in the processing of cognitive information, and Ito (1993) has formally outlined how models of the cerebellum might be extended to cognitive problems. Further progress will depend on the development of network models of cognitive processing.
3. Role of the Cerebellum in Motor Learning
Models of motor learning need to address several aspects of the problem. First, it is essential to specify precisely what is being learned in an information processing sense. The models of the cerebellum reviewed in the previous sections can serve this function. Second, they need to adopt a rule for modifiying synaptic efficacy, hereafter referred to as a learning rule. Preferably the learning rule should conform to, or at least be motivated by, the cellular mechanisms that underlie neuronal plasticity in the region (or regions) of brain that is (are) being modeled. Third, they need to confront the credit assignment problem, which is the difficulty of directing training signals to the appropriate sites in the network, and at the appropriate moments in the training process, in order for learning to be adaptive. Fourth, they must define the training information that is provided to the model, and this should be justified in terms of the information that is likely to be available for guiding the learning process in the organism. In this section we attempt to address each of these issues in relation to the role of the cerebellum in motor learning.
3.1 Cellular mechanisms defining learning rules
The publication of pattern recognition models of the cerebellum by Marr (1969) and Albus (1971) encouraged experimentalists to search for a cellular mechanism of synaptic plasticity in parallel fiber synapses that might implement one of the postulated learning rules. Marr had the hypothesis that the parallel fiber synaptic weight would be increased if the Purkinje cell fired at about the same time that the parallel fiber was active. This is analogous to long-term potentiation (LTP) as expressed in the hippocampus (Bliss & Collingridge, 1993) and amounts to a Hebbian rule in which a synapse is strengthened whenever the presynaptic ending and postsynaptic cell are simultaneously active. The role of the climbing fiber input was to fire the Purkinje cell unconditionally, thus reinforcing parallel fiber synapses active at the time of climbing fiber discharge. No provision was made for weight decreases.
Albus envisioned a learning rule with an opposite sign, and with more complex properties. He postulated that synaptic weight would be decreased, i.e., a long-term depression (LTD) instead of LTP, and that this would occur only in the presence of a three-way coincidence between a climbing fiber input (training signal), Purkinje cell firing (postsynaptic factor) and parallel fiber synaptic activity (presynaptic factor). This amounts to a unidirectional version of the training rule used in research on Perceptrons. He also postulated synaptic decrements on the spiny synapses of basket and stellate cells, thus providing a mechanism for countering the generally reduced excitatory input to Purkinje cells that would occur with training experience.
The experimental search for cellular mechanisms defining a cerebellar learning rule is discussed in other articles in this issue (Crepel et al., this volume; Linden, this volume). To summarize these observations, there now seems to be good agreement that climbing fiber activity, when coupled with other factors, produces an LTD as opposed to an LTP, in agreement with Albus' model of the learning rule (Ito 1989). Although there may be more than one mechanism for LTD, the one that is best supported by current data involves the intracellular activation of protein kinase C in the spine (Linden 1994). The model shown in Fig. 9 summarizes the intracellular steps that appear to mediate protein kinase C activation and relates these processes to a three-factor learning rule for LTP.
When a parallel fiber is active, it releases glutamate neurotransmitter which produces both depolarization of the spine, via AMPA receptors, and an activation of mGluR1 metabotropic receptors (left side of Fig. 9). The activation of metabotropic receptors should be localized to those synapses that are activated by presynaptic transmitter release, and this step has been interpreted to represent the presynaptic component in a 3-factor learning rule (Houk et al, 1990; Houk & Barto, 1992). The other two factors relate to dendrite and spine depolarization, which are required to open calcium channels and elevate spine calcium. According to the model, accumulation of sufficient spine calcium to mediate LTD requires a larger degree of spine depolarization than can be produced by the spine's own synapse. Additional spine depolarization is caused by the depolarization of the adjacent dendrite, the latter being mediated partly by climbing fiber input (the training signal) and partly by postsynaptic responses to other parallel fiber inputs (the postsynaptic factor). If both of these influences are present, the spine becomes sufficiently depolarized to elevate spine calcium appreciably. Then protein kinase C can be activated by its cofactors, calcium and the diacylglycerol produced by mGluR1 activation. Activated protein kinase C produces LTD through an action on AMPA receptors (Crepel et al., this volume). Thus, presynaptic, postsynaptic and training factors are required in combination to produce LTD. Additional support for the above model of LTD has come recently from studies of mice that lack mGluR1 (Conquet et al 1994; Aiba et al 1994). These animals display both impaired Purkinje cell LTD and severe cerebellar ataxia.
Less is known about the learning rule for weight increases, although one seems to exist. LTP has been induced in brain slices by stimulating parallel fibers without climbing fibers (Sakurai 1987). This led Houk and Barto (1992) to postulate that the presence of presynaptic input without either postsynaptic depolarization or climbing fiber discharge mediates LTP. However, in tissue culture LTP can be produced by combining AMPA-receptor activation with depolarization, provided mGluR1 activation is omitted (Linden et al 1991). This result suggests quite a different rule for LTP, the presence of climbing fiber and postsynaptic activity coupled with the absence of presynaptic activity. The latter rule, which would have the effect of normalizing input to PCs, has not yet been tested computationally. It receives support from the recent finding that gain decreases of the vestibulo-ocular reflex (presumeably mediated by LTP) are resistant to metabotropic antagonists, whereas gain increases (presumeably mediated by LTD) are blocked (Carter & McElligott, 1994).
Simulation studies using the learning rule for LTD outlined in Figure 9 and the first mechanism for LTP mentioned above demonstrated the capability of finding correct PF weights, but only in a simple learning task under highly restricted conditions (Berthier et al 1993). The model's learning process was not robust enough to learn arbitrary movements in different parts of the work space, nor was it capable of training the weights of sets of PF synapses onto individual PCs. Most cerebellar modeling studies have not attempted to conform to the mechanisms of synaptic plasticity to this degree. They have simply used variants of the well-known Perceptron or LMS rules (discussed in Houk & Barto, 1992). Such studies can be looked upon as testing the operational features of a model adequately, and in some cases they may test some of the organizational issues of learning discussed later, but they fall short of addressing the basic neurobiology of motor learning, which has been an important goal in our research.
In analyzing the shortcomings of the above cellular learning rule, we found that a major problem is its failure to adquately address the temporal credit assignment problem. This is the problem of delivering appropriately timed training information to the network's neurons to insure that learning is adaptive. The actions produced by PCs are completed before they are detected by sensory feedback to generate the training information in CFs. To compensate for this problem of delayed feedback, the cerebellar learning rule needs to modify synaptic actions that occurred prior to a CF's discharge. Most synaptic physiologists have not addressed this problem of temporal credit assignment. However, the one full study that is available reported that CF stimulation must actually precede parallel fiber stimulation by 125-250 msec for an optimal LTD (Ekerot & Kano, 1989). A recent preliminary report based on field potentials suggested the opposite (and computationally more appropriate) timing (Chen & Thompson, 1992). More attention needs to be given to this important issue.
Network theorists typically address temporal credit assignment by assuming a trace mechanism that provides a short-term memory of preceding synaptic events until the arrival of the corresponding training information (Klopf 1982; Sutton & Barto, 1981). In the APG model, the most critical events to store are traces of the synaptic events that promote state transitions in PC dendrites. In a recent version of the APG model (Buckingham et al, 1994), we postulated that a trace is triggered whenever a PC switches from its off- to its on-state. Recall that this should occur when a PC recognizes a pattern of PF input predicting that the limb will soon reach its intended goal. The subsequent firing of the PC helps to terminate the movement, and it is only after this that the CF returns error information (Section 3.3). The important point here is that CF firing, in trials when it occurs, arrives several hundred msec after the PC response that needs to be evaluated. The postulated mechanism saves a trace in those spines that helped to switch the PC into its depolarized state. With this learning rule, PCs very effectively learn to respond to complex patterns of PF input to terminate movements at an intended goal (Buckingham et al, 1994). Cellular studies need to explore the possibility that spines receiving PF input concurrent with the onsets of plateau potentials undergo LTD in response to subseqent CF input.
The cellular mechanisms of LTD and LTP discussed above may not be adequate or appropriate for forming more permanent, very long-term memories of motor programs. Gilbert (1975) postulated that noradrenaline-containing cells in the locus coeruleus function to evaluate motor performance on a slower time scale than do CFs. Their firing might signal, for example, that a succession of movements performed over the course of several minutes was sufficiently successful to warrent conversion of the LTDs and/or LTPs that ocurred during their execution into a more permanent memory. The latter might, for example, take the form of changes in spine density as seen by Greenough and colleagues (Black et al 1990).
3.2 Structural credit assignment
The learning rules discussed in the previous section apply to each parallel fiber synapse, there being on the order of 1015 of these synapses. If training information were conveyed by climbing fibers indiscriminately to all of these synapses, learning would be quite impractical. Since each PC is innervated by only one climbing fiber, and since climbing fibers transmit diverse training signals (Section 3.3), there is an opportunity for learning to be guided in an efficient manner. For this to be achieved, however, requires the routing of each training signal to appropriate PCs in the network. The problem of doing this, called the structural credit assignment problem, is potentially very difficult. Houk and Barto (1992) hypothesized that the unique modular organization of the cerebellar cortex is an evolutionary adaptation that helps to alleviate structural credit assignment. Small clusters of inferior olive (IO) neurons with similar receptive fields innervate parasagittally oriented strips of PCs called microzones (Ekerot et al, 1991), and the PCs comprising this set project to a common cluster of nuclear cells (Gibson et al, 1987). This anatomical organization occurs early in development (Hawkes et al. 1993) and insures that each APG module will receive a training signal that is particular to that module.
Figure 10 illustrates schematically 30 of the several hundred thousand modules present in the mammalian cerebellum, as if looking down on the surface of the cerebellar cortex. For simplicity, we show only sets of 5, out of the 100 PCs estimated to participate in each APG module. All of the PCs in a given set converge upon a discrete nuclear cell cluster (N, with one loop to the motor cortex (M) being shown forming an elemental motor command). Climbing fibers run parasagittally (vertically on the page) and innervate small numbers of PCs within a given set. Parallel fibers run horizontally intersecting a large number of the rectangular-shaped dendritic trees of the PCs. Note that each PC in a set is shown exposed to a different 100,000-element vector of PFs, such that the 100 PCs comprising an APG have a grand total of 10,000,000 PFs from which to select input. (There most likely is some overlap, so the actual number of PFs might instead be 2,000,000, still a very large number.) Since each APG receives its own semi-private training signal, the storage potential of the network is indeed quite exceptional (Gilbert, 1974).
While the modular organization shown in Figure 10 has great potential for appropriate credit assignment, due to precision of connectivity in the parasagittal plane, the realization of this potential requires a precise alignment between the elemental commands generated by a given module and the training information conveyed by its CF input. The elemental command has to be one capable of diminishing the firing probability of the CF (Houk & Barto, 1992). In the case of CFs with tactile receptive fields, the appropriate aligment is one that would mimic the spinal withdrawal reflex elicited by a noxious stimulus applied to its receptive field (Ekerot et al, 1991; Houk & Barto, 1992). If it is assumed that the premotor network is capable of Hebbian-like, NMDA-mediated synaptic plasticity, one can outline a plausible developmental sequence that could automatically perform this alignment (Guzm=B7n-Lara, 1993). Once aligment is completed, the module would be ideally structured to elaborate conditioned withdrawal responses, and the same organization would be useful in developing motor programs for guiding the limb in a workspace containing obstacles. Given our hypothesis that CFs with proprioceptive receptive fields detect when movements are too small (Berthier et al, 1993; Section 3.3), we postulated that the elemental command produced by the corresponding APG should move the limb in the direction that maximally activates the CF. With proper alignment, the feedback error learning scheme discussed in Section 3.3 (Kawato & Gomi, 1993) could also use this mechanism to achieve excellent structural credit assignment.
Cerebellar models of conditioning and of eye movement control have not yet confronted the structural credit assigment problem. This is because they have generally dealt with single control modules (typically single PCs) that produce net commands as opposed to sets of modules regulating sets of elemental commands. The tacit assumption here is that all PCs in, say, the horizontal zone of the flocculus have identical output connections and thus can be represented by a single equivalent PC. If there is only one PC and only one CF in the model, there is no need to design an alignment scheme. However, this means that the model is not suitable for investigating the mechanisms that normally operate to coordinate the cooperative actions of many modules working in parallel.
3.3 Training signals
Most theories of the cerebellum assume that climbing fibers from the inferior olive transmit the essential training information that guides motor learning. However, different theories are based on different assumptions regarding the specific nature of these signals. Marr (1969; for an elaboration, see Blomfield & Marr, 1970) assumed that the IO transmits specific instructions from the motor cortex designating which elemental movements need to be executed. This begs the question as to how the motor cortex acquires such elaborate knowledge about a large set of required movements. Albus (1971) assumed that the IO compares sensory feedback with desired trajectories to signal errors in performance. His desired trajectories are less demanding although inherently similar to Marr's instructed movements. Neither theory confronts the problem of how internal standards might be acquired.
Recordings from IO neurons, or from their CF axons, provide useful constraints on the type of training information that is available. The different regions of the IO receive various combinations of sensory fibers and collaterals of motor fibers (Bloedel & Courville, 1981) signaling efference copies. The electrophysiology of the ascending sensory pathways from the spinal cord originally suggested a very limited responsiveness to low-threshold somatosensory signals, leading Oscarsson (1980) to postulate that the IO computes error signals that are dominated by efference copy inputs. However, in awake animals somatosensory responsiveness is marked, both to tactile and to proprioceptive stimuli; furthermore, the motor responses stressed by Oscarsson are difficult to demonstrate, except as inhibitory influences that gate off somatosensory responsiveness during certain phases of movement (Gellman et al 1985; Weiss et al 1990). The regions of cerebellum that regulate smooth eye movements receive CFs that are directionally selective to very low velocities of visual motion across the retina; some are sensitive to retinal slip of large, optokinetic images (Simpson 1984), and others are sensitive to slip of small, visual pursuit targets (Stone & Lisberger, 1990). Retinal slip is a natural error signal since it designates a failure of the smooth eye control system to stabilize visual images on the retina. It is a sensory signal as opposed to an efference copy.
Kawato and Gomi (1992b; 1993) refined and developed Oscarsson's error hypothesis into a theory of feedback-error learning, summarized in Figure 11A. According to this theory, the IO transmits a motor error signal that is generated by a simple feedback controller. The difference between a desired trajectory and sensory feedback reporting on the actual trajectory forms a trajectory error analogous to the error signal in Albus' theory. However, in Kawato's theory the trajectory error is processed by the feedback controller to convert it into a motor error signal, which is a vector with quite desireable training capabilities. The motor error also serves as a crude motor command that is eventually replaced by an improved motor command, after the cerebellar cortex has learned the inverse model that was described in Section 2.4. While this theory functions well in robot manipulation tasks, its consistency with the anatomy and physiology of different cerebellar control systems needs to be examined.
In the case of limb movements, Kawato and Gomi (1992a) assumed that the motor cortex functions as the simple feedback controller and that it also contains the two summing junctions. Motor cortex receives desired trajectory signals from the association cortex and actual trajectory information via sensory feedback. It computes the difference to form a trajectory error which is processed further to generate the motor error signal that is sent through the IO to provide training information to the cerebellum. The motor cortex is also the summation site where the improved command generated in the cerebellum is added to the motor error to produce the net command sent to the spinal cord. A basic problem with this model may be its assumption that IO activity reflects motor error. As pointed out earlier in this section, IO neurons are highly responsive to sensory input and are actually suppressed by motor signals. Another disadvantage is that feedback-error learning requires a higher authority, the association cortex, to produce a desired trajectory signal. Kawato's model suggests that the cerebellar cortex progressively takes over control from extracerebellar mechanisms in the course of learning, which is opposite to the shift that has been postulated by others (Galiana 1986; Peterson et al 1991; Houk & Barto, 1992; Houk et al 1992).
In the APG model, Houk and Barto (1992) accepted the sensory nature of IO signaling as a basis for the generation of training information. Tactile cells respond to light contact within a receptive field on the surface of the limb, and proprioceptive cells respond to limb movement in a particular direction (Gellman et al 1985). In both cases, responsiveness is suppressed during certain phases of the animal's movement. The model in Figure 11B attributes the suppression to inhibitory gating controlled by efference copy signals, which can occur in sensory relay neurons (SR) or directly in the IO. The tactile responses are inhibited just after a motor command ceases (Weiss et al, 1990), which is postulated to eliminate contact responses that would otherwise occur at the end of an accurate movement. This leaves uninhibited responses to contacts that occur when the limb bumps into an object during the movement, a simple indicator of motor error. The proprioceptive responses are instead inhibited during movement. In the APG array model (Berthier et al, 1993), we assumed that inhibition occurs only during primary movements, leaving the IO responsive during secondary corrective movements, which seems to be in agreement with single unit data (Gilbert & Thach, 1977; Gelman et al 1985). Because proprioceptive neurons are tuned to different directions of movement, different units in the network detect different directions of corrective movement, which provides a form of supervised training information in the model.
The assumption that IO training signals derive from simple somatosensory properties helps to address the internal standards problem mentioned earlier. The fact that low-threshold receptive fields are aligned with nociceptive fields for the same neurons (Ekerot et al 1991) suggested that IO neurons learn to respond to low-threshold predictors of nociceptive stimuli in the course of normal development (Houk & Barto, 1992). According to this theory, nociception as a punishment signal provides the ultimate training information for the cerebellar network.
The origin of internal standards may not be an issue for smooth eye movements, since the negative image of a head movement, as sensed by vestibular receptors, can be thought of as the desired trajectory for stabilizing visual images on the retina. Furthermore, motion detectors in the retina directly transduce the trajectory errors of the optokinetic system. The accessory optic system routes crude commands directly to vestibular neurons and efference copies to the appropriate region of the IO (Kawato & Gomi, 1992b). If smooth pursuit is treated as a reflex, a similar, though somewhat more complex, argument can be made regarding this system. The movement of a visual target, as analyzed in the parietal cortex, could serve as the desired trajectory.
In applying feedback-error learning to saccadic eye movements, Dean and colleagues (1994) assumed that the retinal projection to the superficial layer of the tectum in effect computes an endpoint error in visual coordinates, as assumed earlier by Grossberg and Kuperstein (1989). The internal standard is thus replaced by the assumption that a target in the peripheral visual field needs to be foveated. Note that a trajectory error is not computed in this case, only an endpoint error. This endpoint error is then transformed by the intermediate layer of the tectum into a kind of motor error signal that was found to be suitable for driving feedback-error learning in a CMAC model of the cerebellum (Dean et al., 1994). Again, we would note that single unit recordings from CFs do not support this theory. The CFs that innervate the saccadic region of the cerebellum do not encode motor signals, although they do respond to proprioceptive inputs. In reviewing this issue, Houk, Galiana and Guitton (1992) suggested that these CFs might be sensitive to their proprioceptive input only during corrective saccades. This should be relatively easy to test experimentally.
Models of classical conditioning have assumed that the IO transmits the US, e.g., a strong puff of air to the cornea, as a training signal. These CFs should then mediate the associative learning of a CS (Thompson 1986). In analogy with IO neurons involved in limb control, those projecting to the eye blink system have exquisitely sensitive tactile receptive fields around the eye. This suggested to Weiss and colleagues (1993) that these IO neurons might be detecting eyelid closure, as opposed to the US, in which case they could provide training signals for adjusting the amplitude of the eye blink motor program. This would fit with the APG version of the conditioning model of the cerebellum discussed earlier (Fig. 6B). It also fits with conditioning studies of mutant mice lacking both mGluR1 receptors and cerebellar LTD (Aiba et al, 1994). These animals retain an ability to initiate CRs, but are impaired on their ability to regulate the amplitude of the response.
In summary, one can make a substantial case that the training information transmitted by the IO is based on relatively simple sensory responsiveness. A sensory origin of these training signals is advantageous since it largely circumvents the theoretical problem of providing an internal standard for judging motor performance.
3.4 Distributed learning
The cerebellar cortex is, of course, not the only CNS structure that mediates motor learning. While not a primary topic in the present article, perspective may be served by offering a brief statement of our views about how plasticity in the inferior olive, premotor networks, basal ganglia and cerebral cortex might interact with the cerebellar cortex to create an advantageous environment for overall motor learning (cf. Houk & Barto, 1992; Houk et al 1992; Houk, 1992; Barto 1995; Houk et al 1995; Houk & Wise, 1995).
The IO may be a site of learning that could improve its ability to generate training information. This seems likely in the course of development, during which IO neurons may acquire their ability to respond to low-threshold predictors of the nociceptive stimuli that also activate them (Section 3.3). Learning at the level of the IO may also operate in the adult, which could offer explanations for the puzzling CF responses that have been observed during some motor tasks (Mano et al 1986; Ojakangas & Ebner, 1992). In the Mano experiment, CFs responded in the interval between the random transitions in a visual target and the onsets of the movements that the animals made in attempting to follow the targets. Ito (1989) pointed out that the observed CF responses do, in fact, detect errors, namely the descrepancies between target motion and the animals responses which are delayed by a reaction time. Due to the random occurrences of target transitions, the observed responses might represent the earliest possible predictors of these errors. In a similar vein, the CF responses observed by Ojakangas and Ebner (1992) may represent acquired predictors of the corrective movements their animals used to compensate for gain changes in the visual target display. In both of these cases, we would postulate that IO neurons have somehow acquired responses to signals that predict the proprioceptive or tactile signals that activate these neurons at the onsets of the corrective movements. The IO system might also learn how to better use inhibitory gating to improve the quality of the training information that is supplied by CFs (Houk & Barto, 1992). Learned gating patterns could be transmitted by the inhibitory projections to the olive from the GABAergic cerebellar nuclear cells N in Figure 11B, since these neurons are probably adaptively controlled by input from PCs in the cerebellar cortex. Future microelectrode studies, if appropriately designed, might be able to detect adaptive alterations in IO responsiveness.
The cerebellar cortex is probably capable of transfering some of its motor program knowledge to the premotor network. Elsewhere we discussed how this could come about for limb and saccadic eye movements through Hebbian mechanisms supervised by PC forcing functions generated in the cerebellar cortex (Houk & Barto, 1992; Houk et al 1992). It has also been suggested that the cerebellum provides the information that is used to train the brainstem vestibulo-ocular reflex (Miles & Lisberger, 1981; Galiana, 1986; Peterson et al, 1991). The cerebellar cortex might acquire this information during task rehearsal, or it might already have adequate information and need only export it to the premotor network. Formerly it was predicted that premotor networks would learn relatively slowly as compared with the cerebellar cortex (Galiana 1986; Petersen et al 1991; Houk & Barto, 1992; Houk et al 1992). However, several recent studies suggest that rapid learning can occur at brainstem (Luebke & Robinson, 1992; Khater et al 1993) and cerebral (Sanes et al 1988; Raichle et al 1994) sites. In the Luebke and Robinson study, a process of deadaptation, which normally required only 30 minutes, was suspended by inactivation of the flocculus, supporting the hypothesis that the cerebellar cortex guides the learning. In the Raichle study, PET scans showed metabolic activity associated with a cognitive process moving from the lateral cerebellum to a sylvian site in the cerebral cortex following less than 15 minutes of practice. These cases are consistent with the hypothesis that the cerebellar cortex can export knowledge that it has previously acquired to neurons that are the targets of its regulation. This should be a fruitful area for future investigation.
Learning how to perform complex behavioral acts clearly requires the cerebral cortex and basal ganglia. We envision that these structures learn to detect and register events and contexts that are potentially useful in planning and programming motor actions (Houk & Wise, 1995). The cerebellum, once provided with this information through the cortico-ponto-cerebellar pathway, must then learn how to use it in an optimal fashion to orchestrate its own participation in complex motor acts.
Adams, J.A. (1977) Feedback theory of how joint receptors regulate the timing and positioning of a limb. Psych. Rev 84: 504-523
Aiba, A., Kano, M., Chen, C., Stanton, M.E., Fox, G.D., Herrup, K., Zwingman, T.A., Tonegawa, S. (1994) Deficient cerebellar long- term depression and impaired motor learning in mGLuR1 mutant mice. Cell 79: 377-388
Albus, J.S. (1971) A theory of cerebellar function. Math. Biosci 10: 25-61
Albus, J.S. (1975) A new approach to manipulator control: The cerebellar model articulation controller (CMAC). Trans. ASME J. Dynamic Systems, Measurement, and Control 97: 220-227
Albus, J.S. (1981) Brains, Behavior and Robotics. Petersborough, NH, Byte Books
Arbib, M.A., Boylls, C.C., Dev, P. (1974) Neural models of spatial perception and the control of movement. In: R. Oldenbourg (ed) Kybernetik und Bionik/Cybernetics. , pp 216-231
Arshavsky, Y.I., Gelfand, I.M., Orlovsky, G.N. (1986) Cerebellum and rhythmical movements. In: V. Braitenberg (ed) Studies of Brain Function, Vol 13. Berlin, Springer-Verlag
Barto, A.G. (1995) Adaptive critics and the basal ganglia. In: J.C. Houk, J.L. Davis, D.G. Beiser (eds) Models of Information Processing in the Basal Ganglia, Ch 11. Cambridge, MIT Press, pp 215-232
Bell, C.C. (1994) The generation of expectations in cerebellum-like structures. The Neurobiology of Computation: Proceedings of the Annual Computational Neuroscience Meeting
Bell, C.C., Grimm, R.J. (1969) Discharge properties of Purkinje cells recorded on single and double microelectrodes. Journal of Neurophysiology 32: 1044-1055
Berthier, N.E., Moore, J.W. (1986) Cerebellar Purkinje cell activity related to the classically conditioned nictitating membrane response. Exp. Brain Res 63: 341-350
Berthier, N.E., Singh, S.P., Barto, A.G., Houk, J.C. (1993) Distributed representation of limb motor programs in arrays of adjustable pattern generators. J. Cognitive Neuroscience 5: 56-78
Black, J.E., Isaacs, K.R., Anderson, B.J., Alcantara, A.A., Greenough, W.T. (1990) Learning causes synaptogenesis, whereas motor activity causes angiogenesis, in cerebellar cortex of adult rats. Proc. Natl. Acad. Sci. USA 87: 5568-5572
Bliss, T.V.P., Collingridge, G.L. (1993) A synaptic model of memory: long-term potentiation in the hippocampus. Nature 361: 31-39
Bloedel, J.R. (1992) Functional heterogeneity with structural homogeneity: how does the cerebellum operate? Behavioral and Brain Sciences 15: 666-678
Bloedel, J.R., Courville, J. (1981) Cerebellar afferent systems. In: Brookhart, J., Mountcastle, V., Brooks, V., Geiger, S. (eds) Handbook of Phsyiology, Sect. 1. The Nervous System. Motor Control, Vol II. Bethesda, MD, American Physiological Society, pp 735-829
Blomfield, S., Marr, D. (1970) How the cerebellum may be used. Nature 227: 1224-1228
Boylls, C.C. (1975) A Theory of Cerebellar Function with Applications to Locomotion, COINS Tech. Rep. 76-1, Amherst, MA
Braitenberg, A., Atwood, R.P. (1958) Morphological observations on the cerebellar cortex. J. Comp. Neurol 109: 1-27
Buckingham, J.T., Houk, J.C., Barto, J.G. (1994) Controlling a nonlinear spring-mass system with a cerebellar model. In: Proceedings of the Eighth Yale Workshop on Adaptive and Learning Systems.
Buonomano, D.V., Mauk, M.D. (1994) Neural network model of the cerebellum: temporal discrimination and the timing of motor responses. Neural Computation 6: 38-55
Cannon, S.C., Robinson, D.A. (1987) Loss of the neural integrator of the oculomotor system from brainstem lesions in monkey. J. Neurophysiol 57: 1383-1409
Carter, T.L., McElligott, J.G. (1994) Metabotropic glutamate receptor antagonist (L-AP3) inhibits vestibulo-ocular reflex adaptation when administered into goldfish vestibulo-cerebellum. Soc. Neurosci. Abs 20: 17.10
Chapeau-Blondeau, F., Chauvet, G. (1991) A neural network model of the cerebellar cortex performing dynamic associations. Biol. Cybern 65: 267-279
Chen, C., Thompson, R.F. (1992) Associative long-term depression revealed by field potential recording in rat cerebellar slice. Soc. Neurosci. Abs 18: 508.2
Conquet, F., Bashir, Z.I., Davies, C.H., Daniel, H., Ferraguti, F., Bordi, F., Franz-Bacon, K., Reggiani, A., Matarese, V., Conde, F., Collingridge, G., Crepel, F. (1994) Motor deficit and impairment of synaptic plasticity in mice lacking mGLuR1. Nature 372: 237-243
Crepel, F. Hemart, N., Jaillard, D. Daniel, H. (this volume) Cellular Mechanisms of long-term depression in the cerebellum
Dean, P., Mayhew, J.E.W., Langdon, P. (1994) Learning and maintaining saccadic accuracy: a model of brainstem-cerebellar interactions. J. Cognitive Neuroscience 6: 117-138
Desmond, J.E., Moore, J.W. (1991) Single-unit activity in red nucleus during the classically conditioned rabbit nictitating membrane response. Neurosci. Res 10: 260-279
Eisenman, L.N., Keifer, J., Houk, J.C. (1991) Positive feedback in the cerebro-cerebellar recurrent network may explain rotation of population vectors. In: Eeckman, F. (ed) Analysis and Modeling of Neural Systems. Kluwer Academic Publishers, pp 371-376
Ekerot, C.-R., Garwicz, M., Schouenborg, J. (1991) Topography and nociceptive receptive fields of climbing fibres projecting to the cerebellar anterior lobe in the cat. J. Physiol. London 441: 257-274
Ekerot, C.F., Kano, M. (1989) Stimulation parameters influencing climbing fibre induced long-term depression of parallel fibre synapses. Neuroscience Research 6: 264-268
Florens, P. (1824) Recherches Exp=C8rimentales sur les Propri=C8t=C8s et les Fonctions du System Nerveux dan les Animaux Vertebr=C8s. Paris, Crevot
Fuchs, A.F., Robinson, F.R., Straube, A. (1993) Role of the caudal fastigial nucleus in saccade generation. I.Neuronal discharge pattern. Journal of Neurophysiology 70: 1723-40
Fujita, M. (1982) Adaptive filter model of the cerebellum. Biol. Cybern 45: 195-206
Galiana, H.L. (1985) Comissural vestibular nuclear coupling: a powerful putative site for producing adaptive change. In: A. Berthoz, G. Melvill Jones (eds) Adaptive Mechanisms in Gaze Control: Facts and Theories, Ch 22. Elsevier Science Pub. BV, pp 327-339
Galiana, H.L. (1986) A new approach to understanding adaptive visual- vestibular interactions in the central nervous system. Journal of Neurophysiology 55: 349-374
Galiana, H.L., Guitton, D. (1992) Central organization and modelling of eye-head coordination during orienting gaze shifts. In: B. Cohen, D.L. Tomka, F. Guedry (eds) Sensing and Controlling Motion: Vestibular and Sensorimotor Function, Vol 656. Ann. N.Y. Acad. Sci., pp 452-471
Galiana, H.L., Outerbridge, J.S. (1984) A bilateral model for central neural pathways in the vestibuloocular reflex. Journal of Neurophysiology 51: 210-241
Gellman, R., Gibson, A.R., Houk, J.C. (1985) Inferior olivary neurons in the awake cat: Detection of contact and passive body displacement. J. Neurophsyiol 54: 40-60
Ghez, C., Hening, W., Favilla, M. (1990) Parallel interacting channels in the initiation and specification of motor response features. In: M. Jeannerod (ed) Attention and Performance XIII: Motor Representation and Control, Vol Ch. 8. Hillsdale, New Jersey, Lawrence Erlbaum Associates, pp 265-293
Gibson, A.R., Robinson, F.R., Alam, J., Houk, J.C. (1987) Somatotopic alignment between climbing fiber input and nuclear output of the intermediate cerebellum. J. Comp. Neurol 260: 362-377
Gielen, C.C.A.M., van Gisbergen, J.A.M. (1990) The visual guidance of saccades and fast aiming movements. News in Physiol. Sci 5: 58-63
Gilbert, P.F.C. (1974) A theory of memory that explains the function and structure of the cerebellum. Brain Res 70: 1-18
Gilbert, P.F.C. (1975) How the cerebellum could memorize movements. Nature (London) 254: 688-689
Gilbert, P.F.C., Thach, W.T. (1977) Purkinje cell activity during motor learning. Brain Res 128: 309-328
Gluck, M.A., Thompson, R.F. (1990) Adaptive signal processing and the cerebellum: models of classical conditioning and VOR adaptation. In: M.A. Gluck, D.E. Rumelhart (eds) Neuroscience and Connectionist Theory, Ch 4. Hillsdale, N.J., Lawrence Erlbaum Assoc., pp 131-185
Gomi, H., Kawato, M. (1992) Adaptive feedback control models of the vestibulocerebellum and spinocerebellum. Biological Cybernetics 68: 105-114
Grossberg, S., Kuperstein, M. (1989) Neural Dynamics of Adaptive Sensory-motor Control. New York, Pergamon Press, p 66
Guitton, D., Munoz, D.P., Galiana, H.L. (1990) Gaze control in the cat: studies and modeling of the coupling between orienting eye and head movements in different behavioral tasks. J. Neurophysiol 64: 509-531
Guzm=B7n-Lara, S. (1993) Adjusting connections using reflexes as guidance. NPB Tech Rept 8, Northewestern Univ Inst Neuroscience
Harvey, R.J., Porter, R., Rawson, J.A. (1977) The natural discharges of Purkinje cells in paravermal regions of lobules V and VI of the monkey's cerebellum. J. Physiol 271: 515-536
Harvey, R.J., Porter, R., Rawson, J.A. (1979) Discharges of intracerebellar nuclear cells in monkeys. J. Physiol 297: 559-580
Hawkes, R., Blyth, S., Chockkan, V., Tano, D., Ji, Z., Mascher, C. (1993) Structural and molecular compartmentation in the cerebellum. Can. J. Neurol. Sci 20: S29-S35
Houk, J.C. (1989) Cooperative control of limb movements by the motor cortex, brainstem and cerebellum. In: R.M.J. Cotterill (ed) Models of Brain Function. Cambridge, Cambridge Univ Press, pp 309-325
Houk, J.C. (1990) Role of cerebellum in classical conditioning. Soc. Neurosci. Abstr 16: 474
Houk, J.C. (1992) Learning in modular networks. In: K.S. Narendra (ed) Proceeding 7th Yale Workshop on Adaptive and Learning Systems. New Haven, Ctr. Sys. Sci., pp 80-84
Houk, J.C., Adams, J.L., Barto, A.G. (1995) A model of how the basal ganglia generates and uses neural signals that predict reinforcement. In: J.C. Houk, J.L. Davis, D.G. Beiser (eds) Models of Information Processing in the Basal Ganglia, Ch 13. Cambridge, MIT Press, pp 249-270
Houk, J.C., Barto, A.G. (1992) Distributed sensorimotor learning. In: G.E. Stelmach, J. Requin (eds) Tutorials in Motor Behavior II. Amsterdam, Elsevier, pp 71-100
Houk, J.C., Galiana, H.L., Guitton, D. (1992) Cooperative control of gaze by the superior colliculus, brainstem and cerebellum. In: G.E. Stelmach, J. Requin (eds) Tutorials in Motor Behavior II. Amsterdam, Elsevier, pp 443-474
Houk, J.C., Gibson, A.R. (1987) Sensorimotor processing through the cerebellum. In: King, J.S. (ed) New Concepts in Cerebellar Neurobiology. New York, NY, Alan R. Liss, Inc., pp 387-416
Houk, J.C., Keifer, J., Barto, A.G. (1993) Distributed motor commands in the limb premotor network. Trends in Neuroscience 16: 27-33
Houk, J.C., Singh, S.P., Fisher, C., Barto, A.G. (1990) An adaptive sensorimotor network inspired by the anatomy and physiology of the cerebellum. In: Miller, W.T., Sutton, R.S., Werbos, P.J. (eds) Neural Networks for Control, Ch 13. Cambridge, Mass., MIT Press, pp 301-348
Houk, J.C., Wise, S.P. (1995) Distributed modular architectures linking basal ganglia, cerebellum and cerebral cortex: their role in planning and controlling action. Cerebral Cortex 5: 95-110
Ito, M. (1969) Neurons of cerebellar nuclei. In: M.A.B. Brazier (ed) The Interneuron. 1969 UCLA Forum, pp 309-327
Ito, M. (1970) Neurophysiological aspects of the cerebellar motor control system. Int. J. Neurology 7: 162-176
Ito, M. (1984) The Cerebellum and Neural Control. New York, Raven Press, pp
Ito, M. (1989) Long-term depression. Ann. Rev. Neurosci 12: 85-102
Ito, M. (1993) Movement and thought: identical control mechanisms by the cerebellum. Trends in Neurosciences 16: 448-450
Kawato, M. (1990) Computational schemes and neural network models for formation and control of multijoint arm trajectory. In: Miller, T. , Sutton, R.S., Werbos, P.J. (eds) Neural Networks for Control. Cambridge, MA, MIT Press
Kawato, M., Gomi, H. (1992a) A computational model of four regions of the cerebellum based on feedback-error learning. Biological Cybernetics 68: 95-103
Kawato, M., Gomi, H. (1992b) The cerebellum and VOR/OKR learning models. Trends in Neuroscience 15: 445-453
Kawato, M., Gomi, H. (1993) Feedback-error-learning model of cerebellar motor control. In: N. Mano, I. Hamada, M.R. DeLong (eds) Role of the Cerebellum and Basal Ganglia in Voluntary Movement. Elsevier Science Pub. B.V., pp 51-61
Keifer, J., Houk, J.C. (1995) In vitro classical conditioning of abducens nerve discharge in turtles. Journal of Neuroscience
Khater, T.T., Quinn, K.J., Pena, J., Baker, J.F., Peterson, B.W. (1993) The latency of the cat vestibulo-ocular reflex before and after short- and long-term adaptation. Experimental Brain Research 94: 16-32
Klopf, A.H. (1982) The Hedonistic Neuron: A Theory of Memory, Learning and Intelligence. New York, Harper and Row Hemispheres, pp
Krommenhoek, K.P., Van Opstal, A.J., Gielen, C.C.A.M., Van Gisbergen, J.A.M. (1993) Remapping of neural activity in the motor colliculus: a neural network study. Vision Research 33: 1287-1298
Leiner, H.C., Leiner, A.L., Dow, R.S. (1989) Reappraising the cerebellum: what does the hindbrain contribute to the forebrain? Behavioral Neuroscience 103: 998-1008
Linden, D.J. (this volume) Cerebellar long-term depression as investigated in a cell culture preparation
Linden, D.J. (1994) Long-term synaptic depression in the mammalian brain. Neuron 12: 457-472
Linden, D.J., Dickinson, M.H., Smeyne, M., Connor, J.A. (1991) A long-term depression of AMPA currents in cultured cerebellar Purkinje neurons. Neuron 7: 81-89
Lisberger, S.G. (1994) Neural basis for motor learning in the vestibuloocular reflex of primates. III. computational and behavioral analysis of the sites of learning. Journal of Neurophysiology 72: 974-998
Llin=B7s, R., Welsh, (1993) On the cerebellum and motor learning. Current Opinion in Neurobiology 3: 958-965
Luebke, A.E., Robinson, D.A. (1992) Climbing fiber intervention blocks plasticity of the vestibuloocular reflex. Annals N.Y. Acad. Sci 656: 428-430
Mahamud, S., Barto, A.G., Kettner, R.E., Houk, J.C. (1995) A model of prediction in smooth eye movements. Fourth Ann. Compt. Neural Systems Meeting
Mano, N., Kanazawa, I., Yamamoto, K. (1986) Complex-spike activity of cerebellar Purkinje cells related to wrist tracking movement in monkey. J. Neurophysiol 56: 137-158
Marr, D. (1969) A theory of cerebellar cortex. J. Physiol. London 202: 437-470
McIlwain, J. (1986) Effects of eye position on saccades evoked electrically from superior colliculus of alert cats. J. Neurophysiol 55: 97-112
Miall, R.C., Weir, D.J., Wolpert, D.M., Stein, J.F. (1993) Is the cerebellum a Smith predictor?. J. Motor Behavior 25: 203-216
Middleton, F.A., Strick, P.L. (1994) Anatomical evidence for cerebellar and basal ganglia involvement in higher cognitive function. Science 266: 458-461
Miles, F.A., Lisberger, S.G. (1981) Plasticity in the vestibulo-ocular reflex: a new hypothesis. Ann. Rev. Neurosci 4: 273-299
Miller, W.T. (1987) Sensor-based control of robotic manipulators using a general learning algorithm. IEEE J. Robotics & Automation RA-3: 157-165
Moore, J.W., Desmond, J.E., Berthier, N.E. (1989) Adaptively timed conditioned responses and the cerebellum: a neural network approach. Biol. Cybern 62: 17-28
Mugnaini, E., Maler, L. (1993) Comparison between the fish electrosensory lateral line lobe and the mammalian dorsal cochlear nucleus. In: C.C. Bell, C.D. Hopkins, K. Grant (eds) Contributions of Electrosensory Systems to Neurobiology and Neuroethology, Vol 173. J. Comparative Physiology A, pp 683-685
Ojakangas, C.L., Ebner, T.J. (1992) Purkinje cell complex and simple spike changes during a voluntary arm movement learning task in the monkey. J. Neurophysiol 68: 2222-2236
Optican, L.M., Robinson, D.A. (1980) Cerebellar-dependent adaptive control of primate saccadic system. J. Neurophysiol 44: 1058-1076
Oscarsson, O. (1980) Functional organization of olivary projection to the cerebellar anterior lobe. In: J. Courville, C. de Montigney, Y. Lamarre (eds) The Inferior Olivary Nucleus: Anatomy and Physiology. New York, New York, Raven Press, pp 279-289
Partsalis, A.M., Zhang, Y., Highstein, S.M. (1995) Dorsal Y group in the squirrel monkey. II. Contribution of the cerebellar flocculus to neuronal responses in normal and adapted animals. J. Neurophysiology 73: 632-650
Paulin, M. (1989) A Kalman filter theory of the cerebellum. In: M.A. Arbib, S. Amari (eds) Dynamic Interactions in Neural Networks: Models and Data. Springer-Verlag, pp 241-259
Peterson, B.W., Baker, J.F., Houk, J.C. (1991) A model of adaptive control of vestibuloocular reflex based on properties of cross-axis adaptation. Annals N.Y. Acad. Sci 627: 319-337
Peterson, B.W., Houk, J.C. (1991) A model of cerebellar-brainstem interaction in the adaptive control of the vestibuloocular reflex. Acta Otolaryngol (Stockh) 481: 428-432
Prochazka, A. (1989) Sensorimotor gain control: a basic strategy of motor systems? Progr. Neurobiol 33: 281-307
Raichle, M.E., Fiez, J.A., Videen, T.O., MacLeod, A.K., Pardo, J.V., Fox, P.T., Petersen, S.E. (1994) Practice-related changes in human brain functional anatomy during nonmotor learning. Cereberal Cortex 4: 8-26
Robinson, D.A. (1975) Oculomotor control signals. In: Lennerstrand, G., Bach-y-rita, P. (eds) Basic Mechanisms of Ocular Motility and their Clinical Implications. Oxford, Pergamon Press, pp 337-374
Robinson, D.A. (1976) Adaptive gain control of the vestibulo-ocular reflex by the cerebellum. J. Neurophysiol 39: 954-969
Robinson, D.A. (1987) Why visuomotor systems don't like negative feedback and how they avoid it. In: M.A. Arbib, A.R. (ed) Vision, Brain and Cooperative Computation. Cambridge, MIT Press, pp 89-107
Robinson, D.A., Optican, L.M. (1981) Adaptive plasticity in the oculomotor system. In: H. Flohr, W. Precht (eds) Lesion Induced Neuronal Plasticity in Sensorimotor Systems. Berlin, Springer- Verlag, pp 295-304
Sakurai, M. (1987) Synaptic modification of parallel fibre-Purkinje cell transmission in in vitro guinea-pig cerebellar slices. J. Physiol. London 394: 463-480
Sanes, J.N., Suner, S., Lando, J.F., Donoghue, J.P. (1988) Rapid reorganization of adult rat motor cortex somatic representation patterns after motor nerve injury. Proc. Natl. Acad. Sci. USA 85: 2003-2007
Sarrafizadeh, R. (1994) Sensory triggering of limb motor programs: neural correlates of decisions for action. NPB Tech Rept 9, Northwestern Univ Inst Neuroscience
Schmidt, R.A. (1988) Motor Control and Motor Learning. Champaign, Illinois, Human Kinetics
Scudder, C.S. (1988) A new local feedback model of the saccadic burst generator. J. Neurophysiol 59: 1455-1475
Shidara, M., Kawano, K., Gomi, H., Kawato, M. (1993) Inverse- dynamics model eye movement control by Purkinje cells in the cerebellum. Nature 365: 50-52
Simpson, J.I. (1984) The accessory optic system. Annual Review of Neuroscience 7: 13-41
Sinkjaer, T., Wu, C.H., Barto, A., Houk, J.C. (1990) Cerebellum control of endpoint position-a simulation model. IJCNN 90 II: 705-710
Sparks, D.L. (1988) Neural cartography: sensory and motor maps in the superior colliculus. Brain Behav. Evol 31: 49-56
Stone, L.S., Lisberger, S.G. (1990) Visual responses of Purkinje cells in the cerebellar flocculus during smooth-pursuit eye movements in monkeys. II. Complex spikes. J. Neurophysiology 63: 1262-1275
Sutton, R.S., Barto, A.G. (1981) Toward a modern theory of adaptive networks: expectation and prediction. Psych. Rev 88: 135-170
Thach, W.T., Goodkin, H.P., Keating J.G. (1992) The cerebellum and the adaptive coordination of movement. Annu. Rev. Neurosci 15: 403-442
Thompson, R.F. (1986) The neurobiology of learning and memory. Science 233: 941-947
Tsukahara, N., Korn, H., Stone, J. (1968) Pontine relay from cerebral cortex to cerebellar cortex and nucleus interpositus. Brain Res 10: 448-453
Tyrrell, T., Willshaw, D.J. (1992) Cerebellar cortex: its simulation and the relevance of Marr's theory. Royal Soc. Lond. B 336: 239-257
Van Gisbergen, J.A.M., Robinson, D.A., Gielen, S. (1981) A quantitative analysis of saccadic eye movements by burst neurons. J. Neurophysiol 45: 417-442
Van Gisbergen, J.A.M., Van Opstal, A.J., Hoeks, B. (1989) The transformation of the collicular motor map into rapid eye movements: implications of a nonorthogonal muscle system. In: Personnaz, L., Dreyfus, G. (eds) Neural Networks from Models to Applications. Paris, I.D.S.E.T., pp 88-96
Van Kan, P.L.E., Gibson, A.R., Houk, J.C. (1993) Movement-related inputs to intermediate cerebellum of the monkey. Journal of Neurophysiology 69: 74-94
Weiss, C., Disterhoft, J.F., Gibson, A.R., Houk, J.C. (1993) Receptive fields of single cells from the face zone of the cat rostral dorsal accessory olive. Brain Research 605: 207-213
Weiss, C., Houk, J.C., Gibson, A.R. (1990) Inhibiton of sensory responses of cat inferior olive neurons produced by stimulation of red nucleus. J. Neurophysiology 64: 1170-1185
Young, L.R. (1977) Pursuit eye movements -- what is being pursued? In: R. Baker, A. Berthoz (eds) Control of Gaze by Brain Stem Neurons. , pp 29-36
Yuen, G.L., Hockberger, P.E., Houk, J.C. (1995) Bistability in cerebellar Purkinje cell dendrites modelled with high-threshold calcium and delayed-rectifier potassium channels. Biological Cybernetics
Zipser, D., Anderson, R.E. (1988) A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature 331: 679-684
[Note: Figures avaiable only in hard copy version]
FIGURE LEGENDS
Figure 1: Position of the cerebellar cortex in motor control. Basic motor control actions are implemented by premotor networks in the brainstem, sensorimotor cortex and spinal cord. The cerebellar cortex regulates premotor actions through inhibition and disinhibition. The inferior olive transmits the climbing fiber information that trains Purkinje cells in the cerebellar cortex how to perform their regulatory functions. In this and succeeding illustrations, regular arrow heads denote predominantly excitatory connections, closed arrows denote inhibition and open arrows denote training influences.
Figure 2: Organization of the limb premotor network and its regulation by cerebellar Purkinje cell (PC) inhibition. Neural stages in the limb premotor network are: N, cerebellar nuclear cells; T, thalamic relays; M, neurons in the primary motor cortex; R, neurons in the magnocellular red nucleus; P, pontine neurons; L, lateral reticular neurons.
Figure 3: Organization of the premotor network controlling smooth eye movements in the horizontal plane. Neural stages are: V, medial vestibular neurons; P, prepositus hypoglossius neurons; PV, intermediate types.
Figure 4: Modular architecture of the adjustable pattern generator (APG) array model of the cerebellum. PCs, Purkinje cells; B, basket cell; PFs, parallel fibers; N, cerebellar nuclear cells; M, neurons in primary motor cortex. Inhibitory interneurons in a module, the Golgi, basket and stellate cells, are denoted with stippled cell bodies and axons.
Figure 5: Signals utilized by the APG model in selecting and regulating the execution of a motor program. The instruction stimulus fires basket (B) cells and causes state transitions in Purkinje cells (PCs). The trigger stimulus initiates positive feedback in those M-N loops that are disinhibited. The inserts on the right illustrate the model of dendritic bistability in PCs and the concept that bistability in several dendrites (D1 through D5) gives rise to multistability in the soma (S) of a PC.
Figure 6: Cerebellar regulation of conditioned reflexes (CRs). Part A shows a simple model of CR generation: PC, Purkinje cells; N, cerebellar nuclear cells; R, red nucleus cells; VI, cells in the abducens nucleus; V, cells in the trigeminal nucleus. Part B shows a modified model of CR generation based on the APG theory. L, lateral reticular nucleus; RF, reticular formation; IO, inferior olive.
Figure 7: Cerebellar regulation of saccadic eye movements. Part A shows a model of the tecto-reticulo-cerebellar network. PC, Purkinje cells; N, cerebellar nuclear cells; T, tectal neurons; LL, long-lead burst neurons in reticular formation. Part B shows a model of cerebellar regulation of the brainstem burst-generating network. PC, other Purkinje cells; N, other nuclear cells; OP, omnipause neurons; EB, excitatory burst neurons; IB, inhibitory burst neurons.
Figure 8: Cerebellar regulation of smooth eye movements in the horizontal plane. PC, Purkinje cells; V, medial vestibular neurons. Also see Figure 3.
Figure 9: Cellular mechanisms in a learning rule for long-term depression (LTD). Stippled box defines boundaries of a spine. According to this model, LTD requires three factors: parallel fiber input to the spine (a presynaptic factor), dendritic depolarization produced by responses to other parallel fibers (a postsynaptic factor), and dendritic depolarization produced by climbing fiber input (a training factor).
Figure 10: Structural credit assignment in cerebellar modules. Open arrows at the top are climbing fiber inputs oriented in a parasagittal plane. Parallel fiber inputs instead are oriented in the horizontal plane. The rectangles schematize the dendritic trees of individual Purkinje cells. Longitudinal sets of Purkinje cells provide focused inhibitory input to cerebellar nuclear cells (N). Stipling highlights 5 PCs that participate in one APG module, with loop connections to the motor cortex (M) shown forming an elemental motor command.
Figure 11: Two models of how climbing fiber (CF) training information is generated. Part A shows the feedback-error learning scheme, and part B shows the scheme utilized in the APG theory. IO, inferior olive; N, cerebellar nucear cells of the GABAergic type; SR, sensory relay cells; Int, inhibitory interneurons.