Preprint of:

Thelen, E., Schöner, G., Scheier, C. & Smith, L., B. (2000) The Dynamics of Embodiment: A Field Theory of Infant Perseverative Reaching. Behavioral and Brain Sciences 24 (1): XXX-XXX.

Paper also available in PDF FORMAT


This is the unedited final draft of a BBS target article that has been accepted for publication (Copyright 2000: Cambridge University Press) and is currently being circulated for Open Peer Commentary.

This preprint is for inspection only, to help prospective commentators decide whether or not they wish to prepare a formal commentary.

Please do not prepare a commentary unless you have received a formal invitation indicating that it has been possible to include you in the final list of invited commentators.

For information on becoming a commentator on this or other BBS target articles, write to bbs@soton.ac.uk

For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to:

journals_subscriptions@cup.org (North America)
journals_subscriptions@cup.cam.ac.uk (All other countries).


The Dynamics of Embodiment:

A Field Theory of Infant Perseverative Reaching
 

Esther Thelen1

Gregor Schöner2,

Christian Scheier3

and

Linda B. Smith1





1 Department of Psychology and Program in Cognitive Science,
Indiana University,
Bloomington,
IN 47405,
USA
thelene@indiana.edu
smith4@indiana.edu
 
2Centre de Recherche en Neurosciences Cognitives,
C.N.R.S.,
Marseille,
Cedex 20,
France.
gregor@lnf.cnrs-mrs.fr
 
3 Department of Biology,
California Institute of Technology,
Pasadena,
CA 91125, USA
scheier@neuro.caltech.edu

 

 

Esther Thelen has a Ph.D in Biological Sciences from the University of Missouri and has been Professor of Psychology and Cognitive Science at Indiana University-Bloomington since 1985. Her research has centered on the acquisition of motor skills in infants, the relation of movement to cognition, and developmental theory. She is the author of over 85 papers and, with Linda Smith, the book, A Dynamical Systems Approach to the Development of Cognition and Action. She recently served as the President of the International Society for Infant Studies.
 
 
 
 
 
 

Gregor Schöner is a Directeur de Recherche at the Centre de Recherche en Neurosciences Cognitives of the French Centre National de la Recherche Scientifique. Originally trained as a theoretical physicist he has developed ideas from dynamical systems theory for such diverse fields as perception, movement,cortical neurophysiology, and autonomous robotics, with a recent interest in cognition. Over the last 15 years he published over 85 papers in experimental psychology, neuroscience, and engineering. He has lectured on dynamical systems ideas to many different audiences, with the goal of contributing to interdisciplinary interchange.
 
 
 
 
 
 

Christian Scheier is a postdoctoral fellow at the California Institute of Technology. He is the author of over 40 publications in the areas of time series analysis, autonomous agents, crossmodal psychophysics and infant development. He has recently published, together with Rolf Pfeifer, a book on embodied cognitive science with MIT Press. He did his Ph.D. on category learning in autonomous agents at the Artificial Intelligence Laboratory at the University of Zurich.
 
 
 
 
 
 
 
 

Linda Smith is a Chancellor's Professor of Psychology and Cognitive Science at Indiana University - Bloomington. She received her B.S. degree in 1973 from the University of Wisconsin - Madison and her Ph.D. in psychology from the University of Pennsylvania in 1977. She joined the faculty at Indiana University in 1977. Her research is directed to understanding developmental processes especially at it applies to cognition and word learning. She has published over 80 research articles and is co-author with Esther Thelen of A Dynamical Systems Approach to the Development of Cognition and Action. Her research is supported by grants from the National Institutes of Child Health and Development and the National Institute of Mental Health.
 
 
 
 
 
 


Abstract:

The overall goal of this target article is to demonstrate a mechanism for an embodied cognition. The particular vehicle is a much-studied, but still widely debated phenomenon seen in 7-12 month-old-infants. In Piaget's classic "A-not-B error," infants who have successfully uncovered a toy at location "A" continue to reach to that location even after they watch the toy hidden in a nearby location "B." Here we question the traditional explanations of the error as indicator of infants' concepts of objects or other static mental structures. Instead, we demonstrate that the A-not-B error and its previously puzzling contextual variations can be understood by the coupled dynamics of the ordinary processes of goal-directed actions: looking, planning, reaching, and remembering. We offer a formal dynamic theory and model based on cognitive embodiment that both simulates the known A-not-B effects and offers novel predictions that match new experimental results. The demonstration supports an embodied view by casting the mental events involved in perception, planning, deciding and remembering in the same analogic dynamic language as that used to describe bodily movement, so that they may be continuously meshed. We maintain that this mesh is a pre-eminently cognitive act of "knowing" not only in infancy but also in everyday activities throughout the life span.
 

Keywords:

Cognitive development, dynamical systems theory, embodied cognition, infant development, motor control, motor planning, perception and action.



 

"It is far too little recognized how entirely the intellect is built up of practical interests" (William James 1897 p.36 ).

1. Introduction

A century after William James created a psychology based on the primacy of experience, the ideas of embodiment are again entering the debate in the mind sciences. To say that cognition is embodied means that it arises from bodily interactions with the world. From this point of view, cognition depends on the kinds of experiences that come from having a body with particular perceptual and motor capabilities that are inseparably linked and that together form the matrix within which reasoning, memory, emotion, language and all other aspects of mental life are meshed. The contemporary notion of an embodied cognition stands in contrast to the prevailing cognitivist stance which sees mind as a device to manipulate symbols and is thus concerned with the formal rules and processes by which the symbols appropriately represent the real world. There is now converging interest in embodiment from scholars in philosophy, cognitive science, psychology, linguistics, robotics, and neuroscience (Almássy et al. 1998; Ballard et al. 1997; Brooks 1991; Chiel & Beer 1997; Clark 1997; Damasio 1994; Edelman 1987; Fogel 1993; Glenberg 1997; Gibson 1969; Harnad 1990; Johnson 1987; Lakoff 1987; Lakoff & Johnson 1990; Merleau-Ponty 1963; Pfeiffer& Scheier 1999; Sheets-Johnstone, 1990; Talmy 1988; Thelen 1995; Thelen & Smith 1994; Varela et al. 1991.)

In this target article, we contribute to this multidisciplinary effort by focusing in considerable detail on a particular, and controversial, phenomenon seen in human infants, the so called, "A-not-B error." We present theory, a model, simulations, and experiments that recast this phenomenon in embodied terms, using assumptions and the formal language of dynamic systems. We believe this offers an attractive starting point for an embodied cognition for several reasons:

1. We cast the mental events involved in perception, planning, deciding, and remembering in the analogic language of dynamics. This situates cognition within the same continuous, time-based, and nonlinear processes as those involved in bodily movement, and in the large-scale processes in the nervous system (Freeman & Skarda 1985; Kelso 1995; Koch & Davis 1994; Port & van Gelder 1995; Singer 1990; Turvey 1990; van Gelder, 1998). Finding a common language for behavior, body, and brain is a first step for banishing the specter of dualism once and for all.

2. Because perception, action, decision, execution, and memory are cast in compatible task dynamics, the processes can be continuously meshed together. This changes the information-processing flow from the traditional input-transduction-output stream to one of time-based and often shifting patterns of cooperative and competitive interactions. The advantage is the ability to capture the subtle contextual and temporal influences that are the hallmarks of real life behavior in the world.

3. We address specifically the developmental origins of cognition. Since Piaget (1952, 1954), it has been widely acknowledged that all forms of human thought must somehow arise from the purely sensorimotor activities of infants. But it is also generally assumed that the goal of development is to rise above the "mere sensorimotor" into symbolic and conceptual modes of functioning. The task of the developmental researcher, in this view, has been to unearth the "real" cognitive competence of the child unfettered by performance deficits from immature perception, attention, or motor skills. This division between what children really "know" and what they can demonstrate they know has been a persistent theme in developmental psychology (Gelman 1979; Spelke 1990). We argue here that these discontinuities are untenable. Our message is: if we can understand this particular infant task and its myriad contextual variations in terms of coupled dynamic processes, then the same kind of analysis can be applied to any task at any age. If we can show that "knowing" cannot be separated from perceiving, acting, and remembering, then these processes are always linked. There is no time and no task when such dynamics cease and some other mode of processing kicks in. Body and world remain ceaselessly melded together.

The burden of this larger agenda rests on our dissection and reinterpretation of a classic infant perseverative reaching phenomenon, the well-known "A-not-B" error (Piaget 1954). The dynamic field model formalizes a new approach to this error first suggested in conceptual form in Thelen & Smith (1994) and subsequently extended and supported by a series of experiments described in Smith, et al. (1999b). The model is an adaptation of Erlhagen & Schöner's (1998) dynamic neural field theory of motor programming (Schöner et al. 1997) and thus offers a bridge between the more general processes of motor planning and execution and the developmentally specific effects revealed by the A-not-B task. The model offers a powerful and parsimonious, yet biologically plausible, account of the many contextual influences on A-not-B tasks that have puzzled developmental psychologists for two generations. More generally, it demonstrates the elegance and usefulness of dynamic systems principles and language for understanding the intertwined processes of perceiving, deciding, acting, and remembering, and their changes over time.

The article proceeds as follows: First, we describe the A-not-B error in its canonical form and the variations that constitute the data to-be-explained. We show how previous explanations each capture some of the phenomenon but fail to account for all the known effects. Next we lay out the broad outlines of the new approach put forth in Thelen & Smith (1994) and Smith et al. (1999b) and the empirical support for that approach. Then we introduce the major assumptions of the dynamic field model and discuss why the model is so well-suited to explaining the A-not-B error. This is followed by a description of the model, and a series of simulations that capture the main A-not-B effect as well as the known contextual variations. We evaluate the strengths and shortcomings of this model in relation to other explanations. And finally, we offer some speculations about the model's more general usefulness for integrating multiple, time-based processes of human cognition and action.

2. The A-not-B error

The A-not-B error was first described by Jean Piaget (1954) in the context of his life-long quest for the developmental origins of knowledge. Piaget was particularly concerned with the question of when infants understand the properties of objects, and especially that objects continue to exist even when they cannot be seen or directly acted upon. Through a series of clever hiding games he played with his own children, Piaget discovered that such object knowledge arises gradually and rather late in infancy. Before 7 or 8 months of age, infants refuse to search for a toy hidden under a cover, as though the toy simply ceased to exist. After about 12 months, they search robustly, even after the toy is hidden in several places in succession. But between 7-12 months, infants display a peculiar kind of "partial knowledge" where they search at one location, but cannot switch their search if the toy is then switched to a second or subsequent hiding place. Piaget labeled this behavior as "Stage IV" in his series of stages in the development of adult-like object permanence: Infants act as if the toy had lasting existence only where it first disappeared.

In the decades since Piaget's first descriptions, the hiding task has been repeated countless times, in laboratories all over the world, and with myriad variations (see reviews by Acredolo 1985; Bremner 1985; Diamond 1990a, 1990b; Harris 1987, Markovitch & Zelazo, 1999; Munakata, 1998; Wellman et al. 1986, among others.). The classic, canonical version goes like this: An investigator hides a small, attractive toy under one of two identical hiding places, usually cloth covers or hiding wells with lids, and allows the infant to search and recover the toy. After a number of hidings and recoveries from the first location, "A," the investigator hides the toy in the second location "B" in full sight of the baby. If there is a few second delay between the hiding event and when the infant is permitted to search, eight-to-ten-month-old infants reliably make the "A-not-B" error, that is they reach back to the original location "A," even though they saw the toy hidden at "B."

It is both surprising and endearing to see infants so determined to make a mistake. But the reason that the A-not-B error has intrigued developmentalists for nearly fifty years is not just the phenomenon itself, but the questions raised by the many variations of the task studied over that time. Here is the puzzle: while the A-not-B error is entirely robust in the canonical form we described above, even seemingly small alterations in the task conditions can disrupt it. Nearly every aspect of the event matters: the visual properties of the hiding locations, including the distinctiveness, distance, number, and transparency of the covers (e.g. Bremner 1978b; Butterworth 1977; Butterworth et al. 1982; Horobin & Acredolo 1986; Sophian 1985), the delay between hiding and search (e.g. Diamond 1985; Gratch et al. 1974; Harris 1973), whether search involves reaching or just looking (e.g., Hofstadter & Reznick 1996) whether there are landmarks in the environment (Acredolo 1979), whether infants search for objects, food treats or people (Diamond 1997), whether the task is done at home or in the laboratory (Acredolo 1979), whether the infant or the hiding places have been moved (e.g., Bremner 1978a) , and the infants' amount of crawling experience (e.g. Bertenthal & Campos 1990; Horobin & Acredolo 1986). Such diverse context effects pose a serious challenge to Piaget's original interpretation. If the A-not-B error is a true measure of the status of infants' representations of objects, how can it be that what they know depends on so many seemingly irrelevant factors? How can it be that infants have a more mature object concept at home than in the laboratory or when the object is a cookie rather than a small toy?

2.1 Some explanations

The contemporary consensus is that Piaget's account is incorrect, but opinions differ on why the classic explanation is insufficient. One group of developmentalists argue that Piaget was asking the right question, but that he simply chose the wrong behavioral task to answer it. (Baillargeon & Graber 1988; Baillargeon & DeVos 1991, Bertenthal 1996; Diamond 1990b; Munakata et al. 1997). These theorists focus on the striking decalage between what infants seem to know about hidden objects when they manually search for them compared to when they just watch hiding events. Experiments using visual violation-of-expectancy measures have shown that many months before infants routinely make the A-not-B error when reaching for hidden objects, they seem to expect that objects will be retrieved from the place they were just hidden when no reaches are involved. They demonstrate this knowledge by looking longer at events where an object is plucked from a place other than the one where the baby watched it last disappear (e.g.,Ahmed & Ruffman 1998; Baillargeon & Graber 1988). Reaching is the problem, some argue, because it requires a "stronger" object representation than looking (Munakata et al. 1997), or because it involves additional means-ends performance demands (Baillargeon & Graber 1988), or because the "knowing" system is unable to control the "acting" system (Ahmed & Ruffman 1998; Bertenthal 1996). Infants, Diamond (1990b, p. 662) maintains, "really know where the [object] is even when they reach back to where they last found it."

One foundational assumption behind these dual-process (knowing vs. acting) accounts is that there lives, in the baby's head, a creature that is smarter than the body it inhabits; that there is a sharp partition between the mental events that precede the decision to act and the action itself. In this scenario, an object concept develops that is disembodied, timeless, and modular, and that may or may not actually motivate behavior. Moreover, there is the second assumption that because looking is motorically less complex than reaching, it will have a privileged access into the object knowledge module; it is a better measure of infants' "real" object concept. Bertenthal (1996) takes this further to suggest that knowing and acting are two dissociable systems, with different anatomical bases. One system represents objects, develops early, and is tapped by the visual expectancy studies. The A-not-B error comes from a second, "perception-action" system and has nothing to do with object representation. But if the A-not-B error is indeed not about the object concept at all, the intriguing question remains. What are the mechanisms that account, in the same baby, for accurate performance under some circumstances and perseverative reaching under other conditions?

One way to view the A-not-B error is simply as a reach to the wrong location. Indeed, a second group of theorists explain the A-not-B error as a manifestation of infants' immature abilities to direct their movements in space. Here the basic assumption is that young infants tend to represent space egocentrically, that is, based on their own bodies, rather than the objects' true positions in space, or an allocentric representation. For instance, infants who are trained at A and then moved around the table 180 degrees reach correctly to B at the B cue, which is still the A position from the baby's perspective (Bremner 1978a,b; Bremner & Bryant 1977). Indeed, studies show that conditions that provide clues in the environment that help disambiguate the two identical targets, A and B, tend to decrease perseverative reaching. These include making the hiding covers more distinctive, adding landmarks in the room, or testing infants in more familiar environments (see Acredolo 1985; Wellman et al. 1986). In addition, there is a strong association between infants experience in self-locomotion and correct responses (Acredolo 1985; Bell & Fox 1992; Bertenthal & Campos 1990; Kermoian & Campos 1988). Self-locomotion is believed to increase infants' visual attention to spatial locations and thereby, their ability to code them allocentrically. The spatial hypothesis itself is not sufficient, however, because it does not account for the delay effect (why would babies' spatial coding change from allocentric to egocentric in the three second delay?) nor for the looking-reaching decalage (allocentric when looking and egocentric when reaching?).

Another way to view the A-not-B task without invoking object permanence is as behavior requiring memory for a location and a motor response. In a series of influential papers, Diamond (1985, 1988, 1990a, 1990b) invoked these two processes to address two of the important context effects: the necessary delay between hiding and retrieval, and the ability of older infants to tolerate longer delays. Diamond proposed that two processes combine: the error results from infants' poor memories for the hiding place and their inability to inhibit strong motor responses. Having once reached to A, infants must inhibit this prepotent response in order to shift to the B place. The delay is important because in infants, both the memory for the hiding location and the ability to inhibit responses are short-lived. Thus, over the few second delay between hiding and retrieval both the memory of the hiding place and the ability to inhibit decline. At the B trial, this leads to a B reach with a short delay and a return to the old, prepotent A response with longer delays. With age, the persistence of these processes increases and infants do not err at short delays. The source of the developmental effect, according to Diamond, is maturational change in the dorsolateral prefrontal cortex, a brain area identified by lesion studies to be involved in both memory and response inhibition (Diamond & Goldman-Rakic, 1989). This account, while powerful, is incomplete. It does not offer a principled explanation of the spatial location or context effects described above, nor why locomotion would hasten the decline of perseveration. And it cannot account for the looking-reaching decalage.

Thus, each of these accounts captures some truths about infant perseverative reaching, but none has a full explanation of both the canonical error, and the richly-documented effects of context which are part and parcel of the same phenomenon. In our theory, we incorporate aspects of many of the explanations of our predecessors. First, we agree with some of our colleagues that the A-not-B error is not about an object concept per se. Smith et al (1999b) performed a critical experiment: infants were tested in the canonical task with one difference, there were no hidden objects at all. When simply cued to location A or B by waving the lids covering the wells, infants made the A-not-B error just as robustly as they did when objects were actually hidden and recovered. But we deeply disagree with the widely held assumptions that knowing and acting are modular and dissociable. Indeed the cornerstone of our dynamic model is that "knowing" is perceiving, moving, and remembering as they evolve over time, and that the error can be understood simply and completely in terms of these coupled processes.

Second, we further agree that the A-not-B error is about moving to a location in space. But it is also about remembering a cued location, and being unable to inhibit a previous response. What we will demonstrate with the model, however, is that there is no need to posit such individual and separate mechanisms such as egocentric or allocentric coding or memory or response inhibition deficits or incomplete object knowledge. Infants indeed sometimes act as though their responses are egocentric or as though they lack memory or inhibition. But they act that way because of the coupled interactions of the very same dynamic processes that make them appear to sometimes "know" where the object was hidden. Because all of the processes contributing to the behavior are coupled, continuous, and based in time, we can account in one model for both the error itself and for the decline in perserverative responding in different situations and at different ages.

3. A dynamic systems approach

The starting point of the dynamic model is with new assumptions. The A-not-B error is not about what infants have and don't have as enduring concepts, traits, or deficits, but what they are doing and have done. What they do is reach repeatedly to one location and then return to the original location when the goal has changed. From a dynamic perspective, this perseveration is emergent from the real-time dynamics of visually elicited reaching, the memory dynamics of repeating the same action several times in succession, and the intrinsic dynamics of these processes in infants. According to our view, the error arises from the same multiple processes that produce goal-directed reaching at any age and we indeed create the error using a model originally formulated to simulate the general process of motor planning for reaching. The age and context effects arise naturally from the parameters of the model, which in turn, we derive from a fine analysis of the actual task.

3.1 A task analysis

Thus, to begin, we describe the error task in the most simple behavioral terms. Then, we identify the factors in the task itself or in the baby that are known to affect the behavioral outcome, that is, the tendency to reach to A when cued at B. We report here the details of the canonical task used in Smith et al (1999b). There were two versions of this task, which differed only in whether a toy was actually hidden or whether there was no hidden object and the infant was cued simply with the cover to the hiding well (lids-only). If this account seems burdened with details, it is because these details matter profoundly, as we will further document below.

For both the hidden toy and lids-only versions, the infant sat on a parents' lap at a small table facing the experimenter, surrounded by neutral and unmarked walls or screens. Here are the events that transpired in a typical A-side hiding trial. The infants first saw the hiding box, which was 30 cm by 23 cm by 5.5 cm. It was painted brown and contained two circular hiding wells, each with a radius of 4.5 cm and depth of 4.5 cm. The centers of the wells were 12.5 cm apart. The wells were each covered by a circular wooden lid with a small round handle in the middle, all painted the same brown color as the box. The box and the covers were in view throughout the whole procedure and thus constituted a continuous visual task input, as depicted in Figure 1. The notable characteristic of the task input was its lack of visual specificity. The two lids were indistinguishable from each other and blended into the background of the box. No familiar or distinct environmental landmarks were evident. Once the two lids were in place, there was little else to demarcate them except their relative spatial position.

Figure 1: A task analysis of the canonical A-not-b error, depicting a typical A-side hiding event. The box and hiding wells constitute the continually present visual task input. The specific input comes from the transient cue of hiding the toy in the A well. A delay is imposed between hiding and allowing the infant to reach. During these events, the infant looks at the events, remembers the cued location and undertakes a mental planning process leading to activation of reach parameters, followed by reaching itself. Finally, the infant remembers the parameters of the current reach.

The trial began with the box well out of the infants' reach. The experimenter first called attention to the toy by waving it or tapping it on the box, always in the vicinity of the A side, and often calling the child's name. (All experiments were counterbalanced: half the infants had the A side on their right and half had the A side on their left.) When the infant was clearly looking at the toy, the experimenter hid the toy under the lid. This visual (and auditory) cue we call the specific input, a transitory indication of which lid is specified as the goal (Figure 1). (In the lids-only condition, no toy was hidden and only the lid to the well was waved and tapped.) We then imposed a short delay of 3 seconds for 8-month-olds and 5 seconds for ten-month-olds. During the delay, the infants most often looked at the cued location. Infrequently, they also glanced at the experimenter. During this delay, infants needed to remember the cued location in the face of the ambiguous task input. After the delay, the experimenter pushed the box into the infants' reach.

After infants saw the specific location cue, they needed to decide whether to reach to the A or to the B side, and plan the appropriate movement parameters to actually activate the muscles of their arms to go to the right or to the left. This planning was done largely in the absence of the well-specified goal. We have conceived this planning in Figure 1 as mental activation functions that precede the reach and are initially equally likely at A or B, but gradually shift to the A side. (We shall further justify this depiction below.) The reach itself was initiated once activation reached a certain threshold, sufficient to activate the muscles to move the arm in the specified direction. The infants continued to look at the A target throughout the planning and reaching.

Finally, the infants grasped the target toy or lid and shifted their glances back to the experimenter, who also withdrew the test box. A critical assumption on our part at this juncture is that the just executed reaching act was remembered for some unknown time after the reaching act was completed. Again, we will elaborate further on this memory process in a later section.

The A trials were repeated several times before infants were asked to switch to B for two trials. The events for the B trials were identical to those at the A side except that the experimenter hid the toy in or cued the previously unspecified location. Under these canonical conditions in both hiding and lids-only conditions, 70-80% of infants reached back to A on the first B trial, and an equal proportion continued to perseverate on the second B trial. This rate is significantly different from chance responding, that is, from an equal probability to go for either target after the B cue.

3.2. Events and processes in the A-not-B task

A foundational assumption of our theory and model is that both the error and correct reaching emerge from the coupled dynamics of looking, planning, reaching, and remembering within the particular context of the task: repeating a novel and confusing reaching action. Having described these events and processes in a canonical A-not-B trial, we now further analyze their contributions to the infants' behavior. It is important to note here that although we address these contributions one at a time, the point of our dynamic analysis is that they are continuously coupled and interactive. Everything counts! These detailed context and infant effects are the data upon which the model is based. We need not invoke any new constructs or traits.

3.2.1. Task input. As we mentioned earlier, the notable characteristic of the test scene was that the hiding box and wells were poorly specified as targets. Once the toy was hidden or the lid placed on the box, infants could easily confuse the targets, requiring them to decide on a reaching direction in the absence of well-marked locational cues. Indeed, there is ample evidence in the literature that perseverative responding depends on target ambiguity and that therefore, errors are reduced by manipulations that make the targets more distinct from one another (see Wellman et al. 1986). For instance, the error is decreased when the two hiding locations are visually very different, for example, when one cover is blue and the other cover is white (Butterworth et al. 1982). There is less clear evidence on the importance of the spatial separation of the targets, but for the most part, only relatively small differences in this distance have been manipulated (Wellman, et al. 1986; Sophian 1985). However, in the study with the largest target separation, only 6 of 56 9-month old infants made the A-not-B error, far less than would be expected in the canonical situation (Horobin & Acredolo 1986). Using multiple, rather than just two, hiding locations both reduces error and increases correct responding, meaning that infants return to the just cued location more than at just chance levels (Horobin & Acredolo 1986; Wellman, et al. 1986). The effect of multiple targets as part of the task input may be to add spatial landmarks to the scene, giving the infant other relational cues for reach direction once the specific input is completed. Indeed, the addition of the familiar landmarks of infants' homes dramatically reduced egocentric responding (Acredolo 1979).

In addition to the A-not-B task being visually confusing for babies, it is also entirely novel. Infants have been grabbing objects for four or five months before they typically do the A-not-B test, but during this time, they have been reaching for well-specified, single, often colorful and highly distinct toys. Before this, no one has asked them to repeatedly choose between identical, closely spaced, boring items such as the covers to the hiding locations.

Indeed, the task is so confusing and novel for infants that they have to be trained to do it. All A-not-B studies involve some, often unspecified, number of training trials. For example, in the canonical hidden object task used by Smith et al. (1999b) the training consisted of four trials in which infants were gradually led from a familiar task to the novel A-not-B test. To do this, the experimenter first placed the toy alone at the edge of the box at the A location and slid the box forward toward the infant, while verbally encouraging the infant to reach for the toy. This task was familiar and infants reached reliably. On the second training trial (A2), the toy was placed inside the well but left uncovered and on the third trial (A3), the toy was partially hidden in the well. In the last training trial (A4) the toy was completely hidden. In the lids-only version, the A-side lid was progressively moved back over the four trials from the front edge of the box to a position in line with the B side lid. The training was followed by two test trials at A with the toy completely hidden or with the lids lined up equidistant from the edge of the box and then the two B trials where the cue was switched, as described above.

Because the task input is novel and confusing, Smith et al. (1999b) predicted that without the training events to help disambiguate the A and B locations, infants would not be strongly disposed to reach to A even on the first A trial. They tested this by eliminating the four pre-training trials and commencing with only the two A test trials. With no training, only 35% of the infants reached correctly to A on the first A trial, compared to over 75% of infants who had been conventionally trained. Moreover, without training, a sizeable proportion of the infants did not reach at all, but just stared at the hiding wells, as though they could not figure out what to do, and they often refused to reach on the remaining trials as well.

In previous studies of A-not-B, training was justified as necessary to teach babies to reach for a hidden or unfamiliar object, and it does accomplish this. But this training, when done at the A location, also accustoms infants to repeatedly reach to A. In an effort to demonstrate these two effects, Smith et al (1999b) trained infants for the typical A not B task, not at the two-welled test box, but at a different box with only a single, center well. They reasoned that, given the confusion of the targets and the fewer reaches to A, training at this single "C" location should render infants less likely to consistently reach to A on both A and B trials. Indeed, infants trained at the single well were more likely to reach to A on the A trials than babies with no pretraining. Simple practice with the hiding and recovery events in one location made the two-well condition less novel. But the "train at C" infants were also less likely to reliably reach to A on both the A and B trials than those babies given the standard training regime. Stable behavior at A and perseveration back to A depended on previously reaching to the A side. (We will discuss the effects of repetition further below.) Thus, the training trials, thought to be merely "warm-up" by previous investigators, are also critical contributors to the dynamics that produce the A-not-B error, both in helping disambiguate the confusing targets--affecting A-side behavior-- and building up a tendency to stay at A when cued at B--affecting the appearance of the error.

Thus, target ambiguity alone is not the single cause of perseveration. But coupled with the cue, the delay, and the particular dynamics of infant reaching and remembering: The relative ambiguity of the task input is a critical parameter in the model.

3.2.2 Specific input. Given the largely tonic and confusing nature of the task input, infants receive transient clues to where to reach by the actions of the experimenter who waves and/or taps the toy or target lid. This activity invariably directed infants' visual attention to the target location, and they continued to look at that location throughout the hiding and reaching event (Smith et al. 1999b). The finding that waving the lid to the hiding well was as effective in producing perseverative reaching as was actually hiding a toy demonstrates that it is not hiding per se that is the critical stimulus. However, this does not mean that the nature of the specific input is unimportant. In our dynamic view, it is one of the critical events.

Recall that the specific input serves to demarcate the target location and then it disappears; the infant must remember it when it is no longer evident. Given that the specific input serves both to capture visual attention and to provide a spatial target in memory, it follows that some specific inputs may be more effective in either or both of these purposes than others. Some objects and events may be more interesting for babies and some may be more memorable than others. Specific inputs with more "punch," therefore should increase correct responding, both on A and on B trials.

We found some support for this assumption in the literature. For instance, inspired by Smith et al.'s (1999b) demonstration of perseverative responding with no hidden objects, Munakata (1997) tried the following variant: She presented infants with only lids on the A trials, but on the B trials, either hid a toy at B or cued just the lid. In the lids-only condition, infants reached perseveratively at A, replicating Smith et al.(1999b) The error was decreased, however, in the condition where the toy was shown for the first time at B. Munakata (1997) interpreted this as infants representing lids-hiding-toys differently from lids with no toys. Our interpretation of these interesting results is simpler: the novel and visually interesting toy captured visual attention more strongly than the lid alone and pulled infants away from their habitual response. This interpretation is further supported by Munakata's (1997) second experiment where first toys were hidden on A and then infants were cued with either toys or just lids on B. In this manipulation, infants made the error in both conditions. Of course, because both the lid and the toy were used on the A trials, neither the lid alone nor lid-plus-toy provided a sufficiently strong specific input to counteract the repeated reaches to A.

Our assumption that the strength of the specific cue makes a difference is further confirmed by experiments reported recently by Diamond (1997). Diamond coded infants' levels of interest in the hidden toys as they were hidden and then uncovered. Infants made significantly fewer A-not-B errors when the toy was different from that used on previous trials and the when infants' interest in that toy was high. More remarkably, when Diamond substituted pieces of cookie for hidden toys, all infants reached correctly, even at delays in which they had previously erred when toys were hidden.

Furthermore, when the "punch" of the specific input is enhanced by an event that increases infants' visual attention in one direction, it should be more potent in influencing reaching direction. Likewise, when the specific cue is diminished by competing claims on infants' visual attention, the power of that cue for directing the subsequent reach should be weakened. This is exactly what Smith et al. (1999b) found. In one experiment, these investigators manipulated the direction of infants' gaze after the A trials by simply tapping on a little blue rod placed either on the far right or the far left side of the testing arena. After 4 training trials, half the infants' had their visual attention directed to the just trained A side and half to the upcoming B side. Reliably more infants in the A-side-tap condition reached to A on the A trials than those in the B-side-tap condition, and although there were no further taps, these infants also reached more to A on the B trials. In a second experiment, the researchers provided the pulls to visual attention before the B trials. In this case, infants given the additional cue to A all stayed at A, committing the error. Conversely, and in line with predictions, infants seeing the tap in the direction of B reached correctly on the B trials,

The critical feature, then, of the contribution of the specific cue is not whether it is a toy or a lid or a cookie or just an event to look at, but the power of the stimulus for capturing infants' attention and/or for remaining in memory when it is no longer in view. When the specific input is strong it will have a powerful influence; when it is weak, it may be swamped by the other system dynamics.

Thus: The relative strength of the specific input is a second critical element in the model.

3.2.3. Delay. Infants do not perseverate at any age if they are allowed to reach immediately after the object is hidden. The error emerges in the delay between the cessation of the specific input and when infants are permitted to act by sliding the box forward (Figure 1). In addition, as they get older, infants require increasingly longer gaps before they perseverate. Eight- month-olds make the error with a 3 second delay, but 10-month-olds require 5 seconds to perseverate. The delay effect is robust (e.g. Diamond 1985; Harris 1973; Wellman, et al. 1986) and is especially hard to reconcile with a strictly Piagetian interpretation of infants' stage-specific object knowledge since there is no reason why infants would know less after a longer delay, when they presumably have more time to process the situation.

What sorts of mechanisms can account for this apparent switch from right-to-wrong in 3 seconds? One class of explanations invokes a shift of level of processing within the delay period. For instance, Gratch et al. (1974) proposed that at no delay, infants' actions are guided by a motor memory of the most recent event, but at the longer delay, the concept of the hidden object kicks in, so infants return to the habitual hiding location. Harris (1983, 1987) agreed that the no-delay performance was dominated by a simple motor response, but also, at longer delays, infants had problems with object identity in that they believed the object hidden at B was not the same one that was previously hidden at A. In contrast, Wellman, et al. (1986) saw the delay as revealing a conflict between two mechanisms for search. At short delays, infants relied on more immature, direct-search strategies: go where the object last disappeared. Longer delays, however, allowed activation of the conflicting, albeit more mature, strategy of an inferred search based on the movement of the object to its hiding position, leading to sometimes correct and sometimes errorful actions. Competing short and longer-term memories are also at the heart of the recent connectionist model of Munakata (1998).

As mentioned earlier, these accounts have several troubling aspects. Strategy-based theories are hard to reconcile with the age and delay interactions. If retrieving a correct strategy takes time, as infants get older and as they are given more time to retrieve a correct strategy (longer delay), they should be increasingly correct, but the reverse is true. Even more problematic, in our view, is the sharp distinction between knowledge (more mature, more conceptual) and action (more immature, less planful), especially in the face of compelling evidence that this "knowledge" comes and goes with each variant of the task.

Nonetheless, the heart of the explanation of the A-not-B task, we believe, lies in what happens during this delay when infants are faced with the ambiguous task input and yet must decide whether to go to A or to B based on a cue that is no longer present. Our model is similar to Diamond's well-known theory or to the recent connectionist model suggested by Munakata (1998), in postulating interacting dynamic processes that lead to one behavioral outcome at short delays and the probability of another outcome as time passes. But, specifically and uniquely, we place the locus of these dynamics in the motor planning process, envisioned as a continuous dynamic field that evolves under the influence of several input parameters, and whose behavior can be sampled during the delay.

In sum: What happens during the delay is also critical as it is a window on the natural dynamics of the contributing processes.

3.2.4. Reaching. Our focus on the A-not-B error as centered in the motor plan for reaching is well-supported by the developmental evidence. As we mentioned earlier, one of the most serious challenges for Piaget's original explanation was the discovery of the apparent decalage between infants' understanding of objects when reaching for them compared to when they were questioned by looking measures alone. Thus, even several months before they demonstrate reliable reaching errors, infants are not apparently confused about the last location of a hidden object when just looking. Likewise, 8-12 month-old infants tolerate a longer delay before erring in a visual violation-of-expectancy search task than when manually searching (Ahmed & Ruffman 1998). Moreover, when infants only watch the hiding events at A and B, they more often look correctly at B than they reach to B in the conventional search task (Hofstadter & Reznick 1996.) Thus, at the very same age, infants show that they "know" more about hidden objects in the visual modality than with manual action.

The apparent confusion of what infants "really know" in the wake of such conflicting stories is only an issue when the foundational assumption is that infants "really know" something in the absence of the processes that demonstrate, in the moment, what they do in light of what they have just done. As argued in Smith et al. (1999b), the A-not-B error is about the behavior of reaching in the context of a particular perceptual scene, specific task dynamics, and dynamics of reaching and remembering intrinsic to infants at particular stages of development. Perseveration or the lack thereof in looking tasks has its own contributing dynamics, which may or may not produce the same behavioral outcome (and clearly does not). Neither reaching nor looking is a better measure of the infant mind; both are very revealing windows into the complex coupled dynamics that produce goal-directed behavior. But whether infants do better at looking than reaching is somewhat of a side issue for our present goal which is to elucidate the dynamics of the reaching A-not-B error. Once these dynamics are understood, a similar analysis can be applied to looking tasks and comparisons made.

At the heart of our model, therefore, is the act of reaching, which requires that infants see and remember a target location as a goal, that they plan the appropriate movement parameters for a trajectory in space and time that will transport their hands to the desired object, and that they activate muscles that will carry out their intended movement. These processes are perceptual and they are motor and they are cognitive. Indeed, the model demonstrates very clearly the impossibility of making clear distinctions between these processes as they evolve in coupled and parallel fashion over the time of each trial and the time of the whole experiment.

In particular, we focus on the evolution of the reaching plan between the time the target is cued and infants actually move their arms forward to pick up the toy or lid. This plan, we contend, is the locus of both the error and correct responding because it integrates the perceptual input of both the tonic task conditions, the cuing of the location with toy or lid, the infants' visual attention to that cue, and the memory of previous actions in the same situation. The combined dynamics of these processes constitute the infants' decisions whether to move toward the A side or toward the B side. All of these are expressed in movement parameters because the ultimate behavior is an action requiring a correct movement. The fact that the visual, memory, and motor processes are coupled and continuous means that changes in any of their parameters can (and often does) affect whether infants reach correctly or perseveratively. Again, we will justify this feature of the model in more detail in the next section.

Thus, the error emerges in the context of the specific behavior of reaching.

3.2.5 Remembering. In an earlier section, we presented evidence that infants reliably produced the A-not-B error only after a number of training trials which both served to disambiguate the targets and to establish a repeated pattern of reaching to the A side. Indeed, in every version of the task reported, experimenters elicit the error only after several reaches to the A location. Nonetheless, in their comprehensive meta-analysis, Wellman et al. (1986) found no overall effect of the number of A reaches on the commission of the error. Does repetition matter? This is a critical question for understanding the task dynamics, and we believe the answer is unequivocally -yes, they matter profoundly.

In a number of recent studies, Smith, Thelen, and their associates have shown conclusively that commission of location errors with the B cue is strictly a function of the number of prior reaches to A. The strongest influence on where infants will reach on any trial is where they have just reached. This is true whether the target objects are hidden toys or just the lids.

As we described earlier, perseverative errors decreased when infants were not trained at all or trained at a neutral location. Both of these manipulations simply reduced the number of times infants have reached to A (Smith et al. 1999b). Furthermore, infants tended to perseverate even when no specific cue was offered. In a recent experiment, Smith et al. (1999b) gave groups of infants 6 trials in the lids-only task. One group was cued on one side before they reached at all and then allowed to reach spontaneously. These babies went to the cued side on their first reach and tended to stay there. The other groups were allowed to reach spontaneously on the first trial and then they were cued once, either after 1, 3 or 5 reaches. The cue was always opposite of infants' preferred side; a "B" trial to infants' personal "A" choices. The infants who were perturbed by a B trial after reaching five times to their A choices stayed at the A side. Without any hidden objects or even location cues, infants built a habit to go to A strong enough to counteract the new pull on visual attention. Infants perturbed after one or three reaches, in contrast, were more likely to switch to B when cued, and to switch sides more spontaneously thereafter.

In addition, in several studies where other parameters of the A-not-B task were manipulated, including aspects of the specific and task input as well as the dynamics of the reaching arm, the Smith/Thelen group always found a strong effect of the number of trials to A on whether infants stayed at A or switched. This effect could be detected because, as noted by previous researchers, infants' tendencies to go to A or B are never absolute, but probabilistic. That is, on any given cue to A, even the very first one, there is a chance that, spontaneously, infants will go to B. Even after several trials to A, when the probability to stick with A is high, a few infants will go to B.

Diedrich et al. (1999) captured this effect of spontaneous switches by a new measure of the growing effect of previous reaches--the relative memory strength to A. The index is predicated on the simple assumption that each reach in one direction creates a memory trace that increases the likelihood of subsequent reaches in that direction. Thus, each time an infant reaches to A or to B, the memory of that target increases. By subtracting the memory of B from the memory of A, we can express the probability the infant will return to A. The maximum memory strength for any direction is 1 (8/8) meaning that the infant will have reached to that side in, say, in all 8 of 8 trials. Likewise, the minimum memory strength is 0/8, when the baby never reached to that direction. Figure 2 shows the evolving memory strength to A in a canonical two-target, no-hidden object task. The infants in this study were divided into those that perseverated on both B trials and those that got at least one B trial correct. Note that, from the earliest trials, the infants who were more likely correct at B were those who, spontaneously, had reached previously to B even when cued to A, while those who strongly perseverated had stuck with A throughout.

Figure 2. The effects of repeated reaching on the A-not-B error. The measure " Relative memory strength to A" (explained in text) for a canonical no-hidden object task in 9 month-old infants. Shown for infants who made the error on both B trials (n = 9), on only the first B trial (n = 3) and who were correct on both B trials (n = 5). Correct B trial responding was preceded by spontaneous reaches to B even when A was cued.

How are these results reconciled with the conclusions of Wellman et al. (1986) that "number of A trials was consistently unrelated to performance" (p. 31). As mentioned previously, in all A-not-B studies, experimenters train infants to do the hidden object task, which may involve some number of reaches to the A side. These training trials were thought to be prelude to the task, and were not included in the actual count of repetitions to A. Infants thus may have actually reached to the A location three or four times before the first official A trial is counted. Furthermore, in some versions of the task, infants are cued to the A side until they reach a criterion of two or three successive "correct" reaches to A and then switched to B (e.g. Diamond 1985; Bell & Fox 1992). Correct in these tasks means not just touching the cover, but lifting the cover and touching the hidden toy. In this version of the procedure, therefore, the actual number of reaches to A before the switch is not reported and is unknown. An infant may make several "incorrect" reaches to the A side, or several "correct" reaches but not in succession. Without counting training and/or actual movements in the A direction, the full range of parametric effects for repetition was therefore not obtained, and indeed cannot be ascertained from the information provided in the published reports (Diedrich et al. 1999 ) When these parameters are fully explored as in Smith et al (1999b) the effect of repeating movements is overwhelmingly strong. Indeed, a recently published new meta-analysis of A-not-B reaches the identical conclusion: that perseveration is a function of the number of A-side reaches (Markovitch & Zelazo, 1999).

That the direction of reaching depends critically on where infants just previously reached means that the A-not-B task is a memory task on two time scales. First, it requires that infants remember the location of the specified target in its absence, as discussed above. But second, there must be memory dynamics between the time when the reach is executed in one trial and the sight of the target and decision to reach on the succeeding trial. We suggest that this time between retrieval trials was also of critical importance because infants remember the actions they just performed during the time between recovering the toy and the next specific cue. Thus, when the second cycle of cuing, deciding and reaching commences, it is initiated within the time span of the memory of the previous cycle. The third cycle builds upon the memory of the first two and so on, such that when the baby is finally cued to B the memory traces from the first six reaches to the A side may be very strong. This memory is a motor memory whose content can be examined, at least in part. Critical to the model, therefore, is that the memory of one action is in the same space of movement parameters as the plan for the subsequent actions and can thus, influence the evolution of the next movement. Again, we substantiate these claims in a later section.

Infants make perseverative location errors because the motor memory of one reach persists and influences subsequent reaches.

3.2.6. Development. In the canonical form of the task, only infants between 7 and 12 months of age consistently make the error. Before this age span, infants do not search at all for hidden objects, and after a year, infants search successfully where they saw the object last disappear, even after it has been displaced several times. Thus, in traditional interpretations, perseveration is seen as a distinct stage in cognitive development: that of incomplete object representation.

It is incorrect, however, to assume that perseverative reaching responses are unique to a particular stage in infancy. Try moving an article in your kitchen from a long-established location and start cooking dinner! Indeed, adults with no perceptual or motor impairments can be trained into perseverative or biased responding within an experimental session in the laboratory (e.g., Ghilardi et al. 1995). Especially relevant is a recent study where Smith et al (1999a) elicited strong perseverative reaching responses in 2 year-olds, at an age at which the Piagetian object concept should be strongly established. These authors asked toddlers to retrieve a toy hidden in a long narrow sandbox , a task that differed from the classic A-not-B exercise primarily in the less well-delineated targets. (Once the toy was hidden in the sand, no covers to the hiding places helped mark the possible locations.) After several recoveries from the A side, toddlers continued to search in the vicinity of the A location even after the toy was hidden on the other side.

We contend here that the processes that create perseverative responding in infancy are not special, but are the very same processes that lead to correct responding at this age, and also to correct and perseverative responding in individuals at any age. Nonetheless, it is incumbent upon us to explain which aspects of the processes can account for the developmental effects: why location errors occur at particular ages and under certain conditions and not at others.

We believe that all of the above-mentioned contexts and parameters contribute in coupled and perhaps nonlinear ways, and that they all may have age- and experience-specific dynamics. In short, the age differences reside in the particulars of the environment and timing demands of the tasks. Consider, for instance, the problem infants face in distinguishing the two identical lids. There are good developmental reasons for infants to have difficulties at say, nine months, and to improve their abilities in this regard by 12 months, but still to be confused at 24 months when the task becomes more ambiguous. In the last part of the first year, infants have limited experience with spatial localization, especially before they have moved themselves around. Understandably, self-locomotion may focus infants' attention on where things are in the environment because these objects and places become relevant in ways they were not when babies are still transported by others (Horobin & Acredolo 1986). Increasing visual attention alone may change the impact of the various manipulations of the task. In addition, infants may learn to pay attention to the relationships of objects in the environment to each other and thus form a better conception of right side versus left side when faced with an otherwise ambiguous visual scene. These experiential effects are co-evolving with infants' increasing skill at reaching. As infants move about by walking, cruising, and crawling, they reach for different items, at different levels and locations, and from a variety of postures. Thus, their action capabilities expand and become more flexible. At the same time, we may suppose, these interrelated perception-action experiences also impact memory processes, as babies now have reasons to remember where things once were when they are presently out of sight. However, and what is critical here, is that these changes are gradual and not all-or-none: given good reaching skills and adept use of landmarks, toddlers and even adults may still become confused when the target locations are transient or not well-marked or when the delay between target cue and go signal for reaching is extended.

Given these multiply-determined, dynamic, and cascading effects, the power of the model is to offer entry points for actually probing how the process parameters may change over developmental time. As a starting assumption, we focus on one of the likely sites for developmental change, the mental processes involved in the motor planning and decision to reach to either the A or B side. We will spend considerable time in the next sections justifying why it is appropriate to express the evolution of the A-not-B error dynamics in terms of a motor field, and therefore, how developmental events may impact upon the behavior of that field under certain task conditions.

Thus, and finally, age-related changes in the likelihood of perseverative reaching may result from parameter changes in multiple contributing processes; one candidate is a change in the properties of the integrative motor planning process.

4. A dynamic field model: Overview and rationale

Previous accounts have assigned the cause of the A-not-B error to infants' deficits in object knowledge, spatial localization, memory, or inhibition. In contrast, we center our attention on reaching, and in particular, the processes that lead to a directional reach to A or B. The challenge is to explain, therefore, in terms of the normal processes involved in reaching, behavioral phenomena that look like infants really do have problems with object permanence, and that they cannot escape their body-centered understanding of space, and that there is something lacking in their memories or inhibitory mechanisms. Thus, our formulation is justified as far as we can show that the well-documented context- and age- effects described above can be manifest in the domain of the action of reaching.

The model describes the mental events that constitute the decision to reach to A or B as activations in a dynamic field expressed as directional movement parameters (Erlhagen & Schöner 1999; Schöner et al. 1997). The field, which has nonlinear properties, evolves continuously under the influence of input dynamics from three sources, also expressed in compatible directional parameters. These include the specifications of the task environment, which establishes the decision field, the specific cue to reach to A or B, which is transient and must be remembered, and, after the first reach, a memory dynamics which biases the field for the subsequent reach. Both the properties of the field itself and the input dynamics can be assigned parameters that are derived from data, and we simulate, using such parameters, the robust effects documented in the literature. We also use the model to generate novel predictions.

Note here three critical aspects of the model. First, although we maintain that the model is biologically plausible, and discuss this further below, it is an entirely abstract model, and not a neuroanatomically specific account of looking, reaching, and remembering. In its present form the model also does not incoporate specific bodily parameters such as muscle anatomy, segment masses and centers of gravity, or joint configurations. Rather,the model captures the abstract, collective dynamics of multiple processes, which are likely happening in parallel in many integrated sites in the brain and the body. Second, and related, the concrete mathematical functional form of the model is not unique (see for instance, Grossberg, 1980; Wilson & Cowan, 1973). What is important here are the assumptions of blended, continuous performance and how closely these assumptions both match the previous experimental results and generate good, testable new experiments. That this precise formulations works validates our assumptions but does not preclude other theoretical instantiations using similar assumptions. Thirdly, we put forth this model as only the first step towards a fully embodied account by casting the mental events that precede the movement into the same dynamic language appropriate for movement itself: continuous and time-based. The important step of integrating the actual movement dynamics with the motor plan dynamics remains to be done. It is a difficult challenge, given the complex nature of the coordination and control of limb movements (see Erlhagen & Schöner 1999; Kopecz & Schöner 1995, also Jordan 1990; Bullock & Grossberg 1998; Houk et al, 1995 for neural network accounts of the motor control problems.).

4. 1 Integration in a motor planning field

Reaching in infants, like reaching in adults, begins when individuals see objects they want. We know from a vast literature that 150-300 milliseconds elapse between the visual fixation of the target or some other "go" signal and the actual movement of the arm and hand toward it. During this time, conventionally called the reaction time, persons are processing the visual stimuli, effecting some sort of transformation from visual space to body-space and establishing the specific movement parameters for the execution of a movement that will attain the goal. After this "motor program" is set, the traditional explanation goes, then the reach is triggered and the actual movement occurs. In terms of the cartoon of Figure 1, the series of events can be imagined as the diagonal row of icons from left to right: look, plan, reach.

The first part of the process involving attention, vision, and planning are the mental or cognitive events usually studied by psychologists. The prevailing theoretical frame of reference has been that of information processing: the vocabulary revolves around notions of programs, codes, and representations, and processes such as feature extraction, response choice, and serial stages. These mental constructs are accessed through manipulating the attentional demands of the task, the nature of the stimuli or the memory load and by looking at reaction times and error rates. Investigators cared about mental events; the movement itself was of little or no concern. In contrast, movement scientists have been more focused on the actual control and execution of the reach itself. In this case, they focused on kinematic variables such as trajectories and velocities, and the biomechanical and neuromuscular contributions to the transport of a physical entity-- a real limb-- in space and time. They might vary the load on the limb, speed and accuracy requirements, work space demands, postures, and so on, and measure actual movements, forces, and muscle patterns. People plan in order to move; yet, in the traditional formulations, there was no common currency between the activities of the mind and those of the limb (for discussions see Allport 1987; Poulton 1981; Keele 1981; but also Georgopolous 1986; Hommel 1996; Prinz 1997; Rosenbaum et al. 1995; Schöner 1995).

Despite this historical dualism, it is increasingly recognized that the processes of action planning and execution cannot, in reality, be so neatly parceled and expressed in such incommensurate dimensions. Empirical advances over the last decade or so have gone far in breaking down the strict serial processing assumptions of "first preparation, then execution," in favor of much more parallel, mutually influential mechanisms. There is growing evidence, for instance, that the visual input is not just important at the start of the process, but is continuously and intimately influential throughout the planning and execution of the reach trajectory. And that adults are proficient at producing on-line corrections to movements, indicating that the programming process is an ongoing dynamic, not rigidly fixed by the initial target specifications (see, for instance, Goodale et al. 1986; Prablanc & Martin, 1992). Relatedly, perception itself is not isolated from action: the very act of perceiving is enmeshed with actions that accompany it (Muessler & Hommel 1997). Indeed, we believe that it is just this amalgam of processes that gives rise to infant perseveration.

Because we take the unusual stance that the A-not-B error arises from a motor planning process that is part of a dynamic perception-action loop, we take some time at this point to substantiate this fundamental assumption. Specifically, we give evidence for 1) Actions are planned in movement parameter space; 2) The plans are continuous and graded in nature; 3) Plans evolve under continuous perceptual influence of both task and cue; and 4) The system has history.

4.1.1. Actions are planned in movement parameters. The most compelling and direct evidence that what has been traditionally called "cognition" can be effected in movement parameters comes from the pioneering work of Apostolos Georgopoulos and his colleagues (for reviews, see Georgopoulos, 1986, 1990, 1991, 1995). In their now-classic experiments, these investigators trained monkeys to reach for different targets in space while they recorded simultaneously from many neurons in the motor cortex. They discovered populations of neurons that together code for the direction of movement. This code is a product of an ensemble of neurons, each of which is only broadly directionally tuned. Together, however, they provide a unique population vector that points in the direction of the movement to the target. There are several aspects of these findings that are especially relevant for our model. First, the code is body-centered: the population vector points in the direction of the target despite the monkey starting from various positions. Visual and body information are in the same parameters. Second, the vector emerges gradually and continuously in the planning period between cue and reach. As illustrated in Figure 3 (top panel), in the approximately 150 msec between the presentation of the target and the actual start of the movement (indicated by the velocity vectors of the movement) neuronal activity gradually builds and predicts the upcoming movement. Third, the vectors mirror the direction of the velocity of path of the hand, suggesting that the population vector carries information about the instantaneous velocity of the hand. Again, this is strong evidence of the compatibility and continuity of the planning/action code. Fourth, the population vector predicts the direction of reaching during a delay. In a good analogue of the infant A-not-B task, researchers presented the target briefly, then instructed the monkey not to move until a cue was given 450-750 ms later. Figure 3 (middle panel) shows the vectors in the direction of the cued signal held in memory during the delay, again as a continuous and graded signal. Indeed, when the light was not turned off during the delay, requiring no memory, the vector length is somewhat decreased (Smyrnis et al. 1992). Finally, specific task-related cognitive manipulations learned by the monkeys can be detected in the evolution of the directional vectors. In one such study, Georgopolous and colleagues trained a monkey to sometimes move a handle 90 degrees and counterclockwise from a reference direction and sometimes move directly to it. During the reaction time--before actual movement--this directional rotation could be detected in the population vectors (Figure 3 bottom panel), indicating that the animal was performing a mental rotation from the accustomed to the novel motor direction. Again, this rotation was gradual and continuous and involved activation of cells tuned not just to the stimulus and rotated direction, but those in the directions passed through during the rotation. This, according to Georgopoulos, "provided for the first time a direct visualization of a dynamic cognitive process" (1992, p. 514). Similarly, in a recent paper, Kettner et al. (1996) showed that in monkeys trained to move to two locations in a sequence, activation of both sets of directionally specific neurons--to the first and then the succeeding movement--could be detected in motor and premotor cortex, again before the movement actually began.

Figure 3: Top panel: cortical population vectors point in the direction of the movement before the movement begins. Middle panel: population vectors in the direction of the cued signal are held in memory during the delay. Bottom panel: Rotational vector detected during the reaction time when animal was performing a mental rotation. (A) Rotation task showing direction of stimulus (S) and movement (M). (B) Neuronal population vectors calculated from the onset of S until the onset of M. (From Georgopoulos, 1995).

The A-not-B error is an error of reach direction. Directional vector coding has been described in several areas of the brain, including primary motor cortex (Georgopoulos et al. 1982), dorsal premotor cortex (Caminiti et al. 1991), areas 2 and 5 of the parietal cortex (Kalaska et al. 1983), and in the cerebellum (Fortier et al. 1989). Moreover, in several of these areas, the neural response dynamics are congruent with the visual-motor integration we propose in the model, as we discuss further in a later section (Pelligrino & Wise 1993). Thus, it is tempting to conceptualize activation in the model to be direction-specific. However, there is also considerable debate among motor physiologists over which movement parameters are actually encoded in the brain. For instance, researchers have discovered neurons in motor, premotor, and parietal cortex that are specific to a body-centered code, and change with the postures of the arm, before, during, and after the movement (e.g. Caminiti et al. 1991; Lacquaniti et al, 1995; Scott & Kalaska 1997; Scott et al. 1997). It is possible and likely that the CNS uses multiple frames of reference. The precise coordinates of the movement parameters are not critical for our model, however, as the activation in parameter space is purely abstract, and could therefore be topographic in respect to body segments as easily as metric in respect to extrinsic coordinates.

4.1.2. The plans are continuous and graded in nature. The analogue character of motor planning is clearly manifest in the direct recordings from monkey cortex. But there is similarly compelling evidence of the continuous and graded nature of this process from behavioral studies of adult humans done by Claude Ghez and his collaborators (Hening et al. 1988a, 1988b). Their task, like the one used in the monkey studies, is similar to the infant A-not-B in that the target was ambiguous and that there was delay between preparation and execution. Participants were asked to match forces generated by isometric muscle contractions to one of three target amplitudes shown on a computer screen (Think of contracting your biceps with your arm held still-- a little, some more, and lot). The experimenters instructed the participants to respond on the last of four successive, equally-spaced tones. The manipulation was that the target specification flashed on the screen at varying intervals between the third and fourth tone (see Figure 4). When the interval between target and go signal was long, individuals had a long time to prepare their response. When the interval was short, participants had to respond presumably before the planning process was complete. Indeed, what the Ghez group found was that the distribution of responses varied systematically with the time available for preparation. Figure 4 also illustrates these results. At short intervals and without knowledge of the upcoming target, individuals responded with a middle, "default" response. Gradually, as the time between target and response signal increased, these force amplitudes shifted toward the target values. (In contrast, when participants knew well ahead what their target would be, this interval did not matter). By this clever experiment compelling participants to move before they were "really ready," these researchers sampled the response preparation time and showed that the planning gradually evolves--it is not an all-or-nothing trigger for response initiation. Equally remarkably, and also relevant, is that the default response, the middle amplitude, was prepared in advance, presumably specified from prior trials in the task.

Figure 4: The effect of response preparation time on target accuracy. Top panel: participants were given variable times between the appearance of the target for an isometric force amplitude (three levels of force, indicated by the filled bars in the top panel) and the "go" signal indicated by the fourth of 4 tones spaced 500 msec apart (S-R interval). The diagonally striped bar is the center or default value, which is what the participants prepare ahead of the specific target cue. The effect is shown on the bottom panel. When the S-R interval is short, participants respond with the middle default response. Target accuracy increased continuously with increased planning time. (Redrawn from Henning et al.1988a).

Together, the Ghez and Georgopoulos studies point to planning for action as specified in movement parameters and as having real dynamics, that is, a time course of activation and decay. The dynamics can be viewed directly by neural recording or indirectly through behavioral manipulations. We contend that the A-not-B error is, much like the Ghez studies, a window on these planning dynamics, with the ambiguous targets and delays providing the necessary manipulations.

4.1.3. Plans evolve under continuous perceptual specification of task and cue. Having established the likelihood that the plan for action is expressed in the same dynamic variables that control action--that planning and reaching are meshed and continuous--we now add the third parallel channel, looking. There is a large and growing body of evidence that visual behavior is intimately linked to every aspect of the reaching task, and therefore, that we are justified in assimilating the visual dynamics into the reaching field. Looking affects reaching and reaching affects looking.

Consider the following adult experiments which link direction of eye movements with motor responses. Fisk and Goodale (1985) asked participants to reach with right and left hands to targets that were both ipsilateral and contralateral to the hand used. Latency to reach onset was shorter when participants moved to targets on the same side as the reaching hand and also when using their dominant hand. This effect was precisely echoed in their eye movements. Although eyes moved toward the targets 50 msec sooner than hands did, eye movements latencies were also decreased when the hand was reaching to the ipsilateral side and/or was the preferred hand. Now there are reasons to expect that neuromuscular and biomechanical factors would make it easier to prepare movements on the same side or with the accustomed limb. Eye movements to the right or left, however, if independently specified, should be equally facilitated; there should be no biases in the system. But in the dynamics of this task, there are. One conclusion is likely: "The relationship between ocular and manual latencies suggests that production of these two motor responses is far from independent and that programming within the two systems must be integrated (at least temporally) at some level of the central nervous system" (p. 170).

A second study of looking and reaching closely parallels the A-not-B task: the target is only briefly presented and must be remembered and the task situation provides ambiguous cues. This experiment, conducted by Enright (1995), demonstrates that momentary changes in gaze affect reaching accuracy. Participants initially fixated a central target. After a few seconds, a peripheral target lit up and then was turned off. The central target remained on for another 2 seconds--the delay-- and when it went off, participants had to reach toward the remembered target. Three different conditions were used. In one, individuals kept their gaze on the central focus and pointed in the dark. In the second, they were instructed to look at the missing target immediately after the fixation light went off, and in the third, to shift their gaze to the target during the time it was lit and to keep it there. Participants had more accurate aiming when they shifted their eyes toward the target, whether the target was visible or not. It is not so surprising that people would be better when they immediately shifted their gaze and held it at the target. But the difference between those who kept their gaze at midline and those who shifted just as they reached suggested that it was the eye orientation during the pointing process itself (even though the target was memorized and not visible in either case) that had the impact on the movement outcome. This must mean, according to the author, that information about the direction of the gaze and not the visual information per se, combines with the spatial specification of the upcoming movement which, in this case, is stored in memory. Where one looks matters in how one reaches.

These experiments showed the ongoing influence of looking at the specific targets on reaching, but other studies also show that even seemingly irrelevant distractors in the visual field perturb reaching performance, and most importantly, that the disturbance is action-centered. These experiments are pertinent because the A-not-B task can be seen as one where the target (the cued location) is made ambiguous by distractors in the field. Tipper et al. (1992) presented adults with a 3 x 3 matrix of buttons on a sloped board. Buttons in the middle row, when lit by a red light, were the targets and those in the front and back rows, when flashed yellow, acted as distractors in some trials. Latency to reach was consistently longer when distractors were present, but asymmetrically so. Distractors interfered more with the reach plan when they were below and in front of the targets, and thus acted as visual perturbations to the path of the hand, than when they were above and behind them, out of the hand path. In addition, and consistent with the Fisk and Goodale studies described above, right-handed participants were slower when the distractor was on the right side, again in the path of hand. The visual pull for attention was again manifest in motor dimensions.

Another particularly clear demonstration of the influence of the entire visual task environment, in interaction with the target dynamics, comes from a study by Jackson et al. (1995). These authors asked participants to reach for a wooden block, sometimes in the presence of a wooden dowel placed either midline or peripheral to the target. They found a significant distortion of both reach and grasp kinematics when the distractors were in the visual field, but only in the condition when participants closed their eyes. There was an effect of flanker position as well, whereby distractors at midline affected only those reaches that crossed midline whereby those at the periphery distorted reaches to the ipsilateral side. The effect of vision is especially relevant: when the targets were continually in view, the authors suggested, participants had time to attentively select the correct target and prepare motor response. When eyes were closed, however, the target location had to be remembered for about 2 seconds. As in the A-not-B task, during this gap, memory of the distractors must have influenced the memory of the specified target location --in reach kinematic parameters.

One further line of evidence of the reach-look synergy comes from studies where limb proprioception is manipulated as well as vision of limb and target. For example, when participants were instructed to move to either visible or memorized targets without seeing their limbs, the directional biases in target accuracy depended on how far the hand starting position was from the middle of the body (Ghez et al 1995; Ghilardi et al. 1995; see also Prablanc et al. 1979; Rossetti et al. 1994; Desmurget et al. 1995, among others). When both the hand and the target were visible, no such biases appeared. Presumably, given the relatively less practiced task of reaching from either one side or the other, people needed to update their hand-to-target motor plan with additional visual information about the position of their limbs.

4.1.4. The system has history. The A-not-B task is about looking and reaching, but it is also about remembering on two time scales. The target position must be remembered during the reach. But also critical to our model assumptions is that the memory of one-just completed reach is retained to bias subsequent movements. As documented in the previous section, this memory likely includes the entire task set-up, which "pre-shapes" and biases the field in the same movement parameters as the reach plan (see also Mitz et al. 1991; Spencer 1999). But we also propose that the motor memory of the just completed movement is also retained and integrated into the next plan.There is evidence for this assumption in the adult literature. For example, in the Ghilardi, et al (1995) study described earlier, participants showed biased directional errors when reaching from unusual starting positions such as the far right or left of their bodies. With extensive training in one area of this novel work space, say to the right or to the left, these biases were eliminated. However, the repeated movements in the trained area then skewed performance in the untrained areas toward the trained position, even creating systematic errors in midline where none were found before the experimental training. Directional accuracy, therefore, was a direct consequence of repeated moving in that task. When people are accustomed to reaching from a middle position, they make errors in other areas of the workspace. This visual-motor map must be highly dynamic however, as it could be changed through altered experience. The system retains a memory of previous movements that incorporates the feel of the arm in relation to the target and uses the memory to plan future responses.

Equally dramatic evidence of these movement field memories was provided by studies of Shadmehr and Mussi-Ivaldi (1994) (see also Lackner & Dizio 1994) who created novel reaching conditions by subjecting participants to an artificial force field. At first, participants could not make straight reaches to a target; when they tried their habitual reach parameters, the novel forces created curved trajectories. With extensive training, however, participants learned to adjust their arm dynamics to the new environment, and performed straight reaches again. When the force field was then unexpectedly turned off, the learned adaptation remained, and participants now produced curved movements, as though they were still compensating for the unusal field. These "after-effects" again demonstrated, that even in adults, repetition of an action changes subsequent movements, indicating that the system retains a memory for the movement parameters from trial to trial.

4.2. Motor plans in infants

Are the fundamental dynamics that produce these coupled interactions of looking, reaching, and remembering in adults the same as those that lead to perseverative reaching in infants? Although direct behavioral evidence is sparse, a few recent experiments increase our confidence that model assumptions chosen from adult studies are also good for infants.

We discuss first the issue of looking and reaching, and especially that the direction of the gaze is mutually coupled to the direction of movement. This was directly tested by Smith et al. (1999b) using the canonical A-not-B hiding, but adding a simple manipulation of the direction of visual attention, as we described earlier. Infants whose visual attention was pulled in the direction of their original movement training stuck with their ongoing motor habit; conversely, infants whose glances were in the opposite direction of their movements were more likely to switch to the new target. In infants, as well as adults, goal-directed reaching is coupled to the direction of visual attention.

Second, infants code reach direction in a manner that incorporates both postural and trajectory information, as suggested by the neurophysiological evidence in primates mentioned previously. Furthermore, these parameters are held in memory and influence subsequent reaches. For instance, in an another experiment, Smith et al. (1999) reasoned that if the memory built up of repeated reaches to A is based specifically on the position of the hand and arm in relation to the target, then shifts of posture that disturb the remembered hand-target trace should also disrupt the perseverative pull to A, much like a glance in the B direction competes with the activity pulling the baby to A. To test this idea, the researchers gave infants the standard hidden object A-not-B task. For the training trails and A -side test trials infants sat on a parent's lap, as is customary. However, between trials A2 and B1, parents stood their infants up so that the baby had to reach down to uncover the toy. Control group infants were distracted visually with a colorful, noisy toy shown to them at midline. Infants who saw the centered visual distraction perseverated when cued at B. However, when their posture was shifted, infants' tendencies to return to A were dramatically reduced. Indeed, infants who reached from a standing position tended to reach correctly to B, a level of performance not seen in any previous manipulation. (In other experiments, even when perseveration is reduced, infants are not normally correct, but at chance levels of going to either A or B). Body memory, therefore, was whole-body memory, incorporating the trace of the reach from a specific arm-to-body position to a specific location. Disrupting that memory through a bodily perturbation was especially powerful in interrupting the influence of previous reaches.

Recently, Diedrich et al. (in press) provided equally compelling support to our assumption that the A-not-B error is generated from motor memories. These investigators, for the first time, actually tracked the path of infants' hands while they engaged in a no-hidden-objects version of the task, using computerized motion analysis equipment. At nine-months -- the age of the participants-- infants are not yet fully skilled reachers, as evidenced in their hand trajectories. Although they go quickly and rather accurately to the target lid, their hands trace a somewhat bumpy course, speeding up and slowing down several times (Hofsten 1991; Thelen et al. 1996). Each reach of each infant, therefore, has a distinctive speed signature, which normally varies from reach to reach.. However, when infants reached repeatedly to the A side in the two-target task, Diedrich et al (in press) discovered a remarkable result. In those infants who perseverated to A on both B trials, and presumably had built up a strong motor memory of the direction of their reaches, their trajectories converged in form. That is, the speed bumps became increasingly alike, as evidenced by increasingly strong pairwise correlations. This is illustrated in Figure 5 which is an example from a single infant reaching to a single target "C" (left panel) and in an A-not-B lids-only task where the baby perseverated on both B trials (right panel). Note that although the repetition at the single lid produced some trajectory resemblance in C3 and C4, there was a strong tendency for trajectory convergence in the two-lid task. For the group of infants, convergence was less strong in infants who spontaneously reached occasionally to B during the A trials, who reached correctly to B, and in infants reaching to only one target. It is unlikely that several reaches will have exactly the same time-space signature repeatedly in the absence of some memory of the previous reach. We can conclude that, in the conditions that produce perseverative reaching--repetition in the face of the novel and confusing two targets-- both the direction and the pattern of changes of forces producing hand accelerations are held in memory from one trial to the next. And this memory is a powerful influence on the movement parameters generated for the next reach. In sum, there are good behavioral and neurophysiological reasons to seat the reach decision in a movement parameter field that integrates the visual characteristics of the task and the memory of the previous actions. Also critical is our assumption of a graded and continual field where these integrated dynamics evolve.

Figure 5. Speed profiles of an individual nine-month-infant in two reaching tasks. Successive trials are shown on the Y axis. Left panel: hand speeds while reaching to a single, centered target 6 times. Right panel: hand speeds while reaching in a no-hidden object A-not-B task where the infant reached to A on both B trials, although the kinematic data from trail B1 was missing. (See Diedrich, et al,(in press)).

5. The dynamic field model

The model presented herein is an extension and modification of a dynamic field theory of motor programming originally formulated by Erlhagen and Schöner (1999) that is based on mathematical models formulated by Amari (1977), Grossberg (1980), and Wilson & Cowan (1973) (see Grossberg, 1988 for reviews). Erlhagen and Schöner developed the model to formulate the processes of movement planning in dynamic language that ultimately may be reconciled with the dynamics of movement execution. Kopecz and Schöner (1995) and Schöner, Kopecz, & Erlhagen (1997) have applied a similar model to the planning of eye movement saccades. Readers are referred to these papers for additional technical details concerning the models.

5.1. The movement planning field

We begin by describing the dynamics of the movement planning field, the site of the integration of visual input and motor memory, and the generation of the decision to reach to A or B. As we have argued earlier, this field must be able to generate and maintain specific activation states denoting the directional parameters of the reach in a continuously evolving manner that simulates the gradual specification of motor plans seen in experiments. Thus, the dimensions of the field in this case are the movement parameters appropriate to planning and executing a reach in a specific direction to the right or to the left. The field represents the relative activation states of those parameters. At this point, we conceptualize this field only in abstract terms as a site where visual input and memory are integrated into movement parameters supporting movement amplitude, direction, or time. Later in the discussion, we will speculate further as to possible neuroanatomical areas where such a field might evolve.

Although the model dynamics result in a dichotomous choice-- A or B--it is important to emphasize again that the behavioral dimensions supporting the choice are continuous directional parameters where A and B are locations on this continuum. Unlike classic symbolic models where either A or B constitute the universe of choices, the field model allows metric specification of particular parameters from a continuum of possible actions (see Lewin, 1946, for an early version of choice behavior on a continuum.) The activation field then assigns an activation variable to each site on the dimension. The specification of the movement is thus a function of the amount of activation at particular values representing, in this case, direction. Thus, the field literally has a shape that reflects different possible movement states: a sharp local peak indicates a well-specified motor act. Activations that are more graded and distributed imply that the movement parameters are less-well specified, resulting in more random responding or less accurate actions toward the targets.

In Figure 6, we depict the dynamic field in terms of direction to the A or B side as a continuous space spanning the infants' visual and reaching field and locating the targets at A or B. A reach to A requires strong, above-threshold activation at the A target in movement parameters: infants need to activate whatever combination of muscles are needed to get their hand to the cued target. In the upper left panel we depict the specification of a reach to A. In contrast, without any cues to A or B and when both are visible, infants may have equally relatively low activation at both locations, and may not reach at all. The upper right panel illustrates this condition. Graded information about the target choices is reflected in the activation space, but not in sufficient strength to trigger a movement. The bottom panel represents another possibility where the distributed activation is asymmetric, favoring the A site, but where the activation has still not reached threshold for movement generation.

Figure 6. Activation in continuous movement parameter fields at values corresponding the to A or B direction. Top left: activation passes threshold (dotted line) for a reach to A. Top right: subthreshold activation leads to no reach at all. Bottom: Graded, and asymmetrical, but still subthreshold activation at both sites.

5.1.1. Dynamics. Given our extended justification for choosing the parameters of the field, we begin by defining the movement parameter, x, and its dynamic field u(x) as representing those motor values the baby can continually specify to move in the direction anywhere from right to left. Our foundational assumption is that this dynamic field changes continuously with time, thus u(x,t). The state of the field depends, however, not just on the dimension x, but also with respect to its own level of activation, u. This means that u(x,t) itself has continual dynamics, where the next state depends on the previous one. The level of activation, therefore, cannot jump, but must build up gradually instead.

This build-up depends on the nature of the field and the inputs to the system, in this case the information about the task structure itself, the specific cue to A or B, and after the first reach, the memory of previous reaches. These inputs, as we stated earlier, are expressed in the dimensions of the field: S(x,t), here, again, in a reach direction specifying A or B. (We explain the contributions to S(x,t) in detail in the next section.) So the most simple case where the inputs are added together to the field can be expressed mathematically as

(1)

where we have also added a linear decay term, -u(x,t) which together with together with the constant defines the time scale over which the field gradually builds up or decays.

5.1.2. Time scale. Because the field is dynamic, its time scale of activation is critical. We illustrate the operation of , the time scale parameter, with a simple case where each site on the field evolves independently, that is, without influencing its neighboring sites. Under this condition, the activation at each site relaxes over time to the level of the input, and the stationary solution, u(x) = S(x), directly reflects the input. However, when the input changes, the activation in the field does not change with the input instantaneously, but has a certain inertia. Specifically, when the input changes in a step-like manner, say from S(x) = 0 to S(x) =S0 (x) , then the field changes exponentially according to:

(2)

where u(x) is the initial deviation of u(x,0) from S0 (x). Thus, is the amount of time during which the distance between the current activation level and the input level is reduced to 1/e, or about 37% of its initial value. Because expresses a percentage change, the temporal evolution of a site proceeds independently of its current activation, although its activation level itself is strictly a function of its previous activation. This is a characteristic of any dynamic system close to its stationary state. Thus, without cooperativity, the field evolves over time to assume the shape of the specific input and then it decays.

5.1.3. Cooperativity. We used the simple case of site independence to illustrate the temporal properties of the field. This limit case is unrealistic, however, because if all the sites were independent, the field would only reflect the exact parameters of the inputs and could not reach a decision in the face of several competing inputs. A mechanism for integrating graded information is needed: the sites must be coupled so that a single decision can evolve (Amari & Arbib, 1977). To produce such a self-sustaining peak from multiple inputs, the field is endowed with interactions such that sites that are close together are mutually excitatory, whereas more distant sites are inhibitory. We refer to these interactions within the field itself that can enhance (or inhibit) activations as cooperativity. (Note that, strictly speaking, these interactions contain both excitatory and inhibitory activations. Typically only the former are called cooperative connections, while the inhibitory interactions are usually considered competitive. We adopt the term cooperativity to stand for the combined effects of the interactions as both mutually produce (or inhibit) self-sustaining peaks.)

Again, cooperativity, gintra-field, like the other contributions, is a function of the state of the field itself and so may be expressed as an added term to equation (1);

(3)

More specifically, gintra-field is composed of two functions: an interaction kernel, and a threshold function. The interaction kernel w(x-x') allows the model to generate self-sustaining solutions by the balance of local and global excitation and inhibition. The interactions may arise from any point, x' in the field. An additional assumption is that these cooperative interactions are homogeneous within the field, so that no point is privileged over others and the interaction thus depends only on the difference, x-x', the distance between the sites. The interaction kernel has the form:

(4)
 
 

where wi > 0 and we > 0 are the strengths of the inhibitory and excitatory components and w > 0 is the size of the excitatory region, which establishes the size of the localized activation patterns. This is illustrated in the top panel of Figure 7.

Figure 7. Cooperative interactions within the dynamic motor field. Top panel: Interaction kernel w(r) consists of a local excitatory zone of width w and strength wexcite and a global inhibitory contribution of strength winhibit. Bottom panel: The contribution of any location in the neural field to the cooperative interaction is determined by a threshold function f(u) to the neural activation. The slope(beta)of this function determines the degree to which subthreshold values of activation contribute to the interaction.

Not all sites in the motor parameter field can contribute to the interaction at all times, however, as this would lead to a single, inflexible solution. As is true in real neural systems, only those sites that are activated communicate to other sites. Thus, a threshold function, f(u), allows only certain levels of activation to enter into the interaction:

(5)

where is the slope of the sigmoid function, set to a zero-to-one range, where the system becomes activated, depicted in the bottom panel of Figure 7. (For a discussion of the different types of nonlinearity involved, see Grossberg, 1973).

Thus gintrafiield is the product of the interaction kernel and the threshold function which is then integrated over all sites of the field:

(6)

When the cooperative functions are added to the other contributions to the field, as sketched in Equation (3), the dynamics take the form of Equation (7), reflecting the continual evolution of the motor field with a particular time scale, cooperativity, and inertia, in the presence of sensory inputs.

(7)

We now add two additional parameters to the equation. The first, h, sets a resting level to the field. Recall that we are able to set a threshold for entering into the interaction kernel. At any given threshold, however, the number of sites that actually participate depend upon the resting level of the field. If this level, h, is small, only sites with strong input contribute to the interaction because the threshold is relatively greater. In contrast, if the sites are already somewhat active, and the threshold is effectively lower, the interaction is much less localized and more widely distributed.

This resting level has profound implications for the resulting field dynamics. When h is low and only strong inputs predominate, the system is driven largely by inputs and less by the local interactions. The field behaves more like the one shown in Figure 7 where the output reflects the input. When h is large, however, many sites contribute to the interaction and the localized excitation becomes amplified by the many excitatory connections of its neighbors and the corresponding surrounding inhibition. In this regime, excitation can become self-sustained even without continual input, and the field can express a decision in the face of multiple inputs (Amari 1977). This is the critical mechanism for the integration of perception and memory into a decision field as it allows inputs of different relative types, strengths, and degrees of specificity to contribute to an integrated motor outcome.

Finally, we add a term, q(x,t), for Gaussian noise, giving the field dynamics the overall form:

(8)

As in any dynamic system that exhibits multiple states, these fields are sensitive to noise near instabilities, and thus noise is justified in the equation. In reality as well, infants' behavior is noisy. The A-not-B decision is always a probabilistic one; sometimes infants spontaneously reach to either A or B whatever the input or history of the system.

As we show in the simulations, below, these intrafield dynamics capture both the age and the delay effects in the A-not-B error, as well as their interactions. In principle, one might imagine developmental changes in any of these interaction parameters: in h, the ability of the field to generate a localized solution; in the strengths of wi, we, and w which determine the relative basins of inhibition surrounding peaks of excitation, and/or in the threshold function. For the purposes of our simulations, however, we will assume that the main developmental effect lies in h, the ability of the field to enter the cooperative, self-sustaining regime.

5. 2. Inputs to the field

With these intrinsic properties, the action decision evolves in the motor planning field under the specification of three sources of input:

(9)

The first two, Stask and S specific, are the parameters of the persisting task environment and of the cuing event that are both present in the very first reach to A. The third, Smemory, is the contribution to the current motor decision from the system's history and thus effectively enters into the model after the first reach. Critical to this formulation is that the inputs to the decision field are expressed in movement parameters so that they may be mutually coupled. The assumption here is that the inputs, like the movement parameters, are not discretely specified, but are identified as locations on a continuous field.

5.2.1 Task input. People move within an immediate spatial environment which usually remains stationary. This environment provides persistent visual (or tactile or auditory) input that specifies the task space-- what objects and surfaces delineate the continuous targets and supports for action. These are the features of the world which constitute the behavioral alternatives within the intentions of the actor: the possibilities to move in one direction or another (or forcefully or gently, etc). The task layout is thus prespecified and independent of immediate signals to act. In experimental situations, the task parameters typically do not change during the performance of a single trial, although they will often be varied for different experimental contexts. Without additional specific attentional cues or memory traces, the strength and symmetry of the task input determine the contours of the decision field. When specific attentional or memory inputs are added, they combine with the existing and tonic activation patterns set by the task environment.

Consider the infant reaching task with the two possible target locations, A and B, always in view. In Figure 8, we illustrate the task input (before any additional cues are provided) as activations in motor parameter space. The first way that the task input can be characterized is by the target locations along the decision field. These are represented as probability distributions in the field centered on two locations, xA, xB. In the upper left graph of Figure 8, we represent the activation distributions around two similar targets that are well-separated and provide two clearly specified goals. The upper right panel, in contrast, shows these distributions when A and B are close together, their distributions overlap, and there is a greater probability that the baby would, by chance, reach to either A or B. Indeed, target confusion as evidenced by A-not-B errors is reduced when the covers are relatively farther apart (Acredolo 1985).

Figure 8. Inputs to the decision field at the first reach to A, showing possible configurations of the visual stimuli. Task input. T1: two clearly specified, separate targets; T2: two specified targets close together; T3: two, boring, identical targets; T4: one attractive and one boring target. Specific input. S1: well-specified cue to A; S2: poorly specified cue to A, for example, a quick wave of the hand over the target; S3: well-specified cue to B. The bottom panel shows the time evolution of T3 and S1, where a strong cue to A interacts with two weakly specified targets.

The second way in which the task environment can be parameterized is by the distinctiveness of the targets, and hence, their relative attractiveness. Imagine as an extreme case, a hungry infant faced with two targets, a cookie and a familiar toy . Without any additional cuing, the baby would likely spontaneously reach for the cookie and would persist in going for the cookie despite being repeatedly enticed with the toy. This is a measure, therefore, of the strength of the task environment to compel reaching to one side or another. Thus, Stask,0 is the dimension of the field expressing the tendency of the infant to go to B when A is cued or to go to A when B is cued, in other words, to "spontaneously" be pulled to one side or another. Finally, parameter cA/B is used to express any asymmetry in the task arrangement. When the targets are alike and placed side-by-side, cA/B = 1.

To illustrate this, refer again to Figure 8. The bottom left panel represents the typical A-not-B situation used in Smith et al (1999b) where the task input specifies two identical covers or targets at A and B. Since the targets are indistinguishable from each other and not highly distinct from the background, there is no strong incentive for infants to reach to either A or B or even to reach at all. The task contribution to the motor decision field is centered around the target areas, but not biased to either location, and not very strong. Next, imagine that the usual A-not-B targets are replaced by one plain brown lid and one colorful and attractive toy. As in the illustration on the lower right of Figure 8, the decision field would then be biased even before the experimenter called attention to the A or B side because of the increased possibility that infants will go for the toy rather than the lid. A third possibility --two attractive toys side-by-side-- is shown in the illustration at the top left of Figure 8. Here, as in the first situation, the task environment does not offer a basis for choice of A or B, although both targets are attractive and the probability for a reach to either is high. Finally, a parameter task characterizes the spatial spread of the activation function, similar to that used in the field dynamics. For our simulations, this parameter is fixed.

Together, then, the specification of the task input for the A-not-B paradigm has the

following Gaussian form:

(10)
 
 

We chose the Gaussian because the function conveniently expresses the forms of the three parameters of the task input (location, strength, and width).

As discussed in an earlier section, because the targets are typically similar and relatively close, without other input, infants have no compelling reason to reach to preferentially to A or B and their performance is at chance levels (Smith et al. 1999). To overcome this, infants must be trained to go consistently to the A side by making that location more salient. Experimenters do this by moving the object-to-be-hidden, or the lid, forward so that it is closer to the baby, providing a visually more distinctive target.(1) This adds asymmetry to the task input field by making cA/B not equal to one, and thus biases the first reach--and subsequent reaches-- to one side. Changing the distinctiveness of the lids, that is making one lid a different shape or color, also biases the task field, so that once one side is cued, infants may have continual reminders of the differential targets. We simulate both the training and distinctive target effects.

Experiments have shown that other features of the task environment such as visual landmarks or background colors and surfaces can interact with the distinctiveness and placement of the targets to determine whether infants perseverate or not. As summed up by Butterworth et al. (1982) whatever the perceptual basis of this task, "it is extremely sensitive to variations in context" (p. 447). The task input field can be adjusted to simulate these variations.

5.2.2. Specific input. While looking at the task scene, the infant is cued to one target, A or B, and this cue also contributes in a graded way to the movement decision. Thus, the model has a second source of input, Sspec, the phasic visual cue of the experimenter waving, tapping, or otherwise calling attention to the target object. The specific input is similar to the task input as it is characterized as a location in motor parameter space, with a particular strength, and activation spread. It differs because it is a time-limited input to the field to simulate the transient visual cue which then must be held in memory. Thus, xspec represents in the equation the location where the attentional cue is delivered, in this case either at A or B. The strength of the input, Sspec,0, can be varied to capture the saliency of the cue. For instance, waving or hiding a bright, glittery or noisy object, a brightly colored toy, or a cookie will be more attention-grabbing than a plain colored cloth or lid, as is further illustrated in Figure 8. In addition, as in the task input, the specific input can have different values for its activation spread, spec. The cue duration is entered into the model as a step function. Thus, before the cue is given, the specific input is zero. During the cue, the specific input is non-zero at a constant level and then it instantaneously returns to zero at the termination of the specified time. A cue of longer duration provides more input to that location in the field. The form of the specific input is:

(11)
 

during the time interval of length T when such input is present and zero otherwise.

5.2.3. Memory input. The third source of bias to the field adds its influence and is in the form of a memory field in movement parameters retaining the shape of previous decisions to go to A or B. The memory field gets its input from the motor planning field, and thus encodes the history of all previous reaches. Because the planning field itself has integrated the two visual inputs, that of the task and specific cue, the memory also indirectly reflects the whole perceptual aspects of the repeated task. As it builds, the memory input contributes not only to the probability of a reach to A or B after cueing, but also to the likely direction of a spontaneous reach. As reported in Smith et al. (1999b), once infants have made a decision to go to one target or another in the absence of a specific cue, they are likely to stick with their choices. The memory input becomes stronger after each repeated reach to one location such that it may swamp the task and the transient cue. This is the heart of the A-not-B error.

The memory field itself has dynamics: it evolves continuously in time. These dynamics occur on two time scales, one related to its growth and one to its decay. First, the memory field grows in parallel with the motor planning field in the few seconds between cue and reach. More specifically, the planning field enters the memory field whenever the planning field is activated above a certain threshold, u0. The memory field has its own time scale, specified by mem :

Here mem governs the growth process of the memory, and

 

(u(x,t,)-u0)=1 if u(x,t,) >u0

and zero otherwise. The memory field reaches a maximum level after approximately 6 trials: mem =6.

This is where we conventionally test for the error with a cue to B.

The second time scale is that of the decay of the memory field. In theory, we assume that the memory field decays slowly in the absence of activity in the motor plan field in the time that elapses between one reach and the cue for the succeeding trial. For the purpose of these simulations, we have assumed that this time scale of this decay is much slower than the inter-trial interval, and therefore does not enter into the model as a parameter.

We believe this to be reasonable because perseveration, and hence, the persistence of the memory, is robust despite variability in the inter-trial intervals: delays between the end of one reach and the initiation of the succeeding reach ranged from 20 to 50 seconds. This means that within the time scale of the typical A-not-B task, this decay may not be critical, although we have not empirically tested these limits.

In summary, the contribution of the memory input takes the form

where Smem,0 is the strength of the memory input.

5.3 Output

The field represents a parametric movement plan, but is not linked to a model of the actual motor control of the arm (see Kopecz & Schöner, 1995, for integration with motor control of eye movements). Here we approximate such control b a simple read-out procedure: when the movement is elicited after the dealys, we assume that the locaiton in the field with the maximul activation describes the movement that is actually performed. This approximation is reasonable as long as no other manipulations or perturbations intervene at the control level.

6. Simulations

Before we discuss the simulations of the model and compare them to experiments, a few comments about the model parameters may be useful. As with any model, we have constructed a mathematical abstraction--one of many possible abstractions-- of complex and multi-determined behavioral events. Some aspects of the events, such as the timing of cues and trial length, can be assigned realistic parameter values. For others, such as the relative separation of the targets or the strength of the cues, parameter values are less directly mapped onto experimental factors. In principle, the basic effects are described by the model over a wide range of values of these parameters. However, the constellation of parameters is strongly constrained by the experimental results. The model is successful if two conditions are met in parameter assignment. First, the ensemble of parameters must produce the primary experimental effects when values are fixed such that all the orders of magnitude are reasonable in relation to one another. For instance, to produce the canonical error, the specific input must be sufficient to produce an A-side decision on the first reaches, but not so strong as to dominate the field interactions as the memory strength builds. Within such constraints, the precise parameter values are not critical: the qualitative effects are robust within a range of parameter values. But having determined these values on the primary result, the test of the model is to keep certain parameters fixed--the characteristics of the field, for instance--and to simulate different experimental results, as well as generate new testable predictions. Here, for example, we use the settings of the canonical effect and manipulate different values of the visual input. Together, these conditions show that the model is both internally consistent and externally valid. Note that the internal structure of the dynamic field model precludes simulation of any arbitrary input-output relationship through judicious choice of parameters. Strong theoretical assumptions of the mechanisms involved are incorporated into the equations, such as localized input with uniform width, homogeneous symmetrical interaction kernel, and superposition of inputs. Many connectionist models, in contrast, have been shown to be universal approximators under some conditions (Hornik, Stinchcombe, & White, 1989). The question addressed by the model, then, is how a decision to reach to A or B evolves under varying conditions, represented by the parameters of the model. To reach a decision, the dynamics of the inputs are thus coupled to the dynamics of the movement field. The model is integrated in time for 10 seconds, which realistically represents the time interval that begins when the infant looks at the display, the experimenter waves the target, the delay is imposed, and ending with the decision to go to one target to another. Evolution in time results from the dynamic equation, the solutions of which provide values of activation at all sites in the field as a function of time. These solutions can be visualized as activation landscapes representing the relative strengths of various values of the movement parameter signifying direction.

In the simulations, we solved the dynamic equations on a digital computer using the Euler procedure, in which one time step represents 50 msec. The results do not depend on the time step, which we chose to be sufficiently small so that the numerically obtained solutions approximate the real solutions of the dynamical system. An individual simulation run of 10 s realistically models a single reaching act. At the end of each run, the activation in the movement field is reset to zero, readying the system for a new trial. By contrast, we continuously update the memory field to reflect the build up of a history of reaches. A sequence of such individual trials reflects the experimental paradigm (6 reaches with cues at A, two reaches with cues at B). Such entire trial sequences are the