Below is the unedited penultimate draft of:
MacNeilage, P.F. (19XX). The Frame/Content Theory of Evolution of Speech Production. Behavioral and Brain Sciences, XX (X): XXX-XXX.
The final published draft of the target article, commentaries and Author's Response currently available only in paper.
For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to: journals_subscriptions@cup.org (North America) or journals_marketing@cup.cam.ac.uk (All other countries).

THE FRAME/CONTENT THEORY OF EVOLUTION OF SPEECH PRODUCTION

Peter F MacNeilage
Department of Psychology
University of Texas at Austin
Austin TX 78712
USA
macneilage@mail.utexas.edu

Keywords

speech, language, evolution, communication, neuropsychology

Abstract

The species-specific organizational property of speech is a continual mouth open-close alternation, the two phases of which are subject to continual articulatory modulation. The cycle constitutes the syllable and the open and closed phases are segments - vowels and consonants respectively. The fact that segmental serial ordering errors in normal adults obey syllable structure constraints suggests that syllabic "Frames" and segmental "Content" elements are separately controlled in the speech production process. The frames may derive from cycles of mandibular oscillation, present in humans from babbling onset, which are responsible for the open-close alternation. These communication-related frames perhaps first evolved when the ingestion-related cyclicities of mandibular oscillation (associated with mastication (chewing) sucking and licking) took on communicative significance as lipsmacks, tonguesmacks and teeth chatters - displays which are prominent in many nonhuman primates. The new role of Broca's area and its surround in human vocal communication may have derived from its evolutionary history as the main cortical center for the control of ingestive processes. The frame and content components of speech may have subsequently evolved separate realizations within two general-purpose primate motor control systems: (1) A motivation-related medial "intrinsic" system, including anterior cingulate cortex and the supplementary motor area, for self-generated behavior, formerly responsible for ancestral vocalization control and now also responsible for frames, and (2) a lateral "extrinsic" system, including Broca's area and surround, and Wernicke's area, specialized for response to external input (and therefore the emergent vocal learning capacity) and more responsible for Content.


1. INTRODUCTION

This target article is concerned with the evolution of speech production as "action". The question is, how did we evolve the capacity to do what we do with the speech production apparatus when we speak? There will be little concern with the evolution of the conceptual structure that underlies speech actions. Instead, the focus will be on a capability typically taken for granted in current linguistic theory and cognitive science: How is our remarkable capacity for making the serially organized complexes of movements which constitute speech to be explained?

The basic thesis is quite simple. Speech differs from vocal communication of other mammals in that we alone superimpose a continual rhythmic alternation between an open and closed mouth (a "frame") on the sound production process. The likelihood that this cyclicity, associated with the syllable, evolved from ingestive cyclicities (e.g. chewing) is indicated by the fact that much of the new development of the brain for speech purposes occurred in and around Broca's area, in a frontal perisylvian region basic to the control of ingestive movements in mammals. An evolutionary route from ingestive cyclicities to speech is suggested by the existence of a putative intermediate form present in many other higher primates, namely, visuofacial communicative cyclicities such as lipsmacks, tonguesmacks and teeth chatters. The modification of the frontal perisylvian region leading to syllable production presumably made its other ingestion-related capabilities available for use in modulation of the basic cycle in the form of different consonants and vowels ("content"). More generally, it is suggested that the control of speech production evolved by descent with modification within two general-purpose primate cortical motor control systems, a medial system, associated with vocalization control in all primates, and a lateral system, including Broca's area which has the necessary emergent vocal learning capacity.

In Darwin's words, evolution is a matter of "descent with modification" (Darwin, l859, p 420). We must therefore accept the constraint noted by Huxley: "The doctrine of continuity is too well established for it to be permissible to me to suppose that any complex natural phenomenon comes into existence suddenly, and without being preceded by simpler modifications" (Huxley, l917, p 236). Consequently the most successful theory of evolution of speech, as the action component of language, will be the one that best characterizes this descent with modification, with an accurate and dispassionate assessment of prior states and the end state, and of the nature of the difference between them. The best characterization will not be the one that humans often find congenial - one that exults in the glories of the end state and trivializes the precursors. As Darwin (l871) said, "man bears the indelible stamp of his lowly origins" (p 597).

This characterization immediately rules out any explanation of the ultimate causes of language in terms of the Chomskyan concept of "Universal Grammar" (Chomsky, l986). This concept is in the tradition of Platonic essentialism (see Mayr, l982 pp 37-38 on essentialism in biology and Lakoff, l987 for a characterization of the essentialistic assumptions underlying generative grammar) according to which form has a priori status. In response to the currently accepted view, derived from evolutionary theory, that language has not always been present, Chomsky has departed from both Platonism and orthodox evolutionary theory in implying an instantaneous onset for language form, resulting from "a mutation" (Chomsky, l988, p 170). However, despite this accomodation to the fact of evolution, there is apparently no room for a role of modification in the Chomskyan scenario.

The following assumptions will be made in the attempt to characterize the state prior to language evolution in this target article: (l) As the vocal characteristics of call systems of all living nonhuman primates are basically similar despite considerable differences in the closeness of the relations of the various taxa to forms ancestral to humans, it will be assumed that the call systems of forms ancestral to humans were similar to presently observable ones. (2) Most work on brain organization underlying vocal communication in nonhuman primates has been done on two taxa: rhesus monkeys, which are old world monkeys, and squirrel monkeys, which are new world monkeys. These taxa probably had a common ancestry which was also common to humans, about 40 million years ago. The brain organization underlying call production in these two living taxa seems to be relatively similar (Jurgens, l979). It will be assumed that this similarity owes a good deal more to properties of ancestral brain organization than to convergent evolution of organization radically differing from ancestral organization. It is therefore also assumed that the brain organization underlying call production in these two taxa is basically similar to that of forms ancestral to humans. It is concluded that in underlying brain organization as well as in vocal production, the problem of accounting for the evolution of human speech production can be considered, for practical purposes, to be the problem of accounting for the change from characteristics displayed by other living primates to characteristics of humans.

2. EVOLUTION OF PRIMATE VOCAL PRODUCTION: NATURE OF THE HUMAN-NONHUMAN DIFFERENCE

2.1 Vocal production systems of other mammals.

The three main components of the vocal production system of mammals - the respiratory, phonatory and articulatory components - are shown schematically in Figure 1a. They are shown in the typical horizontal plane characteristic of quadrupeds. With the advent of bipedalism in hominids, the respiratory and phonatory components take on a vertical orientation. In addition, as shown in Figure 1b, in advanced hominids the posterior part of the articulatory system takes on a vertical configuration, but the anterior part does not, resulting in a 2-tube vocal tract (perhaps in the last few hundred thousand years according to Lieberman, l984).

The main role of the respiratory component in sound production is to produce an outward flow of air under pressure (Hixon, l973). Phonation (or voicing) is produced when the vocal folds are brought together in such a way that they vibrate when activated by the outward air flow (Negus, l949). The articulatory component - basically the mouth - is usually opened at least once for a vocal episode, and the shape of the cavity between lips and larynx - the vocal tract - modulates the voice source in the form of resonances (Fant, l960). The value of the evolution of the two-tubed vocal tract (Lieberman, l984) in hominids was that it considerably increased the acoustic potential for making different sounds (Carre, Lindblom & MacNeilage, l995). The question being raised here, however,is: How did humans evolve the organizational capacity to make use of this potential by producing rapid and highly variegated sound sequences in syllabic packages?

Except for humans, mammals typically have a very small repertoire of different calls, with some seeming to involve a graded continuum. For example, in a recent study of gelada baboon vocalizations (Aich, Moos-Heilen and Zimmermann, l990) "at least 22 acoustically different vocal patterns" were distinguished. Their distinctively holistic character, lacking independently variable internal subcomponents, is indicated by the fact that they are often given names with single auditory connotations. Names given to gelada baboon calls by Dunbar and Dunbar (l975) include "moan", "grunt", "vocalized yawn", "vibrato moan", "yelp", "hnn pant", "staccato cough", "snarl", "scream", "aspirated pant" and "how bark". Some calls of other primates occur only alone, some alone and in series, and some only in series. Although it occurs "often" (Marler, l977, p 24), different acoustic units are not typically combined into series in other primates, and when they are, different arrangements of internal subcomponents do not seem to have separate meanings in themselves (e.g. Robinson, l979).

2.2 The nature of speech.

The main difference between speech and other mammalian call systems involves the articulatory component. In all mammals, the operation of the respiratory and phonatory components can be most generally described in terms of modulated biphasic cyclicities. In respiration, the basic cycle is the inspiration-expiration alternation and the expiratory phase is modulated to produce vocalizations. In the phonatory system, the basic cycle is the alternation of the vocal folds between an open and closed position during phonation ("Voicing" in humans) (Broad, l973). This cycle is modulated in its frequency, presumably in all mammals, by changes in vocal fold tension and subglottal pressure level, producing variations in perceived pitch.

The articulatory system in nonhuman mammals is typically only used in an open configuration during call production, although some calls in some animals (e.g. "girneys" in Japanese macaques - see Green, l975) seem to involve a rhythmic series of open-close alternations. However in human speech in general, the fact that the vocal tract alternates more or less regularly between a relatively open and a relatively closed configuration, open for vowels and closed for consonants, is basic enough to be a definitional characteristic (MacNeilage, l991a). With the exception of a few words consisting of a single vowel, virtually every utterance of every speaker of every one of the world's languages involves an alternation between open and closed configurations of the vocal tract. As noted earlier, the syllable, a universal unit in speech, is defined in terms of a nucleus with a relatively open vocal tract, and margins with a relatively closed vocal tract. Modulation of this open - close cycle in humans takes the form of typically producing different basic units, consonants and vowels, collectively termed phonemes, in successive closing and opening phases. Thus, speech is distinguished from other mammalian vocal communication, in movement terms, by the fact that a third, articulatory, level of modulated cyclicity continuously coexists with the two levels present in other mammals.

Figure 2 is a schematic view of the structure of the English word "tomato". It can be described as consisting of two levels, suprasegmental and segmental. The segmental level, consisting of consonants and vowels, can be further divided into a number of subattributes or features. (In more behaviorally oriented treatments, subattributes of phonemes are described in terms of gestures (e.g. Browman and Goldstein, l986).) For example, for the sound [t], a featural description would be applied to its voicing properties, the place in the vocal tract at which occlusion occurred and the fact that it involves a complete occlusion of the vocal tract. At the suprasegmental level, the term "stress" refers roughly to the amount of energy involved in producing a syllable, which is correlated with its perceptual prominence. In English at least, more stressed syllables tend to be louder, and have higher fundamental frequencies and longer durations. Intonation refers to the global pattern of fundamental frequency (rate of vocal fold vibration). In multisyllabic words spoken in isolation, and in simple declarative sentences such as "The boy hit the ball" there is a terminal fall in fundamental frequency. The syllable lies at the interface between the suprasegmental and the segmental levels. At the suprasegmental level it is the unit in terms of which stress is distributed, a unit of rhythmic organization, and a point of inflexion for intonation contours. At the segmental level it provides an organizational superstructure for the distribution of consonants and vowels. (For further detail see Levelt, l989, Chapter 8.)

3. HOW IS THE NEW HUMAN CAPABILITY ORGANIZED? IN A FRAME/CONTENT MODE.

3.1 Serial ordering errors in speech.

How do we discover the organizational principles underlying syllabic frames and their modulation by internal content? Normal speakers sometimes make errors in the serial organization of their utterances. It was Lashley's (l951) insight to realize that serial ordering errors provide important information about both the functional units of action and their serial organization. At the level of sounds (rather than words) the most frequent unit to be misplaced is the single segment (consonant or vowel). For example, in a corpus collected by Shattuck-Hufnagel (l980) about two thirds of the errors involve single segments. The other errors involve for the most part sub-syllabic groupings of segments.

There is some agreement on the existence of 5 types of segmental speech error, often called "exchange" (spoonerisms), "substitution", "shift", "addition" and "omission". In previous discussions of the implications of speech errors, the author and colleagues have focussed primarily on exchange errors (MacNeilage, l973; MacNeilage, Studdert-Kennedy and Lindblom, l984; l985; MacNeilage, l985; l987a,b) as they are the only relatively frequently occurring type in which the source of the unit can be unequivocally established. However, much evidence from other error types is consistent with that from exchange errors.

The central fact about exchange errors is that in virtually all segmental exchanges, the units move into a position in syllable structure like the one they vacated; syllable-initial consonants exchange with other syllable-initial consonants, vowels exchange with vowels, and syllable-final consonants with other syllable-final consonants. For example, Shattuck-Hufnagel (l979) reported that of a total of 211 segmental exchanges between words "all but 4 take place between phonemes in similar positions in their respective syllables" (p 307).

Examples from Fromkin (l973) are;

Initial Consonants: well made - mell wade

Vowels: add hoc - odd hack

Final Consonants: top shelf - toff shelp

This result, which is widely attested in studies of both spontaneous and elicited errors (Levelt, l989) demonstrates that there is a severe syllable position constraint on the serial organization of the sound level of language. Most notably, the position-in-syllable constraint seems virtually absolute in preserving a lack of interaction between consonants and vowels. There are numbers of consonant-vowel and vowel-consonant syllables in English that are mirror images of each other; e.g. "eat", vs "tea"; "no", vs "own"; "abstract vs "bastract". Either form therefore naturally occurs as a sequence of the two opposing vocal tract phases, but exchange errors which would turn one such form into the other are not attested.

3.2 Metaphors for speech organization: Slot/Segment and Frame-Content (F/C).

For Shattuck-Hufnagel, (l979) These error patterns imply the existence of a scan-copy mechanism that scans the lexical items of the intended utterance for representation of segments, and then copies these representations into slots in a series of canonical syllable structure matrices. The fundamental conception underlying this "Slot/Segment" hypothesis is that "slots in an utterance are represented in some way during the production process independent of their segmental contents" (Shattuck-Hufnagel, l979 p 303). It is this conception that also underlies the "Frame/Content" metaphor (henceforth F/C) used by the author and colleagues (MacNeilage et al l984; l985; MacNeilage, l985: l987a,b), and by Levelt (l989). The only difference lies in the choice of terms for the two components. In the present terms, syllable-structure Frames are represented in some way during the production process independent of segmental Content elements.

The speech errors which reveal the F/C mode of organization of speech production presumably occur at the stage of interfacing the lexicon with the motor system. The motor system is required to both produce the overall rhythmic organization associated with syllables, basically by means of an open-close alternation of the vocal tract, and to continually modulate these cycles by producing particular consonants and vowels during closing and opening phases. Rather than there being holistic chunking of output into an indissoluble motor package for each syllable, there may have developed, in the production system, some natural division of labor whereby the basic syllabic cycle and the phasic modulations of the cycle are separately controlled. Thus, perhaps, when frame modulation, by means of varying consonants and vowels, evolved as a favored means of increasing the message set, the increasing load on this aspect of production led to the development of a separate mechanism for its motor control.

According to the above conception, which will be amplified in subsequent discussion, fundamental phylogenetic properties of the motor system have played the primary role in determining the F/C structure of speech. It is assumed that as this occurred, the consequences of the two-part division of labor then ramified into the organization of the prior stage of lexical storage. There is good evidence that there is in fact independent lexical representation of segmental information and information about syllable structure in the mental lexicon . This evidence comes from a set of studies on the "Tip of the Tongue" (TOT) phenomenon occurring when people find themselves able to retrieve some information about the word they wish to produce but cannot produce the whole word. Levelt (l989) concludes that "... lexical form information is not all-or-none. A word's representation in memory consists of components that are relatively accessible and there can be metrical information about the number and accents of syllables without these syllables being available" (p 321).

The conception of the syllable as the receptacle for segments during motor organization is supported by another body of evidence. Garrett (l988) has pointed out that there is little evidence that syllables themselves are moved about in serial ordering errors "except where the latter are ambiguous as to their classification (i.e. they coincide with morphemes, or the segmental makeup of the error unit is ambiguous)" (p 82). Thus "syllables appear to constrain error rather than indulge in it...". (For a similar conclusion, see Levelt, l989, p 322.)

3.3 Lack of evidence for subsegmental units.

It is of interest to note that in emphasizing this dual- component (syllable and segment) conception of speech production, no role is accorded to the most nested subcomponent in the linguistic description of syllable structure, the distinctive feature, or its functional counterpart, the gesture, the units most favored in current phonological and phonetic conceptions of the organization of speech. This contrarian stance is taken primarily on the grounds of the paucity of evidence from speech errors that the feature/gesture is an independent variable in the control of speech production. The fact that members of most pairs of segments involved in errors are similar, differing only by one feature, has sometimes been taken to mean that the feature is a functional unit in the control process. However the proposition that phonetic similarity is a variable in potentiating errors of serial organization can be made without dependence on an analysis in terms of features. When two exchanged segments differ by one feature, it cannot be determined whether features or whole segments have been exchanged. But as Shattuck-Hufnagel and Klatt (1979) have pointed out, when the two segments participating in an exchange error differ by more than one feature, a parsimonious interpretation of the view that features are functional units would suggest that usual number of features that would be exchanged would be one. However, in an analysis of 72 exchange errors in which the members of the pairs of participating segments differed by more than one feature there were only three cases where only a single feature was involved in the exchange. This of course is not conclusive evidence against the independence of features/gestures as units in the control process, but it does serve to encourage a conception of production in which their independence is not required.

3.4 Speech and typing.

A perspective on this dual-component view of speech production organization can be gained by comparing it with another language output behavior - typing. There is evidence to suggest that there is a considerable commonality between spoken language and typing - even copy typing - in early stages of the process of phonological output, stages in which there is a role of the lexicon. For example, Grudin (l981) found that on 11 of 15 occasions, copy typists spontaneously corrected the spelling of a misspelled word with which they were inadvertently presented. However typing does not posess an F/C mode of organization. Any typist knows that in contrast with spoken language, exchange errors occur not between units with comparable positions in an independently specified superordinate frame structure, but simply between adjacent letters (MacNeilage, l964). And this is true whether the units are in the same syllable or in different syllables. In addition, unlike in speech, there is no constraint against exchanging actions symbolizing consonants and actions symbolizing vowels. Vowel and consonant letters exchange with each other about as often as would be predicted from the relative frequency with which vowel letters and consonant letters appear in written language (MacNeilage, l985). Nespoulous et al (l985) have reported a similar freedom from phonotactic constraints of the language in agraphics.

In concluding this section on adult speech organization, it should be emphasized that the present focus on the frame-content dichotomy is not simply a case of deification of some marginal phenomenon. As Levelt puts it: "Probably the most fundamental insight from modern speech error research is that a word's skeleton or frame and its segmental content are independently generated" (1992, p 10). Speech error data have in turn been the most important source of information in the psycholinguistic study of language production.

4. HOW DID THE FRAME/CONTENT MODE EVOLVE?

4.1 Evolution as tinkering.

Francois Jacob's metaphor of evolution as tinkering has gained wide acceptance (Jacob, l977). Evolution does not build new structures from scratch like an engineer. Instead it takes whatever is available, and, where called for by natural selection, molds it to new use. This is presumably equally true for structures and behaviors. There are of course plenty of examples of this in the evolution of vocalization. No structure in the speech production system initially evolved for vocalization. Our task is to determine what modifications of existing capacities led to speech. Specifically the question is; how was the new articulatory level of modulated cyclicity tinkered into use?

4.2 Cyclicities and tinkering.

An obvious answer suggests itself. The oral system has an extremely long history of ingestive cyclicities involving mandibular oscillation, probably extending back to the evolution of the first mammals, circa 200 million years ago. Chewing, licking and sucking are extremely widespread mammalian activities, which, in terms of casual observation, have obvious similarities with speech, in that they involve successive cycles of mandibular oscillation. If ingestion-related mandibular oscillation was modified for speech purposes, the articulatory level would be like the other two levels in making use of pre-existing cyclicities. The respiratory cycle originally evolved for gas exchange, and the larynx initially evolved as a valve protecting the lungs from invasion by fluids. Presumably vocal fold cyclicities were initially adventitious results of release of air through the valve under pressure, a phenomenon similar to that sometimes observed in the anal passage, but one which presumably had more potential for control.

It is well known that biphasic cycles are the main way in which the animal kingdom does work which is extended in the time domain. Long ago, Lashley (l951) attempted, more or less unsuccessfully, to bring to our attention the importance of rhythm generators as a basis for serially organized behaviors, even behaviors as complex as speech. Examples of such biphasic cycles are legion: locomotion of many different kinds in aquatic, terrestrial and aerial media, the heartbeat, respiration, scratching, digging, copulating, vomiting, milking cows, pedal alarm 'calling' in rabbits, cyclical ingestive processes etc. The conservative connotation of the tinkering metaphor is applicable to the fact that biphasic cyclicities, once invented do not appear to be abandoned but are often modified for uses somewhat different than the original one. For example, Cohen (l988) makes the astonishing claim that an evolutionary continuity in a biphasic vertebrate locomotory cycle of flexion and extension can be traced over a period of a half a billion years: "There is ... a clear phylogenetic pathway from lampreys to mammalian quadrupeds for the locomotor central pattern generator (CPG)" (p 160). She points out that "With the evolution of more sophisticated and versatile vertebrates, more levels of control have been added to an increasingly more sensitive and labile CPG coordinating system." She concludes, however, that "In this view the basic locomotor CPG need change very little to accomodate the increasing demands natural selection placed on it" (p 161).

4.3 Ingestive Cyclicities.

Ingestive oral cyclicities are similar to locomotion in having a CPG in the brain stem with similar characteristics across a wide range of mammals. In fact the similarity between the locomotor and ingestive CPGs is sufficiently great that Rossignol, Lund and Drew (l988) were motivated to suggest a single neural network model for the two CPGs and the CPG for respiration. Lund and Enomoto (l988) characterize mastication as "one of the types of rhythmical movements that are made by coordinated action of masticatory, facial, lingual, neck and and supra- and infra-hyoid muscles" (p 49). In fact, this description is apt for speech. The question is whether speech would develop an entirely new rhythm generator, with its own totally new superordinate control structures which could respond to coordinative demands similar to those made on the older system, if evolution is correctly characterized as a tinkering operation, making conservative use of existing CPGs. The answer to this question must be no! If so, then it is not unreasonable to conclude that speech makes use of the same brain stem pattern generator as ingestive cyclicities do, and that its control structures for speech purposes are, in part at least, shared with those for ingestion.

In coming to this conclusion one needs to resist a tendency to regard mastication as too simple to be a candidate for tinkering into speech. As Luschei and Goldberg (l981) point out, mastication is "a rhythmic activity that seems to proceed successfully in a highly "automatic" fashion, even in the face of wide variation in the loads presented by eating different food materials" (p 1237). However they warn us that "Movements of mastication are actually quite complex and they must bring the teeth to bear on the food material in a precise way" (p 1238). In addition, they note that "... the mandible is often used in a controlled manner for a variety of tasks. For the quadrupeds, in particular, the mandible constitutes an important system for manipulation of objects in the environment" (p 1238). The inaccessability of the masticatory system to direct observation presumably contributes to a tendency to underestimate its prowess. The reader may have shared the author's surprise, on biting his tongue, that it does not occur more often.

Perhaps part of the reason that so little attention has been given to the possibility that ingestive cyclicities were precursors to speech is that speech is a quite different function than ingestion. However, functional changes which occur when locomotor cyclicities of the limbs are modified for scratching and digging do not prompt a denial of the relation of these functions to locomotion. In the author's opinion, it is the anthropocentric view of speech as having exalted status that is the main reason for the neglect of the possibility that actions basic to it may have had ingestive precursors.

4.4 Visuofacial Communicative Cyclicities.

If the articulatory cyclicity of speech indeed evolved from ingestive cyclicities how would this have occurred? An important fact in this regard is that mandibular cyclicities, though not common in nonhuman vocalization systems are extremely common as faciovisual communicative gestures. "lipsmacks", "tonguesmacks" and "teeth chatters" can be distinguished. Redican (l975) describes the most common of these, the lipsmack, as follows: "the lower jaw moves up and down but the teeth do not meet. At the same time the lips open and close slightly and the tongue is brought forward and back between the teeth so that the movements are usually quite audible... . The tongue movements are often difficult to see, as the tongue rarely protrudes far beyond the lips" (p 138). Perhaps these communicative events evolved from ingestive cyclicities.

It is surprising that more attention has not been drawn to the similarity between the movement dynamics of the lipsmack and the dynamics of the syllable (MacNeilage, l986). The up and down movements of the mandible are typically reduplicated in a rhythmic fashion in the lipsmack, as they are in syllables. In addition to its similarity to syllable production in motor terms, there are a number of other reasons to believe that the lipsmack could be a precursor to speech. First, it is analogous to speech in its ubiquity of occurrence. Redican (l975) believes that it may occur in a wider variety of social circumstances than any of the other facial expressions that he reviewed. A second similarity between the lipsmack and speech is that it typically occurs in the context of positive social interactions. A third similarity is that unlike many vocal calls of the other primates, the lipsmack is an accompanyment of one-on-one social interactions involving eye contact, and sometimes what appears to be turn-taking. This is the most likely context for the origin of true language.

Finally, in some circumstances the lipsmack is accompanied by phonation. Andrew (l976) identifies a class of "humanoid grunts" involving low frequency phonation in baboons, sometimes combined with lipsmacking. In the case he studied most intensively, mandibular lowering was accompanied by tongue protrusion and mandibular elevation by tongue retraction. Green (l975) describes a category of "atonal girneys" in which phonation is modulated "by rapid tongue flickings and lipsmacks." Green particularly emphasizes the labile morphology of these events, stating that "a slightly new vocal tract configuration may be assumed after each articulation" (p 45). Both Andrew and Green suggest that these vocal events could be precursors to speech.

How, exactly, might ingestive cyclicities get into the communicative repertoire? Lipsmacks occurring during grooming have often been linked with the oral actions of ingestion of various materials discovered during the grooming process, as they often precede the ingestion of such materials. In young infants they have been characterized as consisting of, or deriving from nonnutritive sucking movements. It does not seem too far fetched to suggest that gestures anticipatory to ingestion may have become incorporated in communicative repertoires.

5. PHYLOGENY AND ONTOGENY: DEVELOPMENT OF THE FRAME/CONTENT MODE.

5.1 Manual ontogeny recapitulates phylogeny.

The claim, originating with Haeckel, that ontogeny recapitulates phylogeny, has been discredited in a number of domains of inquiry (Gould, l977; Medicus, l992). However, in the realm of human motor function there is some evidence in favor of it. Paleontological evidence, plus the existence of living forms homologous with ancestral forms allows a relatively straightforward reconstruction of the general outlines of evolutionary history of the hand (Napier l962). Mammals ancestral to primates are considered to have the property of convergence-divergence of the claws or paws of the forelimbs but not to have prehensility (the capability of enclosing an object within the limb extremity). This is considered to have first developed with the hand itself in ancestral primates (Prosimians) about 60 million years ago. Precise control of individual fingers, including opposability of the thumb, which allows a precision grip, only became widespread in higher primates, whose ancestral forms evolved about 40 million years ago (MacNeilage, l989). In human infants, while convergence-divergence is present from birth, spontaneous manual prehension does not develop until about 3-4 months (Hofsten, l984), and "it is not until 9 months of age that infants start to be able to control relatively independent finger movements" (Hofsten, l986).

5.2 Speech ontogeny: Frames, then content.

A similar relationship exists between the putative phylogeny of speech and its ontogeny. Infants are born with the ability to phonate, which involves the cooperation between the respiratory and phonatory systems characteristic of all mammals. Meier et al (l997) have recently found that infants may produce "jaw wags", rhythmic multicycle episodes of mouth open-close alternation without phonation - a phenomenon similar to lipsmacks - as early as 5 months of age. Then, at about 7 months of age, infants begin to babble, producing rhythmic mouth open-close alternations accompanied by phonation.

Work with Davis and other colleagues has shown convincingly that the main source of variance in the articulatory component of babbling (7-12 months) and subsequent early speech (12-18 months) is mandibular oscillation. The ability of the other articulators - lips, tongue, soft palate - to actively vary their position from segment to segment, and even from syllable to syllable, is exremely limited. We have termed this phenomenon "frame dominance" (Davis and MacNeilage, l995).

We have hypothesized that frame dominance is indicated by 5 aspects of babbling and early speech patterns. Three of these hypotheses involve relations between consonants and vowels in consonant-vowel (CV) syllables, the most favored syllable type in babbling and early speech, and the other two involve relations between syllables. The first two hypotheses concerned the possible lack of independence of the tongue within consonant-vowel (CV) syllables: 1. Consonants made with a constriction in the front of the mouth (e.g. "d", "n") will be preferentially associated with front vowels. 2. Consonants made with a constriction in the back of the mouth (e.g. "g") will be preferentially associated with back vowels. 3. A third hypothesis was that consonants made with the lips (e.g. "b", "m") will be associated with central vowels; that is, vowels that are neither front nor back. It was suggested that as no direct mechanical linkage could be responsible for lip closure co-occurring with central tongue position, these syllables may be produced simply by mandibular oscillation, with both lips and tongue in resting positions. These CV syllable types were called "pure frames".

The lack of independent control of articulators other than the mandible during the basic oscillatory sequence of babbling is further illustrated by the fact that about 50% of the time, a given syllable will be followed by the same syllable (Davis and MacNeilage, l995). This phenomenon has been called "reduplicated babbling", and apparently involves an unchanging configuration of the tongue, lips, and soft palate from syllable to syllable. It was further hypothesized that even when successive syllables differed, (a phenomenon called "variegated babbling") the difference might most often be related to frame control, reflected in changes in the elevation of the mandible between syllables. In general it was proposed that changes in the vertical dimension, which could be related to amount of elevation of the mandible, would be more frequent than changes in the horizontal dimension. Changes in the horizontal dimension would be between a lip and tongue articulation for consonants, or changes in the front-back dimension of tongue position for consonants or for vowels. The resultant hypotheses were: 4. There will be relatively more intersyllabic changes in manner of articulation (specifically, amount of vocal tract constriction) than in place of constriction for consonants. 5. There will be relatively more intersyllabic changes in tongue height that in the front-back dimension for vowels.

To date, in three papers ( Davis and MacNeilage, l995; MacNeilage and Davis, l996; Zlatic, MacNeilage, Matyear and Davis, l997) we have reported a total of 99 tests of these 5 hypotheses regarding the predominant role of frames in prespeech babbling, early speech, and babbling concurrent with early speech. Fourteen infants were studied. Of these tests, 91 showed positive results, typically at statistically significant levels, 6 showed counter-trends, and two showed an absence of trend.

Is it a mere co-incidence that the frame dominance pattern that we have found in both babbling and the earliest words is similar to the pattern postulated here for the earliest speech of hominids, or is this pattern showing us the most basic properties of hominid speech production? If the earliest speech patterns were not like this, what were they like and why? And why has this question not received attention?

Another way of looking at this matter is to argue that modern hominids have evolved higher levels of both manual and vocal skill than their ancestors, but that this skill only becomes manifest later in development. The question of skill development in speech production requires some background. Most work on the sound preferences in babbling and early words has been done on consonants. Labial, alveolar and velar stops (e.g. "b", "d", "g" respectively) and labial and alveolar nasals ("m", "n") are most favored. Lindblom and Maddieson (l988) have made a classification of consonants into 3 levels of difficulty, in terms of the number of separate action subcomponents they require. Ordinary stops and nasals are in the "simple" category. In fact, even though within the simple category, consonants that are widely considered to be more difficult to produce than ordinary stops and nasals - e.g. liquids, such as those written in English orthography as "r" and "l" and fricatives such "th" - are relatively infrequent in babbling and early words, (Locke, l983) and even remain problematical for life for some speakers. Thus, the progression in development of consonant production is from simple sounds to those that can be considered to require more skill.

The possibility that this was also the sequence of events in the evolution of language is supported by another aspect of the work of Lindblom and Maddieson. They found in a survey of the consonant inventories of languages, that languages with small inventories tended to only have their "simple" consonants, languages with medium-sized inventories differed mainly by also including "complex" consonants, and languages with the largest inventories tended to also add "elaborated" consonants, the most complex subgroup in the classification. Presumably the first true language/s had a small number of consonants. It seems that the only way that the beyond-chance allocation of difficult consonants to languages with larger inventories can be explained is by arguing that they tended to employ consonants of greater complexity as the size of their inventory increased. If so, the tendency for infants to add more difficult consonants later in acquisition suggests that ontogeny recapitulates phylogeny.

5.3 Sound pattern of the first language.

If babbling and early speech patterns are similar to those of the first language, what was it like? I have proposed "that the conjoint set of sounds and sound patterns favored in babbling and in the world's languages constitutes, in effect the fossil record of true speech" (MacNeilage, l994). The proposed consonants are the voiceless unaspirated stops [p], [t], [k] (as in "bill", "dill", "gill") and the nasals [m] and [n] (as in "man"). (The square brackets denote phonetic symbols.) The two semivowels [w] and [j] as in "wet" and "yet" can also occupy the consonant position in syllables. The three vowels are versions of the three point vowels [i], [u] and [a]. Only the consonant-vowel syllable type is allowed, either alone or with one reiteration. Some constraints on possible intersyllabic combinations, similar to those observed in babbling and early speech, are imposed. An initial corpus of 102 words is proposed.

5.4 Frames and rhythmic behavior.

Phylogeny can profitably be characterized as a succession of ontogenies. The important role in evolution of biphasic cycles with their basically fixed rhythms is paralelled by their important role in ontogeny. From the beginning of babbling, utterances typically have a fixed rhythm in which the syllable frame is the unit. Mastery of rhythm does not develop from non-rhythmicity as it does in learning to play the piano. I appeal to the intuition of the reader as parent or supermarket shopper that intersyllable durations of babbling utterances often sound completely regular.

This initial rhythmicity provides a basis for the control of speech throughout life. For example, Kozhevnikov and Chistovich (l965) have observed that when speakers changed speaking rate, the relative duration of stressed and unstressed syllables remained more or less constant, suggesting the presence of a superordinate rhythmic control generator related to syllable structure. They also noted that the typical finding of shorter segment durations in syllables with more segments reflected an adjustment of a segmental component to a syllabic one.

Thelen (l981) has emphasized the fact that babbling is simply one of a wide variety of repetitive rhythmic movements characteristic of infants in the first few months of life.: "kicking, rocking, waving, bouncing, banging, rubbing, scratching, swaying ..." (p 238). As she notes, the behavior "stands out not only for its frequency but also for the peculiar exuberance and seemingly pleasurable absorption often seen in infants moving in this manner" (p 238). She believes that such "rhythmic stereotypies are transition behavior between uncoordinated behavior and complex, coordinated motor control." In her opinion, they are "phylogenetically available to the immature infant. In this view, rhythmical patterning originating as motor programs essential for movement control... (underlining mine) are "called forth", so to speak, during the long period before full voluntary control develops, to serve adaptive needs later met by goal-corrected behavior" (p 253). She suggests an adaptive function for such stereotypies as aids to the infants in becoming active participants in their social environment. This, in turn, suggests a scenario whereby the child could have, to some degree, become father of the man, in the evolution of speech, by encouraging use of rhythmic syllabic vocalization for communication purposes. (See also Wolff, l967, l968, for an earlier discussion of a similar thesis.)

5.5 Perceptual consequences of the open-close alternation.

The focus of this target article is on speech production From this standpoint, the evolution of the mouth open-close alternation for speech is seen as the tinkering of an already available motor cyclicity into use as a general-purpose carrier wave for time-extended message production, with its subsequent modulation increasing message set size. It has also been pointed out, however, that the open-close alternation confers perceptual benefits. In particular, the acoustic transients, associated with consonants, that accompany onset and offset of vocal tract constriction are considered to be especially salient to the auditory system (e.g. Stevens, l989). The ability to produce varied transients at high rates may have been an important hominid-specific communicative development. In addition, the regularly repeating high amplitude events provided by the vowels may have played an important role in inducing rhythmic imitations.

6. COMPARATIVE NEUROBIOLOGY OF THE FRAME/CONTENT MODE

6.1 The evolution of Broca's Area

The possibility that the mandibular cycle is the main articulatory building block of speech gains force from the fact that the region of the inferior frontal lobe that contains Broca's area in humans is the main cortical locus for the control of ingestive processes in mammals (Woolsey, l958). In particular the equivalents in the monkey of Brodmann's area 44 - the posterior part of classical Broca's area - and the immediately posterior area 6 have been clearly implicated in mastication,(Luschei and Goldberg, l981) and electrical stimulation of area 6 in humans evokes chewing movements (Foerster, l936). In addition in recent high resolution Positron Emission Tomography (PET) studies, cortical tissue at the confluence of areas 44 and 6 has been shown to be activated during speech production. Figure 3 shows regions of activation of posterior inferior frontal cortex in two studies in which subjects spoke written words (Peterson et al, l988 - square - LeBlanc, l992 - circle). The points are plotted on horizontal slice z = 16mm of the normalized human brain coordinates made available by Talairach (Talairach and Tornoux, l988). The figure was generated by use of the Brainmap database (Fox et al, l995) Both areas straddle the boundary between areas 6 and 44 of Brodmann. Fox (l995) reports additional evidence of joint activation of areas 6 and 44 during single word speech.

Of course a landmark event in the history of neuroscience was the discovery that Broca's area played an important role in the motor control of speech. More recently a good deal of significance has been attached to the discovery by paleontologists that the surface configuration of the cortex in this region underwent relatively sudden changes in Homo H abilis (e.g. Tobias, l987). The question of exactly why it was this particular area of the brain which took on this momentous new role has received little attention. Perhaps part of the answer may come not only from the recognition of the importance of our ingestive heritage in the evolution of speech, but also when one acknowledges the more general fact that the main change from other primate vocalization to human speech has come in the articulatory system. Consistent with this fact, bilateral damage to Broca's area and the surrounding region does not interfere in any obvious way with monkey vocalization, (Jurgens, Kirzinger and von Cramon, l982) but unilateral damage the the region of Broca's area in the left hemisphere, if sufficiently extensive, results in a severe deficit in speech production. But despite the involvement of Broca's area in the control of the articulatory apparatus, caution is advised in drawing implications from this part of Homo Habilis morphology for the evolution of speech. This region is also involved in manual function in monkeys (Gentilucci et al, l988; Rizzolatti et al, l988) and in humans (Fox, l995).

6.2 Medial frontal cortex and speech evolution

At first glance, evolution of a new vocal communication capacity in Broca's area of humans appears to constitute a counterexample to Darwin's basic tenet of descent with modification. It has often been considered to be an entirely new development (e.g. Lancaster, l973; Myers, l976; Robinson, l976) The main region of cortex controlling vocal communication in monkeys is Anterior Cingulate Cortex, on the medial surface of the hemisphere (Jurgens, l987). Vocalization can be evoked by electrical stimulation of this region and damage to it impairs the monkey's ability to voluntarily produce calls on demand (e.g. in a conditioning situation). However, a clue to the evolutionary sequence of events for speech comes from consideration of the Supplementary Motor Area, (SMA) an area immediately superior to anterior cingulate cortex and closely connected with it. While this area has not been implicated in vocal communication in monkeys, it is consistently activated in brain imaging studies of speech, (Roland, l993) and, it is active even when the subjects merely think about making movements (Orgogozo and Larson, l979). It was given equal status with Broca's and Wernicke 's areas as a language area in the classic monograph of Penfield and Roberts (l959).

Two properties of the SMA are of particular interest in the context of the Frame/Content theory. A number of investigators have reported that electrical stimulation of this area often makes patients involuntarily produce simple consonant-vowel syllable sequences such as "dadadada" or "tetetete" (Brickner, 1940; Erikson and Woolsey, l951; Penfield and Welch, l951; Penfield and Jasper, l954; Chauvel, l976; Woolsey, Erikson and Gilson, l979: Dinner and Luders, l995). Penfield and Welch concluded from their observations of rhythmic vocalizations that "These mechanisms, which we have activated by gross artificial stimuli, may, however, under different conditions, be important in the production of the varied sounds which men often use to communicate ideas." (p 303) The author believes that this conclusion was of profound importance for the understanding of the mechanism of speech production, and its evolution, but apparently it has been totally ignored.

In addition Jonas, (l981) has summarized 8 studies of irritative lesions of the SMA which have reported involuntary production of similar sequences by 20 patients. The convergence of these two types of evidence strongly suggests that the SMA is involved in frame generation in modern humans.

It thus appears that the evolution of a communicative role for Broca's area was not an entirely de novo development. It is more likely that when mandibular oscillations became important for communication, their control for this purpose shifted to the region of the brain which was already most important for control of communicative output - medial cortex. But it may have been that once the mandibular cycle was co-opted for communicative purposes, the overall motor abilities associated with ingestion also became available for tinkering into use for communicative purposes. This is consistent with the fact that a typical result of damage to Broca's area is what has been called "apraxia of speech" - a disorder of motor programming revealed by phonemic paraphasias and distortions of speech sounds (e.g. MacNeilage, l982).

6.3 Medial and lateral premotor systems

Further understanding of this particular distribution of speech motor roles and how they relate to properties of manual control can be gained by viewing the overall problem of primate motor control from a broader perspective. It is now generally accepted that the SMA and inferior premotor cortex of areas 6 and 44 are the main areas of premotor cortex for two fundamentally different motor subsystems for bodily action in general (e.g. Eccles, l982; Rizzolatti, Matelli and Pavesi, l983; Goldberg, l985; l992; Passingham, l987) Using the terminology of Goldberg, Anterior Cingulate Cortex and the SMA are part of a "Medial Premotor System" (MPS) associated primarily with "Intrinsic" or self-generated activity, while the areas of inferior premotor cortex are part of a "Lateral Premotor System" (LPS) associated primarily with "Extrinsic" actions; that is actions responsive to external stimulation. The connectivity of these two premotor areas is consistent with this proposed division of labor. While the sensory input to the SMA is primarily from deep somatic afferents,inferior premotor cortex recieves heavy multimodal sensory input - somatic input from anterior parietal cortex, visual input primarily from posterior parietal cortex, and auditory input from superior temporal cortex, including Wernicke's area in the left hemisphere of humans (Pandya, l987).

This basic action dichotomy has been well established by studies involving both lesion and electrical recording in monkeys. The MPS has been shown to be primarily involved in tasks in which monkeys produce sequences of previously learned manual actions with no external prompting, while LPS is primarily involved in sequencing tasks in which the component acts are cued by sensory stimulation (e.g. lights) (Tanji, Shima, Matsuzaka and Halsband, l995) . The human equivalent of the findings from monkey lesion studies of MPS is an initial akinesia, an inability to spontaneously generate bodily actions. A symptom often encountered in such patients is the "Alien Hand Sign" (Goldberg, l992). The hand contralateral to the lesion, typically the right hand, seems to take on a life of its own, without the control of the patient. In such patients the normal balance of MPS and LPS apparently shifts towards a dominance of the LPS. If an object is introduced into the intrapersonal space of a patient with the alien hand sign, the patient will grasp the object with such force that the fingers have to be prized off it. The relative role of the two systems in patients with MPS lesions is further shown by a study by Watson, Fleet, Gonzales-Rothi and Heilman (l986). They showed that such patients were maximally impaired in attempts to pantomime acts from verbal instruction. Less impairment was noted in attempts to imitate the neurologist's actions, and actual use of objects was most normal.

There are equivalent effects of MPS lesions for speech. The initial effect is often complete mutism - inability to spontaneously generate speech. However, subsequently, while spontaneous speech remains sparse, such patients typically show almost normal repetition ability. In these cases, Passingham (l987) has surmised that "it is Broca's area speaking".(p 159) A similar pattern of results has been observed in patients with Transcortical Motor Aphasia which typically involves interference with the pathway from the SMA to inferior premotor cortex (Freedman, Alexander and Naeser, l984).

In contrast to these results of MPS lesions on speech are results of lesions of LPS, which tend to affect repetition more than spontaneous speech. In particular, this pattern is often observed in Conduction aphasics who tend to have damage in inferior parietal cortex affecting transmission of information from Wernicke's to Broca's area. Thus the medial and lateral patients described here show a "double dissociation", a pattern much valued in neuropsychology because it provides evidence that there are two separable functional systems in the brain (Shallice, 1988). Further evidence for this dichotomy comes from patients with "Isolation of the Speech Area". These patients, who have lost most cortex except for lateral perisylvian cortex have no spontaneous speech, but may repeat input obligatorily, without instruction (Geschwind, Quadfassel and Segarra, 1968).

6.4 The lateral system and speech learnability

Typical bodily actions are visually guided. While the motivationally based intention is generated in MPS, which may also help in providing the basic action skeleton, the action itself is normally accomplished, while taking into account target-related information available to vision, by means of LPS. In contrast, spontaneously generated speech episodes are not sensorily guided to any important degree. However, as we have seen, the lateral system has an extremely good repetition capacity. Normal humans can repeat short stretches of speech with input-output latencies for particular sounds that are often shorter than typical simple auditory reaction times (about 140 ms; see Porter and Castellanos, l980). People have been puzzled as to why we possess this rather amazing capacity when, in the words of Stengel and Lodge-Patch (l955) repetition is an ability that lacks functional purpose.

A background for a better understanding of the repetition phenomenon comes from evidence from PET studies for the activation of ventral lateral frontal cortex (roughly Broca's Area) in tasks that do not involve any overt speech (Demonet et al, l993:

"... for example, the categorization of visually presented letters on the basis of their phonetic value (Sergent et al, l992), a rhyming task on auditorily presented pairs of syllables (Zatorre et al, l992), a sequential phoneme monitoring task on auditorily presented nonwords with serial processing (Demonet et al, l992), the memorization of a sequence of visually presented consonants (Paulesu et al, l993) a lexical decision task on visually presented letter strings (Price et al, l993) and monitoring tasks for various language stimuli either auditorily or visually presented (Fiez et al, l993)." (p 44)."

As the authors note:

"The observed activation of this premotor area in artificial metalinguistic comprehension tasks suggests the involvement of sensorimotor transcoding processes that are also involved in other psychological phenomena such as motor theory of perception of speech (Liberman and Mattingly, l985), inner speech, (Stuss and Benson, l986: Wise et al, l991), the articulatory loop of working memory (Baddeley, l986), or motor strategies developed by infants during the period of language acquisition (Kuhl and Meltzoff, l982)." They note that the presence of this sensorimotor transcoding capacity is also suggested by "disorders of phonetic discrimination in Broca-type aphasic patients (Blumstein et al, l977) as well as in subjects during electrical stimulations of the left inferior frontal region (Ojemann, l983)." (p 44).

The utility of this capacity and the probable reason for its origin becomes clearer when one notes that while humans learn whichever one of the 6000 or so languages they grow up with, monkeys have negligible vocal learning capacity (Jurgens, l995). The human repetition capacity is presumably associated with the now well established Phonological Loop of working memory, which involves subvocalization as an aid in temporary storage of speech material(Baddeley, l986). Baddeley (1995) has recently speculated that this capability probably evolved in order for language to be learned. Thus while in adults, the primary role of the LPS for spontaneous speech is probably transmission of previously learned and now stored lexical information relevant to pronunciation from Wernicke's to Broca's area, the primary role of LPS in infants is that it allows speech to be learned. It is somewhat ironic, in view of the special modular innate status often claimed for the human speech capacity, that from a perceptual-motor perspective the main change in vocal organization from other primates to humans may be evolution in the LPS of a capacity to learn speech. Furthermore, rather than having a unique form, the overall brain organization of motor output for speech seems to be no different than that for other bodily activity. Both are equally subject to the basic intrinsic-extrinsic functional dichotomy.

Lateral cortex presumably allows humans to not only say what they hear but do what they see, in general bodily terms. The presence of some ability of the SMA patients described by Watson et al (l986) to imitate demonstrations of object use when they cannot pantomime such use is evidence of this. But there is also evidence that monkeys may possess some comparable ability. Pellegrino et al (l992) have observed numerous instances in which single neurons in ventral lateral premotor cortex which had been shown to be active in various movement complexes performed by the animal, also discharged when experimenters performed the same movements in front of the animal.

It seems likely that we have grossly underestimated the importance of our capacity for matching movements to input patterns, vested in the LPS, in our attempts to understand the evolution of cognition in general. Elsewhere, I have summarized an argument to this effect by Donald (l991). He believes that "... evolution of a generalized mimetic capacity in Homo Erectus was the first major step in the evolution of a hominid capacity beyond the great ape level, and was a necessary precursor to the evolution of language, which probably evolved in homo sapiens. This hypothesis addresses the otherwise anomalous centrality in human culture of a wide range of behaviors including tribal ritual, dance, games, singing and music in general, all of which involve a capacity for the production of intentional representational displays but have virtually no analogs in living great apes. A wide variety of actions and modalities can be incorporated for the mimetic purpose: `Tones of voice, facial expressions, eye movements, manual signs and gestures, postural attitudes, patterned whole body movements of various sorts...' (p 169). Donald makes the plausible argument that this mimetic capacity must have evolved before language, because language provides such a rich cognitive endowment that it would be hard to explain the necessity for mimesis once language had evolved." (MacNeilage, l994, pp 186-187.

6.5 Speech input from posterior cortex.

Finally, with reference to speech, a word is in order about the input to the two proposed motor control subsystems. There is general agreement that perisylvian cortex in the temporoparietal region is involved in phonological representations of at least the stem forms of many content words, especially nouns. In contrast, grammatical morphemes (function words and affixes) and perhaps aspects of verbs may be primarily controlled from frontal cortex judging by the agrammatism that follows extensive lesions in lateral frontal cortex in classical Broca's Aphasia. Most segmental serial ordering errors of speech in both normals and aphasics involve content word stems, not grammatical morphemes, and the F/C theory presented here is most relevant to content words.

Patients with lesions in temporoparietal cortex typically produce paraphasic speech - speech replete with segmental errors. Acoustic studies have shown that these errors for the most part are errors of choice of segments rather than errors in their motor control, the latter errors being more prominent in patients with ventral frontal lesions (MacNeilage, l982). From this, one can conclude that temporoparietal cortex is involved in phonological encoding of lexical items - access to phonological information about words, and successful delivery of this information to the production control apparatus. It is hypothesized that this encoding phase involves two kinds of information, one kind for each of the motor control subsystems that have been discussed. Information regarding numbers of syllables in the word, suprasegmental information regarding stress placement, and perhaps information about vowels may be sent to the medial system for frame generation. Information about consonants and vowels may be sent to the lateral system for generation of content elements adjusted to their segmental context, as suggested earlier. According to this conception, the subsequent reintegration of the frame and content components must take place in lateral premotor cortex.

6.6 The role of prefrontal cortex.

The full story of the evolution of speech must include the history of selection pressures for communication in the context of overall hominid evolution. An important neurobiological development in this regard is the enormous expansion of prefrontal cortex, a region involved in higher order organizational functions in general. Prefrontal cortex is heavily interconnected with the limbic system, leading MacLean (l982) to suggest that it affords "an increased capacity to relate internal and external experience" (p 311). Deacon (l992) accords prefrontal cortex the primary role in the development in humans of a "low arousal" learnable communicative capacity independent of innate emotion-based vocalizations because of its "dominant status in the loop linking sensory analysis, emotional arousal and motor output" (p 155). A specific functional linkage between dorsolateral prefrontal cortex and the SMA in humans was recently shown by Frith et al (l991) in a study involving word generation as distinct from word repetition. Paus et al (l993) also observed joint activation of prefrontal cortex with medial cortical sites during speech tasks. Studies cited earlier (e.g. Posner et al, l988) also suggest that prefrontal cortex may have played a dominant role in the evolution of grammar.

7. SOME IMPLICATIONS OF THE THEORY.

7.1 Testability.

Is this theory testable? Predictions regarding levels of activity of the SMA and areas 6 and 44 in certain tasks, testable by means of brain imaging studies, can be made. One straightforward prediction is that mastication, sucking and licking will involve more activity in ventral area 6 than in area 44, and more activity in area 44 than in the SMA. Another prediction involves the general claim that the SMA is specialized for frame generation and ventral premotor cortex for content generation. The prediction can be tested using artificial forms of speech which manipulate the relative role of the frame and content components: 1. "Reiterant Speech", a condition in which segmental content demands are minimized but syllabic and phonatory demands are not should produce relatively higher activity levels in SMA, and perhaps more activity in area 6 than in area 44. In this condition, the subject attempts to simulate words or utterances using only l syllable. For example if the stimulus word is "concatenate", the subject says "maMAmama" producing the same number of syllables as in the stimulus word with major stress on the second syllable. 2. Bite block speech or speech with the teeth clenched eliminates the demand on the mandible for syllabification but increases the demand on segmental production because every segment must be produced in an unusual way to compensate for the inability to adopt the usual jaw position for the sound. This condition should produce higher relative levels of activity in area 44 than in either area 6 or the SMA.

7.2 Comparison with other theories.

How does this theory compare with other current general conceptions of speech production that have implications for evolution. The concept of the syllable was found to be central in all the areas of subject matter considered in the formulation of this theory. With the exception of conceptions based primarily on evidence from segmental errors in speech (Shattuck-Hufnagel, l979, Dell, l986, Levelt, l989) this emphasis is not shared in other conceptions of speech as a behavior. The syllable is given virtually no attention in two other current theories of the evolution of speech, the two-tubed vocal tract theory of Lieberman (l984) and the motor theory of speech perception (Liberman and Mattingly, l985). It is not mentioned in the most prominent conception of brain-language relations, the Wernicke-Geschwind scenario, reiterated by Damasio and Geschwind in l984 (Also see Damasio and Damasio, l992). It scarcely figures in the most prominent current conception of the on-line control of speech - the articulatory phonology perspective of Browman and Goldstein (l986). It is incidental to what has till recently been the dominant conception of acquisition of speech production - the theory of Jakobson (l968). The Frame/Content theory suggests that all of these approaches require drastic restructuring.

In other contexts, the syllable falls victim to a functional eclecticism which results in a lack of recognition that speech might be different than other functions because it has been subject to different selection pressures. For example, Norman and Rumelhart (1983) have constructed a model of typing control consistent with typical typing error patterns. The model is based on "the assumption that the motor control of a learned movement is represented by means of a motor schema, an organized unit of knowledge, differing from the form of knowledge widely studied in the literature on memory, language, and thought only in that it has as its output the control of body movements" (p 55). (See also Rumelhart and Norman, 1982). This eclectic approach to mental organization, unaffected by the possibility that different functions may be subject to different phylogenetic constraints, is relatively common in both cognitive science and neuroscience. The present contention is that no theory of either the organization of speech or its evolution that does not include the dual components metaphorically labelled frame and content in the present discussion is a viable one, whatever theories might be advanced to account for any other aspect of human function.

7.3 Other instances of the frame/content mode.

In earlier writings MacNeilage (l987a) and his colleagues (MacNeilage et al, l984, l985) had suggested that the F/C mode of phonology may have had a precursor in an F/C mode of bimanual coordination, in which the holding hand is the frame and the manipulating hand contributes content elements. The author wishes to retract this view as he was unable to conceive of an adaptation, induced by a specific selection pressure, that would have achieved the transfer of such a generalized organization capability from the manual to the vocal system. In this journal and elsewhere, my colleagues and I have suggested an alternative view of the hand-mouth relation whereby both the manual and the speech specialization arose from a left hemisphere specialization for whole-body postural control already present in prosimians (MacNeilage, Studdert-Kennedy & Lindblom, l987: MacNeilage, l991b). It is possible that this role of the left hemisphere for whole body motor control may be fundamental to all vertebrates (MacNeilage, l997a). There seem to be other important F/C modes of complex behavior. Garrett (l988) has argued for an F/C mode of syntax on the basis of evidence from serial ordering errors involving morphemes and words. This author regards this evidence as an extremely important clue as to the means of evolution of grammar, but would regard this mode as analogous to the F/C mode of organization of phonology rather than homologous. The F/C mode of organization can also be implicated in much hand/mouth interaction, such as that involved in one-handed and two-handed feeding (MacNeilage, l992) It appears that the F/C mode is an important means of evolution of complex action systems. Presumably this is because it makes it possible to produce a large number of output states with a small number of basic organizational configurations - one basic frame in the case of speech.

7.4 Innate subsegmental units.

F/C theory provides no justification for the postulation of innate subsegmental units - either the Feature of linguistic theory (Chomsky and Halle, l968) or its functional equivalent, the Gesture (Liberman and Mattingly, l985). The concept of the feature, including the notion that it provides an innate basis for the "phonetic possibilities of Man" (Chomsky and Halle, l968) arises from circular reasoning, as Ohala (e.g. Ohala, l978) has repeatedly pointed out, and the author has discussed elsewhere the problem of providing independent evidence for the concept of gesture in on-line production and perception, let alone genetic structure (MacNeilage, l990). In addition to the inherent inability of essentialistic concepts, such as the concept of distinctive feature, to form part of a theory involving change, the concept is of no value in the present context, because it lacks functional implications. For example, a distinctive feature such as "+high" (MacNeilage, l991a) refers to an end state of the tongue which can vary across a subclass of vowels. The characterization is considered to be abstract and only indirectly related to articulation (e.g. Anderson, l982). However, no coherent theory of the transforms from the putative single abstract representation to its various manifestations in vowels has ever been presented. The present articulatory connotations of the definitions of features ("high" refers to an articulator) introduced by Chomsky and Halle (l968) to replace the perceptually based features of Jakobson Fant and Halle (l951) were not chosen on the basis of any evidence regarding speech production, but only because articulatory terms had more straightforward connotations than perceptual terms. However, as a result of this decision, the discipline of phonology is now ill-equipped to describe, let alone explain, the many features of sound patterns of languages that apparently develop for perceptual reasons - for example the fact that nasals tend to assimilate to the place of articulation of adjacent stop consonants, but fricatives do not (Hura et al, l994).

7.5 Input-output relationships.

The F/C theory includes the suggestion that there have been major developments in the efficiency of input-output linkages in the evolution of speech. The motor theory of speech perception (Liberman and Mattingly, l985) also emphasizes the importance of the evolution of input/output relationships for speech. In this theory, the gesture is the fundamental unit of both input and output, with the abstract representation of the output unit serving as a basis for categorical perception of input. However, the motor theory calls for the opposite relation between phylogeny and ontogeny than the one suggested here. According to the Motor theory, (Liberman and Mattingly, l985) gestures originate as separate entities, and then, under pressures for rapid message transmission became increasingly coarticulated with neighboring units to the point where only perceptual access to gestural invariance at some abstract production level makes them perceivable. If true, this would be a case in which ontogeny reverses phylogeny. As we have seen, the frame dominance state in babbling and early speech is characterized primarily by heavy coarticulation of successive articulatory positionings, and subsequent developments are in the direction of reducing coarticulation rather than increasing it (e.g. Nittrouer, Studdert-Kennedy, and McGowan, l989). Thus, rather than being the initial elements out of which speech was created, gestures, if they can be adequately defined, will probably be best regarded as later emergents, phylogenetically and ontogenetically. The F/C theory suggests instead that the syllable frame should be regarded as providing an initial common basis for interactions between perceptual, lexical and motor subcomponents of the speech system in earlier hominids and modern infants.

7.6 Was the first language spoken or signed?

The question of whether spoken or sign language was the first language is considered in detail elsewhere (MacNeilage, l997a), with the following conclusions:

(1) The current ubiquity of spoken language encourages a belief in its evolutionary priority. The reasons usually given for an historical switch from signed to spoken language - the lack of omnidirectionality of sign, the fact that it prevents other uses of the hands, and its lack of utility in the dark - seem insufficient to have caused a total shift from manual to vocal language.

(2) The likelihood that there is a left hemisphere vocal communication specialization in frogs, birds, mice, gerbils and monkeys, and the many instances of right handedness in groups of higher nonhuman primates (both reviewed in MacNeilage, l997b) casts doubt on the frequently encountered contention that tool construction and use in homo habilis were crucial manual adaptations for language.

(3) The repeatedly obtained finding that language lateralization is more closely related to foot preference - an index of postural asymmetry - than to handedness which is an index of skill (Searleman, l980; Maki, l990; Day and MacNeilage, l996; Elias and Bryden, l997) casts further doubt on an early role of manual language.

(4) Recent claims that there is a left hemisphere specialization for language independent of the modality (Poizner, Klima and Bellugi, l987; Petitto and Marentette, l991) give the spurious impression that an historical shift from signed to spoken language could easily have occurred. These claims are found to be unjustified.

7.7 Coda

According to the F/C theory, the evolution of the control of the movements of speech from prespeech vocalizations involved pre-existing phonatory capacities and a specific sequence of adaptations proceding from ingestive cyclicities, via visuofacial communicative cyclicities, to syllables, which ultimately became modulated in their internal content. The overall form of the theory is relatively straightforward, taking as it does a well accepted notion of the dual structure of speech organization (frames and content elements) and mapping it onto a relatively well accepted notion of the dual structure of primate cortical motor systems (medial and lateral) which were presumably modified for the purpose. It is hoped that this theory will provide an antidote, in addition to the one provided in this medium by Pinker and Bloom, (l990) to the tendency to regard language as "an embarrassment for evolutionary theory" (Premack, l986, p 133). The author's guess is that language will eventually prove to be amenable to current mainstream evolutionary theory. A Neodarwinian approach to speech may prove to be the thin end of the wedge for the understanding of language evolution.

Acknowledgements

This paper was prepared with support from research grant # HD-27733 from the Public Health Service. I thank Hugh Buckingham, Antonio Damasio, Barbara Davis, Randy Diehl, Rogers Elliot, Jean-Luc Nespoulous, Giacomo Rizzolatti, Marilyn Vihman and Steven Wise for their comments on the manuscript. I also wish to thank Mario Liotti and Larry Parsons for generating Figure 3 by use of BrainMap software and Peter Fox for making the facilities of the Research Imaging Center, University of Texas Health Science Center, San Antonio, available for this purpose.

References

Aich, H., Moos-Heilen, R. & Zimmermann, E. (l990) Vocalizations of adult gelada baboons (Theropithecus gelada): Acoustic structure and behavioral context. Folia Primatologica, 55, 109-132.

Anderson, S. (l981) Why phonology isn't natural. Linguistic Inquiry, 12, 493-539.

Andrew, R.J. (l976) Use of formants in the grunts of baboons and other nonhuman primates. Annals of the New York Academy of Sciences, Vol 280, 673-693.

Baddeley, A.D. (l986) Working memory. Clarendon Press.

Baddeley, A.D. (l995) Working memory. In: Principles of neuroscience, ed. M.S. Gazzaniga. MIT Press.

Blumstein, S.E., Baker, H. & Goodglass, H. (l977) Phonological factors in auditory comprehension in aphasia. Neuropsychologia, 15, 19-30.

Brickner, R.M. (1940) A human cortical area producing repetitive phenomena when stimulated. Journal of Neurophysiology 3, 128-130.

Broad, D.J. (l973) Phonation. In: Normal aspects of speech, hearing and language, eds. F.D. Minifie, T.J. Hixon & F. Williams. Prentice-Hall.

Browman, C.P. & Goldstein, L. (l986) Toward an articulatory phonology. Phonology Yearbook 3. Cambridge University Press.

Carre, R., Lindblom, B. & MacNeilage, P.F. (l995) Acoustic factors in the evolution of the vocal tract. (Translation) C.R. Acadamie des Sciences Paris, t 320, Serie IIb, 471-476.

Chauvel, P.C. (l976) Les stimulations de l'aire motrice supplementaire chez l'homme. Thesis, Universite de Rennes.

Chomsky, N. & Halle, M. (l968) The sound pattern of English. Harper and Row.

Chomsky, N. (l986) Knowledge of language: Its nature, origin, and use. Praeger.

Chomsky, N. (l988) Language and problems of knowledge: The Managua lectures. MIT Press.

Cohen, A.H. (1988) Evolution of the vertebrate central pattern generator for locomotion. In: Neural control of rhythmic movements, eds. A. Cohen, S. Rossignol & S. Grillner. Wiley.

Darwin, C. (l859) The origin of species. John Murray.

Darwin, C. (l871) The descent of man. Great Books, Encyclopedia Britanica.

Davis, B.L. & MacNeilage, P.F. (l995) The articulatory basis of babbling. Journal of Speech and Hearing Research, 38, 1199-1211.

Day, L.B. & MacNeilage, P.F. (l996) Postural asymmetries and language lateralization in humans (Homo sapiens). Journal of Comparative Psychology, 110, 86-96.

Deacon, T.W. (l992) The neural circuits underlying primate calls and human language. In: Language origin: A multidisciplinary approach, eds. J. Wind, B. Chiarelli, B. Bichakjian & A. Nocentini. Kluwer.

Dell, G.S. (l986) A spreading-activation theory of retrieval in sentence production. Psychological Review, 93, 283-321.

Demonet, J-F., Chollet, F., Ramsay, S., Cardebat, D., Nespoulous J-L., Wise, R., Rascol, A. & Frackowiak, R.S.J. (l992) The anatomy of phonological and semantic processing in normal subjects. Brain, 115, 1753-1768.

Demonet, J.F., Wise, R. & Frackowiak, R.S.J. (l993) Language functions explored in normal subjects by positron emission tomography: A critical review. Human Brain Mapping, 1, 39-47.

Dinner, D.S. & Luders, H.O. (l995) Human supplementary sensorimotor area: Electrical Stimulation and movement-related potential studies. In: Epilepsy and the functional anatomy of the frontal lobe, ed. H.H. Jasper, S Riggio, & P.S. Goldman-Rakic. Raven Press.

Donald, M. (l991) Origins of the modern mind: Three stages in the evolution of culture and cognition. Harvard University Press.

Dunbar, R.I.M. & Dunbar, P. (l975) Social dynamics of gelada baboons. Contributions to Primatology, Vol. 6. Karger.

Eccles, J.C. (l982) The initiation of voluntary movements by the supplementary motor area. Archiv Psychiatrie Nervenkrankheiten, 231, 423-441.

Elias, L.J. & Bryden, M.P. (l997) Footedness is a better predictor of language lateralization than handedness. Laterality, (In press)

Erickson, T.C. & Woolsey, C.N. (l951) Observations on the supplementary motor area of man. Transactions of the American Neurological Association, 76, 50-52.

Fant, C.G.M. (l960) Acoustic theory of speech production. Mouton.

Fiez, J.A., Raichle, M.E., Tallal, P.A. & Petersen, S.E. (l993) Activation of a left frontal area near Broca's area during auditory detection and phonological access tasks. Journal of Cerebral Blood Flow and Metabolism, 13 (Supplement 1) S519.

Foerster, O. (l936) The motor cortex in man in the light of Hughlings Jackson's doctrines. Brain, 59, 135-159.

Fox, P.T. (l995) Broca's area: Motor encoding in somatic space. Behavioral and Brain Sciences, 18, 344-345.

Fox, P.T., Mikiten, S., Davis, G. & Lancaster, J.L. (l995) Brainmap: A database of human functional brain mapping. In: Advances in functional neuroimaging: Technical foundations, eds. R.W. Thatcher, M. Hallet, & T. Zeffiro. Academic Press.

Freedman, M., Alexander, M.P. & Naeser, M.A. (l984) Anatomic basis of transcortical motor aphasia. Neurology, 34, 409-417.

Frith, C.D., Friston, K.J., Liddle, P.F. & Frackowiak, R.S.J. (l991) A PET study of word finding. Neuropsychologia, 29, 1137-1148.

Fromkin, V.A. (Ed.) (l973) Speech errors as linguistic evidence. Mouton.

Garrett, M.F. (l988) Processes in language production. In: Linguistics: The Cambridge survey, Vol lll, Language: Psychological and biological aspects, ed. F.J. Newmeyer. Cambridge University Press.

Gentilucci, M., Fogassi, L., Luppino, G., Matelli, M., Camarda, R. & Rizzolattl, G. (l988) Functional organization of inferior area 6 in the macaque monkey I. Somatotopy and the control of proximal movements. Experimental Brain Research, 71, 475-490.

Geschwind, N., Quadfassel, F.A. & Segarra, J. (l968) Isolation of the speech area. Neuropsychologia, 6, 327-340.

Goldberg, G. (l985) Supplementary motor area structure and function: Review and hypothesis. Behavioral and Brain Sciences, 8, 567-616.

Goldberg, G.(l992) Premotor systems, attention to action and behavioral choice. In Neurobiology of motor program selection, eds. J. Kein, C.R. McCrohan & W. Winlow, Pergamon Press.

Goldberg, G. (l992) Premotor systems: attention to action and behavioral choice. In: Neurobiology of motor program selection, eds. J.Kein, C.R. McCrohan & W. Winlow. Pergamon Press

Gould, S.J. (l977) Ontogeny and phylogeny. Bellknap.

Green, S. (l975) Variations of vocal pattern with social situation in the Japanese monkey (Macaca fuscata): A field study. In: Primate behavior: Vol. 4, Developments in field and laboratory research, ed. L.A. Rosenblum. Academic Press.

Grudin, J. (l981) The organization of serial order in typing. Unpublished doctoral dissertation, University of California at San Diego.

Hixon, T.J. (l973) Respiratory function in speech. In: Normal aspects of speech, hearing and language, eds. F.D. Minifie, T.J. Hixon & F. Williams. Prentice-Hall.

Hofsten, C.von (l984) Developmental changes in the organization of prereaching movements. Developmental Psychology, 20, 378-388.

Hofsten. C.von (l986) The early development of the manual system. In: Precursors to infant speech, eds. B. Lindblom & R. Zetterstrom. Stockton Press.

Hura, S.L., Lindblom, B. & Diehl, R.L. (l994) On the role of perception in shaping phonological assimilation rules. Language and Speech, 35, 59-72.

Huxley, T.H. (l917) Methods and results: Essays. Appleton & Co.

Jacob, F. (l977) Evolution and tinkering. Science, l96, ll6l-1166.

Jakobson, R. (l968) Child language, aphasia, and phonological universals. Mouton.

Jakobson, R., Fant, C.G.M. & Halle, M. (l951) Prelininaries to speech analysis. MIT Press.

Jonas, S. (l981) The supplementary motor region and speech emission. Journal of Communication Disorders, 14, 349-373.

Jurgens, U. (l979) Neural control of vocalization in non-human primates. In: Neurobiology of social communication in primates, eds. H.D. Steklis & M.J. Raleigh. Academic Press.

Jurgens, U. (l987) Primate communication: Signalling, vocalization. In: Encyclopedia of Neuroscience, ed. G. Adelman. Birkhauser.

Jurgens, U. (l995) Neuronal control of vocal production in human and non-human primates. In: Current topics in primate vocal communication, eds. E. Zimmerman, J.D. Newman & U. Jurgens. Plenum Press.

Jurgens, U., Kirzinger, A. & Cramon D.von. (l982) The effect of deep reaching lesions in the cortical face area on phonation: A combined case report and experimental monkey study. Cortex, 18, 125-140.

Kozhevnikov, V.A. & Chistovich, L. (Eds) (l965) Speech: Articulation and perception. Washington, Clearing House for Federal, Scientific and Technical Information, JPRS, 30, 543.

Kuhl, P.K. & Meltzoff, A.N. (l982) The bimodal perception of speech in infancy. Science, 218, 1138-1141.

Lakoff, G. (1987) Women, fire and dangerous things: What categories reveal about the mind. The University of Chicago Press.

Lancaster. J. (l973) Primate behavior and the emergence of human culture. Holt, Rinehart Winston.

Lashley, K.S. (l951) The problem of serial order in behavior. In: Cerebral mechanisms in behavior: The Hixon symposium, ed. L.A. Jeffress. Wiley.

LeBlanc, P. (l992) Language localization with activation PET scanning. Journal of Neurosurgery, 31, 369-373.

Levelt, W.J.M. (1989) Speaking: From intention to articulation. MIT Press.

Levelt, W.J.M. (l992) Accessing words in speech production: Stages, processes and representations. Cognition, 42, 1-22.

Liberman, A.M. & Mattingly, I.G. (l985) The motor theory of speech perception revised. Cognition, 21, 1-36.

Lieberman, P. (l984) The biology and evolution of language. Harvard University Press.

Lindblom, B. & Maddieson, I. (l988) Phonetic universals in consonant systems. In: Language, speech and mind, eds. L.M. Hyman & C.N. Li. Routledge.

Locke, J. (l983) Phonological acquisition and change. Academic Press.

Lund, J.P. & Enomoto, S. (l988) The generation of mastication by the central nervous system. In: Neural control of rhythmic movements, eds. A. Cohen, S. Rossignol, and S. Grillner. Wiley.

Luschei, E.S. & Goldberg, L.J. (l98l) Neural mechanisms of mandibular control: mastication and voluntary biting. In: Handbook of physiology, The nervous system Vol 2. Washington, D.C. American Physiological Society.

MacLean, P.D. (l982) On the origin and progressive evolution of the triune brain. In: Primate brain evolution: Methods and concepts, eds. E. Armstrong and D.Falk, D. Plenum.

MacNeilage, P.F. (l964) Typing errors as clues to serial ordering mechanisms in language behavior. Language and Speech, 7, 144-159.

MacNeilage, P.F. (l973) Central Processes controlling speech production during sleep and waking. In: The psychophysiology of thinking: Studies of covert processes, eds. F.J. McGuigan & R.A. Schoonover. Academic Press.

MacNeilage, P.F. (l982) Speech production mechanisms in aphasia. In: Speech motor control, eds S. Grillner, B. Lindblom, J. Lubker & A Persson. Pergamon.

MacNeilage, P.F. (l985) Serial ordering errors in speech and typing. In: Phonetic linguistics, ed. V.A. Fromkin. Academic Press.

MacNeilage, P.F. (l986) Bimanual coordination and the beginnings of speech. In: Precursors to early speech, eds. B. Lindblom & R. Zetterstrom. Stockton Press.

MacNeilage, P.F. (l987a) The evolution of hemispheric specialization for manual function and language. In: Higher brain functions: explorations of the brain's emergent properties, ed. S.P. Wise. Wiley.

MacNeilage, P.F. (l987b) Speech: Motor control. In: Encyclopedia of neuroscience, ed. G.A. Adelman. Birkhauser

MacNeilage, P.F. (l989) Grasping in modern primates: The evolutionary context. In: Vision and action: The control of grasping, ed. M.A. Goodale. Ablex.

MacNeilage, P.F. (l990) The gesture as a unit in speech perception theories. In: Modularity and the motor theory, eds. I.G. Mattingly and M.G. Studdert-Kennedy. Erlbaum.

MacNeilage, P.F. (l991a) Articulatory phonetics. In: Oxford international encyclopedia of linguistics, ed. W. Bright Oxford University Press.

MacNeilage, P.F. (l991b) The "Postural Origins" theory of primate neurobiological asymmetries. In: Biological and behavioral determinants of language development, eds. N.A. Krasnegor, D.M. Rumbaugh, R.L. Schiefelbusch & M.G. Studdert-Kennedy. Erlbaum.

MacNeilage, P.F. (l992) Evolution and lateralization of the two great primate action systems. In: Language origin: A multidisciplinary approach, eds. J. Wind, B. Chiarelli, B. Bichakjian & A. Nocentini. Kluwer.

MacNeilage, P.F. (l994) Prolegomena to a theory of the sound pattern of the first language. Phonetica, 51, 184-194.

MacNeilage, P.F. (l997a) Evolution of the mechanism of language output: Comparative Neurobiology of vocal and manual communication. In Evolution of language, eds. J.R. Hurford, C. Knight & M.G. Studdert-Kennedy. Cambridge University Press (In press).

MacNeilage, P.F. (l997b) Towards a unified view of cerebral hemispheric specializations in vertebrates. In Comparative neuropsychology, ed. A.D. Milner. Oxford University Press (In press).

MacNeilage, P.F. & Davis, B.L. (l996) From babbling to first words: Phonetic Patterns. Proceedings of the first ECA tutorial and research workshop on speech production modelling. Autrans, France, 155-157.

MacNeilage, P.F., Studdert-Kennedy, M.G. & Lindblom, B. (l984) Functional precursors to language and its lateralization. American Journal of Physiology, 246, (Regulatory Integrative and Comparative Physiology 15) R 912-915.

MacNeilage, P.F., Studdert-Kennedy, M.G. & Lindblom, B. (l985) Planning and production of speech: An overview. In: Planning and production of speech in normally hearing and deaf people, ed. J. Lauter. ASHA Reports, 15-21.

MacNeilage, P.F., Studdert-Keneddy, M.G. & Lindblom, B. (l987) Primate handedness reconsidered. Behavioral and Brain Sciences, 10, 247-263.

Maki, S. (l990) An experimental approach to the postural origins theory of neurobiological asymmetries in primates. Unpublished Ph.D. Dissertation, University of Texas at Austin.

Marler, P. (l977) The structure of animal communication sounds. In: Recognition of complex acoustic signals, ed. T.H. Bullock. Dahlem Konferenzen.

Mayr, E. (l982) The growth of biological thought. Bellknap.

Medicus, G. (l992) The inapplicability of the biogenetic rule to behavioral development. Human Development, 35, 1-8.

Meier, R.P., McGarvin, L., Zakia, R.A.E., Willerman, R. (l997) Silent mandibular oscillations in vocal babbling. Phonetica (In press).

Myers, R.E. (l976) Comparative neurology of vocalization and speech: Proof of a dichotomy. Annals of the New York Academy of Sciences, Vol. 280, 745-757.

Napier, J.R. (l962) The evolution of the hand. Scientific American, 207, 56-62.

Negus, V.E. (l949) The comparative anatomy and physiology of the larynx. Hafner.

Nespoulous, J-L., Lecours, A.R., Lafond, D. & Joanette, Y. (l985) Jargonagraphia with(out) Jargonaphasia. Paper presented at the BABBLE conference, Niagara Falls, March l985.

Nittrouer, S., Studdert-Kennedy, M.G. & McGowan, R.S. (l989) The emergence of phonetic segments. Evidence from the spectral structure of fricative-vowel syllables spoken by children and adults. Journal of Speech and Hearing Research, 32, 120-132.

Norman, D.A. & Rumelhart, D.E. (1983) Studies of typing from the LNR research group. In: Cognitive aspects of skilled typwriting, ed. W.E. Cooper. Springer-Verlag.

Ohala, J. J. (1978) Phonological notations as models. In: Proceedings of the Twelfth International Congress of Linguists. Insbruck, eds. W.U. Dressler and W. Meid. Institut fur Sprachwissenschaft der Universitat Innsbruck.

Ojemann, G. (l983) Brain organization for language from the perspective of electrical stimulation mapping. Behavioral and Brain Sciences, 6, 189-230.

Orgogozo, J.M. & Larsen, B. (1979) Activation of the supplementary motor area during voluntary movement in man suggests it works as a supramotor area. Science, 206, 847-850.

Pandya, D. (l987) Association cortex. In: The encyclopedia of neuroscience, ed. G. Adelman. Birkhauser.

Passingham, R.E. (l987) Two cortical systems for directing movement. In: Ciba Foundation Symposium No. l32, eds. G. Bock, M. O'Connor & J. Marsh. Wiley.

Paulesu, E. Frith, C.D. & Frackowiak, R.S.J. (l993) The neural correlates of the component of working memory. Nature, 362, 342-344.

Paus, T., Petrides, M., Evans, A.C. & Meyer, E. (l993) The role of the human anterior cingulate cortex in the control of oculomotor, manual and speech responses: A positron emission tomography study. Journal of Neurophysiology, 70, 453-469.

Pellegrino, G., Fadiga, L., Fogassi, L. Gallese, V. & Rizzolatti, G. (l992) Understanding motor events: a neurophysiological study. Experimental Brain Research, 91, 176-180.

Penfield, W. & Jasper, H. (l954) Epilepsy and the functional anatomy of the human brain. Little Brown.

Penfield, W. & Roberts, L. (l959) Speech and brain mechanisms. Princeton University Press.

Penfield, W. & Welch, K. (l951) The Supplementary motor area of the cerebral cortex: A clinical and experimental study. A.M.A. Archives of Neurology and Psychiatry, 66 , 289-317.

Petersen, S.E., Fox, P.T., Posner, M.I., Mintun, M. & Raichle, M.E. (l988) Positron emission tomographic studies of the cortical anatomy of single word processing. Nature, 331, 585-589.

Petitto, L. & Marentette, P. (l991) Babbling in the manual mode: evidence for the ontogeny of language. Science, 251, 1493-1496.

Pinker, S. & Bloom, P. (l990) Natural language and natural selection. Behavioral and Brain Sciences, 13, 707-784.

Poizner, H., Klima, E. & Bellugi, U. (l987) What the hands reveal about the brain. MIT Press.

Porter, R.J. & Castellanos, F.X. (l980) Speech production measures of speech perception: rapid shadowing of VCV syllables. Journal of the Acoustical Society of America, 67, 1349-1356.

Posner, M.I., Petersen, S.E., Fox, P.T. & Raichle, M.E. (l988) Localization of cognitive operations in the human brain. Science, 240, 1627-l631.

Premack, D. (l986) Gavagai!. MIT Press.

Price, C., Wise, R., Howard, D., Patterson, K., Watson, K. & Frackowiak, R.S.J. (l993) The brain regions involved in the recognition of visually presented words. Journal of Cerebral Blood Flow and Metabolism, 13 (Supplement 1) S501.

Redican, W.K. (l975) Facial expressions in nonhuman primates. In: Primate behavior: Developments in field and laboratory research, Vol 4, ed. L.A. Rosenblum. Academic Press.

Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G. & Matelli, M. (l988) Functional organization of inferior area 6 in the macaque monkey II. Area F5 and the control of distal movements. Experimental Brain Research, 71, 491-507.

Rizzolatti, G., Matelli, M. & Pavesi, G. (l983) Eficits in attention and movement following the removal of postarcuate (area 6) and prearcuate (area 8) cortex in macaque monkeys. Brain, 106, 655-673.

Robinson, B.W. (l976) Limbic influences on human speech. Annals of the New York Academy of Sciences, Vol. 280, 761-771.

Robinson, J.G. (l979) An analysis of the organization of vocal communication in the titi monkey (Callicebus moloch). Zeitschrift fur Tierpsychologie, 49, 381-405.

Roland, P. (l993) Brain activation. Wiley-Liss.

Rossignol, S., Lund, J.P. & Drew, T. (1988) The role of sensory inputs in regulating patterns of rhythmical movements in higher vertebrates: A comparison between locomotion, respiration and mastication. In: Neural control of rhythmic movements in vertebrates, eds. A. Cohen, S. Rossignol, and S. Grillner. Wiley.

Rumelhart, D.E. & Norman, D.A. (l982) Simulating a skilled typist: a study of skilled cognitive-motor performance. Cognitive Science, 6, 1-36.

Searleman, A. (l980) Subject variables and cerebral organization for language. Cortex, 16, 239-254.

Sergent, J., Zuck, E., Levesque, M. & Macdonald, B. (l992) Positron emission tomography study of letter and object processing: Empirical findings and methodological considerations. Cerebral Cortex, 2, 68-80.

Shallice, T. (l988) From neuropsychology to mental structure. Cambridge University Press.

Shattuck-Hufnagel, S. (l979) Speech errors as evidence for a serial ordering mechanism in sentence production. In: Sentence processing: Psycholinguistic studies presented to Merrill Garrett, eds. W.E. Cooper & E.C.T. Walker. Erlbaum.

Shattuck-Hufnagel, S. (l980) Speech units smaller than the syllable. Journal of the Acoustical Society of America, 72 (Suppl. 1).

Shattuck-Hufnagel, S. & Klatt, D.H. (l979) The limited use of distinctive features and markedness in speech production: Evidence from speech error data. Journal of Verbal Learning and Verbal Behavior, 18, 41-55.

Stengel, E. & Lodge Patch, I.C. (l955) 'Central' aphasia associated with parietal symptoms. Brain, 78, 401-416.

Stevens, K.N. (l989) On the quantal nature of speech. Journal of Phonetics, 17, 3-46.

Stuss, D.T. & Benson, D.F. (l986) The frontal lobes. Raven Press.

Talairach, J. & Tournoux, P. (l988) Co-planar stereotaxic atlas of the human brain. 3 dimensional proportional system. An approach to cerebral imaging. Translated by Mark Rayport. Thieme Medical Publishers Inc.

Tanji, J., Shima, Y., Matsuzaka, Y. and Halsband, U. (l995) Neuronal activity in the supplementary, presupplementary, and premotor cortex of monkey. In: Functions of the cortico-basal ganglia loop, eds. M Kimura & A.M. Graybiel. Springer.

Thelen, E. (l981) Rhythmical behavior in infancy: An ethological perspective. Developmental Psychology, 17, 237-257.

Tobias, P. (l987) The brain of Homo Habilis: A new level of organization in cerebral evolution. Journal of Human Evolution, 16, 741-761.

Watson, R.T., Fleet, W.S., Gonzales-Rothi, L. & Heilman, K.M. (l986) Apraxia and the supplementary motor area. Archives of Neurology, 43, 787-792.

Wise, R.J., Chollet, F., Hadar, U., Friston, K., Hoffner, E. & Frackowiak, R. (1991) Distribution of cortical neural networks involved in word comprehension and word retrieval. Brain, 114, 1803-1817.

Wolff P.H. (1967) The role of biological rhythms in psychological development. Bulletin of the Menninger Clinic, 31, 197-218.

Wolff, P.H. (l968) Stereotypic behavior and development. Canadian Psychologist, 9, 474-483.

Woolsey, C.N. (l958) Organization of somatic sensory and motor areas of the cerebral cortex. In: Biological and biochemical bases of behavior, eds. H.F. Harlow and C.N. Woolsey. University of Wisconsin Press.

Woolsey, C.N., Erickson, T.C. & Gilson, W.E. (l979) Localization in somatic sensory and motor areas of human cerebral cortex as determined by direct recording of evoked potentials and electrical stimulation. Journal of Neurosurgery, 51, 476-506.

Zatorre, R.J., Evans, A.C., Meyer, E. & Gjedde, A. (l992) Lateralization of phonetic and pitch discrimination in speech processing. Science, 256, 846-849.

Zlatic, L. MacNeilage, P.F., Matyear, C. & Davis, B.L. (l997) Babbling of twins in a bilingual environment. Applied Psycholinguistics (In press).