To be published in Behavioral and Brain Sciences (in press)

© Cambridge University Press 2008

 

Below is the unedited, uncorrected final draft of a BBS target article that has been accepted for publication. This preprint has been prepared for potential commentators who wish to nominate themselves for formal commentary invitation. Please DO NOT write a commentary until you receive a formal invitation. If you are invited to submit a commentary, a copyedited, corrected version of this paper will be posted.

 

 

 

 

 

Language as Shaped by the Brain

 

 

 

 

              Morten H. Christiansen                                                       Nick Chater

                Department of Psychology                             Division of Psychology and Language Sciences

                       Cornell University                                                   University College London

                      Ithaca, NY 14853                                                      London, WC1E 6BT

                                USA                                                                             UK

                                 and                                                             email: n.chater@ucl.ac.uk

                       Santa Fe Institute

                   1399 Hyde Park Road

                    Santa Fe, NM 87501

                                USA

               email: mhc27@cornell.edu

 

 

 


Short abstract

It is widely assumed that human learning and the structure of human languages are intimately related. It is typically argued that this relationship is rooted in a language-specific biological endowment, which encodes universal, but communicatively arbitrary, principles of language structure (a universal grammar). We argue instead that the mesh between learners and languages arises because language has been shaped to fit the human brain, rather than vice versa. If so, then apparently arbitrary aspects of linguistic structure may result from general learning and processing biases.

 

 

 

Long abstract

It is widely assumed that human learning and the structure of human languages are intimately related. This relationship is frequently suggested to derive from a language-specific biological endowment, which encodes universal, but communicatively arbitrary, principles of language structure (a universal grammar or UG). How might such a UG have evolved? We argue that UG could not have arisen either by biological adaptation or non-adaptationist genetic processes, resulting in a logical problem of language evolution. Specifically, as the processes of language change are much more rapid than processes of genetic change, language constitutes a “moving target” both over time and across different human populations, and hence cannot provide a stable environment to which language genes could have adapted. We conclude that a biologically determined UG is not evolutionarily viable. Instead, the original motivation for UG—the mesh between learners and languages—arises because language has been shaped to fit the human brain, rather than vice versa. Following Darwin, we view language itself as a complex and interdependent “organism,” which evolves under selectional pressures from human learning and processing mechanisms. That is, languages themselves are shaped by severe selectional pressure from each generation of language users and learners. This suggests that apparently arbitrary aspects of linguistic structure may result from general learning and processing biases deriving from the structure of thought processes, perceptuo-motor factors, cognitive limitations, and pragmatics.

 

 

 

 

Keywords: biological adaptation, cultural evolution, grammaticalization, language acquisition, language evolution, linguistic change, natural selection, universal grammar

 


1. Introduction

 

Natural language constitutes one of the most complex aspects of human cognition, yet children already have a good grasp of their native language before they can tie their shoes or ride a bicycle. The relative ease of acquisition suggests that when the child makes a “guess” about the structure of language on the basis of apparently limited evidence, she has an uncanny tendency to guess right. This strongly suggests that there must be a close relationship between the mechanisms by which the child acquires and processes language, and the structure of language itself.

 

What is the origin of this presumed close relationship between the mechanisms children use in acquisition and the structure of language? One view is that specialized brain mechanisms specific to language acquisition have evolved over long periods of natural selection (e.g., Pinker & Bloom, 1990). A second view rejects the idea that these specialized brain mechanisms have arisen through adaptation, and assumes that they have emerged through some non-adaptationist route, just as it has been argued that many biological structures are not the product of adaptation (e.g., Bickerton, 1995; Gould, 1993; Jenkins, 2000; Lightfoot, 2000). Both these viewpoints put the explanatory emphasis on brain mechanisms specialized for language—and ask how they have evolved.

 

In this paper, we develop and argue for a third view, which takes the opposite starting point. It asks not, Why is the brain so well suited to learning language?, but instead, Why is language so well suited to being learned by the brain? We propose that language has adapted through gradual processes of cultural evolution to be easy to learn to produce and understand. Thus the structure of human language must inevitably be shaped around human learning and processing biases deriving from the structure of our thought processes, perceptuo-motor factors, cognitive limitations, and pragmatic constraints. Language is easy for us to learn and use, not because our brains embody knowledge of language, but because language has adapted to our brains. Following Darwin (1900), we argue that it is useful metaphorically to view languages as “organisms”, i.e., highly complex systems of interconnected constraints, that have evolved in a symbiotic relationship with humans. According to this view, whatever domain-general learning and processing biases people happen to have will tend to become embedded in the structure of language—because it will be easier to learn to understand and produce languages, or specific linguistic forms, that fit these biases.

 

We start by introducing The Logical Problem of Language Evolution, which faces theories proposing that humans have evolved specialized brain mechanisms for language. The following two sections, Evolution of Universal Grammar by Biological Adaptation and Evolution of Universal Grammar by Non-adaptationist Means, evaluate adaptationist and non-adaptationist explanations of language evolution, concluding that both face insurmountable theoretical obstacles. Instead, we present an alternative perspective, Language as Shaped by the Brain, in which language is treated as an evolutionary system in its own right, adapting to the human brain. The next two sections, Constraints on Language Structure and How Constraints Shape Language over Time, discuss what biases have shaped language evolution and how these can be observed in language change mediated by cultural transmission. Finally, in the Scope of the Argument, we consider the wider implications of our theory of language evolution, including a radical recasting of the problem of language acquisition.

 

2. The Logical Problem of Language Evolution

 

For a period spanning three decades, Chomsky (1965, 1972, 1980, 1986, 1988, 1993) has argued that a substantial innate endowment of language-specific knowledge is necessary for language acquisition. These constraints form a Universal Grammar (UG); that is, a collection of grammatical principles that hold across all human languages. In this framework, a child’s language ability gradually unfolds according to a genetic blueprint in much the same way as a chicken grows a wing (Chomsky, 1988). The staunchest proponents of this view even go as far as to claim that “doubting that there are language-specific, innate computational capacities today is a bit like being still dubious about the very existence of molecules, in spite of the awesome progress of molecular biology” (Piattelli-Palmarini, 1994: p. 335).

 

There is considerable variation in current conceptions of the exact nature of UG, ranging from being close to the Principle and Parameter Theory (PPT; Chomsky, 1981) of pre-minimalist generative grammar (e.g., Crain, Goro & Thornton, 2006; Crain & Pietroski, 2006), to the Simpler Syntax (SS) version of generative grammar proposed by Jackendoff (2002) and colleagues (Culicover & Jackendoff, 2005; Pinker & Jackendoff, 2005), to the Minimalist Program (MP) in which language acquisition is confined to learning a lexicon from which cross-linguistic variation is proposed to arise (Boeckx, 2006; Chomsky, 1995). From the viewpoint of PPT, UG consists of a set of genetically specified universal linguistic principles combined with a set of parameters to account for variations among languages (Crain et al., 2006). Information from the language environment is used during acquisition to determine the parameter settings relevant for individual languages. The SS approach combines elements from construction grammar (e.g., Goldberg, 2006) with more traditional structural principles from generative grammar, including principles relating to phrase structure (X-bar theory), agreement, and case-marking. Along with constraints arising from the syntax-semantic interface, these basic structural principles form part of a universal “toolkit” of language-specific mechanisms, encoded in a genetically specified UG (Culicover & Jackendoff, 2005). By contrast, proponents of MP construe language as a perfect system for mapping between sound and meaning (Chomsky, 1995). In departure from earlier generative approaches, only recursion (in the form of Merge) is considered to be unique to the human language ability (Hauser, Chomsky & Fitch, 2002). Variation among languages is now explained in terms of lexical parameterization (Borer, 1984); that is, differences between languages are no longer explained in terms of parameters associated with grammars (as in PPT), but primarily in terms of parameters associated with particular lexical items (though some non-lexical parameters currently remain; Baker, 2001; Boeckx, 2006).

 

Common to these three current approaches to generative grammar is the central assumption that the constraints of UG (whatever their form) are fundamentally arbitrary—i.e., not determined by functional considerations. That is, these principles cannot be explained in terms of learning, cognitive constraints, or communicative effectiveness. For example, consider the principles of binding, which have come to play a key role in generative linguistics (Chomsky, 1981). The principles of binding capture patterns of, among other things, reflexive pronouns (e.g., himself, themselves) and accusative pronouns (him, them, etc.), which appear, at first sight, to defy functional explanation. Consider examples (1)-(4), where the subscripts indicate co-reference, and asterisks indicate ungrammaticality.

 

(1) Johni sees himselfi

(2) *Johni sees himi

(3) Johni said hei/j won

(4) *Hei said Johni won

 

In (1), the pronoun himself must refer to John; in (2) it cannot. In (3), the pronoun he may refer to John or to another person; in (4), it cannot refer to John. These and many other cases indicate that an extremely rich set of patterns govern the behavior of pronouns, and these patterns appear arbitrary—it appears that numerous alternative patterns would, from a functional standpoint, serve equally well. These patterns are instantiated in PPT by the principles of binding theory (Chomsky, 1981), in SS by constraints arising from structural and/or syntax-semantics interface principles (Culicover & Jackendoff, 2005), and in MP by limitations on movement (internal merge, Hornstein, 2001). Independent of their specific formulations, the constraints on binding, while apparently universal across natural languages, are assumed to be arbitrary—and hence may be presumed to be part of the genetically encoded UG.

 

Putative arbitrary universals, such as the restrictions on binding, contrast with functional constraints on language. Whereas the former are hypothesized to derive from the internal workings of a UG-based language system, the latter originate from cognitive and pragmatic constraints related to language acquisition and use. Consider the tendency in English to place long phrases after short ones; for example, as evidenced by so-called “heavy-NP shifts”. In (5), the long (or “heavy”) direct-object noun phrase (NP), the book he had not been able to locate for over two months, appears at the end of the sentence, separated from its canonical postverbal position by the prepositional phrase (PP) under his bed. Both corpus analyses (Hawkins, 1994) and psycholinguistic sentence-production experiments (Stallings, MacDonald & O’Seaghdha, 1998) suggest that (5) is much more acceptable than the standard (or “non-shifted”) version in (6), in which the direct object NP is placed immediately following the verb.

 

(5) John found PP[under his bed] NP[the book he had not been able to locate for over two months].

(6) John found NP[the book he had not been able to locate for over two months] PP[under his bed].

 

Whereas individuals speaking head-initial languages, such as English, tend to prefer short phrases before long, speakers of head-final languages, such as Japanese, have been shown to have the opposite long-before-short preference (Yamashita & Chang, 2001). In both cases, the preferential ordering of long versus short phrases can be explained in terms of minimization of memory load and maximization of processing efficiency (Hawkins, 2004). As such, the patterns of length-induced phrasal reordering are generally considered within generative grammar to be a performance issue related to functional constraints outside the purview of UG (although some functionally-oriented linguists have suggested that these kind of performance constraints may shape grammar itself; e.g., Hawkins, 1994, 2004). In contrast, the constraints inherent in UG are arbitrary and non-functional in the sense that they do not relate to communicative or pragmatic considerations, nor from limitations on the mechanisms involved in using or acquiring language. Indeed, some generative linguists have argued that aspects of UG hinder communication (e.g., Chomsky, 2005; Lightfoot, 2000).

 

If we suppose that such arbitrary principles of UG are genetically specified, then this raises the question of the evolutionary origin of this genetic endowment. Two views have been proposed.

 

Adaptationists emphasize a gradual evolution of the human language faculty through natural selection (e.g., Briscoe, 2003; Corballis, 1992, 2003; Dunbar, 2003; Greenfield, 1991; Hurford, 1991; Jackendoff, 2002; Nowak, Komarova & Niyogi, 2001; Pinker, 1994, 2003; Pinker & Bloom, 1990; Pinker & Jackendoff, 2005). Linguistic ability confers added reproductive fitness, leading to a selective pressure for language genes[i]; richer language genes encode increasingly elaborate grammars.

 

Non-adaptationists (e.g., Bickerton, 1995—but see Bickerton, 2003; Chomsky, 1988; Jenkins, 2000; Lightfoot, 2000; Piattelli-Palmarini, 1989) suggest that natural selection only played a minor role in the emergence of language in humans, focusing instead on a variety of alternative possible evolutionary mechanisms by which UG could have emerged de novo (e.g., due to as few as two or three key mutation “events”, Lanyon, 2006).

 

In the next two sections, we argue that both of these views, as currently formulated, face profound theoretical difficulties resulting in a logical problem of language evolution[ii]. This is because, on analysis, it is mysterious how proto-language—which must have been, at least initially, a cultural product likely to be highly variable both over time and geographical locations—could have become genetically fixed as a highly elaborate biological structure. Hence there is no currently viable account of how a genetically encoded UG could have evolved. In subsequent sections, we argue that the brain does not encode principles of UG—and therefore neither adaptationist nor non-adaptationist solutions are required. Instead, language has been shaped by the brain: language reflects pre-existing, and hence non-language-specific, human learning and processing mechanisms.

 

3. Evolution of Universal Grammar by Biological Adaptation

 

The adaptationist position is probably the most widely held view of the origin of UG. We first describe adaptationism in biology and its proposed application to UG before outlining three conceptual difficulties for adaptationist explanations of language evolution.

 

3.1 Adaptation: The very idea

 

Adaptation is a candidate explanation for the origin of any innate biological structure. In general, the idea is that natural selection has favored genes that code for biological structures that increase fitness (in terms of expected numbers of viable offspring).[iii] Typically, a biological structure contributes to fitness by fulfilling some purpose—the heart is assumed to pump blood, the legs to provide locomotion, or UG to support language acquisition. If so, natural selection will generally favor biological structures that fulfill their purpose well, so that, over the generations, hearts will become well adapted to pumping blood, legs well adapted to locomotion, and any presumed biological endowment for language acquisition will become well adapted to acquiring language.

 

Perhaps the most influential statement of the adaptationist viewpoint is by Pinker and Bloom (1990). They argue that “natural selection is the only scientific explanation of adaptive complexity. ‘Adaptive complexity’ describes any system composed of many interacting parts where the details of the parts’ structure and arrangement suggest design to fulfill some function” (p. 709; their emphasis). As another example of adaptive complexity, they refer to the exquisite optical and computational sophistication of the vertebrate visual system. Pinker and Bloom note that such a complex and intricate mechanism has an extremely low probability of occurring by chance. Whatever the influence of non-adaptational factors (see below), they argue that there must additionally have been substantial adaptation to fine-tune a system as complex as the visual system. Given that language appears as complex as vision, Pinker and Bloom conclude that it is also highly improbable that language is entirely the product of nonadaptationist processes (see also Pinker, 2003).

 

The scope and validity of the adaptationist viewpoint in biology is controversial (e.g., Dawkins, 1986; Gould, 2002; Gould & Lewontin, 1979; Hecht Orzak & Sober, 2001); and some theorists have used this controversy to question adaptationist views of the origin of UG (e.g., Bickerton, 1995; Lewontin, 1998). Here, we take a different tack. We argue that, whatever the merits of adaptationist explanation in general, and as applied to vision in particular, the adaptationist account cannot extend to a putative UG.

 

3.2. Why universal grammar could not be an adaptation to language

 

Let us suppose that a genetic encoding of universal properties of language did, as the adaptationist view holds, arise as an adaptation to the environment, here to the linguistic environment. This point of view seems to work most naturally for aspects of language that have a transparent functional value. For example, the compositional character of language (i.e., the ability to express in an infinite number of messages using a finite number of lexical items) seems to have great functional advantages. A biological endowment that allows, or perhaps requires, that language has this form appears likely to lead to enhanced communication; and hence to be positively selected. Thus, over time, functional aspects of language might be expected to become genetically encoded across the entire population. But UG, according to Chomsky (e.g., 1980, 1988), consists precisely of linguistic principles that appear highly abstract and arbitrary—i.e., which have no functional significance. To what extent can an adaptationist account of the evolution of a biological basis for language explain how a genetic basis could arise for such abstract and arbitrary properties of language?

 

Pinker and Bloom (1990) provide an elegant approach to this question. They suggest that the constraints imposed by UG, such as the binding constraints mentioned above, can be construed as communication protocols for transmitting information over a serial channel. While the general features of such protocols (e.g., concerning compositionality, or the use of a small set of discrete symbols) may be functionally important, many of the specific aspects of the protocol do not matter, as long as everyone (within a given speech community) adopts the same protocol. For example, when using a modem to communicate between computers, a particular protocol might have features such as odd parity, handshake on, 7 bit, etc. However, there are many other settings that would be just as effective. What is important is that the computers that are to interact adopt the same set of settings—otherwise communication will not be possible. Adopting the same settings is therefore of fundamental functional importance to communication between computers, but the particular choice of settings is not. Similarly, when it comes to the specific features of UG, Pinker and Bloom suggest that “in the evolution of the language faculty, many ‘arbitrary’ constraints may have been selected simply because they defined parts of a standardized communicative code in the brains of some critical mass of speakers” (1990: p. 718)[iv]. Thus, such arbitrary constraints on language can come to have crucial adaptive value to the language-user; genes that favor such constraints will be positively selected. Over many generations, the arbitrary constraints may then become innately specified.

 

We will argue that this viewpoint faces three fundamental difficulties, concerning the dispersion of hominid populations, language change, and the question of what is genetically encoded. We consider these in turn.

 

 

 

3.2.1 Problem 1: The dispersion of human populations

 

Pinker and Bloom’s (1990) analogy with communications protocols, while apt, is, however, something of a double-edged sword. Communications protocols and other technical standards typically diverge rapidly unless there is concerted oversight and enforcement to maintain common standards. Maintaining and developing common standards is an integral part of software and hardware development. In the absence of such pressures for standardization, protocols would rapidly diverge. Given that language presumably evolved without top-down pressures for standardization, divergence between languages seems inevitable. To assume that “universal” arbitrary features of language would emerge from adaptation by separate groups of language users, would be analogous to assuming that the same set of specific features for computer communication protocols might emerge from separate teams of scientists, working in separate laboratories (e.g., that different modem designers independently alight on odd parity, handshake on, 7 bit error correction, and so on). Note that this point would apply equally well, even if the teams of scientists emerged from a single group. Once cut off from each other, groups would develop in independent ways. Indeed, in biological adaptation, genes appear to rapidly evolve to deal with a specific local environment. Thus, Darwin observed rich patterns of variations in fauna (e.g., finches) across the Galapagos Islands, and interpreted these variations as adaptation to local island conditions. Hence, if language genes have adapted to local linguistic environments, we should expect a range of different biologically encoded UGs, each specifically adapted to its local linguistic context. Indeed, one might expect, if anything, that language-genes would diverge especially rapidly—because the linguistic environment in each population is assumed to be itself shaped by the different language-genes in each subpopulation, thus amplifying the differences in the linguistic environment. If so, then people should have, at minimum, some specific predisposition to learn and process languages associated with their genetic lineage. This does not appear to be the case—and it is a key assumption of the generative linguistics perspective that the human language endowment does not vary in this way but is universal across the species (Chomsky, 1980; Pinker, 1994).

 

There is an interesting contrast, here, with the human immune system, which has evolved to a very rapidly changing microbial environment. Crucially, the immune system can build new antibody proteins (and the genetic mechanisms from which antibody proteins are constructed) without having to eliminate old antibody proteins (Goldsby, Kindt, Osborne & Kuby, 2003). Therefore, natural selection will operate to enrich the coverage of the immune system (though such progress will not always be cumulative, of course); there is no penalty for the immune system following a fast-moving “target” (defined by the microbial environment). But the case of acquiring genes coding for regularities in language is very different—because, at any one time, there is just one language (or at most two or three) that must be acquired—and hence a bias that helps learn a language with property P will thereby inhibit learning languages with not-P. The fact that language change is so fast (so that whether the current linguistic environment has property P or not will vary rapidly, in the time scale of biological evolution) means that such biases will, on balance, be counterproductive.

 

Given that the immune system does coevolve with the microbial environment, different co-evolutionary paths have been followed when human populations have diverged. Therefore populations that have co-evolved to their local microbial environment are often poorly adapted to other microbial environments. For example, when Europeans began to explore the New World, they succumbed in large numbers to the diseases they encountered, while conversely, European diseases caused catastrophic collapse in indigenous populations (e.g., Diamond, 1997). If an innate UG had co-evolved with the linguistic environment, similar radically divergent co-evolutionary paths might be expected. Yet, as we have noted, the contrary appears to be the case.

 

The problem of divergent populations arises across a range of different scenarios concerning the relationship between language evolution and the dispersion of human populations. One scenario is that language evolution is recent, and occurred during the dispersion of modern humans (Homo sapiens sapiens). In this case, whether language was discovered once, and then spread throughout human populations, or was discovered in various locations independently, there remains the problem that adaptations to language would not be coordinated across geographically dispersed groups. It is tempting to suggest that all of these sublanguages will, nonetheless, obey universal grammatical principles, thus providing some constancy in the linguistic environment—but this appeal would, of course, be circular, as we are attempting to explain the origin of such principles. We shall repeatedly have to steer around this circularity trap below.  

 

An alternative scenario is that language evolution pre-dates the dispersion of modern humans. If so, then it is conceivable that prior dispersions of hominid populations, perhaps within Africa, did lead to the emergence of diverse languages and diverse UGs, adapted to learning and processing such languages, and then that subsequently, one local population proved to be adaptively most successful, and came to displace other hominid populations. Thus, on this account, our current UG might conceivably be the only survivor of a larger family of such UGs due to a population “bottleneck”—the universality of UG would arise, then, because it was genetically encoded in the sub-population from which modern humans descended[v]. This viewpoint is not without difficulties. Some interpretations of the genetic and archaeological evidence suggest that the last bottleneck in human evolution occurred at between 500,000 and 2,000,000 years ago (e.g., Hawks, Hunley, Lee & Wolpoff, 2000); few researchers in language evolution believe that language, in anything like its modern form, is this old. Moreover, even if we assume a more recent bottleneck, any such bottleneck must at least predate the 100,000 years or so since the geographical dispersion of human populations, and 100,000 years still seems to provide sufficient time for substantial linguistic divergence to occur. Given that the processes of genetic adaptation to language most likely would continue to operate[vi], different genetic bases for language would be expected to evolve across geographically separated populations. That is, the evolution of UG by adaptation would appear to require rapid adaptations for language prior to the dispersion of human populations, followed by an abrupt cessation of such adaptation, for a long period after dispersion. The contrast between the evolution of the putative “language organ” and that of biological processes, such as digestion, is striking. The digestive system is evolutionarily very old, and many orders of magnitude older than the recent divergence of human populations. Nonetheless, digestion appears to have adapted in important ways to recent changes in the dietary environment; for example, with apparent coevolution of lactose tolerance and the domestication of milk-producing animals (Beja-Pereira et al., 2003).

 

 

 

3.2.1 Problem 2: Language change

 

Whatever the timing of the origin of language and hominid dispersion, the thesis that a genetically encoded UG arose through adaptation faces a second problem: that, even within a single population, linguistic conventions change rapidly. Hence the linguistic environment over which selectional pressures operate presents a “moving target” for natural selection. If linguistic conventions change more rapidly than genes change via natural selection, then genes that encode biases for particular conventions will be eliminated—because, as the language changes, the biases will be incorrect, and hence decrease fitness. More generally, in a fast changing environment, phenotypic flexibility to deal with various environments will typically be favored over genes that bias the phenotype narrowly toward a particular environment. Again, there is a tempting counter-argument—that the linguistic principles of UG will not change, and hence these aspects of language will provide a stable linguistic environment over which adaptation can operate. But, of course, this argument falls into the circularity trap, because the genetic endowment of UG is proposed to explain language universals; so it cannot be assumed that the language universals pre-date the emergence of the genetic basis for UG.

 

Christiansen, Reali and Chater (2006) illustrate the problems raised by language change in a series of computer simulations. They assume the simplest possible set-up: that (binary) linguistic principles and language “genes” stand in one-to-one correspondence. Each gene has three alleles—one biased in favor of each version of the corresponding principle, and one neutral allele[vii]. Agents learn the language by trial-and-error, where their guesses are biased according to which alleles they have. The fittest agents are allowed to reproduce, and a new generation of agents is produced by sexual recombination and mutation. When the language is fixed, there is a selection pressure in favor of the “correctly” biased genes, and these rapidly come to dominate the population, as illustrated by Figure 1. This is an instance of the Baldwin effect (Baldwin, 1896; for discussion see Weber & Depew, 2003) in which information that is initially learned becomes encoded in the genome. A frequently cited example of the Baldwin effect is the development of calluses on the keels and sterna of ostriches (Waddington, 1942). The proposal is that calluses are initially developed in response to abrasion where the keel and sterna touch the ground during sitting. Natural selection then favored individuals that could develop calluses more rapidly, until callus development became triggered within the embryo and could occur without environmental stimulation. Pinker and Bloom suggest that the Baldwin effect in a similar way could be the driving force behind the adaptation of UG. Natural selection will favor learners who are genetically disposed rapidly to acquire the language to which they are exposed. Hence, over many generations this process will lead to a genetically specified UG.

 

 

Figure 1. The effect of linguistic change on the genetic encoding of arbitrary linguistic principles. Results are shown from a simulation with a population size of 100 agents, a genome size of 20, survival of the top 50% of the population, and starting with 50% neutral alleles. When there is no linguistic change, alleles encoding specific aspects of language emerge quickly—i.e., a Baldwin effect occurs—but when language is allowed to change, neutral alleles become more advantageous. Similar results were obtained across a wide range of different simulation parameters (Adapted from Christiansen, Reali & Chater, 2006).

 

However, when language is allowed to change (e.g., due to exogenous forces such as language contact), the effect reverses—biased genes are severely selected against when they are inconsistent with the linguistic environment, and neutral genes come to dominate the population. The selection in favor of neutral genes occurs even for low levels of language change (i.e., the effect occurs, to some degree, even if language change equals the rate of genetic mutation). But, of course, linguistic change (prior to any genetic encoding) is likely to have been much faster than genetic change. After all, in the modern era, language change has been astonishingly rapid, leading, for example, to the wide phonological and syntactic diversity of the Indo-European language group, from a common ancestor about 10,000 years ago (Gray & Atkinson, 2003). Language in hunter-gatherer societies changes at least as rapidly. Papua New Guinea, settled within the last 50,000 years, has an estimated one-quarter of the world’s languages. These are enormously linguistically diverse, and most originate in hunter-gatherer communities (Diamond, 1992)[viii]. Thus, from the point of view of natural selection, it appears that language, like other cultural adaptations, changes far too rapidly to provide a stable target over which natural selection can operate. Human language learning therefore may be analogous to typical biological responses to high levels of environmental change—i.e., to develop general-purpose strategies which apply across rapidly-changing environments, rather than specializing to any particular environment. This strategy appears to have been used, in biology, by “generalists” such as cockroaches and rats, in contrast, for example, to pandas and koalas, which are adapted to extremely narrow environmental niches.

 

A potential limitation of our argument so far is that we have assumed that changes in the linguistic environment are “exogenous.” But many aspects of language change may be “endogenous,” i.e., may arise because the language is adapting due to selection pressures from learners, and hence their genes. Thus, one might imagine the following argument: suppose there is a slight, random, genetic preference for languages with feature A rather than B. Then this may influence the language spoken by the population to have feature A, and this may in turn select for genes that favor the feature A[ix]. Such feedback might, in principle, serve to amplify small random differences into, ultimately, rigid arbitrary language universals. However, as Figure 2 illustrates, when linguistic change is genetically influenced, rather than random, it turns out that, while this amplification effect can occur, leading to a Baldwin effect, it does not emerge from small random fluctuations. Instead, it only occurs when language is initially strongly influenced by genes. But if arbitrary features of language would have to be predetermined strongly by the genes from the very beginning, then this leaves little scope for subsequent operation of the Baldwin effect as envisioned by Pinker and Bloom.

 

 Figure 2. The Baldwin effect, where genes influence language: the role of population influence (i.e., genetic “feedback”) on the emergence of the Baldwin effect for language-relevant alleles when language is allowed to change 10 times faster than biological change. Only when the pressure from the learners’ genetic biases is very high (~50%) can the Baldwin effect overcome linguistic change. (Adapted from Christiansen, Reali & Chater, 2006).

 

3.2.3. Problem 3: What is genetically encoded?

 

Even if the first two difficulties for adaptationist accounts of UG could be solved, the view still faces a further puzzle: why is it that genetic adaptation occurred only to very abstract properties of language, rather than also occurring to its superficial properties? Given the spectacular variety of surface forms of the world’s languages, in both syntax (including every combination of basic orderings of subject, verb and object, and a wide variety of less constrained word orders) and phonology (including tone and click languages, for example), why did language genes not adapt to these surface features?[x] Why should genes become adapted to capture the extremely rich and abstract set of possibilities countenanced by the principles of UG, rather than merely encoding the actual linguistic possibilities in the specific language that was being spoken (i.e., the phonological inventory and particular morphosyntactic regularities of the early click-language, from which the Khoisan family originated and which might be the first human language; e.g., Pennisi, 2004)? The unrelenting abstractness of the universal principles makes them difficult to reconcile with an adaptationist account.

 

One of the general features of biological adaptation is that it is driven by the constraints of the immediate environment. It can have no regard for distant or future environments that might one day be encountered. For example, the visual system is highly adapted to the laws of optics as they hold in normal environments. Thus, human vision mis-estimates the length of a stick in water, because it does not correct for the refraction of light through water (this being not commonly encountered in the human visual world). By contrast, the visual system of the archerfish, which must strike air-born flies with a water jet from below the water surface, does make this correction (Rossel, Corlija & Schuster, 2002). Biological adaptation produces systems designed to fit the environment to which adaptation occurs; there is, of course, no selectional pressure to fit environments that have not occurred, or might do so at some point in the future. Hence, if a UG did adapt to a past linguistic environment, it would seem inevitable that it would adapt to that language environment as a whole: thus adapting to its specific word order, phonotactic rules, inventory of phonemic distinctions, and so on. In particular, it seems very implausible that an emerging UG would be selected primarily for extremely abstract features, which apply equally to all possible human languages (not just the language evident in the linguistic environment in which selection operates). This would be analogous to an animal living in a desert environment somehow developing adaptations that are not specific to desert conditions, but that are equally adaptive in all terrestrial environments. 

 

The remarkable abilities of the young indigo bunting to use stars for navigational purposes—even in the absence of older birds to lead the way—might at first seem to counter this line of reasoning (e.g., Hauser, 2001; Marcus, 2004). Every autumn this migratory bird uses the location of Polaris in the night sky to fly from its summer quarters in the Northeast United States to its winter residence in the Bahamas. As demonstrated by Emlen (1970), the indigo bunting uses celestial rotation as a reference axis to discover which stars point to true north. Thus, when Emlen raised young fledglings in a planetarium that was modified to rotate the night sky around Betelgeuse, the birds oriented themselves as if north was in the direction of this bright star. Crucially, what has become genetically encoded is not a star map, because star constellations change over evolutionary time and thus form moving targets, but instead that which is stable: that stationary stars indicate the axis of earth’s rotation, and hence true north.

 

Similarly, it is tempting to claim that the principles of UG are just those that are invariant across languages, whereas contingent aspects of word order or phonology will vary across languages. Thus, one might suggest that only the highly abstract, language-universal, principles of UG will provide a stable basis upon which natural selection can operate. But this argument is again, of course, a further instance of the circularity trap. We are trying to explain how a putative UG might become genetically fixed, and hence we cannot assume UG is already in place. Thus, this counterargument is blocked.

 

We are not, of course, arguing that abstract structures cannot arise by adaptation. Indeed, abstract patterns, such as the body plan of mammals or birds, are conserved across species, and constitute a complex and highly integrated system. Notice, though, that such abstract structures are still tailored to the specific environment of each species. Thus, while bats, whales, and cows have a common abstract body plan, these species embody dramatically different instantiations of this pattern, adapted to their ecological niches in the air, in water, or on land. Substantial modifications of this kind can occur quite rapidly, due to changes in a small numbers of genes and/or their pattern of expression. For example, the differing beak shape in Darwin’s finches, adapted to different habitats in the Galapagos Islands, may be largely determined by as few as two genes: BMP4, the expression of which is associated with the width as well as depth of beaks (Abzhanov, Protas, Grant, Grant & Tabin, 2004), and CaM, the expression of which is correlated with beak length (Abzhanov et al., 2006). Again, these adaptations are all related closely to the local environment in which an organism exists. In contrast, adaptations for UG are hypothesized to be for abstract principles holding across all linguistic environments, with no adaptation to the local environment of specific languages and language users.

 

In summary, Pinker and Bloom (1990), as we have seen, draw a parallel between the adaptationist account of the development of the visual system, and an adaptationist account of a putative language faculty. But the above arguments indicate that the two cases are profoundly different. The principles of optics, and the structure of the visual world, have many invariant features across environments (e.g., Simoncelli & Olshausen, 2001), but the linguistic environment is vastly different from one population to another. Moreover, the linguistic environment, unlike the visual environment, will itself be altered in line with any genetic changes in the propensity to learn and use languages, thus amplifying differences between linguistic environments further. We conclude, then, that linguistically-driven biological adaptation cannot underlie the evolution of language.

 

It remains possible, though, that the development of language did have a substantial impact on biological evolution. The arguments given here merely preclude the possibility that linguistic conventions that would originally differ across different linguistic environments could somehow become universal across all linguistic communities, by virtue of biological adaptation to the linguistic environment. This is because, in the relevant respects, the linguistic environment for the different populations is highly variable, and hence any biological adaptations could only serve to entrench such differences further. But there might be features that are universal across linguistic environments that might lead to biological adaptation (such as the means of producing speech; Lieberman, 1984; or the need for enhanced memory capacity, or complex pragmatic inferences; Givón & Malle, 2002). However, these language features are likely to be functional, i.e., they facilitate language use—and thus would typically not be considered part of UG.

 

It is consistent with our arguments that the emergence of language influenced biological evolution in a more indirect way. The possession of language might have fundamentally changed the patterns of collective problem solving and other social behavior in early humans, with a consequent shift in the selectional pressures on humans engaged in these new patterns of behavior. But universal, arbitrary constraints on the structure of language cannot emerge from biological adaptation to a varied pattern of linguistic environments. Thus, the adaptationist account of the biological origins of UG cannot succeed.

 

 

 

 

4. Evolution of Universal Grammar by Non-adaptationist Means

 

Some theorists advocating a genetically-based UG might concur with our arguments against adaptationist accounts of language evolution. For instance, Chomsky (1972, 1988, 1993) has for more than two decades expressed strong doubts about neo-Darwinian explanations of language evolution, hinting that UG may be a by-product of increased brain size or yet unknown physical or biological evolutionary constraints. Further arguments for a radically non-adaptationist perspective have been advanced by Jenkins (2000), Lanyon (2006), Lightfoot (2000), and Piattelli-Palmarini (1989, 1994).

 

Non-adaptationists typically argue that UG is both highly complex, and radically different from other biological machinery (though see Hauser et al., 2002). They suggest, moreover, that UG appears to be so unique in terms of structure and properties, that it is unlikely to be a product of natural selection amongst random mutations. However, we argue that non-adaptationist attempts to explain a putative language-specific genetic endowment also fail.

 

To what extent can any non-adaptationist mechanism account for the development of a genetically encoded UG, as traditionally conceived? In particular, can such mechanisms account for the appearance of genetically specified principles that are presumed to be (a) idiosyncratic to language, and (b) of substantial complexity? We argue that the probability that non-adaptationist factors played a substantial role in the evolution of UG is vanishingly small.

 

The argument involves a straightforward application of information theory. Suppose that the constraints embodied in UG are indeed language-specific, and hence do not emerge as side-effects of existing processing mechanisms. This means that UG would have to be generated at random by non-adaptationist processes. Suppose further that the information required to specify a language acquisition device, so that language can be acquired and produced, over and above the pre-linguistic biological endowment can be represented as a binary string of N bits (this particular coding assumption is purely for convenience). Then the probability of generating this sequence of N bits by chance is 2-N. If the language-specific information could be specified using a binary string that would fit on one page of normal text (which would presumably be a considerable underestimate, from the perspective of most linguistic theory), then N would be over 2500. Hence the probability of generating the grammar by a random process would be less than 2-2500. So to generate this machinery by chance (i.e., without the influence of the forces of adaptation) would be expected to require of the order of 22500 individuals. But the total population of humans over the last two million or so years, including the present, is measured in billions, and is much smaller than 235. Hence, the probability of non-adaptationist mechanisms “chancing” upon a specification of a language organ or language instinct through purely non-adaptationist means is astronomically unlikely[xi].

 

It is sometimes suggested, apparently in the face of this type of argument, that the recent evolutionary-developmental biology literature has revealed how local genetic changes, e.g., on homeobox genes, can influence the expression of other genes, and through a cascade of developmental influences, result in extensive phenotypic consequences (e.g., Gerhart & Kirschner, 1997; Laubichler & Maienschein, 2007). Yet suppose that UG arises from a small “tweak” to pre-linguistic cognitive machinery, then general cognitive machinery will provide the vast bulk of the explanation of language structure—without this machinery, the impact of the tweak would be impossible to understand. Thus, the vision of universal grammar as a language-specific innate faculty or language organ would have to be retracted. But the idea that a simple tweak might lead to a complex, highly interdependent, and intricately organized system, such as the putative UG, is highly implausible. Small genetic changes lead to modifications of existing complex systems (and these modifications can be quite far-reaching); they do not lead to the construction of new complexity. Thus, a mutation might lead to an insect having an extra pair of legs, and a complex set of genetic modifications (almost certainly over strong and continuous selectional pressure) may modify a leg into a flipper, but no single gene creates an entirely new means of locomotion, from scratch. The whole burden of the classic arguments for UG is that UG is both highly organized and complex, and utterly distinct from general cognitive principles. Thus, the emergence of a putative UG requires the construction of a new complex system, and the argument sketched above notes that the probability of even modest new complexity arising by chance is astronomically low.

 

The implication of this argument is that it is extremely unlikely that substantial quantities of linguistically idiosyncratic information have been specified by non-adaptationist means. Indeed, the point applies more generally to the generation of any complex, functional biological structures. Thus, it is not clear how any non-adaptationist account can explain the emergence of something as intricately complex as UG. 

 

Some authors who express skepticism concerning the role of adaptation implicitly recognize this kind of theoretical difficulty. Instead, many apparently complex and arbitrary aspects of cognition and language are suggested to have emerged out of the constraints on building any complex information processing system, given perhaps currently unknown physical and biological constraints (e.g., Chomsky, 1993; see Kauffman, 1995, for a related viewpoint on evolutionary processes). A related perspective is proposed by Gould (1993), who views language as a spandrel—i.e., as emerging as a byproduct of other cognitive processes. Another option would be to appeal to exaptation (Gould & Vrba, 1982) whereby a biological structure that was originally adapted to serve one function is put to use to serve a novel function. Yet the non-adaptationist attracted by these or other non-adaptationist mechanisms is faced with a dilemma. If language can emerge from general physical, biological or cognitive factors, then the complexity and idiosyncrasy of UG is illusory; language emerges from general non-linguistic factors, a conclusion entirely consistent with the view we advocate here. If, by contrast, UG is maintained to be sui generis and not readily derivable from general processes, the complexity argument bites: i.e., the probability of a new and highly complex adaptive system emerging by chance is astronomically low.

 

The dilemma is equally stark for the non-adaptationist who attempts to reach for other non-adaptationist mechanisms of evolutionary change. There are numerous mechanisms that amount to random perturbations (from the point of view of the construction of a highly complex adaptive system) (Schlosser & Wagner, 2004). These include genetic drift (Suzuki, Griffiths, Miller & Lewontin, 1989), the random fluctuations in gene frequencies in a population; genetic hitch-hiking (Maynard-Smith, 1978), a mechanism by which non-selected genes “catch a ride” with another gene (nearby on the chromosome) that was subject to selection; epigenesis (Jablonka & Lamb, 1989), which causes heritable cell changes due to environmental influences but without corresponding changes to the basic DNA sequences of that cell; horizontal genetic transfer (Syvanen, 1985) by which genetic material shifts from one species to another; and transposons (McClintock, 1950), mobile genetic elements that can move around in different positions within the genome of a cell and thus alter its phenotype. Each of these mechanisms provides a richer picture of the mechanisms of evolutionary change—but provides no answer to the question of how novel and highly complex adaptive systems, such as the putative UG, might emerge de novo. However, if language is viewed as embodying novel complexity, then the emergence of this complexity by non-adaptationist (and hence, from an adaptive point of view, random) mechanisms is astronomically unlikely.

 

We may seem to be faced with a paradox. It seems clear that the mechanisms involved in acquiring and processing language are enormously intricate and moreover intimately connected to the structure of natural languages. The complexity of these mechanisms rules out, as we have seen in this section, a non-adaptationist account of their origin. However, if these mechanisms arose through adaptation, this adaptation cannot, as we argued above, have been adaptation to language. But if the mechanisms that currently underpin language acquisition and processing were originally adapted to carry out other functions, then how is their apparently intimate relationship with the structure of natural language to be explained? How, for example, are we to explain that the language acquisition mechanisms seem particularly well-adapted to learning natural languages, but not to any of a vast range of conceivable non-natural languages (e.g. Chomsky, 1980)? As we now argue, the paradox can be resolved if we assume that the “fit” between the mechanisms of language acquisition and processing, on the one hand, and natural language, on the other, has arisen because natural languages themselves have “evolved” to be as easy to learn and process as possible: language has been shaped by the brain, rather than vice versa.

 

5. Language as Shaped by the Brain

 

We propose, then, to invert the perspective on language evolution, shifting the focus from the evolution of language users to the evolution of languages. Figure 3 provides a conceptual illustration of these two perspectives (see also Andersen, 1973; Hurford, 1990; Kirby & Hurford, 1997). The UG adaptationists (a) suggest that selective pressure toward better language abilities gradually led to the selection of more sophisticated UGs. In contrast, (b) we propose to view language as an evolutionary system in its own right (see also e.g., Christiansen, 1994; Deacon, 1997; Keller, 1994; Kirby, 1999; Ritt, 2004), subject to adaptive pressures from the human brain. As a result, linguistic adaptation allows for the evolution of increasingly expressive languages that can nonetheless still be learned and processed by domain-general mechanisms. From this perspective, we argue that the mystery of the fit between human language acquisition and processing mechanisms and natural language may be unraveled, and we might, furthermore, understand how language has attained its apparently “idiosyncratic” structure.

 

Instead of puzzling that humans can only learn a small subset of the infinity of mathematically possible languages, we take a different starting point: the observation that natural languages exist only because humans can produce, learn and process them. In order for languages to be passed on from generation to generation, they must adapt to the properties of the human learning and processing mechanisms; the structures in each language form a highly interdependent system, rather than a collection of independent traits. The key to understanding the fit between language and the brain is to understand how language has been shaped by the brain, not the reverse. The process by which language has been shaped by the brain is, in important ways, akin to Darwinian selection—hence, we therefore suggest that it is a productive metaphor to view languages as analogous to biological species, adapted through natural selection to fit a particular ecological niche: the human brain.

 

This viewpoint does not rule out the possibility that language may have played a role in the biological evolution of hominids. Good language skills may indeed enhance reproductive success. But the pressures working on language to adapt to humans are significantly stronger than the selection pressures on humans to use language. In case of the former, a language can only survive if it is learnable and processable by humans. On the other hand, adaptation towards language use is merely one of many selective pressures working on hominid evolution (including, for example, avoiding predators and finding food). Whereas humans can survive without language, the opposite is not the case. Thus, prima facie language is more likely to have been shaped to fit the human brain than the other way round. Languages that are hard for humans to learn and process cannot come into existence at all.

 

 

Figure 3. Illustration of two different views on the direction of causation in language evolution: a) biological adaptations of the brain to language (double arrows), resulting in gradually more intricate UGs (curved arrows) to provide the basis for increasingly complex language production and comprehension (single arrows); b) cultural adaptation of language to the brain (double arrows), resulting in increasingly expressive languages (curved arrows) that are well suited to being acquired and processed by domain-general mechanisms (single arrows).

 

5.1. Historical parallels between linguistic and biological change

 

The idea of language as an adaptive, evolutionary system has a prominent historical pedigree dating back to Darwin and beyond. One of the earliest proponents of the idea that languages evolve diachronically was the eighteenth-century language scholar, Sir William Jones, the first Western scholar to study Sanskrit and note its affinity with Greek and Latin (Cannon, 1991). Later, nineteenth-century linguistics was dominated by an organistic view of language (McMahon, 1994). Franz Bopp, one of the founders of comparative linguistics, regarded language as an organism that could be dissected and classified (Davies, 1987). Wilhelm von Humboldt—the father of generative grammar (Chomsky, 1965; Pinker, 1994)—argued that “… language, in direct conjunction with mental power, is a fully-fashioned organism…” (von Humboldt, 1836/1999, p. 90; original emphasis). More generally, languages were viewed as having life-cycles that included birth, progressive growth, procreation, and eventually decay and death. However, the notion of evolution underlying this organistic view of language was largely pre-Darwinian. This is perhaps reflected most clearly in the writings of another influential linguist, August Schleicher. Although he explicitly emphasized the relationship between linguistics and Darwinian theory (Schleicher, 1863; quoted in Percival, 1987), Darwin’s principles of mutation, variation, and natural selection did not enter into the theorizing about language evolution (Nerlich, 1989). Instead, the evolution of language was seen in pre-Darwinian terms as the progressive growth towards attainment of perfection, followed by decay. 

 

Darwin (1900), too, recognized the similarities between linguistic and biological change[xii]:

 

The formation of different languages and of distinct species, and the proofs that both have been developed through a gradual process, are curiously parallel … We find in distinct languages striking homologies due to community of descent, and analogies due to a similar process of formation. The manner in which certain letters or sounds change when others change is very like correlated growth … Languages, like organic beings, can be classed in groups under groups; and they can be classed either naturally, according to descent, or artificially by other characters. Dominant languages and dialects spread widely, and lead to the gradual extinction of other tongues. A language, like a species, when once extinct, never … reappears … A struggle for life is constantly going on among the words and grammatical forms in each language. The better, the shorter, the easier forms are constantly gaining the upper hand … The survival and preservation of certain favored words in the struggle for existence is natural selection. (p. 106)

 

In this sense, natural language can be construed metaphorically as akin to an organism whose evolution has been constrained by the properties of human learning and processing mechanisms. A similar perspective on language evolution was revived, within a modern evolutionary framework, by Stevick (1963) and later by Nerlich (1989). Sereno (1991) has listed a number of parallels between biological organisms and language (with the biological comparisons in parentheses):

 

An intercommunicating group of people defines a language (cf. gene flow in relation to a species); language abilities develop in each speaker (cf. embryonic development); language must be transmitted to offspring (cf. heritability); there is a low level process of sound and meaning change that continuously generates variation (cf. mutation); languages gradually diverge, especially when spatially separated (cf. allopatric speciation); geographical distributions of dialects (cf. subspecies, clines) gradually give rise to wholesale rearrangements of phonology and syntax (cf. macroevolution); sociolinguistic isolation can lead to language divergence without spatial discontinuity (cf. sympatric speciation). (p. 472)

 

Christiansen (1994) pushed the analogy a little further, suggesting that language may be viewed as a “beneficial parasite” engaged in a symbiotic relationship with its human hosts, without whom it cannot survive (see also Deacon, 1997). Symbiotic parasites and their hosts tend to become increasingly co-adapted (e.g., Dawkins, 1976). But note that this co-adaptation will be very lopsided, because the rate of linguistic change is far greater than the rate of biological change. Whereas Danish and Hindi needed less than 7,000 years to evolve from a common hypothesized proto-Indo-European ancestor into very different languages (Gray & Atkinson, 2003), it took our remote ancestors approximately 100,000–200,000 years to evolve from the archaic form of Homo sapiens into the anatomically modern form, sometimes termed Homo sapiens sapiens. Indeed, as we argued above, the rapidity of language change, and the geographical dispersal of humanity, suggests that biological adaptation to language is negligible. This suggestion is further corroborated by work in evolutionary game theory, showing that when two species with markedly different rates of adaptation enter a symbiotic relationship, the rapidly evolving species adapts to the slowly evolving one but not the reverse (Frean & Abraham, 2004).

 

5.2. Language as a system

 

But in what sense should language be viewed as akin to an integrated organism, rather than as a collection of separate traits, evolving relatively independently? The reason is that language is highly systematic—so much so, indeed, that much of linguistic theory is concerned with tracking the systematic relationships between different aspects of linguistic structure. Although language is an integrated system, it can, nonetheless, be viewed as comprising a complex set of “features” or “traits” which may or may not be passed on from one generation to the next (concerning lexical items, idioms, aspects of phonology, syntax and so on). To a first approximation, traits that are easy for learners to acquire and use will become more prevalent; traits that are more difficult to acquire and use will disappear. Thus, selectional pressure from language learners and users will shape the way in which language evolves. Crucially, the systematic character of linguistic traits means that, to some degree at least, the fates of different traits in a language are intertwined. That is, the degree to which any particular trait is easy to learn or process will, to some extent, depend on the other features of the language—because language users will tend to learn and process each aspect of the language in the light of their experience with the rest. This picture is familiar in biology—the selectional impact of any gene depends crucially on the rest of the genome; the selectional forces on each gene, for good or ill, are tied to the development and functioning of the entire organism. 

 

Construing language as an evolutionary system has implications for explanations of what is being selected in language evolution. From the viewpoint of generative grammar, the unit of selection would seem to be either specific UG principles (in PPT; Newmeyer, 1991), particular parts of the UG toolkit (in SS; Culicover & Jackendoff, 2005), or recursion in the form of Merge (in MP; Hauser et al., 2002). In all cases, selection would seem to take place at a high level of abstraction that cuts across a multitude of specific linguistic constructions. Our approach suggests a different perspective inspired by the “lexical turn” in linguistics (e.g., Combinatory Categorical Grammar, Steedman, 2000; Head-driven Phrase Structure Grammar, Sag & Pollard, 1987; Lexical-Functionalist Grammar, Bresnan, 1982), focusing on specific lexical items with their associated syntactic and semantic information. Specifically, we adopt a Construction Grammar view of language (e.g., Croft, 2000, 2001; Goldberg, 2006; O’Grady, 2005), proposing that individual constructions consisting of words or combinations thereof are among the basic units of selection.

 

To spell out the parallel, the idiolect of an individual speaker is analogous to an individual organism; a language (e.g., Mandarin, French) is akin to a species. A linguistic “genotype” corresponds to the neural representation of an idiolect, instantiated by a collection of mental “constructions”, which are here analogous to genes, and gives rise to linguistic behavior—the language “phenotype”—characterized by a collection of utterances and interpretations. Just as the fitness of an individual gene depends on its interaction with other genes, so the fitness of an individual construction is intertwined with those of other constructions; i.e., constructions are part of a (linguistic) system. A species in biology is defined by the ability to interbreed; a “language species” is defined by mutual intelligibility. Hence, interbreeding and mutually intelligible linguistic interactions can be viewed as analogous processes by which genetic material and constructions can propagate.

 

The long-term survival of any given construction is affected both by its individual properties (e.g., frequency of usage) and how well it fits into the overall linguistic system (e.g., syntactic, semantic, or pragmatic overlap with other constructions). In a series of linguistic and corpus-based analyses, Bybee (2007) has shown how frequency of occurrence plays an important role in shaping language from phonology to morphology to morphosyntax, due to the effects of repeated processing experiences with specific examples (either types or tokens). Additionally, groups of constructions overlapping in terms of syntactic, semantic, and/or pragmatic properties emerge and form the basis for usage-based generalizations (e.g., Goldberg, 2006; Tomasello, 2003). Crucially, however, these groupings lead to a distributed system of local generalizations across partially overlapping constructions, rather than the abstract, mostly global generalizations of current generative grammar.

 

In psycholinguistics, the effects of frequency and pattern overlap have been observed in so-called Frequency ´ Regularity interactions. As an example, consider the acquisition of the English past tense. Frequently occurring mappings, such as go ® went, are learned more easily than more infrequent mappings, such as lie ® lay. However, low-frequency patterns may be more easily learned if they overlap in part with other patterns. Thus, the partial overlap in the mappings from stem to past tense in sleep ® slept, weep ® wept, keep ® kept (i.e., -eep ® -ept) make the learning of the these mappings relatively easy even though none of the words individually have a particularly high frequency. Importantly, the two factors—frequency and regularity (i.e., degree of partial overlap)—interact with each other. High frequency patterns are easily learned independent of whether they are regular or not, whereas the learning of low-frequency patterns suffers if they are not regular (i.e., if they do not have partial overlap with other patterns). Results from psycholinguistic experimentation and computational modeling have observed such Frequency ´ Regularity interactions across many aspects of language, including auditory word recognition (Lively, Pisoni & Goldinger, 1994), visual word recognition (Seidenberg, 1985), English past tense acquisition (Hare & Elman, 1995), and sentence processing (Juliano & Tanenhaus, 1994; MacDonald & Christiansen, 2002; Pearlmutter & MacDonald, 1995).

 

In our case, we suggest that similar interactions between frequency and pattern overlap are likely to play an important role in language evolution. Individual constructions may survive through frequent usage or because they participate in usage-based generalizations through syntactic, semantic or pragmatic overlap with other similar constructions. Further support for this suggestion comes from artificial language learning studies with human subjects, demonstrating that certain combinations of artificial-language structures are more easily learned than others given sequential learning biases (e.g., Christiansen, 2000; Christiansen & Reeder, 2006; Saffran, 2001; see Section 6.3). For example, Ellefson and Christiansen (2000) compared human learning across two artificial languages that only differed in the order of words in two out of six sentence types. They found that not only was the more “natural” language learned better overall but also that the four sentence types common to both languages were learned better as well. This suggests that the artificial languages were learned as integrated systems, rather than as collections of independent items. Further corroboration comes from a study by Kaschak and Glenberg (2004) who had adult participants learn the needs construction (e.g., “The meal needs cooked”), a feature of the American English dialect spoken in the northern midlands region from western Pennsylvania across Ohio, Indiana, and Illinois to Iowa. The training on the needs construction facilitated the processing of related modifier constructions (e.g., “The meal needs cooked vegetables”), again suggesting that constructions form an integrated system that can be affected by the learning of new constructions. Thus, although constructions are selected independently, they also provide an environment for each other within which selection takes place, just as the selection of individual genes are tied to the survival of the other genes that make up an organism.

 

5.3. The nature of language universals

 

We have argued that language is best viewed as a linguistic system adapted to the human brain. But if evolution is unlikely to have bestowed us with an innate UG, then how can we account for the various aspects of language that UG constraints are supposed to explain? That is, how can we explain the existence of apparent language universals: regularities in language structure and use? Notice, however, that is it by no means clear exactly what counts as a language universal. Rather, the notion of language universals differs considerably across language researchers (e.g., the variety in perspectives among contributions in Christiansen, Collins & Edelman, in press). Many linguists working within the generative grammar framework see universals as primarily, and some times exclusively, deriving from UG (e.g., Hornstein & Boeckx, in press; Pinker & Jackendoff, in press). Functional linguists, on the other hand, view universals as arising from patterns of language usage due to pragmatic, processing and other constraints, and amplified in diachronic language change (e.g., Bybee, in press). However, even within the same theoretical linguistic framework, there is often little agreement about what the exact universals are. For example, when surveying specific universals proposed by different proponents of UG, Tomasello (2004) found little overlap between proposed universals.

 

Although there may be little agreement about specific universals, some consensus can nonetheless be found with respect to their general nature. Thus, within mainstream generative grammar approaches (including MP and PPT), language universals are seen as arising from the inner workings of UG. As noted by Hornstein and Boeckx (in press),

 

on this conception universals are likely to be quite abstract. They need not be observable even were one to survey thousands of languages looking for commonalities (unlike, say Greenbergian Universals). In fact, on this conception, the mere fact that every language displayed some property P does not imply that P is a universal in the sense of being a feature of UG. Put more paradoxically, the fact that P holds universally does not imply that P is a universal. Conversely, some property can be a universal even if only manifested in a single natural language. The only thing that makes a principle a Universal on this view is that it is a property of our innate ability to grow a language. (p. 3-4)

 

Thus, from the perspective of MP and PPT, language universals are by definition properties of UG; that is, they are formal universals (Chomsky, 1965). A similar view of universals also figures within the SS framework (Culicover & Jackendoff, 2005), defined in terms of the universal toolkit encoded in UG. Because different languages are hypothesized to use different subsets of tools, the SS approach—like MP and PPT—suggests that some universals may not show up in all languages (Pinker & Jackendoff, in press). However, both notions of universals face the logical problem of language evolution discussed above: How could the full set of UG constraints have evolved if any single linguistic environment only ever supported a subset of them?

 

The solution to this problem, we suggest, is to adopt a non-formal conception of universals in which they emerge from processes of repeated language acquisition and use. We see universals as products of the interaction between constraints deriving from the way our thought processes work, from perceptuo-motor factors, from cognitive limitations on learning and processing, and from pragmatic sources (Section 6 below). This view implies that most universals are unlikely to be found across all languages; rather, “universals” are more akin to statistical trends tied to patterns of language use. Consequently, specific universals fall on a continuum ranging from being attested to only in some languages to being found across most languages. An example of the former is the class of implicational universals, such as that verb-final languages tend to have postpositions (Dryer, 1992), whereas the presence of nouns and verbs in most, if not all, languages (minimally as typological prototypes; Croft, 2001) is an example of the latter. Thus, language universals, we suggest, are best construed as statistical tendencies with varying degrees of universality across the world’s languages.

 

We have argued that language is too variable, both in time and space, to provide a selectional pressure that might shape the gradual adaptation of an innate UG encoding arbitrary, but universal linguistic constraints. Moreover, a putative innate UG would be too complex and specialized to have credibly arisen through non-adaptationist mechanisms. Instead, we have proposed that the fit between language and the brain arises because language has evolved to be readily learned and processed by the brain. We now consider what kinds of non-linguistic constraints are likely to have shaped language to the brain, and given rise to statistical tendencies in language structure and use.

 

6. Constraints on Language Structure

 

We have proposed that language has adapted to the non-linguistic constraints deriving from language learners and users, giving rise to observable linguistic universals. But how far can these constraints be identified? To what extent can linguistic structure previously ascribed to an innate UG be identified as having a non-linguistic basis? Clearly, establishing a complete answer to this question would require a vast program of research. In this section, we illustrate how research from different areas of the language sciences can be brought together to explain aspects of language previously thought to require the existence of UG for their explanation. For the purpose of exposition, we divide the constraints into four groups relating to thought, perceptuo-motor factors, cognition, and pragmatics. These constraints derive from the limitations and idiosyncratic properties of the human brain and other parts of our body involved in language (e.g., the vocal tract). However, as we note below, any given linguistic phenomenon is likely to arise from a combination of multiple constraints that cut across these groupings, and thus across different kinds of brain mechanisms.

 

6.1. Constraints from thought

 

The relationship between language and thought is both potentially abundantly rich, but also extremely controversial. Thus, the analytic tradition in philosophy can be viewed as attempting to understand thought through a careful analysis of language (e.g., Blackburn, 1984); it has been widely assumed that the structure of sentences (or utterances, and perhaps the contexts in which they stand), and the inferential relations over them, provide an analysis of thought. A standard assumption is that thought is largely prior to, and independent of, linguistic communication. Accordingly, fundamental properties of language such as compositionality, function-argument structure, quantification, aspect and modality, may arise from the structure of the thoughts language is required to express (e.g., Schoenemann, 1999). Moreover, presumably language provides a reasonably efficient mapping of the mental representation of thoughts, with these properties, into phonology. This viewpoint can be instantiated in a variety of ways. For example, Steedman’s emphasis on incremental interpretation (e.g., that successive partial semantic representations are constructed as the sentence unfolds—i.e., the thought that a sentence expresses is built up piecemeal) is one motivation for categorical grammar (e.g., Steedman, 2000). From a very different stance, the aim of finding a “perfect” relationship between thought and phonology is closely related to the goals of the Minimalist Program (Chomsky, 1995).[xiii] Indeed, Chomsky (e.g., 2005) has recently suggested that language may have originated as a vehicle for thought, and only later become exapted to serve as a system of communication. This viewpoint would not, of course, explain the content of a putative UG, which concerns principles for mapping mental representations of thought into phonology; and this mapping surely is specific to communication: inferences are, after all, presumably defined over mental representations of thoughts, rather than phonological representations, or, for that matter, syntactic trees.

 

The lexicon is presumably also strongly constrained by processes of perception and categorization—the meanings of words must be both learnable and cognitively useful (e.g., Murphy, 2002); indeed, the philosophical literature on lexical meaning, from a range of theoretical perspectives, sees cognitive constraints as fundamental to understanding word meaning, whether these constraints are given by innate systems of internal representation (Fodor, 1975), or primitive mechanisms of generalization (Quine, 1960). Cognitive linguists (e.g., Croft & Cruise, 2004) have argued for a far more intimate relation between thought and language: for example, that basic conceptual machinery (e.g., concerning spatial structure) and the mapping of such structure into more abstract domains (e.g., via metaphor) is, according to some accounts, evident in languages (e.g., Lakoff & Johnson, 1980). And from a related perspective (e.g., Croft, 2001), some linguists have argued that semantic categories of thought (e.g., of objects and relations) may be shared between languages, whereas syntactic categories and constructions are defined by language-internal properties, such as distributional relations, so that the attempt to find cross-linguistic syntactic universals is doomed to failure.

 

6.2. Perceptuo-motor constraints

 

The motor and perceptual machinery underpinning language seems inevitably to have some influence on language structure. The seriality of vocal output, most obviously, forces a sequential construction of messages. A perceptual and memory system which is typically a “greedy” processor, and has a very limited capacity for storing “raw” sensory input of any kind (e.g., Haber, 1983) may, moreover, force a code which can interpreted incrementally (rather than the many practical codes in communication engineering, in which information is stored in large blocks, e.g., Mackay, 2003). The noisiness and variability (both with context and speaker) of vocal (or, indeed, signed) signals may, moreover, force a “digital” communication system, with a small number of basic messages: i.e., one that uses discrete units (phonetic features or phonemes). The basic phonetic inventory is transparently related to deployment of the vocal apparatus, and it is also possible that it is tuned, to some degree, to respect “natural” perceptual boundaries (Kuhl, 1987). Some theorists have argued for more far-reaching connections. For example, MacNeilage (1998) argues that aspects of syllable structure emerge as a variation on the jaw movements involved in eating, and for some cognitive linguists, the perceptual-motor system is a crucial part of the machinery on which the linguistic system is built (e.g., Hampe, 2006). The depth of the influence of perceptual and motor control on more abstract aspects of language is controversial—but it seems plausible that such influence may be substantial. 

 

 

6.3. Cognitive constraints on learning and processing

 

In our framework, language acquisition is construed not as learning a distant grammar, but as learning how to process language. Although constraints on learning and processing are often treated separately (e.g., Bybee, 2007; Hawkins, 2004; Tomasello, 2003), we see them as being highly intertwined, subserved by the very same underlying mechanisms. Language processing involves extracting regularities from highly complex sequential input, pointing to a connection between general sequential learning (e.g., planning, motor control, etc., Lashley, 1951) and language: both involve the extraction and further processing of discrete elements occurring in complex temporal sequences. It is therefore not surprising that sequential learning tasks have become an important experimental paradigm for studying language acquisition and processing (sometimes under the heading of “artificial grammar/language learning”, Gómez & Gerken, 2000, or “statistical learning”, Saffran, 2003). Sequential learning has thus been demonstrated for a variety of different aspects of language, including speech segmentation (Curtin, Mintz & Christiansen, 2005; Saffran, Aslin & Newport, 1996; Saffran, Newport & Aslin, 1996), discovering complex word-internal structure between nonadjacent elements (Newport & Aslin, 2004; Onnis, Monaghan, Chater & Richmond, 2005; Peña, Bonnatti, Nespor & Mehler, 2002), acquiring gender-like morphological systems (Brooks, Braine, Catalano, Brody & Sudhalter, 1993; Frigo & McDonald, 1998), locating syntactic phrase boundaries (Saffran, 2001, 2002), using function words to delineate phrases (Green, 1979), integrating prosodic and morphological cues in the learning of phrase structure (Morgan, Meier & Newport, 1987), integrating phonological and distributional cues (Monaghan, Chater & Christiansen, 2005), and detecting long-distances relationships between words (Gómez, 2002; Onnis, Christiansen, Chater & Gómez, 2003).

 

The close relationship between sequential learning and grammatical ability has been further corroborated by recent neuroimaging studies, showing that people trained on an artificial language have the same event-related potential (ERP) brainwave patterns to ungrammatical artificial-language sentences as to ungrammatical natural-language sentences (Christiansen, Conway & Onnis, 2007; Friederici, Steinhauer & Pfeifer, 2002). Moreover, novel incongruent musical sequences elicit ERP patterns that are statistically indistinguishable from syntactic incongruities in language (Patel, Gibson, Ratner, Besson & Holcomb, 1998). Results from a magnetoencephalography (MEG) experiment further suggest that Broca’s area plays a crucial role in processing music sequences (Maess, Koelsch, Gunter & Friederici, 2001). Finally, event-related functional magnetic resonance imaging (fMRI) has shown that the same brain area—Broca’s area—is involved in an artificial grammar learning task and in normal natural language processing (Petersson, Forkstam & Ingvar, 2004). Further evidence comes from behavioral studies with language impaired populations, showing that aphasia (Christiansen, Kelly, Shillcock & Greenfield, 2007; Hoen et al., 2003), language learning disability (Plante, Gómez & Gerken, 2002), and specific language impairment (Hsu, Christiansen, Tomblin, Zhang & Gómez, 2006; Tomblin, Mainela-Arnold & Zhang, 2007) are associated with impaired sequential learning. Together, these studies strongly suggest that there is considerable overlap in the neural mechanisms involved in language and sequential learning[xiv] (see also Conway, Karpicke & Pisoni, 2007; Ullman, 2004; Wilkins & Wakefield, 1995, for similar perspectives).

 

This psychological research can be seen as providing a foundation for work in functional and typological linguistics indicating how theoretical constraints on sequential learning and processing can be used to explain certain universal patterns in language structure and use. One suggestion, from O’Grady (2005), is that the language processing system seeks to resolve linguistic dependencies (e.g., between verbs and their arguments) at the first opportunity—a tendency that might not be syntax-specific, but instead an instance of a general cognitive tendency to attempt to resolve ambiguities rapidly in linguistic (Clark, 1975) and perceptual input (Pomerantz & Kubovy, 1986). In a similar vein, Hawkins (1994, 2004) and Culicover (1999) propose specific measures of processing complexity (roughly, the number of linguistic constituents required to link syntactic and conceptual structure), which they assume underpin judgments concerning linguistic acceptability. The collection of studies in Bybee (2007) further underscores the importance of frequency of use in shaping language. Importantly, this line of work has begun to detail learning and processing constraints that can help explain specific linguistic patterns, such as the aforementioned examples of pronoun binding (previous examples 1-4; see O’Grady, 2005) and heavy NP-shift (examples 5-6; see Hawkins, 1994, 2004), and indicates an increasing emphasis on performance constraints within linguistics.

 

In turn, a growing body of empirical research in computational linguistics, cognitive science, and psycholinguistics has begun to explore how these theoretical constraints may be instantiated in terms of computational and psychological mechanisms. For instance, basic word order patterns may thus derive from memory constraints related to sequential learning and processing of linguistic material, as indicated by computational simulations (e.g., Christiansen & Devlin, 1997; Kirby, 1999; Lupyan & Christiansen, 2002; Van Everbroeck, 1999), human experimentation involving artificial languages (e.g., Christiansen, 2000; Christiansen & Reeder, 2006), and cross-linguistic corpus analyses (e.g., Bybee, 2002; Hawkins, 1994, 2004). Similarly, behavioral experiments and computational modeling have provided evidence for general processing constraints (instead of innate subjacency constraints) on complex question formation (Berwick & Weinberg, 1984; Ellefson & Christiansen, 2000).

 

6.4. Pragmatic constraints

 

Language is likely, moreover, to be substantially shaped by the pragmatic constraints involved in linguistic communication. The program of developing and extending Gricean implicatures (Grice, 1967; Levinson, 2000; Sperber & Wilson, 1986) has revealed enormous complexity in the relationship between the literal meaning of an utterance and the message that the speaker intends to convey. Pragmatic processes may, indeed, be crucial in understanding many aspects of linguistic structure, as well as the processes of language change.

 

Consider the nature of anaphora and binding. Levinson (2000) notes that the patterns of “discourse” anaphora (7) and syntactic anaphora (8) have interesting parallels.

 

(7)        a. John arrived. He began to sing.

b. John arrived. The man began to sing.

(8)        a. John arrived and he began to sing. 

b. John arrived and the man began to sing.

 

In both (7) and (8), the first form indicates preferred coreference of he and john; the second form prefers non-coreference. The general pattern is that brief expressions encourage coreference with a previously introduced item; Grice’s maxim of quantity implies that, by default, a prolix expression will not be used where a brief expression could be, and hence prolix expressions are typically taken to imply non-coreference with previously introduced entities. Where the referring expression is absent, then coreference may be required as in (9), in which the singer can only be John:

 

(9) John arrived and began to sing.

 

It is natural to assume that syntactic structures emerge, diachronically, from reduction of discourse structures—and that, in Givón’s phrase “Yesterday’s discourse is today’s syntax” (as cited in Tomasello, 2006). The shift, over time, from default constraint to rigid rule is widespread in language change and much studied in the sub-field of grammaticalization (see Section 7.1).

 

Applying this pragmatic perspective to the binding constraints, Levinson (1987a, 1987b, 2000) notes that the availability, but non-use, of the reflexive himself provides a default (and later, perhaps, rigid) constraint that him does not corefer with John in (10).

 

(10)      a. Johni likes himselfi

b. Johni likes himj

 

Levinson (2000), building on related work by Reinhart (1983), provides a comprehensive account of the binding constraints, and putative exceptions to them, purely on pragmatic principles (see also Huang, 2000, for a cross-linguistic perspective). In sum, pragmatic principles can at least partly explain both the structure, and origin, of linguistic patterns that are often viewed as solely formal, and hence arbitrary.

 

6.5. The impact of multiple constraints

 

In this section, we have discussed four types of constraints that have shaped the evolution of language. Importantly, we see these constraints as interacting with one another, such that individual linguistic phenomena arise from a combination of several different types of constraints. For example, the patterns of binding phenomena are likely to require explanations that cut across the four types of constraints, including constraints on cognitive processing (O’Grady, 2005) and pragmatics (Levinson, 1987a; Reinhart, 1983). That is, the explanation of any given aspect of language is likely to require the inclusion of multiple overlapping constraints deriving from thought, perceptual-motor factors, cognition, and pragmatics.

 

The idea of explaining language structure and use through the integration of multiple constraints goes back at least to early functionalist approaches to the psychology of language (e.g., Bates & MacWhinney, 1979; Bever, 1970; Slobin, 1973). It plays an important role in current constraint-based theories of sentence comprehension (e.g., MacDonald, Pearlmutter & Seidenberg, 1994; Tanenhaus & Trueswell, 1995). Experiments have demonstrated how adults’ interpretations of sentences are sensitive to a variety of constraints, including specific world knowledge relating to the content of an utterance (e.g., Kamide, Altmann & Haywood, 2003), the visual context in which the utterance is produced (e.g., Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy, 1995), the sound properties of individual words (Farmer, Christiansen & Monaghan, 2006), the processing difficulty of an utterance as well as how such difficulty may be affected by prior experience (e.g., Reali & Christiansen, 2007), and various pragmatic factors (e.g., Fitneva & Spivey, 2004). Similarly, the integration of multiple constraints—or “cues”—also figures prominently in contemporary theories of language acquisition (see e.g., contributions in Golinkoff et al., 2000; Morgan & Demuth, 1996; Weissenborn & Höhle, 2001; for a review, see Monaghan & Christiansen, in press).

 

The multiple-constraints satisfaction perspective on language evolution also offers an explanation for why language is unique to humans: as a cultural product, language has been shaped by constraints from multiple mechanisms, some of which have properties unique to humans. Specifically, we suggest that language does not involve any qualitatively different mechanisms compared to extant apes, but instead a number of quantitative evolutionary refinements of older primate systems (e.g., for intention sharing and understanding, Tomasello, Carpenter, Call, Behre & Moll, 2005; or complex sequential learning and processing[xv], Conway & Christiansen, 2001). These changes could be viewed as providing necessary pre-adaptations that, once in place, allowed language to emerge through cultural transmission (e.g., Elman, 1999). It is also conceivable that initial changes, if functional, could have been subject to further amplification through the Baldwin effect, perhaps resulting in multiple quantitative shifts in human evolution. The key point is that none of these changes would result in the evolution of UG. The species-specificity of a given trait does not necessitate postulating specific biological adaptations for that trait. For example, even though playing tag may be species-specific and perhaps even universal, few people, if any, would argue that humans have evolved specific adaptations for playing this game. Thus, the uniqueness of language is better viewed as part of the larger question: why are humans different from other primates? It seems clear that considering language in isolation is not going to give us the answer to this question.

 

7. How Constraints Shape Language over Time

 

According to the view that language evolution is determined by the development of UG, there is a sharp divide between questions of language evolution (how the genetic endowment could arise evolutionarily), and historical language change (which is viewed as variation within the genetically determined limits of possible human languages). By contrast, if language has evolved to fit prior cognitive and communicative constraints, then it is plausible that historical processes of language change provide a model of language evolution; indeed, historical language change may be language evolution in microcosm. This perspective is consistent with much work in functional and typological linguistics (e.g., Bever & Langendoen, 1971; Croft, 2000; Givón, 1998; Hawkins, 2004; Heine & Kuteva, 2002).

 

At the outset, it is natural to expect that language will be the outcome of competing selectional forces. On the one hand, as we shall note, there will be a variety of selectional forces that make the language “easier” for speakers/hearers; on the other, it is likely that expressibility is a powerful selectional constraint, tending to increase linguistic complexity over evolutionary time. For instance, it has been suggested that the use of hierarchical structure and limited recursion to express more complex meanings may have arrived at later stages of language evolution (Jackendoff, 2002; Johansson, 2006). Indeed, the modern Amazonian language, Pirahã, lacks recursion and has one of the world’s smallest phoneme inventories (though its morphology is complex), limiting its expressivity (Everett, 2005; but see also the critique by Nevins, Pesetsky & Rodrigues, 2007, and Everett’s, 2007, response).

 

While expressivity is one selectional force that may tend to increase linguistic complexity, it will typically stand in opposition to another: ease of learning and processing will tend to favor linguistic simplicity. But the picture may be more complex: in some cases, ease of learning and ease of processing may stand in opposition. For example, regularity makes items easier to learn; the shortening of frequent items, and consequent irregularity, may make aspects of language easier to say. There are similar tensions between ease of production (which favors simplifying the speech signal), and ease of comprehension (which favors a richer, and hence more informative, signal). Moreover, whereas constraints deriving from the brain provide pressures toward simplification of language, processes of grammaticalization can add complexity to language (e.g., by the emergence of morphological markers). Thus, part of the complexity of language, just as in biology, may arise from the complex interaction of competing constraints.

 

7.1. Language evolution as linguistic change

 

Recent theory in diachronic linguistics has focused on grammaticalization (e.g., Bybee, Perkins & Pagliuca, 1994; Heine, 1991; Hopper & Traugott, 1993): the process by which functional items, including closed class words and morphology, develop from what are initially open-class items. This transitional process involves a “bleaching” of meaning, phonological reduction, and increasingly rigid dependencies with other items. Thus, the English number one is likely to be the root to a(n). The Latin cantare habeo (I have (something) to sing) mutated into chanterais, cantaré, cantarò (I will sing in French, Spanish, Italian). The suffix corresponds phonologically to I have in each language (respectively, ai, he, ho—the have element has collapsed into inflectional morphology, Fleischman, 1982). The same processes of grammaticalization can also cause certain content words over time to get bleached of their meaning and become grammatical particles. For example, the use of go and have as auxiliary verbs (as in I am going to sing or I have forgotten my hat) have been bleached of their original meanings concerning physical movement and possession (Bybee et al., 1994). The processes of grammaticalization appear gradual, and follow historical patterns, suggesting that there are systematic selectional pressures operative in language change. More generally, these processes provide a possible origin of grammatical structure from a proto-language initially involving perhaps unordered and uninflected strings of content words.

 

From a historical perspective, it is natural to view many aspects of syntax as emerging from processing or pragmatic factors. Revisiting our discussion of binding constraints, we might view complementary distributions of reflexive and non-reflexive pronouns as initially arising from pragmatic factors; the resulting pattern may be acquired and modified by future generations of learners, to some degree independently of those initial factors (e.g., Givón, 1979; Levinson, 1987b). Thus, binding constraints might be a complex product of many forces, including pragmatic factors, and learning and processing biases—and hence the subtlety of those constraints should not be entirely surprising. But from the present perspective, the fact that such a complex system of constraints is readily learnable, is neither puzzling, nor indicative of an innately specified genetic endowment. Rather the constraints are learnable because they have been shaped by the very pragmatic, processing and learning constraints with which the learner is endowed.

 

Understanding the cognitive and communicative basis for the direction of grammaticalization and related processes is an important challenge. But equally, the suggestion that this type of observable historical change may be continuous with language evolution opens up the possibility that research on the origin of language may not be a theoretically isolated island of speculation, but may connect directly with one of the most central topics in linguistics: the nature of language change (e.g., Zeevat, 2006). Indeed, grammaticalization has become the center of many recent perspectives on the evolution of language as mediated by cultural transmission across hundreds (perhaps thousands) of generations of learners (e.g., Bybee et al., 1994; Givón, 1998; Heine & Kuteva, 2002; Schoenemann, 1999; Tomasello, 2003). Although the present approach also emphasizes the importance of grammaticalization in the evolution of complex syntax, it differs from other approaches in that we see this diachronic process as being constrained by limitations on learning and processing. Indeed, there have even been intriguing attempts to explain some aspects of language change with reference to the learning properties of connectionist networks. For example, Hare & Elman (1995) demonstrated how cross-generational learning by sequential learning devices can model the gradual historical change in English verb inflection from a complex past tense system in Old English to the dominant “regular” class and small classes of “irregular” verbs of modern English.

 

7.2. Language evolution through cultural transmission

 

How far can language evolution, and historical processes of language change, be explained in terms of general mechanisms of cultural transmission: by attempting to capture the processes by which information is passed from person to person? And how information might be selectively distorted by such processes? Crucial to any such model, whether concerning language or not, are assumptions about the channel over which cultural information is transmitted; the structure of the network of social interactions over which transmission occurs; and the learning and processing mechanisms that support the acquisition and use of the transmitted information (Boyd & Richerson, 2005).

 

A wide range of recent computational models of the cultural transmission of language has been developed, with different points of emphasis. Some of these models have considered how language is shaped by the process of transmission over successive generations, by the nature of the communication problem to be solved and/or by the nature of the learners (e.g., Batali, 1998; Kirby, 1999). For example, Kirby, Dowman and Griffiths (2007) show that, if information is transmitted directly between individual learners, and learners sample grammars from the Bayes posterior distribution of grammars given that information, then language asymptotically converges to match the priors initially encoded by the learners. In contrast, Smith, Brighton and Kirby (2003), using a different model of how information is learned, indicate how compositional structure in language might have resulted from the complex interaction of learning constraints and cultural transmission, resulting in a “learning bottleneck”. Moreover, a growing number of studies have started to investigate the potentially important interactions between biological and linguistic adaptation in language evolution (e.g., Christiansen et al., 2006; Hurford, 1990; Hurford & Kirby, 1999; Kvasnicka & Pospichal, 1999; Livingstone & Fyfe, 2000; Munroe & Cangelosi, 2002; Smith, 2002, 2004; Yamauchi, 2001).

 

Of particular interest here are simulations indicating that apparently arbitrary aspects of linguistic structure may arise from constraints on learning and processing (e.g., Kirby, 1998, 1999; Van Everbroeck, 1999). For example, it has been suggested that subjacency constraints may arise from cognitive limitations on sequential learning (Ellefson & Christiansen, 2000). Moreover, using rule-based language induction, Kirby (1999) accounted for the emergence of typological universals as a result of domain-general learning and processing constraints. Finally, note that, in line with the present arguments a range of recent studies have challenged the plausibility of biological adaptation to arbitrary features of the linguistic environment (e.g., Christiansen et al., 2006; Kirby et al., 2007; Kirby & Hurford, 1997; Munroe & Cangelosi, 2002; Yamauchi, 2001).

 

The range of factors known to be important in cultural transmission (e.g., group size and networks of transmission between group members, fidelity of transmission) has been explored relatively little in simulation work. Furthermore, to the extent that language is shaped by the brain, then enriching models of cultural transmission of language, against the backdrop of learning and processing constraints, will be an important direction for the study both of historical language change and language evolution. More generally, viewing language as shaped by cultural transmission (Arbib, 2005; Bybee, 2002; Donald, 1998) only provides the starting point for an explanation of linguistic regularities. The real challenge, we suggest, is to delineate the wide range of constraints, from perceptuo-motor to pragmatic (as sketched above), that operate on language evolution. Detailing these constraints is likely to be crucial for explanations of complex linguistic regularities, and how they can readily be learned and processed.

 

We note here that this perspective on the adaptation of language differs importantly from the processes of cultural change that operate through deliberate and conscious innovation and/or evaluation of cultural variants. On our account, the processes of language change operate to make languages easier to learn and process, and more communicatively effective. But these changes do not operate through processes either of “design” or deliberate adoption by language users. Thus, following Darwin, we view the origin of the adaptive complexity in language as analogous to the origin of adaptive complexity in biology. Specifically, the adaptive complexity of biological organisms is presumed to arise from random genetic variation, winnowed by natural selection (a “blind watchmaker”; Dawkins, 1986); we argue that the adaptive complexity of language arises, similarly, from random linguistic variation winnowed by selectional pressures, though here concerning learning and processing (so again, we have a blind watchmaker).

 

By contrast, for aspects of cultural changes for which variants are either created, or selected, by deliberate choice, the picture is very different. Such cultural products can be viewed instead as arising from the incremental action of processes of intelligent design, and more or less explicit evaluations, and decisions to adopt (see Chater, 2005). Many phenomena discussed by evolutionary theorists concerning culture (e.g., Campbell, 1965; Richerson & Boyd, 2005)— including those described by meme-theorists (e.g., Blackmore, 1999; Dawkins, 1976; Dennett, 1995)—fall into this latter category: explanations of fashions (e.g., wearing baseball caps backwards), catch-phrases, memorable tunes, engineering methods, cultural conventions and institutions (e.g., marriage, revenge killings), scientific and artistic ideas, religious views, and so on, seem patently to be products of sighted watchmakers; i.e., they are products, in part at least, of many generations of intelligent designers, imitators, and critics.

 

Our focus here concerns, instead, the specific, and interdependent, constraints operating on particular linguistic structures and of which people have no conscious awareness. Presumably, speakers do not deliberately contemplate syntactic reanalyses of existing structures, bleach the meaning of common verbs so that they play an increasingly syntactic role, or collapse discourse structure into syntax or syntactic structure into morphology. Of course, there is some deliberate innovation in language (e.g., people consciously invent new words and phrases). But such deliberate innovations should be sharply distinguished from the unconscious operation of the basic learning and processing biases that have shaped the phonological, syntactic and semantic regularities of language.

 

7.3. Language change “in vivo”

 

We have argued that language has evolved over time to be compatible with the human brain. However, it might be objected that it is not clear that languages become better adapted over time given that they all seem capable of expressing a similar range of meanings (Sereno, 1991). In fact, the idea that all languages are fundamentally equal and independent of their users—uniformitarianism—is widely adopted in linguistics, preventing many linguists from thinking about language evolution (Newmeyer, 2003). Yet, much variation exist in how easy it is to use a given language to express a particular meaning given the limitations of human learning and processing mechanisms.

 

The recent work on creolization in sign language provides a window onto how pressures towards increased expressivity interact with constraints on learning and processing “in vivo”. In less than three decades, a sign language has emerged in Nicaragua, created by deaf children with little exposure to established languages. Senghas, Kita and Özyürek (2004) compared signed expressions for complex motions produced by deaf signers of Nicaraguan Sign Language (NSL) with the gestures of hearing Spanish speakers. The results showed that the hearing individuals used a single simultaneous movement combining both manner and path of motion, whereas the deaf NSL signers tended to break the event into two consecutive signs: one for the path of motion and another for the manner. Moreover, this tendency was strongest for the signers who had learned NSL more recently, indicating that NSL has changed from using a holistic way of denoting motion events to a more sequential, compositional format. Although such creolization may be considered as evidence of UG (e.g., Bickerton, 1984; Pinker, 1994), the results may be better construed in terms of cognitive constraints on cultural transmission. Indeed, computational simulations have demonstrated how iterated learning in cultural transmission can change a language starting as a collection of holistic form-meaning pairings into a more compositional format, in which sequences of forms are combined to produce meanings previously expressed holistically (see Kirby & Hurford, 2002, for a review). Similarly, human experimentation operationalizing iterated learning within a new “cross-generational” paradigm—in which the output of one artificial-language learner is used as the input for subsequent “generations” of language learners—has shown that such learning biases over generations can change the structure of artificial languages from holistic mappings to a compositional format (Cornish, 2006). This allows language to have increased expressivity, while being learnable from exposure to a finite set of form-meaning pairings. Thus, the change towards using sequential compositional forms to describe motion events in NSL can be viewed as a reflection of similar processes of learning and cultural transmission.

 

In a similar vein, the rapid emergence of a regular SOV (subject-object-verb) word order in Al-Sayyid Bedouin Sign Language (ABSL; Sandler, Meir, Padden & Aronoff, 2005) can be interpreted as arising from constraints on learning and processing. ABSL has a longer history than NSL, going back some 70 years. The Al-Sayyid Bedouin group forms an isolated community with a high incidence of congenital deafness, located in the Negev desert region of southern Israel. In contrast to NSL, which developed within a school environment, ABSL has evolved in a more natural setting and is recognized as the second language of the Al-Sayyid village. A key feature of ABSL is that is has developed a basic SOV word order within sentences (e.g., boy apple eat), with modifiers following heads (e.g., apple red). Although this type of word order is very common across the world (Dryer, 1992), it is found neither in the local spoken Arabic dialect nor in Israeli Sign Language, suggesting that ABSL has developed these grammatical regularities de novo. In a series of computational simulations, Christiansen & Devlin (1997) found that languages with consistent word order were easier to learn by a sequential learning device compared to inconsistent word orders. Thus, a language with a grammatical structure such as ABSL was easier to learn than one in which an SOV word order was combined with a modifier-head order within phrases. Similar results were obtained when human subjects were trained on artificial languages with either consistent or inconsistent word orders (Christiansen, 2000; Christiansen & Reeder, 2006). Further simulations have demonstrated how sequential learning biases can lead to the emergence of languages with regular word orders through cultural transmission—even when starting from a language with a completely random word order (Christiansen & Dale, 2004; Reali & Christiansen, in press).

 

Differences in learnability are not confined to newly emerged languages but can also be observed in well-established languages. For example, Slobin and Bever (1982) found that when children learning English, Italian, Turkish, or Serbo-Croatian were asked to act out reversible transitive sentences, such as the horse kicked the cow, using familiar toy animals, language-specific differences in performance emerged. Turkish-speaking children performed very well already at 2 years of age, most likely due to the regular case markings in this language, indicating who is doing what to whom. Young English and Italian-speaking children initially performed slightly worse than the Turkish children but quickly caught up around 3 years of age, relying on the relatively consistent word order information available in these languages, with subjects preceding objects. The children acquiring Serbo-Croatian, on the other hand, had problems determining the meaning of the simple sentences, most likely because this language uses a combination of case markings and word order to indicate agent and patient roles in a sentence. Crucially, only masculine and feminine nouns take on accusative or nominative markings and can occur in any order with respect to one another, but sentences with one or more unmarked neuter nouns are typically ordered as subject-verb-object. Of course, Serbo-Croatian children eventually catch up with the Turkish, English, and Italian-speaking children, but these results do show that some meanings are harder to learn and process in some languages compared to others, indicating differential fitness across languages (see Lupyan & Christiansen, 2002, for corroborating computational simulations).

 

Within specific languages, substantial differences also exist between individual idiolects; e.g., as demonstrated by the considerable differences in language comprehension abilities between cleaners, janitors, undergraduates, graduate students, and lecturers from the same British university (Dabrowska, 1997). Even within the reasonably homogeneous group of college students, individual differences exist in sentence processing abilities due to underlying variations in learning and processing mechanisms combined with variations in exposure to language (for a review, see MacDonald & Christiansen, 2002). Additional sources of variation are likely to come from the incorporation of linguistic innovations into the language. In this context, it has been suggested that innovations may primarily be due to adults (Bybee, in press), whereas constraints on children’s acquisition of language may provide the strongest pressure towards regularization (e.g., Hudson Kam & Newport, 2005). Thus, once we abandon linguistic uniformitarianism, it becomes clear that there is much variability for linguistic adaptation to work with.

 

In sum, we have argued that human language has been shaped by selectional pressure from thousands of generations of language learners and users. Linguistic variants that are easier to learn to understand and produce; variants which are more economical, expressive and generally effective in communication, persuasion, and perhaps signally of status and social group, will be favored. Just as with the multiple selectional pressures operative in biological evolution, the matrix of factors at work in driving the evolution of language is complex. Nonetheless, as we have seen, candidate pressures can be proposed (e.g., the pressure for incrementality, minimizing memory load, regularity, brevity, and so on), and regular patterns of language change that may be responses to those pressures can be identified (e.g., the processes of successive entrenchment, generalization and erosion of structure evident in grammaticalization). Thus, the logical problem of language evolution that appears to confront attempts to explain how a genetically specified linguistic endowment could become encoded, does not arise; it is not the brain that has somehow evolved to language, but the reverse.

 

8. Scope of the Argument

 

In this paper, we have presented a theory of language evolution as shaped by the brain. From this perspective, the close fit between language learners and the structure of natural language that motivates many theorists to posit a language-specific biological endowment may instead arise from processes of adaptation operating on language itself. Moreover, we have argued that there are fundamental difficulties with postulating a language-specific biological endowment. It is implausible that such an endowment could evolve through adaptation (because the prior linguistic environments would be too diverse to give rise to universal principles). It is also unlikely that a language-specific endowment of any substantial complexity arose through non-adaptational genetic mechanisms, because the probability of a functional language system arising essentially by chance is vanishingly small. Instead, we have suggested that some apparently arbitrary aspects of language structure may arise from the interaction of a range of factors, from general constraints on learning, to impacts of semantic and pragmatic factors, and concomitant processes of grammaticalization and other aspects of language change. But, intriguingly, it also possible that many apparently arbitrary aspects of language can be explained by relatively natural cognitive constraints—and hence that language may be rather less arbitrary than at first supposed (e.g., Bates & MacWhinney, 1979, 1987; Bybee, 2007; Elman, 1999; Kirby, 1999; Levinson, 2000; O’Grady, 2005; Tomasello, 2003).

 

 

 

 

8.1 The logical problem of language evolution meets the logical problem of language acquisition

 

The present viewpoint has interesting theoretical implications concerning language acquisition. Children acquire the full complexity of natural language over a relatively short amount of time, from exposure to noisy and partial samples of language. The ability to develop complex linguistic abilities from what appears to be such poor input has led many to speak of the “logical” problem of language acquisition (e.g., Baker & McCarthy, 1981; Hornstein & Lightfoot, 1981). One solution to the problem is to assume that learners have some sort of biological “head-start” in language acquisition—that their learning apparatus is precisely meshed with the structure of natural language. This viewpoint is, of course, consistent with theories according to which there is a genetically specified language organ, module or instinct (e.g., Chomsky, 1986, 1993; Crain, 1991; Piattelli-Palmarini, 1989, 1994; Pinker, 1994; Pinker & Bloom, 1990). But it is also consistent with the present view that languages have evolved to be learnable. According to this view, the mesh between language learning and language structure has occurred not because specialized biological machinery embodies the principles that govern natural languages (UG), but rather that the structure of language has evolved to fit with pre-linguistic learning and processing constraints.

 

If language has evolved to be learnable, then the problem of language acquisition may have been mis-analyzed. Language acquisition is frequently viewed as a standard problem of induction (e.g., Gold, 1967; Jain, Osherson, Royer & Sharma, 1999; Osherson, Stob & Weinstein, 1986; Pinker, 1984, 1989), where there is a vast space of possible grammars that are consistent with the linguistic data to which the child is exposed. Accordingly, it is often readily concluded that the child must have innate knowledge of language structure to constrain the space of possible grammars to a manageable size. But, if language is viewed as having been shaped by the brain, then language learning is by no means a standard problem of induction. To give an analogy, according to the standard view of induction, the problem of language acquisition is like being in an unreasonable quiz show, where you have inadequate information, but must somehow guess the “correct” answer. But according to the present view, by contrast, there is no externally given correct answer; instead, the task is simply to give the same answer as everybody else—because the structure of language will have adapted to conform to this most “popular” guess. This is a much easier problem—whatever learning biases people have, so long as these biases are shared across individuals, learning should proceed successfully. Moreover, the viewpoint that children learn language using general-purpose cognitive mechanisms, rather than language-specific mechanisms, has also been advocated independently from a variety of different perspectives ranging from usage-based and functional accounts of language acquisition (e.g., Bates & MacWhinney, 1979, 1987; MacWhinney, 1999; Seidenberg, 1997; Seidenberg & MacDonald, 2001; Tomasello, 2000a, 2000b, 2000c, 2003) to cultural transmission views of language evolution (e.g., Davidson, 2003; Donald, 1998; Ragir, 2002; Schoenemann, 1999), to neurobiological approaches to language (e.g., Arbib, 2005; Deacon, 1997; Elman et al., 1996) and formal language theory (Chater, & Vitányi, 2007).

 

From this perspective, the problem of language acquisition is very different from learning, say, some aspect of the physical world. In learning naïve physics, the constraints to be learned (e.g., how rigid bodies move, how fluids flow, and so on) are defined by processes outside the cognitive system. External processes define the “right” answers, to which learners must converge. But in language acquisition, the structure of the language to be learned is itself determined by the learning of generations of previous learners (see Zuidema, 2003). Because learners have similar learning biases, this means that the first wild guesses that the learner makes about how some linguistic structure works are likely to be the right guesses. More generally, in language acquisition, the learner’s biases, if shared by other learners, are likely to be helpful in acquiring the language—because the language has been shaped by processes of selection to conform with those biases. This also means that the problem of the poverty of the stimulus (e.g., Chomsky, 1980; Crain, 1991; Crain & Pietroski, 2001) is reduced, because language has been shaped to be learnable from the kind of noisy and partial input available to young children. Thus, language acquisition is constrained by substantial biological constraints—but these constraints emerge from cognitive machinery that is not language-specific.

 

 

 

 

8.2. Natural selection for functional aspects of language?

 

It is important to emphasize what our arguments are not intended to show. In particular, we are not suggesting that biological adaptation is not relevant for language. Indeed, it seems likely that a number of preadaptations for language might have occurred (see Hurford, 2003, for a review), such as the ability to represent discrete symbols (Deacon, 1997; Tomasello, 2003), to reason about other minds (Malle, 2002), to understand and share intentions (Tomasello, 2003; Tomasello et al., 2005), and to perform pragmatic reasoning (Levinson, 2000); there may also be a connection with the emergence of an exceptionally prolonged childhood (Locke & Bogin, 2006). Similarly, biological adaptations might have led to improvements to the cognitive systems that support language, including increased working memory capacity (Gruber, 2002), domain-general capacities for word learning (Bloom, 2001), and complex hierarchical sequential learning abilities (Calvin, 1994; Conway & Christiansen, 2001; Greenfield, 1991; Hauser et al., 2002), though these adaptations are likely to have been for improved cognitive skills rather than for language.

 

Some language-specific adaptations may nonetheless have occurred as well, but given our arguments above these would only be for functional features of language, and not the arbitrary features of UG. For example, changes to the human vocal tract may have resulted in more intelligible speech (Lieberman, 1984, 1991, 2003—though see also Hauser & Fitch, 2003); selectional pressure for this functional adaptation might apply relatively independently of the particular language. Similarly, it remains possible that the Baldwin effect may be invoked to explain cognitive adaptations to language, provided that these adaptations are to functional aspects of language, rather than putatively arbitrary linguistic structures. For example, it has been suggested that there might be a specialized perception apparatus for speech (e.g., Vouloumanos  & Werker, 2007), or enhancement of the motor control system for articulation (e.g., Studdert-Kennedy & Goldstein, 2003). But explaining innate adaptations even in these domains is likely to be difficult—because, if adaptation to language occurs at all, it is likely to occur not merely to functionally universal features (e.g., the fact that languages segment into words), but to specific cues for those features (e.g., for segmenting those words in the current linguistic environment, which differ dramatically across languages; Cutler, Mehler, Norris & Segui, 1986; Otake, Hatano, Cutler & Mehler, 1993). Hence, adaptationist explanations, even for functional aspects of language and language processing, should be treated with considerable caution.

 

8.3 Implications for the co-evolution of genes and culture

 

Our argument may, though, have applications beyond language. Many theorists have suggested that, just as there are specific genetic adaptations to language, there may also be specific genetic adaptations to other cultural domains. The arguments we have outlined against biological adaptationism in language evolution appear to apply equally to rule out putative co-evolution of the brain with any rapidly changing and highly varied aspect of human culture—from marriage practices and food sharing practices, to music and art, to folk theories of religion, science or mathematics. We speculate that, in each case, the apparent fit between culture and the brain arises primarily because culture has been shaped to fit with our prior cognitive biases. Thus, by analogy with language, we suggest that nativist arguments across these domains might usefully be re-evaluated, from the perspective that culture may have adapted to cognition much more substantially than cognition has adapted to culture.

 

In summary, we have argued that the notion of UG is subject to a logical problem of language evolution, whether it is suggested to be the result of gradual biological adaptation or other nonadaptationist factors. Instead, we have proposed to explain the close fit between language and learners as arising from the fact that language is shaped by the brain, rather than the reverse.

 

 

Acknowledgments

 

This research was partially supported by the Human Frontiers Science Program grant RGP0177/2001-B. MHC was supported by a Charles A. Ryskamp Fellowship from the American Council of Learned Societies and by the Santa Fe Institute; NC was supported by a Major Research Fellowship from the Leverhulme Trust. The work presented has benefited from discussions with Andy Clark, Jeff Elman, Robert Foley, Anita Govindjee, Ray Jackendoff, Stephen Mithen, Jennifer Misyak, and David Rand. We are also grateful for the comments on a previous version of this paper from Paul Bloom, Michael Corballis, Adele Goldberg, 6 anonymous BBS reviewers, as well as Christopher Conway, Rick Dale, Lauren Emberson, and Thomas Farmer.

 

References

 

Abzhanov, A., Kuo, W.P., Hartmann, C., Grant, B.R., Grant, P.R. & Tabin, C.J. (2006). The calmodulin pathway and evolution of elongated beak morphology in Darwin's finches. Nature, 442, 563-567.

Abzhanov, A., Protas, M., Grant, B.R., Grant, P.R. & Tabin, C.J. (2004). Bmp4 and morphological variation of beaks in Darwin's finches. Science, 305, 1462-1465.

Alter, S. (1998). Darwinism and the linguistic image: Language, race, and natural theology in the nineteenth century. Baltimore, MD: Johns Hopkins University Press.

Andersen, H. (1973). Abductive and deductive change. Language, 40, 765-793.

Arbib, M.A. (2005). From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics. Behavioral & Brain Sciences, 28, 105-124.

Baker, C. L. & McCarthy, J. J. (Eds.) (1981). The logical problem of language acquisition. Cambridge, MA: MIT Press.

Baker, M.C. (2001). The atoms of language: The mind’s hidden rules of grammar. New York: Basic Books.

Baker, M.C. (2003). Language differences and language design. Trends in Cognitive Sciences, 7, 349-353.

Baldwin, J.M. (1896). A new factor in evolution. American Naturalist, 30, 441-451.

Batali, J. (1998). Computational simulations of the emergence of grammar. In J.R. Hurford, M. Studdert-Kennedy & C. Knight (eds.) Approaches to the evolution of language: Social and cognitive bases (pp. 405-426). Cambridge: Cambridge University Press.

Bates, E., & MacWhinney, B. (1979). A functionalist approach to the acquisition of grammar. In E. Ochs & B. Schieffelin (Eds.), Developmental pragmatics (pp. 167-209). New York: Academic Press.

Bates, E. & MacWhinney, B. (1987). Competition, variation, and language learning. In B. MacWhinney (Ed.), Mechanisms of language acquisition (pp. 157-193). Hillsdale, NJ: Erlbaum.

Beer, G. (1996). Darwin and the growth of language theory. In Open fields: Science in cultural encounter (pp. 95-114). Oxford: Oxford University Press.

Beja-Pereira, A., Luikart, G., England, P. R., Bradley, D. G., Jann, O. C., Bertorelle, G., Chamberlain, A. T., Nunes, T. P., Metodiev, S., Ferrand, N., & Erhardt, G. (2003). Gene-culture coevolution between cattle milk protein genes and human lactase genes. Nature Genetics, 35, 311-313.

Berwick, R.C. & Weinberg, A.S. (1984). The grammatical basis of linguistic performance: language use and acquisition. Cambridge, MA: MIT Press.

Bever, T.G. (1970). The cognitive basis for linguistic structures. In R. Hayes (Ed.), Cognition and language development (pp. 277-360). New York: Wiley & Sons.

Bever, T.G. & Langendoen, D.T. (1971). A dynamic model of the evolution of language. Linguistic Inquiry, 2, 433-463.

Bickerton, D. (1984). The language bio-program hypothesis. Behavioral and Brain Sciences, 7, 173-212.

Bickerton, D. (1995). Language and human behavior. Seattle, WA: University of Washington Press.

Bickerton, D. (2003). Symbol and structure: a comprehensive framework for language evolution. In M.H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 77-93). Oxford: Oxford University Press.

Blackburn, S. (1984). Spreading the word. Oxford: Oxford University Press.

Blackmore, S.J. (1999). The meme machine. Oxford: Oxford University Press.

Bloom, P. (2001). Précis of How children learn the meanings of words. Behavioral and Brain Sciences, 24,1095-1103.

Boeckx, C. (2006). Linguistic minimalism: Origins, concepts, methods, and aims. New York: Oxford University Press.

Borer, H. (1984). Parametric syntax: Case studies in Semitic and Romance languages. Dordrecht: Foris.

Boyd, R. & Richerson, P.J. (2005). The origin and evolution of cultures. Oxford: Oxford University Press.

Bresnan, J. (1982). The mental representation of grammatical relations. Cambridge, MA: MIT Press.

Briscoe, E.J. (2003). Grammatical assimilation. In M.H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 295-316). Oxford: Oxford University Press.

Brooks, P.J., Braine, M.D.S., Catalano, L., Brody, R.E., & Sudhalter, V. (1993). Acquisition of gender-like noun subclasses in an artificial language:  The contribution of phonological markers to learning. Journal of Memory and Language, 32, 76-95.

Bybee, J.L. (2002). Sequentiality as the basis of constituent structure. In T. Givón, & B. Malle (Eds.), The evolution of language out of pre-language (pp. 107-132). Philadelphia, PA: John Benjamins.

Bybee, J.L. (2007). Frequency of use and the organization of language. New York: Oxford University Press.

Bybee, J.L. (in press). Language universals and usage-based theory. In M.H. Christiansen, C. Collins & S. Edelman (Eds.), Language universals. New York: Oxford University Press.

Bybee, J.L., Perkins, R.D. & Pagliuca, W. (1994). The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago: University of Chicago Press.

Calvin, W.H. (1994). The emergence of intelligence. Scientific American, 271, 100-107.

Campbell, D.T. (1965). Variation and selective retention in socio-cultural evolution. In: H. R. Barringer, G.I. Blanksten and R.W. Mack (Eds.), Social change in developing areas: A reinterpretation of evolutionary theory (pp. 19-49). Cambridge, MA: Schenkman.

Cannon, G. (1991). Jones's “Spring from some common source”: 1786–1986. In S. M. Lamb and E. D. Mitchell (eds.), Sprung from some common source: Investigations into the pre-history of languages. Stanford, CA: Stanford University Press.

Cavalli-Sforza, L.L. & Feldman, M.W. (2003). The application of molecular genetic approaches to the study of human evolution. Nature Genetics, 33, 266-275.

Chater, N. (2005). Mendelian and Darwinian views of memes and cultural change. In S. Hurley, & N. Chater (Eds.) Perspectives on imitation: From neuroscience to social science (Vol. 2) (pp. 355-362). Cambridge, MA: MIT Press.

Chater, N. & Vitányi, P. (2007). ‘Ideal learning’ of natural language: Positive results about learning from positive evidence. Journal of Mathematical Psychology, 51, 135-163.

Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.

Chomsky, N. (1972). Language and mind. Harcourt, Brace and World (extended edition).

Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.

Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris Publications.

Chomsky, N. (1986). Knowledge of language. New York: Praeger.

Chomsky, N. (1988). Language and the problems of knowledge. The Managua Lectures. Cambridge, MA: MIT Press.

Chomsky, N. (1993). Language and thought. Wakefield, RI: Moyer Bell.

Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.

Chomsky, N. (2005). Three factors in language design. Linguistic Inquiry, 36, 1-22.

Christiansen, M.H. (1994). Infinite languages, finite minds: Connectionism, learning and linguistic structure. Unpublished doctoral dissertation, Centre for Cognitive Science, University of Edinburgh, U.K.

Christiansen, M.H. (2000). Using artificial language learning to study language evolution: Exploring the emergence of word universals. J. L. Dessalles & L. Ghadakpour (Eds.), The Evolution of Language: 3rd International Conference (pp. 45-48). Paris, France: Ecole Nationale Supérieure des Télécommunications.

Christiansen, M.H., Collins, C. & Edelman, S. (Eds.) (in press). Language universals. New York: Oxford University Press.

Christiansen, M.H., Conway, C.M. & Onnis, L. (2007). Overlapping neural responses to structural incongruencies in language and statistical learning point to similar underlying mechanisms. In Proceedings of the 29th Annual Cognitive Science Society Conference (pp. 173-178). Mahwah, NJ: Lawrence Erlbaum.

Christiansen, M.H. & Dale, R. (2004). The role of learning and development in the evolution of language. A connectionist perspective. In D. Kimbrough Oller & U. Griebel (Eds.), Evolution of communication systems: A comparative approach. The Vienna Series in Theoretical Biology (pp. 90-109). Cambridge, MA: MIT Press.

Christiansen, M.H. & Devlin, J.T. (1997). Recursive inconsistencies are hard to learn: A connectionist perspective on universal word order correlations. In Proceedings of the 19th Annual Cognitive Science Society Conference (pp. 113-118). Mahwah, NJ: Lawrence Erlbaum.

Christiansen, M.H., Kelly, L., Shillcock, R. & Greenfield, K. (2007). Impaired artificial grammar learning in agrammatism. Submitted manuscript.

Christiansen, M.H., Reali, F. & Chater, N. (2006). The Baldwin effect works for functional, but not arbitrary, features of language. In A. Cangelosi, A. Smith & K. Smith (Eds.), Proceedings of the Sixth International Conference on the Evolution of Language (pp. 27-34). London: World Scientific Publishing.

Christiansen, M.H. & Reeder, P.A. (2006). Cognitive constraints on word order universals: Evidence from connectionist modeling and artificial grammar learning. Manuscript in preparation.

Clark, H.H. (1975). Bridging. In R.C. Schank & B.L. Nash-Webber (Eds.), Theoretical issues in natural language processing (pp. 169-174). New York: Association for Computing Machinery.

Conway, C.M., & Christiansen, M.H. (2001). Sequential learning in non-human primates. Trends in Cognitive Sciences, 5, 539-546.

Conway, C.M., Karpicke, J. & Pisoni, D.B. (2007). Contribution of implicit sequence learning to spoken language processing: Some preliminary findings with hearing adults. Journal of Deaf Studies and Deaf Education, 12, 317-334.

Corballis, M.C. (1992). On the evolution of language and generativity. Cognition, 44, 197-226.

Corballis, M.C. (2003). From hand to mouth: The gestural origins of language. In M.H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 201-218). Oxford: Oxford University Press.

Cornish, H. (2006). Iterated learning with human subjects: An empirical framework for the emergence and cultural transmission of language. Unpublished Masters thesis, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, U.K.

Crain, S. (1991). Language acquisition in the absence of experience. Behavioral and Brain Sciences, 14, 597-650.

Crain, S., Goro, T. & Thornton, R. (2006). Language acquisition is language change. Journal of Psycholinguistic Research, 35, 31-49.

Crain, S., & Pietroski, P. (2001). Nature, nurture and universal grammar. Linguistics and Philosophy, 24, 139–186.

Crain, S., & Pietroski, P. (2006). Is Generative Grammar deceptively simple or simply deceptive? Lingua, 116, 64-68.

Croft, W. (2000). Explaining language change: an evolutionary approach. Harlow, Essex: Longman.

Croft, W. (2001). Radical construction grammar: Syntactic theory in typological perspective. New York: Oxford University Press.

Croft, W. & Cruise, D. A. (2004). Cognitive linguistics. Cambridge, UK: Cambridge University Press.

Culicover, P.W. (1999). Syntactic nuts. Oxford: Oxford University Press.

Culicover, P.W. & Jackendoff, R. (2005). Simpler syntax. New York: Oxford University Press.

Curtin, S., Mintz, T.H. & Christiansen, M.H. (2005). Stress changes the representational landscape: Evidence from word segmentation. Cognition, 96, 233-262.

Cutler, A., Mehler, J., Norris, D., & Segui, J. (1986). The syllable's differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385-400.

Dabrowska, E. (1997). The LAD goes to school: A cautionary tale for nativists. Linguistics 35, 735-766.

Darwin, C. (1900). The descent of man, and selection in relation to sex (2nd Edition). New York: P.F. Collier and Son.

Davidson, I. (2003). The archaeological evidence of language origins: States of art. In M.H. Christiansen, & S. Kirby (Eds.), Language evolution (pp. 140-157). New York: Oxford University Press.

Davies, A. M. (1987). “Organic” and “Organism” in Franz Bopp. In H. M. Hoenigswald and L. F. Wiener (Eds.), Biological metaphor and cladistic classification (pp. 81–107). Philadelphia, PA: University of Pennsylvania Press.

Dawkins, R. (1976). The selfish gene. New York: Oxford University Press.

Dawkins, R. (1986). The blind watchmaker: Why the evidence of evolution reveals a universe without design. Harmondsworth, UK: Penguin.

Deacon, T.W. (1997). The symbolic species: The co-evolution of language and the brain. New York: W.W. Norton.

Dediu, D. & Ladd, D.R. (2007). Linguistic tone is related to the population frequency of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin. Proceedings of the National Academy of Sciences, 104, 10944-10949.

Dennett, D. C. (1995). Darwin's dangerous idea: Evolution and the meanings of life. New York: Simon & Schuster.

de Vries, M., Monaghan, P., Knecht, S. & Zwitserlood, P. (in press). Syntactic structure and artificial grammar learning: The learnability of embedded hierarchical structures. Cognition.

Diamond, J. (1992). The third chimpanzee: The evolution and future of the human animal. New York: Harper Collins.

Diamond, J. (1997). Guns, germs, and steel: The fates of human societies. New York: Harper Collins.

Donald, M. (1998). Mimesis and the executive suite: Missing links in language evolution. In J.R. Hurford, M. Studdert-Kennedy and C. Knight (Eds.), Approaches to the evolution of language (pp. 44-67). Cambridge, U.K.: Cambridge University Press.

Dryer, M. S. (1992). The Greenbergian word order correlations, Language, 68, 81–138.

Dunbar, R.I.M. (2003). The origin and subsequent evolution of language. In M.H. Christiansen, & S. Kirby (Eds.), Language evolution (pp. 219-234). New York: Oxford University Press.

Ellefson, M.R. & Christiansen, M.H. (2000). Subjacency constraints without universal grammar: Evidence from artificial language learning and connectionist modeling. In The Proceedings of the 22nd Annual Conference of the Cognitive Science Society (pp. 645-650). Mahwah, NJ: Lawrence Erlbaum.

Elman, J.L. (1999). Origins of language: A conspiracy theory. In B. MacWhinney (Ed.), The emergence of language. Hillsdale, NJ: Lawrence Erlbaum.

Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A., Parisi, D. & Plunkett, K. (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press.

Emlen, S.T. (1970). Celestial rotation: Its importance in the development of migratory orientation. Science, 170, 1198-1201.

Enard, W., Przeworski, M., Fisher, S. E., Lai, C. S. L., Wiebe, V., Kitano, T., et al. (2002). Molecular evolution of FOXP2, a gene involved in speech and language. Nature, 418, 869-872.

Everett, D.L. (2005). Cultural constraints on grammar and cognition in Pirahã. Current Anthropology, 46, 621-646.

Everett, D.L. (2007). [On-line]. Cultural constraints on grammar in Pirahã: A Reply to Nevins, Pesetsky, and Rodrigues (2007). Available: http://ling.auf.net/lingBuzz/000427.

Farmer, T.A., Christiansen, M.H. & Monaghan, P. (2006). Phonological typicality influences on-line sentence comprehension. Proceedings of the National Academy of Sciences, 103, 12203-12208.

Fisher, S.E. (2006). Tangled webs: Tracing the connections between genes and cognition. Cognition, 101, 270-297.

Fitneva, S.A., & Spivey, M.J. (2004). Context and language processing: The effect of authorship. In J.C. Trueswell & M.K. Tanenhaus (Eds.), Approaches to studying world-situated language use: Bridging the language-as-product and language-as-action traditions (pp. 317-328). Cambridge, MA: MIT Press.

Fleischman, S. (1982). The future in thought and language: Diachronic evidence from Romance. Cambridge, U.K.: Cambridge University Press.

Fodor, J. A. (1975). The language of thought. Cambridge, MA: Harvard University Press.

Frean, M.R. & Abraham, E.R. (2004). Adaptation and enslavement in endosymbiont-host associations. Physical Review E, 69, 051913.

Friederici, A. D., Bahlmann, J., Heim, S., Schibotz, R.I. & Anwander, A. (2006). The brain differentiates human and non-human grammars: Functional localization and structural connectivity. Proceedings of the National Academy of Sciences, 103, 2458-2463.

Friederici, A. D., Steinhauer, K., & Pfeifer, E. (2002). Brain signatures of artificial language processing: Evidence challenging the critical period hypothesis. Proceedings of the National Academy of Sciences of the United States of America, 99, 529-534.

Frigo, L. & McDonald, J.L. (1998). Properties of phonological markers that affect the acquisition of gender-like subclasses. Journal of Memory and Language, 39, 218-245.

Gerhart, J. & Kirschner, M (1997). Cells, embryos and evolution: Toward a cellular and developmental understanding of phenotypic variation and evolutionary adaptability. Cambridge, U.K.: Blackwell.

Givón, T. (1979). On understanding grammar. New York: Academic Press.

Givón, T. (1998). On the co-evolution of language, mind and brain. Evolution of Communication, 2, 45-116.

Givón, T. & Malle, B. F. (Eds.) (2002). The evolution of language out of pre-language. Amsterdam: Benjamins.

Gold, E. (1967). Language identification in the limit. Information and Control, 16, 447-474.

Goldberg, A.E. (2006). Constructions at work: The nature of generalization in language. New York: Oxford University Press.

Goldsby, R. A., Kindt, T. K., Osborne, B. A., & Kuby J. (2003). Immunology (5th Edition). New York: W.H. Freeman and Company.

Golinkoff, R.M., Hirsh-Pasek, K., Bloom, L., Smith, L., Woodward, A., Akhtar, N., Tomasello, M., & Hollich, G. (Eds.) (2000). Becoming a word learner: A debate on lexical acquisition. New York: Oxford University Press.

Gómez, R.L. (2002). Variability and detection of invariant structure. Psychological Science, 13, 431-436.

Gómez, R.L., & Gerken, L.A. (2000). Infant artificial language learning and language acquisition. Trends in Cognitive Sciences, 4, 178-186.

Gould, S.J. (1993). Eight little piggies: Reflections in natural history. New York: Norton.

Gould, S. J. (2002). The structure of evolutionary theory. Cambridge, MA: Harvard University Press. 

Gould, S. J. & Lewontin, R. C. (1979). The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society of London (Series B), 205, 581-598. 

Gould, S.J. & Vrba, E.S. (1982). Exaptation - a missing term in the science of form. Paleobiology, 8, 4-15.

Gray, R. D. & Atkinson, Q. D. (2003). Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature, 426, 435-439.

Green, T. R. G. (1979). Necessity of syntax markers: 2 Experiments with artificial languages. Journal of Verbal Learning and Verbal Behavior, 18, 481-496.

Greenfield, P.M. (1991). Language, tools and brain: The ontogeny and phylogeny of hierarchically organized sequential behavior. Behavioral and Brain Sciences, 14, 531-595.

Grice, H. P. (1967). Logic and conversation. William James Lectures, Ms., Harvard University.

Gruber, O. (2002). The co-evolution of language and working memory capacity in the human brain. In M.I. Stamenov & V. Gallese (Eds.), Mirror neurons and the evolution of brain and language (pp. 77–86). Amsterdam: John Benjamins.

Haber, R. N. (1983). The impending demise of the icon: the role of iconic processes in information processing theories of perception (with commentaries). Behavioral and Brain Sciences, 6, 1-55.

Hamilton, W. D. (1964). The genetical evolution of social behaviour. Journal of Theoretical Biology, 7, 1-52.

Hampe, B. (2006) (Ed.). From perception to meaning: Image schemas in cognitive linguistics. Berlin: Mouton de Gruyter.

Hare, M. & Elman, J.L. (1995). Learning and morphological change. Cognition, 56, 61-98.

Hauser, M.D. (2001). Wild minds: What animals really think. New York: Owl Books.

Hauser, M.D., Chomsky, N. & Fitch, W.T. (2002). The faculty of language: What is it, who has it, and how did it evolve? Science, 298, 1569-1579.

Hauser, M.D. & Fitch, W.T. (2003). What are the uniquely human components of the language faculty? In M.H. Christiansen & S. Kirby (Eds.), Language evolution (pp. 158-181). Oxford: Oxford University Press.

Hawkins, J.A. (1994). A performance theory of order and constituency. Cambridge: Cambridge University Press.

Hawkins, J.A. (2004). Efficiency and complexity in grammars. Oxford: Oxford University Press.

Hawks, J.D., Hunley, K., Lee, S-H. & Wolpoff, M. (2000). Population bottlenecks and Pleistocene human evolution. Molecular Biology and Evolution, 17, 2-22.

Hecht Orzak, S. & Sober, E. (Eds.) (2001). Adaptationism and optimality. Cambridge: Cambridge University Press.

Heine, B. (1991). Grammaticalization. Chicago: University of Chicago Press.

Heine, B. & Kuteva, T. (2002). On the evolution of grammatical forms. In A. Wray (Ed.), Transitions to language (pp. 376-397). Oxford, U.K.: Oxford University Press.

Hinton, G.E. & Nowlan, S.J. (1987). How learning can guide evolution. Complex Systems, 1, 495-502.

Hoen, M., Golembiowski, M., Guyot, E., Deprez, V., Caplan, D. & Dominey, P.F. (2003). Training with cognitive sequences improves syntactic comprehension in agrammatic aphasics. NeuroReport, 14, 495-499.

Hopper, P. & Traugott, E. (1993). Grammaticalization. Cambridge, UK: Cambridge University Press.

Hornstein, N. (2001). Move! A minimalist approach to construal. Oxford: Blackwell.

Hornstein, N. & Boeckx, C. (in press). Universals in light of the varying aims of linguistic theory. In M.H. Christiansen, C. Collins & S. Edelman (Eds.), Language universals. New York: Oxford University Press.

Hornstein, N. & Lightfoot, D. (Eds.) (1981). Explanations in linguistics: The logical problem of language acquisition. London: Longman.

Hsu, H.-J., Christiansen, M.H., Tomblin, J.B., Zhang, X. & Gómez, R.L. (2006). Statistical learning of nonadjacent dependencies in adolescents with and without language impairment. Poster presented at the 2006 Symposium on Research in Child Language Disorders, Madison, WI.

Huang, Y. (2000). Anaphora: A cross-linguistic study. Oxford: Oxford University Press.

Hudson Kam, C.L. & Newport, E.L. (2005). Regularizing unpredictable variation: The roles of adult and child learners in language formation and change. Language Learning and Development, 1, 151-195.

Hurford, J. (1990). Nativist and functional explanations in language acquisition. In I. M. Roca (Ed.), Logical issues in language acquisition (pp. 85-136). Dordrecht: Foris.

Hurford, J.R. (1991). The evolution of the critical period for language learning. Cognition, 40, 159-201.

Hurford, J.R. (2003). The language mosaic and its evolution. In M.H. Christiansen & S. Kirby (Eds.), Language evolution (pp. 38-57). Oxford: Oxford University Press.

Hurford, J.R. & Kirby, S. (1999). Co-evolution of language size and the critical period'' In D. Birdsong (Ed.) Second language acquisition and the critical period hypothesis (pp. 39-63). Mahwah, NJ: Erlbaum.

Jablonka, E. & Lamb, M.J. (1989). The inheritance of acquired epigenetic variations. Journal of Theoretical Biology, 139, 69-83.

Jackendoff, R. (2002). Foundations of language: Brain, meaning, grammar, evolution. New York: Oxford University Press. 

Jain, S., Osherson, D., Royer, J., & Sharma, A. (1999). Systems that learn (2nd ed.). Cambridge, MA: M.I.T. Press.

Jenkins, L. (2000). Biolinguistics: Exploring the biology of language. Cambridge: Cambridge University Press.

Johansson, S. (2006). Working backwards from modern language to proto-grammar. In A. Cangelosi, A.D.M. Smith, & K. Smith (Eds.), The Evolution of Language (pp. 160-167). Singapore: World Scientific.

Juliano, C. & Tanenhaus, M.K. (1994). A constraint-based lexicalist account of the subject/object attachment preference. Journal of Psycholinguistic Research, 23, 459-471.

Kamide, Y., Altmann, G.T.M., & Haywood, S. (2003). The time-course of prediction in incremental sentence processing: Evidence from anticipatory eye-movements. Journal of Memory and Language, 49, 133-159.

Kaschak, M.P. & Glenberg, A.M. (2004). This construction needs learned. Journal of Experimental Psychology: General, 133, 450-467.

Kauffman, S. A. (1995). The origins of order: Self-organization and selection in evolution. Oxford: Oxford University Press. 

Keller, R. (1994). On language change: The invisible hand in language. London: Routledge.

Kirby, S. (1998). Fitness and the selective adaptation of language. In J.R. Hurford, M. Studdert-Kennedy, & C. Knight (Eds.), Approaches to the evolution of language: Social and cognitive bases (pp. 359-383). New York: Cambridge University Press.

Kirby, S. (1999). Function, selection and innateness: The emergence of language   universals. Oxford: Oxford University Press.

Kirby, S., Dowman, M., & Griffiths, T. (2007). Innateness and culture in the evolution of language. Proceedings of the National Academy of Sciences, 104, 5241-5245.

Kirby, S. & Hurford, J. (1997). Learning, culture and evolution in the origin of linguistic constraints. In P. Husbands and I. Harvey (Eds.), ECAL97 (pp. 493-502). Cambridge, MA: MIT Press.

Kirby, S. & Hurford, J.R. (2002). The emergence of linguistic structure: An overview of the iterated learning model. In A. Cangelosi & D. Parisi (Eds.), Simulating the evolution of language (pp. 121-148). London: Springer Verlag.

Kuhl, P.K. (1987). The special mechanisms debate in speech research: Categorization tests on animals and infants. In S. Harnad (Ed.), Categorical perception: The groundwork of cognition. (pp. 355-386). Cambridge: Cambridge University Press.

Kvasnicka, V., & Pospichal, J. (1999). An emergence of coordinated communication in populations of agents. Artificial Life, 5, 318-342.

Lai, C. S.L., Fisher, S.E., Hurst, J. A., Vargha-Khadem, F., & Monaco, A.P. (2001). A forkhead-domain gene is mutated in a severe speech and language disorder. Nature, 413, 519-523.

Lai, C. S. L., Gerrelli, D., Monaco, A. P., Fisher, S. E., & Copp, A. J. (2003). FOXP2 expression during brain development coincides with adult sites of pathology in a severe speech and language disorder. Brain, 126, 2455–2462.

Lakoff, G. and Johnson, M. (1980). Metaphors we live by. Chicago: University of Chicago Press.

Lanyon, S.J. (2006). A saltationist approach for the evolution of human cognition and language. In A. Cangelosi, A.D.M. Smith, & K. Smith (Eds.), The Evolution of Language (pp. 176-183). Singapore: World Scientific.

Lashley, K.S. (1951). The problem of serial order in behavior. In L.A. Jeffress (Ed.), Cerebral mechanisms in behavior (pp. 112-146). New York: Wiley.

Laubichler, M.D. & Maienschein, J. (Eds.) (2007). From embryology to evo-devo: A history of developmental evolution. Cambridge, MA: MIT Press.

Levinson, S.C. (1987a). Pragmatics and the grammar of anaphora: A partial pragmatic reduction of binding and control phenomena. Journal of Linguistics, 23, 379-434.

Levinson S.C. (1987b). Minimization and conversational inference. In M. Papi and J. Verschueren (Eds.), The Pragmatic Perspective: Proceedings of the International Conference on Pragmatics at Viareggio (pp. 61-129). Amsterdam: J. Benjamins.

Levinson, S.C. (2000). Presumptive meanings: The theory of generalized conversational implicature. Cambridge, MA: MIT Press.

Lewontin, R.C. (1998). The evolution of cognition: Questions we will never answer. In D. Scarborough & S. Sternberg (Eds.), An invitation to cognitive science, Volume 4: Methods, models, and conceptual issues. Cambridge, MA: MIT Press.

Li, M. & Vitányi, P. (1997). An introduction to Kolmogorov complexity theory and its applications (2nd ed). Berlin: Springer.

Lieberman, P. (1984). The biology and evolution of language. Cambridge, MA: Harvard University Press.

Lieberman, P. (1991). Speech and brain evolution. Behavioral and Brain Science, 14, 566-568.

Lieberman, P. (2003). Motor control, speech, and the evolution of human language. In M.H. Christiansen & S. Kirby (Eds.), Language evolution (pp. 255-271). New York: Oxford University Press.

Lightfoot, D. (2000). The spandrels of the linguistic genotype. In C. Knight, M. Studdert-Kennedy & J.R. Hurford (Eds.), The evolutionary emergence of language: Social function and the origins of linguistic form (pp. 231-247). Cambridge, U.K.: Cambridge University Press.

Lively, S.E., Pisoni, D.B. & Goldinger, S.D. (1994). Spoken word recognition. In M.A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp. 265-318). San Diego, CA: Academic Press.

Livingstone, D., & Fyfe, C. (2000). Modelling language-physiology coevolution. In C. Knight, M., Studdert-Kennedy and J. R. Hurford (Eds.), The emergence of language: Social function and the origins of linguistic form (pp. 199- 215). Cambridge University Press.

Locke, J.L. & Bogin, B. (2006). Language and life history: A new perspective on the development and evolution of human language. Behavioral & Brain Sciences, 29, 259-280.

Lupyan, G. & Christiansen, M.H. (2002). Case, word order, and language learnability: Insights from connectionist modeling. In Proceedings of the 24th Annual Conference of the Cognitive Science Society (pp. 596-601). Mahwah, NJ: Lawrence Erlbaum.

MacDermot, K. D., Bonora, E., Sykes, N., Coupe, A. M., Lai, C. S. L., Vernes, S. C., et al. (2005). Identification of FOXP2 truncation as a novel cause of developmental speech and language deficits. American Journal of Human Genetics, 76, 1074–1080.

MacDonald, M.C. & Christiansen, M.H. (2002). Reassessing working memory: A comment on Just & Carpenter  (1992) and Waters & Caplan (1996). Psychological Review, 109, 35-54.

MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity resolution. Psychological Review, 101, 676 –703.

Mackay, D. J. C. (2003). Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.

MacNeilage, P.F. (1998) The frame/content theory of evolution of speech production. Behavioral and Brain Sciences, 21, 499-511.

MacWhinney, B. (Ed.) (1999). The emergence of language. Mahwah, NJ: Erlbaum.

Maess, B., Koelsch, S., Gunter, T.C. & Friederici A.D. (2001). Musical syntax is processed in Broca’s area: an MEG study. Nature Neuroscience, 4, 540–545.

Malle, B.F. (2002). The relation between language and theory of mind in development and evolution. In T. Givón, & B. Malle (Eds.), The evolution of language out of pre-language (pp. 265-284). Philadelphia, PA: John Benjamins.

Marcus, G.F. (2004). The birth of the mind: How a tiny number of genes creates the complexities of human thought. New York: Basic Books.

Maynard-Smith, J. (1978). Optimization theory in evolution. Annual Review of Ecology and Systematics, 9, 31-56.

McClintock, B. (1950). The origin and behavior of mutable loci in maize. Proceedings of the National Academy of Sciences, 36, 344–355.

McMahon, A.M.S. (1994). Understanding language change. Cambridge: Cambridge University Press.

Monaghan, P., Chater, N. & Christiansen, M.H. (2005). The differential role of phonological and distributional cues in grammatical categorisation. Cognition, 96, 143-182.

Monaghan, P. & Christiansen, M.H. (in press). Integration of multiple probabilistic cues in syntax acquisition. In H. Behrens (Ed.), Trends in corpus research: Finding structure in data (TILAR Series). Amsterdam: John Benjamins.

Morgan, J.L. & Demuth, K. (1996). Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ: Lawrence Erlbaum Associates.

Morgan, J.L., Meier, R.P., & Newport, E.L. (1987). Structural packaging in the input to language learning: Contributions of prosodic and morphological marking of phrases to the acquisition of language. Cognitive Psychology, 19, 498-550.

Munroe S., & Cangelosi A. (2002). Learning and the evolution of language: the role of cultural variation and learning cost in the Baldwin Effect. Artificial Life, 8, 311-339.

Murphy, G. L. (2002). The big book of concepts. Cambridge, MA: MIT Press.

Nerlich, B. (1989). The evolution of the concept of ‘linguistic evolution’ in the 19th and 20th century. Lingua, 77, 101–112.

Nettle, D. & Dunbar, R.I.M. (1997). Social markers and the evolution of reciprocal exchange. Current Anthropology, 38, 93-99.

Nevins, A., Pesetsky, D. & Rodrigues, C. (2007). Pirahã exceptionality: A reassessment [On-line]. Available: http://ling.auf.net/lingBuzz/000411.

Newmeyer, F.J. (1991). Functional explanation in linguistics and the origins of language. Language and Communication, 11, 3-28.

Newmeyer, F. (2003). What can the field of linguistics tell us about the origin of language? In M.H. Christiansen & S. Kirby (Eds.), Language evolution (pp. 58-76). New York: Oxford University Press.

Newport, E.L. & Aslin, R.N. (2004). Learning at a distance: I. Statistical learning of non-adjacent dependencies. Cognitive Psychology, 48, 127-162.

Nowak, M.A., Komarova, N.L. & Niyogi, P. (2001). Evolution of universal grammar. Science, 291, 114-118.

Odling-Smee, F.J., Laland, K.N. & Feldman, M.W. (2003). Niche construction: The neglected process in evolution. Princeton, NJ: Princeton University Press.

O’Grady, W. (2005). Syntactic carpentry: An emergentist approach to syntax. Mahwah, NJ: Erlbaum.

Onnis, L., Christiansen, M.H., Chater, N. & Gómez, R. (2003). Reduction of uncertainty in human sequential learning: Evidence from artificial grammar learning. In Proceedings of the 25th Annual Conference of the Cognitive Science Society (pp. 886-891). Mahwah, NJ: Lawrence Erlbaum.

Onnis, L., Monaghan, P., Chater, N. & Richmond, K. (2005). Phonology impacts segmentation in speech processing. Journal of Memory and Language, 53, 225-237.

Osherson, D., Stob, M. and Weinstein, S. (1986). Systems that learn. Cambridge, MA: MIT Press.

Otake, T., Hatano, G., Cutler, A., & Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32, 258-278.

Packard, M. & Knowlton, B. (2002). Learning and memory functions of the basal ganglia. Annual Review of Neuroscience, 25, 563-593.

Patel, A. D., Gibson, E., Ratner, J., Besson, M., & Holcomb, P. J. (1998). Processing syntactic relations in language and music: An event-related potential study. Journal of Cognitive Neuroscience, 10, 717-733.

Pearlmutter, N.J. & MacDonald, M.C. (1995). Individual differences and probabilistic constraints in syntactic ambiguity resolution. Journal of Memory and Language, 34, 521-542.

Peña, M., Bonnatti, L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298, 604-607.

Pennisi, E. (2004). The first language? Science, 303, 1319-1320.

Percival, W.K. (1987). Biological analogy in the study of languages before the advent of comparative grammar. In H.M. Hoenigswald & L.F. Wiener (Eds.), Biological metaphor and cladistic classification (pp. 3–38). Philadelphia, PA: University of Pennsylvania Press.

Petersson, K. M., Forkstam, C., & Ingvar, M. (2004). Artificial syntactic violations activate Broca's region. Cognitive Science, 28, 383-407.

Piattelli-Palmarini, M. (1989). Evolution, selection and cognition: From “learning” to parameter setting in biology and in the study of language. Cognition, 31, 1-44.

Piattelli-Palmarini, M. (1994). Ever since language and learning: Afterthoughts on the Piaget-Chomsky debate. Cognition, 50, 315-346.

Pinker, S. (1984). Language learnability and language development. Cambridge, MA: Harvard University Press.

Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, MA: MIT Press.

Pinker, S. (1994). The language instinct: How the mind creates language. New York: NY: William Morrow and Company.

Pinker, S. (2003). Language as an adaptation to the cognitive niche. In M. H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 16-37). Oxford: Oxford University Press.

Pinker, S. & Bloom, P. (1990). Natural language and natural selection. Brain and Behavioral Sciences, 13, 707-727.

Pinker, S. & Jackendoff, R. (2005). The faculty of language: What’s special about it? Cognition, 95, 201-236.

Pinker, S. & Jackendoff, R. (in press). The components of language: What’s specific to language, and What’s specific to humans? In M.H. Christiansen, C. Collins & S. Edelman (Eds.), Language universals. New York: Oxford University Press.

Plante, E., Gómez, R.L., & Gerken, L.A. (2002). Sensitivity to word order cues by normal and language/learning disabled adults. Journal of Communication Disorders, 35, 453-462.

Pomerantz, J. R. & Kubovy, M. (1986). Theoretical approaches to perceptual organization: Simplicity and likelihood principles. In K. R. Boff, L. Kaufman & J. P. Thomas (Eds.) Handbook of Perception and Human Performance. Volume 2: Cognitive Processes and Performance. (pp. 36-1-36-46) New York: Wiley.

Quine, W. V. O. (1960). Word and object. Cambridge, MA: MIT Press.

Raddick, G. (2000). Review of S. Alter's Darwinism and the Linguistic Image. British Journal for the History of Science, 33, 122–124.

Raddick, G. (2002). Darwin on language and selection. Selection, 3, 7–16.

Ragir, S. (2002). Constraints on communities with indigenous sign languages: Clues to the dynamics of language origins. In A. Wray (Ed.), Transitions to language (pp. 272-294). Oxford: Oxford University Press.

Reali, F. & Christiansen, M.H. (2007). Processing of relative clauses is made easier by frequency of occurrence. Journal of Memory and Language, 57, 1-23.

Reali, F. & Christiansen, M.H. (in press). Sequential learning and the interaction between biological and linguistic adaptation in language evolution. Interaction Studies.

Richerson, P.J. & Boyd, R. (2005). Not by genes alone: How culture transformed human evolution. Chicago: Chicago University Press.

Reinhart, T. (1983). Anaphora and semantic interpretation. Chicago: Chicago University Press.

Ritt, N. (2004). Selfish sounds and linguistic evolution: A Darwinian approach to language change. Cambridge: Cambridge University Press.

Rossel, S., Corlija, J., & Schuster, S. (2002). Predicting three-dimensional target motion: How archer fish determine where to catch their dislodged prey. Journal of Experimental Biology, 205, 3321-3326.

Sag, I.A. & Pollard, C.J. (1987). Head-driven phrase structure grammar: An informal synopsis. CSLI Report 87-79. Stanford, CA: Stanford University.

Saffran, J.R. (2001). The use of predictive dependencies in language learning. Journal of Memory and Language, 44, 493-515.

Saffran J.R. (2002). Constraints on statistical language learning. Journal of Memory and Language, 47, 172-196.            

Saffran, J.R. (2003). Statistical language learning: Mechanisms and constraints. Current Directions in Psychological Science, 12, 110-114.

Saffran, J.R., Aslin, R.N., & Newport, E.L. (1996a). Statistical learning by 8-month-old infants. Science, 274, 1926-1928.

Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996b). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35, 606-621.

Sandler, W., Meir, I., Padden, C. & Aronoff, M. (2005). The emergence of grammar: Systematic structure in a new language. Proceedings of the National Academy of Sciences, 102, 2661-2665.

Schlosser, G., & Wagner, G. P. (Eds.) (2004). Modularity in development and evolution. Chicago, IL: University of Chicago Press.

Schoenemann, P.T. (1999). Syntax as an emergent characteristic of the evolution of semantic complexity. Minds and Machines, 9, 309-346.

Seidenberg, M.S. (1985). The time course of phonological code activation in two writing systems. Cognition, 19, 1-30.

Seidenberg, M.S. (1997). Language acquisition and use: Learning and applying probabilistic constraints. Science, 275, 1599-1604.

Seidenberg, M.S. & MacDonald, M. (2001). Constraint-satisfaction in language acquisition. In M.H. Christiansen & N. Chater (Eds.), Connectionist psycholinguistics (pp. 281-318). Westport, CT: Ablex.

Senghas, A., Kita, S. & Özyürek, A. (2004). Children creating core properties of language: Evidence from an emerging sign language in Nicaragua. Science, 305, 1779-1782.

Sereno, M.I. (1991). Four analogies between biological and cultural/linguistic evolution. Journal of Theoretical Biology, 151, 467-507.

Simoncelli, E. P. & Olshausen, B. A. (2001). Natural image statistics as neural representation. Annual Review of Neuroscience, 24, 1193-1215.

Slobin, D.I. (1973). Cognitive prerequisites for the development of grammar. In C.A. Ferguson and D.I. Slobin (Eds.), Studies of child language development (pp. 175-208). New York: Holt, Rinehart & Winston.

Slobin, D.I., & Bever, T.G. (1982). Children use canonical sentence schemas: A crosslinguistic study of word order and inflections. Cognition, 12, 229-265.

Smith, K. (2002). Natural selection and cultural selection in the evolution of  communication. Adaptive Behavior, 10, 25-44.

Smith, K. (2004). The evolution of vocabulary. Journal of Theoretical Biology, 228, 127-142.

Smith, K., Brighton, H. & Kirby, S. (2003). Complex systems in language evolution: the cultural emergence of compositional structure. Advances in Complex Systems, 6, 537-558.

Sperber, D. & Wilson, D. (1986). Relevance. Oxford: Blackwell.

Stallings, L., MacDonald, M. & O’Seaghdha, P. (1998). Phrasal ordering constraints in sentence production: phrase length and verb disposition in heavy-NP shift. Journal of Memory and Language, 39, 392-417.

Steedman, M. (2000). The syntactic process. Cambridge, MA: MIT Press.

Stevick, R.D. (1963). The biological model and historical linguistics. Language, 39, 159–169.

Studdert-Kennedy, M. & Goldstein, L. (2003). Launching language: The gestural origin of discrete infinity. In M.H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 235-254). New York: Oxford University Press.

Suzuki, D.T., Griffiths, A.J.F., Miller, J.H. & Lewontin, R.C. (1989). An introduction to genetic analysis (4th edition). New York, NY: W. H. Freeman.

Syvanen, M. (1985). Cross-species gene transfer: Implications for a new theory of evolution. Journal of Theoretical Biology, 112, 333-343.

Tanenhaus, M.K., Spivey-Knowlton, M.J., Eberhard, K.M. & Sedivy, J.E. (l995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632-1634.

Tanenhaus, M.K. & Trueswell, J.C. (1995). Sentence comprehension. In J. Miller & P. Eimas (Eds.), Handbook of cognition and perception (pp. 217-262). San Diego, CA: Academic Press.

Tomasello, M., (2000a). Do you children have adult syntactic competence? Cognition, 74, 209-253.

Tomasello, M., (2000b). The item-based nature of children’s early syntactic development. Trends in Cognitive Sciences, 4, 156-163.

Tomasello, M., (Ed). (2000c).  The new psychology of language: Cognitive and functional approaches. Hillsdale, NJ: Erlbaum.

Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge, MA: Harvard University Press.

Tomasello, M. (2004). What kind of evidence could refute the UG hypothesis? Studies in Language, 28, 642-644.

Tomasello, M. (2006). Origins of human communication. The Jean Nicod Lectures, May 2006, Paris.

Tomasello, M., Carpenter, M., Call, J., Behne, T. & Moll, H. (2005). Understanding and sharing intentions: The origins of cultural cognition. Behavioral & Brain Sciences, 28, 675-691.

Tomblin, J.B., Mainela-Arnold, M.E. & Zhang, X. (2007). Procedural learning in adolescents with and without specific language impairment. Language Learning and Development, 3, 269-293.

Tomblin, J.B., Shriberg, L., Murray, J., Patil, S. & Williams, C. (2004). Speech and language characteristics associated with a 7/13 translocation involving FOXP2. American Journal of Medical Genetics, 130B, 97.

Ullman, M.T. (2004). Contributions of memory circuits to language: The declarative/procedural model. Cognition, 92, 231-270.

van Everbroeck, E. (1999). Language type frequency and learnability: A connectionist appraisal. In Proceedings of the 21st Annual Cognitive Science Society Conference (pp. 755–760). Mahwah, NJ: Erlbaum.

Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. (2006). A map of recent positive selection in the human genome. PloS Biology, 4, e72.

von Humboldt, W. (1999). On language: On the diversity of human language construction and its influence on the metal development of the human species. Cambridge, U.K.: Cambridge University Press.

Vouloumanos, A. & Werker, J.F. (2007). Listening to language at birth: Evidence for a bias for speech in neonates. Developmental Science, 10, 159-164.

Waddington, C.H. (1942). Canalization of development and the inheritance of acquired characters. Nature, 150, 563-565.

Weber, B.H., & Depew, D.J. (Eds.) (2003). Evolution and learning: The Baldwin effect reconsidered. Cambridge, MA: MIT Press.

Weissenborn, J. & Höhle, B. (Eds.) (2001). Approaches to bootstrapping: Phonological, lexical, syntactic and neurophysiological aspects of early language acquisition. Philadelphia, PA: John Benjamins.

Wilkins, W.K. & Wakefield, J. (1995). Brain evolution and neurolinguistic preconditions. Behavioral & Brain Sciences, 18, 161-182.

Yamashita, H. & Chang, F. (2001). “Long before short” preference in the production of a head-final language. Cognition, 81, B45-B55.

Yamauchi, H. (2001). The difficulty of the Baldwinian account of linguistic innateness. In J. Kelemen and P. Sosík (Eds.), ECAL01 (pp. 391-400). Prague: Springer.

Yang, C.D. (2002). Knowledge and learning in natural language. New York: Oxford University Press.

Zeevat, H. (2006). Grammaticalisation and evolution. In A. Cangelosi, A. D. M. Smith, & K. Smith (Eds.) The Evolution of Language (pp. 372-378). Singapore: World Scientific.

Zuidema, W. (2003). How the poverty of the stimulus solves the poverty of the stimulus. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15 (pp. 51-58). Cambridge, MA: MIT Press.

 

 


Endnotes



[i] For the purposes of exposition, we use the term “language genes” as short-hand for genes that may be involved in encoding a potential UG. By using this term we do not mean to suggest that this relationship necessarily involves a one-to-one correspondence between individual genes and a specific aspect of language (or cognition).

 

[ii] Intermediate positions, which accord some role to both non-adaptationists and adaptationist mechanisms, are, of course, possible. Such intermediate viewpoints inherit the logical problems that we discuss below for both types of approach, in proportion to the relative contribution presumed to be associated with each. Moreover, we note that our arguments have equal force independent of whether one assumes that language has a vocal (e.g., Dunbar, 2003) or manual-gesture (e.g., Corballis, 2003) based origin.

 

[iii] Strictly, the appropriate measure is the more subtle inclusive fitness, which takes into account the reproductive potential not just of an organism, but also a weighted sum of the reproductive potentials of its kin, where the weighting is determined by the closeness of kinship (Hamilton, 1964). Moreover, mere reproduction is only of value to the degree that one's offspring have a propensity to reproduce, and so down the generations.

 

[iv] In addition, Pinker and Bloom (1990) point out that it is often the case that natural selection has several (equally adaptive) alternatives to choose from to carry out a given function (e.g., both the invertebrate and the vertebrate eye support vision despite having significant architectural differences).

 

[v] One prominent view is that language emerged within the last 100,000 to 200,000 years (e.g., Bickerton, 2003). Hominid populations over this period, and before, appear to have undergone waves of spread; “… modern languages derive mostly or completely from a single language spoken in East Africa around 100 kya … it was the only language then existing that survived and evolved with rapid differentiation and transformation.” (Cavalli-Sforza & Feldman, 2003: p. 273)

 

[vi] Human genome-wide scans have revealed evidence of recent positive selection for more than 250 genes (Voight, Kudaravalli, Wen & Pritchard, 2006), making it very likely that genetic adaptations for language would have continued in this scenario.

 

[vii]  This set-up closely resembles the one used by Hinton and Nowlan (1987) in their simulations of the Baldwin effect, and to which Pinker and Bloom (1990) refer in support of their adaptationist account of language evolution. The simulations are also similar in format to other models of language evolution (e.g., Briscoe, 2003; Kirby & Hurford, 1997; Nowak, Komarova & Niyogi, 2001). Note, however, the reported simulations have a very different purpose from work on understanding historical language change from a UG perspective, for example, as involving successive changes in linguistic parameters (e.g., Baker, 2001; Lightfoot, 2000; Yang, 2002).

 

[viii] Some recent theorists have proposed that a further pressure for language divergence between groups is the sociolinguistic tendency for groups to “badge” their in-group by difficult-to-fake linguistic idiosyncrasies (Baker, 2003; Nettle & Dunbar, 1997). Such pressures would increase the pace of language divergence, and thus exacerbate the problem of divergence for adaptationist theories of language evolution.

 

[ix] This type of phenomenon, where the genetically-influenced behavior of an organism affects the environment to which those genes are adapting, is known as Baldwinian niche construction (Odling-Smee, Laland & Feldman, 2003; Weber & Depew, 2003).

 

[x] Indeed, a population genetic study by Dediu and Ladd (2007) could, on the one hand, be taken as pointing to biological adaptations for a surface feature of phonology: the adoption of a single-tier phonological system relying only on phoneme-sequence information to differentiate between words instead of a two-tier system incorporating both phonemes and tones (i.e., pitch contours). Specifically, two particular alleles of ASPM and Microcephalin, both related to brain development, were strongly associated with languages that incorporate a single-tier phonological system, even when controlling for geographical factors and common linguistic history. On the other hand, given that the relevant mutations would have had to occur independently several times, the causal explanation plausibly goes in the opposite direction, from genes to language. The two alleles may have been selected for other reasons relating to brain development but once in place they made it harder to acquire phonological systems involving tonal contrasts, which, in turn, allowed languages without tonal contrasts to evolve more readily. This perspective (also advocated by Dediu & Ladd) dovetails with our suggestion that language is shaped by the brain, as discussed below. However, either of these interpretations would argue against an adaptationist account of UG.

 

[xi] We have presented the argument in informal terms. A more rigorous argument is as follows. We can measure the amount of information embodied in universal grammar, U, over and above the information in pre-existing cognitive processes, C, by the length of the shortest code that will generate U from C. This is the conditional Kolmogorov complexity K(U|C) (Li & Vitányi, 1997). By the coding theorem of Kolmogorov complexity theory (Li & Vitányi, 1997), the probability of randomly generating U from C is approximately 2-K(U|C). Thus, if universal grammar has any substantial complexity, then it has a vanishingly small probability of being encountered by a random process, such as a non-adaptational mechanism.

 

[xii] Darwin may have had several reasons for pointing to these similarities. Given that comparative linguistics at the time was considered to be a model science on a par with geology and comparative anatomy, he may have used comparisons between linguistic change—which was thought to be well understood at that time—and species change to corroborate his theory of evolution (Alter, 1998; Beer, 1996). Darwin may also have used these language-species comparisons to support the notion that less “civilized” human societies spoke less civilized languages, because he believed that this was predicted by his theory of human evolution (Raddick, 2000, 2002).

 

[xiii] Chomsky has sometimes speculated that the primary role of language may be as a vehicle for thought, rather than communication (e.g., Chomsky, 1980). This viewpoint has its puzzles—for example, the existence of anything other than semantic representations is difficult to understand, as it is these over which thought is defined; and the semantic representations in Chomsky’s recent theorizing are, indeed, too underspecified to support inference, throwing the utility of even these representations into doubt.

 

[xiv] Some studies purportedly indicate that the mechanisms involved in syntactic language are not the same as those involved in most sequential learning tasks (e.g., Friederici, Bahlmann, Heim, Schibotz & Anwander, 2006; Peña et al., 2002). However, the methods used in these studies have subsequently been shown to be fundamentally flawed (de Vries, Monaghan, Knecht & Zwitserlood, in press, and Onnis et al., 2005, respectively), thereby undermining their negative conclusions. Thus, the preponderance of the evidence suggests that sequential learning tasks tap into the mechanisms involved in language acquisition and processing.

 

[xv] The current knowledge regarding the FOXP2 gene is consistent with the suggestion of a human pre-adaptation for sequential learning (Fisher, 2006). FOXP2 is highly conserved across species but two amino acid changes have occurred after the split between humans and chimps, and these became fixed in the human population about 200,000 years ago (Enard et al., 2002). In humans, mutations to FOXP2 result in severe speech and orofacial motor impairments (Lai, Fisher, Hurst, Vargha-Khadem & Monaco, 2001; MacDermot et al., 2005). Studies of FOXP2 expression in mice and imaging studies of an extended family pedigree with FOXP2 mutations have provided evidence that this gene is important to neural development and function, including of the corticostriatal system (Lai, Gerrelli, Monaco, Fisher & Copp, 2003). This system has been shown to be important for sequential (and other types of procedural) learning (Packard & Knowlton, 2002). Crucially, preliminary findings from a mother and daughter with a translocation involving FOXP2 indicate that they have problems with both language and sequential learning (Tomblin, Shriberg, Murray, Patil & Williams, 2004).