To be published in Behavioral and Brain Sciences (in press)
© Cambridge University Press 2008
Below
is the unedited, uncorrected final draft of a BBS target article that has been
accepted for publication. This preprint has been prepared for potential commentators
who wish to nominate themselves for formal commentary invitation. Please DO NOT
write a commentary until you receive a formal invitation. If you are invited to
submit a commentary, a copyedited, corrected version of this paper will be
posted.
Language as Shaped by the Brain
Department
of Psychology Division
of Psychology and Language Sciences
Cornell
University University
College London
Ithaca,
NY 14853 London, WC1E 6BT
and email: n.chater@ucl.ac.uk
Santa
Fe Institute
1399
Hyde Park Road
Santa
Fe, NM 87501
USA
email: mhc27@cornell.edu
Short abstract
It is widely
assumed that human learning and the structure of human languages are intimately
related. It is typically argued that this relationship is rooted in a
language-specific biological endowment, which encodes universal, but
communicatively arbitrary, principles of language structure (a universal
grammar). We argue instead that the mesh between learners and languages arises
because language has been shaped to fit the human brain, rather than vice
versa. If so, then apparently arbitrary aspects of linguistic structure may
result from general learning and processing biases.
Long abstract
It is widely
assumed that human learning and the structure of human languages are intimately
related. This relationship is frequently suggested to derive from a
language-specific biological endowment, which encodes universal, but
communicatively arbitrary, principles of language structure (a universal
grammar or UG). How might such a UG have evolved? We argue that UG could not
have arisen either by biological adaptation or non-adaptationist genetic processes,
resulting in a logical problem of
language evolution. Specifically, as the processes of language change are
much more rapid than processes of genetic change, language constitutes a
“moving target” both over time and across different human populations, and
hence cannot provide a stable environment to which language genes could have
adapted. We conclude that a biologically determined UG is not evolutionarily
viable. Instead, the original motivation for UG—the mesh between learners and
languages—arises because language has been shaped to fit the human brain,
rather than vice versa. Following Darwin, we view language itself as a complex
and interdependent “organism,” which evolves under selectional pressures from
human learning and processing mechanisms. That is, languages themselves are
shaped by severe selectional pressure from each generation of language users
and learners. This suggests that apparently arbitrary aspects of linguistic
structure may result from general learning and processing biases deriving from
the structure of thought processes, perceptuo-motor factors, cognitive
limitations, and pragmatics.
Keywords: biological adaptation, cultural
evolution, grammaticalization, language acquisition, language evolution,
linguistic change, natural selection, universal grammar
1. Introduction
Natural language constitutes one of the most complex aspects of human cognition, yet children already have a good grasp of their native language before they can tie their shoes or ride a bicycle. The relative ease of acquisition suggests that when the child makes a “guess” about the structure of language on the basis of apparently limited evidence, she has an uncanny tendency to guess right. This strongly suggests that there must be a close relationship between the mechanisms by which the child acquires and processes language, and the structure of language itself.
What
is the origin of this presumed close relationship between the mechanisms
children use in acquisition and the structure of language? One view is that
specialized brain mechanisms specific to language acquisition have evolved over
long periods of natural selection (e.g., Pinker & Bloom, 1990). A second
view rejects the idea that these specialized brain mechanisms have arisen
through adaptation, and assumes that they have emerged through some
non-adaptationist route, just as it has been argued that many biological
structures are not the product of adaptation (e.g., Bickerton, 1995; Gould,
1993; Jenkins, 2000; Lightfoot, 2000). Both these viewpoints put the
explanatory emphasis on brain mechanisms specialized for language—and ask how
they have evolved.
In
this paper, we develop and argue for a third view, which takes the opposite
starting point. It asks not, Why is the brain so well suited to learning
language?, but instead, Why is language so well suited to being learned by the
brain? We propose that language has
adapted through gradual processes of cultural evolution to be easy to learn to
produce and understand. Thus the structure of human language must inevitably be
shaped around human learning and processing biases deriving from the structure
of our thought processes, perceptuo-motor factors, cognitive limitations, and
pragmatic constraints. Language is easy for us to learn and use, not because our
brains embody knowledge of language, but because language has adapted to our
brains. Following Darwin (1900), we argue that it is useful metaphorically to
view languages as “organisms”, i.e., highly complex systems of interconnected
constraints, that have evolved in a symbiotic relationship with humans.
According to this view, whatever domain-general learning and processing biases
people happen to have will tend to become embedded in the structure of
language—because it will be easier to learn to understand and produce
languages, or specific linguistic forms, that fit these biases.
We
start by introducing The Logical Problem
of Language Evolution, which faces theories proposing that humans have
evolved specialized brain mechanisms for language. The following two sections, Evolution of Universal Grammar by Biological
Adaptation and Evolution of Universal
Grammar by Non-adaptationist Means, evaluate adaptationist and
non-adaptationist explanations of language evolution, concluding that both face
insurmountable theoretical obstacles. Instead, we present an alternative
perspective, Language as Shaped by the
Brain, in which language is treated as an evolutionary system in its own
right, adapting to the human brain. The next two sections, Constraints on Language Structure and How Constraints Shape Language over Time, discuss what biases have
shaped language evolution and how these can be observed in language change
mediated by cultural transmission. Finally, in the Scope of the Argument, we consider the wider implications of our
theory of language evolution, including a radical recasting of the problem of
language acquisition.
For
a period spanning three decades, Chomsky (1965, 1972, 1980, 1986, 1988, 1993)
has argued that a substantial innate endowment of language-specific knowledge
is necessary for language acquisition. These constraints form a Universal Grammar (UG); that is, a
collection of grammatical principles that hold across all human languages. In
this framework, a child’s language ability gradually unfolds according to a
genetic blueprint in much the same way as a chicken grows a wing (Chomsky,
1988). The staunchest proponents of this view even go as far as to claim that
“doubting that there are language-specific, innate computational capacities
today is a bit like being still dubious about the very existence of molecules,
in spite of the awesome progress of molecular biology” (Piattelli-Palmarini,
1994: p. 335).
There
is considerable variation in current conceptions of the exact nature of UG,
ranging from being close to the Principle
and Parameter Theory (PPT; Chomsky, 1981) of pre-minimalist generative
grammar (e.g., Crain, Goro & Thornton, 2006; Crain & Pietroski, 2006),
to the Simpler Syntax (SS) version of
generative grammar proposed by Jackendoff (2002) and colleagues (Culicover
& Jackendoff, 2005; Pinker & Jackendoff, 2005), to the Minimalist Program (MP) in which
language acquisition is confined to learning a lexicon from which
cross-linguistic variation is proposed to arise (Boeckx, 2006; Chomsky, 1995).
From the viewpoint of PPT, UG consists of a set of genetically specified
universal linguistic principles combined with a set of parameters to account
for variations among languages (Crain et al., 2006). Information from the
language environment is used during acquisition to determine the parameter
settings relevant for individual languages. The SS approach combines elements
from construction grammar (e.g., Goldberg, 2006) with more traditional structural
principles from generative grammar, including principles relating to phrase
structure (X-bar theory), agreement, and case-marking. Along with constraints
arising from the syntax-semantic interface, these basic structural principles
form part of a universal “toolkit” of language-specific mechanisms, encoded in
a genetically specified UG (Culicover & Jackendoff, 2005). By contrast,
proponents of MP construe language as a perfect system for mapping between
sound and meaning (Chomsky, 1995). In departure from earlier generative
approaches, only recursion (in the form of Merge) is considered to be unique to
the human language ability (Hauser, Chomsky & Fitch, 2002). Variation among
languages is now explained in terms of lexical parameterization (Borer, 1984);
that is, differences between languages are no longer explained in terms of
parameters associated with grammars (as in PPT), but primarily in terms of
parameters associated with particular lexical items (though some non-lexical
parameters currently remain; Baker, 2001; Boeckx, 2006).
Common
to these three current approaches to generative grammar is the central
assumption that the constraints of UG (whatever their form) are fundamentally
arbitrary—i.e., not determined by functional considerations. That is, these
principles cannot be explained in terms of learning, cognitive constraints, or
communicative effectiveness. For example, consider the principles of binding,
which have come to play a key role in generative linguistics (Chomsky, 1981).
The principles of binding capture patterns of, among other things, reflexive
pronouns (e.g., himself, themselves) and accusative pronouns (him, them,
etc.), which appear, at first sight, to defy functional explanation. Consider
examples (1)-(4), where the subscripts indicate co-reference, and asterisks
indicate ungrammaticality.
(1)
Johni sees himselfi
(2)
*Johni sees himi
(3)
Johni said hei/j won
(4)
*Hei said Johni won
In
(1), the pronoun himself must refer
to John; in (2) it cannot. In (3), the pronoun he may refer to John or to another person; in (4), it cannot refer
to John. These and many other cases indicate that an extremely rich set of
patterns govern the behavior of pronouns, and these patterns appear
arbitrary—it appears that numerous alternative patterns would, from a
functional standpoint, serve equally well. These patterns are instantiated in
PPT by the principles of binding theory (Chomsky, 1981), in SS by constraints
arising from structural and/or syntax-semantics interface principles (Culicover
& Jackendoff, 2005), and in MP by limitations on movement (internal merge,
Hornstein, 2001). Independent of their specific formulations, the constraints
on binding, while apparently universal across natural languages, are assumed to
be arbitrary—and hence may be presumed to be part of the genetically encoded
UG.
Putative
arbitrary universals, such as the restrictions on binding, contrast with
functional constraints on language. Whereas the former are hypothesized to
derive from the internal workings of a UG-based language system, the latter
originate from cognitive and pragmatic constraints related to language
acquisition and use. Consider the tendency in English to place long phrases
after short ones; for example, as evidenced by so-called “heavy-NP shifts”. In
(5), the long (or “heavy”) direct-object noun phrase (NP), the book he had not been able to locate for over two months,
appears at the end of the sentence, separated from its canonical postverbal
position by the prepositional phrase (PP) under
his bed. Both corpus analyses (Hawkins, 1994) and psycholinguistic
sentence-production experiments (Stallings, MacDonald & O’Seaghdha, 1998)
suggest that (5) is much more acceptable than the standard (or “non-shifted”)
version in (6), in which the direct object NP is placed immediately following
the verb.
(5)
John found PP[under his bed] NP[the book he had not been
able to locate for over two months].
(6)
John found NP[the book he had not been able to locate for over two
months] PP[under his bed].
Whereas
individuals speaking head-initial languages, such as English, tend to prefer
short phrases before long, speakers of head-final languages, such as Japanese,
have been shown to have the opposite long-before-short preference (Yamashita
& Chang, 2001). In both cases, the preferential ordering of long versus
short phrases can be explained in terms of minimization of memory load and
maximization of processing efficiency (Hawkins, 2004). As such, the patterns of
length-induced phrasal reordering are generally considered within generative
grammar to be a performance issue related to functional constraints outside the
purview of UG (although some functionally-oriented linguists have suggested
that these kind of performance constraints may shape grammar itself; e.g.,
Hawkins, 1994, 2004). In contrast, the constraints inherent in UG are arbitrary
and non-functional in the sense that they do not relate to communicative or
pragmatic considerations, nor from limitations on the mechanisms involved in
using or acquiring language. Indeed, some generative linguists have argued that
aspects of UG hinder communication (e.g., Chomsky, 2005; Lightfoot, 2000).
If
we suppose that such arbitrary principles of UG are genetically specified, then
this raises the question of the evolutionary origin of this genetic endowment.
Two views have been proposed.
Adaptationists emphasize a gradual evolution of the
human language faculty through natural
selection (e.g., Briscoe, 2003; Corballis, 1992, 2003; Dunbar, 2003;
Greenfield, 1991; Hurford, 1991; Jackendoff, 2002; Nowak, Komarova &
Niyogi, 2001; Pinker, 1994, 2003; Pinker & Bloom, 1990; Pinker &
Jackendoff, 2005). Linguistic ability confers added reproductive fitness,
leading to a selective pressure for language genes[i];
richer language genes encode increasingly elaborate grammars.
Non-adaptationists (e.g., Bickerton, 1995—but see
Bickerton, 2003; Chomsky, 1988; Jenkins, 2000; Lightfoot, 2000;
Piattelli-Palmarini, 1989) suggest that natural selection only played a minor
role in the emergence of language in humans, focusing instead on a variety of
alternative possible evolutionary mechanisms by which UG could have emerged de novo (e.g., due to as few as two or
three key mutation “events”, Lanyon, 2006).
In the next two sections, we argue that
both of these views, as currently formulated, face profound theoretical
difficulties resulting in a logical
problem of language evolution[ii].
This is because, on analysis, it is mysterious how proto-language—which must
have been, at least initially, a cultural product likely to be highly variable
both over time and geographical locations—could have become genetically fixed
as a highly elaborate biological structure. Hence there is no currently viable
account of how a genetically encoded UG could have evolved. In subsequent
sections, we argue that the brain does not encode principles of UG—and
therefore neither adaptationist nor non-adaptationist solutions are required.
Instead, language has been shaped by the brain: language reflects pre-existing,
and hence non-language-specific, human learning and processing mechanisms.
The
adaptationist position is probably the most widely held view of the origin of
UG. We first describe adaptationism in biology and its proposed application to
UG before outlining three conceptual difficulties for adaptationist
explanations of language evolution.
3.1 Adaptation: The very
idea
Adaptation
is a candidate explanation for the origin of any innate biological structure.
In general, the idea is that natural selection has favored genes that code for
biological structures that increase fitness
(in terms of expected numbers of viable offspring).[iii]
Typically, a biological structure contributes to fitness by fulfilling some
purpose—the heart is assumed to pump blood, the legs to provide locomotion, or
UG to support language acquisition. If so, natural selection will generally
favor biological structures that fulfill their purpose well, so that, over the
generations, hearts will become well adapted to pumping blood, legs well
adapted to locomotion, and any presumed biological endowment for language
acquisition will become well adapted to acquiring language.
Perhaps
the most influential statement of the adaptationist viewpoint is by Pinker and Bloom
(1990). They argue that “natural
selection is the only scientific explanation of adaptive complexity.
‘Adaptive complexity’ describes any system composed of many interacting parts
where the details of the parts’ structure and arrangement suggest design to
fulfill some function” (p. 709; their emphasis). As another example of adaptive
complexity, they refer to the exquisite optical and computational
sophistication of the vertebrate visual system. Pinker and Bloom note that such
a complex and intricate mechanism has an extremely low probability of occurring
by chance. Whatever the influence of non-adaptational factors (see below), they
argue that there must additionally have been substantial adaptation to
fine-tune a system as complex as the visual system. Given that language appears
as complex as vision, Pinker and Bloom conclude that it is also highly
improbable that language is entirely the product of nonadaptationist processes
(see also Pinker, 2003).
The scope and validity of the adaptationist viewpoint in biology is controversial (e.g., Dawkins, 1986; Gould, 2002; Gould & Lewontin, 1979; Hecht Orzak & Sober, 2001); and some theorists have used this controversy to question adaptationist views of the origin of UG (e.g., Bickerton, 1995; Lewontin, 1998). Here, we take a different tack. We argue that, whatever the merits of adaptationist explanation in general, and as applied to vision in particular, the adaptationist account cannot extend to a putative UG.
3.2. Why universal grammar
could not be an adaptation to language
Let
us suppose that a genetic encoding of universal properties of language did, as
the adaptationist view holds, arise as an adaptation to the environment, here
to the linguistic environment. This
point of view seems to work most naturally for aspects of language that have a
transparent functional value. For
example, the compositional character of language (i.e., the ability to express
in an infinite number of messages using a finite number of lexical items) seems
to have great functional advantages. A biological endowment that allows, or
perhaps requires, that language has this form appears likely to lead to
enhanced communication; and hence to be positively selected. Thus, over time,
functional aspects of language might be expected to become genetically encoded
across the entire population. But UG, according to Chomsky (e.g., 1980, 1988),
consists precisely of linguistic principles that appear highly abstract and
arbitrary—i.e., which have no functional significance. To what extent can an
adaptationist account of the evolution of a biological basis for language
explain how a genetic basis could arise for such abstract and arbitrary
properties of language?
Pinker
and Bloom (1990) provide an elegant approach to this question. They suggest
that the constraints imposed by UG, such as the binding constraints
mentioned above, can be construed as communication protocols for transmitting
information over a serial channel. While the general features of such protocols
(e.g., concerning compositionality, or the use of a small set of discrete
symbols) may be functionally important, many of the specific aspects of the
protocol do not matter, as long as everyone (within a given speech community)
adopts the same protocol. For
example, when using a modem to communicate between computers, a particular
protocol might have features such as odd parity, handshake on, 7 bit, etc.
However, there are many other settings that would be just as effective. What is
important is that the computers that are to interact adopt the same set of settings—otherwise
communication will not be possible. Adopting the same settings is therefore of
fundamental functional importance to communication between computers, but the
particular choice of settings is not. Similarly, when it comes to the specific
features of UG, Pinker and Bloom suggest that “in the evolution of the language
faculty, many ‘arbitrary’ constraints may have been selected simply because
they defined parts of a standardized communicative code in the brains of some
critical mass of speakers” (1990: p. 718)[iv].
Thus, such arbitrary constraints on language can come to have crucial adaptive
value to the language-user; genes that favor such constraints will be
positively selected. Over many generations, the arbitrary constraints may then
become innately specified.
We will argue that this viewpoint
faces three fundamental difficulties, concerning the dispersion of hominid
populations, language change, and the question of what is genetically encoded. We consider these in turn.
3.2.1
Problem 1: The dispersion of human populations
Pinker and Bloom’s (1990) analogy
with communications protocols, while apt, is, however, something of a
double-edged sword. Communications protocols and other technical standards
typically diverge rapidly unless there is concerted oversight and enforcement
to maintain common standards. Maintaining and developing common standards is an
integral part of software and hardware development. In the absence of such
pressures for standardization, protocols would rapidly diverge. Given that
language presumably evolved without top-down pressures for standardization,
divergence between languages seems inevitable. To assume that “universal” arbitrary features of language
would emerge from adaptation by separate groups of language users, would be
analogous to assuming that the same set of specific features for computer
communication protocols might emerge from separate teams of scientists, working
in separate laboratories (e.g., that different modem designers independently
alight on odd parity, handshake on, 7 bit error correction, and so on). Note
that this point would apply equally well, even if the teams of scientists
emerged from a single group. Once cut off from each other, groups would develop
in independent ways. Indeed, in biological adaptation, genes appear to rapidly
evolve to deal with a specific local environment. Thus, Darwin observed rich
patterns of variations in fauna (e.g., finches) across the Galapagos Islands,
and interpreted these variations as adaptation to local island conditions.
Hence, if language genes have adapted to local linguistic environments, we
should expect a range of different biologically encoded UGs, each specifically
adapted to its local linguistic context. Indeed, one might expect, if anything,
that language-genes would diverge especially rapidly—because the linguistic
environment in each population is assumed to be itself shaped by the different
language-genes in each subpopulation, thus amplifying the differences in the linguistic
environment. If so, then people should have, at minimum, some specific
predisposition to learn and process languages associated with their genetic
lineage. This does not appear to be the case—and it is a key assumption of the
generative linguistics perspective that the human language endowment does not
vary in this way but is universal across the species (Chomsky, 1980; Pinker,
1994).
There
is an interesting contrast, here, with the human immune system, which has
evolved to a very rapidly changing microbial environment. Crucially, the immune
system can build new antibody proteins (and the genetic mechanisms from which
antibody proteins are constructed) without having to eliminate old antibody
proteins (Goldsby, Kindt, Osborne & Kuby, 2003). Therefore, natural
selection will operate to enrich the
coverage of the immune system (though such progress will not always be
cumulative, of course); there is no penalty for the immune system following a
fast-moving “target” (defined by the microbial environment). But the case of
acquiring genes coding for regularities in language is very different—because,
at any one time, there is just one language (or at most two or three) that must
be acquired—and hence a bias that helps learn a language with property P will thereby inhibit learning languages with not-P. The fact that language change is so fast (so that whether the
current linguistic environment has property P
or not will vary rapidly, in the time scale of biological evolution) means that
such biases will, on balance, be counterproductive.
Given
that the immune system does coevolve with the microbial environment, different
co-evolutionary paths have been followed when human populations have diverged.
Therefore populations that have co-evolved to their local microbial environment
are often poorly adapted to other microbial environments. For example, when
Europeans began to explore the New World, they succumbed in large numbers to
the diseases they encountered, while conversely, European diseases caused
catastrophic collapse in indigenous populations (e.g., Diamond, 1997). If an
innate UG had co-evolved with the linguistic environment, similar radically
divergent co-evolutionary paths might be expected. Yet, as we have noted, the
contrary appears to be the case.
The problem of divergent populations
arises across a range of different scenarios concerning the relationship
between language evolution and the dispersion of human populations. One
scenario is that language evolution is recent, and occurred during the dispersion
of modern humans (Homo sapiens sapiens).
In this case, whether language was discovered once, and then spread throughout
human populations, or was discovered in various locations independently, there
remains the problem that adaptations to language would not be coordinated
across geographically dispersed groups. It
is tempting to suggest that all of these sublanguages will, nonetheless, obey
universal grammatical principles, thus providing some constancy in the
linguistic environment—but this appeal would, of course, be circular, as we are
attempting to explain the origin of
such principles. We shall repeatedly have to steer around this circularity trap below.
An alternative scenario is that
language evolution pre-dates the dispersion of modern humans. If so, then it is
conceivable that prior dispersions of hominid populations, perhaps within
Africa, did lead to the emergence of diverse languages and diverse UGs, adapted
to learning and processing such languages, and then that subsequently, one local
population proved to be adaptively most successful, and came to displace other
hominid populations. Thus, on this account, our current UG might conceivably be
the only survivor of a larger family of such UGs due to a population
“bottleneck”—the universality of UG would arise, then, because it was
genetically encoded in the sub-population from which modern humans descended[v].
This viewpoint is not without difficulties.
Some interpretations of the genetic and archaeological evidence suggest that
the last bottleneck in human evolution occurred at between 500,000 and
2,000,000 years ago (e.g., Hawks,
Hunley, Lee & Wolpoff, 2000);
few researchers in language evolution believe that language,
in anything like its modern form, is this old. Moreover, even if we assume a
more recent bottleneck, any such bottleneck must at least predate the 100,000
years or so since the geographical dispersion of human populations, and 100,000
years still seems to provide sufficient time for substantial linguistic
divergence to occur. Given that the processes of genetic adaptation to language
most likely would continue to operate[vi],
different genetic bases for language would be expected to evolve across
geographically separated populations. That is, the evolution of UG by
adaptation would appear to require rapid adaptations for language prior to the
dispersion of human populations, followed by an abrupt cessation of such
adaptation, for a long period after dispersion. The contrast between the
evolution of the putative “language organ” and that of biological processes,
such as digestion, is striking. The digestive system is evolutionarily very
old, and many orders of magnitude older than the recent divergence of human
populations. Nonetheless, digestion appears to have adapted in important ways
to recent changes in the dietary environment; for example, with apparent
coevolution of lactose tolerance and the domestication of milk-producing
animals (Beja-Pereira et al., 2003).
3.2.1
Problem 2: Language change
Whatever the timing of the origin of language and hominid dispersion, the thesis that a genetically encoded UG arose through adaptation faces a second problem: that, even within a single population, linguistic conventions change rapidly. Hence the linguistic environment over which selectional pressures operate presents a “moving target” for natural selection. If linguistic conventions change more rapidly than genes change via natural selection, then genes that encode biases for particular conventions will be eliminated—because, as the language changes, the biases will be incorrect, and hence decrease fitness. More generally, in a fast changing environment, phenotypic flexibility to deal with various environments will typically be favored over genes that bias the phenotype narrowly toward a particular environment. Again, there is a tempting counter-argument—that the linguistic principles of UG will not change, and hence these aspects of language will provide a stable linguistic environment over which adaptation can operate. But, of course, this argument falls into the circularity trap, because the genetic endowment of UG is proposed to explain language universals; so it cannot be assumed that the language universals pre-date the emergence of the genetic basis for UG.
Christiansen, Reali and Chater (2006) illustrate the problems raised by language change in a series of computer simulations. They assume the simplest possible set-up: that (binary) linguistic principles and language “genes” stand in one-to-one correspondence. Each gene has three alleles—one biased in favor of each version of the corresponding principle, and one neutral allele[vii]. Agents learn the language by trial-and-error, where their guesses are biased according to which alleles they have. The fittest agents are allowed to reproduce, and a new generation of agents is produced by sexual recombination and mutation. When the language is fixed, there is a selection pressure in favor of the “correctly” biased genes, and these rapidly come to dominate the population, as illustrated by Figure 1. This is an instance of the Baldwin effect (Baldwin, 1896; for discussion see Weber & Depew, 2003) in which information that is initially learned becomes encoded in the genome. A frequently cited example of the Baldwin effect is the development of calluses on the keels and sterna of ostriches (Waddington, 1942). The proposal is that calluses are initially developed in response to abrasion where the keel and sterna touch the ground during sitting. Natural selection then favored individuals that could develop calluses more rapidly, until callus development became triggered within the embryo and could occur without environmental stimulation. Pinker and Bloom suggest that the Baldwin effect in a similar way could be the driving force behind the adaptation of UG. Natural selection will favor learners who are genetically disposed rapidly to acquire the language to which they are exposed. Hence, over many generations this process will lead to a genetically specified UG.

Figure 1. The effect of linguistic change on the genetic encoding of arbitrary linguistic principles. Results are shown from a simulation with a population size of 100 agents, a genome size of 20, survival of the top 50% of the population, and starting with 50% neutral alleles. When there is no linguistic change, alleles encoding specific aspects of language emerge quickly—i.e., a Baldwin effect occurs—but when language is allowed to change, neutral alleles become more advantageous. Similar results were obtained across a wide range of different simulation parameters (Adapted from Christiansen, Reali & Chater, 2006).
However, when language is allowed to change (e.g., due to exogenous forces such as language contact), the effect reverses—biased genes are severely selected against when they are inconsistent with the linguistic environment, and neutral genes come to dominate the population. The selection in favor of neutral genes occurs even for low levels of language change (i.e., the effect occurs, to some degree, even if language change equals the rate of genetic mutation). But, of course, linguistic change (prior to any genetic encoding) is likely to have been much faster than genetic change. After all, in the modern era, language change has been astonishingly rapid, leading, for example, to the wide phonological and syntactic diversity of the Indo-European language group, from a common ancestor about 10,000 years ago (Gray & Atkinson, 2003). Language in hunter-gatherer societies changes at least as rapidly. Papua New Guinea, settled within the last 50,000 years, has an estimated one-quarter of the world’s languages. These are enormously linguistically diverse, and most originate in hunter-gatherer communities (Diamond, 1992)[viii]. Thus, from the point of view of natural selection, it appears that language, like other cultural adaptations, changes far too rapidly to provide a stable target over which natural selection can operate. Human language learning therefore may be analogous to typical biological responses to high levels of environmental change—i.e., to develop general-purpose strategies which apply across rapidly-changing environments, rather than specializing to any particular environment. This strategy appears to have been used, in biology, by “generalists” such as cockroaches and rats, in contrast, for example, to pandas and koalas, which are adapted to extremely narrow environmental niches.
A potential limitation of our argument so far is that we have assumed that changes in the linguistic environment are “exogenous.” But many aspects of language change may be “endogenous,” i.e., may arise because the language is adapting due to selection pressures from learners, and hence their genes. Thus, one might imagine the following argument: suppose there is a slight, random, genetic preference for languages with feature A rather than B. Then this may influence the language spoken by the population to have feature A, and this may in turn select for genes that favor the feature A[ix]. Such feedback might, in principle, serve to amplify small random differences into, ultimately, rigid arbitrary language universals. However, as Figure 2 illustrates, when linguistic change is genetically influenced, rather than random, it turns out that, while this amplification effect can occur, leading to a Baldwin effect, it does not emerge from small random fluctuations. Instead, it only occurs when language is initially strongly influenced by genes. But if arbitrary features of language would have to be predetermined strongly by the genes from the very beginning, then this leaves little scope for subsequent operation of the Baldwin effect as envisioned by Pinker and Bloom.
Figure
2. The Baldwin effect, where genes influence language: the role of
population influence (i.e., genetic “feedback”) on the emergence of the Baldwin
effect for language-relevant alleles when language is allowed to change 10
times faster than biological change. Only when the pressure from the learners’
genetic biases is very high (~50%) can the Baldwin effect overcome linguistic
change. (Adapted from Christiansen, Reali & Chater, 2006).
3.2.3.
Problem 3: What is genetically encoded?
Even if the first two difficulties for adaptationist accounts of UG could be solved, the view still faces a further puzzle: why is it that genetic adaptation occurred only to very abstract properties of language, rather than also occurring to its superficial properties? Given the spectacular variety of surface forms of the world’s languages, in both syntax (including every combination of basic orderings of subject, verb and object, and a wide variety of less constrained word orders) and phonology (including tone and click languages, for example), why did language genes not adapt to these surface features?[x] Why should genes become adapted to capture the extremely rich and abstract set of possibilities countenanced by the principles of UG, rather than merely encoding the actual linguistic possibilities in the specific language that was being spoken (i.e., the phonological inventory and particular morphosyntactic regularities of the early click-language, from which the Khoisan family originated and which might be the first human language; e.g., Pennisi, 2004)? The unrelenting abstractness of the universal principles makes them difficult to reconcile with an adaptationist account.
One of the general features of biological adaptation is that it is driven by the constraints of the immediate environment. It can have no regard for distant or future environments that might one day be encountered. For example, the visual system is highly adapted to the laws of optics as they hold in normal environments. Thus, human vision mis-estimates the length of a stick in water, because it does not correct for the refraction of light through water (this being not commonly encountered in the human visual world). By contrast, the visual system of the archerfish, which must strike air-born flies with a water jet from below the water surface, does make this correction (Rossel, Corlija & Schuster, 2002). Biological adaptation produces systems designed to fit the environment to which adaptation occurs; there is, of course, no selectional pressure to fit environments that have not occurred, or might do so at some point in the future. Hence, if a UG did adapt to a past linguistic environment, it would seem inevitable that it would adapt to that language environment as a whole: thus adapting to its specific word order, phonotactic rules, inventory of phonemic distinctions, and so on. In particular, it seems very implausible that an emerging UG would be selected primarily for extremely abstract features, which apply equally to all possible human languages (not just the language evident in the linguistic environment in which selection operates). This would be analogous to an animal living in a desert environment somehow developing adaptations that are not specific to desert conditions, but that are equally adaptive in all terrestrial environments.
The
remarkable abilities of the young indigo bunting to use stars for navigational
purposes—even in the absence of older birds to lead the way—might at first seem
to counter this line of reasoning (e.g., Hauser, 2001; Marcus, 2004). Every
autumn this migratory bird uses the location of Polaris in the night sky to fly
from its summer quarters in the Northeast United States to its winter residence
in the Bahamas. As demonstrated by Emlen (1970), the indigo bunting uses
celestial rotation as a reference axis to discover which stars point to true
north. Thus, when Emlen raised young fledglings in a planetarium that was
modified to rotate the night sky around Betelgeuse, the birds oriented
themselves as if north was in the direction of this bright star. Crucially, what has become
genetically encoded is not a star map, because star constellations change over
evolutionary time and thus form moving targets, but instead that which is
stable: that stationary stars indicate the axis of earth’s rotation, and hence
true north.
Similarly, it is tempting to claim that the principles of UG are just those that are invariant across languages, whereas contingent aspects of word order or phonology will vary across languages. Thus, one might suggest that only the highly abstract, language-universal, principles of UG will provide a stable basis upon which natural selection can operate. But this argument is again, of course, a further instance of the circularity trap. We are trying to explain how a putative UG might become genetically fixed, and hence we cannot assume UG is already in place. Thus, this counterargument is blocked.
We
are not, of course, arguing that abstract structures cannot arise by
adaptation. Indeed, abstract patterns, such as the body plan of mammals or
birds, are conserved across species, and constitute a complex and highly
integrated system. Notice, though, that such abstract structures are still
tailored to the specific environment of each species. Thus, while bats, whales,
and cows have a common abstract body plan, these species embody dramatically
different instantiations of this pattern, adapted to their ecological niches in
the air, in water, or on land. Substantial modifications of this kind can occur
quite rapidly, due to changes in a small numbers of genes and/or their pattern
of expression. For example, the differing beak shape in Darwin’s finches, adapted
to different habitats in the Galapagos Islands, may be largely determined by as
few as two genes: BMP4, the
expression of which is associated with the width as well as depth of beaks
(Abzhanov, Protas, Grant, Grant & Tabin, 2004), and CaM, the expression of which is correlated with beak length
(Abzhanov et al., 2006). Again, these adaptations are all related closely to
the local environment in which an organism exists. In contrast, adaptations for
UG are hypothesized to be for abstract principles holding across all linguistic
environments, with no adaptation to the local environment of specific languages
and language users.
In summary, Pinker and Bloom (1990), as we have seen, draw a parallel between the adaptationist account of the development of the visual system, and an adaptationist account of a putative language faculty. But the above arguments indicate that the two cases are profoundly different. The principles of optics, and the structure of the visual world, have many invariant features across environments (e.g., Simoncelli & Olshausen, 2001), but the linguistic environment is vastly different from one population to another. Moreover, the linguistic environment, unlike the visual environment, will itself be altered in line with any genetic changes in the propensity to learn and use languages, thus amplifying differences between linguistic environments further. We conclude, then, that linguistically-driven biological adaptation cannot underlie the evolution of language.
It
remains possible, though, that the development of language did have a
substantial impact on biological evolution. The arguments given here merely
preclude the possibility that linguistic conventions that would originally differ across different linguistic
environments could somehow become universal across all linguistic communities,
by virtue of biological adaptation to the linguistic environment. This is
because, in the relevant respects, the linguistic environment for the different
populations is highly variable, and hence any biological adaptations could only
serve to entrench such differences further. But there might be features that
are universal across linguistic environments that might lead to biological
adaptation (such as the means of producing speech; Lieberman, 1984; or the need
for enhanced memory capacity, or complex pragmatic inferences; Givón &
Malle, 2002). However, these language features are likely to be functional,
i.e., they facilitate language use—and
thus would typically not be considered part of UG.
It
is consistent with our arguments that the emergence of language influenced
biological evolution in a more indirect way. The possession of language might
have fundamentally changed the patterns of collective problem solving and other
social behavior in early humans, with a consequent shift in the selectional
pressures on humans engaged in these new patterns of behavior. But universal,
arbitrary constraints on the structure of language cannot emerge from
biological adaptation to a varied pattern of linguistic environments. Thus, the
adaptationist account of the biological origins of UG cannot succeed.
4. Evolution of Universal
Grammar by Non-adaptationist Means
Some theorists advocating a genetically-based UG might concur with our arguments against adaptationist accounts of language evolution. For instance, Chomsky (1972, 1988, 1993) has for more than two decades expressed strong doubts about neo-Darwinian explanations of language evolution, hinting that UG may be a by-product of increased brain size or yet unknown physical or biological evolutionary constraints. Further arguments for a radically non-adaptationist perspective have been advanced by Jenkins (2000), Lanyon (2006), Lightfoot (2000), and Piattelli-Palmarini (1989, 1994).
Non-adaptationists typically argue that
UG is both highly complex, and radically different from other biological
machinery (though see Hauser et al., 2002). They suggest, moreover, that UG
appears to be so unique in terms of structure and properties, that it is
unlikely to be a product of natural selection amongst random mutations.
However, we argue that non-adaptationist attempts to explain a putative
language-specific genetic endowment also fail.
To
what extent can any non-adaptationist mechanism account for the development of
a genetically encoded UG, as traditionally conceived? In particular, can such
mechanisms account for the appearance of genetically specified principles that
are presumed to be (a) idiosyncratic to language, and (b) of substantial
complexity? We argue that the probability that non-adaptationist factors played
a substantial role in the evolution of UG is vanishingly small.
The
argument involves a straightforward application of information theory. Suppose
that the constraints embodied in UG are indeed language-specific, and hence do
not emerge as side-effects of existing processing mechanisms. This means that
UG would have to be generated at random
by non-adaptationist processes. Suppose further that the information required
to specify a language acquisition device, so that language can be acquired and
produced, over and above the pre-linguistic biological endowment can be
represented as a binary string of N
bits (this particular coding assumption is purely for convenience). Then the
probability of generating this sequence of N
bits by chance is 2-N. If
the language-specific information could be specified using a binary string that
would fit on one page of normal text (which would presumably be a considerable
underestimate, from the perspective of most linguistic theory), then N would be over 2500. Hence the
probability of generating the grammar by a random process would be less than 2-2500.
So to generate this machinery by chance (i.e., without the influence of the
forces of adaptation) would be expected to require of the order of 22500
individuals. But the total population of humans over the last two million or so
years, including the present, is measured in billions, and is much smaller than
235. Hence, the probability of non-adaptationist mechanisms
“chancing” upon a specification of a language organ or language instinct
through purely non-adaptationist means is astronomically unlikely[xi].
It
is sometimes suggested, apparently in the face of this type of argument, that
the recent evolutionary-developmental biology literature has revealed how local
genetic changes, e.g., on homeobox genes, can influence the expression of other
genes, and through a cascade of developmental influences, result in extensive
phenotypic consequences (e.g., Gerhart & Kirschner, 1997; Laubichler & Maienschein, 2007).
Yet suppose that UG arises from a small “tweak” to pre-linguistic cognitive
machinery, then general cognitive machinery will provide the vast bulk of the
explanation of language structure—without this machinery, the impact of the
tweak would be impossible to understand. Thus, the vision of universal grammar
as a language-specific innate faculty or language organ would have to be
retracted. But the idea that a simple tweak might lead to a complex, highly
interdependent, and intricately organized system, such as the putative UG, is
highly implausible. Small genetic changes lead to modifications of existing
complex systems (and these modifications can be quite far-reaching); they do
not lead to the construction of new complexity. Thus, a mutation might lead to
an insect having an extra pair of legs, and a complex set of genetic
modifications (almost certainly over strong and continuous selectional
pressure) may modify a leg into a flipper, but no single gene creates an
entirely new means of locomotion, from scratch. The whole burden of the classic
arguments for UG is that UG is both highly organized and complex, and utterly
distinct from general cognitive principles. Thus, the emergence of a putative
UG requires the construction of a new complex system, and the argument sketched
above notes that the probability of even modest new complexity arising by
chance is astronomically low.
The
implication of this argument is that it is extremely unlikely that substantial
quantities of linguistically idiosyncratic information have been specified by
non-adaptationist means. Indeed, the point applies more generally to the
generation of any complex, functional biological structures. Thus, it is not
clear how any non-adaptationist account can explain the emergence of something
as intricately complex as UG.
Some
authors who express skepticism concerning the role of adaptation implicitly
recognize this kind of theoretical difficulty. Instead, many apparently complex
and arbitrary aspects of cognition and language are suggested to have emerged
out of the constraints on building any complex information processing system,
given perhaps currently unknown physical and biological constraints (e.g.,
Chomsky, 1993; see Kauffman, 1995, for a related viewpoint on evolutionary
processes). A related perspective is proposed by Gould (1993), who views
language as a spandrel—i.e., as emerging as a byproduct of other cognitive
processes. Another option would be to appeal to exaptation (Gould & Vrba, 1982) whereby a biological structure
that was originally adapted to serve one function is put to use to serve a
novel function. Yet the non-adaptationist attracted by these or other
non-adaptationist mechanisms is faced with a dilemma. If language can emerge
from general physical, biological or cognitive factors, then the complexity and
idiosyncrasy of UG is illusory; language emerges from general non-linguistic
factors, a conclusion entirely consistent with the view we advocate here. If,
by contrast, UG is maintained to be sui
generis and not readily derivable from general processes, the complexity
argument bites: i.e., the probability of a new and highly complex adaptive
system emerging by chance is astronomically low.
The
dilemma is equally stark for the non-adaptationist who attempts to reach for
other non-adaptationist mechanisms of evolutionary change. There are numerous
mechanisms that amount to random perturbations (from the point of view of the
construction of a highly complex adaptive system) (Schlosser & Wagner,
2004). These include genetic drift
(Suzuki, Griffiths, Miller & Lewontin, 1989), the random fluctuations in
gene frequencies in a population; genetic
hitch-hiking (Maynard-Smith, 1978), a mechanism by which non-selected genes
“catch a ride” with another gene (nearby on the chromosome) that was subject to selection; epigenesis (Jablonka & Lamb, 1989),
which causes heritable cell changes due to environmental influences but without
corresponding changes to the basic DNA sequences of that cell; horizontal genetic transfer (Syvanen,
1985) by which genetic material shifts from one species to another; and transposons (McClintock, 1950), mobile
genetic elements that can move around in different positions within the genome
of a cell and thus alter its phenotype. Each of these mechanisms provides a
richer picture of the mechanisms of evolutionary change—but provides no answer
to the question of how novel and highly complex adaptive systems, such as the
putative UG, might emerge de novo.
However, if language is viewed as embodying novel complexity, then the
emergence of this complexity by non-adaptationist (and hence, from an adaptive
point of view, random) mechanisms is astronomically unlikely.
We
may seem to be faced with a paradox. It seems clear that the mechanisms
involved in acquiring and processing language are enormously intricate and
moreover intimately connected to the structure of natural languages. The
complexity of these mechanisms rules out, as we have seen in this section, a
non-adaptationist account of their origin. However, if these mechanisms arose
through adaptation, this adaptation cannot, as we argued above, have been
adaptation to language. But if the
mechanisms that currently underpin language acquisition and processing were
originally adapted to carry out other functions, then how is their apparently
intimate relationship with the structure of natural language to be explained?
How, for example, are we to explain that the language acquisition mechanisms
seem particularly well-adapted to learning natural languages, but not to any of
a vast range of conceivable non-natural languages (e.g. Chomsky, 1980)? As we
now argue, the paradox can be resolved if we assume that the “fit” between the
mechanisms of language acquisition and processing, on the one hand, and natural
language, on the other, has arisen because natural languages themselves have
“evolved” to be as easy to learn and process as possible: language has been
shaped by the brain, rather than vice versa.
5. Language as Shaped by
the Brain
We
propose, then, to invert the perspective on language evolution, shifting the
focus from the evolution of language
users to the evolution of languages.
Figure 3 provides a conceptual illustration of these two perspectives (see also
Andersen, 1973; Hurford, 1990; Kirby & Hurford, 1997). The UG
adaptationists (a) suggest that selective pressure toward better language
abilities gradually led to the selection of more sophisticated UGs. In
contrast, (b) we propose to view language as an evolutionary system in its own
right (see also e.g., Christiansen, 1994; Deacon, 1997; Keller, 1994; Kirby,
1999; Ritt, 2004), subject to adaptive pressures from the human brain. As a
result, linguistic adaptation allows for the evolution of increasingly
expressive languages that can nonetheless still be learned and processed by
domain-general mechanisms. From this perspective, we argue that the mystery of
the fit between human language acquisition and processing mechanisms and
natural language may be unraveled, and we might, furthermore, understand how
language has attained its apparently “idiosyncratic” structure.
Instead
of puzzling that humans can only learn a small subset of the infinity of
mathematically possible languages, we take a different starting point: the
observation that natural languages exist only because humans can produce, learn
and process them. In order for languages to be passed on from generation to
generation, they must adapt to the properties of the human learning and
processing mechanisms; the structures in each language form a highly
interdependent system, rather than a
collection of independent traits. The
key to understanding the fit between language and the brain is to understand
how language has been shaped by the brain, not the reverse. The process by which
language has been shaped by the brain is, in important ways, akin to Darwinian
selection—hence, we therefore suggest that it is a productive metaphor to view
languages as analogous to biological species, adapted through natural selection
to fit a particular ecological niche: the human brain.
This viewpoint does not rule out the
possibility that language may have played a role in the biological evolution of
hominids. Good language skills may indeed enhance reproductive success. But the
pressures working on language to adapt to humans are significantly stronger
than the selection pressures on humans to use language. In case of the former,
a language can only survive if it is
learnable and processable by humans. On the other hand, adaptation towards language
use is merely one of many selective
pressures working on hominid evolution (including, for example, avoiding
predators and finding food). Whereas humans can survive without language, the
opposite is not the case. Thus, prima
facie language is more likely to have been shaped to fit the human brain
than the other way round. Languages that are hard for humans to learn and
process cannot come into existence at all.

Figure 3. Illustration of two different views on
the direction of causation in language evolution: a) biological adaptations of
the brain to language (double arrows), resulting in gradually more intricate
UGs (curved arrows) to provide the basis for increasingly complex language
production and comprehension (single arrows); b) cultural adaptation of
language to the brain (double arrows), resulting in increasingly expressive
languages (curved arrows) that are well suited to being acquired and processed
by domain-general mechanisms (single arrows).
5.1. Historical parallels
between linguistic and biological change
The
idea of language as an adaptive, evolutionary system has a prominent historical
pedigree dating back to Darwin and beyond. One of the earliest proponents of
the idea that languages evolve diachronically was the eighteenth-century
language scholar, Sir William Jones, the first Western scholar to study
Sanskrit and note its affinity with Greek and Latin (Cannon, 1991). Later,
nineteenth-century linguistics was dominated by an organistic view of language
(McMahon, 1994). Franz Bopp, one of the founders of comparative linguistics,
regarded language as an organism that could be dissected and classified
(Davies, 1987). Wilhelm von Humboldt—the father of generative grammar (Chomsky,
1965; Pinker, 1994)—argued that “… language, in direct conjunction with mental
power, is a fully-fashioned organism…”
(von Humboldt, 1836/1999, p. 90; original emphasis). More generally, languages
were viewed as having life-cycles that included birth, progressive growth,
procreation, and eventually decay and death. However, the notion of evolution
underlying this organistic view of language was largely pre-Darwinian. This is
perhaps reflected most clearly in the writings of another influential linguist,
August Schleicher. Although he explicitly emphasized the relationship between
linguistics and Darwinian theory (Schleicher, 1863; quoted in Percival, 1987),
Darwin’s principles of mutation, variation, and natural selection did not enter
into the theorizing about language evolution (Nerlich, 1989). Instead, the evolution
of language was seen in pre-Darwinian terms as the progressive growth towards
attainment of perfection, followed by decay.
Darwin
(1900), too, recognized the similarities between linguistic and biological
change[xii]:
The formation of different languages and of distinct species, and the proofs that both have been developed through a gradual process, are curiously parallel … We find in distinct languages striking homologies due to community of descent, and analogies due to a similar process of formation. The manner in which certain letters or sounds change when others change is very like correlated growth … Languages, like organic beings, can be classed in groups under groups; and they can be classed either naturally, according to descent, or artificially by other characters. Dominant languages and dialects spread widely, and lead to the gradual extinction of other tongues. A language, like a species, when once extinct, never … reappears … A struggle for life is constantly going on among the words and grammatical forms in each language. The better, the shorter, the easier forms are constantly gaining the upper hand … The survival and preservation of certain favored words in the struggle for existence is natural selection. (p. 106)
In
this sense, natural language can be construed metaphorically as akin to an
organism whose evolution has been constrained by the properties of human
learning and processing mechanisms. A similar perspective on language evolution
was revived, within a modern evolutionary framework, by Stevick (1963) and
later by Nerlich (1989). Sereno (1991) has listed a number of parallels between
biological organisms and language (with the biological comparisons in
parentheses):
An intercommunicating group of people
defines a language (cf. gene flow in relation to a species); language abilities
develop in each speaker (cf. embryonic development); language must be
transmitted to offspring (cf. heritability); there is a low level process of
sound and meaning change that continuously generates variation (cf. mutation);
languages gradually diverge, especially when spatially separated (cf.
allopatric speciation); geographical distributions of dialects (cf. subspecies,
clines) gradually give rise to wholesale rearrangements of phonology and syntax
(cf. macroevolution); sociolinguistic isolation can lead to language divergence
without spatial discontinuity (cf. sympatric speciation). (p. 472)
Christiansen
(1994) pushed the analogy a little further, suggesting that language may be
viewed as a “beneficial parasite” engaged in a symbiotic relationship with its
human hosts, without whom it cannot survive (see also Deacon, 1997). Symbiotic
parasites and their hosts tend to become increasingly co-adapted (e.g.,
Dawkins, 1976). But note that this co-adaptation will be very lopsided, because
the rate of linguistic change is far greater than the rate of biological
change. Whereas Danish and Hindi needed less than 7,000 years to
evolve from a common hypothesized proto-Indo-European ancestor into very
different languages (Gray & Atkinson, 2003), it took our remote ancestors
approximately 100,000–200,000 years to evolve from the archaic form of Homo sapiens into the anatomically
modern form, sometimes termed Homo
sapiens sapiens. Indeed,
as we argued above, the rapidity of language change, and the geographical
dispersal of humanity, suggests that biological adaptation to language is
negligible. This suggestion is further corroborated by work in evolutionary
game theory, showing that when two species with markedly different rates of
adaptation enter a symbiotic relationship, the rapidly evolving species adapts
to the slowly evolving one but not the reverse (Frean & Abraham, 2004).
5.2. Language as a system
But
in what sense should language be viewed as akin to an integrated organism, rather than as a collection of
separate traits, evolving relatively independently? The reason is that language
is highly systematic—so much so,
indeed, that much of linguistic theory is concerned with tracking the
systematic relationships between different aspects of linguistic structure.
Although language is an integrated system, it can, nonetheless, be viewed as
comprising a complex set of “features” or “traits” which may or may not be
passed on from one generation to the next (concerning lexical items, idioms,
aspects of phonology, syntax and so on). To a first approximation, traits that
are easy for learners to acquire and use will become more prevalent; traits
that are more difficult to acquire and use will disappear. Thus, selectional
pressure from language learners and users will shape the way in which language
evolves. Crucially, the systematic character of linguistic traits means that,
to some degree at least, the fates of different traits in a language are
intertwined. That is, the degree to which any particular trait is easy to learn
or process will, to some extent, depend on the other features of the
language—because language users will tend to learn and process each aspect of
the language in the light of their experience with the rest. This picture is
familiar in biology—the selectional impact of any gene depends crucially on the
rest of the genome; the selectional forces on each gene, for good or ill, are
tied to the development and functioning of the entire organism.
Construing
language as an evolutionary system has implications for explanations of what is being selected in language
evolution. From the viewpoint of generative grammar, the unit of selection
would seem to be either specific UG principles (in PPT; Newmeyer, 1991),
particular parts of the UG toolkit (in SS; Culicover & Jackendoff, 2005),
or recursion in the form of Merge (in MP; Hauser et al., 2002). In all cases,
selection would seem to take place at a high level of abstraction that cuts
across a multitude of specific linguistic constructions. Our approach suggests
a different perspective inspired by the “lexical turn” in linguistics (e.g.,
Combinatory Categorical Grammar, Steedman, 2000; Head-driven Phrase Structure
Grammar, Sag & Pollard, 1987; Lexical-Functionalist Grammar, Bresnan,
1982), focusing on specific lexical items with their associated syntactic and
semantic information. Specifically, we adopt a Construction Grammar view of
language (e.g., Croft, 2000, 2001; Goldberg, 2006; O’Grady, 2005), proposing that
individual constructions consisting of words or combinations thereof are among
the basic units of selection.
To
spell out the parallel, the idiolect of an individual speaker is analogous to
an individual organism; a language (e.g., Mandarin, French) is akin to a
species. A linguistic “genotype” corresponds to the neural representation of an
idiolect, instantiated by a collection of mental “constructions”, which are
here analogous to genes, and gives rise to linguistic behavior—the language
“phenotype”—characterized by a collection of utterances and interpretations.
Just as the fitness of an individual gene depends on its interaction with other
genes, so the fitness of an individual construction is intertwined with those
of other constructions; i.e., constructions are part of a (linguistic) system.
A species in biology is defined by the ability to interbreed; a “language
species” is defined by mutual intelligibility. Hence, interbreeding and
mutually intelligible linguistic interactions can be viewed as analogous
processes by which genetic material and constructions can propagate.
The
long-term survival of any given construction is affected both by its individual
properties (e.g., frequency of usage) and how well it fits into the overall
linguistic system (e.g., syntactic, semantic, or pragmatic overlap with other
constructions). In a series of linguistic and corpus-based analyses, Bybee
(2007) has shown how frequency of occurrence plays an important role in shaping
language from phonology to morphology to morphosyntax, due to the effects of
repeated processing experiences with specific examples (either types or
tokens). Additionally, groups of constructions overlapping in terms of
syntactic, semantic, and/or pragmatic properties emerge and form the basis for
usage-based generalizations (e.g., Goldberg, 2006; Tomasello, 2003). Crucially,
however, these groupings lead to a distributed system of local generalizations across partially overlapping constructions,
rather than the abstract, mostly global generalizations of current generative
grammar.
In
psycholinguistics, the effects of frequency and pattern overlap have been
observed in so-called Frequency ´ Regularity interactions. As an example, consider the
acquisition of the English past tense. Frequently occurring mappings, such as go ® went,
are learned more easily than more infrequent mappings, such as lie ®
lay. However,
low-frequency patterns may be more easily learned if they overlap in part with
other patterns. Thus, the partial overlap in the mappings from stem to past
tense in sleep ® slept, weep ® wept, keep ® kept (i.e., -eep ® -ept) make the learning of the these
mappings relatively easy even though none of the words individually have a
particularly high frequency. Importantly, the two factors—frequency and
regularity (i.e., degree of partial overlap)—interact with each other. High
frequency patterns are easily learned independent of whether they are regular
or not, whereas the learning of low-frequency patterns suffers if they are not
regular (i.e., if they do not have partial overlap with other patterns).
Results from psycholinguistic experimentation and computational modeling have
observed such Frequency ´ Regularity
interactions across many aspects of language, including auditory word
recognition (Lively, Pisoni & Goldinger, 1994), visual word recognition
(Seidenberg, 1985), English past tense acquisition (Hare & Elman, 1995),
and sentence processing (Juliano & Tanenhaus, 1994; MacDonald &
Christiansen, 2002; Pearlmutter & MacDonald, 1995).
In
our case, we suggest that similar interactions between frequency and pattern
overlap are likely to play an important role in language evolution. Individual
constructions may survive through frequent usage or because they participate in
usage-based generalizations through syntactic, semantic or pragmatic overlap
with other similar constructions. Further support for this suggestion comes
from artificial language learning studies with human subjects, demonstrating
that certain combinations of artificial-language structures are more easily
learned than others given sequential learning biases (e.g., Christiansen, 2000; Christiansen & Reeder, 2006; Saffran,
2001; see Section 6.3). For example, Ellefson and Christiansen (2000) compared
human learning across two artificial languages that only differed in the order
of words in two out of six sentence types. They found that not only was the
more “natural” language learned better overall but also that the four sentence
types common to both languages were learned better as well. This suggests that
the artificial languages were learned as integrated systems, rather than as
collections of independent items. Further corroboration comes from a study by
Kaschak and Glenberg (2004) who had adult participants learn the needs construction (e.g., “The meal
needs cooked”), a feature of the American English dialect spoken in the
northern midlands region from western Pennsylvania across Ohio, Indiana, and
Illinois to Iowa. The training on the needs construction facilitated the processing
of related modifier constructions (e.g., “The meal needs cooked vegetables”),
again suggesting that constructions form an integrated system that can be
affected by the learning of new constructions. Thus, although constructions are
selected independently, they also provide an environment for each other within
which selection takes place, just as the selection of individual genes are tied
to the survival of the other genes that make up an organism.
5.3. The nature of language
universals
We
have argued that language is best viewed as a linguistic system adapted to the
human brain. But if evolution is unlikely to have bestowed us with an innate
UG, then how can we account for the various aspects of language that UG
constraints are supposed to explain? That is, how can we explain the existence
of apparent language universals: regularities in language structure and use?
Notice, however, that is it by no means clear exactly what counts as a language
universal. Rather, the notion of language universals differs considerably
across language researchers (e.g., the variety in perspectives among
contributions in Christiansen, Collins & Edelman, in press). Many linguists
working within the generative grammar framework see universals as primarily,
and some times exclusively, deriving from UG (e.g., Hornstein & Boeckx, in
press; Pinker & Jackendoff, in press). Functional linguists, on the other
hand, view universals as arising from patterns of language usage due to
pragmatic, processing and other constraints, and amplified in diachronic
language change (e.g., Bybee, in press). However, even within the same
theoretical linguistic framework, there is often little agreement about what
the exact universals are. For example, when surveying specific universals
proposed by different proponents of UG, Tomasello (2004) found little overlap
between proposed universals.
Although
there may be little agreement about specific universals, some consensus can
nonetheless be found with respect to their general nature. Thus, within
mainstream generative grammar approaches (including MP and PPT), language
universals are seen as arising from the inner workings of UG. As noted by
Hornstein and Boeckx (in press),
on this conception universals are likely to be quite abstract. They need not be observable even were one to survey thousands of languages looking for commonalities (unlike, say Greenbergian Universals). In fact, on this conception, the mere fact that every language displayed some property P does not imply that P is a universal in the sense of being a feature of UG. Put more paradoxically, the fact that P holds universally does not imply that P is a universal. Conversely, some property can be a universal even if only manifested in a single natural language. The only thing that makes a principle a Universal on this view is that it is a property of our innate ability to grow a language. (p. 3-4)
Thus,
from the perspective of MP and PPT, language universals are by definition
properties of UG; that is, they are formal
universals (Chomsky, 1965). A similar view of universals also figures within
the SS framework (Culicover & Jackendoff, 2005), defined in terms of the
universal toolkit encoded in UG. Because different languages are hypothesized
to use different subsets of tools, the SS approach—like MP and PPT—suggests
that some universals may not show up in all languages (Pinker & Jackendoff,
in press). However, both notions of universals face the logical problem of
language evolution discussed above: How could the full set of UG constraints
have evolved if any single linguistic environment only ever supported a subset
of them?
The
solution to this problem, we suggest, is to adopt a non-formal conception of
universals in which they emerge from processes of repeated language acquisition
and use. We see universals as products of the interaction between constraints
deriving from the way our thought processes work, from perceptuo-motor factors,
from cognitive limitations on learning and processing, and from pragmatic
sources (Section 6 below). This view implies that most universals are unlikely
to be found across all languages; rather, “universals” are more akin to
statistical trends tied to patterns of language use. Consequently, specific
universals fall on a continuum ranging from being attested to only in some
languages to being found across most languages. An example of the former is the
class of implicational universals, such as that verb-final languages tend to
have postpositions (Dryer, 1992), whereas the presence of nouns and verbs in
most, if not all, languages (minimally as typological prototypes; Croft, 2001)
is an example of the latter. Thus, language universals, we suggest, are best
construed as statistical tendencies with varying degrees of universality across
the world’s languages.
We
have argued that language is too variable, both in time and space, to provide a
selectional pressure that might shape the gradual adaptation of an innate UG
encoding arbitrary, but universal linguistic constraints. Moreover, a putative
innate UG would be too complex and specialized to have credibly arisen through
non-adaptationist mechanisms. Instead, we have proposed that the fit between
language and the brain arises because language has evolved to be readily
learned and processed by the brain. We now consider what kinds of
non-linguistic constraints are likely to have shaped language to the brain, and
given rise to statistical tendencies in language structure and use.
6. Constraints on Language
Structure
We
have proposed that language has adapted to the non-linguistic constraints
deriving from language learners and users, giving rise to observable linguistic
universals. But how far can these constraints be identified? To what extent can
linguistic structure previously ascribed to an innate UG be identified as
having a non-linguistic basis? Clearly, establishing a complete answer to this
question would require a vast program of research. In this section, we
illustrate how research from different areas of the language sciences can be
brought together to explain aspects of language previously thought to require
the existence of UG for their explanation. For the purpose of exposition, we
divide the constraints into four groups relating to thought, perceptuo-motor
factors, cognition, and pragmatics. These constraints derive from the
limitations and idiosyncratic properties of the human brain and other parts of
our body involved in language (e.g., the vocal tract). However, as we note
below, any given linguistic phenomenon is likely to arise from a combination of
multiple constraints that cut across these groupings, and thus across different
kinds of brain mechanisms.
6.1. Constraints from
thought
The
relationship between language and thought is both potentially abundantly rich,
but also extremely controversial. Thus, the analytic tradition in philosophy
can be viewed as attempting to understand thought through a careful analysis of
language (e.g., Blackburn, 1984); it has been widely assumed that the structure
of sentences (or utterances, and perhaps the contexts in which they stand), and
the inferential relations over them, provide an analysis of thought. A standard
assumption is that thought is largely prior to, and independent of, linguistic
communication. Accordingly, fundamental properties of language such as compositionality,
function-argument structure, quantification, aspect and modality, may arise
from the structure of the thoughts language is required to express (e.g.,
Schoenemann, 1999). Moreover, presumably language provides a reasonably
efficient mapping of the mental representation of thoughts, with these
properties, into phonology. This viewpoint can be instantiated in a variety of
ways. For example, Steedman’s emphasis on incremental interpretation (e.g.,
that successive partial semantic representations are constructed as the
sentence unfolds—i.e., the thought that a sentence expresses is built up
piecemeal) is one motivation for categorical grammar (e.g., Steedman, 2000).
From a very different stance, the aim of finding a “perfect” relationship between
thought and phonology is closely related to the goals of the Minimalist Program
(Chomsky, 1995).[xiii]
Indeed, Chomsky (e.g., 2005) has recently suggested that language may have
originated as a vehicle for thought, and only later become exapted to serve as
a system of communication. This viewpoint would not, of course, explain the
content of a putative UG, which concerns principles for mapping mental
representations of thought into phonology; and this mapping surely is specific to communication: inferences
are, after all, presumably defined over mental representations of thoughts,
rather than phonological representations, or, for that matter, syntactic trees.
The
lexicon is presumably also strongly constrained by processes of perception and
categorization—the meanings of words must be both learnable and cognitively
useful (e.g., Murphy, 2002); indeed, the philosophical literature on lexical
meaning, from a range of theoretical perspectives, sees cognitive constraints
as fundamental to understanding word meaning, whether these constraints are
given by innate systems of internal representation (Fodor, 1975), or primitive
mechanisms of generalization (Quine, 1960). Cognitive linguists (e.g., Croft
& Cruise, 2004) have argued for a far more intimate relation between
thought and language: for example, that basic conceptual machinery (e.g.,
concerning spatial structure) and the mapping of such structure into more
abstract domains (e.g., via metaphor) is, according to some accounts, evident
in languages (e.g., Lakoff & Johnson, 1980). And from a related perspective
(e.g., Croft, 2001), some linguists have argued that semantic categories of
thought (e.g., of objects and relations) may be shared between languages,
whereas syntactic categories and constructions are defined by language-internal
properties, such as distributional relations, so that the attempt to find
cross-linguistic syntactic universals is doomed to failure.
6.2. Perceptuo-motor
constraints
The motor and perceptual machinery
underpinning language seems inevitably to have some influence on language
structure. The seriality of vocal output, most obviously, forces a sequential
construction of messages. A perceptual and memory system which is typically a
“greedy” processor, and has a very limited capacity for storing “raw” sensory
input of any kind (e.g., Haber, 1983) may, moreover, force a code which can
interpreted incrementally (rather than the many practical codes in
communication engineering, in which information is stored in large blocks,
e.g., Mackay, 2003). The noisiness and variability (both with context and
speaker) of vocal (or, indeed, signed) signals may, moreover, force a “digital”
communication system, with a small number of basic messages: i.e., one that
uses discrete units (phonetic features or phonemes). The basic phonetic
inventory is transparently related to deployment of the vocal apparatus, and it
is also possible that it is tuned, to some degree, to respect “natural”
perceptual boundaries (Kuhl, 1987). Some theorists have argued for more far-reaching
connections. For example, MacNeilage (1998) argues that aspects of syllable
structure emerge as a variation on the jaw movements involved in eating, and
for some cognitive linguists, the perceptual-motor system is a crucial part of
the machinery on which the linguistic system is built (e.g., Hampe, 2006). The
depth of the influence of perceptual and motor control on more abstract aspects
of language is controversial—but it seems plausible that such influence may be
substantial.
6.3. Cognitive constraints
on learning and processing
In
our framework, language acquisition is construed not as learning a distant
grammar, but as learning how to process
language. Although constraints on learning and processing are often treated
separately (e.g., Bybee, 2007; Hawkins, 2004; Tomasello, 2003), we see them as
being highly intertwined, subserved by the very same underlying mechanisms.
Language processing involves extracting regularities from highly complex
sequential input, pointing to a connection between general sequential learning
(e.g., planning, motor control, etc., Lashley, 1951) and language: both involve
the extraction and further processing of discrete elements
occurring in complex temporal sequences. It is therefore not surprising that sequential learning tasks have become an
important experimental paradigm for studying language acquisition and
processing (sometimes under the heading of “artificial grammar/language
learning”, Gómez & Gerken, 2000, or “statistical learning”, Saffran, 2003).
Sequential learning has thus been demonstrated for a variety of different
aspects of language, including speech segmentation (Curtin, Mintz &
Christiansen, 2005; Saffran, Aslin & Newport, 1996; Saffran, Newport &
Aslin, 1996), discovering complex word-internal structure between nonadjacent
elements (Newport & Aslin, 2004; Onnis, Monaghan, Chater & Richmond,
2005; Peña, Bonnatti, Nespor & Mehler, 2002), acquiring gender-like
morphological systems (Brooks, Braine, Catalano, Brody & Sudhalter, 1993;
Frigo & McDonald, 1998), locating syntactic phrase boundaries (Saffran,
2001, 2002), using function words to delineate phrases (Green, 1979),
integrating prosodic and morphological cues in the learning of phrase structure
(Morgan, Meier & Newport, 1987), integrating phonological and
distributional cues (Monaghan, Chater & Christiansen, 2005), and detecting
long-distances relationships between words (Gómez, 2002; Onnis, Christiansen,
Chater & Gómez, 2003).
The close relationship between sequential learning and grammatical
ability has been further corroborated by recent neuroimaging studies, showing
that people trained on an artificial language have the same event-related
potential (ERP) brainwave patterns to ungrammatical artificial-language
sentences as to ungrammatical natural-language sentences (Christiansen, Conway
& Onnis, 2007; Friederici, Steinhauer & Pfeifer, 2002). Moreover, novel
incongruent musical sequences elicit ERP patterns that are statistically
indistinguishable from syntactic incongruities in language (Patel, Gibson,
Ratner, Besson & Holcomb, 1998). Results from a magnetoencephalography
(MEG) experiment further suggest that Broca’s area plays a crucial role in
processing music sequences (Maess, Koelsch, Gunter & Friederici, 2001).
Finally, event-related functional magnetic resonance imaging (fMRI) has shown
that the same brain area—Broca’s area—is involved in an artificial grammar
learning task and in normal natural language processing (Petersson, Forkstam & Ingvar, 2004). Further evidence comes from behavioral studies with
language impaired populations, showing that aphasia (Christiansen,
Kelly, Shillcock & Greenfield, 2007; Hoen et al.,
2003), language learning disability (Plante,
Gómez & Gerken, 2002), and specific language impairment (Hsu, Christiansen,
Tomblin, Zhang & Gómez, 2006; Tomblin, Mainela-Arnold & Zhang, 2007)
are associated with impaired sequential learning. Together, these studies
strongly suggest that there is considerable overlap in the neural mechanisms
involved in language and sequential learning[xiv] (see also Conway, Karpicke & Pisoni, 2007; Ullman,
2004; Wilkins & Wakefield, 1995, for similar perspectives).
This psychological research can be seen as providing a foundation
for work in functional and typological linguistics indicating how theoretical
constraints on sequential learning and processing can be used to explain
certain universal patterns in language structure and use. One suggestion, from O’Grady (2005), is
that the language processing system seeks to resolve linguistic dependencies
(e.g., between verbs and their arguments) at the first opportunity—a tendency
that might not be syntax-specific, but instead an instance of a general
cognitive tendency to attempt to resolve ambiguities rapidly in linguistic
(Clark, 1975) and perceptual input (Pomerantz & Kubovy, 1986). In a similar
vein, Hawkins (1994, 2004) and Culicover (1999) propose specific measures of
processing complexity (roughly, the number of linguistic constituents required
to link syntactic and conceptual structure), which they assume underpin
judgments concerning linguistic acceptability. The collection of studies in
Bybee (2007) further underscores the importance of frequency of use in shaping
language. Importantly, this line of work has begun to
detail learning and processing constraints that can help explain specific
linguistic patterns, such as the aforementioned examples of pronoun binding
(previous examples 1-4; see O’Grady, 2005) and heavy NP-shift (examples 5-6;
see Hawkins, 1994, 2004), and indicates an increasing emphasis on
performance constraints within linguistics.
In turn, a growing body of empirical
research in computational linguistics, cognitive science, and psycholinguistics
has begun to explore how these theoretical constraints may be instantiated in
terms of computational and psychological mechanisms. For instance, basic word order patterns may
thus derive from memory constraints related to sequential learning and
processing of linguistic material, as indicated by computational simulations
(e.g., Christiansen & Devlin, 1997; Kirby, 1999; Lupyan & Christiansen,
2002; Van Everbroeck, 1999), human experimentation involving artificial
languages (e.g., Christiansen, 2000; Christiansen & Reeder, 2006), and
cross-linguistic corpus analyses (e.g., Bybee, 2002; Hawkins, 1994, 2004).
Similarly, behavioral experiments and computational modeling have provided
evidence for general processing constraints (instead of innate subjacency
constraints) on complex question formation (Berwick & Weinberg, 1984;
Ellefson & Christiansen, 2000).
6.4. Pragmatic constraints
Language
is likely, moreover, to be substantially shaped by the pragmatic constraints
involved in linguistic communication. The program of developing and extending
Gricean implicatures (Grice, 1967; Levinson, 2000; Sperber & Wilson, 1986)
has revealed enormous complexity in the relationship between the literal
meaning of an utterance and the message that the speaker intends to convey.
Pragmatic processes may, indeed, be crucial in understanding many aspects of linguistic
structure, as well as the processes of language change.
Consider
the nature of anaphora and binding. Levinson (2000) notes that the patterns of
“discourse” anaphora (7) and syntactic anaphora (8) have interesting parallels.
(7)
a. John arrived. He began to sing.
b. John arrived. The man began to sing.
(8)
a. John arrived and he began to
sing.
b. John arrived and the man began to
sing.
In
both (7) and (8), the first form indicates preferred coreference of he and john; the second form prefers non-coreference. The general pattern
is that brief expressions encourage coreference with a previously introduced
item; Grice’s maxim of quantity implies that, by default, a prolix expression
will not be used where a brief expression could be, and hence prolix
expressions are typically taken to imply non-coreference with previously
introduced entities. Where the referring expression is absent, then coreference
may be required as in (9), in which the singer can only be John:
(9)
John arrived and began to sing.
It
is natural to assume that syntactic structures emerge, diachronically, from
reduction of discourse structures—and that, in Givón’s phrase “Yesterday’s
discourse is today’s syntax” (as cited in Tomasello, 2006). The shift, over
time, from default constraint to rigid rule is widespread in language change
and much studied in the sub-field of grammaticalization (see Section 7.1).
Applying
this pragmatic perspective to the binding constraints, Levinson (1987a, 1987b,
2000) notes that the availability, but non-use, of the reflexive himself provides a default (and later,
perhaps, rigid) constraint that him
does not corefer with John in (10).
(10)
a. Johni likes himselfi
b.
Johni likes himj
Levinson
(2000), building on related work by Reinhart (1983), provides a comprehensive
account of the binding constraints, and putative exceptions to them, purely on
pragmatic principles (see also Huang, 2000, for a cross-linguistic
perspective). In sum, pragmatic principles can at least partly explain both the
structure, and origin, of linguistic patterns that are often viewed as solely
formal, and hence arbitrary.
6.5. The impact of multiple
constraints
In
this section, we have discussed four types of constraints that have shaped the
evolution of language. Importantly, we see these constraints as interacting
with one another, such that individual linguistic phenomena arise from a
combination of several different types of constraints. For example, the
patterns of binding phenomena are likely to require explanations that cut
across the four types of constraints, including constraints on cognitive
processing (O’Grady, 2005) and pragmatics (Levinson, 1987a; Reinhart, 1983).
That is, the explanation of any given aspect of language is likely to require
the inclusion of multiple overlapping constraints deriving from thought,
perceptual-motor factors, cognition, and pragmatics.
The
idea of explaining language structure and use through the integration of
multiple constraints goes back at least to early functionalist approaches to
the psychology of language (e.g., Bates & MacWhinney, 1979; Bever, 1970;
Slobin, 1973). It plays an important role in current constraint-based theories
of sentence comprehension (e.g., MacDonald, Pearlmutter &
Seidenberg, 1994; Tanenhaus & Trueswell, 1995). Experiments have demonstrated how adults’ interpretations
of sentences are sensitive to a variety of constraints, including specific
world knowledge relating to the content of an utterance (e.g., Kamide, Altmann
& Haywood, 2003), the visual context in which the utterance is produced
(e.g., Tanenhaus, Spivey-Knowlton, Eberhard & Sedivy,
1995), the sound properties
of individual words (Farmer, Christiansen & Monaghan, 2006), the processing
difficulty of an utterance as well as how such difficulty may be affected by
prior experience (e.g., Reali & Christiansen, 2007), and various pragmatic
factors (e.g., Fitneva & Spivey, 2004). Similarly, the integration of
multiple constraints—or “cues”—also figures prominently in contemporary
theories of language acquisition (see e.g., contributions in Golinkoff et al.,
2000; Morgan & Demuth, 1996; Weissenborn & Höhle, 2001; for a review,
see Monaghan & Christiansen, in press).
The
multiple-constraints satisfaction perspective on language evolution also offers
an explanation for why language is unique to humans: as a cultural product,
language has been shaped by constraints from multiple mechanisms, some of which
have properties unique to humans. Specifically, we suggest that language does
not involve any qualitatively different mechanisms compared to extant apes, but
instead a number of quantitative evolutionary refinements of older primate
systems (e.g., for intention sharing and understanding, Tomasello, Carpenter,
Call, Behre & Moll, 2005; or complex sequential learning and processing[xv],
Conway & Christiansen, 2001). These changes could be viewed as providing
necessary pre-adaptations that, once in place, allowed language to emerge
through cultural transmission (e.g., Elman, 1999). It is also conceivable that
initial changes, if functional, could have been subject to further
amplification through the Baldwin effect, perhaps resulting in multiple
quantitative shifts in human evolution. The key point is that none of these
changes would result in the evolution of UG. The species-specificity of a given
trait does not necessitate postulating specific biological adaptations for that
trait. For example, even though playing tag may be species-specific and perhaps
even universal, few people, if any, would argue that humans have evolved
specific adaptations for playing this game. Thus, the uniqueness of language is
better viewed as part of the larger question: why are humans different from
other primates? It seems clear that considering language in isolation is not
going to give us the answer to this question.
7. How Constraints Shape
Language over Time
According
to the view that language evolution is determined by the development of UG,
there is a sharp divide between questions of language evolution (how the
genetic endowment could arise evolutionarily), and historical language change
(which is viewed as variation within the genetically determined limits of
possible human languages). By contrast, if language has evolved to fit prior
cognitive and communicative constraints, then it is plausible that historical
processes of language change provide a model of language evolution; indeed,
historical language change may be language evolution in microcosm. This
perspective is consistent with much work in functional and typological linguistics
(e.g., Bever & Langendoen, 1971; Croft, 2000; Givón, 1998; Hawkins, 2004;
Heine & Kuteva, 2002).
At
the outset, it is natural to expect that language will be the outcome of
competing selectional forces. On the one hand, as we shall note, there will be
a variety of selectional forces that make the language “easier” for
speakers/hearers; on the other, it is likely that expressibility is a powerful
selectional constraint, tending to increase linguistic complexity over
evolutionary time. For instance, it has been suggested that the use of
hierarchical structure and limited recursion to express more complex meanings
may have arrived at later stages of language evolution (Jackendoff, 2002;
Johansson, 2006). Indeed, the modern Amazonian language, Pirahã, lacks
recursion and has one of the world’s smallest phoneme inventories (though its
morphology is complex), limiting its expressivity (Everett, 2005; but see also
the critique by Nevins, Pesetsky & Rodrigues, 2007, and Everett’s, 2007,
response).
While
expressivity is one selectional force that may tend to increase linguistic
complexity, it will typically stand in opposition to another: ease of learning
and processing will tend to favor linguistic simplicity. But the picture may be
more complex: in some cases, ease of learning and ease of processing may stand
in opposition. For example, regularity makes items easier to learn; the shortening of frequent items,
and consequent irregularity, may make aspects of language easier to say. There are similar tensions between
ease of production (which favors simplifying the speech signal), and ease of
comprehension (which favors a richer, and hence more informative, signal).
Moreover, whereas constraints deriving from the brain provide pressures toward
simplification of language, processes of grammaticalization can add complexity
to language (e.g., by the emergence of morphological markers). Thus, part of
the complexity of language, just as in biology, may arise from the complex
interaction of competing constraints.
7.1. Language evolution as
linguistic change
Recent
theory in diachronic linguistics has focused on grammaticalization (e.g.,
Bybee, Perkins & Pagliuca, 1994; Heine, 1991; Hopper & Traugott, 1993):
the process by which functional items, including closed class words and
morphology, develop from what are initially open-class items. This transitional
process involves a “bleaching” of meaning, phonological reduction, and
increasingly rigid dependencies with other items. Thus, the English number one is likely to be the root to a(n).
The Latin cantare habeo (I have (something) to sing) mutated into chanterais,
cantaré, cantarò (I will sing in French, Spanish,
Italian). The suffix corresponds phonologically to I have in each language (respectively, ai, he, ho—the have element has collapsed into inflectional morphology,
Fleischman, 1982). The same processes of grammaticalization can also cause
certain content words over time to get bleached of their meaning and become
grammatical particles. For example, the use of go and have as auxiliary
verbs (as in I am going to sing or I have forgotten my hat) have been
bleached of their original meanings concerning physical movement and possession
(Bybee et al., 1994). The processes of grammaticalization appear gradual, and
follow historical patterns, suggesting that there are systematic selectional
pressures operative in language change. More generally, these processes provide
a possible origin of grammatical structure from a proto-language initially
involving perhaps unordered and uninflected strings of content words.
From a historical perspective, it is
natural to view many aspects of syntax as emerging from processing or pragmatic
factors. Revisiting our discussion of binding constraints, we might view
complementary distributions of reflexive and non-reflexive pronouns as
initially arising from pragmatic factors; the resulting pattern may be acquired
and modified by future generations of learners, to some degree independently of
those initial factors (e.g., Givón, 1979; Levinson, 1987b). Thus, binding
constraints might be a complex product of many forces, including pragmatic
factors, and learning and processing biases—and hence the subtlety of those
constraints should not be entirely surprising. But from the present
perspective, the fact that such a complex system of constraints is readily
learnable, is neither puzzling, nor indicative of an innately specified genetic
endowment. Rather the constraints are learnable because they have been shaped
by the very pragmatic, processing and learning constraints with which the
learner is endowed.
Understanding
the cognitive and communicative basis for the direction of grammaticalization
and related processes is an important challenge. But equally, the suggestion
that this type of observable historical change may be continuous with language
evolution opens up the possibility that research on the origin of language may
not be a theoretically isolated island of speculation, but may connect directly
with one of the most central topics in linguistics: the nature of language
change (e.g., Zeevat, 2006). Indeed, grammaticalization has become the center
of many recent perspectives on the evolution of language as mediated by
cultural transmission across hundreds (perhaps thousands) of generations of learners
(e.g., Bybee et al., 1994; Givón, 1998; Heine & Kuteva, 2002; Schoenemann,
1999; Tomasello, 2003). Although the present approach also emphasizes the
importance of grammaticalization in the evolution of complex syntax, it differs
from other approaches in that we see this diachronic process as being
constrained by limitations on learning and processing. Indeed, there have even
been intriguing attempts to explain some aspects of language change with
reference to the learning properties of connectionist networks. For example,
Hare & Elman (1995) demonstrated how cross-generational learning by
sequential learning devices can model the gradual historical change in English
verb inflection from a complex past tense system in Old English to the dominant
“regular” class and small classes of “irregular” verbs of modern English.
7.2. Language evolution
through cultural transmission
How
far can language evolution, and historical processes of language change, be
explained in terms of general mechanisms of cultural transmission: by
attempting to capture the processes by which information is passed from person
to person? And how information might be selectively distorted by such
processes? Crucial to any such model, whether concerning language or not, are
assumptions about the channel over which cultural information is transmitted;
the structure of the network of social interactions over which transmission
occurs; and the learning and processing mechanisms that support the acquisition
and use of the transmitted information (Boyd & Richerson, 2005).
A
wide range of recent computational models of the cultural transmission of
language has been developed, with different points of emphasis. Some of these
models have considered how language is shaped by the process of transmission
over successive generations, by the nature of the communication problem to be
solved and/or by the nature of the learners (e.g., Batali, 1998; Kirby, 1999).
For example, Kirby, Dowman and Griffiths (2007) show that, if information is
transmitted directly between individual learners, and learners sample grammars
from the Bayes posterior distribution of grammars given that information, then
language asymptotically converges to match the priors initially encoded by the
learners. In contrast, Smith, Brighton and Kirby (2003), using a different
model of how information is learned, indicate how compositional structure in
language might have resulted from the complex interaction of learning
constraints and cultural transmission, resulting in a “learning bottleneck”. Moreover, a growing number of studies
have started to investigate the potentially important interactions between
biological and linguistic adaptation in language evolution (e.g., Christiansen
et al., 2006; Hurford, 1990; Hurford & Kirby, 1999; Kvasnicka &
Pospichal, 1999; Livingstone & Fyfe, 2000; Munroe & Cangelosi, 2002;
Smith, 2002, 2004; Yamauchi, 2001).
Of
particular interest here are simulations indicating that apparently
arbitrary aspects of linguistic structure may arise from constraints on
learning and processing (e.g., Kirby, 1998, 1999; Van Everbroeck, 1999). For
example, it has been suggested that subjacency constraints may arise from
cognitive limitations on sequential learning (Ellefson & Christiansen,
2000). Moreover, using rule-based language induction, Kirby (1999) accounted
for the emergence of typological universals as a result of domain-general
learning and processing constraints. Finally, note that, in line with the
present arguments a range of recent studies have challenged the plausibility of
biological adaptation to arbitrary features of the linguistic environment
(e.g., Christiansen et al., 2006; Kirby et al., 2007; Kirby & Hurford,
1997; Munroe & Cangelosi, 2002; Yamauchi, 2001).
The range of factors known to be
important in cultural transmission (e.g., group size and networks of
transmission between group members, fidelity of transmission) has been explored
relatively little in simulation work. Furthermore, to the extent that language
is shaped by the brain, then enriching models of cultural transmission of
language, against the backdrop of learning and processing constraints, will be
an important direction for the study both of historical language change and
language evolution. More generally, viewing
language as shaped by cultural transmission (Arbib, 2005; Bybee, 2002; Donald,
1998) only provides the starting point for an explanation of linguistic
regularities. The real challenge, we suggest, is to delineate the wide range of
constraints, from perceptuo-motor to pragmatic (as sketched above), that
operate on language evolution. Detailing these constraints is likely to be
crucial for explanations of complex linguistic regularities, and how they can
readily be learned and processed.
We
note here that this perspective on the adaptation of language differs
importantly from the processes of cultural change that operate through
deliberate and conscious innovation and/or evaluation of cultural variants. On
our account, the processes of language change operate to make languages easier
to learn and process, and more communicatively effective. But these changes do
not operate through processes either of “design” or deliberate adoption by
language users. Thus, following Darwin, we view the origin of the adaptive
complexity in language as analogous to the origin of adaptive complexity in
biology. Specifically, the adaptive complexity of biological organisms is
presumed to arise from random genetic variation, winnowed by natural selection
(a “blind watchmaker”; Dawkins, 1986); we argue that the adaptive complexity of
language arises, similarly, from random linguistic variation winnowed by
selectional pressures, though here concerning learning and processing (so
again, we have a blind watchmaker).
By
contrast, for aspects of cultural changes for which variants are either
created, or selected, by deliberate choice, the picture is very different. Such
cultural products can be viewed instead as arising from the incremental action
of processes of intelligent design, and more or less explicit evaluations, and
decisions to adopt (see Chater, 2005). Many phenomena discussed by evolutionary
theorists concerning culture (e.g., Campbell, 1965; Richerson & Boyd,
2005)— including those described by meme-theorists (e.g., Blackmore, 1999;
Dawkins, 1976; Dennett, 1995)—fall into this latter category: explanations of
fashions (e.g., wearing baseball caps backwards), catch-phrases, memorable
tunes, engineering methods, cultural conventions and institutions (e.g.,
marriage, revenge killings), scientific and artistic ideas, religious views,
and so on, seem patently to be products of sighted
watchmakers; i.e., they are products, in part at least, of many generations of
intelligent designers, imitators, and critics.
Our
focus here concerns, instead, the specific, and interdependent, constraints
operating on particular linguistic structures and of which people have no
conscious awareness. Presumably, speakers do not deliberately contemplate
syntactic reanalyses of existing structures, bleach the meaning of common verbs
so that they play an increasingly syntactic role, or collapse discourse
structure into syntax or syntactic structure into morphology. Of course, there
is some deliberate innovation in language (e.g., people consciously invent new
words and phrases). But such deliberate innovations should be sharply
distinguished from the unconscious operation of the basic learning and
processing biases that have shaped the phonological, syntactic and semantic
regularities of language.
7.3. Language change “in
vivo”
We
have argued that language has evolved over time to be compatible with the human
brain. However, it might be objected that it is not clear that languages become
better adapted over time given that they all seem capable of expressing a
similar range of meanings (Sereno, 1991). In fact, the idea that all languages
are fundamentally equal and independent of their users—uniformitarianism—is
widely adopted in linguistics, preventing many linguists from thinking about
language evolution (Newmeyer, 2003). Yet, much variation exist in how easy it
is to use a given language to express a particular meaning given the
limitations of human learning and processing mechanisms.
The
recent work on creolization in sign language provides a window onto how
pressures towards increased expressivity interact with constraints on learning
and processing “in vivo”. In less than three decades, a sign language has
emerged in Nicaragua, created by deaf children with little exposure to
established languages. Senghas, Kita and Özyürek (2004) compared signed
expressions for complex motions produced by deaf signers of Nicaraguan Sign
Language (NSL) with the gestures of hearing Spanish speakers. The results
showed that the hearing individuals used a single simultaneous movement combining
both manner and path of motion, whereas the deaf NSL signers tended to break
the event into two consecutive signs: one for the path of motion and another
for the manner. Moreover, this tendency was strongest for the signers who had
learned NSL more recently, indicating that NSL has changed from using a
holistic way of denoting motion events to a more sequential, compositional
format. Although such creolization may be considered as evidence of UG (e.g.,
Bickerton, 1984; Pinker, 1994), the results may be better construed in terms of
cognitive constraints on cultural transmission. Indeed, computational
simulations have demonstrated how iterated learning in cultural transmission
can change a language starting as a collection of holistic form-meaning
pairings into a more compositional format, in which sequences of forms are
combined to produce meanings previously expressed holistically (see Kirby &
Hurford, 2002, for a review). Similarly, human experimentation operationalizing
iterated learning within a new “cross-generational” paradigm—in which the
output of one artificial-language learner is used as the input for subsequent
“generations” of language learners—has shown that such learning biases over
generations can change the structure of artificial languages from holistic
mappings to a compositional format (Cornish, 2006). This allows language to
have increased expressivity, while being learnable from exposure to a finite
set of form-meaning pairings. Thus, the change towards using sequential
compositional forms to describe motion events in NSL can be viewed as a
reflection of similar processes of learning and cultural transmission.
In
a similar vein, the rapid emergence of a regular SOV (subject-object-verb) word
order in Al-Sayyid Bedouin Sign Language (ABSL; Sandler, Meir, Padden &
Aronoff, 2005) can be interpreted as arising from constraints on learning and
processing. ABSL has a longer history than NSL, going back some 70 years. The
Al-Sayyid Bedouin group forms an isolated community with a high incidence of
congenital deafness, located in the Negev desert region of southern Israel. In
contrast to NSL, which developed within a school environment, ABSL has evolved
in a more natural setting and is recognized as the second language of the
Al-Sayyid village. A key feature of ABSL is that is has developed a basic SOV
word order within sentences (e.g., boy
apple eat), with modifiers following heads (e.g., apple red). Although this type of word order is very common across
the world (Dryer, 1992), it is found neither in the local spoken Arabic dialect
nor in Israeli Sign Language, suggesting that ABSL has developed these
grammatical regularities de novo. In
a series of computational simulations, Christiansen & Devlin (1997) found
that languages with consistent word order were easier to learn by a sequential
learning device compared to inconsistent word orders. Thus, a language with a
grammatical structure such as ABSL was easier to learn than one in which an SOV
word order was combined with a modifier-head order within phrases. Similar
results were obtained when human subjects were trained on artificial languages
with either consistent or inconsistent word orders (Christiansen, 2000;
Christiansen & Reeder, 2006). Further simulations have demonstrated how
sequential learning biases can lead to the emergence of languages with regular
word orders through cultural transmission—even when starting from a language
with a completely random word order (Christiansen & Dale, 2004; Reali &
Christiansen, in press).
Differences
in learnability are not confined to newly emerged languages but can also be
observed in well-established languages. For example, Slobin and Bever (1982)
found that when children learning English, Italian, Turkish, or Serbo-Croatian
were asked to act out reversible transitive sentences, such as the horse kicked the cow, using familiar
toy animals, language-specific differences in performance emerged.
Turkish-speaking children performed very well already at 2 years of age, most
likely due to the regular case markings in this language, indicating who is
doing what to whom. Young English and Italian-speaking children initially
performed slightly worse than the Turkish children but quickly caught up around
3 years of age, relying on the relatively consistent word order information
available in these languages, with subjects preceding objects. The children
acquiring Serbo-Croatian, on the other hand, had problems determining the
meaning of the simple sentences, most likely because this language uses a
combination of case markings and word order to indicate agent and patient roles
in a sentence. Crucially, only masculine and feminine nouns take on accusative
or nominative markings and can occur in any order with respect to one another,
but sentences with one or more unmarked neuter nouns are typically ordered as
subject-verb-object. Of course, Serbo-Croatian children eventually catch up
with the Turkish, English, and Italian-speaking children, but these results do
show that some meanings are harder to learn and process in some languages
compared to others, indicating differential fitness across languages (see
Lupyan & Christiansen, 2002, for corroborating computational simulations).
Within
specific languages, substantial differences also exist between individual
idiolects; e.g., as demonstrated by the considerable differences in language
comprehension abilities between cleaners, janitors, undergraduates, graduate
students, and lecturers from the same British university (Dabrowska, 1997).
Even within the reasonably homogeneous group of college students, individual
differences exist in sentence processing abilities due to underlying variations
in learning and processing mechanisms combined with variations in exposure to
language (for a review, see MacDonald & Christiansen, 2002). Additional
sources of variation are likely to come from the incorporation of linguistic
innovations into the language. In this context, it has been suggested that
innovations may primarily be due to adults (Bybee, in press), whereas
constraints on children’s acquisition of language may provide the strongest
pressure towards regularization (e.g., Hudson Kam & Newport, 2005). Thus,
once we abandon linguistic uniformitarianism, it becomes clear that there is
much variability for linguistic adaptation to work with.
In
sum, we have argued that human language has been shaped by selectional pressure
from thousands of generations of language learners and users. Linguistic
variants that are easier to learn to understand and produce; variants which are
more economical, expressive and generally effective in communication,
persuasion, and perhaps signally of status and social group, will be favored.
Just as with the multiple selectional pressures operative in biological
evolution, the matrix of factors at work in driving the evolution of language
is complex. Nonetheless, as we have seen, candidate pressures can be proposed
(e.g., the pressure for incrementality, minimizing memory load, regularity,
brevity, and so on), and regular patterns of language change that may be
responses to those pressures can be identified (e.g., the processes of
successive entrenchment, generalization and erosion of structure evident in
grammaticalization). Thus, the logical problem of language evolution that
appears to confront attempts to explain how a genetically specified linguistic
endowment could become encoded, does not arise; it is not the brain that has
somehow evolved to language, but the reverse.
8. Scope of the Argument
In
this paper, we have presented a theory of language evolution as shaped by the
brain. From this perspective, the close fit between language learners and the
structure of natural language that motivates many theorists to posit a
language-specific biological endowment may instead arise from processes of
adaptation operating on language itself. Moreover, we have argued that there
are fundamental difficulties with postulating a language-specific biological
endowment. It is implausible that such an endowment could evolve through
adaptation (because the prior linguistic environments would be too diverse to
give rise to universal principles). It is also unlikely that a
language-specific endowment of any substantial complexity arose through
non-adaptational genetic mechanisms, because the probability of a functional language
system arising essentially by chance is vanishingly small. Instead, we have
suggested that some apparently arbitrary aspects of language structure may
arise from the interaction of a range of factors, from general constraints on
learning, to impacts of semantic and pragmatic factors, and concomitant
processes of grammaticalization and other aspects of language change. But,
intriguingly, it also possible that many apparently arbitrary aspects of
language can be explained by relatively natural cognitive constraints—and hence
that language may be rather less arbitrary than at first supposed (e.g., Bates
& MacWhinney, 1979, 1987; Bybee, 2007; Elman, 1999; Kirby, 1999; Levinson,
2000; O’Grady, 2005; Tomasello, 2003).
8.1 The logical problem of
language evolution meets the logical problem of language acquisition
The
present viewpoint has interesting theoretical implications concerning language
acquisition. Children acquire the full complexity of natural language over a
relatively short amount of time, from exposure to noisy and partial samples of
language. The ability to develop complex linguistic abilities from what appears
to be such poor input has led many to speak of the “logical” problem of
language acquisition (e.g., Baker & McCarthy, 1981; Hornstein &
Lightfoot, 1981). One solution to the problem is to assume that learners have
some sort of biological “head-start” in language acquisition—that their
learning apparatus is precisely meshed with the structure of natural language.
This viewpoint is, of course, consistent with theories according to which there
is a genetically specified language organ, module or instinct (e.g., Chomsky,
1986, 1993; Crain, 1991; Piattelli-Palmarini, 1989, 1994; Pinker, 1994; Pinker
& Bloom, 1990). But it is also consistent with the present view that
languages have evolved to be learnable. According to this view, the mesh
between language learning and language structure has occurred not because
specialized biological machinery embodies the principles that govern natural
languages (UG), but rather that the structure of language has evolved to fit
with pre-linguistic learning and processing constraints.
If language has evolved to be learnable, then the problem of language acquisition may have been mis-analyzed. Language acquisition is frequently viewed as a standard problem of induction (e.g., Gold, 1967; Jain, Osherson, Royer & Sharma, 1999; Osherson, Stob & Weinstein, 1986; Pinker, 1984, 1989), where there is a vast space of possible grammars that are consistent with the linguistic data to which the child is exposed. Accordingly, it is often readily concluded that the child must have innate knowledge of language structure to constrain the space of possible grammars to a manageable size. But, if language is viewed as having been shaped by the brain, then language learning is by no means a standard problem of induction. To give an analogy, according to the standard view of induction, the problem of language acquisition is like being in an unreasonable quiz show, where you have inadequate information, but must somehow guess the “correct” answer. But according to the present view, by contrast, there is no externally given correct answer; instead, the task is simply to give the same answer as everybody else—because the structure of language will have adapted to conform to this most “popular” guess. This is a much easier problem—whatever learning biases people have, so long as these biases are shared across individuals, learning should proceed successfully. Moreover, the viewpoint that children learn language using general-purpose cognitive mechanisms, rather than language-specific mechanisms, has also been advocated independently from a variety of different perspectives ranging from usage-based and functional accounts of language acquisition (e.g., Bates & MacWhinney, 1979, 1987; MacWhinney, 1999; Seidenberg, 1997; Seidenberg & MacDonald, 2001; Tomasello, 2000a, 2000b, 2000c, 2003) to cultural transmission views of language evolution (e.g., Davidson, 2003; Donald, 1998; Ragir, 2002; Schoenemann, 1999), to neurobiological approaches to language (e.g., Arbib, 2005; Deacon, 1997; Elman et al., 1996) and formal language theory (Chater, & Vitányi, 2007).
From this perspective, the problem of language acquisition is very different from learning, say, some aspect of the physical world. In learning naïve physics, the constraints to be learned (e.g., how rigid bodies move, how fluids flow, and so on) are defined by processes outside the cognitive system. External processes define the “right” answers, to which learners must converge. But in language acquisition, the structure of the language to be learned is itself determined by the learning of generations of previous learners (see Zuidema, 2003). Because learners have similar learning biases, this means that the first wild guesses that the learner makes about how some linguistic structure works are likely to be the right guesses. More generally, in language acquisition, the learner’s biases, if shared by other learners, are likely to be helpful in acquiring the language—because the language has been shaped by processes of selection to conform with those biases. This also means that the problem of the poverty of the stimulus (e.g., Chomsky, 1980; Crain, 1991; Crain & Pietroski, 2001) is reduced, because language has been shaped to be learnable from the kind of noisy and partial input available to young children. Thus, language acquisition is constrained by substantial biological constraints—but these constraints emerge from cognitive machinery that is not language-specific.
8.2. Natural selection for
functional aspects of language?
It
is important to emphasize what our arguments are not intended to show. In particular, we are not suggesting that
biological adaptation is not relevant for language. Indeed, it seems likely
that a number of preadaptations for language might have occurred (see Hurford,
2003, for a review), such as the ability to represent discrete symbols (Deacon,
1997; Tomasello, 2003), to reason about other minds (Malle, 2002), to understand
and share intentions (Tomasello, 2003; Tomasello et al., 2005), and to perform
pragmatic reasoning (Levinson, 2000); there may also be a connection with the
emergence of an exceptionally prolonged childhood (Locke & Bogin, 2006).
Similarly, biological adaptations might have led to improvements to the
cognitive systems that support language, including increased working memory
capacity (Gruber, 2002), domain-general capacities for word learning (Bloom,
2001), and complex hierarchical sequential learning abilities (Calvin, 1994;
Conway & Christiansen, 2001; Greenfield, 1991; Hauser et al., 2002), though
these adaptations are likely to have been for improved cognitive skills rather
than for language.
Some
language-specific adaptations may nonetheless have occurred as well, but given
our arguments above these would only be for functional features of language,
and not the arbitrary features of UG. For example, changes to the human vocal
tract may have resulted in more intelligible speech (Lieberman, 1984, 1991,
2003—though see also Hauser & Fitch, 2003); selectional pressure for this
functional adaptation might apply relatively independently of the particular
language. Similarly, it remains possible that the Baldwin effect may be invoked
to explain cognitive adaptations to language, provided that these adaptations
are to functional aspects of language, rather than putatively arbitrary
linguistic structures. For example, it has been suggested that there might be a
specialized perception apparatus for speech (e.g., Vouloumanos & Werker, 2007), or enhancement of the
motor control system for articulation (e.g., Studdert-Kennedy & Goldstein,
2003). But explaining innate adaptations even in these domains is likely to be
difficult—because, if adaptation to language occurs at all, it is likely to
occur not merely to functionally universal features (e.g., the fact that
languages segment into words), but to specific cues for those features (e.g.,
for segmenting those words in the current linguistic environment, which differ
dramatically across languages; Cutler, Mehler, Norris & Segui, 1986; Otake,
Hatano, Cutler & Mehler, 1993). Hence, adaptationist explanations, even for
functional aspects of language and language processing, should be treated with
considerable caution.
8.3 Implications for the
co-evolution of genes and culture
Our argument may, though, have
applications beyond language. Many theorists have suggested that, just as there
are specific genetic adaptations to language, there may also be specific genetic
adaptations to other cultural domains. The arguments we have outlined against
biological adaptationism in language evolution appear to apply equally to rule
out putative co-evolution of the brain with any rapidly changing and highly
varied aspect of human culture—from marriage practices and food sharing
practices, to music and art, to folk theories of religion, science or
mathematics. We speculate that, in each case, the apparent fit between culture
and the brain arises primarily because culture has been shaped to fit with our
prior cognitive biases. Thus, by analogy with language, we suggest that
nativist arguments across these domains might usefully be re-evaluated, from
the perspective that culture may have adapted to cognition much more
substantially than cognition has adapted to culture.
In
summary, we have argued that the notion of UG is subject to a logical problem
of language evolution, whether it is suggested to be the result of gradual
biological adaptation or other nonadaptationist factors. Instead, we have
proposed to explain the close fit between language and learners as arising from
the fact that language is shaped by the brain, rather than the reverse.
This research
was partially supported by the Human Frontiers Science Program grant
RGP0177/2001-B. MHC was supported by a Charles A. Ryskamp Fellowship from the
American Council of Learned Societies and by the Santa Fe Institute; NC was
supported by a Major Research Fellowship from the Leverhulme Trust. The work
presented has benefited from discussions with Andy Clark, Jeff Elman, Robert
Foley, Anita Govindjee, Ray Jackendoff, Stephen Mithen, Jennifer Misyak, and
David Rand. We are also grateful for the comments on a previous version of this
paper from Paul Bloom, Michael Corballis, Adele Goldberg, 6 anonymous BBS
reviewers, as well as Christopher Conway, Rick Dale, Lauren Emberson, and
Thomas Farmer.
Abzhanov, A., Kuo, W.P., Hartmann, C.,
Grant, B.R., Grant, P.R. & Tabin, C.J. (2006). The calmodulin pathway and evolution
of elongated beak morphology in Darwin's finches. Nature, 442, 563-567.
Abzhanov, A., Protas, M., Grant, B.R.,
Grant, P.R. & Tabin, C.J. (2004). Bmp4
and morphological variation of beaks in Darwin's finches. Science, 305, 1462-1465.
Alter,
S. (1998). Darwinism and the linguistic
image: Language, race, and natural theology in the nineteenth century.
Baltimore, MD: Johns Hopkins University Press.
Andersen, H. (1973). Abductive and
deductive change. Language, 40,
765-793.
Arbib, M.A. (2005). From monkey-like
action recognition to human language: An evolutionary framework for
neurolinguistics. Behavioral & Brain
Sciences, 28, 105-124.
Baker, C. L. & McCarthy, J. J. (Eds.)
(1981). The logical problem of language
acquisition. Cambridge, MA: MIT Press.
Baker, M.C. (2001). The atoms of language: The mind’s hidden rules of grammar. New
York: Basic Books.
Baker, M.C. (2003). Language differences and language
design. Trends in Cognitive Sciences,
7, 349-353.
Baldwin, J.M. (1896). A new factor in evolution. American Naturalist, 30, 441-451.
Batali, J. (1998). Computational simulations of the emergence of
grammar. In J.R. Hurford, M. Studdert-Kennedy & C. Knight (eds.) Approaches
to the evolution of language: Social and cognitive bases (pp. 405-426). Cambridge: Cambridge
University Press.
Bates, E.,
& MacWhinney, B. (1979). A functionalist approach to the acquisition of
grammar. In E. Ochs & B. Schieffelin (Eds.), Developmental pragmatics (pp. 167-209). New York: Academic Press.
Bates, E. & MacWhinney, B. (1987).
Competition, variation, and language learning. In B. MacWhinney (Ed.), Mechanisms of language acquisition (pp.
157-193). Hillsdale, NJ: Erlbaum.
Beer, G. (1996). Darwin and the
growth of language theory. In Open
fields: Science in cultural encounter
(pp. 95-114). Oxford: Oxford University Press.
Beja-Pereira, A., Luikart, G.,
England, P. R., Bradley, D. G., Jann, O. C., Bertorelle, G., Chamberlain, A.
T., Nunes, T. P., Metodiev, S., Ferrand, N., & Erhardt, G. (2003).
Gene-culture coevolution between cattle milk protein genes and human lactase
genes. Nature Genetics, 35, 311-313.
Berwick, R.C. & Weinberg, A.S.
(1984). The grammatical basis of
linguistic performance: language use and acquisition. Cambridge, MA: MIT
Press.
Bever, T.G. (1970). The cognitive basis
for linguistic structures. In R. Hayes (Ed.), Cognition and language development (pp. 277-360). New York: Wiley
& Sons.
Bever, T.G. & Langendoen, D.T.
(1971). A dynamic model of the evolution of language. Linguistic Inquiry, 2, 433-463.
Bickerton, D. (1984). The language
bio-program hypothesis. Behavioral and
Brain Sciences, 7, 173-212.
Bickerton, D. (1995). Language and human behavior. Seattle,
WA: University of Washington Press.
Bickerton, D. (2003). Symbol and structure: a comprehensive
framework for language evolution. In M.H. Christiansen and S. Kirby (Eds.), Language
evolution (pp. 77-93). Oxford: Oxford University Press.
Blackburn, S. (1984). Spreading the word. Oxford: Oxford University Press.
Blackmore, S.J. (1999). The meme
machine. Oxford: Oxford University Press.
Bloom,
P. (2001). Précis of How children learn the meanings of words. Behavioral and Brain Sciences, 24,1095-1103.
Boeckx,
C. (2006). Linguistic minimalism:
Origins, concepts, methods, and aims. New York: Oxford University Press.
Borer,
H. (1984). Parametric syntax: Case
studies in Semitic and Romance languages. Dordrecht: Foris.
Boyd, R. & Richerson, P.J. (2005). The
origin and evolution of cultures. Oxford: Oxford University Press.
Bresnan,
J. (1982). The mental representation of
grammatical relations. Cambridge, MA: MIT Press.
Briscoe, E.J. (2003). Grammatical assimilation. In M.H. Christiansen and S.
Kirby (Eds.), Language evolution (pp.
295-316). Oxford: Oxford
University Press.
Brooks, P.J., Braine, M.D.S., Catalano, L., Brody, R.E., &
Sudhalter, V. (1993). Acquisition of gender-like noun subclasses in an
artificial language: The contribution of
phonological markers to learning. Journal
of Memory and Language, 32,
76-95.
Bybee, J.L.
(2002). Sequentiality as the basis of constituent structure. In T. Givón, &
B. Malle (Eds.), The evolution of language out of pre-language (pp. 107-132).
Philadelphia, PA: John Benjamins.
Bybee, J.L.
(2007). Frequency of use and the
organization of language. New York: Oxford University Press.
Bybee, J.L.
(in press). Language universals and usage-based theory. In M.H. Christiansen,
C. Collins & S. Edelman (Eds.), Language
universals. New York: Oxford University Press.
Bybee, J.L.,
Perkins, R.D. & Pagliuca, W. (1994). The
evolution of grammar: Tense, aspect and modality in the languages of the world.
Chicago: University of Chicago Press.
Calvin, W.H.
(1994). The emergence of intelligence. Scientific
American, 271, 100-107.
Campbell, D.T. (1965). Variation and
selective retention in socio-cultural evolution. In: H. R. Barringer, G.I.
Blanksten and R.W. Mack (Eds.), Social change in developing areas: A
reinterpretation of evolutionary theory (pp. 19-49). Cambridge, MA: Schenkman.
Cannon, G. (1991). Jones's “Spring from some common source”:
1786–1986. In S. M. Lamb and E. D. Mitchell (eds.), Sprung from some common source: Investigations into the pre-history of
languages. Stanford, CA: Stanford University Press.
Cavalli-Sforza, L.L. & Feldman, M.W.
(2003). The application of molecular genetic approaches to the study of human
evolution. Nature Genetics, 33,
266-275.
Chater,
N. (2005). Mendelian and Darwinian
views of memes and cultural change. In
S. Hurley, & N. Chater (Eds.) Perspectives on imitation: From
neuroscience to social science (Vol. 2) (pp. 355-362). Cambridge, MA: MIT
Press.
Chater, N. & Vitányi, P. (2007). ‘Ideal learning’ of natural language: Positive results about learning from positive evidence. Journal of Mathematical Psychology, 51, 135-163.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, N. (1972). Language and mind. Harcourt, Brace and World (extended edition).
Chomsky, N. (1980). Rules and representations. New York: Columbia University Press.
Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris Publications.
Chomsky, N. (1986). Knowledge of language. New York: Praeger.
Chomsky, N. (1988). Language and the problems of knowledge. The Managua Lectures.
Cambridge, MA: MIT Press.
Chomsky, N. (1993). Language and thought. Wakefield, RI: Moyer Bell.
Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press.
Chomsky, N. (2005). Three factors in
language design. Linguistic Inquiry, 36, 1-22.
Christiansen, M.H. (1994). Infinite languages, finite minds:
Connectionism, learning and linguistic structure. Unpublished doctoral
dissertation, Centre for Cognitive Science, University of Edinburgh, U.K.
Christiansen, M.H. (2000). Using
artificial language learning to study language evolution: Exploring the
emergence of word universals. J. L. Dessalles & L. Ghadakpour (Eds.), The Evolution of Language: 3rd International
Conference (pp. 45-48). Paris, France: Ecole Nationale Supérieure des
Télécommunications.
Christiansen,
M.H., Collins, C. & Edelman, S. (Eds.) (in press). Language universals. New York: Oxford University Press.
Christiansen, M.H., Conway, C.M. &
Onnis, L. (2007). Overlapping neural responses to structural incongruencies in
language and statistical learning point to similar underlying mechanisms. In Proceedings
of the 29th Annual Cognitive Science Society Conference (pp. 173-178). Mahwah, NJ: Lawrence Erlbaum.
Christiansen,
M.H. & Dale, R. (2004). The role of learning and development in the evolution
of language. A connectionist perspective. In D. Kimbrough Oller & U.
Griebel (Eds.), Evolution of
communication systems: A comparative approach. The Vienna Series in Theoretical
Biology (pp. 90-109). Cambridge, MA: MIT Press.
Christiansen, M.H. & Devlin, J.T.
(1997). Recursive inconsistencies are hard to learn: A connectionist
perspective on universal word order correlations. In Proceedings of the 19th Annual Cognitive Science Society Conference
(pp. 113-118). Mahwah, NJ: Lawrence Erlbaum.
Christiansen, M.H., Kelly, L., Shillcock,
R. & Greenfield, K. (2007). Impaired
artificial grammar learning in agrammatism. Submitted manuscript.
Christiansen, M.H., Reali, F. &
Chater, N. (2006). The Baldwin effect works for
functional, but not arbitrary, features of language. In
A. Cangelosi, A. Smith & K. Smith (Eds.), Proceedings of the Sixth International Conference on the Evolution of
Language (pp. 27-34).
London: World Scientific Publishing.
Christiansen, M.H. & Reeder, P.A.
(2006). Cognitive constraints on word
order universals: Evidence from connectionist modeling and artificial grammar
learning. Manuscript in preparation.
Clark,
H.H. (1975). Bridging. In R.C.
Schank & B.L. Nash-Webber (Eds.), Theoretical issues in natural language processing
(pp. 169-174). New York: Association
for Computing Machinery.
Conway, C.M., & Christiansen, M.H.
(2001). Sequential learning in non-human primates. Trends in Cognitive Sciences, 5, 539-546.
Conway, C.M., Karpicke, J. & Pisoni,
D.B. (2007). Contribution of implicit sequence learning to spoken language
processing: Some preliminary findings with hearing adults. Journal of Deaf Studies and Deaf Education, 12, 317-334.
Corballis, M.C. (1992). On the evolution
of language and generativity. Cognition,
44, 197-226.
Corballis, M.C. (2003). From hand to
mouth: The gestural origins of language. In
M.H.
Christiansen and S. Kirby (Eds.), Language evolution (pp. 201-218). Oxford: Oxford University Press.
Cornish, H. (2006). Iterated
learning with human subjects: An empirical framework for the emergence and
cultural transmission of language. Unpublished
Masters thesis, School of Philosophy, Psychology and Language Sciences,
University of Edinburgh, U.K.
Crain, S. (1991). Language acquisition in
the absence of experience. Behavioral and
Brain Sciences, 14, 597-650.
Crain, S., Goro, T. & Thornton,
R. (2006). Language acquisition is language change. Journal of Psycholinguistic Research, 35, 31-49.
Crain,
S., & Pietroski, P. (2001). Nature, nurture and universal grammar. Linguistics and Philosophy, 24, 139–186.
Crain,
S., & Pietroski, P. (2006). Is Generative Grammar deceptively simple or
simply deceptive? Lingua, 116, 64-68.
Croft,
W. (2000). Explaining language change: an
evolutionary approach. Harlow, Essex: Longman.
Croft,
W. (2001). Radical construction grammar:
Syntactic theory in typological perspective. New York: Oxford University
Press.
Croft,
W. & Cruise, D. A. (2004). Cognitive
linguistics. Cambridge, UK: Cambridge University Press.
Culicover,
P.W. (1999). Syntactic nuts. Oxford:
Oxford University Press.
Culicover,
P.W. & Jackendoff, R. (2005). Simpler
syntax. New York: Oxford University Press.
Curtin,
S., Mintz, T.H. & Christiansen, M.H. (2005). Stress changes the
representational landscape: Evidence from word segmentation. Cognition, 96, 233-262.
Cutler, A., Mehler, J., Norris, D., &
Segui, J. (1986). The syllable's differing role in the segmentation of French
and English. Journal of Memory and
Language, 25, 385-400.
Dabrowska, E. (1997). The LAD goes to
school: A cautionary tale for nativists. Linguistics
35, 735-766.
Darwin, C. (1900). The descent of man, and selection in relation to sex (2nd Edition).
New York: P.F. Collier and Son.
Davidson, I. (2003). The
archaeological evidence of language origins: States of art. In M.H. Christiansen,
& S. Kirby (Eds.), Language evolution
(pp. 140-157). New York: Oxford University Press.
Davies, A. M. (1987). “Organic” and
“Organism” in Franz Bopp. In H. M. Hoenigswald and L. F. Wiener (Eds.), Biological metaphor and cladistic
classification (pp. 81–107). Philadelphia, PA: University of Pennsylvania
Press.
Dawkins,
R. (1976). The selfish gene. New
York: Oxford University Press.
Dawkins, R. (1986). The blind watchmaker: Why the evidence of evolution reveals a universe
without design. Harmondsworth, UK: Penguin.
Deacon, T.W. (1997). The symbolic species: The co-evolution of language and the
brain. New York: W.W. Norton.
Dediu, D. & Ladd, D.R. (2007).
Linguistic tone is related to the population frequency of the adaptive
haplogroups of two brain size genes, ASPM
and Microcephalin. Proceedings of the National Academy of
Sciences, 104, 10944-10949.
Dennett, D. C. (1995).
Darwin's dangerous idea: Evolution and the meanings of life. New York: Simon & Schuster.
de Vries, M., Monaghan, P., Knecht, S. & Zwitserlood, P.
(in press). Syntactic structure and
artificial grammar learning: The learnability of embedded hierarchical
structures. Cognition.
Diamond, J. (1992). The
third chimpanzee: The evolution and future of the human animal. New York:
Harper Collins.
Diamond, J. (1997). Guns,
germs, and steel: The fates of human societies. New York: Harper Collins.
Donald, M. (1998). Mimesis and the executive suite:
Missing links in language evolution. In J.R. Hurford, M. Studdert-Kennedy and
C. Knight (Eds.), Approaches to the
evolution of language (pp. 44-67). Cambridge, U.K.: Cambridge University
Press.
Dryer, M. S.
(1992). The Greenbergian word order correlations, Language, 68, 81–138.
Dunbar, R.I.M. (2003). The origin and subsequent evolution
of language. In M.H. Christiansen, & S. Kirby (Eds.), Language evolution (pp. 219-234). New York: Oxford University
Press.
Ellefson, M.R. & Christiansen, M.H.
(2000). Subjacency constraints without universal grammar: Evidence from
artificial language learning and connectionist modeling. In The Proceedings of the 22nd Annual
Conference of the Cognitive Science Society (pp. 645-650). Mahwah, NJ:
Lawrence Erlbaum.
Elman, J.L. (1999). Origins of language:
A conspiracy theory. In B. MacWhinney (Ed.), The emergence of language. Hillsdale, NJ: Lawrence Erlbaum.
Elman,
J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A., Parisi, D. &
Plunkett, K. (1996). Rethinking
innateness: A connectionist perspective on development.
Cambridge, MA: MIT Press.
Emlen, S.T. (1970). Celestial
rotation: Its importance in the development of migratory orientation. Science, 170, 1198-1201.
Enard, W., Przeworski, M., Fisher,
S. E., Lai, C. S. L., Wiebe, V., Kitano, T., et al. (2002). Molecular evolution
of FOXP2, a gene involved in speech and language. Nature, 418, 869-872.
Everett, D.L.
(2005). Cultural constraints on grammar and cognition in Pirahã. Current Anthropology, 46, 621-646.
Everett, D.L.
(2007). [On-line]. Cultural constraints on grammar in Pirahã: A Reply to Nevins,
Pesetsky, and Rodrigues (2007). Available:
http://ling.auf.net/lingBuzz/000427.
Farmer, T.A.,
Christiansen, M.H. & Monaghan, P. (2006). Phonological typicality
influences on-line sentence comprehension. Proceedings
of the National Academy of Sciences, 103, 12203-12208.
Fisher, S.E.
(2006). Tangled webs: Tracing the connections between genes and cognition. Cognition, 101, 270-297.
Fitneva, S.A.,
& Spivey, M.J. (2004). Context and language processing: The effect of
authorship. In J.C. Trueswell & M.K. Tanenhaus (Eds.), Approaches to studying world-situated language use: Bridging the
language-as-product and language-as-action traditions (pp. 317-328).
Cambridge, MA: MIT Press.
Fleischman, S. (1982). The future in thought and language:
Diachronic evidence from Romance. Cambridge, U.K.: Cambridge University
Press.
Fodor,
J. A. (1975). The language of thought.
Cambridge, MA: Harvard University Press.
Frean, M.R. & Abraham, E.R. (2004).
Adaptation and enslavement in endosymbiont-host associations. Physical Review E, 69, 051913.
Friederici, A. D., Bahlmann, J., Heim,
S., Schibotz, R.I. & Anwander, A. (2006). The brain differentiates human
and non-human grammars: Functional localization and structural connectivity. Proceedings of the National Academy of
Sciences, 103, 2458-2463.
Friederici, A. D., Steinhauer, K., &
Pfeifer, E. (2002). Brain signatures of artificial language processing:
Evidence challenging the critical period hypothesis. Proceedings of the National Academy of Sciences of the United States of
America, 99, 529-534.
Frigo, L. & McDonald, J.L. (1998).
Properties of phonological markers that affect the acquisition of gender-like
subclasses. Journal of Memory and
Language, 39, 218-245.
Gerhart, J. & Kirschner, M (1997). Cells, embryos and evolution: Toward a
cellular and developmental understanding of phenotypic variation and
evolutionary adaptability. Cambridge, U.K.: Blackwell.
Givón, T. (1979). On understanding grammar. New York:
Academic Press.
Givón, T.
(1998). On the co-evolution of language, mind and brain. Evolution of Communication, 2,
45-116.
Givón, T. & Malle, B. F. (Eds.)
(2002). The evolution of language out of
pre-language. Amsterdam: Benjamins.
Gold, E. (1967). Language identification
in the limit. Information and Control,
16, 447-474.
Goldberg, A.E. (2006). Constructions at work: The nature of
generalization in language. New York: Oxford University Press.
Goldsby, R. A., Kindt, T. K., Osborne, B.
A., & Kuby J. (2003). Immunology (5th Edition). New
York: W.H. Freeman and Company.
Golinkoff, R.M., Hirsh-Pasek, K., Bloom,
L., Smith, L., Woodward, A., Akhtar, N., Tomasello, M., & Hollich, G.
(Eds.) (2000). Becoming a word learner: A
debate on lexical acquisition. New York: Oxford University Press.
Gómez, R.L. (2002). Variability and
detection of invariant structure. Psychological
Science, 13, 431-436.
Gómez, R.L., & Gerken, L.A. (2000).
Infant artificial language learning and language acquisition. Trends in Cognitive Sciences, 4, 178-186.
Gould, S.J. (1993). Eight little piggies: Reflections in natural history. New York:
Norton.
Gould, S. J. (2002). The structure of evolutionary theory. Cambridge, MA: Harvard
University Press.
Gould, S. J. & Lewontin, R. C.
(1979). The spandrels of San Marco and the Panglossian paradigm: A critique of
the adaptationist programme. Proceedings
of the Royal Society of London (Series B), 205, 581-598.
Gould, S.J. & Vrba, E.S. (1982).
Exaptation - a missing term in the science of form. Paleobiology, 8, 4-15.
Gray, R. D. & Atkinson, Q. D. (2003).
Language-tree divergence times support the Anatolian theory of Indo-European
origin. Nature, 426, 435-439.
Green, T. R. G. (1979). Necessity of
syntax markers: 2 Experiments with artificial languages. Journal of Verbal Learning and Verbal Behavior, 18, 481-496.
Greenfield, P.M. (1991). Language, tools
and brain: The ontogeny and phylogeny of hierarchically organized sequential
behavior. Behavioral and Brain Sciences,
14, 531-595.
Grice, H. P. (1967). Logic and conversation. William James Lectures, Ms., Harvard
University.
Gruber, O. (2002). The co-evolution of language and working memory capacity in the
human brain. In M.I. Stamenov
& V. Gallese (Eds.), Mirror neurons
and the evolution of brain and language (pp. 77–86). Amsterdam: John
Benjamins.
Haber, R. N. (1983). The
impending demise of the icon: the role of iconic processes in information processing theories of perception (with
commentaries). Behavioral and Brain
Sciences, 6, 1-55.
Hamilton,
W. D. (1964). The genetical evolution of social behaviour. Journal of Theoretical Biology, 7,
1-52.
Hampe, B. (2006) (Ed.). From perception to meaning: Image schemas
in cognitive linguistics. Berlin: Mouton de Gruyter.
Hare, M. &
Elman, J.L. (1995). Learning
and morphological change. Cognition, 56,
61-98.
Hauser, M.D. (2001). Wild minds: What animals really think. New York: Owl Books.
Hauser, M.D., Chomsky, N. & Fitch,
W.T. (2002). The faculty of language: What is it, who has it, and how did
it evolve? Science, 298, 1569-1579.
Hauser, M.D. & Fitch, W.T. (2003).
What are the uniquely human components of the language faculty? In M.H.
Christiansen & S. Kirby (Eds.), Language
evolution (pp. 158-181). Oxford: Oxford University Press.
Hawkins, J.A. (1994). A performance theory of order and
constituency. Cambridge: Cambridge University Press.
Hawkins, J.A.
(2004). Efficiency and complexity in
grammars. Oxford: Oxford University Press.
Hawks, J.D., Hunley, K., Lee, S-H. & Wolpoff, M. (2000). Population
bottlenecks and Pleistocene human evolution. Molecular Biology and Evolution,
17, 2-22.
Hecht Orzak, S.
& Sober, E. (Eds.) (2001). Adaptationism
and optimality. Cambridge:
Cambridge University Press.
Heine, B. (1991). Grammaticalization. Chicago: University of Chicago Press.
Heine, B. & Kuteva, T. (2002). On the
evolution of grammatical forms. In A. Wray (Ed.), Transitions to language (pp. 376-397).
Oxford, U.K.: Oxford University Press.
Hinton, G.E. & Nowlan, S.J. (1987).
How learning can guide evolution. Complex
Systems, 1, 495-502.
Hoen, M., Golembiowski, M., Guyot, E.,
Deprez, V., Caplan, D. & Dominey, P.F. (2003). Training with cognitive
sequences improves syntactic comprehension in agrammatic aphasics. NeuroReport, 14, 495-499.
Hopper, P. & Traugott, E. (1993). Grammaticalization. Cambridge, UK:
Cambridge University Press.
Hornstein, N. (2001). Move! A minimalist approach to construal.
Oxford: Blackwell.
Hornstein, N. & Boeckx, C. (in
press). Universals in light of the varying aims of linguistic theory. In M.H.
Christiansen, C. Collins & S. Edelman (Eds.), Language universals. New York: Oxford University Press.
Hornstein, N. & Lightfoot, D. (Eds.)
(1981). Explanations in linguistics: The
logical problem of language acquisition. London: Longman.
Hsu, H.-J., Christiansen, M.H., Tomblin,
J.B., Zhang, X. & Gómez, R.L. (2006). Statistical
learning of nonadjacent dependencies in adolescents with and without language
impairment. Poster presented at the 2006 Symposium on Research in Child
Language Disorders, Madison, WI.
Huang, Y. (2000). Anaphora: A cross-linguistic study. Oxford: Oxford University
Press.
Hudson
Kam, C.L. & Newport, E.L. (2005). Regularizing unpredictable variation: The
roles of adult and child learners in language formation and change. Language Learning and Development, 1, 151-195.
Hurford, J. (1990). Nativist and
functional explanations in language acquisition. In I. M. Roca (Ed.), Logical issues in language acquisition
(pp. 85-136). Dordrecht: Foris.
Hurford,
J.R. (1991). The evolution of the critical period for language learning. Cognition, 40, 159-201.
Hurford, J.R. (2003). The language mosaic
and its evolution. In M.H. Christiansen & S. Kirby (Eds.), Language evolution (pp. 38-57). Oxford:
Oxford University Press.
Hurford,
J.R. & Kirby, S. (1999). Co-evolution of language size and the critical
period'' In D. Birdsong (Ed.) Second language acquisition and the critical
period hypothesis (pp. 39-63). Mahwah, NJ: Erlbaum.
Jablonka, E. & Lamb, M.J. (1989). The
inheritance of acquired epigenetic variations. Journal of Theoretical Biology, 139, 69-83.
Jackendoff, R. (2002). Foundations of language: Brain, meaning,
grammar, evolution. New York: Oxford University Press.
Jain, S., Osherson, D., Royer, J., &
Sharma, A. (1999). Systems that learn
(2nd ed.). Cambridge, MA: M.I.T. Press.
Jenkins, L. (2000). Biolinguistics: Exploring the biology of language. Cambridge:
Cambridge University Press.
Johansson, S. (2006). Working backwards
from modern language to proto-grammar. In A. Cangelosi, A.D.M. Smith, & K.
Smith (Eds.), The Evolution of Language
(pp. 160-167). Singapore: World Scientific.
Juliano, C. & Tanenhaus, M.K. (1994).
A constraint-based lexicalist account of the subject/object attachment
preference. Journal of Psycholinguistic
Research, 23, 459-471.
Kamide, Y., Altmann, G.T.M., &
Haywood, S. (2003). The time-course of prediction in incremental sentence
processing: Evidence from anticipatory eye-movements. Journal of Memory and Language, 49,
133-159.
Kaschak, M.P. & Glenberg, A.M.
(2004). This construction needs learned. Journal
of Experimental Psychology: General, 133, 450-467.
Kauffman, S. A. (1995). The origins of order: Self-organization and
selection in evolution. Oxford: Oxford University Press.
Keller,
R. (1994). On language change: The
invisible hand in language. London: Routledge.
Kirby, S. (1998). Fitness and the selective
adaptation of language. In J.R. Hurford, M. Studdert-Kennedy, & C. Knight
(Eds.), Approaches to the evolution of
language: Social and cognitive bases (pp. 359-383). New York: Cambridge
University Press.
Kirby, S. (1999). Function, selection and innateness: The emergence of language universals. Oxford: Oxford University
Press.
Kirby,
S., Dowman, M., & Griffiths, T. (2007). Innateness and culture in the
evolution of language. Proceedings of the National Academy of Sciences,
104, 5241-5245.
Kirby, S. & Hurford, J. (1997).
Learning, culture and evolution in the origin of linguistic constraints. In P.
Husbands and I. Harvey (Eds.), ECAL97
(pp. 493-502). Cambridge, MA: MIT Press.
Kirby, S. & Hurford, J.R. (2002). The
emergence of linguistic structure: An overview of the iterated learning model.
In A. Cangelosi & D. Parisi (Eds.), Simulating
the evolution of language (pp. 121-148). London: Springer Verlag.
Kuhl, P.K. (1987). The special mechanisms
debate in speech research: Categorization tests on animals and infants. In S.
Harnad (Ed.), Categorical perception: The groundwork of cognition. (pp.
355-386). Cambridge: Cambridge University Press.
Kvasnicka, V., & Pospichal, J.
(1999). An emergence of coordinated communication in populations of agents. Artificial Life, 5, 318-342.
Lai, C. S.L., Fisher, S.E., Hurst, J. A.,
Vargha-Khadem, F., & Monaco, A.P. (2001). A forkhead-domain gene is mutated
in a severe speech and language disorder. Nature,
413, 519-523.
Lai, C. S. L., Gerrelli, D., Monaco, A.
P., Fisher, S. E., & Copp, A. J. (2003). FOXP2 expression during brain
development coincides with adult sites of pathology in a severe speech and
language disorder. Brain, 126,
2455–2462.
Lakoff,
G. and Johnson, M. (1980). Metaphors we live by. Chicago: University of
Chicago Press.
Lanyon, S.J. (2006). A saltationist
approach for the evolution of human cognition and language. In A. Cangelosi,
A.D.M. Smith, & K. Smith (Eds.), The
Evolution of Language (pp. 176-183). Singapore: World Scientific.
Lashley, K.S. (1951). The problem of
serial order in behavior. In L.A. Jeffress (Ed.), Cerebral mechanisms in behavior (pp. 112-146). New York: Wiley.
Laubichler, M.D. & Maienschein, J. (Eds.) (2007). From
embryology to evo-devo: A history of developmental evolution. Cambridge, MA: MIT Press.
Levinson, S.C. (1987a). Pragmatics
and the grammar of anaphora: A partial pragmatic reduction of binding and
control phenomena. Journal of Linguistics,
23, 379-434.
Levinson
S.C. (1987b). Minimization and conversational inference. In M. Papi and J.
Verschueren (Eds.), The Pragmatic Perspective: Proceedings of the
International Conference on Pragmatics at Viareggio (pp. 61-129).
Amsterdam: J. Benjamins.
Levinson,
S.C. (2000). Presumptive meanings: The theory of
generalized conversational implicature. Cambridge, MA: MIT Press.
Lewontin,
R.C. (1998). The evolution of cognition: Questions we will never answer. In D.
Scarborough & S. Sternberg (Eds.), An
invitation to cognitive science, Volume 4: Methods, models, and conceptual
issues. Cambridge, MA: MIT Press.
Li,
M. & Vitányi, P. (1997). An
introduction to Kolmogorov complexity theory and its applications (2nd
ed). Berlin: Springer.
Lieberman, P. (1984). The biology and evolution of language.
Cambridge, MA: Harvard University Press.
Lieberman, P. (1991). Speech and brain
evolution. Behavioral and Brain Science,
14, 566-568.
Lieberman, P. (2003). Motor control,
speech, and the evolution of human language. In M.H. Christiansen & S.
Kirby (Eds.), Language evolution (pp.
255-271). New York: Oxford University Press.
Lightfoot, D. (2000). The spandrels of
the linguistic genotype. In C. Knight, M. Studdert-Kennedy & J.R. Hurford
(Eds.), The evolutionary emergence of
language: Social function and the origins of linguistic form (pp. 231-247).
Cambridge, U.K.: Cambridge University Press.
Lively, S.E., Pisoni, D.B. &
Goldinger, S.D. (1994). Spoken word recognition. In M.A. Gernsbacher (Ed.), Handbook of psycholinguistics (pp.
265-318). San Diego, CA: Academic Press.
Livingstone, D., & Fyfe, C. (2000). Modelling
language-physiology coevolution. In C. Knight, M., Studdert-Kennedy and J. R.
Hurford (Eds.), The emergence of
language: Social function and the origins of linguistic form (pp. 199-
215). Cambridge University Press.
Locke, J.L. & Bogin, B. (2006). Language
and life history: A new perspective on the development and evolution of human
language. Behavioral & Brain
Sciences, 29, 259-280.
Lupyan, G. &
Christiansen, M.H. (2002). Case, word order, and language learnability:
Insights from connectionist modeling. In Proceedings
of the 24th Annual Conference of the Cognitive Science Society (pp.
596-601). Mahwah, NJ: Lawrence Erlbaum.
MacDermot, K. D., Bonora, E., Sykes, N., Coupe, A.
M., Lai, C. S. L., Vernes, S. C., et al. (2005). Identification of FOXP2
truncation as a novel cause of developmental speech and language deficits. American Journal of Human Genetics, 76,
1074–1080.
MacDonald, M.C. & Christiansen, M.H.
(2002). Reassessing working memory: A comment on Just & Carpenter (1992) and Waters & Caplan (1996). Psychological Review, 109, 35-54.
MacDonald, M. C., Pearlmutter, N. J.,
& Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity
resolution. Psychological Review, 101,
676 –703.
Mackay, D. J. C. (2003). Information theory,
inference, and learning algorithms. Cambridge: Cambridge University
Press.
MacNeilage, P.F. (1998) The frame/content
theory of evolution of speech production. Behavioral and Brain Sciences,
21, 499-511.
MacWhinney,
B. (Ed.) (1999). The emergence of
language. Mahwah, NJ: Erlbaum.
Maess, B.,
Koelsch, S., Gunter, T.C. & Friederici A.D. (2001). Musical syntax is
processed in Broca’s area: an MEG study. Nature
Neuroscience, 4, 540–545.
Malle, B.F. (2002). The relation between language and theory of mind in development
and evolution. In T.
Givón, & B. Malle (Eds.), The
evolution of language out of pre-language (pp. 265-284). Philadelphia, PA:
John Benjamins.
Marcus, G.F. (2004). The birth of the mind: How a tiny number of genes creates the
complexities of human thought. New York: Basic Books.
Maynard-Smith, J. (1978). Optimization
theory in evolution. Annual Review of
Ecology and Systematics, 9,
31-56.
McClintock, B. (1950). The origin and
behavior of mutable loci in maize. Proceedings
of the National Academy of Sciences, 36,
344–355.
McMahon, A.M.S. (1994). Understanding language change.
Cambridge: Cambridge University Press.
Monaghan,
P., Chater, N. & Christiansen, M.H. (2005). The differential role of
phonological and distributional cues in grammatical categorisation. Cognition, 96, 143-182.
Monaghan,
P. & Christiansen, M.H. (in press). Integration of multiple probabilistic
cues in syntax acquisition. In H. Behrens (Ed.), Trends in corpus research: Finding structure in data (TILAR
Series). Amsterdam: John Benjamins.
Morgan,
J.L. & Demuth, K. (1996). Signal to
syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah,
NJ: Lawrence Erlbaum Associates.
Morgan, J.L.,
Meier, R.P., & Newport, E.L. (1987). Structural packaging in the input to
language learning: Contributions of prosodic and morphological marking of
phrases to the acquisition of language. Cognitive
Psychology, 19, 498-550.
Munroe S.,
& Cangelosi A. (2002). Learning and the evolution of language: the role of
cultural variation and learning cost in the Baldwin Effect. Artificial Life, 8, 311-339.
Murphy, G. L. (2002). The big book of
concepts. Cambridge, MA: MIT Press.
Nerlich, B.
(1989). The evolution of the concept of ‘linguistic evolution’ in the 19th and
20th century. Lingua, 77, 101–112.
Nettle, D. & Dunbar,
R.I.M. (1997). Social markers and the evolution of reciprocal exchange. Current Anthropology, 38, 93-99.
Nevins, A., Pesetsky, D. & Rodrigues, C. (2007). Pirahã exceptionality: A reassessment
[On-line]. Available: http://ling.auf.net/lingBuzz/000411.
Newmeyer, F.J. (1991). Functional explanation in linguistics and
the origins of language. Language and
Communication, 11, 3-28.
Newmeyer, F. (2003). What can the field
of linguistics tell us about the origin of language? In M.H. Christiansen &
S. Kirby (Eds.), Language evolution
(pp. 58-76). New York: Oxford University Press.
Newport, E.L. & Aslin, R.N. (2004).
Learning at a distance: I. Statistical learning of non-adjacent dependencies. Cognitive Psychology, 48, 127-162.
Nowak, M.A., Komarova, N.L. & Niyogi,
P. (2001). Evolution of universal grammar. Science,
291, 114-118.
Odling-Smee,
F.J., Laland, K.N. & Feldman, M.W. (2003). Niche construction: The neglected process in evolution. Princeton,
NJ: Princeton University Press.
O’Grady, W. (2005). Syntactic carpentry: An emergentist approach to syntax. Mahwah, NJ:
Erlbaum.
Onnis, L., Christiansen, M.H., Chater, N.
& Gómez, R. (2003). Reduction of uncertainty in human sequential learning:
Evidence from artificial grammar learning. In Proceedings of the 25th Annual Conference of the Cognitive Science
Society (pp. 886-891). Mahwah, NJ: Lawrence Erlbaum.
Onnis, L., Monaghan, P., Chater, N. &
Richmond, K. (2005). Phonology impacts segmentation in speech processing. Journal of Memory and Language, 53, 225-237.
Osherson, D., Stob, M. and Weinstein, S.
(1986). Systems that learn.
Cambridge, MA: MIT Press.
Otake, T., Hatano, G., Cutler, A., &
Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32,
258-278.
Packard, M. & Knowlton, B. (2002).
Learning and memory functions of the basal ganglia. Annual Review of Neuroscience, 25, 563-593.
Patel, A. D., Gibson, E., Ratner, J.,
Besson, M., & Holcomb, P. J. (1998). Processing syntactic relations in
language and music: An event-related potential study. Journal of Cognitive Neuroscience, 10, 717-733.
Pearlmutter, N.J. & MacDonald, M.C.
(1995). Individual differences and probabilistic constraints in syntactic
ambiguity resolution. Journal of Memory
and Language, 34, 521-542.
Peña, M., Bonnatti, L., Nespor, M., &
Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298, 604-607.
Pennisi, E. (2004). The first language? Science, 303, 1319-1320.
Percival, W.K. (1987). Biological analogy in
the study of languages before the advent of comparative grammar. In H.M.
Hoenigswald & L.F. Wiener (Eds.), Biological
metaphor and cladistic classification (pp. 3–38). Philadelphia, PA:
University of Pennsylvania Press.
Petersson, K. M., Forkstam, C., & Ingvar,
M. (2004). Artificial syntactic violations activate Broca's region. Cognitive Science, 28, 383-407.
Piattelli-Palmarini,
M. (1989). Evolution, selection and cognition: From “learning” to parameter
setting in biology and in the study of language. Cognition, 31, 1-44.
Piattelli-Palmarini, M. (1994). Ever
since language and learning: Afterthoughts on the Piaget-Chomsky debate. Cognition, 50, 315-346.
Pinker, S. (1984). Language learnability and language development. Cambridge, MA:
Harvard University Press.
Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure.
Cambridge, MA: MIT Press.
Pinker, S.
(1994). The language instinct: How the
mind creates language. New York: NY: William Morrow and Company.
Pinker, S.
(2003). Language as an adaptation to the cognitive niche. In M. H. Christiansen
and S. Kirby (Eds.), Language evolution (pp. 16-37). Oxford: Oxford
University Press.
Pinker, S.
& Bloom, P. (1990). Natural language and natural selection. Brain and Behavioral Sciences, 13,
707-727.
Pinker, S.
& Jackendoff, R. (2005). The faculty of language: What’s special about it? Cognition, 95, 201-236.
Pinker, S.
& Jackendoff, R. (in press). The components of language: What’s specific to
language, and What’s specific to humans? In M.H. Christiansen, C. Collins &
S. Edelman (Eds.), Language universals.
New York: Oxford University Press.
Plante, E., Gómez, R.L., & Gerken, L.A.
(2002). Sensitivity to word order cues by normal and language/learning disabled
adults. Journal of Communication
Disorders, 35, 453-462.
Pomerantz,
J. R. & Kubovy, M. (1986). Theoretical approaches to perceptual
organization: Simplicity and likelihood principles. In K. R. Boff, L. Kaufman
& J. P. Thomas (Eds.) Handbook of
Perception and Human Performance. Volume 2: Cognitive Processes and Performance.
(pp. 36-1-36-46) New York: Wiley.
Quine, W. V. O.
(1960). Word and object. Cambridge,
MA: MIT Press.
Raddick, G. (2000). Review of S.
Alter's Darwinism and the Linguistic
Image. British Journal for the History of Science, 33, 122–124.
Raddick,
G. (2002). Darwin on language and selection. Selection, 3, 7–16.
Ragir, S. (2002). Constraints on
communities with indigenous sign languages: Clues to the dynamics of language
origins. In A. Wray (Ed.), Transitions to
language (pp. 272-294). Oxford: Oxford University Press.
Reali, F. & Christiansen, M.H.
(2007). Processing of relative clauses is made easier by frequency of
occurrence. Journal of Memory and
Language, 57, 1-23.
Reali, F. & Christiansen, M.H. (in
press). Sequential learning and the interaction between biological and
linguistic adaptation in language evolution.
Interaction Studies.
Richerson,
P.J. & Boyd, R. (2005). Not by genes
alone: How culture transformed human evolution. Chicago: Chicago University
Press.
Reinhart, T. (1983). Anaphora and semantic interpretation. Chicago: Chicago University
Press.
Ritt,
N. (2004). Selfish sounds and linguistic
evolution: A Darwinian approach to language change. Cambridge: Cambridge
University Press.
Rossel, S., Corlija, J., & Schuster,
S. (2002). Predicting three-dimensional target motion: How archer fish
determine where to catch their dislodged prey. Journal of Experimental Biology, 205, 3321-3326.
Sag, I.A. & Pollard, C.J. (1987). Head-driven phrase structure grammar: An
informal synopsis. CSLI Report 87-79. Stanford, CA: Stanford University.
Saffran, J.R. (2001). The use of
predictive dependencies in language learning. Journal of Memory and Language, 44,
493-515.
Saffran J.R. (2002). Constraints on
statistical language learning. Journal of
Memory and Language, 47, 172-196.
Saffran, J.R. (2003). Statistical
language learning: Mechanisms and constraints. Current Directions in Psychological Science, 12, 110-114.
Saffran, J.R., Aslin, R.N., &
Newport, E.L. (1996a). Statistical learning by 8-month-old infants. Science, 274, 1926-1928.
Saffran, J. R., Newport, E. L., &
Aslin, R. N. (1996b). Word segmentation: The role of distributional cues. Journal of Memory and Language, 35,
606-621.
Sandler, W., Meir, I., Padden, C. &
Aronoff, M. (2005). The emergence of grammar: Systematic structure in a new
language. Proceedings of the National
Academy of Sciences, 102, 2661-2665.
Schlosser, G., & Wagner, G. P. (Eds.)
(2004). Modularity in development and
evolution. Chicago, IL: University of Chicago Press.
Schoenemann, P.T. (1999). Syntax as an
emergent characteristic of the evolution of semantic complexity. Minds and Machines, 9, 309-346.
Seidenberg, M.S. (1985). The time course
of phonological code activation in two writing systems. Cognition, 19, 1-30.
Seidenberg, M.S. (1997). Language
acquisition and use: Learning and applying probabilistic constraints. Science, 275, 1599-1604.
Seidenberg, M.S. & MacDonald, M.
(2001). Constraint-satisfaction in language acquisition. In M.H. Christiansen
& N. Chater (Eds.), Connectionist
psycholinguistics (pp. 281-318). Westport, CT: Ablex.
Senghas, A., Kita, S. & Özyürek, A.
(2004). Children creating core properties of language: Evidence from an
emerging sign language in Nicaragua. Science,
305, 1779-1782.
Sereno, M.I. (1991). Four analogies
between biological and cultural/linguistic evolution. Journal of Theoretical Biology, 151, 467-507.
Simoncelli, E. P. & Olshausen, B. A.
(2001). Natural image statistics as neural representation. Annual Review of Neuroscience, 24,
1193-1215.
Slobin, D.I. (1973). Cognitive
prerequisites for the development of grammar. In C.A. Ferguson and D.I. Slobin
(Eds.), Studies of child language
development (pp. 175-208). New York: Holt, Rinehart & Winston.
Slobin, D.I., & Bever, T.G.
(1982). Children use canonical sentence schemas: A crosslinguistic study of
word order and inflections. Cognition,
12, 229-265.
Smith, K. (2002). Natural selection and cultural selection in the
evolution of communication. Adaptive Behavior, 10, 25-44.
Smith, K. (2004). The evolution of vocabulary. Journal of Theoretical Biology, 228, 127-142.
Smith, K., Brighton, H. & Kirby, S. (2003). Complex systems in
language evolution: the cultural emergence of compositional structure. Advances
in Complex Systems, 6, 537-558.
Sperber, D. & Wilson, D. (1986).
Relevance. Oxford: Blackwell.
Stallings, L., MacDonald, M. & O’Seaghdha, P.
(1998). Phrasal ordering constraints in sentence production: phrase length and
verb disposition in heavy-NP shift. Journal
of Memory and Language, 39, 392-417.
Steedman, M. (2000). The syntactic process. Cambridge, MA: MIT Press.
Stevick,
R.D. (1963). The biological model and historical linguistics. Language, 39, 159–169.
Studdert-Kennedy,
M. & Goldstein, L. (2003). Launching language: The gestural origin of
discrete infinity. In M.H. Christiansen and S. Kirby (Eds.), Language evolution (pp. 235-254). New
York: Oxford University Press.
Suzuki, D.T., Griffiths, A.J.F., Miller,
J.H. & Lewontin, R.C. (1989). An introduction
to genetic analysis (4th edition). New York, NY: W. H. Freeman.
Syvanen, M. (1985). Cross-species gene transfer: Implications for a new
theory of evolution. Journal of
Theoretical Biology, 112, 333-343.
Tanenhaus, M.K., Spivey-Knowlton, M.J., Eberhard,
K.M. & Sedivy, J.E. (l995). Integration of visual and linguistic
information in spoken language comprehension.
Science, 268, 1632-1634.
Tanenhaus, M.K. & Trueswell, J.C. (1995).
Sentence comprehension. In J. Miller & P. Eimas (Eds.), Handbook of cognition and perception
(pp. 217-262). San Diego, CA: Academic Press.
Tomasello, M., (2000a). Do you children
have adult syntactic competence? Cognition,
74, 209-253.
Tomasello, M., (2000b). The item-based
nature of children’s early syntactic development. Trends in Cognitive Sciences, 4,
156-163.
Tomasello, M., (Ed). (2000c). The new
psychology of language: Cognitive and functional approaches. Hillsdale, NJ:
Erlbaum.
Tomasello,
M. (2003). Constructing a language: A
usage-based theory of language acquisition. Cambridge, MA: Harvard
University Press.
Tomasello,
M. (2004). What kind of evidence could refute the UG hypothesis? Studies in Language, 28, 642-644.
Tomasello, M. (2006). Origins of human communication. The Jean
Nicod Lectures, May 2006, Paris.
Tomasello,
M., Carpenter, M., Call, J., Behne, T. & Moll, H. (2005). Understanding and
sharing intentions: The origins of cultural cognition. Behavioral & Brain Sciences, 28, 675-691.
Tomblin, J.B., Mainela-Arnold, M.E. &
Zhang, X. (2007). Procedural learning in adolescents with and without specific
language impairment. Language Learning
and Development, 3, 269-293.
Tomblin, J.B., Shriberg, L., Murray, J., Patil, S.
& Williams, C. (2004). Speech and language characteristics associated with
a 7/13 translocation involving FOXP2.
American Journal of Medical Genetics,
130B, 97.
Ullman,
M.T. (2004). Contributions of memory circuits to language: The
declarative/procedural model. Cognition,
92, 231-270.
van
Everbroeck, E. (1999). Language type frequency and learnability: A
connectionist appraisal. In Proceedings
of the 21st Annual Cognitive Science Society Conference (pp. 755–760).
Mahwah, NJ: Erlbaum.
Voight, B.F., Kudaravalli, S., Wen, X.
& Pritchard, J.K. (2006). A map of recent positive selection in the human
genome. PloS Biology, 4, e72.
von Humboldt, W. (1999). On language: On the diversity of human
language construction and its influence on the metal development of the human
species. Cambridge, U.K.: Cambridge University Press.
Vouloumanos, A. & Werker, J.F. (2007).
Listening to language at birth: Evidence for a bias for speech in neonates. Developmental Science, 10, 159-164.
Waddington, C.H. (1942). Canalization of development and the
inheritance of acquired characters. Nature,
150, 563-565.
Weber, B.H., & Depew, D.J. (Eds.)
(2003). Evolution and learning: The
Baldwin effect reconsidered. Cambridge, MA: MIT Press.
Weissenborn, J. & Höhle, B. (Eds.)
(2001). Approaches to bootstrapping:
Phonological, lexical, syntactic and neurophysiological aspects of early
language acquisition. Philadelphia, PA: John Benjamins.
Wilkins, W.K. & Wakefield, J. (1995).
Brain evolution and neurolinguistic preconditions. Behavioral & Brain Sciences, 18, 161-182.
Yamashita, H. & Chang, F. (2001).
“Long before short” preference in the production of a head-final language. Cognition, 81, B45-B55.
Yamauchi, H. (2001). The difficulty of
the Baldwinian account of linguistic innateness. In J. Kelemen and P. Sosík
(Eds.), ECAL01 (pp. 391-400). Prague:
Springer.
Yang, C.D. (2002). Knowledge and learning in natural language. New York: Oxford
University Press.
Zeevat, H. (2006). Grammaticalisation and evolution. In A. Cangelosi, A. D. M. Smith, & K. Smith (Eds.) The Evolution of Language (pp. 372-378). Singapore: World Scientific.
Zuidema, W. (2003). How the poverty of the stimulus solves the poverty of the stimulus. In S. Becker, S. Thrun, & K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15 (pp. 51-58). Cambridge, MA: MIT Press.
Endnotes
[i] For the purposes of exposition, we use
the term “language genes” as short-hand for genes that may be involved in
encoding a potential UG. By using this term we do not mean to suggest that this
relationship necessarily involves a one-to-one correspondence between individual
genes and a specific aspect of language (or cognition).
[ii] Intermediate positions, which accord
some role to both non-adaptationists and adaptationist mechanisms, are, of
course, possible. Such intermediate viewpoints inherit the logical problems
that we discuss below for both types of approach, in proportion to the relative
contribution presumed to be associated with each. Moreover, we note that our
arguments have equal force independent of whether one assumes that language has
a vocal (e.g., Dunbar, 2003) or manual-gesture (e.g., Corballis, 2003) based
origin.
[iii] Strictly, the appropriate measure is the
more subtle inclusive fitness, which
takes into account the reproductive potential not just of an organism, but also
a weighted sum of the reproductive potentials of its kin, where the weighting
is determined by the closeness of kinship (Hamilton, 1964). Moreover, mere
reproduction is only of value to the degree that one's offspring have a
propensity to reproduce, and so down the generations.
[iv] In addition, Pinker and Bloom (1990)
point out that it is often the case that natural selection has several (equally
adaptive) alternatives to choose from to carry out a given function (e.g., both
the invertebrate and the vertebrate eye support vision despite having
significant architectural differences).
[v] One prominent view is that language
emerged within the last 100,000 to 200,000 years (e.g., Bickerton, 2003).
Hominid populations over this period, and before, appear to have undergone
waves of spread; “… modern languages derive mostly or completely from a single
language spoken in East Africa around 100 kya … it was the only language then
existing that survived and evolved with rapid differentiation and
transformation.” (Cavalli-Sforza & Feldman, 2003: p. 273)
[vi] Human genome-wide scans have revealed
evidence of recent positive selection for more than 250 genes (Voight,
Kudaravalli, Wen & Pritchard, 2006), making it very likely that genetic
adaptations for language would have continued in this scenario.
[vii]
This set-up closely resembles the one used by Hinton and Nowlan (1987)
in their simulations of the Baldwin effect, and to which Pinker and Bloom
(1990) refer in support of their adaptationist account of language evolution.
The simulations are also similar in format to other models of language
evolution (e.g., Briscoe, 2003; Kirby & Hurford, 1997; Nowak, Komarova
& Niyogi, 2001). Note, however, the reported simulations have a very
different purpose from work on understanding historical language change from a
UG perspective, for example, as involving successive changes in linguistic
parameters (e.g., Baker, 2001; Lightfoot, 2000; Yang, 2002).
[viii] Some recent theorists have proposed that
a further pressure for language divergence between groups is the sociolinguistic
tendency for groups to “badge” their in-group by difficult-to-fake linguistic
idiosyncrasies (Baker, 2003; Nettle & Dunbar, 1997). Such pressures would
increase the pace of language divergence, and thus exacerbate the problem of
divergence for adaptationist theories of language evolution.
[ix] This type of phenomenon, where the
genetically-influenced behavior of an organism affects the environment to which
those genes are adapting, is known as Baldwinian niche construction
(Odling-Smee, Laland & Feldman, 2003; Weber & Depew, 2003).
[x] Indeed, a population genetic study by
Dediu and Ladd (2007) could, on the one hand, be taken as pointing to
biological adaptations for a surface feature of phonology: the adoption of a
single-tier phonological system relying only on phoneme-sequence information to
differentiate between words instead of a two-tier system incorporating both
phonemes and tones (i.e., pitch contours). Specifically, two particular alleles
of ASPM and Microcephalin, both related to brain development, were strongly
associated with languages that incorporate a single-tier phonological system,
even when controlling for geographical factors and common linguistic history.
On the other hand, given that the relevant mutations would have had to occur independently
several times, the causal explanation plausibly goes in the opposite direction,
from genes to language. The two alleles may have been selected for other
reasons relating to brain development but once in place they made it harder to
acquire phonological systems involving tonal contrasts, which, in turn, allowed
languages without tonal contrasts to evolve more readily. This perspective
(also advocated by Dediu & Ladd) dovetails with our suggestion that
language is shaped by the brain, as discussed below. However, either of these
interpretations would argue against an adaptationist account of UG.
[xi] We have presented the argument in
informal terms. A more rigorous argument is as follows. We can measure the
amount of information embodied in universal grammar, U, over and above the information in pre-existing cognitive
processes, C, by the length of the
shortest code that will generate U
from C. This is the conditional
Kolmogorov complexity K(U|C) (Li
& Vitányi, 1997). By the coding theorem of Kolmogorov complexity theory (Li
& Vitányi, 1997), the probability of randomly generating U from C is approximately 2-K(U|C).
Thus, if universal grammar has any substantial complexity, then it has a
vanishingly small probability of being encountered by a random process, such as
a non-adaptational mechanism.
[xii] Darwin may have had several reasons for
pointing to these similarities. Given that comparative linguistics at the time
was considered to be a model science on a par with geology and comparative
anatomy, he may have used comparisons between linguistic change—which was
thought to be well understood at that time—and species change to corroborate
his theory of evolution (Alter, 1998; Beer, 1996). Darwin may also have used
these language-species comparisons to support the notion that less “civilized”
human societies spoke less civilized languages, because he believed that this
was predicted by his theory of human evolution (Raddick, 2000, 2002).
[xiii] Chomsky has sometimes speculated that
the primary role of language may be as a vehicle for thought, rather than
communication (e.g., Chomsky, 1980). This viewpoint has its puzzles—for
example, the existence of anything other than semantic representations is
difficult to understand, as it is these over which thought is defined; and the
semantic representations in Chomsky’s recent theorizing are, indeed, too
underspecified to support inference, throwing the utility of even these
representations into doubt.
[xiv] Some studies purportedly indicate that
the mechanisms involved in syntactic language are not the same as those
involved in most sequential learning tasks (e.g., Friederici, Bahlmann, Heim,
Schibotz & Anwander, 2006; Peña et al., 2002). However, the methods used in
these studies have subsequently been shown to be fundamentally flawed (de
Vries, Monaghan, Knecht & Zwitserlood, in press, and Onnis et al., 2005,
respectively), thereby undermining their negative conclusions. Thus, the
preponderance of the evidence suggests that sequential learning tasks tap into
the mechanisms involved in language acquisition and processing.
[xv] The current knowledge regarding the FOXP2 gene is consistent with the
suggestion of a human pre-adaptation for sequential learning (Fisher, 2006). FOXP2 is highly conserved across species
but two amino acid changes have occurred after the split between humans and
chimps, and these became fixed in the human population about 200,000 years ago
(Enard et al., 2002). In humans, mutations to FOXP2 result in severe speech and orofacial motor impairments (Lai,
Fisher, Hurst, Vargha-Khadem & Monaco, 2001; MacDermot et al., 2005).
Studies of FOXP2 expression in mice
and imaging studies of an extended family pedigree with FOXP2 mutations have provided evidence that this gene is important
to neural development and function, including of the corticostriatal system
(Lai, Gerrelli, Monaco, Fisher & Copp, 2003). This system has been shown to
be important for sequential (and other types of procedural) learning (Packard
& Knowlton, 2002). Crucially, preliminary findings from a mother and
daughter with a translocation involving FOXP2
indicate that they have problems with both language and sequential learning
(Tomblin, Shriberg, Murray, Patil & Williams, 2004).