Ray Jackendoff
Program in Linguistics, MS 013
Brandeis University
Waltham, MA 02454 USA
tel: 617-484-5394
fax :617-484-0164
Short abstract:
The goal of this study to reintegrate the theory of generative grammar into the
cognitive sciences. Generative grammar was correct to focus on the child's
acquisition of language as its central problem, leading to the hypothesis of an
innate Universal Grammar. However,
generative grammar was mistaken to assume that the syntactic component is the
sole course of combinatoriality, and that everything else is
"interpretive." The proper approach is a parallel architecture, in
which phonology, syntax, and semantics are autonomous generative systems,
linked by interface components. The
parallel architecture leads to an integration within linguistics, and to a far
better integration with the rest of cognitive neuroscience. It fits naturally into the larger
architecture of the mind/brain and permits a properly mentalistic theory of
semantics. It leads to a view of
linguistic performance in which the rules of grammar are directly involved in
processing. Finally, it leads to a
natural account of the incremental evolution of the language capacity.
1.
Introduction
In the 1960s, when I became a graduate student in linguistics,
generative grammar was the hot new topic. Everyone from philosophers to psychologists to anthropologists to
educators to literary theorists was reading about transformational
grammar. But by the late 1970s, the
bloom was off the rose, although most linguists didn’t realize it; and by the
1990s, linguistics was arguably far on the periphery of the action in cognitive
science. To some extent, of course,
such a decline in fortune was simply a matter of fashion and the arrival of new
methodologies such as connectionism and brain imaging. However, there are deeper reasons for
linguistics’ loss of prestige, some historical and some scientific.
The basic questions I want to take up here, then, are:
What was right about generative grammar in the
1960s, such that it held out such promise?
What was wrong about it, such that it didn’t
fulfill its promise?
How can we fix it, so as to restore its value to the
other cognitive sciences?
The goal is to integrate linguistics with the other cognitive sciences,
not to eliminate the insights achieved by any of them. To understand language and the brain, we
need all the tools we can get. But
everyone will have to give a little in order for the pieces to fit together
properly.
The position developed in Foundations of Language is that the
overall program of generative grammar was correct, as was the way this program
was intended to fit in with psychology and biology. However, a basic technical mistake at the heart of the formal
implementation, concerning the overall role of syntax in the grammar, led to
the theory being unable to make the proper connections both within linguistic
theory and with neighboring fields. Foundations
of Language develops an alternative, the parallel architecture,
which offers far richer opportunities for integration of the field. In order to understand the motivation for
the parallel architecture, it is necessary to go through some history.
2. Three founding themes of generative grammar
The remarkable first chapter of Noam Chomsky’s Aspects of the Theory
of Syntax (1965) set the agenda for everything that has happened in
generative linguistics since. Three
theoretical pillars support the enterprise:
mentalism, combinatoriality, and acquisition.
Mentalism. Before Aspects, the predominant view
among linguists – if it was even discussed – was that language is something
that exists either as an abstraction, or in texts, or in some sense “in the
community” (the latter being the influential view of Saussure (1915), for
example). Chomsky urged the view that
the appropriate object of study is the linguistic system in the mind/brain of
the individual speaker. According to
this stance, a community has a common language by virtue of all speakers in the
community having essentially the same linguistic system in their minds/brains.1
The term most often used for this linguistic system is “knowledge”,
perhaps an unfortunate choice. However,
within the theoretical discourse of the time, the alternative was thinking of
language as an ability, a “knowing how” in the sense of Ryle (1949), which
carried overtones of behaviorism and stimulus-response learning, a sense from
which Chomsky with good reason wished to distance himself. It must be stressed, though, that whatever
term is used, the linguistic system in a speaker’s mind/brain is deeply
unconscious and largely unavailable to introspection, in the same way that our
processing of visual signals is deeply unconscious. Thus language is a kind of mind/brain property hard to associate
with the term “knowledge”, which commonly implies accessibility to
introspection. Foundations of
Language compromises with tradition by systematically using the term f-knowledge
(‘functional knowledge’) to describe whatever is in speakers’ heads that
enables them to speak and understand their native language(s).
There still are linguists, especially those edging off toward semiotics
and hermeneutics, who reject the mentalist stance and assert that the only
sensible way to study language is in terms of the communication between
individuals (a random example is Dufva and Lähteenmäki 1996). But on the whole, the mentalistic outlook of
generative grammar has continued to be hugely influential throughout
linguistics and cognitive neuroscience.
More controversial has been an important distinction made in Aspects
between the study of competence – a speaker’s f-knowledge of language –
and performance, the actual
processes (viewed computationally or neurally) taking place in the mind/brain
that put this f-knowledge to use in speaking and understanding sentences. I think the original impulse behind the
distinction was methodological convenience.
A competence theory permits linguists to do what they have always done,
namely study phenomena like Bulgarian case marking and Turkish vowel harmony,
without worrying too much about how the brain actually processes them. Unfortunately, in response to criticism
from many different quarters (especially in response to the collapse of the
derivational theory of complexity as detailed in e.g. Fodor, Bever, and Garrett
19xx), linguists have tended to harden the distinction into a firewall: competence theories were taken to be immune
to evidence from performance. And so
began a gulf between linguistics and
the rest of cognitive science that has persisted until the present.
Foundations
does not abandon the competence-performance distinction, but does return it to
its original status as a methodological rather than ideological
distinction. Although the innovations
in Foundations are largely in the realm of competence theory, one of
their important consequences is that there is a far closer connection to
theories of processing, as well as the possibility of a two-way dialogue
between competence and performance theories.
We return to this issue in section 9.3.
Combinatoriality: The earliest published work
in generative grammar, Chomsky’s Syntactic Structures (1957), began with
the observation that a language contains an arbitrarily large number of
sentences. Therefore, in addition to
the finite list of words, a characterization of a language must contain a set
of rules (or a grammar) that collectively describes or
“generates” the sentences of the language.
Syntactic Structures showed that the rules of natural language
cannot be characterized in terms of a finite-state Markov process, nor in terms
of a context-free phrase structure grammar.
Chomsky proposed that the appropriate form for the rules of a natural
language is a context-free phrase structure grammar supplemented by
transformational rules. Not all
subsequent traditions of generative grammar (e.g. Head-Driven Phrase Structure
Grammar (Pollard and Sag 1994) and Lexical-Functional Grammar (Bresnan 1982,
2001)) have maintained the device of transformational rules; but they all
contain machinery designed to overcome the shortcomings of context-free
grammars pointed out in 1957.2
Transferred into the mentalistic framework of 1965, the consequence of
combinatoriality is that speakers of the language must have rules of language
(or mental grammars) in their heads as part of their f-knowledge. Again there is a certain amount of
controversy arising from the term “rules”.
Rules of grammar in the sense of generative grammar are not like any of
the sorts of rules or laws in ordinary life:
rules of etiquette, rules of chess, traffic laws, or laws of
physics. They are unconscious
principles that play a role in the production and understanding of
sentences. Again, to ward off improper
analogies, Foundations uses the term f-rules for whatever the
combinatorial principles in the head may be.
Generative linguistics leaves open how directly the f-rules are involved in processing, but, as suggested
above, the unfortunate tendency among linguists has been not to care. The theory in Foundations, though,
makes it possible to regard the rules as playing a direct role in processing;
see again section 9.3.
An important reason for the spectacular reception of early generative
grammar was that it went beyond merely claiming that language needs rules: it offered rigorous formal techniques for
characterizing the rules, based on approaches to the foundations of mathematics
and computability developed earlier in the century. The technology suddenly made it possible to say lots of
interesting things about language and ask lots of interesting questions.
For the first time ever it was possible to provide detailed descriptions
of the syntax of natural languages (not only English but German, French,
Turkish, Mohawk, Hidatsa, and Japanese were studied early on). In addition, generative phonology took off
rapidly, adapting elements of Prague School phonology of the 1930s to the new
techniques. With Chomsky and Halle’s
1968 Sound Pattern of English as its flagship, generative phonology
quickly supplanted the phonological theory of the American structuralist
tradition.
Acquisition: Mentalism and combinatoriality together lead
to the crucial question: How do
children get the f-rules into their heads?
Given that the f-rules are unconscious, parents and peers cannot verbalize
them; and even if they could, children would not understand, since they don’t
know language yet. The best the
environment can do for a language learner is provide examples of the language
in a context. From there on it is up to
the language learner to construct the principles on his or her own –
unconsciously of course.
Chomsky asked the prescient question:
what does the child have to “(f-)know in advance” in order to accomplish
this feat? He phrased the problem in
terms of the “poverty of the stimulus”:
many different generalizations are consistent with the data presented to
the child, but the child somehow comes up with the “right” one, i.e. the one
that puts him or her in tune with the generalizations of the language
community. I like to put the problem a
bit more starkly: The whole community
of linguists, working together for decades with all sorts of crosslinguistic
and psycholinguistic data unavailable to children, has still been unable to
come up with a complete characterization of the grammar of a single natural
language. Yet every child does it by
the age of ten or so. Children don’t
have to make the choices we do: for
instance they don’t have to decide whether the “right” choice of grammar is in
the style of transformational grammar, the Minimalist Program, Optimality
Theory, Role and Reference Grammar, Tree-Adjoining Grammar, Cognitive Grammar,
connectionist networks, or some as yet unarticulated alternative. They already f-know it in advance.
One of the goals of linguistic theory, then, is to solve this “Paradox
of Language Acquisition” by discovering what aspects of linguistic f-knowledge
are not learned, but rather form the basis for the child’s
learning. The standard term for the
unlearned component is Universal Grammar or UG, a term that again
perhaps carries too much unwanted baggage.
In particular, UG should not be confused with universals of
language: it is rather what shapes the
acquisition of language. I prefer to
think of it as a toolkit for constructing language, out of which the child (or
better, the child’s brain) f-selects tools appropriate to the job at hand. If the language in the environment happens
to have a case system (like German), UG will help shape the child’s acquisition
of case; if it has a tone system (like Mandarin), UG will help shape the
child’s acquisition of tone. But if the
language in the environment happens to be English, which lacks case and tone,
these parts of UG will simply be silent.
What then is the source of language universals? Some of them will indeed be determined by
UG, for instance the overall “architecture” of the grammatical system: the parts of the mental grammar and their
relations (of which much more below).
Other universals, especially what are often called “statistical” or
“implicational” universals, may be the result of biases imposed by UG. For instance, UG may say that if a language
has a case system, the simplest such systems are thus-and-so; these will be
widespread systems crosslinguistically; they will be acquired earlier by
children; and systems may tend to change toward them over historical time. Other universals may be a consequence of the
functional properties of any relatively efficient communication system: for instance, the most frequently used
signals tend to be short. UG doesn’t
have to say anything about these universals at all; they will come about
through the dynamics of language use in the community (a process which of
course is not very well understood).
If UG is not learned, how does the child acquire it? The only alternative is through the
structure of the brain, which is determined through a combination of genetic
inheritance and the biological processes resulting from expression of the
genes, the latter in turn determined by some combination of inherent structure
and environmental input. Here
contemporary science is pretty much at an impasse. We know little about how genes determine brain structure and
nothing about how the details of brain structure determine anything about
language structure, even aspects of language as simple as speech sounds. Filling out this part of the picture is a
long-term challenge for cognitive neuroscience. It is premature to reject the hypothesis of Universal Grammar, as
some have (e.g. Elman et al. 1996 and Deacon 1997), arguing that we don’t know
how genes could code for language acquisition.
After all, we don’t know how genes code for birdsong or sexual behavior
or sneezing either, but we don’t deny that there is a genetic basis behind
these.
There next arises the question of how much of UG is a human cognitive
specialization for language and how much is a consequence of more general
capacities. The question has often been
oversimplified to a binary decision between language being entirely special or
entirely general, with a strong bias inside generative linguistics towards the
former and outside generative linguistics towards the latter. The truth of the matter undoubtedly lies
somewhere in between. To be sure, many
people (including myself) would find it satisfying if a substantial part of
language acquisition were a consequence of general human cognitive factors; but
the possibility of some specialization overlaying the general factors must not
be discounted. My view is that we
cannot determine what is general and what is special until we have comparable
theories of other cognitive capacities, including other learned
cognitive capacities. To claim that
language is parasitic on, say, motor control, perhaps because both have
hierarchical and temporal structure (this seems to be the essence of
Corballis’s (1991) position) – but without stating a theory of the f-knowledge
involved in motor control – is to
coarsen the fabric of linguistic theory to the point of unrecognizability. The closest approach to a comparable theory
is the music theory of Lerdahl and Jackendoff 1983, which displays some
striking parallels and some striking differences with language.
Of course, if UG – the ability to learn language – is in part a human
cognitive specialization, it must be determined by some specifically human
genes, which in turn had to have come into existence sometime since the hominid
line separated from the other great apes.
One would therefore like to be able to tell some reasonable story about
how it could be shaped by natural selection or other evolutionary
processes. We return to this issue in
section 9.4.
This approach to the acquisition of language has given rise to a
flourishing tradition of developmental research (references far too numerous to
mention) and a small but persistent tradition in learnability theory (e.g.
Wexler and Culicover 1980, Baker and McCarthy 1981). And certainly, even if the jury is not yet in on the degree to
which language acquisition is a cognitive specialization, there have been all
manner of phenomena investigated that bear on the issue, for instance:
My impression is that, while there are questions about
all of these cases, en masse they offer an overwhelming case for some degree of
genetic specialization for language learning in humans.
These three foundational issues of generative grammar
– mentalism, combinatoriality, and acquisition – have stood the test of time;
if anything they have become even more important over the years in the rest of
cognitive science. It is these three
issues that connect linguistics intimately with psychology, brain science, and
genetics. Much of the promise of
generative linguistics arose from this new and exciting potential for
scientific unification.
3. The broken
promise: Deep Structure would be the key to the mind
A fourth major point of Aspects, and the one
that seeped most deeply into the awareness of the wider public, concerned the
notion of Deep Structure. A basic claim
of the 1965 version of generative grammar was that in addition to the surface
form of sentences, i.e. the form we hear, there is another level of syntactic
structure, called Deep Structure, which expresses underlying syntactic
regularities of sentences. For
instance, a passive sentence like (1a)
has a Deep Structure in which the noun phrases are in the order of the
corresponding active (1b).
(1) a. The bear was chased by the lion.
b. The lion chased the bear.
Similarly, a question such as (2a) has a Deep
Structure closely resembling that of the corresponding declarative (2b).
(2) a. Which martini did Harry drink?
b. Harry drank that martini.
In the years preceding Aspects, the question
arose of how syntactic structure is connected to meaning. Following a hypothesis first proposed by
Katz and Postal (1964), Aspects made the striking claim that the
relevant level of syntax for determining meaning is Deep Structure.
In its weakest version, this claim was only that
regularities of meaning are most directly encoded in Deep Structure, and this
can be seen in (1) and (2). However,
the claim was sometimes taken to imply much more: that Deep Structure IS meaning, an interpretation that
Chomsky did not at first discourage.3 And this was the part of generative
linguistics that got everyone really excited.
For if the techniques of transformational grammar lead us to meaning, we
can uncover the nature of human thought.
Moreover, if Deep Structure is innate – being dictated by Universal
Grammar – then linguistic theory gives us unparalleled access to the essence of
human nature. No wonder everyone
wanted to learn linguistics.
What happened next was that a group of generative
linguists, notably George Lakoff, John Robert Ross, James McCawley, and Paul
Postal, pushed very hard on the idea that Deep Structure should directly encode
meaning. The outcome, the theory of
Generative Semantics (e.g. McCawley 1968, Postal 1970, Lakoff 1971), increased
the “abstractness” and complexity of Deep Structure, to the point that the
example Floyd broke the glass was famously posited to have eight
underlying clauses, each corresponding to some feature of the semantics. All the people who admired Aspects
for what it said about meaning loved Generative Semantics, and it swept the
country. But Chomsky himself reacted
negatively, and with the aid of his then-current students (full
disclosure: present author included),
argued vigorously against Generative Semantics. When the dust of the ensuing“Linguistics Wars” cleared around
1973 (Newmeyer 1980, Harris 1993, Huck and Goldsmith 1995), Chomsky had won –
but with a twist: he no longer claimed
that Deep Structure was the sole level that determines meaning (Chomsky
1972). Then, having won the battle, he
turned his attention, not to meaning, but to relatively technical constraints
on movement transformations (e.g. Chomsky 1973, 1977).
The reaction in the larger community was shock: for one thing, at the fact that the
linguists had behaved so badly; but more substantively, at the sense that there
had been a “bait and switch.” Chomsky
had promised Meaning with a capital M and then had withdrawn the
offer. Many researchers, both inside
and outside linguistics, turned away from generative grammar with disgust,
rejecting not only Deep Structure but mentalism, innateness, and sometimes even
combinatoriality. And when, later in
the 1970s, Chomsky started talking about meaning again, in terms of a syntactic
level of Logical Form (e.g. Chomsky 1981), it was too late: the damage had been done. From this point on, the increasingly
abstract technical apparatus of generative grammar was of no interest to more than
a tiny minority of cognitive scientists, much less the general public.
Meanwhile, various nonChomskyan traditions of
generative grammar developed, most notably Relational Grammar (Perlmutter
1983), Head-Driven Phrase Structure Grammar (Pollard and Sag 1987, 1994),
Lexical-Functional Grammar (Bresnan 1982, 2001), Formal Semantics (Partee 1976,
Heim and Kratzer 19xx), Optimality Theory (Prince and Smolensky 1993),
Construction Grammar (Fillmore et al. 1988, Goldberg 1995), and Cognitive
Grammar (Lakoff 1987, Langacker 1987, Talmy 2000). On the whole, these approaches to linguistics (with the possible
exception of Cognitive Grammar) have made even less contact with philosophy,
psychology, and neuroscience than the recent Chomskyan tradition. My impression is that many linguists have
simply returned to the traditional concerns of the field: describing languages,
with as little theoretical and cognitive baggage as possible. While this is perfectly fine – particularly
since issues of innateness don’t play too big a role when you’re trying to record
an endangered language before its speakers all die – the sense of excitement
and danger that comes from participating in the integration of fields has
become attenuated.
4. The
scientific mistake: syntactocentrism
So much for pure intellectual history. We now turn to what I think was an important
mistake at the core of generative grammar, one that in retrospect lies behind
much of the alienation of linguistic theory from the cognitive sciences. Chomsky did demonstrate that language
requires a generative system that makes possible an infinite variety of
sentences. However, he explicitly
assumed, without argument (1965: 16, 17, 75, 198), that generativity is
localized in the syntactic component of the grammar – the construction of
phrases from words – and that phonology
(the organization of speech sounds) and semantics (the organization of meaning)
are purely “interpretive”, that is, that their combinatorial properties are
derived strictly from the combinatoriality of syntax.
In 1965 this was a perfectly reasonable view. The important issue at that time was to show
that something in language is generative. Generative syntax had provided powerful new tools, which were
yielding copious and striking results.
At the time, it looked as though phonology could be treated as a sort of
low-level derivative of syntax: the syntax gets the words in the right order,
then phonology massages their pronunciation to adjust them to their local
environment. As for semantics,
virtually nothing was known: the only
things on the table were the rudimentary proposals of Katz and Fodor (1963) and
some promising work by people such as Bierwisch (1967, 1969) and Weinreich
(1966). So the state of the theory
offered no reason to question the assumption that all combinatorial complexity
arises from syntax.
Subsequent shifts in mainstream generative linguistics
stressed major differences in outlook.
But one thing that remained unchanged was the assumption that syntax is
the sole source of combinatoriality.
Figure 1 diagrams the architecture of components in three major stages
of Chomskyan syntactic theory: the Aspects theory, Principles and
Parameters (or Government-Binding) Theory (Chomsky 1981), and the Minimalist
Program (Chomsky 1995). The arrows
denote direction of derivation.

Figure 1. Architecture of Chomsky's theories over the years.
These shifts alter the components of syntax and their
relation to sound and meaning. What
remains constant throughout, though, is that (a) there is an initial stage of
derivation in which words or morphemes are combined into syntactic structures;
(b) these structures are then massaged by various syntactic operations; and (c)
certain syntactic structures are shipped off to phonology/phonetics to be
pronounced and other syntactic structures are shipped off to “semantic
interpretation” to be understood. In
short, syntax is the source of all linguistic organization.
I believe that this assumption of “syntactocentrism” –
which, I repeat, was never explicitly grounded – was an important mistake at the heart of the field.4 The correct approach is to regard linguistic
structure to be the product of a number of parallel but interacting generative
capacities – at the very least, one each for phonology, syntax, and
semantics. As we will see, elements of
such a “parallel architecture” have been implicit in practice in the field for
years. What is novel in the present
work is bringing these practices out into the open, stating them as a
foundational principle of linguistic organization, and exploring the
large-scale consequences.
5. Phonology
as an exemplar of the parallel architecture
An unnoticed crack in the assumption of
syntactocentrism appeared in the middle to late 1970s, when the theory of
phonology underwent a major seachange.
Before then, the sound system of language had been regarded essentially
as a sequence of speech sounds. Any
further structure, such as the division into words, was thought of as simply
inherited from syntax. However, beginning with work such as Goldsmith (1979)
and Liberman and Prince (1977), phonology rapidly came to be thought of as
having its own autonomous structure, in fact multiple structures or tiers. Figure 2 provides a sample, the structure of
the phrase the big apple. The
phonological segments appear at the bottom, as terminal elements of the
syllabic tree.

Figure 2. Phonological structure of the big apple.
There are several innovations here. First, syllabic structure is seen as
hierarchically organized. At the center
of the syllable (notated as σ) is a syllabic nucleus (notated N),
which is usually a vowel but sometimes a syllabic consonant such as the l
in apple. The material following
the nucleus is the syllabic coda (notated C); this groups with the
nucleus to form the rhyme (notated R), the part involved in
rhyming. In turn, the rhyme groups with
the syllabic onset (notated O) to form the entire syllable. Syllables are grouped together into larger
units such as feet and phonological words (here, the bracketing subscripted Wd). Notice that in Figure 1, the word the
does not constitute a phonological word on its own; it is attached (or
cliticized) onto the word big.
Finally, phonological words group into larger units such as phonological
phrases. Languages differ in their
repertoire of admissible nuclei, onsets, and codas, but the basic hierarchical
organization and the principles by which strings of segments are divided into syllables
are universal. (It should also be
mentioned that signed languages have parallel syllabic organization, except
that the syllables are built out of manual rather than vocal constituents
(Klima and Bellugi 1979, Fischer and Siple 1990).)
These hierarchical structures are not built out of
syntactic primitives such as nouns, verbs, and determiners; their units are
intrinsically phonological. In
addition, the structures, though hierarchical, are not recursive.5 Thus the principles governing these structures
are not derivable from syntactic structures; they are an autonomous system of
generative rules.
Next consider the metrical grid in Figure 2. Its units are beats, notated as
columns of xs. A column with
only one x is a weak beat, and more xs in a column indicate a
relatively stronger beat. Each beat is
associated with a syllable; the strength of a beat indicates the relative
stress on that syllable, so that for example in Figure 2 the first syllable of apple
receives maximum stress. The basic principles
of metrical grids are in part autonomous of language: they also appear, for instance, in music (Lerdahl and Jackendoff
1983), where they are associated with notes instead of syllables. Metrical grids place a high priority on
rhythmicity: an optimum grid presents an alternation of strong and weak beats,
as is found in music and in much poetry.
On the other hand, the structure of syllables exerts an influence on the
associated metrical grid: syllables
with heavy rhymes (i.e. containing a coda or a long vowel) “want” to be
associated with relatively heavy stress.
The stress rules of a language concern the way syllabic structure comes
to be associated with a metrical grid; languages differ in ways that are now
quite well understood (e.g. Kager 1995, Halle and Idsardi 1995).
Again, metrical grids are built of nonsyntactic
units. As they are to some degree
independent of syllabic structure, they turn out to be a further autonomous
“tier” of phonological structure.
At a larger scale of phonological organization we find
prosodic units over which intonation contours are defined. These are comparable in size to syntactic
phrases but do not coincide with them.
Here are two examples.
(3) Syntactic
bracketing:
[Sesame
Street] [is [a production [of [the Children’s Television Workshop]]]]
Prosodic
bracketing (two pronunciations):
a. [Sesame Street is a production of] [the
Children’s Television Workshop]
b. [Sesame Street] [is a production] [of the
Children’s Television Workshop]
(4) Syntactic
bracketing
[This]
[is [the cat [that chased [the rat [that ate [the cheese]]]]]]
Prosodic
bracketing:
[This is
the cat] [that chased the rat] [that ate the cheese]
The two pronunciations of (3) are both acceptable, and
other prosodic bracketings are also possible.
However, the choice of prosodic bracketing is not entirely free, since
for instance [Sesame] [Street is a production of the] [Children’s Television
Workshop] is an impossible phrasing.
Now notice that the first constituent of (3a) and the second constituent
of (3b) do not correspond to any syntactic constituent. We would be hard pressed to know what
syntactic label to give to Sesame Street is a production of. But as an intonational constituent it
is perfectly fine. Similarly in (4),
the syntax is relentlessly right-embedded, but the prosody is flat and
perfectly balanced into three parts.
Again, the first two constituents of the prosody do not correspond to
syntactic constituents of the sentence.
The proper way to deal with this lack of correspondence
is to posit a phonological category of Intonational Phrase, which plays a role
in the assignment of intonation contours and the distribution of stress
(Beckman and Pierrehumbert 1986, Ladd 1996).
Intonation Phrases are to some degree correlated with syntax; their
boundaries tend to be at the beginning of major syntactic constituents; but
their ends do not necessarily correlate with the ends of the
corresponding syntactic constituents.
At the same time, intonational phrases have their own autonomous
constraints, in particular a strong preference for rhythmicity and parallelism
(as evinced in (2) for example), and a preference for saving the longest
prosodic constituent for the end of the sentence.6
Another example of mismatch between syntax and
phonology comes from contractions such as I’m and Lisa’s (as in Lisa’s
a doctor). These are clearly
phonological words, but what is their syntactic category? It is implausible to see them either as noun
phrases that incidentally contain a verb or to see them as verbs that
incidentally contain a noun. Keeping
phonological and syntactic structure separate allows us to say the natural
thing: they are phonological words that
correspond to two separate syntactic constituents.
(5) Syntactic
structure: [NP I] [V
(a)m] [NP Lisa] [V
(i)s]
Phonological
structure: [Wd
I’m] [Wd Lisa’s]
Since every different sentence of the language has a
different phonological structure, and since phonological structures cannot be
derived from syntax, the usual arguments for combinatoriality lead us to the
conclusion that phonological structure is generative. However, in addition to the generative principles that describe
these structures, it is necessary to introduce a new kind of principle into the
grammar, what might be called “correspondence rules” or “interface rules.” These rules (I revert to the standard term
“rules” rather than being obsessive about “f-rules”) regulate the way the
independent structures correspond with each other. For instance, the relation between syllable weight and metrical
weight is regulated by an interface rule between syllabic and metrical
structure; the relation between syntactic and intonational constituents is
regulated by an interface rule between syntactic and prosodic structure.
An important property of interface rules is they don’t
“see” every aspect of the structures they are connecting. For instance, the rules that connect
syllabic content to metrical grids are totally insensitive to syllable onset:
universally, stress rules care only about what happens in the rhyme. Similarly, although the connection between
syntax and phonology “sees” certain syntactic boundaries, it is insensitive to
the depth of syntactic embedding, Moreover, syntactic structure is totally
insensitive to the segmental content of the words it is arranging (e.g. there
is no syntactic rule that applies only to words that begin with b). Thus interface rules implement not
isomorphisms between the structures they relate, but rather only partial
homomorphisms.
This is not to say that we should think of speakers as
thinking up phonological and syntactic structures independently in the hope
they can be matched up by the interfaces.
That would be the same sort of mistake as thinking that speakers start
with the symbol S and generate a syntactic tree, finally putting in
words so they know what the sentence is about.
At the moment we are not thinking in terms of production; rather we are
stating the principles (of “competence”) in terms of which sentences are
well-formed. We will get back to how
this is related to processing in section 9.3.
Now the main point of this section. This view of phonological structure,
developed in the late 1970s and almost immediately adopted as standard, is
deeply subversive of the syntactocentric assumption that all linguistic
combinatoriality originates in syntax.
According to this view, phonological structure is not just a passive
hand-me-down derived from low-level syntax: it has its own role in shaping the
totality of linguistic structure. But
at the time of these changes, no great commotion was made about this most
radical aspect of the new phonology.
Phonologists for the most part were happy to get on with exploring this
exciting way of doing things, and for them, the consequences for syntax didn’t
matter. Syntacticians, for their part,
simply found phonology irrelevant to
their concerns of constraining movement rules and the like, especially since
phonology had now developed its own arcane technical machinery. So neither subdiscipline really took notice;
and as the technologies diverged, the relation between syntax and phonology
became a no-man’s-land (or perhaps only a very-few-man’s-land). Tellingly, as far as I can determine, in all
of Chomsky’s frequent writings on the character of the human language capacity,
there is no reference at all to post-1975 phonology – much less to the
challenge that it presents to his overall syntactocentric view of language.
6. The
syntax-semantics interface
I have treated the developments in phonology first
because it is less controversial. But
in fact the same thing happened in semantics.
Over the course of the 1970s and 1980s, several radically different
approaches to semantics developed: within
linguistics, at least Formal Semantics (growing out of formal logic)(Partee
1976, Heim and Kratzer 19xx), Cognitive Grammar (Lakoff 1987, Langacker 1987,
Talmy 2000), and Conceptual Semantics (Jackendoff 1983, 1990, Pinker 1989,
Pustejovsky 1995), plus approaches within computational linguistics and
cognitive psychology. Whatever their
differences, all these approaches take meaning to be deeply combinatorial. None of them take the units of semantic
structure to be syntactic units such as NPs and VPs; rather, the units are
intrinsically semantic entities like objects, events, actions, properties, and
quantifiers.7 Therefore,
whichever semantic theory we choose, it is necessary to grant semantics an
independent generative organization, and it is necessary to include in the
theory of grammar an interface component that correlates semantic structures
with syntactic and phonological structures.
In other words, the relation of syntax to semantics is qualitatively parallel
to the relation of syntax to phonology.
However, apparently no one pointed out the challenge to syntactocentrism
– except the Cognitive Grammarians, who mostly went to the other extreme and
denied syntax any independent role, and who have been steadfastly
ignored by mainstream generative linguistics.
The organization of phonological structure into
semi-independent tiers finds a parallel in semantics as well. Linguistic meaning can be readily partialled
into two independent aspects. On one
hand there is what might be called “propositional structure”: who did what to whom and so on. For instance, in The bear chased the lion,
there is an event of chasing in which the bear is the chaser and the lion is
“chasee”. On the other hand, there is
also what is now called “information structure”: the partitioning of the message into old vs. new information,
topic vs. comment, presupposition vs. focus, and so forth. We can leave the propositional structure of
a sentence intact but change its information structure, by using stress (6a-c)
or various focusing constructions (6d-f):
(6) a. The BEAR chased the lion.
b. The bear chased the LION.
c. The bear CHASED the lion.
d. It was the bear that chased the lion.
e. What the bear did was chase the lion.
f. What happened to the lion was the bear chased
it.
Thus the propositional structure and the information
structure are orthogonal dimensions of meaning, and can profitably be regarded
as autonomous tiers. (Foundations
proposes a further split of propositional structure into descriptive and
referential tiers, an issue too complex for the present context.)
Like the interface between syntax and phonology, that
between syntax and semantics is not an isomorphism. Some aspects of syntax make no difference in semantics. For instance, the semantic structure of a
language is the same whether or not the syntax marks subject-verb agreement,
verb-object agreement, or nominative and accusative case. The semantic structure of a language does
not care whether the syntax calls for the verb to be after the subject (as in
English), at
the end of the clause (as in Japanese), or second in a main clause and final in
a subordinate clause (as in German). As
these aspects of syntax are not correlated with or derivable from semantics,
the interface component disregards them.
Similarly, some aspects of semantics have little if
any systematic effect in syntax. Here
are a few well-known examples.
(7) a. Where is my hat?
b. (Now,
Billy:) What’s the capital of New York?
c. Would
you please open the window?
d. Is the
Pope Catholic?
(8) a. Jill
jumped until the alarm went off.
b. Jill slept until the alarm went off.
c. Jill jumped when the alarm went off.
The standard account of this contrast (Talmy 2000,
Verkuyl 1993, Pustejovsky 1995, Jackendoff 1997) is that the meaning of until
is to set a temporal bound on an ongoing process. When the verb phrase already denotes an ongoing process, such as
sleeping, all is well. But when the
verb phrase denotes an action that has a natural temporal ending, such as
jumping, then its interpretation is “coerced” into repeated action – a
sort of ongoing process – which in turn can have a temporal bound set on it by until. For present purposes, the point is that the
sense of repetition arises from semantic combination, without any direct
syntactic reflex. (On the other hand,
there are languages such as American Sign Language that have a grammatical
marker of iteration; this will have to be used in the translation of (8a).)
(9) a. [One
waitress says to another]:
The ham sandwich wants another cup of
coffee.
[Interpretation: ‘the person who
ordered/is eating the ham sandwich...’]
b. Chomsky is on the top shelf next to
Plato.
[Interpretation: ‘the book by
Chomsky ...’]
Such cases of “reference transfer” contain no syntactic
reflex of the italicized parts of the interpretation. One might be tempted to dismiss these phenomena as “mere
pragmatics”, hence outside the grammatical system. But this proves impossible, because reference transfer can have
indirect grammatical effects. A clear
example involves imagining that Richard Nixon went to see the opera Nixon in
China (yes, a real opera!), and what happened was that:
(10) Nixon was astonished to see himself sing
a foolish duet with Pat.
The singer of the duet, of course, is the actor
playing Nixon; thus the interpretation of himself involves a
reference transfer. However, we cannot
felicitously say that what happened next was that:
(11) *(Up on stage,) Nixon was astonished to
see himself get up and walk out.
That is, a reflexive pronoun referring to the acted
character can have the real person as antecedent, but not vice versa
(Fauconnier 1985, Jackendoff 1992).
Since the use of reflexive pronouns is central to grammar, reference
transfer cannot be seen as “extragrammatical.”
(12) Everyone in this room knows at least two
languages.
a. ‘John knows English and French; Sue
knows Hebrew and Hausa; ....”
b. ‘... namely, Mandarin and Navajo.’
Should there be two different syntactic structures
associated with these two interpretations?
Chomsky 1957 said no; Chomsky 1981 said yes; Generative Semantics said
yes; I am inclined to say no (Jackendoff 1996, Foundations chapter
12). The problem with finding two
different syntactic structures is that it requires systematic and drastic
distortions of the syntactic tree that never show up in the surface syntax of
any language. The problem with having
only one syntactic structure is that it makes the syntax-semantics interface
more complex. The point to be made here
is that the scope of quantification may well be a further example of the
“dirtiness” of the interface between syntax and semantics; this continues to be
an important issue in linguistic theory.
In each of these cases, a syntactocentric theory is
forced to derive the semantic distinctions from syntactic distinctions. Hence it is forced into artificial solutions
such as empty syntactic structure and elaborate movement, which have no
independent motivation beyond providing grist for the semantics. On the other hand, if the semantics is
treated as independent from syntax but correlated with it, it is possible to
permit a less than perfect correlation; it is then an empirical issue to
determine how close the match is.