To be published in Behavioral and Brain Sciences (in press)
© Cambridge University Press 2003


Below is the unedited précis of a book that is being accorded BBS multiple book review. This preprint has been prepared for potential commentators who wish to nominate themselves for formal commentary invitation. Please do not write a commentary unless you receive a formal invitation. Invited commentators will receive full instructions. Commentary must be based on the book - not the precis


Precis of: Foundations of Language: Brain, Meaning, Grammar, Evolution

Oxford and New York: Oxford University Press, 2002 (506 pp.)

 

 

Ray Jackendoff

Program in Linguistics, MS 013

Brandeis University

Waltham, MA 02454  USA

jackendoff@brandeis.edu

tel: 617-484-5394

fax :617-484-0164

 

 

Short abstract: The goal of this study to reintegrate the theory of generative grammar into the cognitive sciences. Generative grammar was correct to focus on the child's acquisition of language as its central problem, leading to the hypothesis of an innate Universal Grammar.  However, generative grammar was mistaken to assume that the syntactic component is the sole course of combinatoriality, and that everything else is "interpretive." The proper approach is a parallel architecture, in which phonology, syntax, and semantics are autonomous generative systems, linked by interface components.  The parallel architecture leads to an integration within linguistics, and to a far better integration with the rest of cognitive neuroscience.  It fits naturally into the larger architecture of the mind/brain and permits a properly mentalistic theory of semantics.  It leads to a view of linguistic performance in which the rules of grammar are directly involved in processing.  Finally, it leads to a natural account of the incremental evolution of the language capacity. 

 

 

1.  Introduction

 

In the 1960s, when I became a graduate student in linguistics, generative grammar was the hot new topic.  Everyone from philosophers to psychologists to anthropologists to educators to literary theorists was reading about transformational grammar.  But by the late 1970s, the bloom was off the rose, although most linguists didn’t realize it; and by the 1990s, linguistics was arguably far on the periphery of the action in cognitive science.   To some extent, of course, such a decline in fortune was simply a matter of fashion and the arrival of new methodologies such as connectionism and brain imaging.  However, there are deeper reasons for linguistics’ loss of prestige, some historical and some scientific. 

 

The basic questions I want to take up here, then, are: 

 

What was right about generative grammar in the 1960s, such that it held out such promise?

What was wrong about it, such that it didn’t fulfill its promise?

How can we fix it, so as to restore its value to the other cognitive sciences?

 

The goal is to integrate linguistics with the other cognitive sciences, not to eliminate the insights achieved by any of them.  To understand language and the brain, we need all the tools we can get.  But everyone will have to give a little in order for the pieces to fit together properly. 

 

The position developed in Foundations of Language is that the overall program of generative grammar was correct, as was the way this program was intended to fit in with psychology and biology.  However, a basic technical mistake at the heart of the formal implementation, concerning the overall role of syntax in the grammar, led to the theory being unable to make the proper connections both within linguistic theory and with neighboring fields.  Foundations of Language develops an alternative, the parallel architecture, which offers far richer opportunities for integration of the field.  In order to understand the motivation for the parallel architecture, it is necessary to go through some history.

 

 2.  Three founding themes of generative grammar

 

The remarkable first chapter of Noam Chomsky’s Aspects of the Theory of Syntax (1965) set the agenda for everything that has happened in generative linguistics since.  Three theoretical pillars support the enterprise:   mentalism, combinatoriality, and acquisition. 

 

Mentalism.  Before Aspects, the predominant view among linguists – if it was even discussed – was that language is something that exists either as an abstraction, or in texts, or in some sense “in the community” (the latter being the influential view of Saussure (1915), for example).  Chomsky urged the view that the appropriate object of study is the linguistic system in the mind/brain of the individual speaker.  According to this stance, a community has a common language by virtue of all speakers in the community having essentially the same linguistic system in their minds/brains.1

 

The term most often used for this linguistic system is “knowledge”, perhaps an unfortunate choice.  However, within the theoretical discourse of the time, the alternative was thinking of language as an ability, a “knowing how” in the sense of Ryle (1949), which carried overtones of behaviorism and stimulus-response learning, a sense from which Chomsky with good reason wished to distance himself.  It must be stressed, though, that whatever term is used, the linguistic system in a speaker’s mind/brain is deeply unconscious and largely unavailable to introspection, in the same way that our processing of visual signals is deeply unconscious.  Thus language is a kind of mind/brain property hard to associate with the term “knowledge”, which commonly implies accessibility to introspection.  Foundations of Language compromises with tradition by systematically using the term f-knowledge (‘functional knowledge’) to describe whatever is in speakers’ heads that enables them to speak and understand their native language(s).

 

There still are linguists, especially those edging off toward semiotics and hermeneutics, who reject the mentalist stance and assert that the only sensible way to study language is in terms of the communication between individuals (a random example is Dufva and Lähteenmäki 1996).  But on the whole, the mentalistic outlook of generative grammar has continued to be hugely influential throughout linguistics and cognitive neuroscience. 

 

More controversial has been an important distinction made in Aspects between the study of competence – a speaker’s f-knowledge of language – and  performance, the actual processes (viewed computationally or neurally) taking place in the mind/brain that put this f-knowledge to use in speaking and understanding sentences.  I think the original impulse behind the distinction was methodological convenience.  A competence theory permits linguists to do what they have always done, namely study phenomena like Bulgarian case marking and Turkish vowel harmony, without worrying too much about how the brain actually processes them.   Unfortunately, in response to criticism from many different quarters (especially in response to the collapse of the derivational theory of complexity as detailed in e.g. Fodor, Bever, and Garrett 19xx), linguists have tended to harden the distinction into a firewall:  competence theories were taken to be immune to evidence from performance.  And so began a gulf between linguistics and  the rest of cognitive science that has persisted until the present. 

 

Foundations does not abandon the competence-performance distinction, but does return it to its original status as a methodological rather than ideological distinction.  Although the innovations in Foundations are largely in the realm of competence theory, one of their important consequences is that there is a far closer connection to theories of processing, as well as the possibility of a two-way dialogue between competence and performance theories.  We return to this issue in section 9.3.

 

Combinatoriality:   The earliest published work in generative grammar, Chomsky’s Syntactic Structures (1957), began with the observation that a language contains an arbitrarily large number of sentences.  Therefore, in addition to the finite list of words, a characterization of a language must contain a set of rules (or a grammar) that collectively describes or “generates” the sentences of the language.  Syntactic Structures showed that the rules of natural language cannot be characterized in terms of a finite-state Markov process, nor in terms of a context-free phrase structure grammar.  Chomsky proposed that the appropriate form for the rules of a natural language is a context-free phrase structure grammar supplemented by transformational rules.  Not all subsequent traditions of generative grammar (e.g. Head-Driven Phrase Structure Grammar (Pollard and Sag 1994) and Lexical-Functional Grammar (Bresnan 1982, 2001)) have maintained the device of transformational rules; but they all contain machinery designed to overcome the shortcomings of context-free grammars pointed out in 1957.2

 

Transferred into the mentalistic framework of 1965, the consequence of combinatoriality is that speakers of the language must have rules of language (or mental grammars) in their heads as part of their f-knowledge.  Again there is a certain amount of controversy arising from the term “rules”.  Rules of grammar in the sense of generative grammar are not like any of the sorts of rules or laws in ordinary life:  rules of etiquette, rules of chess, traffic laws, or laws of physics.  They are unconscious principles that play a role in the production and understanding of sentences.  Again, to ward off improper analogies, Foundations uses the term f-rules for whatever the combinatorial principles in the head may be.  Generative linguistics leaves open how directly the f-rules are  involved in processing, but, as suggested above, the unfortunate tendency among linguists has been not to care.  The theory in Foundations, though, makes it possible to regard the rules as playing a direct role in processing; see again section 9.3.

 

An important reason for the spectacular reception of early generative grammar was that it went beyond merely claiming that language needs rules:  it offered rigorous formal techniques for characterizing the rules, based on approaches to the foundations of mathematics and computability developed earlier in the century.   The technology suddenly made it possible to say lots of interesting things about language and ask lots of  interesting questions.  For the first time ever it was possible to provide detailed descriptions of the syntax of natural languages (not only English but German, French, Turkish, Mohawk, Hidatsa, and Japanese were studied early on).  In addition, generative phonology took off rapidly, adapting elements of Prague School phonology of the 1930s to the new techniques.  With Chomsky and Halle’s 1968 Sound Pattern of English as its flagship, generative phonology quickly supplanted the phonological theory of the American structuralist tradition. 

 

Acquisition:  Mentalism and combinatoriality together lead to the crucial question:  How do children get the f-rules into their heads?  Given that the f-rules are unconscious, parents and peers cannot verbalize them; and even if they could, children would not understand, since they don’t know language yet.  The best the environment can do for a language learner is provide examples of the language in a context.  From there on it is up to the language learner to construct the principles on his or her own – unconsciously of course. 

 

Chomsky asked the prescient question:  what does the child have to “(f-)know in advance” in order to accomplish this feat?  He phrased the problem in terms of the “poverty of the stimulus”:  many different generalizations are consistent with the data presented to the child, but the child somehow comes up with the “right” one, i.e. the one that puts him or her in tune with the generalizations of the language community.  I like to put the problem a bit more starkly:  The whole community of linguists, working together for decades with all sorts of crosslinguistic and psycholinguistic data unavailable to children, has still been unable to come up with a complete characterization of the grammar of a single natural language.  Yet every child does it by the age of ten or so.  Children don’t have to make the choices we do:  for instance they don’t have to decide whether the “right” choice of grammar is in the style of transformational grammar, the Minimalist Program, Optimality Theory, Role and Reference Grammar, Tree-Adjoining Grammar, Cognitive Grammar, connectionist networks, or some as yet unarticulated alternative.  They already f-know it in advance. 

 

One of the goals of linguistic theory, then, is to solve this “Paradox of Language Acquisition” by discovering what aspects of linguistic f-knowledge are not learned, but rather form the basis for the child’s learning.  The standard term for the unlearned component is Universal Grammar or UG, a term that again perhaps carries too much unwanted baggage.  In particular, UG should not be confused with universals of language:  it is rather what shapes the acquisition of language.  I prefer to think of it as a toolkit for constructing language, out of which the child (or better, the child’s brain) f-selects tools appropriate to the job at hand.  If the language in the environment happens to have a case system (like German), UG will help shape the child’s acquisition of case; if it has a tone system (like Mandarin), UG will help shape the child’s acquisition of tone.  But if the language in the environment happens to be English, which lacks case and tone, these parts of UG will simply be silent. 

 

What then is the source of language universals?  Some of them will indeed be determined by UG, for instance the overall “architecture” of the grammatical system:  the parts of the mental grammar and their relations (of which much more below).  Other universals, especially what are often called “statistical” or “implicational” universals, may be the result of biases imposed by UG.  For instance, UG may say that if a language has a case system, the simplest such systems are thus-and-so; these will be widespread systems crosslinguistically; they will be acquired earlier by children; and systems may tend to change toward them over historical time.  Other universals may be a consequence of the functional properties of any relatively efficient communication system:  for instance, the most frequently used signals tend to be short.  UG doesn’t have to say anything about these universals at all; they will come about through the dynamics of language use in the community (a process which of course is not very well understood). 

 

If UG is not learned, how does the child acquire it?  The only alternative is through the structure of the brain, which is determined through a combination of genetic inheritance and the biological processes resulting from expression of the genes, the latter in turn determined by some combination of inherent structure and environmental input.  Here contemporary science is pretty much at an impasse.  We know little about how genes determine brain structure and nothing about how the details of brain structure determine anything about language structure, even aspects of language as simple as speech sounds.  Filling out this part of the picture is a long-term challenge for cognitive neuroscience.  It is premature to reject the hypothesis of Universal Grammar, as some have (e.g. Elman et al. 1996 and Deacon 1997), arguing that we don’t know how genes could code for language acquisition.  After all, we don’t know how genes code for birdsong or sexual behavior or sneezing either, but we don’t deny that there is a genetic basis behind these.

 

There next arises the question of how much of UG is a human cognitive specialization for language and how much is a consequence of more general capacities.  The question has often been oversimplified to a binary decision between language being entirely special or entirely general, with a strong bias inside generative linguistics towards the former and outside generative linguistics towards the latter.  The truth of the matter undoubtedly lies somewhere in between.  To be sure, many people (including myself) would find it satisfying if a substantial part of language acquisition were a consequence of general human cognitive factors; but the possibility of some specialization overlaying the general factors must not be discounted.  My view is that we cannot determine what is general and what is special until we have comparable theories of other cognitive capacities, including other learned cognitive capacities.  To claim that language is parasitic on, say, motor control, perhaps because both have hierarchical and temporal structure (this seems to be the essence of Corballis’s (1991) position) – but without stating a theory of the f-knowledge involved in motor control –  is to coarsen the fabric of linguistic theory to the point of unrecognizability.  The closest approach to a comparable theory is the music theory of Lerdahl and Jackendoff 1983, which displays some striking parallels and some striking differences with language. 

 

Of course, if UG – the ability to learn language – is in part a human cognitive specialization, it must be determined by some specifically human genes, which in turn had to have come into existence sometime since the hominid line separated from the other great apes.  One would therefore like to be able to tell some reasonable story about how it could be shaped by natural selection or other evolutionary processes.  We return to this issue in section 9.4. 

 

This approach to the acquisition of language has given rise to a flourishing tradition of developmental research (references far too numerous to mention) and a small but persistent tradition in learnability theory (e.g. Wexler and Culicover 1980, Baker and McCarthy 1981).  And certainly, even if the jury is not yet in on the degree to which language acquisition is a cognitive specialization, there have been all manner of phenomena investigated that bear on the issue, for instance: 

 


 

My impression is that, while there are questions about all of these cases, en masse they offer an overwhelming case for some degree of genetic specialization for language learning in humans. 

 

These three foundational issues of generative grammar – mentalism, combinatoriality, and acquisition – have stood the test of time; if anything they have become even more important over the years in the rest of cognitive science.  It is these three issues that connect linguistics intimately with psychology, brain science, and genetics.  Much of the promise of generative linguistics arose from this new and exciting potential for scientific unification.  

 

3.  The broken promise: Deep Structure would be the key to the mind

 

A fourth major point of Aspects, and the one that seeped most deeply into the awareness of the wider public, concerned the notion of Deep Structure.  A basic claim of the 1965 version of generative grammar was that in addition to the surface form of sentences, i.e. the form we hear, there is another level of syntactic structure, called Deep Structure, which expresses underlying syntactic regularities of sentences.  For instance, a passive sentence like (1a)  has a Deep Structure in which the noun phrases are in the order of the corresponding active (1b). 

 

(1)   a.     The bear was chased by the lion.

        b.     The lion chased the bear.  

 

Similarly, a question such as (2a) has a Deep Structure closely resembling that of the corresponding declarative (2b).

 

(2)   a.        Which martini did Harry drink?

        b.     Harry drank that martini. 

 

In the years preceding Aspects, the question arose of how syntactic structure is connected to meaning.  Following a hypothesis first proposed by Katz and Postal (1964), Aspects made the striking claim that the relevant level of syntax for determining meaning is Deep Structure.

 

In its weakest version, this claim was only that regularities of meaning are most directly encoded in Deep Structure, and this can be seen in (1) and (2).  However, the claim was sometimes taken to imply much more:  that Deep Structure IS meaning, an interpretation that

Chomsky did not at first discourage.3  And this was the part of generative linguistics that got everyone really excited.  For if the techniques of transformational grammar lead us to meaning, we can uncover the nature of human thought.  Moreover, if Deep Structure is innate – being dictated by Universal Grammar – then linguistic theory gives us unparalleled access to the essence of human nature.   No wonder everyone wanted to learn linguistics. 

 

What happened next was that a group of generative linguists, notably George Lakoff, John Robert Ross, James McCawley, and Paul Postal, pushed very hard on the idea that Deep Structure should directly encode meaning.  The outcome, the theory of Generative Semantics (e.g. McCawley 1968, Postal 1970, Lakoff 1971), increased the “abstractness” and complexity of Deep Structure, to the point that the example Floyd broke the glass was famously posited to have eight underlying clauses, each corresponding to some feature of the semantics.  All the people who admired Aspects for what it said about meaning loved Generative Semantics, and it swept the country.  But Chomsky himself reacted negatively, and with the aid of his then-current students (full disclosure:  present author included), argued vigorously against Generative Semantics.  When the dust of the ensuing“Linguistics Wars” cleared around 1973 (Newmeyer 1980, Harris 1993, Huck and Goldsmith 1995), Chomsky had won – but with a twist:  he no longer claimed that Deep Structure was the sole level that determines meaning (Chomsky 1972).  Then, having won the battle, he turned his attention, not to meaning, but to relatively technical constraints on movement transformations (e.g. Chomsky 1973, 1977).

                         

The reaction in the larger community was shock:  for one thing, at the fact that the linguists had behaved so badly; but more substantively, at the sense that there had been a “bait and switch.”  Chomsky had promised Meaning with a capital M and then had withdrawn the offer.  Many researchers, both inside and outside linguistics, turned away from generative grammar with disgust, rejecting not only Deep Structure but mentalism, innateness, and sometimes even combinatoriality.  And when, later in the 1970s, Chomsky started talking about meaning again, in terms of a syntactic level of Logical Form (e.g. Chomsky 1981), it was too late:  the damage had been done.  From this point on, the increasingly abstract technical apparatus of generative grammar was of no interest to more than a tiny minority of cognitive scientists, much less the general public. 

 

Meanwhile, various nonChomskyan traditions of generative grammar developed, most notably Relational Grammar (Perlmutter 1983), Head-Driven Phrase Structure Grammar (Pollard and Sag 1987, 1994), Lexical-Functional Grammar (Bresnan 1982, 2001), Formal Semantics (Partee 1976, Heim and Kratzer 19xx), Optimality Theory (Prince and Smolensky 1993), Construction Grammar (Fillmore et al. 1988, Goldberg 1995), and Cognitive Grammar (Lakoff 1987, Langacker 1987, Talmy 2000).  On the whole, these approaches to linguistics (with the possible exception of Cognitive Grammar) have made even less contact with philosophy, psychology, and neuroscience than the recent Chomskyan tradition.  My impression is that many linguists have simply returned to the traditional concerns of the field: describing languages, with as little theoretical and cognitive baggage as possible.   While this is perfectly fine – particularly since issues of innateness don’t play too big a role when you’re trying to record an endangered language before its speakers all die – the sense of excitement and danger that comes from participating in the integration of fields has become attenuated.

 

4.   The scientific mistake:  syntactocentrism

 

So much for pure intellectual history.  We now turn to what I think was an important mistake at the core of generative grammar, one that in retrospect lies behind much of the alienation of linguistic theory from the cognitive sciences.  Chomsky did demonstrate that language requires a generative system that makes possible an infinite variety of sentences.  However, he explicitly assumed, without argument (1965: 16, 17, 75, 198), that generativity is localized in the syntactic component of the grammar – the construction of phrases from words –  and that phonology (the organization of speech sounds) and semantics (the organization of meaning) are purely “interpretive”, that is, that their combinatorial properties are derived strictly from the combinatoriality of syntax. 

 

In 1965 this was a perfectly reasonable view.  The important issue at that time was to show that something in language is generative.  Generative syntax had provided powerful new tools, which were yielding copious and striking results.  At the time, it looked as though phonology could be treated as a sort of low-level derivative of syntax: the syntax gets the words in the right order, then phonology massages their pronunciation to adjust them to their local environment.  As for semantics, virtually nothing was known:  the only things on the table were the rudimentary proposals of Katz and Fodor (1963) and some promising work by people such as Bierwisch (1967, 1969) and Weinreich (1966).  So the state of the theory offered no reason to question the assumption that all combinatorial complexity arises from syntax. 

 

Subsequent shifts in mainstream generative linguistics stressed major differences in outlook.  But one thing that remained unchanged was the assumption that syntax is the sole source of combinatoriality.  Figure 1 diagrams the architecture of components in three major stages of Chomskyan syntactic theory: the Aspects theory, Principles and Parameters (or Government-Binding) Theory (Chomsky 1981), and the Minimalist Program (Chomsky 1995).  The arrows denote direction of derivation.

 

Figure 1. Architecture of Chomsky's theories over the years.

 

 

These shifts alter the components of syntax and their relation to sound and meaning.  What remains constant throughout, though, is that (a) there is an initial stage of derivation in which words or morphemes are combined into syntactic structures; (b) these structures are then massaged by various syntactic operations; and (c) certain syntactic structures are shipped off to phonology/phonetics to be pronounced and other syntactic structures are shipped off to “semantic interpretation” to be understood.   In short, syntax is the source of all linguistic organization. 


 

I believe that this assumption of “syntactocentrism” – which, I repeat, was never explicitly grounded – was an important  mistake at the heart of the field.4  The correct approach is to regard linguistic structure to be the product of a number of parallel but interacting generative capacities – at the very least, one each for phonology, syntax, and semantics.  As we will see, elements of such a “parallel architecture” have been implicit in practice in the field for years.  What is novel in the present work is bringing these practices out into the open, stating them as a foundational principle of linguistic organization, and exploring the large-scale consequences.  

 

5.  Phonology as an exemplar of the parallel architecture

 

An unnoticed crack in the assumption of syntactocentrism appeared in the middle to late 1970s, when the theory of phonology underwent a major seachange.  Before then, the sound system of language had been regarded essentially as a sequence of speech sounds.  Any further structure, such as the division into words, was thought of as simply inherited from syntax. However, beginning with work such as Goldsmith (1979) and Liberman and Prince (1977), phonology rapidly came to be thought of as having its own autonomous structure, in fact multiple structures or tiers.  Figure 2 provides a sample, the structure of the phrase the big apple.  The phonological segments appear at the bottom, as terminal elements of the syllabic tree.

 

 

Figure 2. Phonological structure of the big apple.


 

 

There are several innovations here.  First, syllabic structure is seen as hierarchically organized.  At the center of the syllable (notated as σ) is a syllabic nucleus (notated N), which is usually a vowel but sometimes a syllabic consonant such as the l in apple.  The material following the nucleus is the syllabic coda (notated C); this groups with the nucleus to form the rhyme (notated R), the part involved in rhyming.  In turn, the rhyme groups with the syllabic onset (notated O) to form the entire syllable.  Syllables are grouped together into larger units such as feet and phonological words (here, the bracketing subscripted Wd).  Notice that in Figure 1, the word the does not constitute a phonological word on its own; it is attached (or cliticized) onto the word big.  Finally, phonological words group into larger units such as phonological phrases.  Languages differ in their repertoire of admissible nuclei, onsets, and codas, but the basic hierarchical organization and the principles by which strings of segments are divided into syllables are universal.  (It should also be mentioned that signed languages have parallel syllabic organization, except that the syllables are built out of manual rather than vocal constituents (Klima and Bellugi 1979, Fischer and Siple 1990).)

 

These hierarchical structures are not built out of syntactic primitives such as nouns, verbs, and determiners; their units are intrinsically phonological.  In addition, the structures, though hierarchical, are not recursive.5   Thus the principles governing these structures are not derivable from syntactic structures; they are an autonomous system of generative rules.

 

Next consider the metrical grid in Figure 2.  Its units are beats, notated as columns of xs.  A column with only one x is a weak beat, and more xs in a column indicate a relatively stronger beat.  Each beat is associated with a syllable; the strength of a beat indicates the relative stress on that syllable, so that for example in Figure 2 the first syllable of apple receives maximum stress.   The basic principles of metrical grids are in part autonomous of language:  they also appear, for instance, in music (Lerdahl and Jackendoff 1983), where they are associated with notes instead of syllables.  Metrical grids place a high priority on rhythmicity: an optimum grid presents an alternation of strong and weak beats, as is found in music and in much poetry.  On the other hand, the structure of syllables exerts an influence on the associated metrical grid:  syllables with heavy rhymes (i.e. containing a coda or a long vowel) “want” to be associated with relatively heavy stress.  The stress rules of a language concern the way syllabic structure comes to be associated with a metrical grid; languages differ in ways that are now quite well understood (e.g. Kager 1995, Halle and Idsardi 1995).

 

Again, metrical grids are built of nonsyntactic units.  As they are to some degree independent of syllabic structure, they turn out to be a further autonomous “tier” of phonological structure. 

       

At a larger scale of phonological organization we find prosodic units over which intonation contours are defined.  These are comparable in size to syntactic phrases but do not coincide with them.  Here are two examples.

 

(3)   Syntactic bracketing:

        [Sesame Street] [is [a production [of [the Children’s Television Workshop]]]]

        Prosodic bracketing (two pronunciations):

        a.  [Sesame Street is a production of] [the Children’s Television Workshop]

        b.  [Sesame Street] [is a production] [of the Children’s Television Workshop]

 

(4)   Syntactic bracketing

        [This] [is [the cat [that chased [the rat [that ate [the cheese]]]]]]

        Prosodic bracketing:

        [This is the cat] [that chased the rat] [that ate the cheese]

 

The two pronunciations of (3) are both acceptable, and other prosodic bracketings are also possible.  However, the choice of prosodic bracketing is not entirely free, since for instance [Sesame] [Street is a production of the] [Children’s Television Workshop] is an impossible phrasing.  Now notice that the first constituent of (3a) and the second constituent of (3b) do not correspond to any syntactic constituent.  We would be hard pressed to know what syntactic label to give to Sesame Street is a production of.  But as an intonational constituent it is perfectly fine.  Similarly in (4), the syntax is relentlessly right-embedded, but the prosody is flat and perfectly balanced into three parts.  Again, the first two constituents of the prosody do not correspond to syntactic constituents of the sentence.

 

The proper way to deal with this lack of correspondence is to posit a phonological category of Intonational Phrase, which plays a role in the assignment of intonation contours and the distribution of stress (Beckman and Pierrehumbert 1986, Ladd 1996).  Intonation Phrases are to some degree correlated with syntax; their boundaries tend to be at the beginning of major syntactic constituents; but their ends do not necessarily correlate with the ends of the corresponding syntactic constituents.  At the same time, intonational phrases have their own autonomous constraints, in particular a strong preference for rhythmicity and parallelism (as evinced in (2) for example), and a preference for saving the longest prosodic constituent for the end of the sentence.6

 

Another example of mismatch between syntax and phonology comes from contractions such as I’m and Lisa’s (as in Lisa’s a doctor).  These are clearly phonological words, but what is their syntactic category?  It is implausible to see them either as noun phrases that incidentally contain a verb or to see them as verbs that incidentally contain a noun.  Keeping phonological and syntactic structure separate allows us to say the natural thing:  they are phonological words that correspond to two separate syntactic constituents.

 

(5)   Syntactic structure:             [NP I] [V (a)m]             [NP Lisa] [V (i)s]

        Phonological structure:        [Wd I’m]                      [Wd Lisa’s]

 

Since every different sentence of the language has a different phonological structure, and since phonological structures cannot be derived from syntax, the usual arguments for combinatoriality lead us to the conclusion that phonological structure is generative.  However, in addition to the generative principles that describe these structures, it is necessary to introduce a new kind of principle into the grammar, what might be called “correspondence rules” or “interface rules.”  These rules (I revert to the standard term “rules” rather than being obsessive about “f-rules”) regulate the way the independent structures correspond with each other.  For instance, the relation between syllable weight and metrical weight is regulated by an interface rule between syllabic and metrical structure; the relation between syntactic and intonational constituents is regulated by an interface rule between syntactic and prosodic structure. 

 

An important property of interface rules is they don’t “see” every aspect of the structures they are connecting.  For instance, the rules that connect syllabic content to metrical grids are totally insensitive to syllable onset: universally, stress rules care only about what happens in the rhyme.  Similarly, although the connection between syntax and phonology “sees” certain syntactic boundaries, it is insensitive to the depth of syntactic embedding, Moreover, syntactic structure is totally insensitive to the segmental content of the words it is arranging (e.g. there is no syntactic rule that applies only to words that begin with b).  Thus interface rules implement not isomorphisms between the structures they relate, but rather only partial homomorphisms. 

 

This is not to say that we should think of speakers as thinking up phonological and syntactic structures independently in the hope they can be matched up by the interfaces.  That would be the same sort of mistake as thinking that speakers start with the symbol S and generate a syntactic tree, finally putting in words so they know what the sentence is about.  At the moment we are not thinking in terms of production; rather we are stating the principles (of “competence”) in terms of which sentences are well-formed.  We will get back to how this is related to processing in section 9.3. 

                                                                       

Now the main point of this section.  This view of phonological structure, developed in the late 1970s and almost immediately adopted as standard, is deeply subversive of the syntactocentric assumption that all linguistic combinatoriality originates in syntax.  According to this view, phonological structure is not just a passive hand-me-down derived from low-level syntax: it has its own role in shaping the totality of linguistic structure.  But at the time of these changes, no great commotion was made about this most radical aspect of the new phonology.  Phonologists for the most part were happy to get on with exploring this exciting way of doing things, and for them, the consequences for syntax didn’t matter.  Syntacticians, for their part, simply found phonology  irrelevant to their concerns of constraining movement rules and the like, especially since phonology had now developed its own arcane technical machinery.  So neither subdiscipline really took notice; and as the technologies diverged, the relation between syntax and phonology became a no-man’s-land (or perhaps only a very-few-man’s-land).  Tellingly, as far as I can determine, in all of Chomsky’s frequent writings on the character of the human language capacity, there is no reference at all to post-1975 phonology – much less to the challenge that it presents to his overall syntactocentric view of language. 

       

6.  The syntax-semantics interface

 

I have treated the developments in phonology first because it is less controversial.  But in fact the same thing happened in semantics.  Over the course of the 1970s and 1980s, several radically different approaches to semantics developed:  within linguistics, at least Formal Semantics (growing out of formal logic)(Partee 1976, Heim and Kratzer 19xx), Cognitive Grammar (Lakoff 1987, Langacker 1987, Talmy 2000), and Conceptual Semantics (Jackendoff 1983, 1990, Pinker 1989, Pustejovsky 1995), plus approaches within computational linguistics and cognitive psychology.  Whatever their differences, all these approaches take meaning to be deeply combinatorial.  None of them take the units of semantic structure to be syntactic units such as NPs and VPs; rather, the units are intrinsically semantic entities like objects, events, actions, properties, and quantifiers.7  Therefore, whichever semantic theory we choose, it is necessary to grant semantics an independent generative organization, and it is necessary to include in the theory of grammar an interface component that correlates semantic structures with syntactic and phonological structures.  In other words, the relation of syntax to semantics is qualitatively parallel to the relation of syntax to phonology.  However, apparently no one pointed out the challenge to syntactocentrism – except the Cognitive Grammarians, who mostly went to the other extreme and denied syntax any independent role, and who have been steadfastly ignored by mainstream generative linguistics.

 

The organization of phonological structure into semi-independent tiers finds a parallel in semantics as well.  Linguistic meaning can be readily partialled into two independent aspects.  On one hand there is what might be called “propositional structure”:  who did what to whom and so on.  For instance, in The bear chased the lion, there is an event of chasing in which the bear is the chaser and the lion is “chasee”.  On the other hand, there is also what is now called “information structure”:  the partitioning of the message into old vs. new information, topic vs. comment, presupposition vs. focus, and so forth.  We can leave the propositional structure of a sentence intact but change its information structure, by using stress (6a-c) or various focusing constructions (6d-f):

 

(6)   a.   The BEAR chased the lion.

        b.   The bear chased the LION.

        c.   The bear CHASED the lion.

        d.   It was the bear that chased the lion.

        e.   What the bear did was chase the lion.

        f.    What happened to the lion was the bear chased it.

 

Thus the propositional structure and the information structure are orthogonal dimensions of meaning, and can profitably be regarded as autonomous tiers.  (Foundations proposes a further split of propositional structure into descriptive and referential tiers, an issue too complex for the present context.) 

 

Like the interface between syntax and phonology, that between syntax and semantics is not an isomorphism.  Some aspects of syntax make no difference in semantics.  For instance, the semantic structure of a language is the same whether or not the syntax marks subject-verb agreement, verb-object agreement, or nominative and accusative case.  The semantic structure of a language does not care whether the syntax calls for the verb to be after the subject (as in

 English), at the end of the clause (as in Japanese), or second in a main clause and final in a subordinate clause (as in German).  As these aspects of syntax are not correlated with or derivable from semantics, the interface component disregards them. 

 

Similarly, some aspects of semantics have little if any systematic effect in syntax.  Here are a few well-known examples.

 


 

 

  (7)  a. Where is my hat?

        b. (Now, Billy:) What’s the capital of New York?

        c. Would you please open the window?

        d. Is the Pope Catholic?

 


 

 

        (8)  a.   Jill jumped until the alarm went off.

              b.   Jill slept until the alarm went off.

              c.   Jill jumped when the alarm went off.

 

The standard account of this contrast (Talmy 2000, Verkuyl 1993, Pustejovsky 1995, Jackendoff 1997) is that the meaning of until is to set a temporal bound on an ongoing process.  When the verb phrase already denotes an ongoing process, such as sleeping, all is well.  But when the verb phrase denotes an action that has a natural temporal ending, such as jumping, then its interpretation is “coerced” into repeated action – a sort of ongoing process – which in turn can have a temporal bound set on it by until.  For present purposes, the point is that the sense of repetition arises from semantic combination, without any direct syntactic reflex.  (On the other hand, there are languages such as American Sign Language that have a grammatical marker of iteration; this will have to be used in the translation of (8a).)

 

 

        (9)  a.        [One waitress says to another]:

                      The ham sandwich wants another cup of coffee.

                      [Interpretation: ‘the person who ordered/is eating the ham sandwich...’]

              b.        Chomsky is on the top shelf next to Plato.

                      [Interpretation: ‘the book by Chomsky ...’]

 

Such cases of “reference transfer” contain no syntactic reflex of the italicized parts of the interpretation.  One might be tempted to dismiss these phenomena as “mere pragmatics”, hence outside the grammatical system.  But this proves impossible, because reference transfer can have indirect grammatical effects.  A clear example involves imagining that Richard Nixon went to see the opera Nixon in China (yes, a real opera!), and what happened was that:

 

        (10)        Nixon was astonished to see himself sing a foolish duet with Pat.

 

The singer of the duet, of course, is the actor playing Nixon; thus the interpretation of himself involves a reference transfer.  However, we cannot felicitously say that what happened next was that:

 

        (11)        *(Up on stage,) Nixon was astonished to see himself get up and walk out.

 

That is, a reflexive pronoun referring to the acted character can have the real person as antecedent, but not vice versa (Fauconnier 1985, Jackendoff 1992).  Since the use of reflexive pronouns is central to grammar, reference transfer cannot be seen as “extragrammatical.”

 

 

        (12)        Everyone in this room knows at least two languages.

                    a.        ‘John knows English and French; Sue knows Hebrew and Hausa; ....”

                    b.        ‘... namely, Mandarin and Navajo.’

 

Should there be two different syntactic structures associated with these two interpretations?  Chomsky 1957 said no; Chomsky 1981 said yes; Generative Semantics said yes; I am inclined to say no (Jackendoff 1996, Foundations chapter 12).  The problem with finding two different syntactic structures is that it requires systematic and drastic distortions of the syntactic tree that never show up in the surface syntax of any language.  The problem with having only one syntactic structure is that it makes the syntax-semantics interface more complex.  The point to be made here is that the scope of quantification may well be a further example of the “dirtiness” of the interface between syntax and semantics; this continues to be an important issue in linguistic theory.

 

In each of these cases, a syntactocentric theory is forced to derive the semantic distinctions from syntactic distinctions.  Hence it is forced into artificial solutions such as empty syntactic structure and elaborate movement, which have no independent motivation beyond providing grist for the semantics.  On the other hand, if the semantics is treated as independent from syntax but correlated with it, it is possible to permit a less than perfect correlation; it is then an empirical issue to determine how close the match is.