Below is the unedited preprint (not a quotable final draft) of:
Muller, Ralph-Axel. (1996). Innateness, autonomy, universality? Neurobiological approaches to language. Behavioral and Brain Sciences, 19 (4): 611-675.
The final published draft of the target article, commentaries and Author's Response are currently available only in paper.
For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to: journals_subscriptions@cup.org (North America) or journals_marketing@cup.cam.ac.uk (All other countries).

INNATENESS, AUTONOMY, UNIVERSALITY? NEUROBIOLOGICAL APPROACHES TO LANGUAGE

Ralph-Axel Mueller
PET Center,
Children's Hospital of Michigan,
Wayne State University,
Detroit MI 48201-2196,
USA
rmueller@pet.wayne.edu

Keywords

brain development, dissociations, distributive representations, epigenesis, evolution, functional localization, individual variation, innateness, language.

Abstract

The concepts of the innateness, universality, species-specificity, and autonomy of the human language capacity have had an extreme impact on the psycholinguistic debate for over thirty years. These concepts are evaluated from several neurobiological perspectives, with an emphasis on the emergence of language and its decay due to brain lesion and progressive brain disease.

Evidence of perceptuomotor homologies and preadaptations for human language in nonhuman primates suggests a gradual emergence of language during hominid evolution. Regarding ontogeny, the innate component of language capacity is likely to be polygenic and shared with other developmental domains. Dissociations between verbal and nonverbal development are probably rooted in the perceptuomotor specializations of neural substrates rather than the autonomy of a grammar module. Aphasiological data often assumed to suggest modular linguistic subsystems can be accounted for in terms of a neurofunctional model incorporating perceptuomotor-based regional specializations and distributivity of representations. Thus, dissociations between grammatical functors and content words are due to different conditions of acquisition and resulting differences in neural representation. Since human brains are characterized by multifactorial interindividual variability, strict universality of functional organization is biologically unrealistic.

A theoretical alternative is proposed according to which (a) linguistic specialization of brain areas is due to epigenetic and probabilistic maturational events, not to genetic 'hard-wiring', and (b) linguistic knowledge is neurally represented in distributed cell assemblies whose topography reflects the perceptuomotor modalities involved in the acquisition and use of a given item of knowledge.


1 INTRODUCTION

In the 1960s and 1970s, there was a heated debate about the classical notions of 'innate ideas' and the 'tabula rasa' (e.g. Hook 1969, Stich 1975) - a debate that had been triggered by Chomsky's (1959, 1965) hypothesis that children are endowed with an innate 'universal grammar' developing autonomously, i.e. independently of other cognitive domains. These issues of autonomy and innateness have since been central to the psycholinguistic debate. In this paper, I will only briefly sketch the history of these two concepts and then concentrate on neurobiological evidence to show that

(a) basic assumptions discussed in linguistics can at least tentatively be evaluated from the neuroscientific point of view; and

(b) the closer one studies language as it may be 'represented' in the brain, the fuzzier apparently irreconcilable dualisms such as 'autonomous vs. integrated' or 'innate vs. learned' become.

It should be noted that this paper is not a critique of generative grammar, but an evaluation of three fundamental concepts (autonomy, innateness, and universality) that have characterized discussions on language in various theoretical contexts. The goal of this paper is to confront these concepts with types of evidence they are usually not confronted with. Reference to linguistic theory will only set the stage for the following empirical discussions.

2 THE LINGUISTIC BACKGROUND

Nearly 40 years ago, Chomsky (e.g. 1959; cf. 1975; Chomsky & Ronat 1979) designed a 'generative' model of the language capacity intended to overcome the limitations of behaviorist and 'structuralist' models (cf. Lyons 1977). Chomsky's design of a mentalist theory of language that would explain knowledge in the minds of speakers (and not just describe finite samples of speech behavior) was related to his humanist rationalism (Chomsky 1969, 1972b&c). Chomsky (1966) pleaded for a 'Cartesian' concept of language he detected in the 17th-century 'Port-Royal grammar' of Arnauld and Lancelot (Brekle 1966), who had treated grammar as a mental, rational (i.e. non-random and explicable), and universal system with rules applying to an underlying level not directly observable in speech behavior. Descartes [1637] himself had defined mind as being universal, species-specific, unitary, non-material, and innate, i.e. every human being, but no (other) animal, had an indivisible and non-mechanic mind right from birth.

With one exception, Chomsky (1965, 1972a) has applied these attributes to his concept of the language capacity, which he defines as universal, non-derivable in evolution from lower species, creative, and genetically programmed.[1] The exception is the indivisible, unitary nature of mind. Here Chomsky (e.g. 1980, 1981) takes a position to the extreme opposite and claims that grammar is both internally modular and autonomous with regard to nonverbal cognition, and above all to those aspects of semantics that are related to common sense knowledge. The autonomy hypothesis is rooted in an attempt to formalize the grammatical capacities of average or 'ideal speaker/hearers' in a way compatible with mathematics[2] and motivated by the observation that many structural properties of language(s) do not seem to be explicable in semantic terms (Newmeyer 1986: 26ff., 172ff.). Yet, in spite of claims by Katz & Bever (1976) and Katz & Fodor (1964) that pragmatics and common-sense semantics have to be abstracted from for a formally explicit model of language, many authors have felt the exclusion of these fundamental aspects of language to be counter-intuitive (the intuition being that meaning and intention are central to natural language; Givn 1989, Lakoff 1977, Langacker 1987, Levelt 1992, Searle 1974; cf. Newmeyer 1980: chs.4-5).

Intimately related to the autonomy hypothesis is the claim that essential principles of language acquisition are innate, i.e. genetically programmed. This claim is based on what Chomsky (1967, 1980: 34ff.) has called the 'argument from the poverty of stimulus', according to which there has to be an innate basis to language acquisition because the language stimuli received by the child are insufficient for acquiring abstract grammatical rules solely via inductive learning (cf. Crain 1991). More specifically, the "learnability dilemma" (Wexler & Manzini 1987) is based on two putative shortcomings of the speech input: there is no 'negative evidence' (e.g. parents correcting their children), and positive evidence is of "degenerate quality and narrowly limited extent" (Chomsky 1965: 58; but see Bates & Carnevale 1993).

According to more recent revisions of generative grammar, the innate 'universal grammar' is autonomous with regard to other cognitive domains and is itself organized into modular subsystems of principles (concerning government, binding, thematic roles etc.; Chomsky 1981, 1982). Principles or, alternatively, particular lexical items (Ouhalla 1991) may be associated with parameters, which are innately predetermined as to their possible values, but set according to language experience (Meisel 1995, Roeper 1988).

This nativist autonomy position fits into a general modular model of mind that came to dominate the cognitive sciences in the past decade (e.g. Fodor 1983, Gardner 1983, Gazzaniga 1985, 1988, Marr 1982, Minsky 1986, Poeppel et al. 1991). Fodor (1983) has drawn the following picture of human mind:

(a) Perception is partitioned into 'informationally encapsulated' and genetically prewired 'input systems' (modules), whose function is purely bottom-up, i.e. there is no feedback from higher cognitive systems and no interaction between input systems while they are analyzing (aspects of) a stimulus.

(b) Higher cognition is holistic, i.e. each representation can interact with any other representation in the 'central systems'.

(c) Scientific models can be constructed for modular levels of cognition only. Central systems are "bad candidates for scientific study" (ibid.: 127).

Fodor's views have been amply discussed (e.g. Fodor 1985, Garfield 1987, Pylyshyn & Demopoulos 1986) and have had a considerable impact on the cognitive science community.

I will now turn to the neurobiological examination of the concepts of innateness, autonomy, and universality of language. Any detailed discussion of psycholinguistic data will be beyond the scope of this paper. On occasion, though, I will briefly refer to such data for readers less familiar with the background of the issues.

3 EMERGING LANGUAGE

I will discuss language emergence in terms of three orders of temporal resolution: slow (phylogeny), medium (ontogeny), and fast (microprocessing). Since straightforward evidence on speech emergence in hominids is unavailable, the phylogeny section will be concerned with possible homologies between human language and animal communication systems, which might be evidence against a genetically anchored syntax autonomy. In the ontogeny section, a review of basic principles of brain development will be followed by a discussion of a possible genetic basis of linguistic knowledge, the critical period for language acquisition, and developmental dissociations between verbal and non-verbal domains. Finally, in the microprocessing section electrophysiological evidence concerning the serial modularity vs. parallel distributivity of language processing will be reviewed.

3.1 Phylogeny

The modular-nativist position is highly compatible with the assumption that language emerged abruptly in evolution. Lenneberg (1967) opted for such a discontinuity hypothesis. According to an even more radical claim, phylogeny is characterized by a successive addition of fully-fledged modules due to mutations (e.g. Ellis & Young 1988: 10f., Marr 1977: 41). The idea of abrupt 'additions of modules' is, however, incompatible with the higher mammalian evolution of the neocortex (Pandya et al. 1988: 73). Pinker & Bloom (1990) have tried to reconcile the Chomskyan modular-nativist view with evolutionary evidence and, in particular, with selectionism. They argue that the structure of language, at each stage of its gradual development in hominids, constituted a selectional advantage and can thus be explained from a functionalist point of view. For example, the capacity to produce embedded structures by means of recursive rules (as in 'She said that he thinks that...') is thought to be functional in a society of hunter-gatherers like Homo habilis, i.e. of hominids that lived about two million years ago (ibid.: 724f). One may wonder whether selectionism necessitates strict gradualism (Eldredge & Tattersall 1982, Gould 1982) or whether data on the modern hunter-gatherers !Kung San should be applied to hominids that had only about half the cranial capacity of Homo sapiens and a rather primitive supralaryngeal tract (Liebermann 1984: 306ff.). Understandably, there is a huge gap between Pinker & Bloom's very general functionalist considerations and the very specific hypotheses of principles-and-parameters theory. This is also true of Newmeyer's (1991) functional approach to the emergence of universal grammar. He argues that syntax, defined as the interface between conceptual systems and phonology, has to be autonomous because, due to its intermediate status, it must neither directly reflect phonological nor conceptual structure. However, the term 'autonomy', as used by Newmeyer here, means little more than 'difference' and therefore simply does not match the concept as defined in generative linguistics. For an analogy, the system of complex motor programs, say, in the premotor cortex is an 'interface' between planning (presumably localizable in prefrontal areas) and motor execution (involving a hierarchy of structures including primary motor cortex, basal ganglia, cerebellum etc.). For this 'interface' to function as it does, it must work according to principles different from both those of planning and those of execution. But how enlightening would it be to call such an intermediate system 'autonomous'? On the contrary, Newmeyer's idea that syntax evolved as a 'mediator' between preexisting conceptual systems and preexisting percepuomotor programs would imply that syntactic principles can be explained only by tracing their conceptual and perceptuomotor origins.

Other authors attempting to bridge the gap between generative linguistics and biology oppose functionalist gradualism. Piattelli- Palmarini (1989) reviews evidence on the immune system being equipped a priori with a repertoire of antibodies and reacting to nearly any possible antigen without learning, i.e., without having to create a new antibody in reaction to an external stimulus. He views this as an analogy to an innate foundation of language acquisition (cf. also Jerne 1985). However, Piattelli-Palmarini does not believe that the principles of language can be explained functionally and in selectional terms. Therefore, the genetic changes leading to the emergence of language, in his view, were not directly selected for, but happened due to 'proximity effects', i.e. genetic alterations (not themselves directly advantageous in selectional terms) that 'by chance' accompanied selectionally advantageous mutations. Gradualism of language evolution is questioned by Bickerton, who has described creole languages as direct expressions of an innate and species- universal 'language bioprogram' (Bickerton 1988). He concludes from language acquisition data - demonstrating, as he thinks, an abrupt onset of syntax at about age two - that language emergence in hominids was equally discontinuous (Bickerton in prep.). Even though Bickerton (e.g. 1990) by no means neglects evolutionary issues, his view is reminiscent of Chomsky's claim that there are no analogues to human language in animals. Chomsky himself has even concluded that it is pointless to enquire into the evolution of language at all (e.g. 1972a: 66ff. and in Rieber & Voyat 1983: 58f.).

In the following, I will give a less pessimistic account of the issue and survey a number of approaches that may help explain speech emergence within a more general cognitive framework.

3.1.1 Communication in monkeys and apes

3.1.1.1 Natural vocalizations

In spite of the complex sociocultural structures established by chimpanzees in the wild (involving tool construction and use, cooperative hunting, food sharing, social and kinship relations etc.; Dingwall 1979, Goodall 1971, 1979), it is commonly assumed that communicative behavior in monkeys and apes is strictly non- homologous to human language, mainly because it is essentially affective (whereas most of human speech is cognitive-referential), non-volitional, and controlled by subcortical brain structures (Dingwall 1979). Homology refers to "behaviors that are similar in closely related species, that can be related to structures showing a high degree of concordance in a number of parameters, and that could - together with their structural correlates - be traced back to a common ancestor" (Dingwall 1988: 279f.). In addition, as pointed out by Deacon (1990: 638), homologies may underlie "new adaptations that are radically different than the ancestral function". Thus, even absence of behavioral continuity does not imply discontinuity in the phylogeny of cognitive components and neural substrates involved in an emerging function such as language.

According to a study by Gardner et al. (1989), 'cross-fostered' chimpanzees (raised in human environment) produce vocalizations mainly during emotionally exciting events, whereas the artificial gestural signs they have acquired are often used in anticipation of events announced by an experimenter and are thus less closely tied to the presence of affective stimuli. However, in many nonhuman primates elaborate systems of natural vocalizations have been found that appear to have some referential components (like alarm calls informing conspecifics of the source of danger). In addition to this elementary semanticity, many calls are discrete, i.e. do not intergrade (Steklis 1988), and examples of protosyntax can be observed (Demers 1988: 326ff.). For instance, Robinson (1979) found that in the Titi monkey the order of call sequences has a specific influence on response.

Primate vocalizations can be elicited by electrical stimulation of subcortical (but not neocortical) structures. However, this may be epiphenomenal to affective responses triggered (Marin et al. 1979: 190ff.). In humans, electrical stimulation in the neocortical 'language areas' does not elicit speech either, but interferes with language performance (Leleux & Lebrun 1993, Ojemann 1988, 1991). Steklis (1988) reviews data in support of neocortical involvement and some degree of hemispheric asymmetry in primate vocalization control and perception (see also Petersen et al. 1978). In an ablation experiment, Heffner & Heffner (1984) could show that the superior temporal neocortex of the left hemisphere is necessarily involved in the categorical perception of species-specific vocalizations in the Japanese macaque. As discussed by Mueller-Preuss & Ploog (1983), the neural control of vocalizations in nonhuman primates is widely distributed, with neocortical areas effecting volitional control on the highest level, and limbic, diencephalic and metencephalic structures involved on lower levels. This model could account for deceptive communicative behavior (Cheney & Seyfarth 1991) and operant conditioning of vocalizations (Aitken & Wilson 1979) in nonhuman primates, i.e. behavioral potentials that seem to imply some degree of volitional control. Primate vocalizations are, therefore, neither purely affective, nor are they strictly 'hard-wired'. As further shown by Hopkins & Savage-Rumbaugh (1991), Kanzi, a bonobo raised among humans, spontaneously developed a more elaborate system of vocalizations than conspecifics growing up among themselves. This suggests that learning contributes to the acquisition of calls in nonhuman primates.

It would therefore appear premature to rule out possible homologies between at least some components or aspects of human language and primate vocalizations. Cerebral control appears to have gradually shifted from the limbic system and other lower brain structures to the neocortex. A complementary set of findings from neurolinguistics shows that in humans injury to subcortical brain structures (thalamus and basal ganglia) sometimes results in language impairments comparable to aphasia caused by neocortical damage (Basso et al. 1987, Brown 1975, 1982, Crosson et al. 1986, Demonet et al. 1991, Wallesch & Wyke 1985, Wallesch et al. 1985). Electrical stimulation data (Bhatnager et al. 1989, Lebrun & Leleux 1993, Ojemann 1988) support the view that subcortical structures are involved in human language functions.

3.1.1.2 Artificial 'languages'

Gardner & Gardner (1969) reported that the chimpanzee Washoe spontaneously used gestural signs and combined them into short chains in a non-imitative and non-random manner. The enthusiasm about having discovered protolinguistic capacities in apes comparable to early stages of human language acquisition (e.g. Dingwall 1979, Lieberman 1984, Passingham 1979, Premack 1980) was, however, dampened by objections from Seidenberg & Petitto (1979) and Terrace et al. (1979), who argued that chimpanzee 'speech' is essentially imitative (i.e. triggered by subtle prompts from experimenters), full of meaningless repetitions, and non- spontaneous. The field seemed to have reached a majority consensus in the mid-eighties according to which pongid 'language acquisition' was not comparable to that in children because it was based on explicit training and conditioning, and because productive syntax was beyond the capacities of pongids (cf. reviews by Tartter 1986: 263- 318 and Walker 1983: 339-81).

However, more recent work by Greenfield & Savage-Rumbaugh (1990, 1991) suggests that some of these conclusions may be too restrictive for at least one pongid species, the bonobo (Pan paniscus). Two bonobos, Kanzi and Mulika, that were present, but not involved, in training experiments with their foster mother, spontaneously (i.e. without conditioning) acquired a 'language' consisting of gestures and lexigrams (cf. Savage-Rumbaugh & Lewin 1994). Greenfield & Savage-Rumbaugh (1993) argue that the frequent repetitions (of [parts of] caretaker utterances), previously taken as an indication of inferior communicative capacities, serve pragmatic functions (such as confirmation or request) mostly comparable to the role of repetition at early stages of child language acquisition.[3] As suggested by Savage-Rumbaugh (1991), the gestural signs introduced in earlier training experiments may have been too complex for young chimps to imitate and acquire spontaneously. Contrary to the communicative behavior of apes in these earlier experiments, the vast majority of 'utterances' produced by Kanzi were spontaneous comments, requests or announcements and not prompted by human companions (Savage-Rumbaugh 1986: 384ff.). Seidenberg & Petitto (1987) have conjectured that Kanzi's experimentally acquired communication was all due to food-related conditioning. In my view, this is both far-fetched[4 ]and hard to disprove because intentions and meaningful representations can only be attributed to apes. This, one should note, is the case with children as well. Spontaneous communication has been observed, not only in the Pan paniscus, but to some extent also in the common chimp (Pan troglodytes). Fouts et al. (1989) report that Washoe, when introduced to conspecifics that had no knowledge of sign language, spontaneously signed to them. Washoe also spontaneously began to teach signs to a foster baby.

With regard to bonobos, Greenfield and Savage-Rumbaugh (1990, 1991) detect a "protogrammar" in Kanzi's and Mulika's two- and three-sign concatenations, which they liken to early stages of language acquisition in children. Whereas Terrace et al. (1979) found that multisign utterances by their chimpanzee subject Nim added little meaning to corresponding short utterances, Kanzi's gesture and lexigram combinations are reported to be mostly non-redundant (Savage-Rumbaugh 1986: 392ff.). More impressive still is Kanzi's English speech comprehension, which is comparable to that of a 2- year-old child (Savage-Rumbaugh et al. 1993). Above all, his improved performance in response to embedded sentences of the type "Go get object X that's in location Y " as compared to the type "Go to location Y and get object X" suggests some comprehension of syntactic structure (ibid.: 88ff.).

In sum, these studies demonstrate that nonhuman primates can acquire some components of human language. This potential is probably present to different degrees in different pongid species and requires a linguistically rich environment during the neurodevelopmental plasticity period in order to be fully realized.

3.1.2 Speech emergence and perceptuomotor evolution

As already noted, the idea that language capacity came into being abruptly and as a whole at some stage of hominid evolution is akin to the assumption of grammar being innate, autonomous and universal. This idea, however, appears to be mainly based on a teleological fallacy in that it presupposes evolution to proceed like an engineer. Jacob (1977) has expressed his criticism of this engineering logic in the metaphor of 'evolution the tinkerer': "Evolution does not produce novelties from scratch. It works on what already exists, either transforming a system to give it new functions or combining several systems to produce a more elaborate one" (ibid.: 1164). As Mayr (1991: 4) underlines, natural selection "makes use of any structure or behavior or other component of variability that happens to be available" - a quality he calls "opportunistic" (cf. also Mayr 1988: 408ff.). The common assumption that mind/brain is modular because man-made 'thinking' machines are most easily constructed that way is vacuous from a biological point of view (cf. Mueller 1992). Rather, such evolutionary principles suggest that an integrative approach to language emergence should be the default solution.

Attempts along these lines have been made (e.g. Bates et al. 1991c, Dingwall 1988, Lieberman 1984, 1991). Wind (1983: 21), in his attempt to sketch all processes that probably contributed to language emergence in primate evolution, comes up with a diagram of nearly one hundred interrelations. However, there seems to be a trade-off between inclusiveness and explanatory value. A model acknowledging that virtually every organic subsystem and every environmental influence may interactively contribute to the development of a new capacity would explain little. The alternative, however, must not be to discard the issue altogether, assuming that universal grammar is by definition a discrete subsystem that cannot be traced back to anything in any nonhuman species. The crucial issue is not the (probably vain) search for equivalence of behavioral capacities or neural substrates in apes and humans, but the discovery of homologies and preadaptations.[5] Even if the evidence for homologies in the communicative domain, presented in section 3.1.1, were unconvincing, other kinds of preadaptations related to speech emergence might still be found, for example in the auditory or motor domains. Before reviewing these issues, I will briefly consider some general neural variables that might account for human-specific language capacities. I concede that much of the evidence cited in the following sections sheds light on the issue of autonomy and phylogenetic discontinuity of speech emergence only indirectly. But indirect evidence is better than a theoretical a priori.

3.1.2.1 Are there specifically human brain properties?

If the language faculty is innate and autonomous, as many linguists claim, some corresponding specificity should exist on some level of neural organization.[6] Yet, it is hard to find neuroevolutionary trends which might specifically account for the increase in general intelligence that humans detect when comparing themselves to other animals, let alone to pinpoint substrates for linguistic capacities assumed to have abruptly originated in hominid evolution. The following variables have to be considered:

- General encephalization, which has been investigated by means of a variety of brain/body weight ratios. In this regard, humans do indeed rank quite high, though not much higher than dolphins (and lower than sparrows; Jerison 1985, Walker 1983: 130ff.). Encephalization can be viewed as one expression of a general size increase trend in the evolution of plants and animals, but is only loosely related to an increase in internal complexity (Bonner 1988, Deacon 1990: 649ff.).

- Structural encephalization in mammals mainly relating to the greater than proportional growth of the forebrain, but also the cerebellum (Leiner et al. 1989, Passingham 1979).

- An increase in neocortex linked to structural encephalization which is believed to imply:

(a) a greater than proportional expansion of 'association cortices' (Pandya et al. 1988, Thatcher 1980; but see below) (b) a multiplication of functionally distinct areas within each sensorimotor modality. Whereas, for example, 12 visual areas have been found in the cat brain, 32 such areas have been identified in macaque monkeys (Kaas 1987, Van Essen et al. 1992). The same trend has been observed for other sensorimotor modalities (Kaas 1987, Kaas et al. 1979, Rizzolatti & Gentilucci 1988).

- Hemispheric asymmetries, namely:

(a) Morphological asymmetries such as the asymmetry of the planum temporale (Galaburda 1984; Geschwind & Levitsky 1968; cf. Charles et al. 1994). The fact that the planum temporale can be as much as ten times larger on the left (where it is part of Wernicke's area) suggests a causal relation to left hemisphere language dominance found in about 95% of humans. However, the planum is bigger on the left in only about 65% of the brains examined and, in addition, similar asymmetries have been detected in some monkeys and apes, and in hominids of the Peking and Neanderthal type that probably had no language in the modern sense (Dingwall 1988: 302f., LeMay 1984, 1985, Lieberman 1984). More generally, brain morphological asymmetries are not unique to primates and have even been identified outside the mammalian class, for example in songbirds, where song-control nuclei are bigger in the left forebrain of males (Nottebohm 1984).

(b) Functional asymmetries, which in most humans seem to imply left-lateralization of linguistic, analytic, and sequential functions and right-lateralization of spatial, holistic, and gestalt functions (Bradshaw & Nettleton 1981, Dean 1986). Comparable asymmetries have been found in many nonhuman species as well (Steklis 1988, Walker 1983: 161ff.), in fact, "brain-behavioral lateralization is a rather general property of vertebrates" (Denenberg 1984: 131).

All of these points refer to general mammalian and higher vertebrate trends and thus, if anything, suggest evolutionary continuity. There is little indication of an abrupt emergence of substrates of linguistic capacities (cf. Passingham 1979). Interestingly, for the visual domain comparative cytoarchitectonic data do in fact suggest an evolutionary discontinuity. Rockel et al. (1980) counted the neurons within a given volume of tissue through the depth of the neocortex. Whereas in non-primate mammals the number of neurons was virtually identical in all cortices studied (prefrontal, motor, somatosensory, temporal, parietal and visual), in primates they counted more than twice as many neurons in visual area 17 than in the other areas. This suggests that cognitive discontinuity (more complex visual processing in primates) is reflected in neural discontinuity. Interestingly, no corresponding discontinuity has been detected for the 'language areas' so far.

3.1.2.2 Language and auditory perception

Electrophysiological and 'high-amplitude-sucking' experiments have shown that human infants and neonates (even premature) can categorically perceive speech sounds. Thus, neonates can discriminate speech stimuli from nonverbal auditory stimuli (like music), but can also differentiate between speech stimuli on the basis of such properties as voice-onset time (VOT, distinguishing voiced and unvoiced consonants) and place of articulation (Bertoncini et al. 1989, Molfese 1987, Molfese & Betz 1988). These capacities are 'innate', but not necessarily genetically programmed. In a study with Kikuyu infants, Streeter (1976) shows that some infant phonological distinctions are experience-dependent. There are indications that speech stimuli are received by the fetus in utero and to some extent remembered until after birth (Krasnegor 1989; cf. DeCasper & Fifer 1980). These 'prenatalist' data cannot explain all phonological abilities in neonates, such as their capacity to distinguish phonemes that are non-distinctive in the maternal language (e.g. /l/ and /r/ in Japanese; Mehler 1989). But they suggest that phonological learning occurs before birth. In addition, the 'innate' phonological capacities found can hardly be interpreted as part of a specifically human 'language acquisition device' (LAD) or the like, since animal studies have shown that comparable capacities exist in various birds, rodents, monkeys, and apes (Kluender et al. 1987, Miller & Jusczyk 1990, Springer 1979).

On the other hand, there is evidence for some degree of specificity of phonological versus nonlinguistic auditory processing in humans (cf. Blumstein 1987). Zatorre et al. (1992) report different neural activation patterns for phonetic as compared to nonverbal auditory stimulation.7 And Whalen & Liberman (1987) show that non- phonemic perception of acoustic formant transitions ("chirps") is suppressed in a syllabic context, presumably because speech perception by a 'phonetic module' has priority over nonverbal auditory perception. Phonemic specificity, according to Liberman & Mattingly (1985, 1989), is based on the motor organization of the human vocal tract, in the sense that speech perception implies recovering the vocal tract 'gestures' of speech production.

This phonemic specificity is, however, compatible with a gradual evolution of phonemic capacities. First, according to the motor theory of speech perception, phonemic decoding is based on the specificity of the human vocal tract and its motor control systems. This is highly compatible with the notion of perceptuomotor bases of language functions proposed here. Second, the fact that phonemic perception has developed into a specific type of auditory processing during human phylogeny is not at odds with its being preadapted from and to some extent homologous with auditory (sub)functions in nonhuman species. As pointed out by Kuhl (1993: 251), available data indicate that "in the evolution of speech existing auditory abilities were exploited in the selection of linguistically contrastive sounds". The categorical perception of VOT, for instance, can be attributed to a general mammalian auditory temporal resolution of c.25 msec (Lieberman 1984: 170ff.; cf. Warren 1988). Steinbuechel & Poeppel (1991) relate this to a general principle of perceptual binding by phase-locked oscillations in the 30 Hz range, with each oscillatory period being a "zone of atemporality" (ibid.: 63; see section 6).

3.1.2.3 Language and motor functions

Motor preadaptations for language are most likely to have originated in the following two motor subsystems:

(a) The vocal tract system: As seen in section 3.1, it is conceivable, though as yet not established, that species-specific calls in the common ancestors of man and ape were involved in language emergence. Still, vocalisations in modern pongids are likely to be primarily controlled by subcortical structures (in the limbic system, thalamus and midbrain; Dingwall 1979: 64f., Ojemann & Creutzfeldt 1987: 695f.). So whatever might correspond to Broca's area in apes (cf. Passingham 1982) is not a 'call center', though it may to some extent be involved in vocalization control (Mueller-Preuss & Ploog 1983). As argued above, this suggests a gradual shift of control rather than an abrupt emergence of a supramodal and autonomous language competence in Broca's area that would have nothing in common with similarly localized areas in pongid brains or with the neighboring motor areas in the human brain (cf. Rizzolatti & Gentilucci 1988).

If language evolution is linked to motor systems, it is 'non-specific' from the linguistic point of view, but there may or may not be specificity within the motor domain. Allott (1988) claims that a general 'motor alphabet' (not specific for any group of muscles and comparable across related species), when applied to the human vocal tract, yields the set of phonemes found in human languages. Thelen (1991: 349) suggests vowels to originate from neonate "comfort noises", and Studdert-Kennedy (1991: 20) traces the syllable back to "rhythmic jaw oscillation" analogous to rhythmic movements in the extremeties. On the other hand, Liberman & Mattingly (1985, 1989) argue that speech perception can only function through feedback with motor programs specifically tuned to the human vocal tract. From the viewpoint of motor theories of language, the proximity of Broca's area (roughly corresponding to Brodmann area [BA] 44; Mohr 1976) to the face section of the primary motor cortex seems natural. Fox et al. (1988) report metabolic imaging data contradicting a language- specific role of Broca's area, which they found to 'light up' during non-speech tongue and finger movements (see also Fox & Pardo 1991: 129). However, the low spatial resolution of this positron emission tomography (PET) study may have contributed to these results. In an optical imaging study, Haglund et al. (1992) were in fact able to distinguish areas within the left inferior frontal gyrus activated in tongue movement vs. naming. Sergent et al. (1992) found only the superior part of BA 44 activated in a PET study on keyboard performance in pianists. And clinical evidence shows that deficits of phoneme production in speech apraxia and morphosynatctic deficits in Broca's aphasia can occur independently (Huber et al. 1989, Vogel et al. 1988; cf. section 4.1).

(b) Gestural systems: Homologies and preadaptations to human language have also been discussed for the gestural domain (Kimura 1979). Fischer (1988: 70) contends that gestures form the basis of semanticity in that they are interpreted by apes as "imminent action toward themselves or other parts of the environment". It has also been argued that manual and linguistic developments in children have the same origins and are to some extent parallel (Greenfield 1991, Raffler-Engel 1988, Volterra et al. 1979; cf. Thal et al. 1991b). McNeill (1985) observes this parallelism in adult speech performance as well.

On the other hand, the Piagetian claim that language acquisition is a mere by-product of general cognitive development founded on sensorimotor intelligence (Piaget [1954], 1976, Sinclair 1969, 1974) is no longer tenable in its strongest form (see below 3.2.3). A weaker version, according to which language capacity develops in close association with object manipulation during early stages, but becomes more 'modular' from the third year onwards (due to the late maturation of connections between Broca's area and prefrontal cortices), has been put forward by Greenfield (1991).

Clinical evidence on the relation between aphasia and gestural deficits seems mixed. On the one hand, Kimura & Watson (1989) report a correlation between deficits in the production of speech and of nonverbal gestures. Aphasics also seem to have deficits in pantomime performance that cannot be accounted for in terms of co- occurring apraxia (i.e. an impairment of volitional movement; Wang & Goodglass 1992). Hanlon et al. (1990) show that gesturing with the right hand facilitates naming in nonfluent aphasics. And as Poeck (1988, 1989) underlines, aphasia and apraxia share two essential features in that they are exclusively caused by lesions to the speech- dominant hemisphere and exist in humans only. On the other hand however, there is little doubt that apraxia and aphasia can occur independently (ibid.).

Disturbances of gestural sign language in deaf subjects are usually caused by left-hemispheric lesions (Kimura 1979). This is supported by Bellugi et al. (1989) and Poizner et al. (1987) who, however, interpret their data on 'sign aphasia' in terms of both verbal and sign languages tapping the same supramodal 'universal grammar' (not related to general motor capacities). A study by Corina et al. (1992) with deaf signers supports a left dominance for signing, but not for nonlinguistic gestures (cf. also Poizner & Kegl 1992). Further, Petitto (1987) reports a discontinuity of pronoun acquisition by deaf children from prelinguistic gesture to sign language form, even though both forms are similar motorically. This is seen as evidence for a discontinuity between perceptuomotor and linguistic development assumed to apply to language acquisition in both the auditory and the visual modalities.[8]

Other data, however, indicate that modality differences are reflected in the functional organization of language. Cortical stimulation and single-unit recording in a bilingual signer-speaker demonstrated divergent neural organization of signed vs. spoken language (Haglund et al. 1993), with signed (but not oral) naming involving the left anterior temporal lobe. Braun et al. (1995) report PET data showing inferior parietal activation for sign production, contrasting with prefrontal activation for oral language production in bilingual signer-speakers. Dorsoparietal activation for sign (but not English) comprehension was also found by Neville et al. (1995). Poizner & Kegl's (1993) review of 'sign aphasia' studies suggests that lesion localization and symptomatology do not relate in the same way in deaf sign aphasics and in hearing aphasics. Thus, patient NS showed agrammatic deficits following left temporoparietal lesion. Marcotte & Morere (1990) report left hemisphere dominance in only about 33% of congenitally or early deaf subjects and suggest a neurofunctional reorganisation due to early lack of auditory stimulation (see also Kelly & Tomlinson-Keasy 1981). Finally, Sanders et al. (1989) observe right-hemisphere advantage for semantic categorization and sign language comprehension in early deaf subjects.

In general, there appears neither a strict parallelism of language and gesture nor a strict autonomy of language. Most likely, the phylogenesis of language involved a modality shift from primarily gestural (with the vocal component being mainly subcortically governed) to primarily vocal (and mainly controlled neocortically), but still accompanied by or - in case of sensory or motor deprivation

- substituted by a gestural component (cf. Corballis 1992). Such gestural roots would be compatible with a gradual specialization of the neural bases of language during hominid evolution and human ontogeny.

3.2 Ontogeny

In spite of extensive research in the past 30 years, evidence on 'normal' language acquisition has produced little consensus on the questions central to this paper, namely whether language development requires normal nonverbal cognitive abilities and environmental models (other than for the 'triggering' of a LAD; for but some opposing views and reviews see: Bates et al. 1991a, Crain 1991, Cromer 1974, Gleitman et al. 1988, Jaeggli & Safir 1989, Lightfoot 1989, Pinker 1987, Rice & Kemper 1984, Roeper 1988, Sinclair 1973). Relating language acquisition to brain maturation is still a daring enterprise. Nonetheless, the study of neurogenesis brings to light a number of principles that considerably constrain biologically realistic theories of language acquisition, although they are far from sufficient for the design of a comprehensive alternative model. I will first examine the neural principles assumed to correspond to the emergence and development of cognition, and will then review evidence concerning a genetically programmed LAD, a specifically linguistic 'critical period', and dissociations between verbal and nonverbal development.

3.2.1 Neurogenesis

Embryonic and fetal brain development can be described as a sequence of events (neural induction, neuronal proliferation, migration, aggregation, and differentiation) which in part temporally overlap due to differences in regional schedules (Cowan 1979, Goldman-Rakic 1986, Kolb & Whishaw 1990: ch.25). Location and timing of the 'birth' of a nerve cell is a major determinant for the 'final address' of the cell and its functional differentiation in the mature nervous system (Caviness et al. 1989, Rakic 1981, Rohrer 1990). The maximum number of neurons is reached after about 100 days of gestation. The ensuing growth of order in the brain can be attributed to two types of processes.

(a) Destruction: Selective loss of nerve cells may concern up to 85% of fetal neurons. Neuronal selection is based on limited trophic resources for which neurons and neuronal groups compete (Purves 1988).[9] Frequent activation secures the survival of neurons (Edelman 1987, Rakic 1989). Deficits of selective loss in neurogenesis is probably related to developmental disturbances (cf. Plante et al. 1991). In dyslexics, abnormal accumulations of nerve cell bodies have been found in neocortical layer I (Kemper 1984) and anomalous patterns of structural asymmetry of the planum temporale may be due to neuromigrational disorders and reduced selective loss (Galaburda 1988, Galaburda et al. 1989: 381ff., Leonard et al. 1993; cf. Plante et al. 1991). In early blind subjects, high occipital glucose metabolism is assumed to be related to deficient "synaptic revision" (loss) due to lack of visual stimulation (Veraart et al. 1990). Analogous findings for early deaf subjects are reported by Cataln- Ahumada et al. (1993).

(b) Construction: Whereas neurons do not regenerate after birth, neuronal processes (axons, dendrites) and synapses can be pruned, but can also regenerate during the entire lifetime. In addition, both neurons and synapses grow in size if frequently activated. Synapses are strengthened if pre- and postsynaptic neurons often fire simultaneously (Loewel & Singer 1992, Singer 1990). The loci of activity-related synaptic changes are numerous and interactive (Squire 1987: ch.2). Stabilization of neurons and their processes is related to the local density of glial cells which transport trophic substances from capillaries to neurons. The glia/neuron ratio in a given area can therefore be taken as an indicator of functionality, and thus be tentatively interpreted in cognitive terms (Diamond et al. 1985, Diamond 1990). Glial cells also provide insulating myelin sheaths for axons. Myelination is a reliable indicator of the functional maturation of a given brain area (Lecours 1981). It occurs postnatally and with regional specificity: first in the primary sensorimotor areas of the neocortex and last in the 'association areas'. This pattern is paralleled by the regional development of cerebral glucose metabolism (Chugani 1994a, Chugani & Phelps 1986, Chugani et al. 1987). In the prefrontal cortex, myelination is fully achieved in the second decade only (Trevarthen 1983).

The preceding sketch will suffice to show that there is no unique neural principle or mechanism that could, on its own, explain psychogenesis and knowledge increase. In particular, destructive and constructive principles interact and have different effects on cognition at different times in development. The exuberance of neural substrates in the fetus and in early life is probably non-functional in the cognitive sense, and loss and pruning are related to neurofunctional organization and the evolution of infant cognition.[10] Later in life, neuronal loss may, however, become cognitively dysfunctional, as is particularly evident in pathological conditions like Alzheimer's disease (cf. section 4.2).

The question remains to what degree brain architecture, such as regional and laminar organization and connectivity patterns, are specified in the genome. The hypothesis of 'chemoaffinity' put forward by Roger Sperry nearly half a century ago (cf. Levi- Montalcini 1990), which implies that neural circuits are strictly determined by the genome, had to modified in view of the selective mechanisms mentioned above (Edelman 1987: 100ff.). Still, the prenatal brain is by no means an unstructured equipotential mass. Radial glial cells and calcium ion channels secure the orderly migration of neurons to specific loci in the cortex (Caviness et al. 1989, Komuro & Rakic 1992, Rakic 1981). Nerve-growth factors (NGFs) can direct axonal growth to a target and can prevent selective loss of neurons (Levi-Montalcini & Calissano 1979). NGFs and neurotrophic substances compose an entire set of agents, each with a particular chemical structure and a specific role in particular regions of the developing nervous system (Rohrer 1990, Thomson 1990). Axons of early maturing subplate neurons that are mostly lost soon after birth function as 'pioneers' for axons of later maturing neurons (McConnell 1991: 291ff.). Various types of cell adhesion molecules (CAMs) and surface adhesion molecules (SAMs) serve important and probably specific functions in the constitution of cell assemblies (Edelman 1987). Hormones have specific effects on brain development and function (Kimura 1992, McEwen 1989, Tallal 1991).

There is thus considerable evidence for the specificity of neurogenetic events. A genetic factor is involved in any neurodevelopmental event, any behavior or cognitive process (Benno 1990, Kupfermann 1991). Yet there is no reason to resort to rationalist notions of innate ideas or to the modern metaphor of a genetically programmed hard- wiring. If mind/brain development has to be viewed in terms of 'nature vs. nurture' at all, the truth will lie somewhere between genetic predetermination and environmental shaping (Edelman 1987) or, more specifically, between endogenous mechanisms such as 'chemoaffinity' and selective principles influenced by various types of environment (McConnell 1991, Purves 1988, Rakic 1989; see below 6.1).

3.2.2 Language genes

There is a long history to the notion of developmental language disorders being hereditary and the assumption that language capacities are therefore genetically programmed (Lenneberg 1967: 248ff.). Lenneberg (ibid.: 265) denied the "need to assume genes for language" and Ludlow & Cooper (1983: 5) have estimated that only a very small fraction of children with developmental dysphasia suffer from chromosomal anomalies. Recently yet, the debate about language genes has been revived because familial aggregation of 'specific language impairment' (SLI) could be shown in several studies (Crago & Gopnik 1994, Gopnik & Crago 1991, Tallal et al. 1989a, 1991, Tomblin 1989, Tomblin & Buckwalter 1994).

The evidence for language genes is as yet far from straightforward and there is some contradiction between individual sets of data and hypotheses. First, 'specific language impairment' is often rather loosely defined. Tallal et al. (1989a&b, 1991), for example, classify parents of SLI children who either reported 'below-average school achievement' or 'having been kept back a grade in school' as having a history of language disorder. Such criteria make the statement that 66% of SLI boys having an "affected parent" (Tallal et al. 1989b: 989) rather vague. Crago & Gopnik (1994: 45) are correct in calling for a "carefully defined linguistic phenotype". However, even if linguistic deficits are formally described (as is the case in Gopnik's [1990] and Crago & Gopnik's [1994] study of 'syntactico-semantic feature blindness' in a family presenting aggregation of SLI) equal emphasis must be put on investigating possible deficits in other, nonlinguistic domains (cf. Vargha-Khadem & Passingham 1990). Vargha-Khadem et al. (1995), studying the same family, report intellectual deficit beyond the morphosyntactic domain (severely reduced performance IQ in affected as compared to non-affected members) as well as articulatory deficit and orofacial dyspraxia. As Bishop's (1992) review shows, deficits have been found in SLI subjects for various nonverbal domains and tasks (such as conservation, anticipatory imagery, integration of complex information, hierarchical thinking, etc.).

Second, the symptoms and underlying anomalies reported for developmental language impairment are a mixed lot. Thus, SLI is described as a specifically linguistic deficit of morphological features (Gopnik 1990, Gopnik & Crago 1991), a general auditory (Cohen et al. 1991) or perceptuomotor deficit (Tallal et al. 1991), a memory "retrieval failure" (Leonard 1994: 103), or one of "nonlinguistic sequencing and short-term verbal memory" (Curtiss & Tallal 1991: 207). Third, familial distribution of SLI has been related to either a dominant (Crago & Gopnik 1994, Tallal et al. 1989b) or recessive (Tomblin 1989) autosomal (i.e. non-sex) allele. And finally, the observation that within the SLI population children with a family history of language impairment were not more severely impaired than those without (Tallal et al. 1991) appears hard to reconcile with strict heredity.

There can be little doubt about a genetic component in SLI, as it is found is in other developmental deficits such as dyslexia or attention deficit disorder (Denckla 1993, Pennington 1995). The most convincing concept for relating language capacity and genome is that of polygenic inheritance, which means that numerous genes may have interactive influence on the acquisition of language-related capacities, but these genes also influence many other non-linguistic developments (Hurford 1991: 175ff., Pennington & Smith 1983; see below 6.1). According to Spuhler (1979: 30), "well over 100 major gene loci are known to affect the central language system", but "none of these have been named 'genes for language' as such". For example, the contrast between Turner syndrome (one X chromosome missing; visuospatial deficits, relatively good verbal skills) and Klinefelter syndrome (one extra X chromosome; normal nonverbal IQ, but lexical and syntactic deficits) could be interpreted as a 'double dissociation' (cf. section 4.1) for verbal vs. nonverbal intelligence (Nass 1993). Yet no one would expect a particular locus on the X chromosome to be specifically responsible for language capacity or impairment (Hay 1985: ch.2). Interestingly, as mentioned by Pennington & Smith (1983: 380), each of these and other sex- chromosome anomalies allows a considerable phenotypic variability (presumably because they are buffered by epigenetic mechanisms; see 6.1), "such that some children in each [abnormal] karyotype group are completely normal". Even if specific types of linguistic deficit could be shown to be inherited this would not mean that underlying nonlinguistic causes could be ruled out unless subjects were exhaustively tested in a large range of cognitive domains. Just stating that nonverbal or general IQ are normal in SLI by no means demonstrates the specificity of a developmental language disorder.

3.2.3 The critical period

Related to the idea of 'language genes' is the hypothesis of a genetically based developmental schedule determining a critical period for language acquisition. Since Lenneberg (1967), the notion that first language acquisition can only be normal if it occurs during a critical period (from age two until the onset of puberty, according to him) has been generally accepted (see review by Hurford 1991). Often, this critical period has been assumed to be a specific feature of language acquisition, and not one of learning in general. Lenneberg (1967: 158ff.) cited four aspects of brain development related to this hypothesis: (1) pronounced morphological development with growth coming to a halt at the end of the period; (2) steady and orderly histological development (of dendritic arbors in particular) during the period; (3) high levels of cholesterole and cerebrosides related to myelination; and (4) changes in brain electrophysiology. More recently, additional types of data have emerged. Chugani et al. (1987) and Chugani & Phelps (1991) report that in children from age two onwards global cerebral glucose metabolism as detected by PET exceeds adult levels by a factor of two or more. And EEG coherence data suggest cyclical maturational growth spurts from infancy to adolescence that roughly correspond to the Piagetian 'stages' of psychogenesis (Hudspeth & Pribram 1990, Thatcher 1987, 1992, 1994).

Evidence for a critical period of language acquisition is abundant. There are cases of 'feral' and deprived children, who receive language stimulation only after the onset of puberty and do not acquire language normally (Curtiss 1977, 1988: 97ff.; cf. Dennis 1951). The importance of the age of acquisition can also be observed in deaf signers, who according to Newport (1988) apply morphologic analytic acquisition strategies when learning sign language early - as opposed to adult learners, who predominantly use holistic rote strategies.

The 'Kennard principle', a clinical concept comparable to the critical period hypothesis, suggests that prospects for recovery are the better the earlier a postnatal lesion occurs (Bay 1975, Lenneberg 1967: 142ff., Vargha-Khadem et al. 1985, 1992). Even though the effect of age on lesion outcome is more complex (Gilbert 1988, Kolb 1990, Kolb & Tomie 1988, Rudel 1978), this principle is generally valid. Left hemispherectomy, which would lead to global aphasia in right- handed adults, permits language (re)acquisition if performed during the first decade (Leleux & Lebrun 1981, Strauss & Verity 1983, Vargha-Khadem & Polkey 1992, Verity et al. 1982), or if an exclusively left-hemispheric lesion occurred early in life (Glees 1980, Ogden 1988, Vargha-Khadem et al. 1992). With the exception of slow-growing tumors (DeVos et al. 1995), this usually leads to right- hemisphere dominance (Mueller et al. 1995c, Rey et al. 1988). Some studies, though, yield evidence for persistent syntactic and general cognitive deficits following early left hemisphere lesion (cf. Ozanne & Murdoch 1990: 31ff., Rankin et al. 1981). Vargha-Khadem et al. (1991) find some problems with grammatical morphology in patients with left-sided pathology that underwent hemispherectomy in adolescence. And Dennis (1980) detects subtle but lasting syntactic deficits even after early left hemispherectomy (see review by Vargha- Khadem & Polkey 1992). Such 'deficits', for example in the comprehension of passive negative sentences, can however also be observed in neurologically normal subjects with low mental age or social background (Bishop 1983). Indeed, nonverbal intelligence usually suffers more than verbal intelligence from a lesion-induced right-hemisphere shift of language dominance (Verity et al. 1982; cf. Mills et al. 1994). Strauss et al. (1990) relate this to a 'crowding effect'. In a longitudinal study, Curtiss & Jackson (in press) observed delayed language acquisition, but no evidence for persistent syntactic deficits in a six-year-old who had undergone left hemidecortication at 13 months. Similar results for two patients with left hemispherectomy in late adolescence, following early left hemisphere pathology, are presented by Ogden (1988), who suggests that linguistic recovery after left hemispherectomy is a very slow process.[11]

There is additional evidence against a general left 'prepotency' for all aspects of language. Bates et al. (in press) report lexical and grammatical delays in children with early frontal lesion of either hemisphere, and Thal et al. (1991a) observed association of right vs. left lesion with lower receptive vs. productive vocabulary. Feldman et al. (1992: 100) found "[n]ormal rates of [language] progress after initial delays" in children suffering from early brain damage to either right or left hemisphere (see also Wulfeck et al. 1991b). Thus, after extensive left hemisphere lesion in infancy, the right hemisphere appears to be capable of sustaining normal language acquisition that may, however, be slower than in the average normal child. Language acquisition is extremely plastic in the context of extensive early brain damage. Analogously, the left-hemispheric dominance in the general right-handed population is unlikely to be strictly determined by the genome. Rather, functional asymmetry appears to be due to different maturational schedules in the two hemispheres. Thus, a visuospatial specialization of the early-maturing right hemisphere is followed by a linguistic specialization of the later-maturing left hemisphere (Nass 1993).

Task-related PET data on patients with early unilateral lesion suggest that the allocation of language is flexible not only in terms of lateralization, but also of localization within the affected hemisphere (Mueller et al. 1995b). Language is, in fact, more plastic (i.e. less 'hard-wired') than motor functions (which are persistently impaired after early damage to either hemisphere in cerebral palsy; Eicher & Batshaw 1993) and visuospatial capacities (which are severely disturbed after early right-hemisphere damage; Carlsson et al. 1994, Stiles & Nass 1991, Stiles & Thal 1993).

All of the above data indicate that the period between age two and the onset of puberty is one of extreme neural plasticity (see also Chugani 1994a, Neville 1991). There is, however, little in the data cited that suggests a specifically linguistic type of mechanism. Neither is the critical age phenomenon specifically human (Hubel & Wiesel 1970, Kolb 1990).[12] I will therefore turn to other neurobiological approaches that address the question of a human-specific, innate and 'hard-wired' language acquisition device triggered at some stage around age two (e.g. Lightfoot 1989). Such dramatic processes should be expected to manifest somehow in the functional and structural development of the (emerging) 'language areas' of the brain. PET data suggest that one-year olds already resemble adults in terms of regional metabolic patterns (though not of absolute metabolic rate; Chugani & Phelps 1986, Chugani et al. 1987: 490). According to Molfese & Molfese (1985, 1994) and Molfese & Betz (1988), auditory evoked potentials for speech stimuli in neonates predict the level of language proficiency at 3 years of age. Presumably, the highly variable neonate capacities for fine auditory discriminations are a major factor in later language acquisition. This would be at odds with the hypothesis of innate universality of the speech capacity and abrupt triggering of a LAD at age 2. On the other hand, data on EEG coherence (supposed to be negatively related to functional differentiation; cf. Thatcher 1994) analyzed by Greenfield (1991: 542ff.) suggest that around the end of the second year of life, Broca's area, originally involved in both tool use and word acquisition in unison with the adjacent primary motor cortex, differentiates into two subareas, with only the inferior portion specializing in advanced language functions.

With regard to structural specificity, Lenneberg (1967: 160) observed a "dramatic increase" in the density of neuronal processes (i.e. axons and dendrites) between 15 and 24 months after birth. In a more recent study, Simonds & Scheibel (1989) analyzed the morphology of dendritic arbors in the inferior posterior frontal lobes bilaterally in children's brains. Dendritic systems are assumed to be essential for the 'information-processing' functions of neural ensembles (Jacobs & Scheibel 1993). Simonds & Scheibel (1989) did indeed find a strong development in Broca's area between 24 and 36 months of age (as opposed to an earlier dendritic development of the adjacent section of the primary motor cortex). Yet this finding does not really square with the idea of a LAD maturing in the left hemisphere at about age two because the pattern was observed in both hemispheres.

In sum, there is a great deal of neuroscientific evidence for the psycholinguistic notion of a critical period for language acquisition. Indeed, most structural and functional variables show that the first decade of life is a period of extreme neural activity, selectivity, and plasticity. Yet so far, there are few neurobiological data supporting the hypothesis of generative linguistics that certain autonomous and specifically linguistic properties are hard-wired (into the left hemisphere) and abruptly triggered at the beginning of syntax acquisition. Relatively late maturation of secondary and association cortices (such as the 'language areas') is generally found in the telencephalon (Lecours 1981, Rose 1980, Thatcher 1987) and does not in itself warrant the idea of 'autonomous subsystems' of knowledge.

3.2.4 Developmental dissociations

Developmental disorders are occasionally associated with dissociations between linguistic and nonverbal capacities, which have been interpreted in terms of a modularity of grammar (Curtiss 1982, Fromkin 1988, Yamada 1988). Curtiss (1982) reports on three subjects who, in spite of severely retarded nonverbal cognition, attain high levels of morphosyntax. 'Marta', for example, an adolescent with an IQ below 45 and nonverbal capacities generally on the 2- year-old level, produces morphosyntactically rich utterances like: "He was saying that I lost my battery-powered watch that I loved" (Yamada 1988: 184, for a similar case cf. Curtiss & Yamada 1981). Curtiss (1988) detects a 'double dissociation' because of other cases showing an inverse pattern (i.e. language being selectively disturbed in the context of good nonverbal development). Thus, in spite of "relatively normal nonlinguistic cognitive and social function, plus good lexical abilities", the hearing impaired subject 'Chelsea', who acquired language in her thirties only, produces distorted phrase structures like "The girl is cone the ice cream shopping buying the man" (ibid.: 99).

Models of language acquisition attributing a necessary role to general sensorimotor and preoperative intelligence (Sinclair 1969, 1974) appear to be refuted by such evidence. According to Curtiss, these cases suggest, not just a general autonomy of language, but more specifically, the autonomy of morphosyntactic development vis--vis nonverbal cognition and other language-related capacities (lexico- semantic and pragmatic). With regard developmental dysphasia, even more specific accounts are presented by Clahsen (1989), Gopnik (1990), Grimm & Weinert (1990), Rice (1994), Watkins (1994) and others relating to selective deficits of sentence structure, grammatical morphology, and grammatical agreement (for critical review, see Leonard 1994).

A different type of developmental dissociation is seen in Williams syndrome, a rare and genetically-based metabolic disorder (hypercalcemia) associated with a sharp dissociation of highly developed language versus low general intelligence and, in particular, strongly disturbed visuoconstructive capacities (Bellugi et al. 1991,1992, 1994). A Williams patient (WP) has, for example, severe problems drawing an elephant (the result being an unrecognizable collection of body parts completely neglecting the overall gestalt), but can eloquently describe the animal verbally (Bellugi et al. 1992). Bellugi et al. (1988: 283ff.) have tentatively interpreted this syndrome as support for "the autonomy of language", but it is clear that the dissociation is not comparable to those discussed above. As opposed to the pattern in 'Marta' and analogous cases, there is no division here between morphosyntax and semantics, and Pinker's (1991: 534) statement on a "selective sparing of syntax" in this population is misleading (cf. Bates et al. 1995a). In adult WPs, phonological, lexicosemantic, and pragmatic language components appear to be intact as well although early language acquisition is delayed (Singer et al. in press, Thal et al. 1989). Relatively intact linguistic capacities in WPs can further be related to two nonlinguistic domains: First, high socioaffective function (Reilly et al. 1990) which may be related to good facial recognition (in the context of visuospatial deficit; Bellugi et al. 1994), and second, hypernormal auditory sensitivity which suggests that WPs' good language development may have a perceptual basis.

Language performance in WPs could also be related to cerebellar hyperplasia in the context of general cerebral hypoplasia (Jernigan & Bellugi 1990, 1994). The cerebellum is known to be important for fine motor control and motor learning (Ghez 1991, Nitschke et al. 1995), as well as for motor imagery (Decety et al. 1990). In addition, Leiner et al. (1989, 1993) suggest the neocerebellum may be involved in language and other higher cognitive functions.[13] If the neocerebellum is involved in WPs' language behavior, this could be related to language acquisition strategies in these subjects being strongly based on rote learning (as suggested by Thal et al. 1989). Rubba & Klima (1991) found examples of 'non-conformist' preposition use in one WP studied, which may be explained as overgeneralizations from idiomatic prepositional expressions (like 'all of a sudden' or 'once upon a time') that were highly frequent in this subject's speech. This may indicate that language acquisition in WPs is an extreme form of a non-analytic 'expressive/pronominal style' observed in some normal children (Thal & Bates 1990) and, more generally, that different styles of language acquisition are associated with different patterns of neural allocation of language functions, in part beyond the perisylvian cortices.

As pointed out, Williams syndrome provides no straightforward evidence for a modularity of morphosyntax. There are additional and more general reasons against interpreting such dissociations in terms of modularity. First, the cognitive profile of hyperverbal, but semantically weak, behavior in the context of visuospatial deficits may reflect general, non-localizable early lesion effects. This profile is, for example, also found in hydrocephalus and spina bifida patients (Rahlson 1984). Second, cases such as those presented in Curtiss (1988) are extremely rare. It is unclear to what degree they can shed light on 'normal' neurofunctional organization at all. As reported by Bates et al. (1995a, in press), lexicosemantic and grammatical development are not dissociated in their large sample of children with early focal brain lesion.

The plasticity of the developing brain implies that sensory impairment of one modality leads to compensatory expansion of the neocortical representation of other sensory modalities. There is electrophysiological evidence for increased visual processing in the temporal lobes of early deaf subjects (Neville & Lawson 1987, Neville 1990, 1991) and for tactile processing in early blind persons' occipital lobes (Sadato et al. 1995, Uhl et al. 1991). In a related animal experiment, Sur et al. (1990) 'rewired' retinal afferents in ferrets to the medial geniculate nucleus, the 'auditory' area of the thalamus, and subsequently showed that visual stimuli were processed in what would normally be auditory neocortex. It can by no means be excluded that neurogenetic disorders in humans lead to comparable (less gross but possibly more pervasive) neurofunctional reorganizations. In fact, as sections 3.2.1 above and 6.1 below show, such effects have to be expected in rare cases of anomaly and early injury (see also Bach-y-Rita 1990, Cohen et al. 1993, Stein 1989). From this point of view, the intriguing developmental dissociations mentioned above may be comparable to specializations in so-called 'idiots savants'. I do not agree with Gardner (1983: 63) that the existence of savants and prodigies with highly specialized talents suggests the corresponding capacity in the normals is a modular 'intelligence'. In view of typical savant specializations, such as calendar calculation beyond existing perpetual calendars (Horwitz et al. 1965), memorizing decades of weather information, or generation of 20-digit prime numbers (Treffert 1988), this would have absurd consequences. The same caution applies to at least some cases of severe developmental abnormalities (see, for example, Dorman 1991).

Interestingly, studies by Curtiss & Tallal (1991) and Curtiss et al. (1992) on the broader population of 'language-impaired children' yield results more compatible with a non-specificity view. Phonological, morphological and syntactic development were delayed, but not deviant, in these children. The underlying deficits were found to concern general (i.e. not specifically linguistic) rapid serial perception and short-term memory (see also Shear et al. 1992, Tallal et al. 1993). Neville et al. (1993) found an abnormal bilateral event-related potential (N400, see section 3.3) for grammatical functors in this population, which was, however, accompanied by a general slowing of auditory, visual, and tactile processing.

According to extensive data discussed by Bates & Thal (1991) and Bates et al. (1995a), variability and dissociations are features of language learning in normal children as well, but the lines of division (that are by no means sharp) lie between comprehension and production (i.e. production can lag behind to strongly differing degrees) and are related to different learning styles ('analytic/referential/nominal' vs. 'holistic/expressive/pronominal'). These dualisms are reminiscent of labels applied to cognitive modes of the left vs. right hemispheres (Bradshaw & Nettleton 1981).

The evidence reviewed in this section yields no clear answer as to whether language acquisition is characterized by genetically based universal principles and whether morphosyntactic knowledge constitutes an autonomous subsystem subserving both comprehension and production. In my view, the apparently contradictory empirical data suggest that syntax is a non-discrete cognitive specialization that may be selectively disturbed or developed in abnormal ontogenesis, but is typically embedded in and emerging from the perceptuomotor domains involved in the language comprehension and production in the normal child. As will be discussed in section 6, the mixture of evidence pro and contra autonomy is precisely what is predicted by a model of distributive representations that incorporates existing knowledge about perceptuomotor-based regional brain specializations.

3.3 Microprocessing

Whereas the previous section looked at the emergence of language as a slow (phylo- and ontogenetic) process, this section will focus on the fast and microscopic emergence of individual cognitive and linguistic processes in (subsystems of) mind/brain. Brown's (1988, 1994) theory of microgenesis assumes that "growth processes in phylo-ontogeny correspond with the sequence of stages in cognitive processing" (1988: 4). Unfortunately, this biologically appealing approach has been debated and empirically tested to a limited extent only (for example, Brown 1988, Buckingham 1991, Hanlon et al. 1990). An opposing model that has been dominant for much of the last century abstracts from phylogeny and views cognition as a linear process serially activating specialized subsystems ('centers'). Lichtheim's (1885) 'house diagram' of language processing was based on the metaphor of the reflex arc. This serial localization was more recently revived by Geschwind (1965, 1970, 1979).

Geschwind's views are essentially compatible with Fodor's (1983) claim that speech perception is carried out by a (set of) domain- specific, encapsulated and automatic 'input system(s)' in a rigidly bottom-up manner (permitting no feedback from higher 'central systems'). This is supported by data suggesting that semantic, contextual, and pragmatic factors play no role in early syntactic analysis (e.g., Clifton & Ferreira 1987, Forster 1987; see reviews in Carston 1988, Garfield 1987). Other psychometric data indicate at least partly interactive processing between levels of speech perception (Altman 1987, Gernsbacher 1991, Marslen-Wilson & Tyler 1987, Shillcock & Bard 1993). Concerning speech production, data on performance errors of normal speakers have led Garrett (1988) to propose a model in which each type of error corresponds to a level of production. The fact that some types of errors never occur is taken as evidence for the strict seriality of processing. For example, utterances like "They were turking talkish" (for talking Turkish) are explained as errors on the level of lexical assignment. According to the model, these do not involve bound morphemes ('ing' and 'ish'), which are inserted on a later production stage and therefore appear to be 'stranded' (ibid.: 76ff.). With regard to lexical retrieval, Levelt (1992) opts for a distinction between lexical selection and phonological encoding in terms of serial modules. On the other hand, the observation that 'mixed errors', displaying both semantic and phonetic similarity to the target (e.g. 'rat' instead of cat), occur more often than to be expected statistically has led Dell & O'Seaghdha (1992) to suggest that semantic and phonological processes of speech production are both serial and interactive.

A detailed discussion of the intricacies of psycholinguistic research on modular seriality vs. interactive parallelism is beyond the scope of this neurobiological paper. Yet it is safe to say that no definite answers have so far been established in psycholinguistics and legitimate to examine what neurobiology might contribute. Clinical data are an easy remedy for crude versions of serial modularism. The neural flow charts presented in Geschwind (1979: 163) might be thought to suggest that lesions in Broca's and Wernicke's area should exclusively lead to deficits of speech production and speech perception respectively. This is clearly untrue and, obviously, Geschwind himself was aware of this (cf. section 4.1). Inhibitive electrical stimulation in Broca's area leads to failures in speech comprehension (Gordon et al. 1990, Ojemann 1988). And cerebral blood flow studies show that Broca's area is activated, not only during speech production, but also during speech perception (Celsis et al 1991, Fiez et al. 1993, in press, Lassen et al. 1978, Lechevalier et al. 1989, Silbersweig et al. 1993).

Blood flow imaging by means of PET has a low temporal resolution. Minimum stimulation and scanning windows are 20-30s (Hurtig et al. 1994, Silbersweig et al. 1993). This limits the applicability of PET to fast microprocessing. A complementary electrophysiological approach (with good temporal, but very low spatial resolution) is that of event-related potentials (ERPs). The N400, a negative potential measured 400 msec after stimulus onset, was shown by Kutas & Hillyard (1984) to be related to the perception of semantic anomalies, which suggests that this potential reflects the activation of a semantic subsystem at a rather late stage of language comprehension. In addition, Neville et al. (1991) detect a negative potential after 125 msec correlated with syntactic anomalies. An N125 effect (in response to phrase structure violations, for example) was measured only over the frontal and temporal lobes of the left hemisphere. A similar early negativity related to syntactic violations was registered by Friederici et al. (1993). However, in both studies this effect was much weaker than effects related to the N400. Neville et al. (1992) also found a left frontotemporal N280 related to the comprehension of grammatical functors (as opposed to content words).[14]

Neville et al. (1991: 159) draw the picture of specifically syntactic processes (reflected in the N125) followed by specifically semantic processes (reflected in the N400; see also Holcomb 1993). This serial-modular view is, however, called into question by evidence showing that the N400 is influenced by a range of non-semantic factors, such as modality of stimulus presentation, attention, subject's age and hand preference (Kutas & Kluender 1994, McCarthy & Nobre 1993, Van Petten & Kutas 1991). More importantly, it was shown that sentential factors also influence the N400, which is weaker for a particular word occurring at the end of a grammatical sentence than for the same word occurring at the end of an ungrammatical string (Van Petten & Kutas 1991: 151ff.). An N400 is associated, not only with incongruent content words, but also with grammatical functors (Neville et al. 1992) and morphosyntactic violation (Friederici et al. 1993, Muente et al. 1990, Muente & Heinze 1994). Kutas & Kluender (1994: 201) conclude that the N400 "reflect[s] the interactions of various lexical, semantic, and pragmatic factors with the surface structural properties of language". A high-resolution 64-electrode study by Curran et al. (1993: 207f) even suggests that "the N400 [is] characterized not by a unique scalp topography, but an interval in which the electrical terrain was featureless", which may indicate a highly distributive stage of processing involving syntactic, lexico-semantic, and other functions.

In summary, language-related ERPs, especially the N400, probably reflect the activity of a great number of brain areas and functional systems, involved in both verbal and non-verbal processes. Yet, there is no necessity to counter rigidly serial-modularist hypotheses with the notion of absolute parallelism. Semantic processes quite plausibly involve representations more widely distributed over the forebrain than syntactic functions (cf. section 6.3), which suggests that syntactic analysis of structurally transparent sentences may be more rapid than semantic analysis although both processes temporally and spatially overlap. Even if electrophysiological evidence remained equivocal on the question of serial vs. parallel language processing, the working hypothesis should be one of feedback and interaction between subsystems. Neuroanatomy shows that, at least in the forebrain, virtually all feedforward connections (such as the sensory afferents running from the thalamus to the neocortex) are coupled with corresponding feedback projections (e.g. Edelman 1987, Nieuwenhuys et al. 1988: 239ff., Pandya et al. 1988). Anybody who, like Fodor (1983), postulates strict 'bottom- up' perceptual processing either ignores neuroanatomy or implicitly claims a whole class of connections in the brain to be non-functional. The latter, however, is biologically untenable for reasons discussed in section 3.2.1.

4 DEGENERATING LANGUAGE

After examining various aspects of the emergence of language, I will now turn to evidence derived from two types of language loss due to (1) nonprogressive and (2) progressive brain lesion.

4.1 'Acquired' aphasia

The following very simple inference relates aphasiology to the issue of this paper: If language competence is stored in a set of hardwired and autonomous subsystems in the brain, neural damage should, at least in some cases, cause selective deficits involving (certain aspects) of linguistic capacity, leaving other cognitive domains intact.[15] One of the foremost goals of traditional aphasiology has indeed been to classify the aphasias according to correlations of lesion loci with symptom complexes (e.g. Goodglass & Kaplan 1979, Lichtheim 1885, Luria 1982). Also, aphasia has been viewed as leaving 'general cognition' intact (e.g. Wernicke 1874: 23f.). These 'localizationist' tenets have been challenged by 'holistic' critics such as Hughlings-Jackson (1875), Marie (1906), Head (1926), Goldstein (1948), Critchley (1970), and Eisenson (1984), who see the relation between lesion and symptoms to be complicated by compensatory and disinhibitory effects (see section 5.1) and explain aphasia in terms of underlying cognitive disturbances. Localizationists like Lichtheim (1885: 470f.) seemed to be aware of the ideal-typical status of syndromes such as 'conduction aphasia' or 'transcortical sensory aphasia'[16]. Within the theoretical framework of more recent cognitive neuropsychology, Caramazza (1984: 17) has postulated a "strong" definition of syndromes according to which "symptoms cooccur because of an impairment to a single processing mechanism". Advocates of cognitive neuropsychology such as Caramazza (1984, 1986), Coltheart (1987), Ellis & Young (1988), Linebarger (1990) and, with reservations, Caplan (1987, 1991) and Shallice (1988, 1991) subscribe to the view that a 'double dissociation' observed in two clinical cases, with function F disturbed but function G retained in one patient and G disturbed but F retained in another, is sufficient for postulating modules subserving F and G respectively. With Saffran's (1982: 334) 'subtractivity assumption' and Caramazza's (1984) 'transparency condition', effects of compensation and disinhibition in the lesioned brain are put aside on methodological grounds (for criticism see Mueller 1992, Schweiger & Brown 1988).

Clinical dissociations are by no means a safe and direct avenue to realistic models of brain functional organization (cf. Shallice 1988: 217ff.), but they remain a most valuable type of evidence. Any model of neurofunctional organization has to account for cases of selective or apparently isolated deficits (like category-specific semantic disturbances or prosopagnosia, a selective deficit of face recognition). A general approach will be discussed in section 6. Here, I will restrict discussion to one dissociation that has been a central theme in neurolinguistics for nearly 20 years. The basic observation is that aphasias due to anterior lesions are often characterized by loss of grammatical functors and inflectional morphemes in the context of relatively preserved content words, whereas the pattern tends to be inverse with more posterior lesions (Lonzi & Zanobio 1983). Since the loss of grammatical morphemes is usually accompanied by poverty of syntactic structure, it has been suggested that such agrammatism might be due to a specific disturbance of syntax (Berndt et al. 1983, Blumstein 1988, Linebarger 1990, Zurif 1984). However, it has been shown that the pattern of grammatical losses in 'agrammatics' is often not supramodal - as should be expected from a 'central deficit' - and depends on the type of task. Thus, dissociations between production vs. reception, oral vs. reading comprehension, and speaking vs. writing have been observed (Druks & Marshall 1991, Ellis & Young 1988: 241ff., Howard 1985, Paradis 1988, Tesak 1991). It has also been shown that 'agrammatics' are capable of grammaticality judgments to a surprising degree (Berndt et al. 1988, Linebarger 1990, Wulfeck et al. 1991a). In addition, if agrammatism were explicable in terms of modules of universal grammar (Chomsky 1981), the deficit should be comparable from one language to another (except for differences due to parametric and peripheral variations). This does not seem to be the case (Bates & Wulfeck 1989, Bates et al. 1991b, Simons et al. 1988, Slobin 1991, Tzeng et al. 1991, Vaid & Pandit 1991).

Caplan's (1991) assertion that "agrammatism is a theoretically coherent aphasic category" remains therefore debatable. Caplan (1987: 286ff.) himself has conceded that agrammatism may be but a stronger form of syntactic deficit than paragrammatism (usually caused by left posterior perisylvian lesions). Whereas in Broca's aphasia production would be blocked, in Wernicke's aphasia a compensatory strategy would lead to fluent but abnormal output (cf. Lonzi & Zanobio 1983). Together with the mentioned indications of preserved grammaticality judgments, this suggests that the 'representation' of syntactic knowledge is distributed over wide areas of the (left) perisylvian cortex and/or that retrieval and processing, but not grammatical knowledge, are impaired (Bates et al. 1991b, Wulfeck et al 1991a). Underlying nonlinguistic disturbances have been considered concerning "rapid processing" (Zurif 1990: 133) or "a generalized routine... of automatized, rapid exhaustive searches of memory" (Grodzinsky et al. 1985: 79). Friederici & Frazier (1992: 24f) define agrammatism as a "limitation on processing resources which interfere more with syntactic analyses than with lexical aspects (verbal memory) or with thematic analysis".[17] According to Goodglass & Menn (1985), the dissociation between grammatical functors and content words can be explained, not by syntactic, but by semantic criteria. The variable of meaningfulness elegantly and very simply distinguishes retained from disturbed lexical items in 'agrammatics'. As Friederici (1982) has shown for prepositions, lexical items are located on a continuum from meaningful to purely grammatical. Thus, grammatical functors and content words do not constitute discrete categories. The selective vulnerability of grammatical lexemes and morphemes (Bates et al. 1991b) is explained by their lack of semantic content. As shown by Blackwell & Bates (1995), Kilborn (1991), and Tesak (1994), such selective vulnerability manifests itself even in normal speakers under increased cognitive load of task conditions, or when stimuli are partially masked by noise. In neural terms, this vulnerability can be explained in the following way: Whereas the representation of content words is widely distributed and strongly connected to experiential and conceptual representations (mostly outside the perisylvian 'language areas'; cf. Howard et al. 1992) and therefore quite resistant to focal brain damage or resource limitations, functors are represented in more circumscribed areas (cf. Braitenberg & Pulvermueller 1992, Pulvermueller & Preil 1991, 1994). The critical question is: How are lexical items acquired? While the acquisition of content word meanings is tied to environmental contexts (Pinker 1987, Schlesinger 1982), the context of acquisition for closed-class words is much more reduced. The grammatical function of the latter is primarily inferred from a linguistic, sentential context. Analogous to this condition of acquisition, the neural representation of closed-class words is much less distributive (see below section 6.3 and Mueller in prep.; cf. Pulvermueller 1992).

4.2 Language and progressive brain disease

This brief discussion will focus on Alzheimer's disease (AD), a progressive neural pathology leading to (pre)senile dementia. Prominent changes are neurofibrillary tangles and neuritic plaques disrupting neuronal transmission. The regional pattern of pathology is quite complex, but the mesiotemporal lobe appears to be strongly affected (corresponding to salient dysfunctions of declarative memory) whereas primary sensorimotor cortices, but also Broca's and possibly Wernicke's area may be less involved at earlier stages (Arnold et al. 1991; cf. Zola-Morgan & Squire 1990). It has been suggested that there are two different types of AD, one predominantly aphasic, the other non-aphasic in the early stages (Faber-Langendoen et al. 1988).[18] The second type would imply a dissociation between depressed nonverbal and relatively retained verbal cognition. Whitaker (1976) presents a case of surprising residual language capacities in the context of severe dementia. Her patient never spoke spontaneously. In echolalic utterances, however, she corrected grammatical anomalies (as in 'Can you told me your name?'), but repeated without correction semantically anomalous stimuli (like 'I had a building for breakfast'). Kempler et al. (1987) analogously report that semantic errors were significantly more numerous than syntactic errors in the spontaneous speech of 10 less severe AD subjects. And Murdoch et al. (1988: 133) conclude from similar data that "mild to moderately demented patients' syntactic and phonological abilities remain relatively more intact than their semantic abilities..." (see also Bayles 1984, Chan et al. 1993). Comparably, in Creutzfeldt-Jakob disease, progressive aphasia with stronger semantic than syntactic and phonological deficits has been observed even in the absence of general dementia (Mandell et al. 1989). On the other hand, the pattern is reversed in Parkinson patients (Cummings & Benson 1989). Grossman et al. (1992) found impaired sentence comprehension in these patients to be mainly related to a language- specific attentional deficit and to problems with the perception of grammatical morphemes. Lieberman (1991: 95ff.) likens the linguistic deficits in Parkinson patients to Broca's aphasia. The focus of neural degeneration in Parkinson patients is, however, not Broca's area but the basal ganglia.

These data, once again, appear to support a 'double dissociation' between syntax and semantics. The pattern in Alzheimer's disease has been interpreted as evidence for the modularity of syntax, selectively retained in these cases supposedly due to transient sparing of the left anterior perisylvian cortex (e.g. Kempler et al. 1987). However, the dissociations found are usually graded. Even Whitaker's (1976) case of a behaviorally rather discrete dissociation does not prove a corresponding discrete dissociation on the neurofunctional level. Syntax may be weakly and 'gracefully degraded' (see Bates et al. in press), whereas semantic representations are more strongly impaired leading to complete dysfunction on the behavioral level (see below 5.1).

To sum up, acquired lesions and progressive degeneration in the forebrain can lead to specific disturbances and dissociations supporting the notion of regional functional specificity. However, there is little evidence for autonomously functioning 'modular' subsystems (for further discussion, see section 6.2).

5 UNIVERSALITY

In generative linguistics, genetic determination of language competence is linked to a universality assumption that has two aspects: First, universal grammar is defined as "the system of principles, conditions, and rules that are elements or properties of all human languages... by [biological] necessity" (Chomsky 1976: 29). The second aspect is based on the abstraction from socio- psychological variables of language use. It implies that an "ideal speaker-listener... in a completely homogeneous speech-community" becomes the exclusive object of linguistics (Chomsky 1965: 3). In this section, the concept of intra-species universality will be evaluated with regard to both the lesioned and the 'normal' brain.

5.1 'Pathonormal inference'

A simple common-sense interpretation of clinical data is that a function being disturbed after focal brain lesion was previously localized in the damaged area. This 'pathonormal inference', as I will call it (cf. Mueller 1991, 1992), has been applied to models of healthy brain function for millenia (Benton & Joynt 1960, Clarke & Dewhurst 1972) and until modern times (Shallice 1991, Caramazza & McCloskey 1991), though often in a much more sophisticated form (e.g. Shallice 1988). In cognitive neuropsychology, reference is often made to Saffran's (1982) statement that brain lesion results in a 'subtraction' of impaired functions from the normal system. Pathonormal inference implicitly presupposes universality of neurofunctional organization in 'normal' mind/brains. This becomes clear when ones examines the single-case methodology common in cognitive neuropsychology (Shallice 1988: ch.10). Whereas in Goldstein (1948) the study of single individual cases was motivated by the recognition of interindividual variation, to Caramazza (1986) this method aims at identifying functionally specialized modules in the brain in spite of the fact that 'pure cases' (with damage to only one module) are very rare. It is assumed that evidence from a few cases studied in depth sheds light on the functional organization of the normal brain generally. Complicating factors are considered by Shallice (1988: 217ff.) but are judged non-essential because "the overall danger of misleading conclusions is not that severe" (Shallice 1991: 434; see also Caramazza 1992). For a criticism of this conclusion, I will first outline several points regarding brain damage (some of which are also discussed by Shallice 1988) and will then add considerations from the neurobiology of healthy brains.

Pathonormal inference with its implicit universality assumption is problematic for at least three reasons:

(a) It implicitly presupposes a modular organization. In 'neural networks' built by humans and therefore known to represent in a distributive fashion, artificial circumscribed 'lesions' can result in selective output ('behavioral') deficits (Wood 1982). Seidenberg & McClelland (1989a) and Hinton and Shallice (1991), for example, showed that connectionist single-route models of reading, when lesioned, produce some of the symptoms (like regularization of exception words vs. semantic paralexia and lexicalization of non- words) that have traditionally been taken as evidence for a dual-route model and a dissociation between 'surface' and 'deep' dyslexia (cf. Ellis & Young 1988: ch.8; Ganis & Chater 1991). Corresponding results of lesioned network simulations of past tense acquisition and covert recognition in prosopagnosia have been reported by Marchman (1993) and Farah et al. (1993). Analogously, it has been shown in connectionist models of development that dissociations on the behavioral level can result from 'learning' in non-modular distributed systems. Network simulations of English past tense acquisition exhibit many 'natural' features typically found in children as well, such as consistency and frequency effects (Daugherty & Seidenberg 1992), U-shaped performance patterns (with an intermediate stage of overgeneralizations like 'thinked' and 'goed'), and an abrupt transition from rote learning to rule-governed behavior (Plunkett & Marchman 1991). In these connectionist simulations, regular and irregular forms were not represented in separate modules (as supposed in the traditional 'dual-mechanism' model of psycholinguistics; Pinker 1991, 1993; cf. Marchman & Bates 1994). This does not imply that connectionist modelling is by definition incompatible with modularity (e.g. Tanenhaus et al. 1987), but it does show that clinical evidence believed to unambiguously support modularity can be accounted for in non-modular terms.

(b) Pathonormal inference requires a conceptual terminology appropriate for dysfunctions of biological brains. The agrammatism debate, however, is an obvious example for 'explanatory categories' being imported from fields little concerned with neurobiology (like the theory of grammar; cf. Caplan 1982, Ellis 1987, Grodzinsky 1990, and above section 4.1).

(c) Pathonormal inference unduly neglects neuroplasticity. The major compensatory mechanisms effective in the damaged brain are (Kolb & Whishaw 1990: 729ff.): - Behavioral strategies helping to overcome or bypass dysfunctions, like the many apparently compulsive behaviors in aphasics described by Goldstein (1948), which complicate the overall picture of disturbances (see also Eisenson 1984: ch.4). - Disinhibition, which typically results, not in 'subtractive', but in inappropriately rich performance. - The partial take-over of impaired functions by undamaged areas due to functional reorganization in the developing and, to some extent, the mature brain (Bookheimer et al. 1995, Chollet et al. 1991, Devinsky et al. 1993, Duchowny & Jayakar 1993, Frackowiak et al. 1995, Friedman & Cocking 1986, Pons et al. 1991, Seitz et al. 1995, Stein 1989, Weiller et al. 1995; cf. sections 3.2.3 and 6.1). - The regrowth and restoration of damaged axons, the rerouting of axons away from damaged targets, and the sprouting of axons towards new targets (cf. Johansson & Grabowski 1994). Such mechanisms are effective in the adult brain and can a fortiori be assumed for the developing brain (Kolb 1990). In addition, the effectiveness of plasticity is probably 'contextual'. Hormones, for example, influence the neural and behavioral effects of lesions (Stein 1989).

Correlating data on localized structural damage in the brain (as provided by computer tomography or magnetic resonance imaging) with symptom complexes is complicated by a further factor. Studies of metabolic function in aphasics have shown "that structural lesions are associated with metabolic changes in brain regions beyond the site of damage" (Metter et al. 1989: 27; cf. Chugani 1994b, Frackowiak et al. 1991). This effect may be due to transsynaptic (and therefore often distal) degenerative changes after neuronal damage (Jessell 1991). Metabolic depression in the absence of structural damage was, for example, found in the left prefrontal cortex in Wernicke's aphasics and in the right cerebellum in Broca's aphasics (Metter et al. 1987). The latter phenomenon of 'crossed cerebellar diaschisis' appears to be most strongly correlated with deep middle cerebral artery infarcts and with cortical lesions in central frontoparietotemporal areas (Pantano et al. 1986). Since cerebral blood flow and metabolism are clearly related to an area's involvement in cognitive processing (Raichle 1987), it cannot be excluded that cerebellar dysfunction contributes to 'agrammatism' in Broca's aphasics (cf. section 3.2.3).This does, of course, not mean that grammar is 'located' in the cerebellum. Yet, it suggests that morphosyntactic processing relies on a distributed network of brain structures, of which the neocerebellum is part. This is supported by a report on agrammatic deficits in a patient with right cerebellar infarct (Silveri et al. 1994).

To sum up, inferring functionally specialized modular brain subsystems from clinical dissociations is doubtful because it disregards many aspects of individual variability and postmorbid plasticity. In particular, dissociations (including double dissociations) are fully compatible with distributive functional brain organization (see section 6.2).

5.2 Normal universality

Interindividual variation (IV) is a common notion in traditional behavioral psychology (Carroll 1988) but to a large extent abstracted from in the cognitive sciences - even though Darwinist thinking has recently regained interest (e.g. Bates et al. 1991c, Changeux & Dehaene 1990, Edelman 1987, Lightfoot 1989, Pinker & Bloom 1990). From the Darwinist point of view, IV is indispensable. Selection within a species can be effective only if members are different from each other (Mayr 1988: 224; cf. Lloyd & Gould 1993). Individual differences may be hereditary or epigenetically based (cf. section 6.1). I will briefly outline some aspects of IV concerning human brains (cf. also Yeo 1989):

(a) Brain volume and weight can vary by factor two and more. Sheer mass may be thought to be cognitively irrelevant, but animal studies have shown that regional brain mass is correlated with behavioral capacity (Diamond 1990, Greenough & Bailey 1988).

(b) Neurovascular organization is extremely variable. Though the vascular system is not a locus of cognitive processing, this type of IV plays a considerable role in the clinic because obstruction at a particular point in the vascular system can lead to different patterns of damage in different brains.

(c) IV on the morpho-anatomical level, above all in the pattern of fissures and convolutions (Kolb & Whishaw 1990: 75f.), makes it hard to identify corresponding areas (assumed to perform specific cognitive functions) in different brains. Patterns of anatomical asymmetry in language-related areas that are subject to extreme IV have been reported by Geschwind & Levitsky (1968) and Galaburda (1984; cf. Charles et al. 1994). IV has also been found with regard to the extent and location of cytoarchitectonic fields (Galaburda 1984).19 On a yet more microscopic level, Jacobs et al. (1993), studying brain slices from 20 right-handers, found strong variability in the dendritic structure of Wernicke's area that was related to the person's sex, education, social activities, and profession (see also Scheibel et al. 1990).

(d) IV on level of neurofunctional organization, i.e. the differential allocation of given functions in individual brains (e.g. Schlaug et al. 1994). Many variables of this type of IV are related to hemispheric asymmetries. There is a consensus that the left hemisphere is mainly involved in language-related functions and the right in visuospatial processing (Bradshaw & Nettleton 1981, Kolb & Whishaw 1990: ch.15). Yet, it is clear that there is no universal pattern and that many variables lead to a broad spectrum of IV (Galaburda et al. 1990). The two most discussed variables are sex and hand preference. The hypothesis of language functions being more bilateral in women than in men has only been partly confirmed (Connor 1985, Kimura 1985, 1992, McGlone 1980). This is probably due to the existence of variability both between and within the sexes. Another aspect of sex differences is that, while hormones generally influence brain development and function (Bradford 1987, McEwen 1989), high levels of testosterone in particular have been claimed to be positively related to the occurrence of developmental disorders (like dyslexia), autoimmune abnormalities and left- handedness (Geschwind & Behan 1984, Nass 1993; cf. Tallal 1991). The percentage of atypical language lateralization in left- handers is considerably higher (perhaps 30%) than in right-handers (5% or lower) - although the latter figure may be culture-specific. Thus, Hu et al. (1990) found a strikingly high ratio (c. 30%) of 'crossed aphasia' (resulting from right-hemispheric lesions) in a population of right-handed Chinese aphasics (see also Lecours et al. 1984). Hand preference is by no means a binary variable. As Dean (1986: 93) suggests, "lateral preference patterns... not only... vary from individual to individual but... for each individual as a function of the cerebral system under study". Thus, there appears to be a "right-shifted" normal distribution with absolute lateral preferences at the extremes and most individuals somewhere in between (Annett 1985: ch.13).

In addition to IV, brain function probably also varies intraindividually, in the sense that given cognitive functions may, to some extent, be neurally allocated differently under different circumstances (Molfese & Burger-Judisch 1990). Thus, Raichle et al. (1994) report dramatic changes in regional brain activations from naive to practiced performance in a semantic verb association task.

All in all, humans brains are characterized by a considerable and multifactorial IV. This may appear immaterial for universalist models such as the Chomskyan theory of grammar. From a Turing-machine functionalist perspective (Fodor 1981), it may be argued that identical programs can run on physically different machines and that neuroscientific data are therefore irrelevant. Yet, Chomsky has explicitly claimed universal grammar to be a biologically-based entity. And, as shown previously, it is likely that many of the physical variables discussed are related to cognitive variables. There are empirical answers to the functionalist/physicalist

issue (cf. Churchland 1986), and neural IV should not be disregarded in cognitive models.

6 TOWARDS A BIOLOGICAL MODEL OF LANGUAGE IN THE BRAIN

Many of the arguments presented so far were directed against strong hypotheses about the autonomy, innateness, species-specificity and species-universality of the language capacity as they are found in linguistics, psycholinguistics, and some areas of neuropsychology. My intention was to show that such hypotheses are neurobiologically questionable, not to make equally strong claims to the contrary. I do not think that apes can acquire human-like syntax, that language acquisition is all based on rote learning, or that the brain is an equipotential mass. What I suggest, rather, is that dualisms such as 'innate vs. learned' or 'modular vs. holistic' dissolve the closer the mind is approached from the neurobiological perspective.

6.1 Epigenesis

In the debate about cognition and communication in chimpanzees (cf. section 3.1.1), the evidence that DNA in these apes is nearly 99% identical to human DNA and that chimpanzees are closer genetically to humans than to gorillas (Diamond 1988, Gibbons 1990) has created much surprise.20 It has also been argued that the gene 'code' is not long enough to specify the complexity of the adult human brain, let alone that of the total organism (Benno 1990, Changeux 1979, Clarke 1981). Furthermore, strict genetic determinism would seem incompatible with the fact that genome size is little related to the complexity of an organism (Bonner 1988: 135f.) and that, for example, in mouse and man one finds about the same amount of DNA per cell (Changeux 1986: 185). This amount is no direct clue, however, to the number of genes, because there is an as yet undetermined amount of (possibly non-functional) 'junk DNA' that may theoretically be higher in mice than in humans (Loomis & Gilpin 1986; cf. Lewin 1986). Junk DNA, however, would make the mentioned problem of the 'poverty of the gene code' (to adapt a Chomskyan expression, cf. section 2) even more acute. The phylogenetic emergence of language capacity is therefore unlikely to be based on a single gene as sometimes suggested (cf. section 3.2.2), though it may be related to changes in 'gene nets' (Bonner 1988: 174f.).

The issue is, of course, not as simple as sketched so far. The principle of pleiotropy, for one example, implies that a local genetic alteration can result in a multitude of phenotypic changes distributed over various subsystems of the organism. This would imply that a putative gene involved in 'universal grammar' would also influence other, nonlinguistic domains of development. So, to the degree that pleiotropy applies to the neural bases of higher cognitive development, an autonomy of linguistic or morphosyntactic development would be biologically unexpected. The same is true for Gould's (1977) hypothesis that the vast phenotypic changes during hominization were due to limited changes in regulatory genes leading to heterochrony, i.e. alterations in developmental schedules and an extension of the period of juvenile plasticity (neoteny).

If one abstracts from the rigid dualism of a linguistic rule being either based on innate principles or learned and looks at how the brain develops (cf. section 3.2.1), it becomes apparent that the genome, rather than strictly determining early (let alone adult) brain organization, creates a very high probability that development, under normal circumstances, results in a 'normal' type of brain architecture and neurofunctional organization. The genome is thus not a 'program' but a set of constraints permitting a considerable degree of imprecision in brain development (Clarke 1981), which is governed by the neuro-Darwinist principles of abundance and selection (Changeux 1986, Changeux & Dehaene 1990, Edelman 1987; cf. section 3.2.1).

According to Braitenberg & Schuez (1991, 1992), the statistics of neocortical organization are incompatible with "deterministic wiring", but yield the picture of "a network whose elements are to a large extent connected at random" (1991: 94). As Hay (1985: 67) concludes from the study of genetic disorders, developmental schedules of biochemical interaction do not have to be absolutely exact, but "must be within fine limits if later abilities are not to be impaired". Changeux (1979, 1986) calls this range of tolerance the "genetic envelope". And Katz (1983) argues that 'ontogenetic buffer mechanisms' (such as the selective loss and stabilization of neurons and synapses) mediate between genome and phenotype and create relative universality in phenotypes that is not strictly determined by the gene code (see also Johnson 1993). This principle of ontogeny is analogous to what Waddington (1969), in an evolutionary context, called the 'epigenetic space', situated between genotype and phenotype space and signifying "an essential indeterminacy in the relation between phenotype and genotype" (ibid.: 364).

The epigenetic approach implies that a developing neural structure (on all levels of complexity from a cell membrane to a cerebral hemisphere) results from interactions of a preexisting structure with the environment. The consequences of this simple principle become highly complex because (a) there are different ranges of environment (local, distal [but still within the organism], exterior [the world]), (b) one structure is part of other structures' local or distal environment, and (c) development consists of an indefinite number of epigenetic steps for an indefinite number of structures (on numerous levels of complexity), and with each step the structure and all types of environment may change. Finally, (d) the organism created by these epigenetic interactions is itself part of the exterior environment interacting with (subsystems of) other organisms (cf. Plunkett & Sinha 1992).

The stereotypical question of whether a given mental capacity is 'innate' or 'learned' virtually dissolves: Birth, from the neurogenetic point of view, is a rather arbitrary point in development (see section 3.2.1), and with each epigenetic step, influences of the genetic program and the various ranges of environment become more interwoven (cf. Benno 1990). From this perspective, no property is strictly speaking genetically programmed on the cellular level, let alone on the level of functionally specialized cell assemblies, columns, or brain areas. Transplantation experiments have shown that individual neurons differentiate and functionally specialize according to the host, and not the donor, area (Bjoerklund 1991, Sanes & Lichtman 1992). Neurons are therefore not strictly predestined to subserve specific functions. Furthermore, as Walsh & Cepko (1992) demonstrate, neuronal clones (i.e. neurons deriving from the same precursor neuroblast) are widely distributed in the mature nervous system and do not aggregate into the same functional subdivision. This suggests that functional and topographic subsystems do not evolve in a predetermined way, but are shaped by interaction with internal milieus and external stimulation (cf. Creutzfeldt 1977, Johnson 1993, McConnell 1991, O'Leary 1989, Wimer 1990).

From this perspective, the claim that linguistic principles are innate or genetically programmed is an oversimplification with little precise meaning.

6.2 Specificity and Distribution

Local specificity and distributivity are not, as often suggested in discussions about a modular vs. holistic architecture of mind/brain, exclusive alternatives but two complementary aspects of biological knowledge representation. It is therefore not surprising that evidence is mixed and apparently contradictory. 'Neural network' simulations with distributed representations display many features of learning and 'behavior' (like 'graceful degradation', spontaneous categorization, and content-addressability; Elman 1993, Hanson & Burr 1990, McClelland & Rumelhart 1986, Smolensky 1988) known from biological intelligence. This seems to support a distributive model of neurofunctional organization. On the other hand, cognitive neuropsychologists have accumulated evidence for the dissociability of cognitive functions following localized brain lesion (Caramazza 1990, Ellis & Young 1988, Shallice 1988). Selective deficits pertaining to grammatical function words and inflections (see section 4.1) or to circumscribed semantic domains, such as animate or inanimate categories (Hillis & Caramazza 1991, McCarthy & Warrington 1988), body parts (Goodglass & Budin 1988), or fruits and vegetables (Farah & Wallace 1992, Hart et al. 1985) have been described. On a much more microscopic level, there is evidence for the selective response of particular neurons in monkey cortex to complex stimuli, like faces of particular individuals (Perrett et al. 1987) or paired items in a picture association task (Sakai & Miyashita 1991). Calling such nerve cells "person recognition units" or "pair- coding neurons" is, however, quite misleading. First, there is no discrete specialization because these neurons also respond to other faces, or respectively, pictures in the paired-associate task. Second and more importantly, such results are exactly what a model of distributed representations would predict (see discussion of comparable data from the human medial temporal lobe by Heit et al. 1988). Thus, in the Sakai & Miyashita study on the representation of paired associates, 59 of the 91 picture-selective neurons responded to more than two pictures. Computer simulations of distributed representation have shown that units are differentially activated by particular inputs, but this does not mean that any neuron is 'specialized' in the representation of any particular input or category (Hinton et al. 1986). Although Perrett et al. (1987) do not state from how many cells they recorded, it is clear that it can have been no more than a minute fraction of the totality of cells that may conceivably be involved in face recognition (cf. Lu et al. 1991). Thus, these data provide no evidence for local 'engrams' (cf. Lashley 1950).

As to the evidence for clinical dissociations, I have already argued in sections 4 and 5.1 that these do not straightforwardly support modularity (in the sense of discrete functional subsystems). Dissociations are well compatible with a model of graded regional specificity (Goldberg 1990). The same applies to the mentioned evidence on category-specific semantic disturbances (or to comparable semantic field effects in normal speakers' lexical errors; Garrett 1992). Even if, for example, a selective disturbance of naming and categorizing fruits and vegetables from picture stimuli could be shown to result from a focal lesion (which is not true for the case studies presented by Farah & Wallace [1992] and Hart et al. [1985]), this would not imply the existence of a corresponding lexical or conceptual module. As much as a model of distributed representation predicts functional specialization on the neuronal level, it does so on the level of brain areas, if only because each area is linked to the sensorimotor peripheries by a specific connectivity pattern. Clinical dissociations show that some components of behavior selectively suffer from brain damage in a particular area. This does not at all mean that this area is the sole locus of the intact subfunction, nor does it mean that the subfunction has broken down completely. It only proves that the regional dysfunction is strong enough to undermine behavioral expression of a subfunction. Analogous to the concept of selective vulnerability discussed in section 4.1, a deficit of an assumed cognitive 'subfunction' may be due to a graded dysfunction (or processing overload) within a relatively large brain area that is not at all discretely 'specialized' in the 'subfunction'. There is evidence from connectionist simulations (as discussed in section 5.1) that such artifacts occur in distributed systems.

Therefore, data frequently discussed in support of functionally discrete neural modules are by no means incompatible with the ever-increasing evidence for distributive principles of representation (Mesulam 1990, Posner & Rothbart 1994, Squire 1987). Interconnectivity of neurons in the neocortex is extremely dense. The average number of synapses per cortical neuron is estimated at 20,000 or more (Abeles 1991: 56ff.). Based on statistical analyses of the rodent forebrain, Braitenberg & Schuez (1991) estimate that each neuron in the neocortex will contact any other either directly, or by no more than two or three synaptic relays. Therefore, "[a]ny sufficiently large portion of the cortex is informed... about the activity in the rest of the cortex" (ibid.: 199). If this is true for primary and higher sensorimotor areas, it will rule out the 'informational encapsulation' of perceptual systems (including language) postulated by Fodor (1983). On the contrary, as "there is no way of isolating pieces of the cortex from each other, the correct description of cortical information handling would seem to be one in terms of global states" (Braitenberg & Schuez 1991: 199). Such global or, to be more conservative, distributive principles of function imply that one particular neocortical neuron or synapse will never be the unique site of any meaningful representation, but will participate in the storage of many different but related representations and in many different types of processing. Electrophysiological studies have shown that a particular neuron can be involved in various functions (both sensory and associative for example; Penfield & Perot 1963) and that only the synchronous firing of many neurons corresponds to elementary cognitive functions (Singer 1993, Vaadia et al. 1991; cf. Marder 1991, Meyrand et al. 1991). Creutzfeldt et al. (1989), recording directly from single neurons and neuron groups in humans showed that, although there were response preferences for certain types or parts of verbal stimuli in some neurons of the superior temporal lobe, such single-unit specificity was never tied to any single phonetic feature of linguistic theory. Rather, one neuron would respond to phoneme combinations such as /cr/, /gr/, /sk/, /st/, /str/ (ibid: 456), suggesting that features such as 'guttural' or individual phonemes are stored, not in any single neuron, but in a distributed 'assembly' consisting of neurons with fuzzy response preferences. Fuzzy response preferences are also reported by Haglund et al. (1993) for single-unit recordings in left anterior temporal cortex of a bilingual signer-speaker.

As mentioned before, each sensory system consists of a multitude of topographic maps and several pathways working in parallel (Kaas 1987, Kaas et al. 1979, Van Essen 1992). Functional imaging studies have demonstrated that each cognitive task corresponds to complex patterns of activation distributed over the neocortex (Friberg 1993, Haxby et al. 1991, Pahl 1990, Raichle 1987, Roland & Friberg 1985, Sergent et al. 1992). This has been noted in particular in language-related imaging studies (Demonet et al. 1992, 1993, Mazziotta & Metter 1988, Posner et al. 1988, 1989, Zatorre et al. 1992, Wise et al. 1991). Distributivity is possibly even stronger in the child than in the adult brain, with increasing cognitive skill corresponding to decreasing distributivity (Mills et al. 1994). Therefore, even though functional specificity and distributivity are features of the brain at any age, there may be a trend from distributivity to specificity in development (cf. Greenfield 1991).

To sum up, evidence on neuronal and regional specificity of function does not imply that concepts, percepts or cognitive subfunctions actually reside in neatly circumscribed loci. On the contrary, a distributive model of knowledge representation predicts that particular neurons are more involved in some representations than in others and that particular areas participate in some cognitive functions more than in others. Neuronal and regional specialization are by no means incompatible with distributed representation. This is reflected in a number of theoretical concepts to be discussed in the following.

6.2.1 Binding in 'convergence zones'

If knowledge representation is distributive, the question arises of how complex percepts and concepts can be unitary in introspection (cf. Crick 1984). Damasio (1989, 1990b) and Damasio & Damasio (1994) have proposed a solution based on the hypothesis that there are association cortices ('convergence regions' consisting of numerous 'convergence zones') which possess convergent afferent and divergent feedback connections to many sensorimotor cortices - an anatomical claim for which there is some evidence (e.g. Pandya et al. 1988: 56ff., Braitenberg & Schuez 1991: 143ff.). According to Damasio, activity patterns distributed over 'lower' sensory cortices are recorded by such convergence zones and can be 'retroactivated' or 'reconstructed' by them via their divergent feedback or 'reentrant' projections (cf. Edelman 1987, Posner & Rothbart 1994). On the basis of clinical data (Damasio 1990c, Damasio & Damasio 1990, 1994, Damasio et al. 1990a,b, 1991b), polymodal convergence zones have been located in the anterior temporal and the prefrontal cortices. Convergence zones are not expected to store 're- representations' of all the fragmentary sensorimotor representations which are combined to percepts, concepts etc., but only 'codes' capable of 'reconstructing' these fragments. From this perspective, even abstract conceptual (and linguistic) knowledge would involve distributed activations in perceptuomotor cortices.

6.2.2 Cell assemblies and phase-locked oscillations

At least two theoretical considerations suggest that the smallest neural substrate for meaningful representations is not the single neuron (as claimed by Barlow 1972 and Gilinsky 1984), but the cell assembly:

- Neuronal death is an everyday phenomenon in brains, but the sudden extinction of knowledge happens only when tissue containing very many neurons is damaged.

- The brain's capacity to store procedural and declarative memories has no definite limit. Knowledge representation in overlapping cell assemblies is not only more plausible in terms of the complex interrelations between mental concepts (Barsalou 1987), but also much more powerful because much more knowledge can be represented in a distributive than in a localist system (Palm 1990). Cell assemblies (CAs) are defined as functional units (i.e. smallest 'meaningful' neural substrates) comprising neurons that are strongly interconnected by excitatory synapses such that activation of a subset leads to 'ignition' of the whole assembly as soon as a given threshold is passed (Hebb 1949; Braitenberg & Schuez 1991: 201ff., Pulvermueller 1992, Pulvermueller et al. 1994). Neurons participating in one CA may be widely distributed over the neocortex and lower brain areas. Any neuron may participate in an indefinite number of CAs. Such partly overlapping neuronal substrates are assumed to lead to serial activations of CAs ('synfire chains'; Abeles 1991: 232ff.) .

The cell assembly hypothesis is supported, in general, by the evidence for distributed knowledge representation discussed precedingly. More specific supportive data relate to multineuronal associations of repeating spike patterns (Abeles & Gerstein 1988) and phase-locked oscillations in groups of neurons distributed over non-neighboring cortices and even fairly distant brain structures (like the lateral geniculate and the visual cortex). Gamma oscillations in the range of 20-30 Hz are reflected by behavioral variables as well (Madler & Poeppel 1987, Poeppel & Logothetis 1986, Poeppel et al. 1990a&b, 1991). Phase-coupling of oscillations appears to reflect the multineuronal response to one particular stimulus (Engel et al. 1991, Gray et al. 1989, Jagadeesh et al. 1992, John 1989, Singer 1990), or to different aspects of a particular perceptual gestalt (Singer 1993: 356f.). Thus, the spatial patterns of such oscillations can be claimed to be meaningful in a cognitive sense (Bressler 1990, Freeman 1991, Freeman & Maurer 1989).

Oscillations have been presented as a solution to the aforementioned 'binding problem' (see also Eckhorn 1991), in the sense that synchronous oscillations can bind the activity of neuronal groups in various modal and association cortices as well as subcortical areas.

6.2.3 Toposemanticity

Topographic representation in perceptuomotor pathways up to primary, and even secondary, unimodal cortices (Kelly & Dodd 1991: 280f) can be viewed as the most obvious reflection of a general principle of toposemanticity (Mueller in prep.). In the somatosensory and motor cortices, for example, toposemanticity is straightforward in that the gestalt of the body is reflected - though with distorted proportions - in functional topography.21 Comparable tonotopic and retinotopic representation exists in auditory and visual cortices (e.g. Kaas et al. 1979, Merzenich & Brugge 1973, Sereno & Allman 1991). With regard to cortical interneurons, toposemanticity implies that the functional role of each neuron is determined by

(a) its localization in anatomical space, i.e. the relative proximity to primary sensorimotor cortices (local connectivity effect)

(b) its distal connections (or local connections to neurons with long axons) directly or indirectly linking it to primary cortices and pathways to the sensorimotor periphery (indirect proximity effect).

6.3 'Language areas'

The traditional notion of areas of the (left) perisylvian cortex being the storage sites of a specific and autonomous type of linguistic knowledge can be revised on the basis of the concepts introduced above. The 'language areas' of Broca and Wernicke may be viewed as convergence zones, but in a sense partly differing from Damasio & Damasio's (1994) definition. Whereas the Damasio model suggests linearity of processing streams in that activations in sensorimotor cortices converge onto association areas, where they are coded and potentially retroactivated by linear feedback, I propose to define convergence in terms of cell assemblies, i.e. in parallelist rather than serialist terms. Neurons in both sensorimotor and association cortices may engage in phase-locked oscillation if they belong to the same assembly (or to sufficiently overlapping assemblies). The distributed topography of a lexicosemantic cell assembly, for example, can be traced back to the perceptuomotor components involved in the acquisition of a given word meaning (cf. Damasio &a