If the cortex is an associative memory, strongly connected cell assemblies will form when neurons in different cortical areas are frequently active at the same time. The cortical distributions of these assemblies must be a consequence of where in the cortex correlated neuronal activity occurred during learning. An assembly can be considered a functional unit exhibiting activity states such as full activation (ignition) after appropriate sensory stimulation (possibly related to perception) and continuous reverberation of excitation within the assembly (a putative memory process). This has implications for cortical topographies and activity dynamics of cell assemblies representing words. Cortical topographies of assemblies should be related to aspects of the meaning of the words they represent, and physiological signs of cell assembly ignition should be followed by possible indicators of reverberation. The following postulates are discussed in detail: (1) assemblies representing phonological word forms are strongly lateralized and distributed over perisylvian cortices; (2) assemblies representing highly abstract words, such as grammatical function words, are also strongly lateralized and restricted to these perisylvian regions; (3) assemblies representing concrete content words include additional neurons in both hemispheres; (4) assemblies representing words referring to visual stimuli include neurons in visual cortices; (5) assemblies representing words referring to actions include neurons in motor cortices. Two main sources of evidence are used for evaluating these proposals: (a) imaging studies aiming at localizing word processing in the brain, based on stimulus-triggered event-related potentials (ERP), positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), and (b) studies of the temporal dynamics of fast activity changes in the brain, as revealed by high-frequency responses recorded in the electroencephalogram (EEG) and magnetoencephalogram (MEG). These data provide evidence for processing differences between words and matched meaningless pseudowords, and between word classes such as concrete content and abstract function words, and words evoking visual or motor associations. There is evidence for early word class-specific spreading of neuronal activity and for equally specific high-frequency responses occurring later. These results support a neurobiological model of language in the Hebbian tradition. Competing large-scale neuronal theories of language are discussed in the light of the summarized data. A final paragraph addresses neurobiological perspectives on the problem of serial order of words in syntactic strings.
The issue I would like to address is that of different vocabulary classes. At school, one learns to categorize words into fifty or so lexical categories, such as noun or verb, and one may also be asked to categorize words on the basis of their meaning, according to semantic criteria. Of course, it is useful, for didactic purposes, to make a large number of distinctions between classes of words, not only based on their meaning and their function in syntactic structures, but also based on criteria such as their intonation, syllable complexity, number of letters or speech sounds, or the frequency with which they are used in ordinary language. However, one may wonder whether some of these distinctions reflect differences that are biologically real. This would mean that the members of word classes A and B, which can be distinguished based on linguistic or didactic criteria, would also be represented differently in the human brain. In psycholinguistics, much effort has been spent to demonstrate processing differences between word classes, for example between the major lexical classes called content words (or open-class words, including nouns, verbs, and adjectives) and function words (or closed-class words, including articles, pronouns, auxiliary verbs, conjunctions etc.). (Some of these studies will be discussed in Section 5.) However, if the conclusion is that two classes A and B differ, it becomes important and necessary to specify the difference. It is good to know that two words are different, it is, however, better to know (or to have an idea about) what the actual differences are. A biological approach aims at specifying the difference in terms of neurons and neuronal connections.
During recent years, more and more neuropsychological studies have been devoted to the investigation of cortical mechanisms necessary for word processing, and psychophysiological studies have been investigating the brain areas that "light up" when words are being produced or comprehended. Such studies are most welcome, because they may contribute to an answer of the what question mentioned above. One aspect of this question concerns the localization of neuronal correlates of words, the question where representations are housed and processes take place. However, even if questions such as: "which word classes will be selectively impaired after focal brain lesion takes place in cortical area X?", or: "which brain areas will become active when words of class A are being produced or comprehended?" will have found their definite answers, the question of why this is so may still be open. Why are words of class A being processed in area X? An explanation of language mechanisms in the brain is only possible by answering such why questions based on known biological principles. But even definite and exhaustive answers to where and why questions may still not be considered a satisfactory end point of cognitive neuroscientific research: If it is clear where in the brain particular language units are being represented and processed, and if it is clear why this is so, one may still ask how language representations are laid down, and how these representations are being activated when language units are being processed.
This target article will certainly not provide a complete answer to where, why and how questions related to language. It will provide preliminary answers to the where question as far as words of certain classes are concerned, it hopes to convince the reader that the why question can be answered in a few clear cases, and it aims at specifying some very basic features of cortical representations and the way they become active and maintain their activity. All this is done on the basis of a brain model rooted in Hebb's concept of cell assemblies. In fact, the purpose of this article is not only to discuss the issue of words in the brain. Rather, it aims at making evident that the Hebbian approach is a powerful tool for cognitive neuroscience that may lead to a biological explanation of our language capacity and may provide explanations of other higher cognitive capacities, too.
2. The Hebbian model, recent modifications, and some evidence
In the late 1940s, Donald Hebb (1949) proposed a neuropsychological theory of cortical functioning that can be considered an alternative to both localizationist and holistic approaches. Localizationists would assume that small cortical areas are fully capable of performing complex cognitive operations. A localizationist would, for example, propose that an area of a few square centimeters of cortical surface is the locus of word comprehension (Broca, 1861; Lichtheim, 1885; Wernicke, 1874). According to this view, the psychological process (word comprehension) is restricted to the area, that is, no other areas are assumed to contribute to this specific process. Only under pathological conditions or during development may there be a shift of the process to another equally narrow area (Luria, 1970; Luria, 1973). In contrast, a holistic approach would imply that the entire cortex exhibits equipotentiality with regard to all cognitive operations and that all cortical areas (or even brain parts) contribute to sufficiently complex processes, such as, for example, those involved in language (for discussion, see Freud (1891), Lashley (1950), and, for an overview, Deacon (1989)).
The Hebbian proposal is in sharp contrast to both of these views. Cell assemblies with defined cortical topographies are envisaged to form the neurobiological representations of cognitive elements such as gestalt-like figures or words. This position is radically different from a localizationist approach, because it assumes that neurons in different cortical areas may be part of the same functional unit. The Hebbian viewpoint is also different from the holistic view that "everything is equally distributed", because it assumes that the distributed representation of, for example, an image may involve cortical areas entirely different from those contributing to the representation of, say, an odor. Accordingly, the representation of a word would not be restricted to a small cortical locus, but would be distributed over well-defined areas, for example over Broca's, Wernicke's, and some other areas.
The Hebbian model was based on three fundamental assumptions about cortical functioning which can be summarized as follows:
Hebb was frequently criticized, because his assumptions were considered to be too speculative and because some of his colleagues believed that his ideas would not be testable. Therefore, it is necessary to discuss his assumptions in the light of evidence presently available.
neuron L active inactive neuron M active +w -- inactive -- -- +w indicates an increase in connection strength between neurons l and m (and hyphens indicate no change in connection strength)
Table 1: Associative synaptic learning according to a Hebbian coincidence rule
Electrophysiological studies have demonstrated that many cortical and subcortical neurons being frequently active at the same time strengthen their connections. If a neuron, call it L, sends one connection to a second neuron, M, their synapse will strengthen when both are repeatedly active together, so that L will later have a stronger influence on M. Because this effect may last for many hours or days, or even longer, it has been termed long-term potentiation (LTP) (Ahissar et al.1992; Gustafsson et al.1987). After this kind of associative learning, connection strength will be a function of the frequency of coincident activity. Table 1 describes this kind of coincidence learning (Palm, 1982).
One may object against this and similar learning rules that coincidence learning is only one form of associative learning known to take place between neocortical neurons. If only one of the two neurons is active while the other one remains silent, this could also have an effect on the strength of their connection. In fact, it was shown by electrophysiological experiments that activation of presynaptic neuron L alone, while the membrane potential of postsynaptic neuron M is stable (or only slightly depolarizes), leads to a weakening of their synaptic connection (Artola et al.1990; Artola & Singer, 1987; Artola & Singer, 1993; Rauschecker & Singer, 1979). Because this reduction (or depression) of the influence of one neuron on the other is long-lasting, the phenomenon has been called long-term depression (LTD). There is also evidence for LTD occurring when presynaptic neurons are silent while postsynaptic neurons frequently fire (Tsumoto, 1992; Tsumoto & Suda, 1979). Therefore, the original idea proposed by Hebb needs a slight but important modification: Connection strength is not only modified by coincident activity, it also changes if only one of two connected neurons is active while the other one is inactive. Table 2 describes this kind of learning which will be called correlation learning below, because after this kind of synaptic modification, the strength of the synaptic connection will include information not only about the frequency of coincident firing of neurons, but, in addition, about how strong the correlation was between their activations.
This formulation is very general and, for example, does not make distinctions implied by more precise formulations of synaptic learning rules (Artola & Singer, 1993; Bienenstock et al.1982; Tsumoto, 1992), in which, for example, the states called "active" and "inactive" above, have been replaced by gradual activity levels (quantified in terms of the frequency of action potentials or the membrane potential of the postsynaptic neuron). In addition, the above formulations leave open the questions of how the w-values should actually be chosen. Whereas w1 may be assumed to be larger than w2 and w3, the exact values of the variables are unknown. These questions will not be addressed here, because they have been discussed in great detail based on what is known about synaptic dynamics in the neocortex (Tsumoto, 1992) and in the light of storage properties of artificial associative networks (Palm, 1982; Willshaw & Dayan, 1990; Palm & Sommer, 1995). In the present context, it is most important to keep in mind that a correlation rule, rather than a coincidence rule, is a fundamental principle of synaptic learning in the cortex.
neuron L active inactive neuron M active +w1 -w2 inactive -w3 -- +w1, -w2 and -w3 indicate positive or negative changes in connection strength
Table 2: Associative synaptic learning according to a correlation rule
It appears uncontroversial that excitatory cortical neurons located close to each other are likely to have a synaptic contact. Although this probability is not 100% - it is actually far below (Braitenberg, 1978a; Braitenberg & Schüz, 1991) - it is evident that adjacent neurons are much more likely to be connected than neurons located far apart, that is, in distant cortical areas (Young et al.1995). It is, however, clear from neuroanatomical studies that most cortical pyramidal cells have long axons reaching distant areas or subcortical structures, and that connections from one area project to several other areas. In macaca, for example, what may be considered the homologues of Broca's and Wernicke's areas are not only intensely connected to each other, they also exhibit connections to additional premotor, higher visual and association cortices (Deacon, 1992a; Deacon, 1992b; Pandya & Vignolo, 1971; Pandya & Yeterian, 1985). Therefore, if correlated neuronal activity is present in a large number of neurons in different cortical areas, some of these neurons will exhibit direct connections to each other. These neurons will become more strongly associated even if they are located far apart. Thus, although the cortex is not a fully connected associative memory in which every processing unit is connected to every other one, it still appears to be an associative network well suited to allow for both local and between-area associative learning (Braitenberg & Schüz, 1991; Fuster, 1994; Palm, 1982).
If neurons in an associative network exhibit correlated activity, they will acquire a stronger influence on each other. This implies that these neurons will be more likely to act together as a group. Hebb calls such anatomically and functionally connected neuron groups "cell assemblies". The strong within-assembly connections are likely to have two important functional consequences: (i) If a sufficiently large number of the assembly neurons are being stimulated by external input (either through sensory fibers or through cortico-cortical fibers), activity will spread to additional assembly-members and, finally, the entire assembly will be active. This explosion-like process has been called ignition of the assembly (Braitenberg, 1978b). (ii) After an assembly has ignited, activity will not stop immediately (due to fatigue or regulation processes), but the strong connections within the assembly will allow activity to be retained for some time. Cell assemblies are sometimes conceptualized as packs of neurons without ordered inner structure. However, according to Hebb's (1949) proposal, assembly neurons are connected so that ordered spreading and reverberation of neuronal activity can occur.
The latter point needs further elaboration: Figure 1 is taken from Hebb's 1949 book and depicts what the author believed to be a possible inner structure of an assembly. In this diagram, arrows represent subgroups of neurons included in the assembly. These subgroups would each become active at exactly the same point in time. Arrowheads indicate to which other subgroups a given subgroup would project, and numbers denote a possible activity sequence. After synchronous activity of the neurons represented by the arrow labeled "1", a wave of excitation will run through the assembly as indicated by the numbers, and activity will finally cease. Thus, it is evident that already in Hebb's early proposal, a cell assembly was conceptualized as a highly structured entity. Whereas ignition of the assembly may simultaneously involve all assembly neurons, there is also the possibility of a wave of excitation circulating and reverberating in the many loops of the assembly. The wave can be described as a spatio-temporal pattern of activity in which many cortical neurons participate.
The question whether cell assemblies that represent stimuli and cognitive entities exist in cortex has long been believed to be impossible to test by empirical research. As mentioned earlier, this believe was probably one of the main reasons why Hebb's theory has not been generally accepted in the 1940s and 1950s. However, more recent experimental work provided strong evidence for the Hebbian ideas. Neurophysiological work by Abeles, Aertsen, Gerstein and their colleagues (Abeles, 1982; 1991; Abeles et al.1993; 1994; Aertsen et al.1989; Gerstein et al.1989) revealed exactly timed spatio-temporal firing patterns in cortical neurons. The specific neuronal connections these patterns are probably related to were labeled synfire chains by Abeles, because a subpopulation of neurons must synchronously activate the next subpopulation in order to keep the chain going. Importantly, spatio-temporal activity patterns actually detected in cortical neurons frequently involve the repeated activation of a given neuron, thus suggesting reverberations due to loops in the chain (Abeles et al.1993). Evidently, the concept of a reverberating synfire chain emerging from recent neurophysiological data comes very close to Hebb's original proposal summarized in Figure 1. In contrast to the original proposal, it appears more realistic to postulate connections not only between consecutive subpopulations of neurons, but, in addition, connections that skip subgroups, and directly link, for example, subgroups 1 and 3 in the example illustration (Figure 1 ). Such bypass connections may be realized by relatively slowly-conducting cortico-cortical fibers (Miller, 1996). Further, Abeles' findings suggest that the neuron subgroups represented by arrows in Hebb's diagram overlap, so that a given neuron can be part of, say, subgroups 1 and 7.
In summary, after its full activation (ignition), neuronal activity may reverberate in the loops of an assembly. Ignition and reverberation may represent important functional states of Hebbian cell assemblies. On the cognitive level, ignition may correspond to perception of a meaningful stimulus and to activation of its representation. The fact that an object partially hidden behind another one can frequently be identified can be explained by full ignition of a cell assembly after stimulation of only some of its neurons (Hebb, 1949). Sustained activity of the assembly and reverberation of activity therein may represent an elementary process underlying short-term or active memory (Fuster, 1989; 1994; Fuster & Jervey, 1981). The latter view arises from studies that evidence a systematic relationship between the occurrence of defined spatio-temporal activity patterns in cortex and particular engrams an experimental animal has to keep in active memory (Fuster, 1994; Villa & Fuster, 1992).
Recent neurophysiological work not only revealed well-timed spatio-temporal activity patterns in cortical neurons related to memory processes. Another line of research uncovered stimulus-specific synchronization of activity in cortical neurons related to perceptual processes. If an elementary visual stimulus, for example a bar moving in a particular direction, is presented to an experimental animal, numerous neurons in various visual cortices in both hemispheres start to synchronize their firing and, in many cases, exhibit coherent rhythmic activity in a relatively high frequency range, that is, above 20 Hz (Eckhorn et al.1988; Engel et al.1990; 1991; Gray et al.1989; Kreiter & Singer, 1992)..[1] This provides further evidence that neurons in different areas are strongly coupled and can act as a unit. Although synchronization phenomena have been observed in subcortical structures and even in the retina (Neuenschwander & Singer, 1996; Sillito et al.1994; Steriade et al.1993), cortico-cortical connections are apparently necessary for synchronization of neuron responses in cortex (Gray et al.1989; Engel et al.1991; Singer & Gray, 1995). Because synchronized responses change with stimulus features, for example the direction in which a bar moves (Eckhorn et al.1988; Gray et al.1989; Gray & Singer, 1989), the idea receives support that there are stimulus-specific distributed neuron groups. It appears that these neurophysiological data can only be explained if cell assemblies are assumed that are (a) activated by specific external stimuli, (b) distributed over different cortical areas, and (c) connected through cortico-cortical fibers (and possibly additional subcortical connections).
These results can be interpreted as evidence for a simplified version of Hebb's theory according to which cell assemblies must synchronously oscillate at high frequencies when active. However, synchronous oscillations are a special case of well-timed activity (Abeles et al.1993; Aertsen & Arndt, 1993). Therefore, these data are also consistent with the weaker position made explicit by Hebb that cell assemblies generate well-timed activity patterns in their many neurons. The latter position would imply that at least a fraction of the activated neurons (e.g., those forming one subgroup represented by an arrow in Figure 1 ) exhibit synchronized activity when the assembly reverberates (see Pulvermüller et al. (1997) for further discussion).
If it is taken into account that most cortico-cortical fibers conduct action potentials with velocities around 10 m/s or faster (Aboitiz et al.1992; Miller, 1996), it becomes clear that a wave of activity running through and reverberating within an assembly will lead to rather fast activity changes. Suppose a large-scale physiological recording device, for example an electrode recording the local field potential - or even an EEG electrode or an MEG coil - is placed close to a fraction of the neurons of the assembly sketched in Figure 1 . In this case, a reverberating wave of activity in the assembly will cause that rather fast activity changes are picked up at the recording device. If the neuronal subpopulations represented by arrows are assumed to be located in different cortical areas separated, say, by a few cm, it will take some hundredths of a second until neuronal activity has travelled the loop labelled 1-2-3 and the neurons denoted by the first arrow (the first and the fourth in the sequence) become synchronously active for the second time. It follows that synchronous and fast reverberating activity in the assembly is most likely to lead to spectral dynamics in the high frequency range (> 20 Hz) recorded by the large-scale device. [2]
If specific dynamics in high-frequency cortical activity are taken as an indicator of reverberating activity in Hebbian cell assemblies, the question whether particular cognitive processes are related to high-frequency dynamics becomes particularly relevant for further testing the Hebbian ideas. It is known from animal experiments that if the receptive fields of two neurons in visual cortices are each stimulated by a moving bar and both stimuli are aligned and together move in the same direction, neuron responses can synchronize their fast rhythmic activity. However, if one neuron is stimulated by a bar moving in a particular direction, while the other is stimulated by a bar moving in the opposite direction, synchrony of rhythmic responses vanished (Engel et al.1991). This result and similar findings indicate that synchrony of high-frequency neuronal activity reflects gestalt criteria, for example the fact that two objects move together (Singer, 1995; Singer & Gray, 1995). Consistent with this finding in animals, patterns of regularly moving bars have been found to evoke stronger high-frequency electrocortical responses recorded in the EEG compared to irregular bar patterns (Lutzenberger et al.1995). Further support for the role of high-frequency cortical activity in cognitive processing comes from studies of electrocortical responses to attended and unattended stimuli (Tiitinen et al.1993). Most importantly, gestalt-like figures such as Kanizsa's triangle led to stronger high-frequency EEG responses around 30 Hz compared to physically similar stimuli that are not perceived as a coherent gestalt (Tallon et al.1995; Tallon-Baudry et al.1996). Thus, dynamics of high-frequency responses appear to be an indicator of the cognitive process of gestalt perception. These results are consistent with the idea that gestalts, such as a coherent bar pattern or a triangle, activate cortical cell assemblies that generate coherent high-frequency responses, whereas physically similar stimuli that are not perceived as coherent gestalts lack cortical representations and, therefore, evoke desynchronized electrocortical responses. Therefore, the idea that cell assemblies are relevant for cognitive processing not only receives support from recordings in animals' brains, it is also consistent with non-invasive recordings of human brain activity using large-scale recording techniques such as EEG.
In summary, recent theoretical and empirical research provided support for the existence of Hebbian cell assemblies and for their importance for cognitive brain processes. However, it must be noted that, based on experimental and theoretical work, the Hebbian concept and the assumptions connected with it have slightly changed. Some of these modifications are summarized in the following postulates (which are closely related to points (1) to (3) above):
Future empirical testing of the modified Hebbian framework is, of course, necessary and neuroimaging techniques make it possible to perform such testing, although techniques available at present do not allow for localizing each member of a widely distributed neuron set in different cortical areas. If an assembly ignites and stays active, signs of activity should be visible in single cell and multiple unit responses, local field potentials and more global electrocortical activity, and possibly also in metabolic changes in the brain. The cortical topography of these activity signs may allow for some conclusions on assembly topographies. In addition to general signs of activity enhancement - enhanced blood flow, larger event-related potentials, more powerful single cell responses - changes in well-timed high-frequency cortical responses may include information about reverberatory neuronal activity in cell assemblies.
It may be appropriate at this point to mention possible theoretical problems of the Hebbian approach, some of which have been summarized in a recent article by Milner (1996). If an ignition takes place, there is danger that activity will spread to additional assemblies and finally the entire cortex or even brain, resulting in overactivity such as seen during seizures. In order to avoid this, it is necessary to have a control device regulating the cortical equilibrium of activity. This device has been called "threshold control mechanism" (Braitenberg, 1978b) and its neuroanatomical substrate has been proposed to be located in the basal ganglia (Miller & Wickens, 1991; Wickens, 1993) or, as an alternative, in the hippocampus (Fuster, 1994). Furthermore, if a large number of cell assemblies are built up in the cortex, this may lead to an increase in average connection strength, and, in the worst case, to a clumping together of all assemblies. This would make it impossible to activate representations individually. However, this problem primarily occurs if a coincidence learning rule is assumed (Table 1). If LTD rules are added (for example in the case of correlation-based learning as sketched in Table 2), simultaneous activity of a set of cortical neurons will not only lead to synaptic strengthening between them, but, in addition, to weakening of connections to neurons outside the set (Hetherington & Shapiro, 1993; Palm, 1990; Willshaw & Dayan, 1990). In this case, the problem will only occur if w-parameters (see Table 2) are chosen inappropriately. It has also been argued that the cell assembly framework is not flexible enough to allow for a representation of complex objects. If a house includes a door and a window, how would the respective representations relate to each other? Here, it is necessary to allow for hierarchical organizations of cell assemblies: One assembly may be a subset of another one. This is also important for the semantic representations of words with similar meanings, for example for hyponyms and hyeronyms. Adjustment of the global activation threshold may account for whether the set or its subset is being activated (Braitenberg, 1978b). Furthermore, concepts that have features in common may be represented in cell assemblies that share some of their neurons. These assemblies will, therefore, not be entirely different neuron sets, but they will overlap. The relations of inclusion and overlap can be realized quite naturally within a cell assembly-theory built upon the Hebbian notion (Braitenberg, 1978b; Palm, 1982). Therefore, a modified version of the original Hebbian proposal appears to be well-suited to provide neurobiological answers to important questions in cognitive science.
3. Cortical distribution of cell assemblies
During the last years, the Hebbian idea of distributed assemblies with defined cortical topographies has been incorporated into large-scale neuronal theories of language and other cognitive functions (Abeles, 1991; Braitenberg & Schüz, 1991; Damasio, 1989; Edelman, 1992; Elbert & Rockstroh, 1987; Fuster, 1994; Gerstein et al.1989; Mesulam, 1990; Miller & Wickens, 1991; Palm, 1982; Pulvermüller, 1992; Singer, 1995; Wickens et al.1994). At this point, there appears to be a consensus that neurons in distant cortical areas can work together as functional units. However, the Hebbian framework would not only postulate that there are large-scale neuronal networks, it also provides clear-cut criteria for the formation of cell assemblies and, therefore, straightforward predictions on assembly topographies.
For assembly formation, Hebb (1949) outlines the following scenario (p. 235f): If a particular object is frequently being visually perceived, a set of neurons in visual cortices will repeatedly become active at the same time. Therefore, a cell assembly will form representing the shape of the object. This assembly is distributed over cortical regions where simultaneous neuronal activity is evoked by visual stimulation, that is, in primary and higher-order visual cortices in the occipital lobes, for example in Brodmann's (1909) areas 17, 18, 19 and 20. For convenience, Figure 2 displays a lateral view of the left cortical hemisphere on which the approximate locations of Brodmann's areas are indicated. If correlated neuronal activity is caused by input through other sensory modalities, or if it is related to motor output, the cortical distribution of the co-activated set of neurons will be different. For example, if motor behavior co-occurs with sensory stimulation, cell assemblies may form including neurons in motor and sensory cortices. To put it in a more general manner, the cortical localization of a representation is a function of where in the cortex simultaneous activity occurred when the representation had been acquired or learned.
While correlated neuronal activity of a connected cortical neuron set is a sufficient condition for cell assembly formation to occur, correlated occurrence of sensory stimuli is not. In the most extreme case, when an individual is asleep, correlated stimuli (for example in the somatosensory and acoustic modality) may not cause enough cortical activity to lead to synaptic strengthening. The same may be true in an individual exhibiting very low arousal. Furthermore, the amount of cortical activation caused by a stimulus depends on whether or not it is being attended (Heinze et al.1994; Mangun, 1995). Thus, in order to make it possible for correlated stimuli to induce synaptic learning, sufficient arousal and attention to these stimuli appears necessary, and synaptic learning may be a function of how much attention is being directed to relevant stimuli. In the following considerations it will be tacitly assumed that correlated stimuli receive a sufficient amount of attention from the learning individual in order to allow long-lasting changes of synaptic connections to occur.
3.1 Assemblies representing word forms
Turning to language, it appears relevant to ask where in the cortex correlated neuronal activity occurs during verbal activities at early ontogenetic stages, when language learning takes place (Pulvermüller, 1992; Pulvermüller & Schumann, 1994). The infant's repeated articulations of syllables during the babbling phase are controlled by neuronal activity in inferior motor, premotor and prefrontal cortices (Brodmann areas 4, 44, 45). One may well envisage that one specific synfire chain controls the articulation of a given syllable and thus represents the articulatory program (Braitenberg & Pulvermüller, 1992). In addition to and simultaneously with cortical activity related to motor programs, specific neurons in the auditory system are being stimulated by the sounds produced during articulation (Braitenberg & Schüz, 1992; Fry, 1966). These neurons are localized in primary and higher-order auditory cortices (superior temporal lobe; Brodmann areas 41, 42 and 22). Furthermore, somatosensory self-stimulation during articulatory movements evokes activity in somatosensory cortices (inferior parietal lobe; areas 1-3 and 40). Therefore, neuronal activity can be assumed to be present almost simultaneously in specific primary and higher-order motor and sensory (auditory and somatosensory) cortices. All of these areas are within the first gyrus surrounding the Sylvian fissure, the so-called perisylvian cortex (Bogen & Bogen, 1976). Neuroanatomical evidence from monkeys suggests that the perisylvian areas are strongly and reciprocally connected, whereby long-distance connections between areas anterior to motor, adjacent to primary auditory and posterior to primary somatosensory cortex are particularly relevant (Deacon, 1992b; Pandya & Yeterian, 1985; Young et al.1995). Given the necessary long-distance connections are available, it follows by learning rule (1') (see also Table 2) that the co-activated neurons in the perisylvian areas develop into cell assemblies (Braitenberg, 1980; Braitenberg & Schüz, 1992; Braitenberg & Pulvermüller, 1992; Pulvermüller, 1992). Figure 3 represents an attempt to sketch such a perisylvian assembly. The individual circles in this diagram are thought to represent local clusters of strongly connected neurons. On the psychological level, the network may be considered the organic counterpart of a syllable frequently produced during babbling, or as the embodiment of the phonological form of a word acquired later during language acquisition.
The Hebbian framework implies that different gestalts and word forms have distinct cortical assemblies, because perception of these entities will activate different but possibly overlapping populations of neurons. If a language is not learned through the vocal and auditory modalities, but through the manual and visual modalities (sign languages), cortical localization of cell assemblies representing meaningful elements should be different. Because gestures are performed with the hands and perceived through the eyes, they are related to neuronal activity further away from the sylvian fissure (more superior motor cortices and occipital visual cortices). Thus, meaningful gestures included in sign languages must be assumed to be represented in these extra-perisylvian visual, motor and association cortices (Pulvermüller, 1992).
In assuming cell assemblies distributed over perisylvian cortices, the Hebbian perspective is in apparent contrast to older localizationist models according to which motor and acoustic representations of words are stored separately in Broca's (areas 44 and 45) and Wernicke's regions (posterior part of area 22), respectively (Geschwind, 1970; Lichtheim, 1885; Wernicke, 1874). The Hebbian view implies that the motor and acoustic representation of a word form are not separate, but that they are strongly connected so that they form a distributed functional unit. For this unit to function properly, both motor and acoustic parts need to be intact. This is important for the explanation of aphasias, in particular of the fact that these organic language disturbances in the majority of cases affect all modalities through which language is being transmitted. Whereas localizationist models have great difficulty explaining this (see, for example, Lichtheim (1885) for discussion), a cell assembly model can account for the multimodality of most aphasias. [3] Furthermore, the assumption that word form representations are distributed over inferior frontal and superior temporal areas receives support from imaging studies revealing simultaneous activation of both language areas when words or word-like elements are being perceived (Zatorre et al.1992; Mazoyer et al.1993; Fiez et al.1996).
3.2 Cortical lateralization
From the Hebbian viewpoint, localization of language mechanisms is determined by associative learning and by the neuroanatomical and neurophysiological properties of the learning device (the cortex). The cortical loci where simultaneous activity occurs during motor performance and during sensory stimulation follow from the wiring of efferent and afferent cortical connections which are genetically determined. Genetic factors are also important for the formation of cortico-cortical fiber bundles which are a necessary condition for long-distance association of co-activated neurons located in different areas. Furthermore, a pure associationist approach may have difficulty explaining why, in most right-handers, the left hemisphere - but not the right - is necessary for many aspects of language processing. Left hemispheric "language dominance" is evident from lesion studies in adults and in infants (Woods, 1983), and from psychophysiological experiments in young children demonstrating that stronger language-specific electrocortical activity can be recorded from the left hemisphere than from the right (Molfese & Betz, 1988; Dehaene-Lambertz & Dehaene, 1994). Neuroanatomical correlates of language laterality have been found in the size of perisylvian areas (Galaburda et al.1978; 1991; Geschwind & Levitsky, 1968; Steinmetz et al.1990), and in size (Hayes & Lewis, 1993), ordering (Seldon, 1985) and dendritic arborization (Jacobs et al.1993; Jacobs & Scheibel, 1993; Scheibel et al.1985) of pyramidal cells in language areas. For differences in size of particular areas, epigenetic processes appear to be very important (Steinmetz et al.1995). It is well-known that differences in cell size and dendritic arborization may be due to sensory stimulation and motor output (Diamond, 1990; Diamond et al.1967) and, consistent with this view, language laterality has been proposed to be due to environmental factors, for example to lateralized auditory stimulation before birth (Previc, 1991). Such stimulation may well underlie some of the morphological asymmetries mentioned. However, there are also arguments for a contribution of genetic factors to language lateralization (Annett, 1979). At this point, it therefore appears safer not to dismiss a possible role of genetics here. For the Hebbian framework to operate, an anatomical substrate is necessary and this substrate is determined by genetic factors. Nevertheless, given the brain with its pre-programmed input and output pathways, its specific cortico-cortical projections and its probably genetically determined left-hemispheric preference for language, the Hebbian approach leads to highly specific hypotheses about cortical distribution of language-related processing units.
One of these hypotheses concerns the cortical realization of laterality of language. According to Localizationists, language processes take place in only one hemisphere. In contrast, the Hebbian framework suggests a different view. Although genetic and/or environmental factors lead to stronger language-related activation of left perisylvian cortex when language is being produced or perceived, articulation of a word form is probably controlled by bi-hemispheric activity in motor regions, and acoustic perception of the word certainly leads to activation of bilateral auditory cortices. Because neurons in both hemispheres are co-activated when a word form is being produced or perceived, the cell assembly representing the word form should be distributed over both perisylvian cortices (Mohr et al.1994; Pulvermüller & Mohr, 1996; Pulvermüller & Schönle, 1993). However, if the left hemisphere's neurons are more likely to respond to language stimuli and to control precisely timed articulations, cell assemblies representing word forms would be lateralized to the left in the following sense: They include a large number of neurons in the left hemisphere and a smaller number of neurons in the right. According to this view, a lateralized cell assembly is not restricted to one hemisphere, but a greater percentage of its neurons would be in the "dominant" hemisphere and a smaller percentage in the "non-dominant" hemisphere (Pulvermüller & Mohr, 1996).
What would be the cause of this? Given that genetically programmed differences in the hemispheres' anatomical and physiological properties are the cause of lateralization of cognitive functions, it becomes important to develop ideas about how left/right-differences in the "hardware" could influence the "software". Based on an extensive and profound review of neuroanatomical and neurophysiological asymmetries, Robert Miller (1987; 1996) recently proposed that axonal conduction times in the left hemisphere are slightly slower, in the average, compared to the right. According to Miller, this may lead to a bias in favor of the left hemisphere for storing short time delays, such as are important for distinguishing between certain phonemes (Liberman et al.1967). For example, the probability of finding a neuron that specifically responds to a /p/, but does not respond to a /b/, may be greater in the left hemisphere than in the right, because neurons with slowly conducting axons that could be used as delay lines for hardwiring the long (> 50 ms) voice onset time of the voiceless stop consonant would be more common in the left hemisphere. The availability of axons with particular conduction times may also be relevant for attributing additional distinctive features to acoustic input (Sussman, 1988; 1989). If neurons sensitive to certain phonetic features have a higher probability to be housed in the left hemisphere, the neuron ensemble representing a phonological word form should finally be lateralized to the left. Although Miller's theory of cortical lateralization needs further support by empirical data, it clearly shows how hemispheric specialization at the cognitive and functional level may arise from basic neuroanatomical and physiological differences between the hemispheres.
3.3 Word categories
Associative learning may not only be relevant for the cortical representation of word forms, it may also play an important role in the acquisition of word meanings. When the meaning of a concrete content word is being acquired, the learner may be exposed to stimuli of various modalities related to the word's meaning, or the learner may perform actions the word refers to. Although such stimulus and response contingencies are certainly not sufficient for full acquisition of word meanings (Gleitman & Wanner, 1982; Landau & Gleitman, 1985) - they would, for example, not allow to distinguish between the morning and the evening star (Frege, 1980) - they may nevertheless have important brain-internal consequences. From the Hebbian viewpoint, it is relevant that neurons related to a word form become active together with neurons related to perceptions and actions reflecting aspects of its meaning. If this co-activation happens frequently, it will change the assembly representing the word. Co-activated neurons in motor, visual and other cortices and the perisylvian assembly representing the word form will develop into a higher-order assembly. A content word may thus be laid down in cortex as an assembly including a phonological (perisylvian) and a semantic (mainly extra-perisylvian) part (Pulvermüller, 1992).
After such an assembly has formed, the phonological signal will be sufficient for igniting the entire ensemble, including the semantic representation and, vice versa, the assembly may also become ignited by input only to its semantic part. [4] Thus, frequent co-occurrence and correlation of word form and meaning-related stimuli is only necessary at some point during the acquisition process. Later-on, the strong connections within the higher-order assembly guarantee ignition of the entire assembly when part of it is being activated and, thus, they guarantee a high correlation of activity of all assembly parts, and, therefore, endurance of the assembly.
When phonological word forms become meaningful, quite different cortical processes may take place, depending on what kind of information is being laid down in the associative network. Hebbian associationist logic suggests that cortical representations radically differ between words of different vocabulary types. In the following paragraphs, a few such differences will be discussed.
3.3.1 Content and function words: Neurons activated by stimuli related to the meaning of most concrete content words (nouns, adjectives and verbs) are likely to be housed in both hemispheres. For example, the visual perceptions of objects that can be referred to as a "mouse" will probably activate equal numbers of left- and right-hemispheric neurons, because a corresponding visual stimulus is equally likely to be perceived in the right and left visual half-fields, and, in many cases, will be at fixation so that half of it is projected to the left visual field (right hemisphere) and the other half to the right visual field (left hemisphere). Therefore, if word form representations are strongly lateralized to the left, the assemblies representing content words (word form plus meaning) will be less strongly lateralized. Assemblies with different degrees of laterality are sketched in Figure 4.
In contrast to content words with concrete and well-imaginable meaning, function words such as pronouns, auxiliary verbs, conjunctions and articles primarily serve a grammatical purpose. Many of them significantly contribute to the meaning of sentences, for example "and", "or", "not", and "if". However, their meaning cannot be explained based on objects or actions the words refer to. Rather their meaning appears to be a more complex function of their use (Wittgenstein, 1967) and can only be learned in highly variable linguistic and non-linguistic contexts. Evidently, the correlation between the occurrence of a particular function word and certain stimuli or actions is low. Therefore, there is no reason why the perisylvian assembly representing the word form should incorporate additional neurons. If this is correct, assemblies representing function words remain limited to the perisylvian cortex and strongly left-lateralized in typical right-handers.
Note that this argument depends on the formulation of the cortical learning rule. If coincidence of neuronal activity was the factor causing synaptic modification, function words should have widely distributed cell assemblies, because these words occur in a multitude of stimulus constellations and, in addition, they occur much more frequently compared to most content words (Francis & Kucera, 1982; Ortmann, 1975). When a function word (for example the article "the") is being learned, it may be used with various content words ("the cat", "the dog", "the horse") and, if there is a systematic relationship between the use of the content words and the occurrence of non-linguistic stimuli (e.g., animal pictures), there will be a strong coincidence between the occurrences of each of these non-linguistic stimuli and the word form. If only coincidence learning took place, cell assemblies representing function words should include even more neurons in visual cortices than most content word assemblies, because the assembly representing the function word would incorporate all neurons related to coincident visual non-linguistic stimuli. However, if connections weaken when only pre- or only postsynaptic neurons fire (Table 2), the relatively infrequent co-occurrence of the function word with each of the visual stimuli will guarantee that its assembly does not become associated to representations of either visual stimulus. In essence, correlation of neuronal activity is relevant for synaptic strengthening in the cortex, and this implies that function words are represented in cell assemblies restricted to perisylvian areas, or, at least, that they do not include large numbers of neurons outside.
3.3.2 Abstract content words: One may argue that the postulated difference in semantic meaning between content and function words does not apply for all members of these vocabulary classes. Rather, it appears that there is a continuum of meaning complexity between the "simple" concrete content words that have clearly defined entities they can refer to (so-called referents), more abstract items that may or may not be used to refer to objects and actions, and function words that cannot be used to refer to objects. It may, therefore, appear inappropriate to make a binary distinction between vocabulary classes based on semantic criteria. If semantic criteria are crucial for intra-cortical representation, the suggested gradual differences in the correlation between word form and meaning-related stimuli or actions should be reflected in gradual differences in cortical lateralization and distributedness of assemblies. An abstract content word, such as "philosophy", may therefore have an assembly somewhat in-between typical content and function word assemblies: It may exhibit an intermediate degree of laterality mainly consisting of perisylvian neurons but including a few neurons clusters outside perisylvian areas.
Among the abstract content words are words referring to emotional states, for example "anger" and "joy". For these words, it is not difficult to find characteristic visual stimuli related to their meaning - for example angry or a joyful faces. In addition, there are characteristic meaning-related patterns of muscle activity - namely the contraction of the respective face muscles - and autonomic nervous system activity (Ekman et al.1983; Levenson et al.1990). It should therefore be noted that, although these words do not refer to objects and actions in the sense in which the word "house" refers to an object, the likely co-occurrence of body movements and visual stimuli and patterns of muscle contractions with the word forms may nevertheless lead to the formation of widely distributed cortical cell assemblies representing these words. In addition to cortical neurons added to the word form representations during learning, these assemblies have been proposed to acquire additional links to subcortical neurons in structures of the limbic system related to emotional states (Pulvermüller & Schumann, 1994). "Emotion words" may therefore be represented by a cortical assembly plus a limbic assembly-tail. The amygdala and the frontal septum may be most important structures that link the cortical assembly to its the subcortical tail (Schumann, 1990).
These considerations should make it clear that the degree of abstractness of an item is not the only factor influencing assembly topographies. According to the present proposal, the important criterion is the strength of the correlation of the occurrences of a given word form and a class of non-linguistic stimuli or actions. In the clear cases, this likelihood is related to abstractness, but there are exceptions.
3.3.3 Action words, perception words, and other word classes: Content words are used to refer to odors, tastes, somatic sensations, sounds, visual perceptions, and motor activities. During language learning, word forms are frequently produced when stimuli the words refer to are perceived or actions they refer to are carried out by the infant. If the cortex is an associative memory, the modalities and processing channels through which meaning-related information is being transmitted must be important for formation of cortical assemblies. This has inspired recent models of word processing in the brain postulating distinct cortical representations for word classes that can be distinguished based on semantic criteria (Warrington & McCarthy, 1987; Warrington & Shallice, 1984).
If the modality through which meaning-related information is transmitted determines the cortical distribution of cell assemblies, a fundamental distinction between action and perception words can be made. Action words would refer to movements of the own body and would, thus, be frequently used when such actions are being performed. In this case, a perisylvian assembly representing the word form would become linked to neurons in motor, premotor and prefrontal cortices related to motor programs. Perception words whose meaning can best be explained using prototypical stimuli would consist of a perisylvian assembly plus neurons in posterior cortex. In many cases, visual stimuli are involved and the respective word category may therefore be labelled vision words. Assemblies representing words of this category would be distributed over perisylvian and visual cortices in parietal, temporal and/or occipital lobes. Figure 5 presents sketches of the assembly types postulated for action and vision words. Examples of words whose meaning is related to the visual modality are concrete nouns with well-imaginable referents, such as animal names. The best examples of action words are in the category of action verbs.
This model draws too simple a picture of the relation between word forms and their meanings, because it does not explain homonymy (Bierwisch, 1982; Miller, 1991). If a phonological word form has two exclusive meanings - if it can, for example, be used as a noun with one meaning or as a verb with another meaning (to/the beat) - a mechanism must be assumed that realizes the exclusive-or relationship between the two meanings. As suggested earlier, homonyms could be represented by overlapping cell assemblies, that is, by two content word assemblies sharing one perisylvian phonological part. Inhibition between the semantic assembly parts is unlikely to be wired in cortex, because the percentage of cortical inhibitory neurons is low and these neurons are usually small (Braitenberg & Schüz, 1991). Intracortical inhibitors would therefore be unlikely candidates for mediating inhibition between cortical areas - for example between assembly parts in frontal and occipital lobes. However, such mutual inhibition between overlapping assemblies could be realized by striatal connections (Miller & Wickens, 1991). Accordingly, homonymic content words may be realized as widely distributed assemblies sharing their perisylvian part while inhibiting each other through striatal connections. This wiring would allow the perisylvian word form representation to become active together with only one of its "semantic" assembly parts (see Pulvermüller (1992) for further discussion). [5]
The argument made above for action and vision words can be extended to words referring to stimuli perceived through other modalities. For those, additional word categories - odor, taste, pain, touch, and sound words - can be postulated. Members of these word classes should be represented in assemblies with specific cortical topographies. Whereas, for example, an assembly representing a pain or touch word may include substantial numbers of neurons in somatosensory cortices, sound words may have exceptionally high numbers of neurons in bilateral auditory cortices included in their assemblies. Again, it must be stressed that neurons responding to stimuli of various modalities and neurons controlling body movements and actions are located in both hemispheres. It is for this reason that cell assemblies representing these words are assumed to be distributed over both hemispheres and to be less strongly lateralized compared to assemblies representing function words (Pulvermüller & Mohr, 1996).
The definition of action words is particularly delicate, because not all action-related associations involve the motor modality. Here it is important to distinguish movements which are performed by the own body from movements that are visually perceived. "To fly" or "the plane", for example, are words which are frequently heard by a child when it perceives certain moving visual stimuli. Although a relation of visual stimuli to the motor modality can hardly be denied - because perception of visual stimuli is usually accompanied by eye movements related to neuronal activity in frontal eye fields - this eye movement-related neuronal activity is probably not very stimulus-specific (similar saccades are made when looking at different objects). Therefore, the correlation between visual input patterns and the occurrence of the word forms "fly" or "plane" may be highest and these words may, thus, be organized in assemblies including a significant number of neurons in visual cortices responding to specific moving contours. These words should therefore be classified not as action words but as vision words of a certain kind (as words referring to visually perceived movements). On the other hand, action words as defined above, that is, words usually referring to movements of the own body, may include movement detectors in visual cortices in their assemblies. Many body movements are visually perceived when they are performed, suggesting that sensory-motor assemblies are being established for representing these actions - an idea for which there is ample support from recent studies (Fadiga et al.1995; Gallese et al.1996; Rizzolatti et al.1996). These considerations indicate that Figure 5 draws too crude a picture of cell assemblies representing action words. Such assemblies can include additional neurons in visual cortices primarily processing movement information - many of which are probably located in the posterior part of the middle temporal gyrus (Zeki et al.1991). A similar point can be made for somatosensory stimulations caused by body movements, suggesting that also neurons in parietal cortices may be added to the assembly representing an action word.
Further word class-distinctions can be made based on the cortical areas active during meaning-related motor activity. Different kinds of action words can be distinguished considering the muscles most relevant for performing the actions (to chew, to write, to kick), the complexity of the movement (to knock, to write), and the number of muscles involved (to nod, to embrace). These factors may "shift" the neurons in frontal lobes added to the perisylvian assembly in the inferior/posterior (mouth/hand/foot representation) or anterior/posterior direction (complex/simple movements), or enlarge/reduce their cortical distribution (many/a few muscles involved in movement).
Similar more fine-grained distinctions are desirable for vision words. Some vision words refer to static (house), others to moving objects (train), some refer to colors or colored objects (iguana) others to objects lacking colors (penguin), and, furthermore, some visual stimuli are very simple (line) while others are more complex (square, cube, house, town, megalopolis). This suggests that different sets of neurons are being added to the assembly when contingencies between words and different kinds of visual stimuli are being learned. The assembly of a word usually referring to colors or colored objects may include neurons maximally responding to color, and, as discussed above, neurons sensitive to moving visual stimuli may be included in the assemblies representing words referring to such stimuli. Recently, cortical processing streams have been discovered in temporal lobes that are primarily concerned with movement or color information from the visual input (Corbetta et al.1990; Watson et al.1993; Zeki et al.1991). If movement detecting cells are more frequent in one area, for example in the posterior middle temporal gyrus, and neurons in primary and secondary visual cortex that respond to color preferentially project to other areas, for example in the inferior temporal lobe, this would suggest that words referring to colors or colored objects are realized as assemblies including additional neurons in color areas (e.g., in the inferior temporal gyrus), and words referring to visually perceived movements have assemblies comprise additional neurons in visual movement areas (in the middle temporal gyrus).
It is important to stress (1) that word types defined in this way [6] do not necessarily have a congruent lexical category - most verbs but not all of them are action words, and there may be action words from other lexical categories - and (2) that it is not always clear from theoretical consideration to which category a particular word should be assigned. Most concrete content words probably exhibit a high correlation with stimuli of more than one modality, and their presentation may, therefore, remind subjects of multimodal stimuli. While verbs referring to body movements are likely to be action words, and many concrete nouns (such as animal names) are almost certainly vision words, other word groups - for example nouns referring to tools - probably lead to both visual and motor associations. When evaluating the present ideas about word class-differences related to word meaning in neuroscientific experiments, it is, therefore, most important to quantitatively assess semantic associations elicited by word stimuli. The only way to do this is by asking study participants.
4. Cortical activation during word processing: predictions and methodological remarks
Cognitive brain theories lead to empirical predictions on psychophysiological studies. However, testing such predictions is not trivial. In the case of language, it is particularly difficult to design experiments and interpret their results, because there are so many possible confounds to which, for example, a physiological processing difference between two stimulus words could be attributed. Further, the subtraction logic used in many imaging studies of cognitive processes has frequently been criticized, and one may want to prefer designs that could prove more useful in testing precise predictions on cognitive processes of comparable complexities.
After summarizing selected predictions derived from the Hebbian model (4.1), the subtraction logic underlying many imaging studies will be contrasted to what will be called the double dissociation approach to neuroimaging (4.2), and, finally, methodological issues specific to the investigation of word processing will be addressed (4.3).
4.1 Predictions on brain processes of word processing
Hebbian logic suggests that content and function words, and words referring to actions and perceptions have different neurobiological counterparts. The cell assemblies representing these lexical elements may differ with regard to their laterality and cortical topography. While all assemblies representing words are assumed to include a strongly lateralized perisylvian part, neurons outside perisylvian language areas (and in both hemispheres) would only be added to the assembly if words refer to actions and perceivable objects. If assembly topographies are a function of semantic word properties, signs of cortical activity should differ when these different assemblies are being activated. [7] Based on these ideas, one would expect
(i) function words to evoke strongly left-lateralized signs of cortical activity restricted to perisylvian cortices,
(ii) content words to evoke less lateralized signs of cortical activity in perisylvian areas and outside,
(iii) action words to evoke additional activity signs in motor cortices of frontal lobes [8], and
(iv) vision words to evoke additional activity signs in visual cortices of occipital and inferior temporal lobes.
These are some of the predictions based on the above considerations that relate to the where question. When the assumptions leading to these predictions were discussed in Section 3, the why question was traced back, in each case, to a Hebbian learning rule postulating that correlated neuronal activity is the driving force of assembly formation. With regard to the how question, it is important to recall that cell assemblies were assumed to exhibit two functional states, namely ignition (or full activation) and reverberation (or sustained partial activity). When outlining empirical tests of the cell assembly framework and its application to language, one may not only be interested in testing predictions about assembly topographies, but one may also want to think about possibilities to distinguish and detect possible physiological signs of ignition and reverberation. As detailed in Section 2, ignition may be reflected in a sudden spreading of neuronal activity shortly after stimulation, whereas reverberation would follow ignition and could become visible in high-frequency brain responses. Therefore, the following additional predictions are possible:
(v) Shortly after stimulation, signs of cell assembly ignition are simultaneously present at the cortical loci where the assembly is located.
(vi) After a longer delay, signs of reverberation emerge in the same areas.
It is not possible to deduce the exact point in time when these putative physiological processes take place. However, because words are recognized quite fast - for example, lexical decisions, that is, judgements on letter strings according to whether they are real words or not, can be made as early as half a second after onset of written stimuli - it is clear that the postulated physiological process of cell assembly activation must take place during the first few 100 ms after the stimulus has been presented.
While numerous additional predictions can be derived from the discussion in Section 3, Sections 5 and 6 will focus on hypotheses (i) - (vi). These hypotheses will be discussed based on results from psychophysiological experiments.
4.2 Subtractions versus double dissociations in psychophysiology
In psychophysiology, numerous neuroimaging techniques are available for investigating higher cognitive processes. Activity of large neuron ensembles can be visualized using electrophysiological recording techniques, such as electroencephalography (EEG) and magnetoencephalography (MEG). These techniques provide exact information about temporal dynamics of electrophysiological activation and deactivation processes that occur in the millisecond range. They also allow for localization of sources, although such localization is usually much less precise compared to imaging of brain metabolism. Metabolic imaging techniques with high spatial resolution, such as positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), are extremely valuable for localizing brain structures that maximally become active and, therefore, increase their metabolic rates during cognitive tasks. However, the metabolic methods only give a rough picture of temporal dynamics of brain processes, and it is, therefore, important to use both electrophysiological and metabolic imaging techniques when investigating brain processes of cognitive functions.
It is necessary to recall that important information about where, why and how cognitive processes take place in the human brain has been obtained before modern imaging techiques were available. Most of these studies used the individuals' behavior as the dependent measure. In addition, studies of neurological patients with focal lesions can answer the question of which brain structures are necessary for particular cognitive operations (Jackson, 1878; Jackson, 1879). Studies of healthy individuals in whom stimulus information reaches only one hemisphere - for example using the technique of lateralized tachistoscopic presentation of visual stimuli - can provide important insights into the hemispheres' roles in language processing (Hellige, 1993; Pulvermüller & Mohr, 1996). Together with such neuropsychological evidence, modern neuroimaging and psychophysiological data can provide even stronger conclusions about language mechanisms in the human brain (Posner & Raichle, 1994).
In recent years, a large number of imaging studies of word processing have been carried out many of which are relevant for evaluating the Hebbian model outlined above. When interpreting these results, it is necessary to consider basic methodological issues. Giving an overview over all possible methodological problems that may become relevant is outside the scope of the present article (see, for example, Posner and Raichle (1995) and comments therein). Rather, two important points will be mentioned briefly, the so-called subtraction logic and the question of stimulus matching which are both most crucial for investigating word class-differences.
Various dependent measures recorded by large-scale imaging techniques are usually interpreted as signs of cortical activity. However, the exact mechanisms how an increase in cortical activation, that is, in the frequency of excitatory post synaptic potentials in a set of neurons may lead to an increase of the CO2 concentration in numerous blood vessels, to the increase of intracellular glucose levels, to an enhancement of biomagnetic signals, or to a more positive or negative event-related brain potential are not sufficiently well-understood to make quantitative predictions possible. For example, one may predict that higher glucose metabolism or event-related potentials with higher amplitudes are present in or close to inferior prefrontal cortex during processing of a given word class, but quantification of the expected difference, for example in terms of Microvolts, would not be possible. Ultimately, even the rational underlying the more/less-logic may be flawed, because increase in biomagnetic activity or enhancement of cortical metabolism may be due to activation of inhibitory neurons (Mitzdorf, 1985; Posner & Raichle, 1995). Nevertheless, at least in cortex excitatory neurons represent the majority (> 85 percent of cortical neurons are excitatory), and they are, in the average, much larger than inhibitory neurons (Braitenberg & Schüz, 1991). Further, their function is probably to control excitatory activity in cortex, rather than to process more specific information. It therefore is possible, but not likely, that an enhancement of large-scale measures of cortical activity exclusively reflects inhibitory processes on the neuronal level. (This may be more likely for structures with high percentages of inhibitory neurons, such as the striatum.) Thus, in the large majority of cases, it appears possible to draw conclusions from large-scale neuroimaging dependent measures on activity changes in large numbers of excitatory neurons in cortex.
The logic underlying all imaging work is that a dependent measure indicates a difference in brain activity between two conditions. In most cases, a critical condition is compared to a baseline or control condition. In the simplest case, looking at an empty computer screen or at a fixation cross may be compared to reading words or to making lexical decisions on these stimuli. Using a more complex design, the task of silently reading a word may be compared to generation of a verb that somehow relates to the meaning of a displayed word. If an area of cortex is found to "light up" in such an experiment, one can conclude that the perceptual, cognitive or motor operations induced by the two conditions differ with regard to neuronal activity in this particular area.
Unfortunately, however, in many experiments there are several differences between critical and control conditions. For example, the tasks of looking at an empty screen and of making lexical decisions on words appearing on the screen differ with regard to several aspects: with regard to perceptions - either a word or nothing is being perceived, with regard to higher cognitive processes - the stimulus has to be classified as a real word or as a meaningless element, or nothing has to be done, and with regard to motor activities - either a button press is required or not. Also the comparison of silently reading a noun (e.g., cow) to silently generating a word that refers to an activity related to the object the noun refers to (e.g., to milk, to buy) involve quite different cognitive processes. Although identical words may be displayed in the two conditions and no overt response may be required, the two conditions differ because only one of the conditions requires strong attention and involves search processes, semantic inferences and repeated lexical access etc. (see also the discussion in Posner and Raichle (1995)). Finally, another difference between the reading and the generation task is that only in the latter verbs are involved (while nouns are being read in both conditions). Given an area is found to "light up" in the generation condition if compared to the reading condition, it is not clear which of the many different cognitive processes relates to the difference in brain activity. The difference may even be used to evaluate prediction (iii) above (because action verbs are probably used in only one of the conditions), but, of course, if the prediction is met, the experimental result would not provide strong support for it, due to the many possible confounds.
A solution of the problem may lie in a more careful selection of the conditions and stimuli that are being compared. If, for example, silently reading words is compared to reading of random letter strings made up of the same letters, one may believe that in this case, the critical and the control conditions only differ with regard to well-defined linguistic processes, such as word form identification and processing of semantic information. However, the objection can be raised that processing of words is even not necessary in such conditions, because random letter strings can frequently be distinguished from real words by looking only at the first three letters of the items and by deciding whether these letters can be combined according to the phonological rules of the language the real words are taken from. Thus, word processing could be avoided by experiment participants in these conditions. In order to allow conclusions on processes specific for words, even more similarity between the stimulus classes should be demanded. For example, only letter strings that are in accord with the phonological rules of the language could be allowed as pseudowords, and lexical decisions could be required so that experiment participant would be forced to attend to and process the stimuli. In this case, a neuroimaging difference between conditions could be attributed to the difference between word and pseudoword processing, although, from a psycholinguistic perspective, these processes may differ under various aspects (including word form identification, semantic processes, and the use of a "time out" strategy for rejecting pseudowords (Jacobs & Grainger, 1994; Mohr et al.1994; Grainger & Jacobs, 1996)). Nevertheless, a difference in brain activity between these conditions would allow stronger conclusions on the cortical processes induced by the words.
In many cases, two conditions are being compared where condition 1 is considered to induce a subset of the processes induced in condition 2. The subtraction of the brain responses would then be interpreted as reflecting the psychological processes condition 2 exhibits, but 1 lacks. Subtractions can be performed repeatedly, so that a hierarchy of conditions correspond to a set of subtractions (Posner & Raichle, 1995). However, the principle problems remain, namely (I) that a difference in more than one psychological process may be attributed to each pair of conditions making it difficult to attribute a physiological contrast to one of them, and (II) that statistical criteria for the comparison of two conditions are difficult to choose if multiple pairs of physiological data are compared. If many comparisons are being made (when data from tens of channels or thousands of voxels are contrasted), the likelihood of a difference occurring by chance is high. On the other hand, if critical significance levels are adjusted to reduce the likelihood of significant results (for example, by following a Bonferoni logic), an actual difference between brain responses in two conditions may be masked, because the too rigid statistical criterion is almost impossible to reach (Wise et al.1991).
The only way to avoid problem (I) appears to be choice of maximally similar experimental conditions. To investigate word class-specific processes, a good option appears to be a comparison of two psycholinguistically similar stimulus classes while the experimental task is kept constant in conditions 1 and 2. In order to reduce the risk of obtaining by-chance-results with standard significance criteria (II), more risky predictions can be derived and tested. One way to do this is to predict interactions between topographical variables and stimulus classes, rather than only a more or less in activity at a not yet specified locus. In the best case, condition 1 and condition 2 would induce quite similar cognitive processes, but condition 1 would induce a process not induced by 2, and, vice versa, condition 2 would also induce a specific process not induced by 1. Based on theoretical predictions, processing of stimuli of class 1 in the task chosen may then be assumed to activate a set A of cortical loci not activated by class 2, whereas stimuli of class 2 processed in the same task would be assumed to activate a different set B of areas not activated by 1. (Of course there may be additional areas C activated by both classes.) The brain areas activated by the two conditions or stimulus types would be distinct, and each set of areas would include loci not included in the other. This can be called a physiological double dissociation. The prediction to be tested by analysis of variance would be that direct comparison of the two activation patterns leads to a significant interaction of the task variable with the topography variable. It is very unlikely that such a prediction is being verified by chance in a neuroimaging experiment, in particular if the loci where differences are actually found have been specified before the experiment based on theoretical considerations. The rational underlying this is very similar to the logic used in neuropsychology, where double dissociations are taken as strong evidence for processing differences (Shallice, 1988; 1989) - although the dependent measure is behavioral in neuropsychology, but physiological in psychophysiology.
In summary, one perspective on overcoming some of the problems of a simple subtraction logic in neuroimaging experiments is offered by a double dissociation approach to psychophysiology. In this approach, physiological signs induced by maximally similar tasks - or even patterns of brain activation caused by matched stimuli of different classes in the same task - are being compared, and the prediction would be that class 1 of stimuli activates cortical loci A more strongly than class 2, whereas class 2 induces stronger activity signs than class 1 at distinct loci B. With regard to the present discussion, classes 1 and 2 may represent different word categories - for example action and vision words - and loci A and B would then be large sets of cortical areas - for example motor versus visual cortices.
4.3 Word properties affecting brain processes
Given that comparable stimulus materials are used in an imaging experiment on processing differences between word classes, the expectation would be that defined cortical areas "light up" when members of a given word class are being processed (see predictions (i) - (iv)). But what would "comparable" mean in this case? Behavioral studies in which response times and accuracies of responses were exactly measured have clearly shown that various properties of stimuli influence information processing in the brain, and many of these results from behavioral studies could be confirmed by psychophysiological experiments. Imaging techniques with good spatial resolution are only being used for a few years and, therefore, many methodological studies on the influence of stimulus properties have not yet been performed using these techniques. When evaluating imaging studies of word processing, it is essential to keep in mind the stimulus properties for which behavioral and earlier psychophysiological studies have demonstrated strong effects on brain processes.
Words can vary on various scales. The naive observation that long words are more difficult to read than short ones is paralleled in the observation that words of different length elicit different electrocortical responses measured in the EEG. This appears to be the case regardless of whether the items are presented acoustically (Woodward et al.1990) or visually (Kaufman, 1994). A second important factor influencing behavioral and physiological responses to words is whether they are common or exceptional. In contrast to pictures or real objects for which it is difficult to estimate whether they are frequently or rarely being perceived, the frequency of words can be exactly determined based on the evaluation of large corpora of spoken or written text. Word frequency is well-known to strongly influence response times and accuracies of word processing (see, for example, Bradley (1978); Mohr et al. (1996)). In addition, word frequency has a strong influence on cortical potentials evoked by word presentation (Polich & Donchin, 1988; Rugg, 1990; Rugg & Doyle, 1992). Because certain word classes exhibit enormous differences in word frequencies, this variable may affect the outcome of studies of word class differences. For example, most function words are in the highest frequency range, while only a small percentage of the content words can be found in this high range, and most content words are used only rarely. Thus, word frequency is a likely confounding factor of experimental results about differences between word classes.
Additional possible confounds of word category differences are related to psychological processes induced by the stimuli. Some words are more arousing than others: the word "spider" may lead to much more pronounced brain activity in an arachnophobic patient compared to "beetle", and normal individuals may exhibit similar differences in brain responses. That event-related potentials reliably differ between more or less arousing words has been shown by numerous studies (Chapman et al.1980; Johnston et al.1986; Naumann et al.1992; Williamson et al.1991), and there is also evidence that a variable called "valence", that is, the degree to which the stimulus is evaluated as positive or negative, can have an effect on event-related potentials. Therefore, there is some reason to believe that what has been called the "affective meaning" of words (Osgood et al.1975) can influence the brain processes these stimuli induce. Stimulus matching for the variables valence and arousal therefore appears desirable - except, of course, if the role of these variables in word processing is subject of the experiment.
Another variable strongly affecting behavioral and physiological responses to word stimuli is the context in which they are being presented. There are different types of context effects. They can be elicited not only if words are presented in well-formed or ill-formed sentences, but also when words are presented one by one. If a word occurs twice in the same experiment, event-related potentials are usually more positive-going for the second occurrence (see, for example, Smith & Halgren (1987); Rugg (1985)). The repetition effect appears to be quite complex and can interact with other variables, for example word frequency (Rugg, 1990). Therefore, if a physiological difference is observed between words of different frequencies that are repeatedly presented in the same experiment, it cannot be decided to which variable the difference should be attributed.
Context effects can also occur between different words that are semantically related (semantic priming). Presentation of a prime word changes electrocortical signs of activity elicited by a subsequently presented target which is semantically related to the prime (Holcomb & Neville, 1990; Nobre & McCarthy, 1994; Rugg, 1985). Similar priming effects may also occur when a word is being presented in sentence context. A pronounced negative deflection is seen when meaningful words appear at the end of a sentence where they are highly uncommon (Kutas & Hillyard, 1980), and different brain waves have been identified that may indicate different forms of syntactic or semantic violations (Neville et al.1991; Osterhout & Holcomb, 1992). While there are several different effects of sentence context on word-evoked potentials, at least one of these effects appears to be quite similar to the effect induced by semantic priming (Van Petten, 1993). Most importantly, context effects are not necessarily the same for all word classes (Besson et al.1992). As mentioned above for the effects of word frequency and word repetition, also sentence context effects may vary between word classes. Event-related potentials elicited by content words are attenuated by a sentence context provided that semantic and syntactic restrictions are met by the sentence. In contrast, function words also show attenuation of event-related potentials when presented in semantically deviant strings that still preserve some basic sentence-like structure (Van Petten & Kutas, 1991). If words are presented in sentences or in sentence-like word strings, it may well be that not only the effect of a stimulus word is seen in the neurophysiological response, but a complex blend of the effects of the critical word, its preceding words, and their semantic and syntactic relations. The various context effects may therefore either artificially produce word class difference, or they may mask real processing differences between word classes.
When investigating brain processes distinguishing between word classes, it appears necessary to keep in mind these effects of word length, word frequency, emotional (arousal and valence) properties of the stimuli, as well as those of word repetition, priming and syntactic and semantic sentence context. These properties of word stimuli may confound results of any imaging study revealing differences in brain activity evoked by two word groups. Only if such confounds are excluded can a strong conclusion on differences between lexical or semantic word categories be drawn. [9]
5. Brain activity during word processing: where?
In this Section, studies on the cortical areas activated during word processing will be discussed. The main question will be whether there is evidence for or against predictions (i) - (iv). Studies on differences between content and function words will be dealt with in Section 5.1, and Section 5.2 will be concerned with action and vision words and related categories.
5.1 Content and function words
Neuropsychological work clearly indicates that different brain areas are necessary for processing content and function words. Whereas aphasic patients with anomia have difficulty finding content words (Benson, 1979), function words are more difficult to produce for patients with agrammatic aphasia (Caramazza & Berndt, 1985; Pick, 1913) and also aspects of agrammatics' deficit in language comprehension can be explained based on the assumption that they have a selective deficit in processing these lexical items (Pulvermüller, 1995a). Lesions within the entirety of the perisylvian region can be the cause of the agrammatic language disturbance (Vanier & Caplan, 1990). In contrast, lesions at various cortical sites outside left-hemispheric perisylvian cortices can lead to selective impairment in using or comprehending word categories included in the content word vocabulary (see discussion in Section 5.3). If function word representations are assumed to be restricted to perisylvian cortices (see Figure 3), and content word representations are assumed to be more widely distributed (see examples in Figure 5), a perisylvian lesion will destroy a large percentage of neurons included in function word representations, but will only remove a smaller part of the representations of content words. In contrast, lesions outside the perisylvian region will only affect representations of content words. Thus, different cortical distributions of cell assemblies representing content and function words can account for the double dissociation in processing content and function words in specific aphasic impairments such as agrammatism and anomia (Pulvermüller, 1995a; Pulvermüller & Preissl, 1991).
In addition, evidence from behavioral experiments in healthy individuals using lateralized tachistoscopic presentation have provided further support for processing differences between content and function words. It is well-known that words presented either in the left visual hemifield (and, thus, to the right hemisphere) or in the right visual hemifield (to the left hemisphere) of right-handed individuals exhibit a processing advantage after presentation in the right visual field ("right visual field advantage"; see, for example, Bradley (1978)). In behavioral experiments, these effects can be exactly quantified in terms of response times and accuracies. A frequently applied paradigm is lexical decision where words and matched meaningless pseudowords are presented in random order and study participants have to indicate whether an item is a legal word or not. In lexical decision experiments, the "right visual field advantage" has been found to be stronger for function words compared to content words matched for word frequency and length (Chiarello & Nuding, 1987; Mohr et al.1994). For function words, direct stimulation of the left hemisphere leads to faster or more accurate responses compared to stimulation of the right hemisphere. This is consistent with the idea that cell assemblies representing function words are strongly lateralized to the left (Section 3.3.1). The weaker or even absent right visual field advantage for content words supports the idea that cell assemblies underlying content word processing are less lateralized (Mohr et al.1994).
Several studies investigating event-related potentials (ERPs) have been conducted in search of differential brain activity induced by content and function words. Garnsey's (1985) early experiment revealed a fine-grained word class-difference in event-related potentials uncovered by principle component analysis. Neville, Mills and Lawson (1992) presented content and function words in sentence context and had subjects indicate whether the sentences did make sense or not. Words of the two classes were not matched for word length or frequency. These authors reported a left-lateralized component evoked by function words which peaked at 280 ms post stimulus onset, whereas a peak more symmetrical over the hemispheres was evoked by content words at 350 ms. A similar result was obtained by Nobre and McCarthy (1994) who used stimuli matched for word length but not for word frequency. These authors presented words one by one and their subjects studied the sequence while trying to detect words of a particular semantic class. Again, a left-lateralized negative peak followed function word presentation (latency: 288 ms), whereas content words led to an enhanced negativity (latency: 364 ms) which was more symmetrical over the hemispheres. Gevins, Cutillo and Smith (1995) using a cued two stimulus-paradigm where subjects had to indicate whether two stimuli were similar according to phonological, syntactic or semantic criteria. These authors reported a lateralized positivity (latency: 445 ms) elicited by function words which was most pronounced over left frontal regions, whereas content words failed to elicit a late lateralized component. However these authors do not report stimulus lengths or frequencies and it is, therefore, not possible to exclude the most likely confounds. In an experiment comparing brain responses to content and function words matched for word frequency and word length (Pulvermüller et al.1995a) while study participants had to make speeded lexical decisions, a negative-going wave which peaked around 160 ms after onset of visual stimuli revealed a significant interaction of the factors word class and hemisphere. The peak in the event-related potential was equally visible over both hemispheres after presentation of content words, but it was pronounced over the left hemisphere and reduced over the right when function words were processed. Mean event-related potentials obtained between 150 and 300 ms after stimulus onset revealed a significant hemisphere by word class interaction (left/right difference strong for function words, but minimal or absent for content words).
It is important to point out some of the differences between these studies. For example, the tasks to be performed by participants were very different (lexical decision, sentence judgement etc.). In spite of these differences, all of these experiments revealed differences in electrocortical responses between the major vocabulary types. Results were very similar in the studies by Neville et al. and by Nobre and McCarthy. In both cases an early left-lateralized component was found after function words and a component symmetrical over the hemispheres followed content words after a longer delay. In Gevins et al.'s results, function words led to a left-lateralized component which occurred much later compared to both earlier studies, and, again, no such lateralized component was present for content words. In our study, we found no word class difference in latencies of event-related potentials, but this study again confirmed the observation of a left-lateralized component evoked by function words and a component symmetrical over the hemispheres evoked by content words. Thus, all of these studies agreed on the finding of left-lateralized electrocortical responses to function word presentation and less or even absent lateralization of potentials evoked by content words.
Checking these studies against possible confounds reveals the following: Words were presented in sentence context only in Neville's experiment, whereas context effects are likely to play a minor role in the remaining studies. Matching of stimuli for word length was performed for Nobre and McCarthy's and for Pulvermüller et al.'s experiment. Only the latter study used content and function words matched for word frequency. As already pointed out in Section 4.3, the issue of frequency matching is of particular relevance for electrocortical content/function word differences, because there are data (reported by King and Kutas' (1995)) indicating that latency differences may be due to different word frequencies of the stimuli chosen from the two vocabulary classes. After frequency-matching of stimuli, word class-differences in latencies of event-related potentials indeed vanished. However, the differences in laterality of electrocortical responses to content and function words were still present with frequency-matched stimuli (Pulvermüller et al.1995a). Therefore, the difference in laterality - rather than the difference in latency - appears to be characteristic for the major word classes. [10]
These studies are consistent with predictions (i) and (ii) proposed in Section 4.1. A possible explanation for the differences in cortical laterality of brain responses to content and function words is that specific cortical representations of these stimuli have different degrees of laterality. At present, there is no strong evidence from neuroimaging that content and function word representations are differently distributed within each hemisphere, although neuropsychological data support this view (Pulvermüller et al.1996c; Vanier & Caplan, 1990). However, recent preliminary PET data indicate that this prediction may also be correct (Nobre et al.1997).
The Hebbian viewpoint suggests that differences in cortical loci involved in representing and processing words depend on semantic word properties. However, the summarized studies do not include information about which of the many properties distinguishing content and function words are crucial for differential brain activation induced by these stimuli. Content and function words not only differ with regard to semantic criteria (e.g., only the former can be used to refer to objects and actions), they also belong to different lexical categories, and even their phonological structure may be different. In order to find out whether semantic factors are indeed crucial, it is necessary to compare words that share phonological and lexical properties, but differ only in their meaning. In a study investigating event-related brain potentials to nouns with concrete and abstract meaning, electrocortical responses were also found to be different over the hemispheres (Kounios & Holcomb, 1994). Abstract nouns led to an interhemispheric difference in electrocortical activity, whereas concrete nouns evoked similar responses over both hemispheres. This is consistent with the assumption that semantic differences underlie differential laterality of event-related potentials to concrete and abstract nouns. One may argue that this result makes it plausible that the same is true for the difference between content and function words, although this suggestion cannot be proven to be correct at present. However, consistent with this view the high degree of abstractness of function words is paralleled by a strong interhemispheric difference in event-related potentials, and the smaller degree of abstractness of abstract nouns is paralleled by a weaker interhemispheric difference evoked in a lexical decision task. [11] This pattern of results is in agreement with the assumption of strongly lateralized cell assemblies representing function words, weakly lateralized assemblies representing concrete content words, and a moderate degree of laterality for assemblies representing abstract content words (see Section 3.3.2). Therefore, the view that the degree of laterality of brain responses to words reflects semantic stimulus properties receives support from the summarized psychophysiological studies.
5.2 Action and vision words
If the cortical distribution of word representations is determined by the cortical pathways through which meaning-related information is being transmitted, differences in cortical localization should not only distinguish representations of content and function words, but, in addition, also words that differ in their motor and visual associations, such as nouns and verbs or animal and tool names, should have cell assemblies with different cortical topographies. The Hebbian model, and probably any associationist approach, suggests that semantic word class differences determine differences in cortical representations. Most importantly, however, based on a Hebbian associationist model semantic differences between word categories can be used to generate predictions on cortical areas that are involved in processing words of such categories. As discussed in paragraph 3.3.3, words eliciting strong visual associations can be expected to be represented and processed in perisylvian and additional visual cortices in inferior temporal and occipital areas, whereas words with strong motor associations would be expected to involve additional motor areas in frontal lobe. Concrete nouns referring to animals or large man-made objects appear to be examples of typical vision words, verbs referring to actions usually performed by humans probably are typical action words, and words referring to tools may evoke both strong motor and visual associations.
Neuropsychological data clearly indicate that focal brain lesions can affect these word categories to different degrees. Whereas lesions in temporal and/or occipital regions sometimes selectively impair processing of nouns, lesions in frontal areas have been reported to be associated with deficits in processing verbs (Damasio & Tranel, 1993; Daniele et al.1994; Goodglass et al.1966; Miceli et al.1984). There is also evidence for more fine-grained disturbances primarily affecting, for example, words referring to small man-made objects, such as tools, or words referring to living entities, such as animals (Damasio et al.1996; Warrington & McCarthy, 1983; 1987; Warrington & Shallice, 1984). The relationship between anatomical lesion site and category-specific deficit has not yet been investigated systematically for all cortical lobes. However, lesion studies of lesions in left temporal lobe indicate that damage in the middle part of the inferior temporal lobe most strongly impairs naming of animals whereas more posterior lesions involving inferior and middle temporal gyri result in a more pronounced deficit in naming tools (Damasio et al.1996). The idea that cell assemblies representing words of different semantic and lexical categories have different cortical distributions therefore receives some support from neuropsychological research, although it is not yet clear whether all of the more exact predictions on the cortical loci involved can be verified.
Imaging work that might reveal clues about processing differences between nouns and verbs was frequently carried out after Petersen et al. (1989) and Wise et al. (1991) reported that verb generation involved cortical areas less activated during noun reading. These authors and several more recent investigations used PET to measure brain activity while experiment participants either read visually presented nouns (reading task) or tried to generate verbs that "go with" the nouns (verb generation task). [12] If "car" is being presented, generation of "drive" or "race" may be expected. For evaluation, brain activity maps from the reading task were subtracted from those from the verb generation task. Significantly enhanced brain metabolism in a particular area during the generation task was attributed to cognitive processes necessary for verb generation and not necessary for reading nouns.
While not all of the studies agree on the cortical loci of activity enhancement during verb generation, it appears that increased blood flow in prefrontal and temporal cortices can be observed. [13] Activity enhancement in the left frontal lobe has been reported in Broca's area and anterior and superior to it (McCarthy et al.1993; Petersen et al.1989). Also Wernicke's region (posterior area 22) (Wise et al.1991) and the middle temporal gyrus (Fiez et al.1996) showed increased blood flow. Thus, during verb generation stronger activity in perisylvian language cortices and in additional premotor, prefrontal and temporal areas was found. Figure 6 presents results from one study revealing both prefrontal and middle temporal activation during verb generation relative to the reading condition.
When interpreting these results in order to draw conclusions on cognitive processes, such as processing of a particular class of words, the following should be noted. As the above example clearly demonstrates ("car" leading to generation of "drive" or "race") the generated words are not necessarily verbs, in particular if the experimental language is English where many verbs can also be used as nouns and vice versa. From this point of view, it appears not appropriate to call it a "verb generation task", but rather a task to generate action words. However, even this may not be correct, because subjects may have been instructed to describe "what the nouns might be used for or what they might do" (Fiez et al. (1996) p.1) - thus allowing for generation of both action words and vision words related to perceived movements. In addition, arguments raised in Section 4.2 become relevant here, namely that it is difficult to interpret these results in psychological terms, because comparison of word generation to the reading task reveals several differences on the cognitive level. Recall that generation of action words makes not only semantic processes necessary, it also requires, for example, a search process, and it probably leads to stronger attention compared to the highly automatized process of reading common words. Furthermore, in most cases no information about stimulus or response properties is given that would allow for evaluation of possible confounds as pointed out in Section 4.3. Based on these PET results alone, it is, therefore, not possible to attribute blood-flow changes to verb or action word processing. Nevertheless, assuming that action words were frequently produced by experiment participants, these results appear consistent with the following view. During generation of action words, an additional cell assembly was activated (compared to the reading task) that includes neurons not only in perisylvian cortices but, in addition, in prefrontal, premotor and middle temporal areas. This is probably not too far from what could be expected based on the associationist framework discussed in Section 3 (see also prediction (iii) above). However, from a methodological more rigorous point of view it appears necessary to compare brain activity when action and vision words are being processed in the same task (see Section 4).
In a recent PET study, Martin, Haxby, Lalonde, Wiggs and Ungerleider (1995) presented achromatic line drawings of objects and had subjects generate action names and color words associated with the objects. Direct comparison of activity patterns evoked during generation of these word categories revealed increased metabolic rates in the ventral temporal lobe when color words were generated. In contrast, generation of action words led to stronger activity in more superior temporal areas on the middle temporal gyrus, and in inferior frontal areas, but not in additional motor cortices. Thus, this study fails to support hypothesis (iii) above. This failure may reflect the fact that, as these authors emphasize (footnote 26, p.105), that many of the words actually generated by experimental subjects did not refer to movements the subjects would perform themselves, but rather to movements of objects that are perceived visually. Examples of responses listed by these authors include the verbs "fly", "see" and "sleep" for whom visual associations are plausible, but a classification as action words may appear inappropriate. If many verbs without motor associations were produced, this may be the reason why visual areas were activated instead of additional motor areas relevant for controlling hand or foot movements. This point further evidences the necessity to carefully control both stimulus and response properties. It is, however, important to note that part of the left middle temporal gyrus was active during verb generation in the study by Martin et al. and that either the same or a closely adjacent area has been found to be active during verb generation from visually presented nouns (Fiez et al.1996; Petersen et al.1989).
Differences between action and vision words were also investigated using event-related potentials calculated from EEG recordings. Most of these studies compared electrocortical responses to nouns and verbs. Whereas an early study (Samar & Berent, 1986) reported generally more positive potentials following verbs (compared to nouns), more recent work using larger electrode arrays (32 or 64 channels) and more sophisticated analysis techniques (e.g., current source density analysis) suggest word class-differences in cortical topographies of event-related potentials. In a study investigating potentials evoked by several word classes, Dehaene (1995) presented numerals, nouns (animals' and persons' names) and verbs matched for word length. [14] While word-evoked potentials were generally larger over the left hemisphere, word class differences were discovered over both hemispheres around 300 ms after stimulus onset (see p. 2155). Verbs elicited a left-lateralized positive component maximal over inferior frontal cortical sites which was not found for nouns. Both nouns referring to animals and verbs led to almost identical left-temporal negativities. These results are consistent with the assumption of additional left-frontal activity during processing of verbs, but do not indicate any noun/verb processing differences in more posterior cortical loci. With regard to the methods, it should however be noted that no matching for word frequency or arousal and valence values was performed for nouns and verbs, one third of the verb stimuli had homophonous common nouns, and stimuli were repeated in the experiment. The first point makes a replication with matched stimulus materials desirable.
Presenting nouns and verbs matched for word frequency, length, arousal and valence in a lexical decision task, Preissl and colleagues (Preissl et al.1995; Pulvermüller, 1996a) found electrocortical differences already around 200 ms after onset of visual stimuli. When comparing average noun- and verb-evoked potentials (between 200 and 230 ms), significant differences were only seen over the frontal cortex. After submission of data to current source density analysis in order to maximize the contribution of local generators to the signal (Hjorth, 1975; Perrin et al.1989), stronger electrocortical signs of activity were found after verb presentation over bilateral motor cortices whereas more pronounced event-related potentials over visual cortices in the occipital lobes was seen after nouns. Most importantly, stimuli were carefully evaluated for motor and visual associations. Ratings of experiment participants confirmed differences in associations of body movements and visual scenes elicited by stimulus words. Verbs were judged to elicit significantly stronger motor associations than nouns, and nouns were judged to elicit stronger visual associations compared to the verbs.[15] The electrocortical differences seen over motor and visual cortices paralleled these differences in conscious motor and visual associations. The left diagram in Figure 9 presents these differences in event-related potential topographies elicited by well-matched nouns and verbs. These data are in agreement with predictions (iii) and (iv) listed in Section 4.1. They can be explained by the assumption that action words activate additional neuronal generators close to motor cortices, whereas vision words spark additional neuron populations in or close to primary visual areas in the occipital lobes.
It could be argued that although an influence of the confounding factors discussed in 4.3 appears unlikely in this case, it is not clear whether the electrocortical word class-differences are related to semantic associations elicited by the stimuli, or to the fact that stimuli belong to different lexical categories (noun and verb). However, because the assumption that semantic differences are crucial can explain the topographical differences found in electrocortical responses, this view should probably be preferred. Differential involvement of motor and visual cortices could be predicted based on associationist principles. In contrast, there is no a priori reason why members of different lexical categories should be housed in different cortical areas. However, in order to further confirm the idea that semantic properties of words, not their lexical categories, are crucial for differences in cortical activation patterns, it is appropriate to look at stimuli from the same lexical category (nouns) that nevertheless evoke either primarily visual associations (e.g., animal names) or associations of body movements (e.g., tool names). [16]
Recently, Damasio and her colleagues (1996) examined differences in brain activity during naming of animals and tools. In a PET study investigating activity changes in the temporal lobes, they found strong activation of the middle part of the left inferior temporal gyrus during animal naming (compared to a baseline condition), whereas enhancement of activity in more posterior cortices in the inferior and middle temporal gyri were found when naming of tools was compared to the baseline. These results suggest that different neuronal populations and cortical areas in temporal lobe contribute to processing of vision words compared to words with additional motor associations.
Differences in brain activation during naming of tools and animals were also investigated in a PET study by Martin and colleagues (1996). In this case, subjects had to silently name objects depicted either in line drawing or in silhouette (to eliminate differences in internal detail of drawings [17]). The names of these objects were matched for word frequency. Direct statistical comparison of activity patterns elicited by animal and tool naming revealed the following. Animal naming led to relatively enhanced blood flow in primary and higher visual cortices in the calcarine sulcus in the left hemisphere (and to small activity foci in prefrontal lobe). In contrast, tool naming was accompanied by activity enhancement in left premotor areas, plus an activity increase in the middle temporal gyrus. These data provide additional evidence that areas outside the perisylvian cortices contribute to processing of animal and tool names. Consistent with earlier studies using the verb generation task, a cortical locus in the left middle temporal gyrus was activated when words with strong motor associations (tool names, action verbs) were generated. In contrast to the results of the Damasio study, activity enhancement during animal naming involved occipital visual cortices rather than inferior temporal sites (which is consistent with prediction (iv)). Most importantly, however, naming of tools led to an additional activity focus in the premotor area controlling hand movements (Figure 7). This is consistent with the assumption that processing of words with motor associations activates motor cortices involved in programming such movements (prediction (iii)). The additional focus in middle temporal gyrus where tool naming led to stronger signs of activity may be related to associations of visually perceivable movements.
Although this latter study has several methodological advantages over other PET studies (e.g., matching of stimuli, of responses, calculation of significant differences between critical conditions rather than only between critical condition and baseline), it should be kept in mind that a naming study was carried out and differences between naming conditions may be related to several cognitive processes. Looking at the list of methodological desiderata from Section 4, it should be mentioned that for most PET studies it is not clear whether and to which degree complexity, frequency, arousal or valence values, and repetition of stimuli or responses influenced the results. [18] Furthermore, when naming of depicted animals and tools is being compared, it must be noted that animal pictures include many curved lines, are usually rather complex and can include various colors or shadings, whereas tools can be drawn with a few straight lines and usually lack extensive coloring or shading. If matching of visual stimuli for visual complexity has not been performed, physical differences of stimuli may account for differential activation of visual pathways specialized for processing of particular aspects of stimuli.
The possible merit of exact investigation of psychological properties of stimuli and responses can be further illustrated based on results from the Damasio study already mentioned above (Damasio et al.1996). In this investigation, highest activation values during naming of famous persons' faces were observed in the temporal poles of both hemispheres. It is unclear to which psychological variable this activity enhancement relates. However, it is clear from psychophysiological investigations that faces are among the most arousing stimuli (Lang, 1979; Lang et al.1990), and words referring to such stimuli are very likely to exhibit comparatively high arousal values too. It has been proposed that high-arousal words (that is, words evoking strong emotional associations) are represented in cell assemblies that include additional neurons in the amygdala and subcortical structures (e.g., midbrain dopamine system) (Pulvermüller & Schumann, 1994; Schumann, 1990) [19]. This provides a tentative explanation why Damasio and colleagues found enhanced activity in temporal poles during naming of famous persons [20]. When persons' names were retrieved, it may be that cell assemblies including large numbers of amygdala neurons became active, and therefore, blood flow increase was found in adjacent cortical areas strongly connected to the amygdalae, that is, in temporal poles. Thus, differential arousal values of words may explain differential involvement of amygdalae and temporal poles during naming of pictures of famous persons.
In summary, these studies include the following results relevant to the idea of different cortical representation and processing of action and vision words:
1. PET and fMRI studies using the verb generation task revealed enhanced activity in perisylvian language areas and adjacent temporal and prefrontal cortices in the left hemisphere. Perisylvian activity enhancement may be accounted for by assuming that an additional word form representation is being activated in the generation task (relative to the baseline, usually noun reading). Activation of additional cortical areas outside the perisylvian region may indicate psychological processes coupled to word form processing. Whereas prefrontal activity increase dorsal to Broca's area may relate to body movements the words refer to, activity enhancement in middle temporal gyrus may be related to visual imagination of movements.
2. ERP studies indicate that nouns with strong visual associations and verbs with strong motor associations activate different cortical generators in both hemispheres. Stronger signs of electrocortical activity following action verbs has been recorded from anterior and central regions, whereas the nouns led to more pronounced activity signs over occipital visual cortices. These differences appear to be related to neuronal activity in or close to primary motor or visual cortices underlying movement and visual associations, respectively.
3. PET studies of animal and tool naming provide additional evidence for processing differences between action and vision words. Tool naming with nouns that probably elicit motor associations activated premotor cortices and additional sites in the middle temporal gyrus, and naming animals using visual nouns led to activity enhancement in inferior temporal cortices and in occipital cortices close to the primary visual area.
Although these studies are subject to methodological problems to different degrees (as pointed out in great detail above), a coherent picture can nevertheless be drawn on their basis. Both ERP and PET studies support a contribution of occipital areas close to primary visual cortices to processin