Below is the unedited preprint (not a quotable final draft) of:
Farah, M.J (1994). Neuropsychological inference with an interactive brain: A critique of the "locality" assumption. Behavioral and Brain Sciences 17 (1): 43-104.
The final published draft of the target article, commentaries and Author's Response are currently available only in paper.
For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to: journals_subscriptions@cup.org (North America) or journals_marketing@cup.cam.ac.uk (All other countries).

NEUROPSYCHOLOGICAL INFERENCE WITH AN INTERACTIVE BRAIN: A CRITIQUE OF THE LOCALITY ASSUMPTION

Martha J. Farah
Department of Psychology
University of Pennsylvania
Philadelphia, PA l9l04
mfarah@cattell.psych.upenn.edu

Keywords

cognitive architecture, face recognition, lesions, localization, modularity, neural nets, neuropsychology, semantics, vision

Abstract

When cognitive neuropsychologists make inferences about the functional architecture of the normal mind from selective cognitive impairments they generally assume that the effects of brain damage are local, that is, that the nondamaged components of the architecture continue to function as they did before the damage. This assumption follows from the view that the components of the functional architecture are modular, i.e., informationally encapsulated. In this target article it is argued that this "locality" assumption is probably incorrect. Inferences about the functional architecture can nevertheless be made from neuropsychological data with an alternative set ofassumptions, according to which human information processing is graded, distributed, and interactive. These claims are supported by three examples of neuropsychological dissociations and a comparison of the inferences obtained from these impairments with and without the locality assumption. The three dissociations involve selective impairments in knowledge of living things, disengaging visual attention, and overt face recognition. In all three, the neuropsychological phenomena lead to more plausible inferences about the normal functional architecture when the locality assumption is abandoned. Also discussed are the relations between the locality assumption in neuropsychology and broader issues, including Fodor's modularity hypothesis and the choice between top-down and bottom-up research approaches.

"The fact that the various parts of the encephalon, though anatomically distinct, are yet so intimately combined and related as to form a complex whole, make it natural to suppose that lesions of greater or lesser extent in any one part should produce such general perturbation of the functions of the organ as a whole as to render it at least highly difficult to trace any uncomplicated connection between the symptoms produced and the lesion as such." Ferrier, 1886

INTRODUCTION

Brain damage often has rather selective effects on cognitive functioning, impairing some abilities while sparing others. Psychologists interested in describing the "functional architecture" of the mind, that is, the set of relatively independent information-processing subsystems that underlies human intelligence, have recognized that patterns of cognitive deficit and sparing after brain damage are a potentially useful source of constraints on the functional architecture. In this article I wish to focus on one of the assumptions that frequently underlies the use of neuropsychological data in the development of cognitive theories.

The locality assumption. Cognitive neuropsychologists generally assume that damage to one component of the functional architecture will have exclusively "local" effects. In other words, the nondamaged components will continue to function normally, and the patient's behavior will therefore manifest the underlying impairment in a relatively direct and straightforward way. This assumption follows from a view of the cognitive architecture as being "modular" in the sense of being "informationally encapsulated" (Fodor, 1983).

According to this version of the modularity hypothesis, the different components of the functional architecture do not interact with one another except when one has completed its processing, at which point it makes the end product available to other components. Even these interactions are limited, so that a given component receives input from relatively few (perhaps just one) of the other components. Thus, a paradigm module takes its input from just one other component of the functional architecture (e.g., phonetic analysis would be hypothesized to take its input just from pre-phonetic acoustic analysis), carries out its computations without being affected by other information available in other components (even potentially relevant information, such as semantic context), and then presents its output to the next component in line, for which it might be the sole input (e.g., the auditory input lexicon, which would again be hypothesized to take only phonetic input).

In short, in such an architecture, each component minds its own business and knows nothing about most of the other components. What follows for a damaged system is that most of the components will be oblivious to the loss of any one, carrying on precisely as before. If the components of the functional architecture were informationally encapsulated, then the locality assumption would hold; the removal of one component would have only very local effects on the functioning of the system as a whole, affecting performance only in those tasks that directly call upon the damaged component. Indeed, one of Fodor's other criteria for module-hood, which he suggests will be coincident with informational encapsulation, is that modules make use of dedicated hardware and can therefore be selectively impaired by local brain damage. In contrast, if the different components of the cognitive system were highly interactive, such that each one depended on input from many or most of the others, then damage to any one component could significantly modify the functioning of the others.

Several cognitive neuropsychologists have pointed out that informational encapsulation and/or the locality of the effects of brain damage are assumptions, expressing varying degrees of confidence in them (Allport, 1985; Caplan, 1981; Humphreys & Riddoch, 1987; Kinsbourne, 1971; Klein, 1977; Kosslyn & Van Kleek, 1991; Moscovitch & Umilta, 1990; Shallice, 1988). For example, Shallice (1988, chapter 2) endorses a weaker and more general version of modularity than Fodor's, according to which components of the functional architecture can be distinguished conceptually according to their specialized functions, and empirically by the relatively selective deficits that ensue upon damage to one of them. He likens this concept of modularity to that of Posner's (1978) "isolable subsystems," and offers the following criterion from Tulving (1983) for distinguishing modular systems with some mutual dependence among modules to fully interactive systems: Components of a modular system, in this weaker sense, may not operate as efficiently when other components have been damaged, but they will nevertheless continue to function roughly normally. According to this view, the locality assumption is not strictly true, but is nevertheless roughly true: One would not expect pronounced changes in the functioning of nondamaged components.

A closely related assumption to the locality assumption is the "transparency assumption" of Caramazza (1984; 1986). Although different statements of the transparency assumption leave room for different interpretations, it seems likely that the transparency assumption is weaker than the locality assumption. Particularly in more recent statements of the assumption (e.g., Caramazza, 1992), it appears that transparency requires only that the behavior of the damaged system be understandable in terms of the functional architecture of the normal system. Changes in the functioning of nondamaged components are not considered a violation of the transparency assumption, so long as they are understandable. In particular, interactivity and consequent nonlocal effects are permitted according to the transparency assumption; presumably only if the nonlocal interactions became unstable and chaotic would the transparency assumption be violated.

Unlike the weaker transparency assumption, the locality assumption licenses quite direct inferences from the manifest behavioral deficit to the identity of the underlying damaged cognitive component, of the form "selective deficit in ability A implies component of the functional architecture dedicated to A." Obviously, such inferences can go awry if the selectivity of the deficit is not real, for example if the tasks testing A are merely harder than the comparison tasks, if there are other abilities that are not tested but which are also impaired, or if a combination of functional lesions is mistaken for a single lesion (see Shallice, 1988, ch. 10, for a thorough discussion of other possibilities for misinterpretation of dissociations within a weakly modular theoretical framework). In addition, even simple tasks tap several components at once, and properly designed control tasks are needed in order to pinpoint the deficient component, and absolve intact components downstream. However, assuming the relevant ability has been experimentally isolated, and the deficit is truly selective, the locality assumption allows us to delineate and characterize the components of the functional architecture in a direct, almost algorithmic way.1

The locality assumption is ubiquitous in cognitive neuropsychology. At this point the reader may think that the locality assumption is naive, and the direct inferences that it licenses constitute a mindless reification of deficits as components of the cognitive architecture, which "good" cognitive neuropsychologists would not employ. Note, however, that the locality assumption is justifiable in terms of informational encapsulation. Furthermore, whether or not this seems an adequate justification, it is the case that many of the best-known findings in neuropsychology fit this form of inference. A few examples will be given here and three more will be discussed in detail later. Perusal of the recent journals and textbooks in cognitive neuropsychology will reveal many more examples of this form of inference.

Within the domain of reading, phonological dyslexics show a selective deficit in reading tasks which require grapheme-to-phoneme translation; they are able to read real words (which can be read by recognizing the word as a whole), they can copy and repeat nonwords (demonstrating intact graphemic and phonemic representation), but they cannot read nonwords, which must be read by grapheme-to-phoneme translation. This has been interpreted as an impairment in a grapheme-to-phoneme translation mechanism, and hence as evidence for the existence of such a mechanism in the normal architecture (e.g., Coltheart, 1985). Similarly in surface dyslexia a selective deficit in reading irregular words, with preserved regular word and nonword reading, has been used to identify a deficit in whole-word recognition, and hence to infer a whole-word reading mechanism distict from the grapheme-to-phoneme route (e.g., Coltheart, 1985).

In the production and understanding of spoken language, some patients are selectively impaired at processing closed class, or "function" words, leading to the conclusion that these lexical items are represented by a separate system from open class or "content" words (e.g., Zurif, 1980).

In the domain of vision, some right hemisphere-damaged patients show an apparently selective impairment in the recognition of objects viewed from unusual perspectives. This has been taken to imply the existence of a stage or stages of visual information processing concerned specifically with shape constancy (e.g., Warrington, 1985). Highly selective deficits in face recognition have been taken to support the existence of a specialized module for face recognition, distinct from more general-purpose recognition mechanisms (e.g., DeRenzi, 1986).

In the domain of memory, the finding that patients can be severely impaired at learning facts and other so-called "declarative" or "explicit" knowledge while displaying normal learning of skills and other forms of implicit knowledge is interpreted as evidence for multiple learning systems, one of which is dedicated to the acquisition of declarative knowledge (e.g., Squire, 1992).

Undoubtedly, some of these inferences may be proved wrong in the light of further research. For example, perhaps there is a confounding between the factor of interest and the true determinant of the deficit. In the case of aphasics who seem selectively impaired at processing closed class words, perhaps speech stress pattern, and not lexical class, determines the boundaries of the deficit. Critical thinkers can undoubtedly find reasons to question the inferences in all of the examples given above. However, note that in most cases the question will concern the empirical specifics of the case, such as stress pattern versus lexical class. In the course of scientific debate on these and other deficits, the form of the inference is rarely questioned. If we can truely establish a selective deficit in ability A, then it seems reasonable to attribute the deficit to a lesion of some component of the functional architecture that is dedicated to A, that is, necessary for A and necessary only for A. We are, of course, thereby assuming that the effects of the lesion on the functioning of the system are local to the lesioned component.

Two empirical issues about the locality assumption. Although it is reasonable to assume that the effects of a lesion are confined to the operation of the lesioned components and the relatively small number of components downstream in a system with informationally encapsulated modules, we do not yet know whether the brain is such a system. There is, in fact, some independent reason to believe that it is not. Neurologists have long noted the highly interactive nature of brain organization, and the consequent tendency for local damage to unleash new emergent organizations or modes of functioning in the remaining system (e.g., Ferrier, 1886; Jackson, 1873). Of course, the observations that led to these conclusions were not primarily of cognitive disorders. Therefore, whether or not the locality assumption holds in the domain of cognitive impairments, at least to a good approximation, is an open empirical question.

Note that we should be concerned more about "good approximations" than precise generalizations for purposes of guiding neuropsychological methodology. As already mentioned, Shallice (1988) has pointed out that modularity versus interactionism is a matter of degree. From the point of view of neuropsychological methodology, if nonlocal interactions were to weakly modulate the behavior of patients after brain damage, this would not necessarily lead to wrong inferences using the locality assumption. In such a case, in which the remaining parts of the system act ever-so-slightly differently following damage, the cognitive neuropsychologist will simply fail to account for 100% of the variance in the data (not a novel experience for most of us) but will make the correct inference about functional architecture. However, if deviations from locality were a first order effect, then the best fitting theory for the data using the locality assumption will be false.

There is also a second question concerning the locality assumption in cognitive neuropsychology: Is the locality assumption really indispensible to cognitive neuropsychology? Must we abandon all hope of relating patient behavior to theories of the normal functional architecture if lesions in one part of the system can change the functioning of other parts? Like the first question, this one is also a matter of empirical truth or falsehood.

Nevertheless, unlike many empirical questions, these two are not of the type that lend themselves to single critical experiments. They concern very general properties of the functional architecture of cognition, and of our abilities to make scientific inferences about complex systems, using all of the formal and informal methods and types of evidence available to us. Therefore, the most fruitful approach to answering these two questions would involve an analysis of the body of cognitive neuropsychology research, or at least an extensive sample of it.

As a starting point, I will describe three different neuropsychological dissociations that have been used to make inferences about the functional architecture of the mind. The aspect of cognition under investigation in each case is different: semantic memory, visual attention, and the relation between visual recognition and awareness. What all three cases have in common is the use of the locality assumption. For each case, I will explore alternative inferences about the functional architecture that are not constrained by the locality assumption.

How will such explorations answer the questions posed above? We can assess the empirical basis for the locality assumption by comparing the conclusions about functional architecture that are arrived at with and without the locality assumption. Specifically, we can determine which conclusions are preferable, in the sense of being simpler and according better with other, independent evidence about the functional architecture. If the locality assumption generally leads to preferable conclusions, this suggests that we are probably justified in using it. However, if the assumption of locality often leads to nonpreferable conclusions, then this suggests that we ought not to assume that the effects of brain damage on the functioning of the cognitive architecture are local. The question of whether it is possible to draw inferences about the functional architecture from neuropsychological dissociations without the locality assumption will also be addressed by the degree to which sensible conclusions can be reached without assuming locality.

An architecture for interactive processing. Of course, comparisions between the results of inferences made with and without the locality assumption will be meaningful only if both types of inferences are constrained in principled ways. The locality assumption is one type of constraint on the kinds of functional architectures that can be inferred from a neuropsychological dissociation. It limits the elements in our explanation of a given neuropsychological deficit to just those elements existing the normal functional architecture (minus the damaged component), operating in their normal fashion. If we simply eliminate that constraint without replacing it with other principled constraints on how local damage affects the remaining parts of the system, then the comparison proposed above will not be fair to the locality assumption. We could always pick the simplest, most appealing model of the normal functional architecture and say "the way in which the remaining parts of th e system change their functioning after damage produces this deficit," without saying why we chose to hypothesize that particular change in functioning, as opposed to some other which cannot explain the deficit.

The parallel distributed processing (PDP) framework will be used as a source of principled constraints on the ways in which the remaining parts of the system behave after local damage. Computer simulation will be used to test the sufficiency of the PDP hypotheses to account for the dissociations in question. Readers who would like a detailed introduction to PDP are referred to Rumelhart and McClelland's (1986) collection of readings. For present purposes, the relevant principles of PDP are:

Distributed representation of knowledge. In PDP systems, representations consist of patterns of activation distributed over a population of units. Different entities can therefore be represented using the same set of units, because the pattern of activation over the units will be distinctive. Long term memory knowledge is encoded in the pattern of connection strengths distributed among a population of units.

Graded nature of information processing. In PDP systems processing is not all or none: Representations can be partially active, for example by the partial or subthreshold activation of some of those units that would normally be active. Partial knowledge can embodied in connection strengths, either before learning has been completed or after partial damage.

Interactivity. The units in PDP models are highly interconnected, and thus mutual influence among different parts of the system is the rule rather than the exception. These influences can be excitatory, as when one part of a distributed representation activates the remaining parts (pattern completion), or they can be inhibitory, as when different representations compete with one another to become active or maintain their activation. Note that interactivity is the aspect of the PDP framework that is most directly incompatible with the locality assumption. If the normal operation of a given part of the system depends on the influence of some other part, it may not operate normally after that other part has been damaged.

The psychological plausibility of PDP is controversial, but need not be definitively established here before proceeding. Instead, just as the locality assumption is being identified as an assumption and evaluated, so PDP is being treated as an alternative assumption, to be evaluated as well, as a specific alternative to the locality assumption. In addition, as will be discussed further in the General Discussion, much of the controversy surrounding PDP concerns its adequacy for language and reasoning. It is possible that the arguments being advanced here will not generalize to these cognitive domains.

REINTERPRETING DISSOCIATIONS WITHOUT THE LOCALITY ASSUMPTION: THREE CASE STUDIES

The functional architecture of semantic memory: Category-specific?

The existence of patients with apparently category-specific impairments in semantic memory knowledge has led to the inference that semantic memory has a categorical organization, with different components dedicated to representing knowledge from different categories. The best documented form of category-specific knowledge deficit (as opposed to pure naming or visual recognition deficits) are the deficits in knowledge of living and nonliving things.

Evidence for selective impairments in knowledge of living and nonliving things. Beginning in the 1980's, Warrington and her colleagues began to report the existence of patients with selective impairments in knowledge of either living or nonliving things (Warrington & Shallice, 1984; Warrington & McCarthy, 1983; 1987). Warrington and Shallice (1984) described four patients who were much worse at identifying living things (animals, plants) than nonliving things (inanimate objects). All four of these patients had recovered from Herpes encephalitis, and all had sustained bilateral temporal lobe damage. Two of the patients were studied in detail, and showed a selective impairment for living things across a range of tasks, both visual and verbal. Table 1 shows examples of their performance in a visual identification task, in which they were to identify by name or description the item shown in a colored picture, and in a verbal definition task, in which the names of these same items were presented auditorially, and they were to define them. Examples of their definitions are shown in Table 2. Other cases of selective impairment in knowledge of living things include additional postencephalitic patients described by Pietrini, Nertempi, Revello, Pinna and Ferro-Milone (1988), Sartori and Job (1988), and Silveri and Gianotti (1988), a patient with encephalitis and strokes described by Mehta and Newcombe, two head injury patients decribed by Farah, McMullen and Meyer (1991), and a patient with a focal degenerative disease described by Basso, Capitani and Laiacona (1988). In all of these cases there was damage to the temporal regions, known to be bilateral except in Pietrini et al.'s case 1, and the case of Basso et al., where there was evidence only of left temporal damage.

The opposite dissociation, namely impaired knowledge of nonliving things with relatively preserved knowledge of living things, has also been observed. Warrington and McCarthy (1983, 1987) described two cases of global dysphasia following large left hemisphere strokes in which semantic knowledge was tested in a series of matching tasks. Table 3 shows the results of a matching task in which the subjects were asked to point to the picture, in an array, that corresponded to a spoken word. Their performance with animals and flowers was reliably better than with nonliving things. One of these subjects was also tested with a completely nonverbal matching task, in which different-looking depictions of objects or animals were to be matched to one another in an array, and showed the same selective preservation of knowledge of animals relative to inanimate objects.

Although these patients are not entirely normal in their knowledge of the relatively spared category, they are markedly worse at recognizing, defining, or answering questions about items from the impaired category. The existence of a double dissociation makes it unlikely that a sheer difference in difficulty underlies the apparent selectivity of the deficits, and some of the studies cited above tested several alternative explanations of the impairments in terms of factors other than semantic category (such as name frequency, familiarity, etc.) and failed to support them.

Interpretation of "living" and "nonliving things" deficits relative to the functional architecture of semantic memory. Using the locality assumption, the most straightforward interpretation of the double dissociation between knowledge of living and nonliving things is that these two bodies of knowledge are represented by two separate category-specific components of the functional architecture of semantic memory. A related interpretation is that semantic memory is represented using semantic features such as "animate," " domestic," and so on, and that the dissociations described here result from damage to these features (Hillis & Caramazza, 1991). In either case, the dissociations seem to imply a functional architecture for semantic memory that is organized along rather abstract semantic or taxonomic lines. Figure 1 represents a category-specific model of semantic memory and its relation to visual perception and language.

However, Warrington and colleagues have suggested an alternative interpretation, according to which semantic memory is fundamentally modality-specific. They argue that selective deficits in knowledge of living and nonliving things may reflect the differential weighting of information from different sensorimotor channels in representing knowledge about these two categories. They have pointed out that living things are distinguished primarily by their sensory attributes, whereas nonliving things are distinguished primarily by their functional attributes. For example, our knowledge of an animal such as a leopard, by which we distinguish it from other similar creatures, is predominantly visual. In contrast, our knowledge of a desk, by which we distinguish it from other furniture, is predominantly functional (i.e., what it is used for.) Thus, the distinctions between impaired and preserved knowledge in the cases reviewed earlier may not be "living/nonliving" distinctions perse, but "sensory/functional" distinctions, as illustrated in Figure 2.

The modality-specific hypothesis seems preferable to a strict semantic hypothesis for two reasons. First, it is more consistent with what is already known about brain organization. It is well known that different brain areas are dedicated to representing information from specific sensory and motor channels. Functional knowledge could conceivably be tied to the motor system. A second reason for preferring the sensory/functional hypothesis to the living/nonliving hypothesis is that exceptions to the living/nonliving distinction have been observed in certain cases. For example, Warrington and Shallice (1984) report that their patients, who were deficient in their knowledge of living things, also had impaired knowledge of gemstones and fabrics. Warrington and McCarthy's (1987) patient, whose knowledge of most nonliving things was impaired, seemed to have retained good knowledge of very large outdoor objects such as bridges or windmills. It is at least possible that our knowledge of these abberant categories of nonliving things is primarily visual.

Unfortunately, there appears to be a problem with the hypothesis that "living things impairments" are just impairments in sensory knowledge, and "nonliving things impairments" are just impairments in functional knowledge. This hypothesis seems to predict that cases of "living things impairment" should show good knowledge of the functional attributes of living things, and cases of "nonliving things impairment" should show good knowledge of the visual attributes of nonliving things. The evidence available in cases of "nonliving things impairment" is limited to performance in matching-to-sample tasks, which does not allow us to distinguish knowledge of visual or sensory attributes from knowledge of functional attributes. However, there does appear to be adequate evidence available in cases of "living things impairment," and in at least some cases it disconfirms these predictions (see Farah & McClelland, 1991, for a review). For example, although the definitions of living things shown in Table 2 contain little visual detail, in keeping with the sensory/functional hypothesis, they are also skimpy on functional information. If these cases had lost just their visual semantic knowledge, then why can't they retrieve functional attributes of living things, for example the fact that parrots are kept as pets and can talk, that daffodils are a spring flower, and so on? A more direct and striking demonstration of the apparently categorical nature of the impairment is provided by Newcombe et al. (in press), whose subject was impaired relative to normal subjects in his ability to sort living things according to such nonsensory attributes as whether or not they were generally found in the United Kingdom, in contrast to his normal performance when the task involved nonliving things.

In sum, the sensory/functional hypothesis seems preferable to the living/nonliving hypothesis because it is more in keeping with what we already know about brain organization. However, it does not seem able to account for the impaired ability of these patients to retrieve nonvisual information about living things.

Accounting for category-specific impairments with an interactive modality-specific architecture. Jay McClelland and I have modeled the double dissociation between knowledge of living and nonliving things using a simple autoassociative memory architecture with modality-specific components (Farah & McClelland, 1991). We found that a two-component semantic memory system, consisting of visual and functional components, could be lesioned to produce selective impairments in knowledge of living things and nonliving things. More importantly, we found that such a model could account for the impairment of both visual and functional knowledge of living things.

The basic architecture of the model is shown in Figure 2. There are three pools of units, representing the names of items, the perceived appearances of items, and the semantic memory representations of items. The semantic memory pool is subdivided into visual semantic memory and functional semantic memory. An item, living or nonliving, is represented by a pattern of +1 and -1 activations over the name and visual units, and a pattern of +1 and -1 activations over a subset of the semantics units. The relative proportion of visual and functional information comprising the semantic memory representation of living and nonliving things was derived empirically. Normal subjects identified terms in dictionary definitions of the living and nonliving items used by Warrington and Shallice (1984) as referring to visual properties or functional properties. This experiment confirmed that visual and functional information was differentially weighted in the definitions of living and nonliving things, and the results were used to determine the average proportions of visual and functional units in the semantic memory representations of the living and nonliving items. For the living items, about seven times as many visual semantic units than functional semantic units participated in the semantic memory pattern; for nonliving items the proportions were closer to equal. Units of semantic memory not involved in a particular item's representation took the activation value of 0.

The model was trained using the delta rule (Rumelhart, Hinton & McClelland, 1986) to associate the correct semantic and name portions of its pattern when presented with the visual portion of its pattern as input, and the correct semantic and visual portions of its pattern when presented with the name portion as input. It was then damaged by eliminating different proportions of functional or visual semantic units, and its performance was assessed in a simulated picture-name matching task. In this task, each item's visual input representation is presented to the network and the pattern that is activated in the name units is assessed, or each pattern's name is presented and the resultant visual pattern is assessed. The resultant pattern is scored as correct if it is more similar to the correct pattern than to any of the other 19 patterns.

Figure 3a shows the averaged picture-to-name and name-to-picture performance of the model for living and nonliving items under varying degrees of damage to visual semantics. With increased damage, the model's performance drops, and it drops more precipitously for living things, in effect showing an impairment for living things of comparable selectivity to the patients in the literature. Figure 3b shows that the opposite dissociation is obtained when functional semantics is damaged.

The critical challenge for a modality-specific model of semantic memory is to explain how damage could create an impairment in knowledge of living things that includes functional knowledge of living things. To evaluate the model's ability to access functional semantic knowledge, we presented either name or visual input patterns as before, but instead of assessing the match between the resulting output pattern and the correct output pattern, we assessed the match between the resulting pattern in functional semantics and the correct pattern in functional semantics. The normalized dot product of these two patterns, which provides a measure between 0 (completely dissimilar) and 1 (identical), served as the dependent measure.

Figure 4 shows the accuracy with which functional semantic memory information could be activated for living and nonliving things after different degrees of damage to visual semantics. At all levels of damage, the ability to retreive functional semantic knowledge of living things is disproportionately impaired.

These dissociations can be understood as follows. In the case of picture-name matching, the ability of a given output unit (e.g., a name unit, in the case of picture-to-name matching) to attain its correct activation value depends on the input it receives from the units to which it is connected. These consist of other name units (collateral connections) and both visual and functional semantics units. Therefore, the more semantics units that have been eliminated, the more the output units are deprived of the incoming activation they need to attain their correct activation values. Because most of the semantic input to the name units of living things is from visual semantics, whereas the same is not true for nonliving things, damage to visual semantics will eliminate a greater portion of the activation needed to activate the name patterns for living things than nonliving things, and will therefore have a more severe impact on performance.

The same principle applies to the task of activating functional semantics, although in this case the units are being deprived of collatoral activation from other semantics units. Thus, when visual semantic units are destroyed, one of the sources of input to the functional semantics units is eliminated. For living things, visual semantics comprises a proportionately larger source of input to functional semantics units than for nonliving things, hence the larger effect for these items.

Relevance of the locality assumption. Contrary to the locality assumption, when visual semantics is damaged the remaining parts of the system do not continue to function as before. In particular, functional semantics, which is part of the non-damaged residual system, becomes impaired in its ability to achieve the correct patterns of activation when given input from vision or language. This is because of the loss of collatoral support from visual semantics. The ability of this model to account for the impairment in accessing functional knowledge of living things depends critically upon this nonlocal aspect of its response to damage.

The functional architecture of visual attention: A "disengage" module?

One of the best-known findings in cognitive neuropsychology concerns the "disengage" deficit that follows unilateral parietal damage. In an elegant series of studies, Posner and his colleagues have shown that parietal-damaged patients have a selective impairment in their ability to disengage attention from a location in the spared ipsilesional hemifield, in order to move it to a location in the affected contralesional hemifield (e.g., Posner, Walker, Friedrich & Rafal, 1984). From this they have inferred the existence of a "disengage" component in the functional architecture of visual attention.

Evidence for the disengage deficit. Posner and colleagues inferred the existence of a "disengage" operation from experiments using a cued simple reaction time task. The typical task consists of a display, like that shown in Figure 5a, which the subject fixates centrally, and in which both "cues" and "targets" are presented. The cue is usually the brightening of one of the boxes, as depicted in Figure 5b. This causes attention to be allocated to the region of space around the bright box. The target, usually a simple character such as an asterisk, is then presented in one of the boxes, as shown in Figure 5c. The subject's task is to press a button as soon as possible after the appearance of the target, regardless of its location. When the target is "validly" cued, that is, occurs on the same side of the display as the cue, reaction times to the target are faster than with no cue, because attention is already optimally allocated for perceiving the target. When the target is "invalidly" cued, then reaction times are slower than with no cue, because attention is focused on the wrong side of space.

When parietal-damaged patients are tested in this paradigm, they perform roughly normally on validly cued trials on either side of the display, as well as on invalidly cued trials when the target appears on the side of space ipsilateral to their lesion. However, their reaction times are greatly slowed to invalidly cued contralesional targets. It is as if, once attention has been engaged on the ipsilesional, or "good," side, it cannot be disengaged to be moved to a target occurring on the contralesional, or "bad," side.

Interpretation of the disengage deficit relative to the functional architecture of visual attention. The disproportionate difficulty that parietal-damaged patients have in disengaging their attention from the good side to move it to the bad side led Posner and colleagues to infer the existence of a separate component of the functional architecture for disengaging attention. The resulting model of attention therefore postulates distinct components for engaging and disengaging attention, as shown in Figure 6.

Accounting for the disengage deficit with an interactive architecture that has no "disengage" component. Jonathan Cohen, Richard Romero and I have modelled normal cuing effects and the disengage deficit using a simple model of visual attention that contains no "disengage" component (Cohen, Romero & Farah, in press).2

The model is depicted in Figure 7. The first layer consists of visual transducer, or input, units, through which stimuli are presented to the network. These units send their output to visual perception units. These units represent the visual percept of a stimulus at a particular location in space. In this simple model, there are only two locations in visual space. The visual perception units are connected to two other kinds of units. One is the response unit, which issues the detection response when it has gathered sufficient activation from the perception units to reach its threshold. We will interpret the number of processing cycles that intervene between the presentation of a target to one of the visual transducer units and the attainment of threshold activation in the response unit as a direct correlate of reaction time.

The visual perception units are also connected to a set of spatial attention units corresponding to their spatial location. The spatial attention units are activated by the visual perception unit at the corresponding location, and reciprocally activate that same unit, creating a resonance with that visual perception unit that reinforces its activation. These reciprocal connections are what allows the spatial attention units to facilitate perception.

The spatial attention units are also connected to each other. For units corresponding to a given location, these connections are excitatory, that is, they reinforce each other's activation. The connections between units corresponding to different locations are inhibitory. That is, whichever location's units are more active, they will drive down the activation of the other location's units. These mutually inhibitory connections are what give rise to attentional limitations in the model, that is, the tendency to attend to just one location at a time.

Connection strengths in this model were set by hand. Units in the model can take on activation values between 0 and 1, have a resting value of 0.1, and do not pass on activation to other units until their activation reaches a threshold of 0.9.

Before the onset of a trial, all units are at resting level activation, except for the attention units which are set to 0.5 to simulate the subject's allocation of some attention to each of the two possible stimulus locations. Presentation of a cue is simulated by clamping the activation value of one of the visual input units to 1 for the duration of the cuing interval. Presentation of the target is then simulated by clamping the activation value of one of the visual input units to 1. The target is validly cued if the same input unit is activated by both cue and target, and invalidly cued if different input units are activated. We also simulated a neutral cuing condition, in which no cue preceded the target. The number of processing cycles needed for the perception unit to raise the activation value of the response unit to threshold after target onset is the measure of reaction time. By regressing these numbers of cycles onto the data from normal subjects, we were able to fit the empirical data with our model.

Figure 8 shows the data from normal subjects obtained by Posner, et al. (1984) and the model's best fit to the data. Why does our model show effects of valid and invalid cuing? In our model, attentional facilitation due to valid cueing is the result of both residual activation from the cue, as well as top-down activation that the attention units give the perception unit at its corresponding location. When the perception unit is activated by the cue, it activates the attention units on that side which feed activation back to the perception unit, establishing a resonance that strengthens the activation of the target representation. Attentional inhibition due to invalid cueing is the result of the activated attention unit at the cued location suppressing the activation of the attention unit at the target location, leading to diminished top-down activation of the target perception unit. That is, the attention units on the cued side inhibit the attention units on the opposite side. As a result, when the targ et is presented to the opposite side, the attention unit on that side must first overcome the inhibition of the attention unit on the cued side before it can establish a resonance with its perception unit, and response time is therefore prolonged.

This very simple model of attention, which has no "disengage" component, captures the qualitative relations among the speeds of response in the three different conditions, and can be fit quantitatively to these average speeds with fairly good precision. In this regard, it seems preferable to a model that postulates separate components for orienting, engaging and disengaging attention. However, the "disengage" component was postulated on the basis of the behavior of parietal-damaged subjects, not normal subjects. The critical test of this model is therefore whether it produces a disengage deficit when damaged.

A subset of the attention units on one side of the model were eliminated, and the model was run in the valid and invalid cuing conditions. (No patient data was available for the neutral condition.) Figure 9 shows the data of Posner at al. (1984) from parietal-damaged patients and the simulation results, fit to the data in the same way as before. Both sets of results show a disengage deficit: a disproportionate slowing from invalid cuing when the target is on the damaged side.

Why does the model show a disengage deficit when its attention units are damaged? The answer lies in the competitive nature of attentional allocation in the model, and the imbalance introduced into the competition by unilateral damage. Attentional allocation is competitive, in that once the attention units on one side have been activated, they inhibit attentional activation on the other side. When there are fewer attention units available on the newly stimulated side, then the competition is no longer balanced, and much more bottom-up activation will be needed on the damaged side before the remaining attention units can overcome the inhibition from the attention units on the intact side, and establish a resonance with the perception unit.

One might wonder whether we have really succeeded in simulating the disengage deficit without a disengage component, or whether some part of the model with a different label, such as the attention units or the inhibitory connections between attention units, is actually the disengage component. To answer this question, consider some of the attributes that would define a disengage component: First, it should be brought into play by perception of the target, and not the cue, on a given trial. Second, it should be used to disengage attention and not for any other function. By these criteria, there is no part of the model that is a disengager. The attention units as well as their inhibitory connections are brought into play by both cue and target presentations. In addition, the attention units are used as much for engaging attention as for disengaging it. Therefore, we conclude that the disengage deficit is an emergent property of imbalanced competitive interactions among remaining parts of the system, which do not contain a distinct component for disengaging attention.

Relevance of the locality assumption. After damage to the attention units on one side of the model, the nondamaged attention units on the other side function differently. Specifically, once activated they show a greater tendency to maintain their activation. This is because of the reduced ability of the attention units on the damaged side to recapture activation from the intact side, even when they are receiving bottom-up stimulus activation. The ability of this model to account for the disengage deficit depends critically upon this nonlocal aspect of its response to damage.

The functional architecture of visual face recognition: Separate components for visual processing and awareness?

Prosopagnosia is an impairment of face recognition that can occur relatively independently of impairments in object recognition, and which is not caused by impairments in lower-level vision, or memory. Prosopagnosic patients are impaired in tests of face recognition such as naming faces or classifying them according to semantic information such as occupation, and are also impaired in everyday life situations that call for face recognition. Furthermore, by their own introspective reports, prosopagnosics do not feel as though they recognize faces. However, when tested using certain indirect techniques, some of these patients show evidence of face recognition. This has been taken to imply that their impairment lies not in face recognition per se, but in the transfer of the products of their face recognition system to another system required for conscious awareness. This in turn implies that different components of the functional architecture of the mind are needed to enable perception and awareness of perception.

Evidence for dissociated recognition and awareness of recognition. Three representative types of evidence will be summarized here. The most widely documented form of "covert" face recognition occurs when prosopagnosics are taught to associate names with photographs of faces. For faces and names that were familiar to the subjects prior to their prosopagnosia, correct pairings are learned faster than incorrect pairings (e.g., deHaan, Young & Newcombe, 1987a). An example of this type of finding is shown in Table 4. It seems to imply that, at some level, the subject must have preserved knowledge of the faces' identities. The other two tasks are reaction time tasks. One is a task that measures speed of visual analysis of faces, in which subjects must respond as quickly as possible whether two photographs depict the same face or different faces. Normal subjects perform this task faster with familiar than unfamiliar faces. Surprisingly, as shown in Table 5, a prosopagnosicsubject showed the same pattern, again implying that he was able to recognize the faces, albeit without conscious awareness of recognizing them (deHaan, et al., 1987). The last task to be reviewed is a kind of semantic priming task. Subjects must classify printed names as being actors or politicians as quickly as possible, while on some trials photographs of faces are presented in the background. Even though the faces are irrelevant to the task subjects must perform, they influence reaction times to the names. Specifically, normal subjects are slowed in classifying the names when the faces come from the other occupation category. As shown in Table 6, the prosopagnosic patient who was tested in this task showed the same pattern of results, implying that he was unconsciously recognizing the faces sufficiently fully to derive occupation information from them (deHaan, et al., 1987).

Interpretation of covert recognition relative to the functional architecture of visual recognition and conscious awareness. The dissociation between performance on explicit tests of face recognition and patients' self-report of their conscious experience of looking at faces, on the one hand, and performance on implicit tests of face recognition on the other, has suggested to many authors that face recognition and the ability to make conscious use of face recognition depend on different components of the functional architecture. For example, de Haan, Bauer and Greve (1992) interpret covert recognition in terms of the components shown in Figure 10, in which separate components of the functional architecture subserve face recognition and conscious awareness thereof. According to their model, the face-specific visual and mnemonic processing of a face (carried out within the "Face processing module") proceeds normally in covert recognition, but the results of this process cannot access the "Conscious awarenes s system" because of a lesion at location number 1.

Accounting for dissociated covert and overt recognition with an interactive architecture with the same components used for overt and covert recognition. Randy O'Reilly, Shaun Vecera and I have modeled overt and covert recognition using the five-layer recurrent network shown in Figure 11, in which the same set of so-called "face units" subserve both overt and covert recognition (Farah, O'Reilly & Vecera, 1992). The "face input" units subserve the initial visual representation of faces, the "semantics" units subserve representation of the semantic knowledge of people that can be evoked by either the person's face or name, and the "name" units subserve the representation of names. Hidden units were used to help the network learn the associations among patterns of activity in each of these three layers. These are located between the "face" and "semantic" units, (called the "face hidden" units) and between the "name" and the "semantic" units (the "name" hidden units). Thus, there are two pools of units that together comprise the visual face recognition system in our model, in that they represent visual information about faces: the "face input" units and the "face hidden" units.

The connectivity among the different pools of units was based on the assumption that in order to name a face, or to visualize a named person, one must access semantic knowledge of that person. Thus, face and name units are not directly connected, but send activation to one another through hidden and semantic units. All connections shown in Figure 11 are bidirectional.

Faces and names are represented by random patterns of 5 active units out of the total of 16 in each pool. Semantic knowledge is represented by 6 active units out of the total of 18 in the semantic pool. The only units for which we have assigned an interpretation are the "occupation units" within the semantic pool. One of them represents the semantic feature "actor" and the other represents the semantic feature "politician." The network was trained to be able to associate an individual's face, semantics, and name whenever one of these was presented, using the Contrastive Hebbian Learning algorithm (Movellan, 1990). After training, the network was damaged by removing units.

Figure 12 shows the performance of the model in a 10-alternative forced-choice naming task for face patterns, after different degrees of damage to the "face input" and "face hidden" units. At levels of damage corresponding to removal of 62.5% and 75% of the "face" units in a given layer, the model performs at or near chance on this overt recognition task. This is consistent with the performance of prosopagnosic patients who manifest covert recognition. Such patients perform poorly, but not invariably at chance, on overt tests of face recognition.

In contrast, the damaged network showed faster learning of correct face-name associations. When retrained after damage, the network consistently showed more learning for correct pairings than incorrect in the first 10 training epochs, as shown in Figure 13. The damaged network also completed visual analysis of familiar faces faster than unfamiliar. When presented with face patterns after damage, the face units completed their analysis of the input (i.e., the face units settled) faster for familiar than unfamiliar faces, as shown in Figure 14. And finally, the damaged network showed semantic interference from faces in a name classification task. Figure 15 shows that when the network was presented with name patterns, and the time to classify them according to occupation (i.e., the number of processing cycles for the occupation units to reach threshold) was measured, classification time was slowed when a face from the incorrect category was shown, relative to faces from the correct category and, in some cases, to a no-face baseline.

Why does the network retain "covert recognition" of the faces at levels of damage that lead to poor or even chance levels of overt recognition? The general answer lies in the nature of knowledge representation in PDP networks. As already mentioned, knowledge is stored in the pattern of weights connecting units. The set of the weights in a network that cannot correctly associate patterns because it has never been trained (or has been trained on a different set of patterns) is different in an important way from the set of weights in a network that cannot correctly associate patterns because it has been trained on those patterns and then damaged. The first set of weights is random with respect to the associations in question, whereas the second is a subset of the necessary weights. Even if it is an inadequate subset for performing the overt association, it is not random; it has, "embedded" in it, some degree of knowledge of the associations. Furthermore, consideration of the tasks used to measure covert recognition suggest that the covert measures should be sensitive to this embedded knowledge.

A damaged network would be expected to re-learn associations that it originally knew faster than novel associations because of the nonrandom starting weights. The faster settling with previously learned inputs can be attributed to the fact that the residual weights come from a set designed to create a stable pattern from that input. Finally, to the extent that the weights continue to activate partial and subthreshold patterns over the nondamaged units in association with the input, then these resultant patterns would contribute activation towards the appropriate units downstream, which are simultaneously being activated by intact name units.

Relevance of the locality assumption. The role of the locality assumption is less direct in this example than in the previous two, but it is nevertheless relevant. Many authors have reasoned according to the locality assumption that the selective loss of overt recognition and preservation of covert recognition implies that there has been localized damage to a distinct component of the functional architecture needed for overt but not covert recognition. The alternative account, proposed here, suggests that partial damage to the visual face recognition component changes the relative ability of the remaining parts of the system (i.e., the remaining parts of the face recognition component along with the other components) to perform the overt and covert tasks. Specifically, the discrepancy between the difficulty of the overt and covert tasks is increased, as can be seen by comparing the steep drop in overt performance as a function of damage shown in Figure 12 with the relatively gentle fall-off in the magnitude of the covert recognition effects shown in Figures 13-15. According to the model, this is because the information processing required by the covert tasks can make use of partial knowledge encoded in the weights of the damaged network, and is therefore more robust to damage than the information processing required by the overt task. In other words, with respect to the relative ability of the remaining system to perform overt and covert tasks, the effects of damage were nonlocal. The ability of the model to account for the dissociation between overt and covert recognition depends critically on this violation of the locality assumption.

GENERAL DISCUSSION

Evaluating the truth and methodological necessity of the locality assumption

The foregoing examples were intended as a small "data base" with which to test two empirical claims about the locality assumption: First, that the locality assumption is true, in other words, that after local brain damage the remaining parts of the system continue to function as before. Second, that the locality assumption is necessary, in other words, that there is no other way to make principled inferences from the behavior of brain-damaged subjects to the functional architecture of the mind, and the only alternative is therefore to abandon cognitive neuropsychology.

The examples allow us to assess the likely truth of the locality assumption by assessing the likely truth of the different inferences made with and without the locality assumption. Of course, each such pair of inferences was made on the basis of the same data and fits that data equally well, so that the choice between them rests on considerations of parsimony and of consistency with other information about brain organization. On the basis of these considerations, the inferences made without the locality assumption seem preferable. In the case of semantic memory, the model obtained without the locality assumption is consistent with an abundance of other data implicating modality-specificity as a fundamental principle of brain organization, and with the lack of any other example of a purely semantic distinction determining brain organization. In the case of visual attention, the model obtained without the locality assumption has fewer components: Although the overviews of the models presented in Figures 6 and 7 are not strictly comparable (Fig. 6 includes components postulated to account for other attentional phenomena, and Fig. 7 includes separate depictions of the left and right hemispheres' attentional mechanisms as well as two different levels of stimulus representation), it can be seen that the same "Attention" component shown in Figure 7 does the work of the both the "Engage," and "Disengage" components in Figure 6. Similarly, setting aside the irrelevant differences in the complexity of Figures 10 and 11, due to factors such as the greater range of phenomena to be explained by Figure 10, it is clear that the same visual "Face" components in Figure 11 do the work of the visual "Face" components and "Conscious Awareness System" in Figure 10, at least as far as explaining performance in overt and covert tasks.

It should be noted that the success of these models is a direct result of denying the locality assumption. As explained in the subsections entitled "Relevance of the locality assumption," in linking each neuropsychological dissociation to the more parsimonious functional architecture, a key explanatory role is played by the nonlocal effects that damage to one component of the architecture has on the functioning of other components. Therefore, the weight of evidence from the three cases discussed here suggests that the locality assumption is false. Finally, with respect to the necessity of the locality assumption, the examples presented constitute existence proofs that principled inferences can be made in cognitive neuropsychology without the locality assumption.

Possible objections

In the this section, I will consider some possible objections to these conclusions, with the hope clarifying what has and has not been demonstrated above.

PDP and box-and-arrow: apples and oranges? One kind of objection concerns the comparability of the hypotheses that were derived with and without the locality assumption. The two types of hypothesis do indeed differ in some fundamental ways, and comparing them may be a bit like comparing apples and oranges. Nevertheless, apples and oranges do share some dimensions along which it is meaningful to compare them, and I will argue that the hypotheses being compared here are likewise comparable in the ways discussed above.

For example, it might be objected that the computer models that deny the locality assumption can only demonstrate the sufficiency of a theory, not its empirical truth, whereas the alternative hypotheses are empirically grounded. It is true that the models presented here have only been shown to be sufficient to account for the available data, but this is also true of the alternative hypotheses, and indeed of any hypothesis. It is always possible that a hypothesis can fit all of the data so far collected, but that some other, as yet undiscovered, data could falsify it. The reason that this may seem more problematic for PDP models is that there is a research tradition within computer modeling that takes as its primary goal the accomplishment of a task, rather than the fitting of psychological data (e.g., Sejnowski & Rosenberg, 1986), relying exclusively on computational constraints rather than empirical constraints to inform the models. However, this is not a necessary feature of modeling, and the models presented here are constrained as much as the alternative hypotheses are by the empirical data.

Furthermore, the computational models presented here and the alternative hypotheses are on equal footing with respect to the distinction between prediction and retrodiction of data. In all three cases discussed, the locality assumption has been used to derive an hypothesis, post hoc, from the observed neuropsychological dissociation. It was not the case that researchers had already formulated hypotheses that semantic memory was subdivided by taxonomic category, or that there was a distinct component of the attention system for disengaging attention, or that awareness of face recognition depended on distinct parts of the mental architecture from face recognition, and that they then went looking for the relevant dissociations to test those hypotheses. Rather, they began with the data, and inferred their hypotheses, just as we have done with the models presented earlier. Both the hypotheses derived using the locality assumption and the PDP models presented here await further testing with new data. An example of the way in which new data can be used to distinguish between the competing hypotheses comes from the work of Verfaille, Rapcsak and Heilman (1990) with a bilateral parietal-damaged patient. They found that, contrary to their expectation of a bilateral disengage deficit, their subject showed diminished effects of attentional cuing. When attention units are removed bilaterally from the Cohen et al. model, which was developed before the authors knew of the Verfaille et al. finding, the model also shows reduced attentional effects rather than a bilateral disengage deficit. This is because the disengage deficit in our model is caused by the imbalance in the number of attention units available to compete with one another after unilateral damage; bilateral damage does not lead to an imbalance, but does, of course, reduce the overall number of attention units and therefore the magnitude of the attentional effects.

Another way in which the comparisons presented above might seem mismatched is in their levels of description. The hypotheses derived using the locality assumption concern "macrostructure," that is, the level of description that identifies the components of the functional architecture, as shown in the so-called "box-and-arrow" models. In contrast, the hypotheses that deny the locality assumption appear to concern "microstructure," that is, the nature of the information processing that goes on within the architectural components. In fact, however, the latter hypotheses concern both microsctructure and macrostructure, as should be clear from the macrostructures depicted in Figures 2, 7, and 11. We can therefore compare the two types of hypothesis at the level of macrostructure.

The locality assumption can be saved with more fine-grained empirical analysis of the deficit Perhaps the prospects for the locality assumption look so glum because the types of data considered so far are unduly limited. The arguments and demonstrations given above concern a relatively simple type of neuropsychological observation, namely, a selective deficit in some previously normal ability. I have focussed on this type of observation for two reasons, the first of which is its very simplicity, and the seemingly straightforward nature of the inferences that follow from it. At first glance, a truely selective deficit in A does seem to demand the existence of an A component, and this inference is indeed sound under the assumption that the A component is informationally encapsulated. The second reason for focussing on this form of inference is that it is still the most common form of inference in cognitive neuropsychology, as argued earlier in the section on "ubiquity."

Nevertheless, there are other, more fine-grained, ways of analyzing patient performance that are used increasingly by cognitive neuropsychologists to pinpoint the underlying locus of impairment within a patient's functional architecture. The two most common are qualitative error analyses, and selective experimental manipulations of difficulty of particular processing stages. Can the use of the locality assumption can be buttressed by the additional constraints offered by these methods? Several recent PDP simulations of patient performance suggest that these more fine-grained analyses are just as vulnerable to nonlocal effects of brain damage as are the more brute-force observations of deficit per se.

For example, semantic errors in single word reading (e.g., pear -> "apple") have been considered diagnostic of an underlying impairment in the semantic representations used in reading, and visual errors (pear -> "peer") are generally taken to imply a visual processing impairment (e.g., Coltheart, 1985). Hinton and Shallice (1991) showed how a PDP simulation of reading could produce both kinds of errors when lesioned either in the visual or the semantic components of the model. Humphreys, Freeman and Muller (in press) make a similar point in the domain of visual search: Error patterns suggestive of an impairment in gestalt-like grouping processes can arise either from direct damage to the parts of the system that accomplish grouping, or by adding noise to earlier parts of the system. In both cases, the nondiagnosticity of error types results from the interactivity among the different components of the model.

Another well-known example of the use of error types to infer the locus of impairment is the occurence of regularization errors in the reading performance of surface dyslexics (e.g., Coltheart, 1985). As mentioned earlier, surface dyslexics fail to read irregular words, and this has been interpreted, using the locality assumption, as the loss of a whole-word reading route, with preservation of the sublexical grapheme-phoneme translation route. The inference that these patients are relying on the latter route seems buttressed by a further analysis of the nature of their errors, which are typically regularizations (e.g., pint is pronounced like "lint"). However, Patterson, Seidenberg & McClelland (1989) showed that a single-route architecture, comprised only of whole-word spelling-sound correspondences, produced both a selective impairment in the reading of irregular words and a tendency to regularize them, when partially damaged. With distributed representations used in their model, similar othographies and phonologies have similar representations at each of these levels, and there is consequently a tendency towards generalization. Although with training the system learns not to generalize the pronounciation of, say, pint to the pronounciation of most other -int words (such as lint, mint, hint), this tendency is unmasked at moderate levels of damage. The model's regularization errors are probably best understood as a result the distributed nature of the word representations in their model. However, the principles of PDP are closely interrelated, and the regularization effects can also be viewed as the result of interactions among different word representations, with the less common pronounciations loosing their "critical mass" and therefore being swamped by the remaining representations of more common pronounciations.

Analysis of selective deficits, and of the nature of the errors produced, have in common the use of a purely observational method. Perhaps experimental manipulations designed to tax the operation of specific components offer a more powerful way of pinpointing the locus of impairment. Two recent models speak to this possibility, and show that direct manipulations of particular processing stages are no more immune to nonlocal effects than are the previous methods. Mozer and Behrmann's (1990) model of visual-spatial neglect shows how manipulation of a stimulus property designed to affect post-visual processing, namely lexicality of a letter string (word, pseudoword, nonword), can have pronounced effects on the performance of a model whose locus of damage is visual. Interactions between attended visual information and stored lexical representations allow letter-strings to be reconstructed more efficiently the more they resemble familiar words. Tippett and Farah (1992) showed how apparently conflicting results in the literature on the determinants of naming difficulty in Alzheimer's disease can be accounted for with a single hypothesis. Although most researchers believe that the naming impairment in Alzheimer's disease results from an underlying impairment of semantic knowledge, manipulations of visual difficulty (degraded visual stimuli) and lexical access difficulty (word frequency) have pronouned effects on patients' likelihood of naming, leading to alternative hypotheses of visual agnosia or anomia (Nebes, 1989). When semantic representations were damaged, a PDP model of visual naming showed heightened sensitivity to visual degradation and word frequency. Thus, when one component of an interactive system is damaged, the system as a whole becomes more sensitive to manipulations of the difficulty of any of its components.

In sum, the problem of nonlocal effects of brain damage is not limited to inferences based on the range and boundaries of the impairment; it also affects inferences based on the qualitative mode of failure, and the sensitivity of the system to manipulations designed to affect specific components directly.

PDP could be false. A different type of objection concerns the assumptions of the PDP framework. As already acknowledged, PDP is controversial. How can one be convinced by the comparisons involving PDP models that the locality assumption is false if it has not first been established that PDP is a correct way of characterizing human information processing? First, it should be pointed out that much of controversy concerning PDP involves the adequacy of PDP models of language and reasoning, which are not relevant here. Few vision researchers would deny that the basic principles of PDP are likely to apply to visual attention and pattern recognition (see, e.g., the recent textbook overviews of these topics by Allport, 1989; Biederman, 1990; Hildrith & Ullman, 1989; Humphreys & Bruce, 1989; and even Pinker, 1985, who has been critical of PDP models of language). Semantic memory may be a more controversial case. Second, and perhaps more important, one can remain agnostic about PDP as a general framework for human information processing and still appreciate that the particular models presented here are credible alternatives to those derived using the locality assumption. In fact, PDP, like the locality assumption, is ultimately an empirical claim that will gain or lose support according to how well it helps explain psychological data. The ability of PDP to provide parsimonious accounts for neuropsychological dissociations like the ones described here counts in its favor. Finally, even if PDP were false, there remain other ways of conceptualizing human information processing that provide explicit, mechanistic alternatives to modularity. For example, in production system architectures (see Klahr, Langley & Neches, 1987) working memory is highly nonencapsulated. Kimberg and Farah (in press) found that weakening association strengths in working memory produced an array of specific and characteristic frontal impairments, which were not in any way transparently related to working memory. Although interactive comput ation is at the heart of PDP, which makes PDP the natural architecture to contrast with the locality assumption, other architectures are also capable of accomodating high degrees of interactivity.

General implications of denying the locality assumption

Modularity. The truth of the locality assumption has implications for issues in psychology beyond how best to infer functional architecture from the behavior of brain-damaged patients. As discussed at the outset, the locality assumption follows from a view of the mind and brain corresponding to the modularity hypothesis as described by Fodor (1983). According to this hypothesis, the components of the functional architecture are informationally encapsulated, that is, their inputs and outputs are highly constrained. Components interact only when one has completed its processing, at which point it makes the end product available to a relatively small number of other components. If this were true, then the effects of damaging one component should be relatively local. Alternatively, if we judge that the best interpretation of various neuropsychological deficits (on the grounds of parsimony or consistency with other scientific knowledge, not on the grounds of a priori preferences for encapsulation or interactivity) involves denying the locality assumption, then this counts as evidence against modularity.

The term "modularity" is often used in a more general sense than I have used it so far, and this more general sense is not challenged by the failure of the locality assumption. Specialized representations are sometimes called "modules," such that the model in Figure 2 could be said to contain "visual knowledge" and "functional knowledge" modules. In this more general sense, the "modularity hypothesis" is simply the hypothesis that there is considerable division of labor among different parts of functional architecture, with, for example, knowledge of language represented by a separate part of the system (functionally, and possibly anatomically) from other knowledge. Of course, if such a system is highly interactive, it may be difficult to delineate and characterize the different modules, but this is a problem of how you find something out, not of what something is or whether it exists.

"Top-down" versus "bottom-up" research strategies. Denying the locality assumption also has a more general implication for research strategy in cognitive neuroscience. Most researchers in neuroscience and cognitive science acknowledge that there are multiple levels of description of the nervous system, from molecules to thoughts, and that one of the goals of science is a complete description of the nervous system at all of these levels. However, such researchers may differ in their opinions as to the most efficient way to arrive at this complete description. The bottom-up, or reductionist, approach is to begin with the most elementary levels of description, such as the biophysics of neurons, on the belief that it will be impossible to understand higher levels of organization if one does not know precisely what is being organized. This approach is anathema to cognitive neuroscience, which is, by definition, forging ahead with the effort to understand such higher-level properties of the brain as perception, memory, and so forth, while acknowledging that our understanding of the more elementary level of description is far from complete.

The main alternative approach, explicitly endorsed by many cognitive neuroscientists, is the top-down approach, according to which the most efficient way to understand the nervous system is by successive stages of analysis of systems at higher levels of description in terms of lower levels of description. It is argued that our understanding of lower levels will be facilitated if we know what higher level function they serve. It is also argued that the complexity of the task of understanding the brain will be reduced by the "divide and conquer" aspect of this strategy, in which the system is analyzed into simpler components which can then be further analyzed individually (e.g., Kosslyn et al.'s, 1990, "Hierarchical Decomposition Constraint"). In the context of the three examples discussed earlier, this corresponds to first deriving the macrostructural hypotheses, in which the relevant components of the functional architecture are identified, and then investigating the microstructure of each component's in ternal operation. Unfortunately, to derive a macrostructure from neuropsychological data requires either making the locality assumption or considering the system's microstructure, as was done in the foregoing examples. If the locality assumption is false, then the microstructure has implications for the macrostructure, and one cannot be assured of arriving at the correct macrostructural description without also considering hypotheses about microstructure.

Thus, even if one's only goal is to arrive at the correct macrostructural description of the functional architecture, as is the case for most cognitive neuropsychologists, the three examples presented here suggest that one must nevertheless consider hypotheses about microstructure. This points out a correspondence between theories of functional architecture and the methodologies for studying the functional architecture. If one holds that the components of the functional architecture are informationally encapsulated, then one can take a strictly top-down approach to the different levels of description, "encapsulating" one's investigations of the macrostructure from considerations of microstructure. In contrast, if one views the functional architecture as a highly interactive system, with each component responding directly or indirectly to the influences of many others, then one must adopt a more interactive mode of research, in which hypotheses about macrostructure are influenced by constraints simultaneo usly imposed at both the macrostructural and microstructural levels.

Implications for cognitive neuropsychology. The conclusion that the locality assumption may be false is a disheartening one. It undercuts much of the special appeal of neuropsychological dissociations as evidence about the functional architecture. Although perhaps naive in hindsight, this special appeal came from the apparent directness of neuropsychological data. Conventional methods of cognitive psychology are limited to what Anderson (1978) has called "input-output" data: Manipulation of stimuli and instructions on the input end, and the measurement of responses and response latencies at output. From the relations between these, the nature of the intervening processing must be inferred. Such inferences are indirect, and as a result often underdetermine choices between competing hypotheses. In contrast, brain damage has its effects directly on the intervening processing; it constitutes a direct manipulation of the "black box." Unfortunately, the examples presented here suggest that even if the man ipulation of the intervening processing is direct, the inferences by which the effects of the manipulations must be interpreted are not. In Ferrier's (1886) words, it may well be "at least highly difficult to trace any uncomplicated connection between the symptoms produced and the lesion as such." The locality assumption, which constitutes the most straightforward way of interpreting neuropsychological impairments, does not necessarily lead to the correct interpretation. If the locality assumption is indeed false, then dissociations loose their special status as a particularly direct form of evidence about the functional architecture.

Even for cognitive neuropsychologists who would not claim any special status for neuropsychological data, abandoning the locality assumption would make their work harder. The interpretion of dissociations without the locality assumption requires the exploring a range of possible models which, when damaged, might be capable of producing that dissociation. What makes this difficult is that the relevant models would not necessarily have components corresponding to the distinctions between preserved and impaired abilities, and we therefore lack clear heuristics for selecting models to test.

The foregoing demonstrations and arguments are not intended to settle decisively the issue of whether the locality assumption is correct. As already acknowledged, this is not the type of issue that can be decided based on a single study or even a small number of studies. Instead, my goal has been to call attention to the fact that we do not have any firm basis for an opinion one way or the other, despite the widespread use of the locality assumption. Furthermore, in at least a few cases the best current interpretation seems to involve denying the locality assumption.

It is possible that some cognitive domains will conform more closely to the locality assumption than others, and if so this would have interesting theoretical, as well as methodological, implications concerning the degree of informational encapsulation in different subsystems of the functional architecture. However, until we have a broad enough empirical basis for deciding when the locality assumption can safely be used and when it will lead to incorrect inferences, we cannot simply assume it is true, as has been done almost universally in the past.

References

Allport, D. A. (1985). Distributed memory, modular subsystems, and dysphasia. In Newman, S. K. and Epstein, R. (Eds.) Current Perspectives in Dysphasia. Edinburgh: Churchill Livingstone.

Allport, D. A. (1989). Visual attention. In M. I. Posner (Ed.), Foundations of Cognitive Science. Cambridge, MA: MIT Press.

Anderson, J. R. (1978). Arguments concerning representation for mental imagery. Psychological Review, 85, 249-277.

Basso, A., Capitani, E., & Laiacona, M. (1988). Progressive language impairment without dementia: A case with isolated category specific semantic defect. Journal of Neurology, Neurosurgery and Psychiatry, 51, 1201-1207.

Biederman, I. (1990). Higher-Level Vision. In D. N. Osherson, S. M. Kosslyn, & J. M. Hollerbach (Eds.) Visual Cognition and Action. Cambridge: MIT Press.

Bruyer, R., Laterre, C., Seron, X., Feyereisne, P., Strypstein, E., Pierrard, E., & Rectem, D. (1983). A case of prosopagnosia with some preserved covert remembrance of familiar faces. Brain and Cognition, 2, 257-284.

Caplan, D. (1981). On the cerebral localization of linguistic functions: Logical and empirical issues surroundng deficit analysis and functional localization. Brain and Language, 14, 120-137.

Caramazza, A. (1984). The logic of neuropsychological research and the problem of patient classification in aphasia. Brain and Language, 21, 9-20.

Caramazza, A. (1986). On drawing inferences about the structure of normal cognitive systems from the analysis of patterns of impaired performance: The case for single patient studies. Brain and Cognition, 5, 41-66.

Caramazza, A. (1992). Is cognitive neuropsychology possible? Journal of Cognitive Neuropsychology, 4, 80-95.

Caramazza, A., Hillis, A. E., Rapp, B. C. & Romani, C. (1990) The multiple semantics hypothesis: Multiple confusions? Cognitive Neuropsychology, 7, 161-190.

Cohen, J. D., Romero, R. D., & Farah, M. J. (in press). Disengaging from the disengage function: The relation of macrostructure to microstructure in parietal attentional deficits. Journal of Cognitive Neuroscience.

Coltheart, M. (1985). Cognitive neuropsychology and the study of reading. In M.I. Posner & O.S.M. Marin (Eds.) Attention and Performance XI. London: Erlbaum Associates.

De Haan, E. H. F., Bauer, R. M. & Greve, K. W. (1992). Behavioral and physiological evidence for covert recognition in a prosopagnosic patient. Cortex, 28, 77-95.

De Haan, E. H. F., Young, A., & Newcombe, F. (1987). Faces interfere with name classification in a prosopagnosic patient. Cortex, 23, 309-316.

De Renzi, E. (1986). Current issues in prosopagnosia. In H.D. Ellis, M.A. Jeeves, F. Newcome, and A. Young (Ed.), Aspects of Face Processing. Dordrecht: Martinus Nijhoff.

Farah, M. J., & McClelland, J. L. (1991). A computational model of semantic memory impairment: Modality-specificity and emergent category-specificity. Journal of Experimental Psychology: General, 120, 339-357.

Farah, M. J., McMullen, P. A., & Meyer, M. M. (1991). Can recogniton of living things be selectively impaired? Neuropsychologia, 29, 185-193.

Farah, M. J., O'Reilly, R. C., & Vecera, S. P. (in press). Dissociated overt and covert recognition as on emergent property of lesioned neural networks. Psychological Review.

Ferrier, D. (1986). The Functions of the Brain. London, Smith, Elder & Co.

Fodor, J. A. (1983). The modularity of mind. Cambridge, MA: The MIT Press.

Hillis, A. E. & Caramazza, A. (1991) Brain, 114, 2081-2094.

Hillreth, E. C. & Ullman, S. (1989) The computational study of vision. In M. I. Posner (Ed.), Foundations of Cognitive Science. Cambridge, MA: MIT Press.

Hinton, G.E., McClelland, J.L., & Rumelhart, D.E. (1986). Distributed representations. In D.E.Rumelhart & J.L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (pp.77-109). Cambridge, MA: MIT Press.

Hinton, G. E., & Shallice, T. (1991). Lesioning an attractor network: Investigations of acquired dyslexia. Psychological Review, 98 (1), 74-95.

Humphreys, G. W., & Bruce, V. (1989). Visual Cognition: Computational, Experimental and Neuropsychological Perspectives. Hove: Erlbaum.

Humphreys, G. W., Freeman, T., & Muller, H.J. (in press). Lesioning a connectionist model of visual search: Selective effects ofn distractor grouping. Canadian Journal of Psychology.

Humphreys, G. W. & Riddoch, M. J. (1987) Visual Object Processing: A Cognitive Neuropsychological Approach. Hilldale, NJ: Erlbaum Associates.

Humphreys, G. W., & Riddoch, M. J. (in press) Interactions between object and space systems revealed through neuropsychology. In D. Meyer and S. Kornblum (Eds.) Attention and Performance XIV. Cambridge: MIT Press.

Jackson, J. H. (1873). On the anatomical and physiological localization of movements in the brain. Lancet, 1, 84-85, 162-164, 232-234.

Kimberg, D. Y. & Farah, M. J. (in press) A unified account of cognitive impairments following frontal lobe damage: The role of working memory in complex, organized behavior. Journal of Experimental Psychology: General.

Klahr, D., Langley, P. & Neches, R. (1987) Production System Models of Learning and Development. Cambridge, MA: MIT Press.

Klein, B. von E. (1977) Inferring functional localization from neurological evidence. In E. Walker (ed.) Explorations in the Biology of Language. Cambridge: Bradford Books.

Kosslyn, S. M., Flynn, R. A., Amsterdam, J. B., & Wang, G. (1990). Components of high-level vision: A congnitive neuroscience analysis and accounts of neurological syndromes. Cognition, 32, 203-277.

Kosslyn, S.M. & Van Kleek, M. (1990). Broken brains and normal minds: Why Humpty Dumpty needs a skeleton. In E. Schwartz (Ed.), Computational neuroscience. Cambridge: The MIT Press.

Moscovitch, M. & Umilta, C. (1990) Modularity and neuropsychology: Implications for the organization of attention and memory in normal and brain-damaged people. In M. Schwartz (Ed.) Modular Processes in Dementia. Cambridge, MA: MIT Press/Bradford Books.

Movellan, J. R. (1990) Contrastive Hebbian learning in the continuous Hopfield model. In D. S. Touretzky, G. E.Hinton, & T. J. Sejnowski (Eds.) Proceedings of the 1989 Connectionist Models Summer School. San Mateo, CA: Morgan Kaufman, 10-17.

Mozer, M.C. & Behrmann, M. (1990). On the interaction of selective attention and lexical knowledge: A connectionist account of neglect dyslexia. Journal of Cognitive Neuroscience, 2, 96-123.

Nebes, R.D. (1989). Semantic memory in AlzheimerUs disease. Psychological Bulletin, 106, 377-394.

Newcombe, F., Mehta, Z., & de Haan, E. F. (in press). Category-specificity in visual recognition. In M. J. Farah & G. Ratcliff (Eds.) The Neural Bases of High-Level Vision: Collected Tutorial Essays. Hillsdale: Erlbaum Associates.

Patterson, K. E., Seidenberg, M.S. & McClelland, J. L. (1989) Connections and disconnections: Acquired dyslexia in a computational model of reading processes. In R.G.M. Morris (Ed.), Parallel Distributed Processing: Implications for Psychology and Neurobiology. Oxford: Oxford University Press.

Pietrini, V., Nertimpi, T., Vaglia, A., Revello, M.G., Pinna, V., & Ferro-Milone, F. (1988). Recovery from herpes simplex encephalitis: Selective impairment of specific semantic categories with neuroradiological correlation. Journal of Neurology, Neurosurgery, and Psychiatry, 51, 1284-1293.

Pinker, S. (1985). Visual cognition: An introduction. In S. Pinker (Ed.), Visual Cognition. Cambridge, MA: The MIT Press.

Posner, M. I., Walker, J. A., Friedrich, F. J., & Rafal, R. D. (1984). Effects of parietal lobe injury on covert orienting of visual attention. Journal of Neuroscience, 4, 1863-1874.

Rosenberg, C. R. & Sejnowski, T. K. (1986) NETtalk: A parallel network that learns to read aloud. EE & CS Technical Report # JHU-EECS-86/01. Johns Hopkins University, Baltimore, MD.

Rumelhart, D. E., & McClelland, J. L. (1986). Parallel Distributed Processing. Vol. 1: Foundations. Cambridge, MA: MIT Press.

Sartori, G., & Job, R. (1988). The oyster with four legs: A neuropsychological study on the interaction of visual and semantic information. Cognitive Neuropsychology, 5, 105-132.

Shallice, T. (1988) From Neuropsychology to Mental Structure. New York: Cambridge University Press.

Silveri, M. C. & Gianotti, G. (1988). Interaction between vision and language in category-specific semantic impairment. Cognitive Neuropsychology, 5, 677-709.

Squire, L.R. (1992) Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans. Psychological Review, 99, 195-231.

Tippett, L.J. & Farah, M.J. A computational model of naming in AlzheimerUs Disease: Semantic, visual, and lexical factors. Society for Neuroscience Abstracts, 18, part I, 735.

Verfaellie, M., Rapcsak, S. Z., & Heilman, K. M. (1990). Impaired shifting of attention in Balint's syndrome. Brain and Cognition, 12, 195-204.

Warrington, E. K. (1985) Agnosia: The impairment of object recognition. In P. J. Vinken, G. W. Bruyn, and H. L. Klawans (Editors), Handbook of Clinical Neurology. Amsterdam: Elsevier.

Warrington, E. K., & McCarthy, R. (1983). Category specific access dysphasia. Brain, 106, 859-878.

Warrington, E. K., & McCarthy, R. (1987). Categories of knowledge: Further fractionation and an attempted integration. Brain, 110, 1273-1296.

Warrington, E. K., & Shallice, T. (1984). Category specific semantic impairments. Brain, 107, 829-854.

Zurif, E. B. (1980). Language mechanisms: A neuropsychological perspective. American Scientist, 305-311.

Acknowledgements

The writing of this paper was supported by ONR grant N00014-91-J1546, NIMH grant R01 MH48274, NIH career development award K04-NS01405, and a grant from the McDonnell-Pew Program in Cognitive Neuroscience. I thank my co-authors on the projects described herein for their collaboration and tutelage in PDP modeling: Jonathan Cohen, Jay McClelland, Randy O'Reilly, Rick Romero, and Shaun Vecera. Special thanks to Jay McClelland for his encouragement and support. I thank several colleagues for discussions of the ideas in this paper, John Bruer, Alfonso Caramazza, Clark Glymour, Mike McCloskey, Morris Moscovitch, Edmund Rolls and Larry Squire. I also thank Larry Weiskrantz for calling my attention to the passage from Ferrier quoted at the beginning. Finally, I thank the reviewers and editor of BBS for useful comments and criticisms of a previous draft: C. Allen, S. Harnad, G. Humphreys, M. McCloskey, M. Oaksford, T. Van Gelder, and four anonymous reviewers.

Table 1.

CasePicture identification LivingNonliving JBR6%90% SBY0%75%

Spoken word definition LivingNonliving JBR8%79% SBY0%52%

Table 2.

Examples of definitions Living things:

JBRParrot - don't know Daffodil - plant Snail - an insect animal Eel - not well Ostrich - unusual

SBYDuck - an animal Wasp - bird that flies Crocus - rubbish material Holly - what you drink Spider - a person looking for things, he was a spider for his nation or country

Nonliving things:

JBRTent - temporary outhouse, living home Briefcase - small case used by students to carry papers Compass - tools for telling direction you are going Torch - hand-held light Dustbin - bin for putting rubbish in

SBYWheelbarrow - object used by people to take material about Towel - material used to dry people Pram - used to carry people, with wheels and a thing to sit on Submarine - ship that goes underneath the sea Umbrella - object used to protect you from water that comes

Table 3.

CaseSpoken word-picture matching AnimalsFlowersObjects VER86%96%63% YOT86%86%67%

Picture-picture matching AnimalsObjects YOT100%69%

Table 4

Performance on correct and incorrect face-name pairings in a face-name relearning task

Trial:12345678 Correct pairings21121203 Incorrect pairings00011000 Trial:9101112 Correct pairings2322 Incorrect pairings1100

Table 5 Speed of visual matching for familiar and unfamiliar faces FamiliarUnfamiliar Prosopagnosic2795 msec3297 msec subject Normal subjects1228 msec1253 msec

Table 6 Priming of occupation judgements BaselineUnrelatedRelated Prosopagnosic subject1565 msec1714 msec1560 msec Normal subjects821 msec875 msec815 msec

Table Captions

Table 1. Performance of two patients from Warrington and Shallice's (1984) study of semantic memory impairment, on two tasks assessing knowledge of living and nonliving things.

Table 2. Examples of definitions of living and nonliving things given by two patients described by Warrington and Shallice (1984).

Table 3. Performance of two patients described by Warrington and McCarthy (1983; 1987) on tasks assessing knowledge of living and nonliving things.

Table 4. Savings in face-name relearning. Number of correct responses, out of three possible choices, for twelve successive trials of face-name re-learning in patient PH (from de Haan et al, 1987).

Table 5. Mean response times to decide whether two faces are the same or different, on the basis of internal facial features, for patient PH and age-matched normal subjects (from de Haan et al, 1987).

Table 6. Mean response times to decide whether a printed name beongs to a politician or an actor, as a function of the occupation of a simultaneously displayed face, for patient PH and age-matched normal subjects (from de Haan et al, 1987).

Figure Captions

Figure 1 Category-specific functional architecture for semantic memory.

Figure 2 Modality-specific functional architecture for semantic memory.

Figure 3 (a) Effects of different degrees of damage to visual semantic memory units on ability of network to associate names and pictures, of living things (diamonds) and nonliving things (squares). (b) Effects of different degrees of damage to functional semantic memory units on ability of network to associate names and pictures, of living things (diamonds) and nonliving things (squares).

Figure 4 Effects of different degrees of damage to visual semantic memory units on ability of network to activate correct pattern in functional semantic memory units for living things (diamonds) and nonliving things (squares).

Figure 5 Sequence of trial events in the lateralized simple reaction time task. (a) fixation display; (b) cue; (c) target.

Figure 6 Functional architecture of visual attention system derived by Posner and associates from the study of brain-damaged patients.

Figure 7 Functional architecture of visual attention system as modelled by Cohen et al. (1992).

Figure 8 Performance of normal subjects and network in lateralized cued simple reaction time task. Number of cycles needed for "response" unit to reach threshold has been regressed onto reaction times.

Figure 9 Performance of parietal-damaged patients and damaged network in lateralized cued reaction time task. Number of cycles needed for "response" unit to reach threshold has been regressed onto reaction times.

Figure 10 Functional architecture of perception and awareness, proposed by Schacter et al. (1990).

Figure 11 Functional architecture of face perception as modelled by Farah et al, (1992).

Figure 12 Effects of different degrees of damage to face units on 10-alternative forced choice performance.

Figure 13 Performance at face-name association initially following damage and after ten retraining ephochs, for different degrees of damage to face units.

Figure 14 Effect of different degrees of damage to face units on number of cycles needed for face units to settle for familiar patterns (closed triangles) and unfamiliar patterns (open triangles).

Figure 15 Effect of different degrees of damage to face units on the number of cycles needed for the "actor" and "politician" units to reach threshold when presented with names and faces.

1. There are, of course, many other ways to make a wrong inference using the locality assumption, even with the foregoing conditions satisfied, but these have to do with the particular content of the hypotheses being inferred and its relation to the data, not the use of the locality assumption per se. For example, Caramazza, Hillis, Rapp and Romani (1990) have pointed out that selective impairments in modality-specific knowledge do not imply that knowledge of different modalities is represented in different formats; dissocoiability will not, in general, tell us about representational formats.

2. Humphreys and Riddoch (in press) have independently suggested that the disengage deficit can be explained without hypothesizing a disengage component in the normal architecture.