Published in Behavioral and Brain Sciences
Volume 25, Number 3: 297-330 (June 2002)
© 2002 Cambridge University Press
The
self-organizing consciousness
Pierre
Perruchet
Annie
Vinter
Word counts :
Running title: The self-organizing consciousness
Pierre Perruchet and Annie Vinter
Université de Bourgogne
LEAD/CNRS
6 Bd Gabriel
21000
email: pierre.perruchet@u-bourgogne.fr or annie.vinter@u-bourgogne.fr
http://www.u-bourgogne.fr/LEAD/
Short Abstract
We propose that the isomorphism generally observed between the representations composing our momentary phenomenal experience and the structure of the world is the end-product of a progressive organization that emerges thanks to elementary associative processes that take our conscious representations themselves as the stuff on which they operate, a thesis that we summarize in the concept of Self-Organizing Consciousness (SOC). We show that the SOC framework accounts for the discovery of words and objects, and for word-object mapping. We then argue that isomorphic representations may underlie seemingly rule-governed behavior, as is observed in the areas of implicit learning of arbitrary structures, language, problem solving, and automatisms. This analysis provides support for the so-called "mentalistic" framework (e.g. Dulany, 1997), which avoids postulating the existence of unconscious representations and computations.
Long Abstract
The conventional cognitive framework rests on the existence of a powerful cognitive unconscious. Indeed, most psychological models heavily rely on the possibility of performing manipulations and transformations of unconscious representations using algorithms that are unable to operate while accommodating the functional constraints of conscious thought.
This paper explores the viability of an alternative framework which has its origins in the work of Dulany (1991, 1997). In this alternative, "mentalistic" framework, to borrow Dulany's terminology, the only representations people create and manipulate are those which form the momentary phenomenal experience. The main challenge is to explain why the phenomenal experience of adult people consists of perceptions and representations of the world which are generally isomorphic with the world structure, without needing recourse to a powerful cognitive unconscious. Our proposal is that this isomorphism is the end-product of a progressive organization that emerges thanks to elementary associative processes that take the conscious representations themselves as the stuff on which they operate. We summarize this thesis in the concept of Self-Organizing Consciousness (SOC).
We first provide evidence of self-organization in the context of an experimental example which concerns the progressive extraction of words from an artificial language presented as an unsegmented speech flow (e.g.: Saffran et al., 1997). Our approach is supported by a computer-implemented model, PARSER, the details of which are presented elsewhere (Perruchet & Vinter, 1998 b). A remarkable feature of PARSER is that the only representations generated by the model closely match the conscious representations people may have when performing the task. We then show that, provided that we accept a few simple assumptions about the properties of the world that are likely to capture subjects' attention, the rationale underlying PARSER may be extended to the discovery of the relevant units which form natural language and the physical world, and also accounts for word-object mapping.
We then apply the same principles to more complex aspects of the world structure. We show how the SOC framework can account for some forms of behavior seemingly based on the unconscious knowledge of the syntactical structure of the surrounding environment. This demonstration, which was originally stimulated by the literature on implicit learning of arbitrary structures, finds some echoes in the literature on language processing (notably in the so-called distributional approaches, e.g. Redington, Chater, & Finch, 1998), problem solving (for instance in the computation/ representation trade-off proposed by Clark & Thornton, 1997), incubation (e.g. Mandler, 1994), decision making, and automatism (notably in the instance-based models, as proposed by Logan, e.g.:1988, and Tzelgov, e.g.: 1997). We also show how the SOC framework, in conjunction with simple additional hypotheses, readily accounts for transfer between event patterns across sensory content, as shown for instance in the Marcus et al. (1999) study.
Finally, we argue against the empirical reliability of a some additional phenomena that seemingly require the action of the cognitive unconscious. In this context, we critically examine the studies reporting that implicit memory and implicit learning can occur without any attentional processing of the material during the familiarization phase (e.g. Eich, 1984; Cohen, Ivry, & Keele, 1990), and the data allegedly demonstrating the possibility of unconscious processing of semantic information (e.g. Dehaene et al., 1998). Issues related to the apparent dissociation between performance and consciousness in neuropsychological syndromes, such as blindsight, are also briefly discussed.
Our analysis leads to the surprising conclusion that there is no need for the concepts of unconscious representations and knowledge and, a fortiori, the notion of unconscious inferences: Conscious mental life, when considered within a dynamic perspective, could be sufficient to account for adapted behavior. This alternative framework is more parsimonious than the prevalent conceptions in cognitive and developmental sciences because it manages to account for very sophisticated behavior while respecting the important constraints inherent to the conscious/attentional system, such as limited capacity, seriality of processing, and quick forgetting (and even takes advantage of these constraints).
KEYWORDS: Associative learning, automatism, consciousness, development, implicit learning, incubation, language, mental representation, perception, phenomenal experience.
Contents
1- Questioning the Cognitive Unconscious
Postulate
1.1. The Computational View of Mind
1.2. The Cognitive Unconscious
1.3. An Alternative Framework
1.4. The objectives of the paper
2- The Notion of Self-Organizing
Consciousness (SOC)
2.1. Complex Conscious Representations
Account for Seemingly Rule-Governed Behavior
2.2. Conscious Representations Self-Organize.
2.3. Overview of the Sections 3-7.
3. The Case for Word Extraction
3.1. The Word-Extraction Issue
3.2. PARSER: The Principles of the Model
3.3. PARSER and Consciousness
3.4. PARSER and Alternative Computational
Models
4- Learning the World Units
4.1. Word Extraction in Natural Language
4.2. The Formation of Objects and Word-Object
Mapping
5. From Lexicon to Syntax
5.1. Studies Involving Artificial Grammar
5.2. Learning Syntax in Natural Language
5.3. Converging Lines of Evidence from
Psycholinguistic Research
5.4. Unconscious Rule Processing Outside of
the Language Area
6. Abstracting Away From the Sensory Content
6.1 Experimental Evidence for Abstraction
6.2 The Outline of a Reappraisal
6.3 Perceptual Primitives Can Be Abstract and
Relational
6.4 Is Our Account of Transfer More
Parsimonious?
6.5 Analyzing Transfer Limitations and
Failure
7. Problem Solving, Decision Making, and
Automaticity
7.1 Problem Solving and Incubation
7.2 Decision Making
7.3 Automaticity
8. Other Purported Evidence for the Cognitive
Unconscious
8.1 Implicit memory and learning without
attentional encoding
8.2 What About "The Unconscious
Processing of Semantic Information"
8.3. Blindsight and Other Neuropsychological
Disorders
9. Conclusion
9.1. Summary
9.2. Looking Towards the Future
1-
Questioning the Cognitive Unconscious Postulate
In this introductory section, we point out that, in contradiction to the widespread idea that the issue of consciousness is computationally irrelevant (1.1), the prevalent computational view of mind is grounded on the postulate of an omnipotent cognitive unconscious, which has been tacitly present from the very beginnings of the information processing tradition (1.2). We then outline an alternative perspective, in which this postulate becomes useless (1.3). This section ends with the presentation of our objectives and an overview of the paper (1.4).
1.1. The Computational View of Mind
The objective of psychology, in the prevalent computational view of mind, is to study how human subjects process information. It should be noted that this objective includes no mention of the status, conscious versus unconscious, of the processed information. Addressing this issue is generally conceived of as unnecessary. Indeed, the nature of the representations and computations included in these models make that they could, in principle, be either conscious or unconscious when implemented in a human brain. This contention holds irrespective of whether this processing is construed in terms of rule abstraction and application as in the mainstream tradition, or in terms of multivariate statistics computation as in the connectionist approach. The fact of being conscious or unconscious is, for a mental construct, a property that does not affect the way this construct intervenes within a processing sequence.
The function assigned to consciousness, when it is considered, generally consists in making certain parts of cognitive functioning accessible. To quote Baars (1998), "Many proposals about brain organization and consciousness reflect a single underlying theme that can be labeled the 'theater metaphor' . In these views, the overall function of consciousness is to provide very widespread access to unconscious brain regions." And elsewhere in the same paper: "A classical metaphor for consciousness has been a 'bright spot' cast by a spotlight on the stage of a dark theater... Nearly all current hypotheses about consciousness and selective attention can be viewed as variants of this fundamental idea". In keeping with this metaphor, the states and operations involved in information processing models occur in the same way whether they are concurrently accessed or not. This type of speculations is often summarized in the claim that consciousness is "computationally irrelevant".
1.2. The Cognitive Unconscious
We agree with the claim that qualifying as conscious or unconscious a representation or an operation involved in a computational model has no effect on the way this model works. But the computational irrelevance of consciousness can no longer be maintained if, instead of considering piecemeal aspects of the models, we consider their overall conditions of functioning. It appears then that most information processing models necessarily rely on a cognitive unconscious, for at least two reasons. First, the algorithms forming the models rarely match the phenomenal experience of the subjects running the tasks that, presumably, trigger these algorithms. Second, and more importantly, these algorithms are generally unable to work while accommodating the functional constraints of conscious thought, such as limited capacity, seriality, relative slowness of processing, and quick memory decay. As Lewicki, Hill, and Czyzewska (1992) wrote to emphasize the power of the cognitive unconscious: "Our conscious thinking needs to rely on notes (with flowcharts or lists of if-then statements) or computers to do the same job that our nonconscious operating processing algorithms can do instantly and without external help" (Lewicki et al., p.798).
Chomskyan psycholinguistics provide a striking illustration of these points. Whatever the fuzziness of the operational measures of consciousness, it is not tenable that the conscious mind is endowed with a Universal Grammar, makes assumptions about the properties of the ambient language, and tests hypotheses in order to set parameters at their appropriate values. To a lesser extent, similar remarks can be addressed at most information processing models. For instance, it is quite common to assume the existence of a syntactic processing device. Even the discovery of words in the continuous speech stream has been conceived of as the product of a mathematical algorithm of optimization, performed thanks to a statistical inference method (e.g. Brent, 1996; see below Section 3.4). Some untaught rules of spelling are also assumed to be unconsciously abstracted (e.g. Bryant, Nunes, & Snaith, 2000).
Of course, the premise of a cognitive unconscious is not limited to the studies on language. Let us consider the transcoding of numerals. One of the most influential transcoding models (McCloskey, 1992) assumes that all numerical inputs are translated into an amodal and abstract representation of quantity, associating every number to a power of ten (e.g. 4030 should be coded (4)103, (3)101). Motor activities are also of concern. For instance, how do fielders modulate their speed up to catch a ball before it reaches the ground? According to McLeod and Dienes (1993), they run so that d2(tana)/dt2 = 0, where a is the angle of elevation of gaze from fielder to ball. The authors wrote: "Children probably discover this somewhat obscure strategy... by extrapolating from their experience of watching balls thrown towards them... This strategy is obviously not available consciously. That its effectiveness is discovered demonstrates the power of the brain's unconscious problem-solving abilities" (McLeod & Dienes, 1993, p.23).
These few examples make it clear that, by construing information processing as the main target of psychological science, regardless of the conscious status of the processed information, the prevalent view does not remain neutral with regard to the consciousness issue. This view rests in fact on the existence of a cognitive unconscious (Shevrin & Dickman, 1980; Kihlstrom, 1987). By this expression, we mean that the prevalent view takes for granted the existence of unconscious representations, together with the possibility of performing unconscious manipulations and transformations on these representations. By the same token, the concept of cognitive unconscious includes the assumption that the notions of unconscious knowledge and memory are meaningful, and most authors would probably add to this list the notions of unconscious rule abstraction, unconscious analysis, unconscious reasoning, unconscious inference, and so on.
1.3. An Alternative Framework
1.3.1. The "mentalistic" framework. This paper explores the possibility of an alternative framework, in which the cognitive unconscious has no place. Mental life is posited as co-extensive with consciousness. This idea, in fact, is not new. It has even occupied a respected position in the philosophical tradition since Descartes. More recently, this framework has been cogently articulated by Dulany (1991, 1997), who called it, for want of a better term, the "mentalistic" framework.
The mentalistic view does not challenge overall the notions of representations, and the idea that rule abstraction, analysis, reasoning, and inferences can be performed on these representations. Conscious experience of each of us provides direct evidence for such operations. This evidence supports the conservative conclusion that we abstract rules and makes various computations and inferences when we have direct experience of doing so. These aspects of mental life, that Dulany (1997) calls the "deliberative episodes", are not of focal concern in this paper, although we do not intend to play down their importance in any way.
A departure from the standard cognitive view arises when there is no conscious evidence of performing the cognitive operations that a psychological model stipulates. As pointed out above, the lack of concurrent subjective experience is not thought of as a problem in the information processing tradition, because consciousness is thought of as providing only an optional access to the product of unconscious computations. By contrast, the mentalistic view rejects the notions of unconscious rule abstraction, computation, analysis, reasoning, and inference. Because unconscious representations have no other function than to enter into these activities, eliminating the possibility of these activities actually makes the overall notion of unconscious representation objectless1. Accordingly, the most salient feature of the mentalistic framework is the denial of the very notion of unconscious representations. The only representations that exist, in this view, are those that are embedded in the momentary phenomenal experience.
Representations, of course, are generated by neural processes, of which we are unaware. Thus, in the mentalistic framework, mental life comprises only two categories of events: the conscious representations, and the unconscious processes generating those representations. The two are linked like the head and the tail of a coin. To quote an earlier paper of ours: "Processes and mechanisms responsible for the elaboration of knowledge are intrinsically unconscious, and the resulting mental representations and knowledge are intrinsically conscious. No other components are needed." (Perruchet, Vinter, & Gallego, 1997, p.44; see also O'Brien and Opie, 1999a, for a link between the notions of representation and consciousness).
1.3.2. About terminology. Common sense knowledge of notions such as
process, representation and computation, even if difficult to constrain within
an exhaustive definition (as it is the case for many other concepts), appears
sufficient at this point, because the originality of the mentalistic
perspective is anything but a matter of subtle terminological nuances. However,
it may be useful to exclude one particular understanding of the notion
of representation and computation.
It has become increasingly common to define any pattern of neural activity as a representation, especially in the connectionist framework (e.g. Elman et al., 1996, p.364). Given this approach, any biological consequence of the presentation of a stimulus is a representation of this stimulus. For example, the projection of the world on the retina of the eye provides a representation of the world. A logical consequence of this definition is that most representations are fully unconscious. Of course, such a definition has its own internal consistency: From the observer’s point of view, retinal images are indeed world representations. However, the meaning of the concept is different in the mentalistic framework. Throughout the present paper, the word "representation" designates a mental event that assumes the function of some meaningful component of the represented world (e.g.: a person, an object, a movement, a scene) within the representing world. At least two functions can be envisaged (Dulany, 1997). A representation may evoke other representations (the representation of a pencil may evoke the representation of a pencil box, an exercise book, and so on). It may also enter as an argument into deliberative mental episodes (the representation of a pencil may be involved in reasoning, inference, action planning, and other mental activities). In this terminology, the retinal projection of the pencil does not represent the pencil, because the mosaic of cells of the retinal surface activated by the light reflected by the pencil does not fulfill any of these functions.
Likewise, the notion of computation does not extend to any neural activity, but instead designates the mental operations that take representations as arguments. In the following, the term "computation" will be taken to be synonymous with expressions such as "computation on mental representations".
1.3.3. An illustration. The concrete implications of endorsing a mentalistic view will now be illustrated using a very simple situation. Let us assume that a stimulus S1, initially neutral with regard to its behavioral consequences, comes to elicit an avoidance reaction after its repeated pairing with an aversive stimulus S2. Everyone will have recognized here the schema of a classically conditioned reaction. A first interpretation may be that people have acquired some knowledge about the S1-S2 relationships, then draw the inference "If S1 then S2", thus triggering an avoidance reaction when S1 is displayed. This is a version of the expectancy theory of conditioning, first proposed by Tolman (e.g. 1932), and nowadays largely accepted. This view is compatible with a mentalistic standpoint , as long as people have explicit knowledge about the S1-S2 relationships, and have explicitly drawn the inference that S2 is likely to occur when S1 occurs.
Let us now assume that people no longer remember the earlier S1-S2 pairings during the test, thus making impossible the explicit inference that S1 will be followed by S2. Experimental data suggest that a conditioned reaction can still occur in these conditions (e.g. Gruber, Reed & Block, 1968). In the standard cognitive view, the loss of explicit memory does not matter. People are now assumed to rely on their implicit memory of the S1-S2 pairings, and make the unconscious inference "If S1 then S2". Such an adjustment causes no difficulty, given that the presence or the absence of consciousness is held to be computationally irrelevant.
This interpretation obviously violates the premise of the mentalistic framework. However, is this interpretation mandatory? It is worth remembering here that an alternative interpretation of conditioned performance was proposed long ago. During the training phase, some subjects’ perceptual experiences comprise (at least) some features of S1 endowed with the negative valence triggered by S2. Elementary associative mechanisms are then sufficient to ensure that a negative valence becomes a new intrinsic property of S1. The conditioned avoidance reaction, in this interpretation, is directly elicited by S1. The crucial point is that the formation of knowledge about the stimulus relationships is no longer involved: the link between S1 and S2 has no need to be stored in memory and remembered, either explicitly or implicitly, nor exploited through inferential reasoning. What changes with training is the intrinsic representation of S1, which becomes negatively valenced.
There is overwhelming evidence that both interpretations are needed to
account for all the reported conditioning data. Some paradigms certainly
trigger one process more than the other. For instance, Garcia, Rusiniak, and
Brett (1977) tease apart the behavior of rats preparing to cope with a painful
reinforcer signaled by an auditory stimulus (a situation mainly involving the
knowledge of the S1-S2 relationships), and the behavior of rats that acquire an
aversion to a flavor previously associated with sickness (a situation mainly
involving a change in the intrinsic representation of S1). However, most
paradigms are presumably able to generate both forms of responding. In the discussion
published in the pages following Garcia et al.'s (1977) contribution, Seligman
distinguishes the learning of an "if-then" relationship from the
acquisition of a hedonic shift. These two processes, even though very
different, are both generated, according to Seligman, by Pavlovian situations.
The responses elicited by these mechanisms differ from each other on a variety
of experimental variables in a consistent way (e.g. in their sensitivity to the
precise timing of events, in their resistance to extinction, and so on), thus
strengthening the idea that conditioned behavior has a dual nature (e.g.
Konorsky, 1967;
The dual nature of conditioned responses makes it possible to encompass all the available data within a mentalistic framework. When the knowledge of the stimulus relationships is consciously represented, conditioned responses may be of one or the other form. When explicit knowledge is no longer available, however, there is no need to invoke an unconscious analog to our conscious mode of reasoning. Responses may be due to a change in the intrinsic representation of S1. In this case, there are only successive conscious experiences, with S1, initially neutral, acquiring the negative valence initially induced by S2, through the action of unconscious associative processes. Most of the conditioning literature is consistent with this interpretation. It appears, in particular, that those conditioned responses that are endowed with characteristics typical of the responses due to the formation of knowledge about the stimulus relationships, are closely linked to the conscious knowledge of these relationships (for detailed arguments, see Perruchet, 1984) 2.
In this example, it is easy to understand how the very same observed behavior --a conditioned response without concurrent awareness of stimulus contingencies-- can be explained either in a standard cognitive view relying on the cognitive unconscious, or alternatively in a mentalistic framework which eliminates this postulate. The subsequent sections are devoted to the objective of assessing whether very complex adaptive behavior, commonly taken as indicative of unconscious rule abstraction or other unconscious computations on cognitive representations, can also be accounted for in another way, without introducing much more than the principles set out above for the conditioning data.
1.3.4. A hopeless project? At first glance, the weight of the empirical evidence runs against the view presented above as the behavior under examination becomes more and more complex. The main supporting argument is that most current psychological models accounting for complex behavioral phenomena rely, with indisputable success, on the existence of unconscious representations and computations.
The fact that the models based on a cognitive unconscious work might seem to negate the potential interest of an alternative model. However, the argument is not as straightforward as it might seem. Indeed, computational algorithms are so powerful that they can simulate virtually any phenomena, without proving anything about the computational nature of the actual mechanisms underlying these phenomena. Computational algorithms generate a perfect description of the rotation of the planets around the sun, although the solar system does not compute in any way. In order to be considered as providing a model of the mechanisms actually involved, and not only a simulation of the end-product of mechanisms acting at a different level, computational models have to perform better than alternative, non computational explanations. The point is that the comparison needed to reach such a conclusion has never been conducted. As asserted above, the possibility of a powerful cognitive unconscious has been embedded within the principles of the information processing tradition from its very beginning, without being clearly articulated and hence without being directly challenged. Given these conditions, the current focus on the notion of cognitive unconscious appears to be simply the consequence of making earlier tacit postulates explicit. To summarize, although the pervasiveness of the concept of a cognitive unconscious and its overall success can hardy be disputed, the demonstrative power of these arguments is undermined by a hidden circularity.
1.4. The Objectives of the Paper
It is worth pointing out from the outset that our project does not consist in showing that the prevalent computational framework is unwarranted, for any logical or empirical reasons. This objective would entail demonstrating that consciousness is necessary for any form of representation and computation. But there are major obstacles facing any such a demonstration. There is no theoretical reason for claiming that representations and computation need to be conscious. Moreover, it is difficult to conceive of any form of empirical demonstration. Indeed, addressing the question of the necessity of consciousness for any mental construct requires us to demonstrate that unconscious representations and computations do not exist, and demonstrating non-existence is beyond the reach of any empirical investigation. Our aim is to assess the viability of a mentalistic view, instead of directly questioning the prevalent framework. This leads us to address a different issue, presented below.
1.4.1. Necessity versus sufficiency. Let us start from a twofold consideration. On the one hand, we know that at least some mental events are conscious, because we have direct and personal evidence of their existence. Even those who argue that consciousness is epiphenomenal can not reject this assessment (although a few philosophers have questioned the very existence of consciousness; see Rey, 1991, and the refutation of Rey’s position by Velmans, 1991). On the other hand, the existence of an unconscious mental life is a postulate or a presupposition. This presupposition is so deeply ingrained in our modern culture that it is taken for granted by most people. But the fact remains that we have, by definition, no direct proof of an unconscious counterpart to our conscious mental life. It emerges from these two premises that the mentalistic framework is more parsimonious than the prevalent view, because it exclusively relies on the representations and the mental operations we are aware of, whereas the prevalent view postulates, in addition, a parallel cognitive apparatus 3.
In this context, questions about consciousness, in striking contrast with the overwhelming practice, may be framed in terms of sufficiency, rather than necessity. As a consequence, the question we address is: "is it sufficient to rely on the transient and labile representations that form one's momentary phenomenal experiences, when the conventional framework commonly assumes that a large number of representations are stored in mind and manipulated in various unconscious operations?".
The example in Section 1.3.3 illustrates the point. We do not argue that subjects are unable to build and use unconscious knowledge about the S1-S2 contingencies on the grounds that consciousness should be necessary for these operations. What we do show is that this hypothesis is only one among several possible interpretations of the fact that conditioned reactions persist beyond the forgetting of the S1-S2 contingencies. Positing that the affective reaction elicited by the occurrence of S1 has evolved during training due to unconscious associative processes is sufficient to account for the data.
1.4.2. A major objective and some additional issues. The major part of this paper, namely the sections 2 to 7, will be devoted to the presentation of a new model, called the SOC Model, with SOC standing for Self-Organizing Consciousness. This expression is a short-cut, and as such, it is potentially misleading. It might suggest that we intend to address the hard issues commonly linked to the notion of consciousness, such as the problem of knowing how neural events generate conscious mental states. In fact, this paper focuses more modestly on the contents of consciousness, such as they can be described at an informational level 4. We propose that conscious contents are endowed with self-organizing properties, which make it possible to account for a wide range of adaptive phenomena that are commonly considered to be mediated by the cognitive unconscious. Our objective is to suggest that most of the phenomena of interest for cognitive scientists can be accounted for by this model, which avoids any recourse to the concepts of unconscious representations and computation.
The last section (Section 8) will deal with somewhat different issues. For quite obvious reasons, the SOC model is not devised to account for data that we consider to lack a justifiable empirical basis. However, such data may constitute an a priori reason for some readers to reject our approach. Section 8 addresses such phenomena, and notably the data allegedly demonstrating the possibility of unconscious processing of semantic information. We also briefly discuss, in this section, the apparent dissociation between performance and consciousness observed in a few neuropsychological syndromes, such as blindsight.
2-
The Notion of Self-Organizing Consciousness (SOC)
In the first section, we presented an outline of how a mentalistic framework could account for a response apparently based on unconscious memory and inference, taking as example a specific finding from the conditioning area. We now have to address a far more difficult challenge, namely to account for the most complex aspects of behavior on which contemporary cognitive science focuses. Our approach comprises two steps. The first step consists in showing that a large number of phenomena that seemingly require unconscious rule abstraction processes, inferences, analyses, and other complex implicit operations, can be accounted for by the formation of conscious representations that are isomorphic to the world structure. The second step concerns the formation of these representations, and more precisely the causes of their isomorphism to the world structure. We suggest that this isomorphism is the end-product of a self-organizing process. The general ideas underpinning these two steps will be briefly outlined in turn in this section, then developed at length in the following sections.
2.1. Complex Conscious Representations
Account for Seemingly Rule-Governed Behavior
2.1.1 Trading representation against computation. Complex and integrative representations, we argue, make rule knowledge objectless. Here, our thesis relies heavily on the idea that neural systems "trade representation against computation", to borrow the expression used by Clark and Thornton (1997). The above discussion concerning certain findings in pavlovian conditioning (Section 1.4.) provides a first insight about the meaning of this claim. As shown above, the change in the intrinsic representation of S1, and notably the fact that this representation, initially neutral, becomes affectively valenced during the training phase, may replace, at a functional level, the formation of the knowledge of the S1-S2 contingency and the logical inference "if S1 then S2".
Although often indirect, supporting evidence for a
representation/computation trade-off can be found in various areas of
psychology. Examples include the instance--based model of categorization (e.g.
Brooks, 1978), the so-called episodic (e.g. Neal & Hesketh, 1997) or
fragmentary (e.g. Perruchet, 1994) accounts of implicit learning, the notion of
mental models in problem solving (e.g. Johnson-Laird, 1983), and the
memory-based theory of automatism (
This position, we argue, increases the a priori plausibility of the representation-based views, and expands their explanatory power, for at least two reasons. Firstly, if the momentary phenomenal experience is the only mental event, the whole power of the neural system may be recruited for its construction. Secondly, the construction of a representation can profit from the presence of the momentary sensory input, instead of relying exclusively on the internal, memory capacity of the brain. The growing literature on change blindness and other related phenomena (e.g. see review in Noë, Pessoa & Thompson, 2000) leads us to emphasize the importance of this factor, on the grounds that perceptual experience may be more dependent on the real word than previously thought. If, for instance, a visual scene is changed in such a way that the perception of a movement is prevented (e.g. changes occur during an eye blink, or an ocular saccade, or if a blank mask is inserted between the two displays), changes are surprisingly difficult to notice. Such phenomena indicate that the world could play the role of an "outside memory" (O’Regan, 1992) in the formation of the perceptual experience, hence dispensing the brain from the need to retain a detailed representation of the world. These factors make the task of constructing the representations composing the current phenomenal experience considerably easier than the task of forming the permanent and ready-to-use internal model of the world required in the prevalent view of mind.
2.1.2 The isomorphism between the actual and the represented world. In order to solve problems that, at first glance, require rule abstraction and complex computation, a representation has to be isomorphic to the world structure. And indeed, by and large, phenomenal experience provides an internal representation of the world that is isomorphic to its structure. We generally perceive continuous speech as a meaningful sequence of words, the visual environment as composed of persons and objects, and so on. In some sense, the adapted nature of conscious representations is not a speculative and optional proposal, but derives from the most fundamental principle of evolutionary biology: as pointed out by Velmans, "if the experienced world did not correspond reasonably well to the actual one, our survival would be threatened" (Velmans, 1998, p. 51). If one adheres to the views outlined above, the structural isomorphism between our conscious representations and the world is the major phenomenon we have to explain. However, some preliminary comments are warranted to make it clear that this isomorphism is not perfect, and does not need to be so.
First, the representations we create are limited by sensory constraints. For instance, we do not have any perception about the sounds outside of the 20- 20000 Hz range, and our eyes are able to detect only a very small bandwidth of the electromagnetic spectrum from around 370 nm to around 730 nm. Likewise, phenomenal experience does not provide us with any direct representation of the structure of the physical world at other scales, such as atomic microstructure or galactic organization.
Second, even the parts of the world available to our sensory equipment may be represented only partially, or even erroneously. The fact that our representation of the surrounding world does not include the whole scene currently available to our sensory equipment, but instead is limited to a narrow focus, has been recently documented in the visual domain by the studies on change blindness alluded to above. Examples of misrepresentation are also plentiful. The sun rays at the day’s end are seemingly divergent in all directions whereas they are in fact (nearly) parallel, and star constellation at night have no physical reality due to the varying distances of their elements from the earth. In addition, there are innumerable cases in which our representations are biased by our interests, motivations, and their relevance for survival. The phenomenal experience of the world may even be misadaptive, as in the case of perceptual illusions in which perceptual processes which are generally well-suited in natural situations cease doing their job reliably when faced with highly specific patterns. Such phenomena illustrate that percepts and representations are isomorphic to the world structure only in a limited way. For the sake of brevity, we continue to refer to the isomorphism between subjects' representations and world structure throughout this paper, even though the very phenomenon we are attempting to account for can not be described as a simple term-to-term matching.
2.2. Conscious Representations
Self-Organize.
The main question we have to address at this point is: How to account for the fact that the content of the phenomenal experience is, even in a limited sense, isomorphic to the world 5, if this content is not the product of a powerful unconscious processor manipulating unconscious representations? Our answer consists in considering consciousness within a dynamic perspective, that is to say a perspective centered on learning principles. The key point is that each conscious experience triggers associative learning mechanisms that take the components of this experience as the "stuff" on which they operate. Thanks to this phenomenon, consciousness does not only serve an immediate adaptive function, but also participates in its own development, each conscious experience allowing us to improve the content of subsequent conscious experiences. We summarize this thesis in the proposal that phenomenal experience is self-organizing.
Psychological textbooks routinely point out that there are multiple forms of learning. But they also mention that associative learning is the most fundamental and primitive, maybe the form to which all other forms are reducible in fine. Because our framework is primarily motivated by the search for maximal parsimony, we rely exclusively on conventional associative mechanisms in the following. Relying on associative principles --reminiscent of the old-fashioned behaviorist psychology for many-- within a mentalistic framework centered on the concept of consciousness may appear anachronistic. However, the paradox is one of appearance only. Although behaviorism was grounded on associative principles, the reverse is not true: Associative principles can serve equally well in other frameworks. The mentalistic view provides a highly relevant integrative framework, for at least two reasons that will be considered in turn. First, there is a natural relation between associative learning and consciousness, mediated by the concept of attention (2.2.1.). Second, the assumption that learning associates conscious contents implies that associations involve complex representations, a property that considerably improves the power of an association-based view (2.2.2).
2.2.1. Associative learning and consciousness. The issues of learning and consciousness are generally considered separately. As a case in point, "learning" is nearly absent from the indexes of the numerous recently published volumes on consciousness. However, reasons for considering the two issues jointly arise from the close link between learning and attention, on the one hand, and attention and consciousness on the other.
Attentional processes are sufficient for associative memory and learning to occur. This means that no superimposed operations - such as some forms of intentional orientation towards learning - are required. This phenomenon is known from the conditioning and skill learning experiments run during the behaviorist era. It has been subsequently "rediscovered" in the context of the level-of-processing framework in the seventies (e.g. Craik & Lockhart, 1972), and more recently in the context of the studies on implicit learning (e.g. Whittlesea & Dorken, 1993). The resulting picture is that many authors, using different terminologies, have proposed a view compatible with the claim that associative learning is an automatic process that associates all the components that are present in the attentional focus at a given point (French & Miner, 1994; Jimenez & Mandez, 1999; Logan & Etherton, 1994; Stadler, 1995; Treisman & Gelade, 1990; Wagner, 1981). Associative learning and memory are nothing other than the by-products of attentional processing (see Section 8.1. for a reappraisal of some contradictory evidence).
Now, there is a close relation between attention and consciousness. It must be acknowledged that the psychological literature offers a somewhat fuzzy picture of this relation. Across and even within domain and epoch, one term is often preferred to the other. But this preference lacks any clear justification. For instance, the methods devised to investigate perception without attention differ from the methods devised to investigate perception without consciousness. In the former, the stimuli are supraliminal but maintained outside the current focus of attention as a result of the task demands, whereas in the latter, attention is directed toward the target but stimulus quality is degraded. However, these terminological differences are linked more to historical contingencies than to theoretically rooted reasons. At the empirical level, it turns out that both kinds of manipulations lead to analogous findings (Merikle & Joordens, 1997). A more general argument for dissociating the two concepts is that attention is selective whereas "consciousness incorporates both a central focus, and a rich polymodal periphery", to borrow the expression used by O'Brien and Opie (1999b, p.191). This argument amounts to defining attention as the conceptually driven attentional mechanisms that are directed towards a specific source of information in response to task instructions. This view defines what Schmidt and Dark (1998) call the intention-equals-attention view, according to which participants' intention to attend exclusively to a target is sufficient to restrict attentional processing to this target. All proposals for a dissociation (e.g. Baars, 1997; Velmans, 1999) amount to such a confusion. However, the fact that the instructions ask participants to pay attention to a target does not prevent them from making quick attentional shifts toward non-attended information. Therefore, unless one endorses a highly restrictive definition of attended information as the informational content on which subjects are asked to focus, we see no reason to dissociate between attention and consciousness on the basis of their relative selectivity.
Accordingly, the fact that attention and consciousness refer to the same phenomenon does not mean that they are one and the same concept. Attention is generally located on the side of the processes, and consciousness on the side of the mental states resulting from these processes. As Pribam (1980) says: "'Consciousness' refers to states which have contents; 'attention' refers to processes which organize these contents into one or another conscious state". What constitutes the content of the phenomenal experience at a given moment is what is attended to at this moment, and vice versa (e.g. Cowan, 1995; Mandler, 1975; Miller, 1962; Posner & Boies, 1971).
2.2.2. Associative learning and complex representations. At first glance, associative mechanisms appear to be underpowered for the function that we assign to them. Essential to our claim is the idea that the oft-mentioned limitations of associative learning principles are overcome whenever complex representations are conceived of as the stuff on which associative processes operate. The fact that complex representations can enter into associative links, and the high explanatory power of this mode of functioning, has been pointed out in the modern literature on conditioning and learning. The following quote, borrowed from one of the leading theoreticians of animal learning, illustrates the point:
"Properly understood... associative learning theory is remarkably powerful. Of course, such a theory must... reject the restrictive assumption of S-R theory, which allowed associations to be formed only between a stimulus and a response, and should assume that a representation of any event, be it an external stimulus or an action, can be associated with the representation of any other event, whether another external stimulus, a reinforcer, the affective reaction elicited by the reinforcer, or an animal's own actions. Equally important, however, it must allow that the representation of external events that can enter into such associations may be quite complex. They need not be confined to a faithful copy of an elementary sensation such as a patch of red light; they may be representations of combinations or configurations of such elementary stimuli; they may even include information about certain relationships between elementary stimuli. But once we have allowed associative learning theory these new assumptions, we have a powerful account, capable of explaining quite complex behavior -including behavior that many have been happy to label cognitive and to attribute to processes assumed to lie beyond the scope of any theory of learning" (Mackintosh, 1997, 883-884; italics are ours).
However, by and large, the fact that associative principles apply to complex representations has not been exploited, and hence the power of associative learning theory has not been fully appreciated. The symbolic framework assigns a minimal role, if any, to associative processes, and most of the connectionist models, although rooted in associative principles, only considers associations between the input units of the network, which code the material piecemeal (note that the so-called constructive methods overcome this limitation, e.g. Fahlman & Lebiere, 1990).
To summarize, we propose that basic principles of associative learning and memory allow conscious representations to reach their high degree of organization and adaptiveness, provided that we consider that associations occur between the rich content of conscious experiences. The notion of self-organization excludes any organizing cognitive systems or principles that would be superimposed on phenomenal consciousness6. The phenomenal consciousness itself ensures its own improvement in representational power, thanks to the propensity of conscious representations to evolve in accordance with basic associative learning principles. Because consciousness is an unavoidable companion of our daily life, this means that every life episode has a learning function. There are no separate phases for learning and for performance: Each phenomenal experience contributes to improving people's ability to perceive and represent the genuine structure of the world in subsequent interactions.
2.3. Overview of the Sections 3 to 7
Thus two main ideas are embedded in the notion of Self-Organizing Consciousness (SOC). The first is that conscious representations that are isomorphic to the world structure, due to their ability to integrate various elements in a cohesive picture, can account for adaptive behaviors commonly attributed to rule-governed thought. The second is that ubiquitous principles of associative memory and learning are sufficient to account for the formation of these representations. The subsequent sections deal with these two aspects, although, in order to begin the demonstration at its logical starting point, we begin with the second one.
We start by demonstrating the self-organizing nature of phenomenal experience in the language domain. This domain is especially relevant to our position, because it is the domain in which the notion of the cognitive unconscious may be the most deeply rooted as a result of the Chomskyan tradition. In the next section (Section 3), we show that the ability to extract the words forming an artificial language presented as an unsegmented speech flow may be accounted for as an autonomous change in the phenomenal experience of the materials, due to the action of elementary associative mechanisms. This interpretation has been supported by a computational model, the details of which are presented elsewhere (Perruchet & Vinter, 1998b). Section 4 proposes a generalization of this model to word extraction in natural language, to the formation of objects, and to the word-object mapping issue.
Sections 5 and 6 introduce a generalization of the SOC framework to other dimensions. While sections 3 and 4 concern the formation of conscious representations of elements that are generally construed as the actual world units (words and objects), Section 5 applies the same principles for more complex aspects of the world structure. We show how the formation of complex representations that are isomorphic with the world structure can account for some form of behavior seemingly based on the unconscious knowledge of the syntactical structure of the surrounding environment. Section 6 deals with the fact that human behavior may be sensitive to structural aspects of the world that transcend its surface features. This problem, reminiscent of the criticisms Chomsky levelled at the once prevalent current of behaviorism, is obviously crucial for the validity of our view. We shows how the SOC framework readily accounts for transfer between event patterns cutting across their sensory content. Section 7 shows how the SOC framework may find some echoes in the literature on problem solving, incubation, decision making, automaticity, and implicit memory.
To sum-up, these sections provide, we hope, a model of how organisms deprived of a powerful cognitive unconscious, can behave adaptively when faced with complex world-size situations thanks to the formation of structurally relevant conscious representations of these situations.
3.
The Case of Word Extraction
3.1. The Word-Extraction Issue
Language acquisition initially proceeds from auditory input, and linguistic utterances usually consist of sentences linking several words without clear physical boundaries. The question thus arises: How do infants become able to segment a continuous speech stream into words? Recent psycholinguistic research has identified a number of potentially relevant factors. Analyses of the statistical structure of different languages have shown that a number of features are correlated with the presence of word boundaries, and could therefore be used as cues for segmenting the speech signal into words (see review in Jusczyk, 1997; McDonald, 1997). However, the question remains of how infants abstract the statistical regularities that they seemingly exploit. It cannot be claimed that these regularities are learned inductively from word exposure without falling into circular reasoning, with word knowledge being simultaneously the prerequisite and the consequence of knowledge of statistical regularities. In addition to the difficulties inherent in their exploitation, prosodic and phonological cues in any case provide only probabilistic information.
The importance of prosodic and phonological cues in word discovery is
further questioned by recent experimental studies showing that these cues are
not necessary. For instance, Saffran,
The participants in the study conducted by Saffran et al. (1996b) were told before the training session began that the artificial language contained words, and they were asked to figure out where the words started and ended. The processes used in these conditions may be different from those involved in natural language acquisition. Two subsequent papers from the same laboratory (Saffran, Aslin, & Newport, 1996a; Saffran, Newport, Aslin, Tunick, & Barrueco, 1997) partially respond to this objection. In Saffran et al. (1997), the participants' primary task was to create an illustration using a coloring program. They were not told that the continuous series of syllables, which were presented as a sound background, consisted of a language, nor that they would be tested later in any way. In the subsequent forced choice test, participants still performed significantly better than chance (although performance is comparatively impaired in these conditions, see Ludden & Gupta, 2000). A still more direct indication of the relevance of these data with regard to infants acquiring their mother tongue was provided by Saffran et al. (1996a), who reported studies carried out with 8-month-old infants. The infants were tested with the familiarization-preference procedure used by Jusczyk and Aslin (1995), in which infants controlled the exposure duration of the stimuli by their visual fixation on a light. The infants showed longer fixation (and hence listening) times for nonwords than for words, thus demonstrating that they were sensitive to word structure after a brief exposure to an artificial language. Overall, the studies conducted by Saffran and co-workers offer impressive support for the hypothesis that people are able to learn the words forming a continuous speech stream without any prosodic or phonological cues for word boundaries.
3.2. PARSER: The Principles of the Model
Our aim here is to show that word extraction can be explained by the action of elementary, associative-like processes acting on the initial conscious percepts, the result of which is to modify the conscious experience we have of the linguistic input.
What is the phenomenal experience of the listener of a new language such as the one used in the Saffran et al. experiments, at the beginning and end of training respectively? When people are confronted with material consisting of a succession of elements, each of them matching some of their processing primitives, they segment this material into small and disjunctive parts comprising a small number of primitives. As adults, we have direct evidence of the phenomenon. For instance, when asked to read nonsense consonant strings, we read the material not on a regular rhythmic, letter-by-letter basis, but rather by chunking a few letters together. In a more experimental vein, when adults are asked to write down this kind of material, they frequently reproduce the strings as separate groups of 2, 3, or 4 letters (Servan-Schreiber & Anderson, 1990). The same phenomenon presumably occurs when a listener is faced with an unknown spoken language, with the syllables or other phonological units forming the subjective processing primitives instead of the letters. Certainly, when hearing an unknown language at a normal locution rate, the processing of the material is usually not exhaustive. Rather, subjects pick up a chunk of a few syllables from time to time. But this difference does not alter the basic phenomenon of chunking. Chunking, we contend, is a ubiquitous phenomenon, due to the intrinsic constraints of attentional processing, with each chunk corresponding to one attentional focus.
This initial segmentation is assumed to depend on a large variety of factors. Some factors are linked to the participants. For instance, prior experience of another language may endow participants with different processing primitives. Also, the current state of attention and vigilance may partly determine the chunk size. Other factors are associated with the situation, such as the signal/noise ratio, the time parameters of the speech signal, and the relative perceptual saliency of the components of the signal. The mixture of these factors is very likely to mean that a listener's initial conscious experience consists of a succession of chunks which are different in length and content from the words of the language.
After extensive exposure to the language, the listener's phenomenal experience is presumably the experience each of us has of our mother tongue, that is the experience of perceiving a sequence of words. Our proposal is that the final phenomenal experience of perceiving words emerges through the progressive transformation of the primitives guiding the initial perception of the language, and that this transformation is due to the self-organizing property of the content of phenomenal experience. The basic principle is fairly simple. The primitives forming a chunk, that is those that are perceived within one attentional focus as a consequence of their experienced temporal proximity, tend to pool together and form a new primitive for the system. As a consequence, they can enter as a unitary component into a new chunk in a further processing step 7. This explains why the phenomenal experience changes with practice. But why do the initial primitives evolve into a small number of words instead of innumerable irrelevant processing units?
The reason lies in the combined consideration of two phenomena. The first depends on the properties of the human processing system. The future of the chunk which forms a conscious episode depends on ubiquitous laws of associative learning and memory. If the same experience does not re-occur within some temporal lag, the possibility of a chunk acting as a processing primitive rapidly vanishes, as a consequence of both natural decay and interference with the processing of similar material. The chunks evolve into primitives only if they are repeated. Thus some primitives emerge through a natural selection process, because forgetting and interference lead the human processing system to select the repeated parts from all of those generated by the initial, presumably mostly irrelevant, chunking of the material. The relevance of this phenomenon becomes clear when viewed in relation to a property inherent to any language. If the speech signal is segmented into small parts on a random basis, these parts have more chance of being repeated if they match a word, or a part of a word, than if they straddle word boundaries. In consequence, the primitives that emerge from the natural selection due to forgetting and interference are more likely to match a word, or a part of a word, than a between-word segment.
This account has been implemented in a computer program, PARSER. Technical details about PARSER are provided in Appendix A, and an on-line presentation of the model is available on the URL (http://www.u-bourgogne.fr/LEAD/francais/personnel/perruche/SOC.html). Simulations have revealed that PARSER extracts the words of the language well before exhausting the material presented to adults in the Saffran et al. (1996a) experiments, and the material presented to 8-month old infants8 in the Saffran et al. (1996b) experiments. These results were obtained with an exhaustive chunking of the input. When a more realistic fragmentary processing of the material was simulated, performances were impaired, but remained fairly good. PARSER was able to reproduce the performance of actual subjects while processing only 3 to 5 percent (according to experiments) of the sequences presented to participants. This finding suggests that PARSER was able to simulate the results obtained under attention-disturbing conditions (Saffran et al., 1997), where inattentional gaps were presumably more frequent than under standard conditions. Finally, the good performance of PARSER was not limited to the trisyllabic words used by Saffran et al., but also extended to a language consisting of one- to five-syllable words (Perruchet & Vinter, 1998b).
To summarize, we suggest that parsing results from the interaction between one property of language -essentially that the probability of repeatedly selecting the same group of syllables by chance is higher if these syllables form intra-word rather than between-words components-- and the properties of the processing systems -essentially that repeated perceptual chunks evolve into processing primitives which in turn determine the way further material is perceived. Note that our solution to the word extraction issue does not involve any new and specialized learning devices. The fact that complex material is processed as a succession of chunks each comprising a few primitives is supported by a large amount of literature (e.g. Cowan, 1999). The unitization of these primitives due to their processing within the same attentional focus is one of the basic tenets of associative learning (e.g., Mackintosh, 1975). Likewise, the laws of forgetting and the effects of repetition are ubiquitous phenomena. Moreover, the interdependence of processing units and incoming information - the nature of the processing primitives determines how the material is perceived and the nature of the material determines the transformation of the processing primitives, and so on recursively- is consistent with a developmental principle initially described by Piaget's concepts of assimilation and accommodation (e.g., Piaget, 1985). Most current theories of development, although they use different terminology, also rely on the constructive interplay between assimilation-like and accommodation-like processes (e.g. Case, 1993; Fischer & Granott, 1995; Karmiloff-Smith, 1992).
3.3. PARSER and the Issue of Consciousness
The functioning of PARSER, like the functioning of any other computational model, does not depend in any way on the conscious/unconscious status we ascribe to its components. As a consequence, PARSER does not demonstrate that consciousness is necessary for word extraction. Its objective lies elsewhere. As set out in Section 1.4.1, the aim of this paper is not to demonstrate the necessity of consciousness, but instead to assess whether conscious thought, although endowed with severe capacity limitations, is sufficient to account for performance. We pointed out that devising a model to simulate conscious states while respecting the properties of conscious thought introduces considerable constraints. The point we wish to emphasize here is that PARSER meets much of these constraints. Crucially, the only representations included in the model closely match the conscious representations subjects may have when performing the task. The early coding of the material as a set of short and disjunctive units, as well as the final coding of the input as a sequence of words, are assumed to closely match the phenomenal perceptual experience of the listeners. This correspondence also extends to the entire training phase, thus permitting our model to perform word segmentation while mimicking the on-line conscious processing of incoming information. By doing so, PARSER demonstrates that the transient and labile representations composing the momentary phenomenal experiences are sufficient for word extraction, provided that simple and ubiquitous associative processes are allowed to operate on these representations. There is no need for unconscious representations, nor for any forms of unconscious computation on these representations.
It is worthy of note that the constraints inherent to conscious thought can not be conceived of as limitations to the model. PARSER works well, not despite these constraints, but thanks to them. For instance, the fact that attention is limited to the simultaneous perception of a few primitives --a property of the conscious/attentional system usually thought of as a serious handicap-- is the very property that offers the system a set of candidate units. If humans perceived a complex scene as a single unit, PARSER's principles would not work. Likewise, forgetting is essential to the functioning of the model because, if it did not forget, PARSER would fail to extract the relevant units from the multiple candidate units processed by the system. This aspect of the model makes it specially relevant for a rational analysis of cognition, such as initiated by Anderson and Milson (1989). This approach contrasts with the common mechanistic explanation, in which the cognitive system is described "as an assortment of apparently arbitrary mechanisms, subject to equally capricious limitations, with no apparent rationale or purpose", to borrow Chater and Oaksford's (1999) characterization. The rational analysis of cognition shows how apparent limitations actually serve adaptive functions, due to the characteristic of the surrounding environment. For instance, the fact that memory decays gradually over time is viewed as adaptive, because it turns out that the probability for any memory components will be needed to deal with a subsequent situation also decays over time. In this way, the efficiency of the retrieval of information from memory parallels the probability of this information being recruited for adaptive goals. Although focusing on another function, our analysis follows the same approach: Memory breakdown, considered in conjunction with the preventing effect of repetitions, is adaptive, because it turns out that, in any language, a given segment has more chance of being repeated if it matches a word than if it straddles word boundaries. In this context, forgetting allows the selective disappearance of structurally irrelevant units9.
3.4. PARSER and Alternative Computational
Models
As mentioned above, the primary objective of this paper is to highlight the internal consistency of a framework grounded on a set of premises which are strikingly different from those of the standard cognitive approach. This objective prevents a detailed and exhaustive comparison with alternative models. However, pointing out some differences may help to illustrate some specificities of the SOC framework, whose PARSER provides the instantiation in the word segmentation issue. To this end, we briefly compare PARSER with two other models of word segmentation, respectively based on a symbolic and a connectionist architecture. The comparison concerns only the basic principles of the models, given that empirical comparative analyses are not yet available.
One recent symbolic model of word segmentation has been developed by Brent and Cartwright (1996). The authors construe segmentation as an optimization problem. The principle of the method is akin to establishing a list of all the possible segmentations of a given utterance (although the authors used computational tools which prevented the program from proceeding in this way). The choice between possible segmentations is then made in order to fulfill a number of criteria. These criteria are threefold (according to the somewhat simplified presentation by Brent, 1996): minimize the number of novel words, minimize the sum of the lengths of the novel words, and maximize the product of the relative frequencies of all the words. The process of optimization is performed thanks to a statistical inference method, called the "minimum representation (or description) length" method. When units have been created by the system, they help to choose among different possible segmentations of the utterances. In addition, the choice between possible segmentations takes account of certain phonotactic constraints on the form of English words. This method has been applied with some success for parsing phonetic transcripts of child-directed speech into words.
Most of the connectionist models which address the word segmentation issue rely on the simple recurrent network, or SRN, initially proposed by Elman (e.g. 1990; see also Cleeremans, 1993). An SRN is a network which is designed to learn to predict the next event of a sequence. To this end, at each time step, the activations of the hidden units are stored in a layer of context units, and these activations are fed back to the hidden units on the next time step (hence the term "recurrent"). In this way, at each step, the hidden layer processes both the current input and the results of the processing of the immediately preceding step, and so on recursively. With the exception of this feature, an SRN works as many networks do, using the back propagation of errors as a learning algorithm. The comparison between the predicted event and the next actual event of the sequence is used to adjust the weights in the network at each time step, in such a way as to decrease the discrepancy between the two events. Elman (1990) presented such a network with a continuous stream of phonemes one phoneme at a time, the task being to predict the next phoneme in the sequence. The accuracy of prediction was assessed through the root mean square error for predicting individual phonemes. After training, the error curve had a strikingly marked saw-tooth shape. As a rule, the beginning of any word coincided with the tip of the teeth. This means that after a word, the network was unable to predict the next phoneme. However, as the identity of more and more of the phonemes in a word was revealed, the accuracy of prediction increased up to the last phoneme of the word, and the error curve therefore fell progressively. The start of the next tooth indexed the beginning of the next word. Therefore, an SRN appears able to parse a continuous speech flow into words (for more recent models, see Aslin, Woodward, LaMendola, & Bever, 1996; Christiansen, Allen, & Seidenberg, 1998)
Needless to say, nothing in those models matches the conscious experience of the learner of a new language. The operations involved in the Brent and Cartwright model, such as the computation of all the possible segmentations of an utterance in order to choose the one responding to pre-specified criteria, far exceed the level of complexity that can be achieved by a conscious operator, whether complexity is assessed in terms of computational sophistication or memory capacity. The consequence is that the Brent and Cartwright model is grounded on the postulate of a powerful cognitive unconscious, even if there is no explicit mention of this postulate in their paper. By contrast, an SRN relies on mechanisms that, although lacking direct support (there is no evidence of a neural implementation of the error backpropagation algorithm underpinning SRN functioning, as acknowledged by Elman et al., 1997), are a little more realistic at the neurobiological level. However, the model's contents are even more distant from the learner's experience. Even the final state, namely the representation of the input as a set of words, is not directly provided by the network: Words can only be inferred from the graded distribution of errors after learning is completed.
These remarks on alternative models can hardly be thought of as criticisms by themselves, given that these models were not devised to account for conscious experience. However, they illustrate the specificity of the SOC framework. PARSER, which implements the SOC framework in the word segmentation issue, accounts for the formation of word while closely mimicking the subjective experience of the learner, and without calling on other principles or mechanisms than the ubiquitous principles of associative learning and memory. By contrast, the alternative models rely on various postulates about states and operations we have no evidence of, while giving strictly no function to the representations of which we have direct and immediate evidence through conscious experience. The end-result is that, in the alternative models of word segmentation considered here, costful assumptions are made about unconscious operations while the content of phenomenal experience is left both unexplained and objectless.
4-
Learning the World Units
The achievement of PARSER in simulating experimental data on artificial, over-simplified languages supports the idea that conscious representations, far from being a phenomenal by-product of complex analytical processes, are capable of self organization. We now intend to show that our model provides a reasonable account of word extraction in natural language (4.1.), and also extends to the formation of object representations and word-object mapping (4.2.).
The general position taken in this section is as follows. On the one hand, natural conditions are far more complex than the experimental conditions considered so far, and this leads one to expect our model to perform worse in the latter case than in the former. In particular, it appears likely that relevant units represent a very restricted proportion of the potential units that may be initially perceived, and that the process of natural selection on which our model is based will not be sufficiently efficient. However, on the other hand, the complexity of natural conditions may paradoxically help to built the relevant units. To understand the reasons, we have to go back to the basic principles of the SOC framework, and notably to the role of attentional factors in unit formation. A new unit associates the processing primitives that are attended to simultaneously. With the simple artificial languages considered so far, the primitives embedded within a single attentional focus at the beginning of training are randomly selected on the basis of their temporal contiguity, because there are no other guides to constrain chunking. However, natural conditions often provide clues, which are generally excluded in experimental conditions in order to achieve better control. These clues, we will show, guide the formation of the initial chunks by orienting people's attention, and allow us to deal with the problem of the unmanageable number of possible units.
4.1. Word Extraction in Natural Language
Natural language acquisition does not consist in identifying six words used again and again in a few minutes, but many thousands of words distributed over years. Are the principles underlying PARSER general enough to be easily applied to such different complexity and time scales? As we have mentioned, PARSER works thanks to the interaction between one property of the language and a few properties of the human processing system. There is no reasons to believe that this interaction occurs only with the simplistic language used by Saffran and co-workers. The target property of the language, namely that the probability of repeatedly selecting the same group of syllables by chance is higher if these syllables form intra-word rather than between-words components, is obviously shared by Saffran et al.'s artificial material and by any natural language. Likewise, the properties of the processing system on which PARSER relies are very general. For instance, one fundamental assumption of the model is that a cognitive unit is forgotten when not repeated and strengthened with repetition. This assumption may be taken for granted irrespective of whether the process occurs in the few minutes of an experimental session or across larger time scales, in keeping with a long-standing tradition of research into the laws of memory and associative learning. In consequence, PARSER's principles seem to be relevant to natural as well as to artificial language. Briefly stated, the generality of PARSER is ensured by both the generality of the behavioral laws (e.g., only repeated units shape long-lasting representations) and the generality of the language property (the most repeated units are the words) on which it relies.
However, beyond the theoretical relevance of the principles, it is possible that the complexity of the situation may give rise to an insoluble difficulty. This could be the case if natural language really consisted of a continuous, uninterrupted speech flow. But natural language includes pauses. These provide natural cues for segmenting the speech flow from its very onset. Although the information is insufficient for full segmentation, it may be quite useful for children given that child-directed language is characterized by very short utterances separated by clear pauses. Incorporating the information provided by the pauses into PARSER is straightforward: we simply need to constrain selection of the number of primitives perceived in one attentional focus in such a way that the content of an attentional focus does not straddle pauses. It is worth stressing that this change is not an ad-hoc, poorly motivated addition to the model. Indeed, this change is fully consonant with the SOC framework, and notably with the importance of attentional factors. Pauses, in fact, partly determine the content of the attentional focus, because attention naturally gathers events in close temporal proximity. Furthermore, pauses are only one among many prosodic and phonological cues capable of orienting attention in natural language processing. Overall, although we acknowledge that PARSER is certainly underpowered to deal with natural language, the principles that it implements are general enough for us to be optimistic about achieving an improved version exploiting the multiple cues which are likely to constrain the selection of the primitives embedded in each attentional focus.
4.2. The Representation of Objects and the
Word-Object Mapping Issue
PARSER was initially built to account for the segmentation of a continuous
speech flow observed in the experiments by Saffran and her co-workers. Saffran,
Johnson, Aslin, and
Some adaptations are warranted if we are to achieve our objective. Accounting for the formation of object representations implies a change in the primitives of the system, which will no longer be the syllables or other phonological units, but, for instance, spatially oriented features. Likewise, the natural principles guiding the initial chunking of primitives will no longer be temporal proximity, but spatial contiguity. However, instantiating these adaptations confront us with a problem, which arises from the fact that the number of initial units is much greater than with a linguistic material. Indeed, in the auditory speech flow, the number of possible units is limited by the sequential nature of the speech signal. For instance, a 3-syllable message can be composed of three one-syllable words, two words consisting of one and two syllables, or one 3-syllable word. This results in only four possibilities. By contrast, a visual display can be decomposed into a virtually unlimited set of different parts, even if each part includes only spatially contiguous elements. Under these conditions, the formation of relevant units would appear to be an intractable problem.
This problem, again, finds a solution in the idea that units are formed by the concurrent attentional processing of a small number of primitives. The point is that infants' attention is captured by an array of stimuli sharing specific properties. One of these properties, for instance, is novelty (e.g. Kagan, 1971). If, at a given moment, several primitives are new for the infants, it is highly probable that these primitives are processed conjointly in the attentional focus, hence forming a new unit. Now, if several primitives are new for a subject, there is also a good chance that they will be the components of one and the same meaningful unit, such as an actual object. The same line of reasoning may be followed with movement. It has been established that infants' attention is attracted by a moving display (e.g. Haith, 1978; Bronson, 1982; Vinter, 1986). If several elementary features move concurrently, they have a high probability of being both attentionally processed by infants, and belonging to a same real object (of course, many objects do not move; however, it is imaginable that the perceived movement generated by eye displacement in a 3-D visual field makes it possible to generalize this phenomenon to motionless objects).
The logic applied to the segmentation of the linguistic input into words and to the segmentation of the world into objects may be extended to word/object mapping. Note that the potential problem raised by the number of candidate units is exacerbated here. In real life, infants may capture within a single attentional focus unrelated componential aspects of the environment, such as a sound frequency together with the orientation of a segment of a visual display. To illustrate the latter issue, let us consider an example inspired by a question raised by Karmiloff-Smith (1992, p.40). When an adult points toward a cat and say "look, a cat", how can the child pair the word "cat" with the whole animal, rather than, say, with the cat's whiskers, the color of the cat's fur, or the background context? A solution based on the selective role of attention still works. What is likely to become associated is what captures the infant's attention, that is, essentially, what is new and/or moving. Presumably, considering the auditory input first, "cat" is newer than "look", because "look" has been associated with many contexts before. As a consequence, it is highly probable that "cat", rather than "look", enters into the momentary attentional focus. On the other hand, it is also highly probable that the infant's attention is focused on the animal, which moves as a whole, rather than on one of its parts, or on the other elements of the context, which are presumably both more familiar and motionless.
Of course, the process of mapping as described above may sometimes fail. The infant may be quite familiar with cats, and surprised by the russet color of the fur of this specific cat. We predict that, in this case, the infant would mismap the word "cat" to the color russet. It is worth noting first that, in real world settings, this situation may be infrequent, because adults would tend to spell out what is presumably the most novel for the infants, and more generally, what they infer to be their present object of attention. On the other hand, errors of mapping do in fact occur during language development. What is needed is not a theory predicting a perfect mapping from the outset, but a theory able to predict the final achievement. Our model of learning is precisely adapted to extracting signals from noises. In general, the correct mapping will be the final outcome, because the infants will hear "cat" for animals which are not russet, and will hear "russet" for animals which are not cats.
To summarize, our model of learning, initially applied to the word extraction issue, suggests a new account of infants' basic ability to parse the physical words into objects, and to map words and objects. The apparent problem posed by the unmanageable number of potential units that can be initially perceived finds a simple solution thanks to the fact that attention is naturally captured by a tightly defined set of events. Of course, this account, in its present form, is just a first draft of a more complete developmental model. Such a model should address many other points. For instance, as a rule, a word does not designate a specific object or animal, but a category of objects or animals. It is easy to imagine how the phenomenon may be encompassed in an framework based on the laws of associative learning and memory. Differences between specific instances of, say, cats, can be viewed as noise for the system, whereas the common features are located in the to-be-detected signal. When the word cat is associated with different instances of cats, idiosyncratic features of the animals, because they are not repeated, disappear from the representation while common features are reinforced.
5.
From Lexicon to Syntax
Up to now, we have proposed an interpretation for the formation of conscious representations of parts of the world, such as words and objects. However, the existence of linguistically or physically relevant representations is not commonly considered as sufficient for accounting for human behavior. Representations are generally construed as the elementary bricks of thought, and complex human behavior is assumed to rely on the formation of some kind of abstract knowledge, in which the bricks are combined on the basis of some organizing (e.g. logical) principles. In the language domain for instance, there is a conventional distinction between the lexicon and the syntax. Both of them are assumed to be mediated by different neural mechanisms, and the role of language exposure in the acquisition process is conceived of as very different: although some impact of learning in word acquisition is acknowledged even by strong nativists, the acquisition of grammar is attributed to innate and specialized modules. Needless to say, we do not deny that adult humans are able to abstract rules. The very existence of sciences such as logic, physics, and linguistics, testifies to the human ability to abstract the structure of complex environments. Since this section is devoted to language, it is important to point out from the outset that we agree with the contention that humans can achieve genuine knowledge of the syntax of their language. However, in the mentalistic framework, the formation and manipulation of abstract knowledge is restricted to conscious activities.
Our proposal is that the notion of self-organizing consciousness offers a way of thinking about rule-governed behavior in cases where no conscious rule analysis is performed, without having recourse to the notion of unconscious rule abstraction. The idea is that the separation between basic units on the one hand, and rules governing those units on the other, or between lexicon and syntax in linguistic terminology, is warranted in a scientific approach (i.e. from the observer's viewpoint) but has no relevance for the processing system. The purpose of the processing system is to generate a representation of the world which integrate all the momentary input (internal and external) into a coherent and meaningful scene. This complex and integrative representation, we will argue, makes rule knowledge objectless. The mentalistic framework is specially well-suited to pushing the representation/ computation trade-off (Clark and Thornton, 1997) to its ultimate end. Indeed, claiming that representations exist only in the momentary phenomenal experience is primarily restrictive when contrasted with the conventional cognitive approach, which postulates that innumerable representations are stored and processed in parallel in the cognitive unconscious. But there is a positive counterpart. If there is no Cognitive unconscious, the full power of the neural system may be mobilized for the formation of the current phenomenal experience. This opens up the possibility of generating a multifaceted and highly complex representation of the world.
Our demonstration of the ability of conscious representations to account for improved performances in rule-governed situations starts in the context of artificial languages generated by a simple finite-state grammar (5.1). Then we turn to natural language. Section 5.2 is an attempt to generalize, on a speculative basis, the principles whose efficiency has been demonstrated in connection with finite-state grammars. Section 5.3 indicates a few directions in contemporary psycholinguistic research that also exploit the ability of lexical representations to explain apparent rule-based, syntactical abilities. We then turn away from the field of language and examine the studies in implicit learning that exploit non-linguistic material (5.4).
5.1. Studies Involving Artificial
Grammars.
In the artificial language considered in Section 3, which was used by Saffran and co-workers (e.g. Saffran et al., 1997), the subject's task is to discover the lexicon. There are no syntactical constraints, insofar as the words of the lexicon are displayed in random order. By contrast, in the situations considered in the literature on artificial grammar learning, the discovery of the lexicon raises no particular problems, because the units of the language match some subject's processing primitives. However, the combination of these units are governed by syntactical rules, which are the to-be-learned components of the situation. In most cases, the situation involves a set of consonants, the order of which is governed by a finite-state grammar , such as that initially introduced by Miller (1958). The finite-state grammars have been extensively used by Reber (e.g. Reber, 1967) and many other researchers (e.g. Dulany et al., 1984, Shanks et al., 1997) working in the implicit learning field (for reviews, see Cleeremans, Destrebecqz, & Boyer, 1998; Reber, 1993).
In a conventional situation, participants are first exposed to a set of consonant strings following a finite state grammar such as that represented in Figure 1, without being asked to learn the rules or even being informed of the structured nature of the material. A subsequent test is performed in order to reveal whether participants have learned about the grammar. This test generally consists in asking them to judge the grammaticality of new strings. The usual outcome is that participants are able to classify the new strings as grammatical or ungrammatical with better-than-chance accuracy, whereas they lack conscious knowledge about the grammar. The initial conclusion of these studies was that mind is endowed with an unconscious information processing device able to abstract the rules governing the experimental material, and then applies these rules in other contexts (e.g. Reber, 1967). Because the conclusions of these early studies accorded well with the prevalent zeitgeist, this interpretation has gone unchallenged for many years.

Figure 1. Schematic diagram of the grammar used in several earlier studies (e.g. Dulany, Carlson, & Dewey, 1984)
However, further studies, initiated by the seminal papers by Brooks (1978) and Dulany, Carlson, and Dewey (1984) made it clear that these conclusions were premature. To borrow the distinction proposed by Smith, Langston, and Nisbett (1992), the early studies failed to distinguish between a system that follows rules from one that simply conforms to rules. A ball falling on the ground conforms to the law of gravity, but does not follow this law. Experimental evidence in implicit learning situations shows that the participants conform to the rules underlying the situations, but there is no proof that the rules have been learned in any way. Several alternative interpretations have been proposed. Because this literature has been reviewed extensively elsewhere (e.g. Berry & Dienes, 1993; see also the Handbook of Implicit Learning edited by Stadler and Frensch, 1998), we will focus on our own interpretation.
In keeping with the SOC framework, our re-interpretation (e.g. Perruchet & Vinter, 1998a; Perruchet, Vinter, & Gallego, 1997) of the phenomenon is that the training phase modifies the way the data are consciously coded and perceived. Assuming, for the sake of illustration, that XRX is a frequent recursion in the finite state grammar, participants no longer perceive X and R as two familiar but separate entities, but perceive XRX as an increasingly familiar unit. One possible explanation for the above-chance grammaticality judgments of a new string including XRX is that participants interpret, more or less automatically, the level of perceptual fluency as an indicator of grammaticality. Strings that can be easily read because chunks of letters are directly perceived as familiar units would tend to be judged as grammatical. In short, in our re-appraisal, the formation of the conscious unit XRX replaces the unconscious extraction, retention, and use of a rule such as: If XR, then X.
It might seem, at first glance, that any fragment of a grammatical utterance is itself grammatical, and can be recombined with another fragment to form a new grammatical string. Given this logic, the initial chunking of the material would not matter. And indeed the notion of "fragmentary knowledge" conveys the tacit implication that it is a quite impoverished form of knowledge. This view is faulty, as may be illustrated using the example of natural language. For instance, in the preceding sentence, "this view", or "natural language" form structurally relevant sequences, in the sense that they can be recombined with a large number of other sequences, whereas "faulty as may" cannot be easily integrated as a component in another linguistic context, although it is a component of a legal sentence. It is obvious that it is preferable to become familiar with the former sequences than with the latter. Likewise, in the letter strings generated by a finite-state grammar, it is preferable to become familiar with a subset of sequences --for instance those that are generated by a recursive loop-- than with other, randomly selected, sequences. We (Perruchet, Vinter, Pacteau, & Gallego, in press) have shown that participants in an artificial grammar learning setting indeed formed the structurally relevant units. They were asked to read each string generated by a finite state grammar and, immediately after reading, to mark with a slash bar the natural segmentation positions. The participants repeated this task after a phase of familiarization with the material which consisted either of learning items by rote, performing a short-term matching task, or searching for rules. The same number of total units was observed before and after the training phase, thus indicating that participants did not tend to form increasingly larger units. However, the number of different units reliably decreased, whatever the task during training. This result was taken as evidence that participants' processing units become increasingly relevant as training progressed (see also Servan-Schreiber & Anderson, 1990). Perruchet et al. (in press) also showed that PARSER, the computer model which was used previously to account for the discovery of words in an unsegmented speech flow (Perruchet & Vinter, 1998b; see Section 3), also accounted for participants' actual performance. Thus the principles that make it possible to discover the lexical units of an artificial language built from the random concatenation of words, also proved to be efficient in the discovery of the syntactically relevant units of an artificial language built from a finite-state grammar.
It is worth examining why such simple principles work well in a situation that was once thought of as involving grammatical rule abstraction. It is because first-order and second-order dependency rules capture virtually all the structural constraints of the standard finite-state grammars. For instance, Perruchet and Gallego (1997) have demonstrated that consideration of only the first-order dependency rules is sufficient to account for the performance of the participants in the Reber (1976) experiments and many others which use the same material. Indeed, assuming that participants classify test items as grammatical if they consist only of permissible bigrams (whatever their location in the strings) would result in the production of 90% correct responses, a success level that greatly exceeds observed performance. The same demonstration may be repeated for other standard situations of implicit learning, such as the repeated sequence tasks (Perruchet & Gallego, 1997).
Note that we have dealt separately with the lexical level (in Section 3) and the syntactical level (in this section), while language acquisition implies the simultaneous acquisition of lexicon and syntax. This does not constitute a problem. The starting point for PARSER is the idea that each attentional chunk includes a small number of primitives, and that the primitives which are processed together form a new internal primitive, as a by-product of their joint attentional processing. After having discovered the words forming the artificial language used in the Saffran et al. (1996 a and b, 1997) experiments, PARSER obviously goes on creating new units. These units, which are the concatenation of a few words, rapidly vanish. Indeed, because word order is random in Saffran et al.'s material, the repetition of the same word sequence is not frequent enough to allow the strengthening of any word sequence. Let us now suppose that, instead of being randomly ordered, the words are subjected to some syntactic constraints. The constraints would make some sequences grammatical and the other sequences ungrammatical. In this case, PARSER forms long-lived units consisting of the grammatical sequences. Moreover, as shown in current studies run in collaboration with Axel Cleeremans, PARSER discovers the most frequent multi-words sequences, which have much chance of being the most syntactically relevant. If we transpose the results from the computational model to the level of the phenomenal consciousness of actual people, it appears that the very same process that permitted word formation during the initial stage of learning is able to generate the phenomenal experience of well-formedness for syntactically correct word sequences. This phenomenal experience can be the source of various overt behaviors, such as grammaticality judgments or verbal productions.
5.2. Learning Syntax in Natural Language
Of course, it is premature to claim that the above outline is directly relevant to natural languages. First, it may be argued that any approach relying on associative learning mechanisms can in principle provide only statistical approximation to genuine syntactic knowledge, whereas people make no errors. We believe that this objection amounts to both underestimating a priori the power of associative mechanisms, and exaggerating the actual accuracy of people performance. For instance, we mentioned above (Section 3) that PARSER, although relying only on associative learning mechanisms, was able to extract the words in Saffran et al. (e.g. 1996) language without any errors. Admittedly, this language is oversimplified, but, at the same time, a very limited amount of exposure to the material is sufficient to learn it. The level of performance that can be reached when a more complex language is studied over a more extended period is currently a matter of speculation. On the other hand, people's ability to master the syntax of a natural language may have been overemphasized in the Chomskyan tradition. For instance, even simple spontaneous oral productions are rarely error-free, and it is fairly difficult to capture the syntactical structure of a complex sentence whenever semantics can not help. To conclude, assessing the ultimate explanatory power of associative mechanisms is a matter for further empirical investigations and computational studies.
However, there are a second category of objections, stemming from the fact that the finite-state grammars used in the laboratory studies provide a poor analog for the grammars of natural languages. The finite-state grammars used in the implicit learning literature mainly involve first-order and second-order dependency rules between contiguous elements. By contrast, natural languages involve higher-order dependency rules and remote dependencies. At a more qualitative level, it has long been known that the grammars of natural languages can not be conceived of in terms of a finite-state grammar. Also, it remains unclear how our claims account for other aspects of syntactic knowledge, and especially the abstraction of syntactic classes such as nouns and verbs.
The part of the argument based on the consideration that our account works well only with first and second-order dependency rules is not as problematic as it might seem. Indeed, in PARSER, the dependency rules are captured through the formation of new processing primitives, which can themselves become the components of subsequent primitives. Thanks to this possibility of hierarchical processing, we can speculate that PARSER should become at least partially sensitive to high-order dependency rules. However, the order of the dependency rules is only one aspect. Many other aspects of natural language have no counterpart in artificial languages governed by a finite-state grammar. We acknowledge that a model designed to deal with artificial languages can not deal with natural languages without undergoing substantial changes. But the essential question is: Beyond the limitations of PARSER in its current implementation, are the fundamental principles underlying the SOC model able to account for the acquisition of syntax in natural language? Although we have no definitive response, we believe that there are arguments allowing us to answer this question in the positive.
As an example, let us consider the dependencies between remote elements, and more precisely, the case of a sequence AXB in which A and B are associated irrespective of the length and nature of X. There are many occurrences of such a structure in natural language. For instance, in the sentence: "The window of my office is open", "The window" (A) is associated to "is open" (B) irrespective of the determinant: "of my office" (X), that may be deleted or replaced by an infinite number of subordinate propositions. PARSER is a priori unable to capture the relation, because the model posits that new units can only be formed between contiguous elements. However, the general principle that PARSER instantiates is that new units result from the processing of a few primitives within the same attentional focus. When people encounter sequential material, the most simple assumption is that each attentional focus embraces a small number of contiguous elements. In artificial, meaningless languages, there is no obvious reason to expect a different type of chunking. However, there are clearly no functional or structural constraints here. Each of us commonly mixes present and past events in his/her current phenomenal experience. It is in keeping with our general approach of assuming that a new unit may be composed of spatially or temporally remote events, provided that there is some reason for those events to become associated in phenomenal experience. It is easy to imagine several developmental sketches accounting for how two remote events can be joined in an unitary experience. For instance, a link between A and B may emerge in situations where both events are contiguous (a case which, in our example, corresponds to the most simple utterance: "The window is open"). Then the occurrence de A without its usual successor may result in the retention of A in working memory until B occurs in order to complete the percept AB. At this moment, A and B will be simultaneously held in the attentional focus despite their objective separation, thus providing conditions favoring both the strengthening of their association and the understanding of the sentence. This is again consonant with the SOC framework, which relies on the assumption that perception is shaped by earlier representations.
5.3. Converging Lines of Evidence from
Psycholinguistic Research
Although they developed completely independently of our own framework, there are a number of directions in psycholinguistic research that are able to help us consider the question of language learning within the SOC framework. As an example of such work, the re-emergent distributional approaches to language have recently shown that abstract classes and categories are often associated with simple statistical properties that make them tractable by all-purposes statistical learning mechanisms. Interestingly, even simple properties such as co-occurrence statistics turn out to be informative about syntactic classes. For instance, Redington, Chater and Finch (1998) studied a large natural language corpus taken from the CHILDES database (MacWhinney, 1995), comprising over 2.5 million words of adult speech. They measured the information that the context of a given word provided about the syntactic category of this word (among 12 possible categories). Context was defined by the two words to either side of the target word. The authors showed that "highly local contexts are the most informative concerning syntactic category and that the amount of information they provide is considerable" (Redington et al., 1998, p. 452; see also Gasser & Smith, 1998). Distributional approaches have also proven to be able to account for other aspects of language, such as the development of word meaning (McDonald & Ramscar, 2001).
Converging lines of evidence have evolved in other contexts. For instance, careful scrutiny of the linguistic productions of young children shows that these productions are organized around particular words and phrases, instead of operating with abstract linguistic categories and schemas. This finding of the item-based learning and use of language appears fairly general (for a review, see Tomasello, 2000). Of course, "item-based", or "memory-based" (McKoon & Ratcliff, 1998) approaches to grammar have not gone unchallenged. Some authors go on to argue that there is a modular dissociation between syntax and lexicon (e.g. Grodzinsky, in press).We are not familiar enough with the domain to offer new arguments in either direction. Our intention was simply to point out that distinguished figures in the psycholinguistic literature have been prepared to reject the idea that language processing necessarily involves syntactical rules. Such a view confers a high degree of probability on one of the main propositions of this paper, namely that it may be possible to explain the apparent use of abstract rules in terms of the formation of complex representations.
5.4 Unconscious Rule Processing Outside of
the Language Area
Thus far, we have focused on studies on artificial or natural languages in order to illustrate the idea that apparent rule processing may be reducible to the formation of complex representations. The same idea can be illustrated in other fields. In particular, this idea finds strong support in the literature on implicit learning that is not based on linguistic material. Outside of the artificial grammar settings, studies on implicit learning have primarily involved two situations: the so-called serial reaction time (SRT) situations, and the control of complex systems. Most of the SRT studies have been designed on the basis of Nissen and Bullemer's paradigm. A target stimulus appears on successive trials at one of three or four possible positions, and participants are asked to react to the appearance of the target by pressing a key on the keyboard that spatially matches the location of the target. Unknown to the participants, the same sequence of trials is repeated throughout the sessions. Under these conditions, participants usually exhibit a reliable improvement in performance when compared with a control group presented with randomly generated series. The tasks involving the control of complex and interactive systems have their origin in Broadbent's studies (e.g. Broadbent, 1977). Participants are placed in front of a computer simulating a complex system, such as a city transport system. Unknown to them, the parameters of the system are governed by a linear equation. The task consists of regulating the system, that is they have to manipulate a number of parameters in order to reach and maintain a prefixed target state of the system. Several studies have shown that the initial abstractionist account of performance improvement involved unnecessary assumptions, because alternative interpretations based on simpler memory processes proved to be sufficient (see for instance Cleeremans & McClelland, 1991; Marescaux, Dejean, & Karnas, 1990; Perruchet & Amorim, 1992; Perruchet, Pacteau, & Gallego, 1997; Shanks & St.John, 1994; Stadler, 1992; Whittlesea & Dorken, 1993)
Rather than examining in detail the findings resulting from these conventional situations, we focus below on a specific paradigm initially designed by Lewicki, Hill, and Bizot (1988). Like almost all other studies in the field, this paradigm serves our primary objective which is to show that what is initially interpreted as compelling evidence of unconscious rule abstraction can also be explained in terms of the formation of conscious percepts and representation which are isomorphic with the structure of the material. However, this specific paradigm was also chosen because it allows us to illustrate another point, namely that our interpretation can work even in cases where there is no obvious relationships between the actual rules generating the structure of the material and the participants' conscious processing units. The point is that we may be sensitive to surface regularities that are a remote by-product of the rule, so remote in fact that the logical link between the rules and their by-products may be quite difficult to discover. This subsection is dedicated to those skeptical readers who doubt the power of our approach because of their failure to understand how it can apply after a cursory examination of certain complex situations.
In the Lewicki et al. (1988) paradigm, participants were asked to perform a four-choice reaction time task, with the targets appearing in one of four quadrants on a computer screen. They were simply asked to track the targets on the numeric keypad of the computer as fast as possible. The sequence looked like a long and continuous series of randomly located targets. However, this sequence was organized on the basis of subtle, non salient rules. Indeed, unbeknown to participants, the sequence was divided into a succession of "logical" blocks of five trials each. In each block, the first two target locations were random, while the last three were determined by rules of the form: "If the target describes a movement m while it moves from location n-2 to n-1, then it describes a movement m' from location n-1 to n". Depending on whether n is the third, fourth, or fifth trial of the logical block, if m is horizontal (or vertical and diagonal), m' is vertical or diagonal (or horizontal or diagonal, or horizontal or vertical respectively). It should be noted that to discover these second-order dependency rules, participants must inevitably segment the whole sequence into a succession of 5-trial subsequences. That is to say, any trial within the long displayed sequence must be identified as the first, second, ..., fifth trial within the logical 5-trial block to which it belongs.
The results obtained by Lewicki et al. were clear. The participants were unable to verbalize the nature of the manipulation and, in particular, they had no explicit knowledge of the subdivision into logical blocks of five trials, which was a precondition which had to be satisfied if they were to grasp the other rules. However, performance on the final trials of each block, the locations of which were predictable from the rules, improved at a faster rate and was better overall than performance on the first, random, trials. Lewicki et al. (1988) accounted for these results by postulating that the structuring rules were discovered by a powerful, multipurpose unconscious algorithm abstractor.
Perruchet, Gallego, and Savy (1990) provided the basis for a radically different interpretation (for an alternative interpretation based on connectionist modeling, see Cleeremans and Jimenez, 1998). Perruchet et al. demonstrated that participants learned the task without ever performing the segmentation of the sequence into logical blocks. Instead, they were sensitive to the relative frequency of small units, comprising 2 or 3 successive locations. Some of the possible sequences of 2 or 3 locations were more frequent than others, because the rules determining the last 3 trials within each 5-trial block prohibited certain transitions from occurring. In particular, an examination of the rules shows that they never generated back and forth movements (i.e., m' is never the inverse movement of m). As a consequence, the back and forth transitions were less frequent on the whole sequence than the other possible movements. The crucial point is that these less frequent events, which presumably elicit longer reaction times, were exclusively located on the random trials. This stems not from an unfortunate bias in randomization, but from a logical principle: The rules determined both the relative frequency of certain events within the entire sequence and the selective occurrence of these events in specific trials. The validity of this interpretation was tested by deriving predictions concerning specific features of fine-grained performance from an abstractionist model, on the one hand, and from our alternative model on the other. The empirical data clearly supported our re-analysis.
It should be noted that the subsequences of 2 or 3 successive locations considered by Perruchet et al. (1990) are presumably the events on which the subjects focused attentionally, and which formed their phenomenal experience of the task. Thus, adaptive performance may again be construed as a change of phenomenal experience due to the properties of this experience. Exposure to the material shapes the way it is consciously perceived and processed, and the modification of the phenomenal experience triggers the improvement in motor performance. What is new in this case, however, with regard to the situations examined above, is the fact that the link between the generating rules and the surface regularities that conscious coding can capture is far from obvious; In any case, the authors, reviewers, and the first readers of the Lewicki et al.(1988) paper were presumably all unaware of it 10. The question of the relevance of this experimental example to real-world situations is a matter for further speculation, but, at the very least, these findings strongly suggest that our account could be relevant in cases where at first glance it seems to be inappropriate.
6.
Abstracting Away From the Sensory Content
In the preceding section, we claimed that the changes in the way we consciously perceive and represent our environment may underlie some apparent phenomena of syntax sensitivity. In some cases, it is easy to see how a simple representation may replace genuine rule knowledge. For instance, it is easy to see how perceiving XRX as a unit may replace the rule: 'If XR then X'. In the last example, taken from Perruchet et al. (1990), understanding how the same explanatory schema works is far more difficult, due to the fact that conscious processing units encode a remote by-product of the rules. But after careful scrutiny, the logic of the reappraisal is unquestionable. However, adaptation to other situations does not seem reducible to the same approach. These situations are not necessarily complex, as can be seen from the first experimental situation (Marcus, Vijayan, Rao, and Vishton, 1999) we deal with below. Their common characteristics is that they reveal participants' ability to abstract away from the sensory content of the training situation, an ability that can not seemingly be explained by any association-based account.
6.1 Experimental Evidence for Abstraction
As a case in point, let us consider the recent experiments by Marcus et al.(1999). Seven-month-old infants were exposed to a simplified, artificial language during a training phase. Then they were presented with a few test items, some of which belonged to the same language while the others introduced some structural novelty. The infants controlled the exposure duration of the stimuli by their visual fixation on a light. Their discrimination was assessed through their longer fixation (and hence listening) times for items introducing structural novelty. On all these points, the paradigm was identical to that used in the studies by Saffran which are described above (e.g. Saffran et al., 1997). However, by contrast with the Saffran and co-workers studies in which the test items consisted of the syllables which formed the training sentences, Marcus and co-workers introduced a change in the sensory content of the material.
For instance, in one experiment, infants heard 16 three-word sentences such as gatiti, linana, or tanana, during the study phase. All of these sentences were constructed on the basis of an ABB grammar. The infants were then presented with 12 other three-word sentences, such as wofefe and wofewo. The crucial point is that, although all of the test items were composed of new words, only half of them were constructed from the grammar with which the infants had been familiarized. In the selected example, the grammatical item was wofefe. Wofewo introduces a structural novelty in that it is generated from a concurrent ABA grammar. The infants tended to listen more to the sentences generated by the ABA grammar, thus indicating their sensitivity to the structural novelty. In another experiment, infants were shown to be able to discriminate sentences generated by an AAB grammar. These results were successfully replicated in various other conditions, involving systematic counterbalancing of material and careful control of the phonetic features forming the training and the test items.
Similar studies using more complex material have been performed with eleven-month-old infants (Gomez & Gerken, 1999) and with adults, most of them using the artificial grammar learning paradigm. As described above, in this paradigm, participants are first exposed to a set of letter strings generated by a finite-state grammar such as represented in Figure 1. Participants' performance is usually assessed through their judgments of the grammaticality of new strings during a subsequent test phase. In some studies, the letters forming the study items are changed in a consistent way for the test of grammaticality (e.g. C is always replaced by X, B by L, and so on). Reber (1969), and several subsequent studies (e.g. Dienes & Altman, 1997; Manza & Reber, 1997; Mathews et al. 1989; Shanks, Johnstone, & Staggs, 1997; Whittlesea & Wright, 1997) have shown that participants still outperform chance level under these conditions. The principle underlying the transfer in the so-called "changed letter procedure" has been extended to other surface changes. For instance, the training items and the test items may be, respectively, auditory items and visual items (Manza & Reber, 1997), color and color names, sounds and letters (Dienes & Altman, 1997), or vice-versa. Successful transfer was observed in each case. Reber claimed that these results testify to the fact that participants are able to abstract the "syntax" of the displayed material, independently of the "vocabulary".
The transfer paradigm has also been used in other contexts. For instance, Wulf and Schmidt (1997) reported experiments on implicit motor learning in continuous pursuit tracking. Unbeknown to the participants, each trial during the training sessions was divided into three segments. The target moved pseudo-randomly during two segments of each trial, whereas the other segment was the same throughout the four sessions. The test session included a transfer task in which the tracking patterns were scaled differently in amplitude or speed compared to the training sessions. The authors observed that participants selectively improved their trackings accuracy on the repeated segment, and that variations in the amplitude or the timing of the target displacement during the transfer phases had no detrimental impact on performance. Wulf and Schmidt speculated, to quote: "If the surface structure in grammar learning is analogous to the scaled versions in terms of amplitude and overall duration in the present study, then it is tempting to suggest a parallel between the learning processes in these two domains. In both, the fundamental, or "deep", structure can apparently be learned implicitly." (Wulf & Schmidt, 1997, pp. 1002)
At first glance, evidence for transfer between event patterns cutting across their sensory contents can not be accounted for by any models that relies on the statistical and distributional properties of the material, such as connectionist modeling or our own model. Indeed, the formation of an associative link between, say, ga, ti, and ti, whatever its strength, seems fundamentally unable to explain transfer to wo, fe, and fe, as observed in the Marcus et al. (1999) experiments. Accordingly, Marcus et al. concluded that infants have the capacity to represent algebra-like rules and, in addition, "have the ability to extract those rules rapidly from small amounts of input and to generalize those rules to novel instances" (pp. 79). Pinker (1999) echoes this conclusion, and points out that "Marcus et al.'s experiment is a reminder that humans also think in abstractions, rules, and variables" (pp. 41), besides their sensitivity to simple associative learning mechanisms. Demonstrations of transfer in more complex situations have elicited similar comments. For instance, Reber (1993), talking about performance in the transfer letter paradigm in artificial grammar learning studies, claimed that "the abstractive perspective is the only model of mental representation that can deal with the existence of transfer of knowledge across stimulus domains" (Reber, 1993, pp. 121).
2. The Outline of a Reappraisal
We have no problem with the claim that the evidence of transfer reviewed above is indicative of abstraction. However, we challenge the view that abstraction is indicative of rule formation and rule use and, more generally, is indicative of high-level conceptual processing. Other authors have made the same point. Regarding artificial grammar learning studies, Brooks and Vokey (1991) must be credited for the first account of transfer that does not rely on rule abstraction. More recently, the idea that transfer does not imply rule abstraction has gained support from the possibility of accounting for transfer performance within a connectionist framework (Altman & Dienes, 1999; Christiansen, Conway, & Curtin, 2000; McClelland & Plaut, 1999; Seidenberg & Elman, 1999). Redington and Chater (in press) have also cogently argued "that surface-independence and rule-based knowledge are orthogonal concepts". In the following, we focus the discussion on our own position, although our arguments are not incompatible with, and in some respects are similar to, those of other authors. Our claim is that transfer is a natural implication of the SOC model.
Let us return to PARSER. PARSER shows how the initial conscious percept, which is generally irrelevant to the material structure, becomes increasingly isomorphic with the structurally relevant units, thanks to the elementary principles of associative learning and memory. In section 3, we considered that the initial percept exactly matched the content of the perceived stimuli. For instance, given the auditory string badubatibu.., we assume that participants first form the auditory units baduba, tibu, and so on, by chunking together the auditory primitives ba, du, ti, and bu, and this assumption was sufficient to account for the data. However, it is worth stressing that this assumption is notoriously restrictive. Indeed, the primitives which enter into the associations are internal representations that only partially match the external stimuli that trigger these representations. For instance, as a result of earlier associations, the representations of ba, du, ti, and bu, involve a written component in literate people. Thus, when a new association is built between, say, the components of the auditory percept baduba, the new unit is not limited to the auditory domain, but naturally extends to the area of generalization of the primitive components, and especially to the visual domain. More generally, many examples of transfer originate in the fact that conscious primitives entering into the new associations are not tied to a fixed, domain-specific format of representation, but are instead often amodal, flexible, and domain-general. Conscious knowledge is represented into a cross-system code (e.g. Fodor, 1983; Karmiloff-Smith, 1992), a property that ensures that any conscious content possesses a certain abstractness.
Going a step further, it may also be argued that when a few syllables are perceived within one attentional focus, the resulting conscious experience is not necessarily limited to the sum of these syllables (even considering that they are represented into a cross-system code) but instead may embed some direct perception of the overall structure. For instance, baduti will not be perceived as bababa or baduba. The obvious difference lies in the number of repetition of the same primitives. There is no doubt that a part of the representation of bababa is that it consists in the repetition of the same syllable (a pattern that we refer to as a "run" below), and that a part of the representation of baduba is that the same syllable is repeated with an intervening syllable (a pattern that we refer to as a "trill" below). Coding a pattern as a run or a trill entails some form of relational coding, the relation involved here being the same-different relationship. Thus our assumption is that the sensory input processed within one attentional focus may also integrate some relational information.
If we take it for granted that such abstract and relational primitives are parts of conscious representations, then there is no reason not to apply the same reasoning that we applied to more basic primitives in PARSER. Abstract primitives, if they are frequently involved in the conscious perception of a given material, can emerge from noise on the basis of a selection process analogous to the one that we showed to be responsible for the formation of sensory -based, concrete representations. As it is the case for concrete representations, the extraction of regularities is facilitated by the fact that, in its turn, the initial perception determines the way further material is perceived; Thus, when some abstract relations have been perceived frequently enough to become perceptual primitives, they are automatically detected in the new material whenever present. However, in this case, the end-product of the process will be the emergence of representations coding the deep structure of the situation at hand, which makes transfer to other surface features natural. To oversimplify the matter for the sake of understanding, one could say that, in the conventional account, perception provides the system with a database composed of elementary, sensory-based primitives, from which the unconscious processor abstracts the deep underpinning rules. In our account, the primitives are a little more abstract and complex. However, with these new primitive units, no further conceptual operations are needed to account for transfer.
It is worthy of note that this interpretation is viable only if the coding of the incoming information in an abstract and relational format remains simple enough to be attributed to low-level perceptual processes. Admittedly, if it turns out that the perceptual primitives needed to account for the available data are, say, nested high-level order dependency rules, it would be unrealistic to claim that these primitives are directly coded by elementary perceptual mechanisms. Thus it is important to show that the available evidence of transfer can be explained in terms of the coding of fairly simple relations. In the following section, we examine the form of abstract and relational coding needed to account for the available findings on transfer. We will show that only surprisingly simple forms of coding are required. At the same time, it is equally important to show that transfer would fail if the specific constraints that our approach posits are not met. This aspect will be examined in Section 6.5.
6.3 Perceptual Primitives Can Be Abstract
and Relational
To begin with the most simple case, let us consider the Manza and Reber (1997) results, showing a transfer between auditory and visual modalities in the artificial grammar learning area. These authors interpret their findings as providing support for their abstractionist, rule-based view. Although the authors do not make their interpretation more explicit, we assume that their line of reasoning could be as follows. If, for instance, subjects perceive the visual sequence XMX, they abstract the knowledge that the letter X can be repeated with a lag of one letter. When they perceive again XMX, but in the auditory modality, they may experience some familiarity with the display, because the same rule applies. This interpretation undoubtedly works well. However, the phenomenon can be easily explained without having recourse to rules. It suffices to consider that there is a direct correspondence between the visual and the auditory format of the letters X and M. It is worth stressing the differences between the two approaches. In the former case, a rule-governed pattern needs to be extracted from the visual stimuli, before being transferred to the auditory stimuli. In the latter case, matching is direct, and independent of the structure of the material. A simple thought experiment may help to clarify the differences, and, by the same token, demonstrates the irrelevance of a rule-based account. Suppose that the material is generated randomly, instead of being generated by a finite-state grammar, and thus presents no rule-governed, salient pattern. For the sake of illustration, suppose that a string such as XMT is presented. In a rule-based interpretation, transfer should not occur, because a structure can not be abstracted. Now, it is quite obvious that the prior auditory presentation of XMT increases familiarity with the visual display XMT even though there is no common salient structure (alternatively, it could be argued that in XMT all the letters are different, and that this feature is a structural characteristic. In that case, a rule-based interpretation would predict equal transfer to any letter strings in which letters are different, such as DZM, a prediction that is clearly invalid)
The same comment can be applied to other studies. For example, Dienes and Altman (1997) observed a positive transfer between colors and the name of colors, which can also be accounted for by the natural mapping between the primitives involved in the experiment. Again, transfer would probably occur even with randomly generated stimulus sequences, thus demonstrating the irrelevance of a rule-based interpretation. However, not all studies of transfer can be explained using so simple an argument. As a case in point, the above explanation does not apply to the Marcus et al. studies in which transfer is observed between, say, gatiti et wofefe, because there is no natural mapping between ga and wo, or ti and fe.
Re-interpretation of the Marcus et al. data demands recourse to another property of conscious percepts, namely the direct coding of simple relations between the components of one percept. The relation that needs to be coded is the relation "same- different", or, in other words, the only ability that infants need to exhibit is that of coding the repetition of an event. If one postulates that infants are able to detect whether two successive stimuli are the same or not, the Marcus et al.'s results are easily explained. Indeed, as pointed out by McClelland & Plaut, 1999, gatiti, wofefe, and more generally all the ABB items, can be coded as different-same, whereas none of the other items can be coded using the same schema. AAB items are coded as same-different; ABA items instantiate a slightly more sophisticated pattern. Note that there is no indication in the data that this pattern is actually perceived as special: Considering that ABA items do not match the pattern of the other items is sufficient to account for the data. However, it does not seem to be unrealistic to assume that a trill pattern is also directly perceived when the components of this pattern can be processed within a single attentional focus. The numerous studies (e.g. Bornstein & Krinsky, 1985) showing infants' early sensitivity to symmetrical displays support this assumption.
At first glance, the demonstrations of transfer stemming from the more complex situations of artificial grammar learning in adults imply the coding of far more complex relations. We now argue that in fact, as surprising as this conclusion may be, the very same abilities as we have invoked up to now are sufficient. Indeed, although finite-state grammars embed complex relations, the coding of fairly simple patterns appears sufficient to account for improved performance in transfer situations. For instance, Whittlesea and Wright (1997, Exp. 4) reported successful transfer between letters and colors in artificial grammar learning. In the experiment, five out of the 20 training items begin with a salient alternation ("RMR"). Now, it turned out that color alternation at the beginning of a string appeared in legal test items, but never in illegal test items. It is enough to assume that participants consider the test items beginning with an alternation to be grammatical, and respond at random on the others, to simulate observed performance. If we take this interpretation for granted then transfer is easy to account for. Indeed, although there is no natural link between, for instance, R and a red square, a natural mapping may be established between the subjective unit "RMR" and "RED/ YELLOW/ RED", or any other color alternation. Again, the observation of a positive transfer is irrelevant as to whether subjects have abstracted the complex grammar used to generate the material. It can be accounted for more parsimoniously by assuming that subjective units are at least partially represented into a relational code.
For a still more complex illustration, let us consider one of the recent studies by Shanks et al. (1997), which concluded that transfer in artificial grammar learning is mediated at least to some extent by abstract knowledge. Experiment 1 used a standard changed-letter procedure, in which the letters used during study, M, R, T, V, and X, were replaced by C, H, J, L, and N respectively for the test. Shanks et al. introduced 5 types of violations in their ungrammatical transfer strings. The only violation that led participants to reject the strings in a forced choice grammaticality test was illegal letter repetitions. In the original grammar, only R, T, and V could be repeated. Thus, in legal transfer items, H, J, and L could also be repeated, but C and N could not. Shanks et al. showed that participants rejected transfer items including a repetition of one of these two letters at a significant level. Such a result suggests that subjects were able to perform a quite sophisticated analysis, including at least two steps. They first have to identify the fact that M and X were never repeated in the original set, then to establish a correct mapping between M and C, on the one hand, and X and N on the other.
It can be shown that correct responses imply neither of these steps. Let us assume that participants have formed subjective units, each composed of a few letters. An examination of the training strings shows that these subjective units include far fewer repetitions than if letters had been selected at random. The training strings included 9 repetitions, whereas we assessed (through a computational simulation) the number of repetitions expected by chance at about 22. Now, looking at the 5 pairs of transfer strings testing the "illegal letter repetition" feature, it appears that ungrammatical test strings always include more letter repetitions than grammatical test strings. It is enough for the participants to feel that the encoding units including a letter repetition to be unfamiliar for them to choose the grammatical item from each pair. The point is that there is strictly no need to infer what letter repetitions were legal in the study strings, or to establish a letter to letter mapping: it suffices to be sensitive to the fact that subjective units rarely include a letter repetition, whatever the nature of these letters. Transfer originates in the fact that a unit's feature such as "including a letter repetition" may be captured naturally, and not in the abstraction of the rules of the finite state grammar used to generate the letter strings (for other analyses pointing out to the primary importance of repetition structure to account for transfer in artificial grammar learning, see Gomez, Gerken, & Schvaneveldt, 2000; Tunney & Altmann, 1999).
With an appropriate change in terminology, we believe that the studies by Wulf and Schmidt (1997), in which successful transfer was observed on repeated patterns in a motor tracking task even though the tracking patterns were scaled differently in amplitude or speed in comparison with the training session, can be easily encompassed within the same line of reasoning. Indeed, to be brief, the analogy between a small and a large movement pattern is immediate and natural. In a more formal ways, the natural correspondence between the training and the transfer patterns follows from the long-standing contention that motor behavior may be subdivided into a deep, spatial-temporal structure (Schmidt's "relative timing" of movement), and a component which is scalable in terms of amplitude and rate. There is a natural term-to-term mapping between the training and the transfer patterns because movement is not encoded in absolute spatial or temporal units, but instead as a generalizable internal schema. The spatial-temporal structure, we argue, is analogous to the representations emerging from the processing of the strings of letters in artificial grammar learning: both are, in some sense, schematic, flexible, prone to generalization, although the dimension on which generalization occurs is unrelated to the dimension involved in the generation of the rules.
6.4 Is Our Account of Transfer More
Parsimonious?
To recapitulate, in the conventional models, the data made available to the central processor are the individual sensory-based events. The task of finding analogies between events which differ in their surface appearance is the job of some further inferential processes. These processes belong to the domain of cognition, and more precisely, because we are not aware of them, to the realm of the sophisticated cognitive unconscious. In our alternative conception, unconscious (but elementary) processes provide a conscious representation of the sensory input that is directly framed in some abstract and relational way, as any conscious content is. With this modified input, the performance observed in transfer situations no longer needs to be explained in terms of a sophisticated unconscious processor. The ubiquitous learning and memory processes evoked in the previous sections are sufficient to explain the emergence of a reliable representation of the deep structure of the material. In Sections 3 and 4, we indicated how simple principles of associative learning and memory explain the emergence of conscious representations which are increasingly isomorphic to the world structure in cases where the sensory domain remains identical. When applied to more abstract primitives, the very same principles account for the discovery of the structure of the material in cases where the sensory domain is changed. Suppose, for instance, that a grammar-generated string such as ABA is naturally perceived as a trill. If this particular pattern is not repeated, this will be quickly forgotten, and other more frequent patterns will certainly emerge; However, if a trill re-occurs frequently, even under different surface features, it will become a part of subject's representation, which in turn guides the perception of the material which is displayed subsequently. Thus, where the conventional approach makes use of complex inferential, rule inference processes which are applied to unconscious representations, we propose no operations other than those driven by the ubiquitous mechanisms that are basic to our approach.
Opponents of this position might argue that our conception simply shadows or resituates the problem instead of solving it. The argument should be that positing that ongoing sensory information is directly coded into an abstract and relational code is akin to taking as premises the to-be-explained phenomenon, and presumably further consideration of this initial stage of processing would indicate that it, in fact, involves the same kind of complex machinery that most authors include under the label of cognitive unconscious. This criticism is unsound, however, because the relationships we assume to be directly coded by low-level perceptual processes are considerably simpler than the abstract rules of the mainstream tradition. They are limited to a few aspects, including the same/ different distinction, the properties of symmetry, repetition, and alternation, and relationships along some perceptual dimensions such as smaller than or brighter than. It is not biologically implausible to assume that these relationships are coded at earlier stages of neural processing, although there is as yet no direct evidence (one exception is the direct coding of the relation brighter than, that is at least partially coded at the retinal level by lateral interaction between concurrent stimulations)
In the absence of more extensive neuropsychological arguments, our hypothesis finds some support in the primacy of relational coding in phylogenetic evolution. It has long been shown that animals such as rats are able to perform tasks involving elementary forms of relational learning successfully. For instance, if rats are trained with two stimuli differing in brightness in such a way that the choice of the brighter is rewarded and the choice of the darker not rewarded, they subsequently choose the brighter of two new stimuli even though the absolute brightness of the new rewarded stimulus may be identical to that of the old unrewarded stimuli. Thus rats appear to be sensitive to the relationship between stimuli rather than to their absolute properties. Such a demonstration has been replicated with various animal species and using a variety of simple relationships, such as larger than. Primates and a number of birds also appear able to learn a discriminative response to pairs of stimuli depending on whether they are identical or different, and once acquired, this ability transfer to any new stimulus pair irrespective of its nature11. Within the perspective of evolutionary biology, these results are not at all surprising. In many cases, the raw information provided by an isolated event is only partially relevant. For instance, the retinal size of a perceived object or animal is uninformative, because it depends on the distance between the observer and the distal stimulus. Similarly, the absolute brightness provides incomplete information, because perceived brightness depends on the ambient luminance. Considerably more reliable information is provided by a relational coding by means of which the size or brightness of a new stimulus is assessed by comparison with contextual stimuli.
6.5 Analyzing Transfer Limitations and Failure
As pointed out earlier (Section 1.5.1), a major advantage of a parsimonious account lies, somewhat paradoxically, in its limited power, which makes it easier to falsify. Indeed, our account is certainly unable to explain all possible kinds of transfer, and demonstration that these types of transfer actually occur should be taken as a compelling refutation. This section is devoted to show that transfer is in fact severely limited, as our account anticipates.
6.5.1. The Transfer Decrement Phenomenon. In experiments where positive evidence of transfer is reported, performance levels on the transfer situations are, as a rule, lower than performance levels on the original training situation. This so-called transfer decrement phenomenon raises a problem for a rule-based standpoint. In an authoritative discussion on the use of abstract rules, Smith et al. (1992) posit as the first of their eight criteria for rule use that "Performance on rule-governed items is as accurate with unfamiliar as with familiar material" (Smith et al., 1992, p.7; see also Anderson,1994, pp.35). In the context of artificial grammar learning studies, Whittlesea and Dorken posit that "a subject who learned a useful rule would have equal success in transfer on stimuli presented either in the same or different features, because the rule is applicable regardless of the features in which items are presented" (Whittlesea and Dorken, 1997, pp. 66). Manza and Reber (1997) acknowledge this implication of their own abstractionist view. Thus, an essential prediction of any system that uses algebraic rules to represent its knowledge about some domain is that its transfer performance on novel items should be just as good as its performance on familiar items. The question arises: why is the phenomenon of transfer decrement ubiquitous?
A simple way to reconcile the empirical evidence with the assumption that knowledge is rule-based is to assume that the rules are not absolute, but probabilistic, and that they have limited scope. Although this argument is logically sound, it is clear that it severely undermines the core advantage of rule-based approaches, namely that they provide general and abstract descriptions of the stimuli. Rules that only apply to familiar cases obviously have only limited interest. In short, rules have a potential adaptive value insofar as they can be applied to novel situations. This is indeed what made them so attractive to early cognitivists such as Chomsky. Another possible explanation of the transfer decrement phenomenon in a rule-based framework would be that the usual training conditions provide insufficient practice. This explanation accords with Manza and Reber (1997) view. According to these authors, performance is initially sensitive to low-level surface features, then becomes increasingly independent of those features, and exclusively determined by the deep structure of the material. After sensory-based representations have been built, to quote, "an ‘abstractor’ would come into play, gradually removing irrelevant surface elements and leaving only structural elements in the representation" (Manza & Reber, 1997, pp. 101). The transfer decrement phenomenon would correspond to a intermediate stage of training in which performances would reflect a mix of influences from specific and abstract components, in which the top level of the abstractive process has not yet been attained.
Pacton, Perruchet, Fayol and Cleeremans (2001) tested this hypothesis. They reasoned that training in laboratory settings is necessarily restricted, both in duration and in the number of stimuli experienced by participants. To overcome this limitation, they tracked the time course of transfer performance over the extended durations typical of the acquisition of complex skills in natural settings. They examined the development of children sensitivity to certain orthographic regularities based on experience of printed language. For instance, some experiments exploited the fact that, in French, the consonants that can be doubled are only doubled in medial position of words (i.e. never at the beginning or at the end). This rule is never taught and the situation therefore taps implicit learning processes. Children became increasingly sensitive to the legal position of double consonants from grade1 to grade 5. However, the major point of interest concerned whether this sensitivity transfers to consonants that are never doubled in French. Rule-based approaches would predict that children learned the rule that consonants are only doubled in medial position from a subset of consonants that are seen in doublets, then transfer this knowledge to consonants that are never seen in doublets. This should result in a progressive convergence of performance on seen and unseen material with training.
Pacton et al.'s results clearly invalidate this prediction. There was no trend towards a reduction of transfer decrement amplitude over the 5-years of training that were examined. The performance curves for seen and unseen material remained parallel throughout practice. This parallelism was observed in several experiments and also applied to other orthographic rules. Overall, these results suggest that even after exposure to, presumably, several million words in which a rule applies, children's orthographic behavior still can not be readily qualified as rule-directed.
Note that the persistence of transfer decrement across extended practice is fully consistent with our view. To be fair, the persistence of transfer decrement is consistent with any view that relies on statistical or distributional properties of the material. Indeed, in such views, transfer is construed as generalization, with generalization gradients depending on the similarity between familiar and novel forms. In contrast with the predictions issuing from an abstractionist view, there is no obvious reason to expect that the amount of generalization depends on the level of training. Distributional approaches would predict continued lower levels of performance on novel material, even after extensive training, because the similarity between familiar and novel situations remains the same across time. In keeping with this observation, any statistical approach is able to account for parallelism between performance on familiar and novel material over practice.
6.5.2. Accounting for Transfer Failure. Up to now, we have dealt with the results showing evidence for transfer, even if the transfer decrement phenomenon makes this evidence less powerful than abstractionist theorists would presumably hope. This emphasis on positive results is warranted. However, it is worth stressing that totally negative results are certainly the most frequent outcome in the relevant literature. With the notable exception of between-letter transfer in artificial grammar learning, transfer failure has frequently been reported in the literature on implicit learning. Total failure to obtain transfer to new material with dissimilar surface features is the rule in studies involving serial reaction time tasks (e.g.: Stadler, 1989; Willingham, Nissen, & Bullemer, 1989) or control process tasks (e.g.: Berry & Broadbent, 1988; Squire & Frambach, 1990) 12. In the conclusion to their review of transfer in the most current implicit learning paradigms, Berry and Dienes (1993 p.180) pointed out that "the knowledge underlying performance on numerous tasks... often fails to transfer to different tasks involving conceptually irrelevant perceptual changes". This empirical finding leads the authors to propose that limited transfer to related tasks is one of the few key features of performance in implicit learning tasks. Likewise, a surprising specificity of learning has been observed in the coordination between perception and action during infancy (e.g. Adolph, 2000). In the literature on problem solving, that will be discussed in the next section, there is also overwhelming evidence for the difficulty of transferring the solution of a problem to another, when both problems have the same deep structure but different surface features (e.g. Clement, 1994).
A model positing a powerful unconscious rule abstractor is obviously equipped to account for positive results, but, as an inevitable consequence, is deeply underpowered in the face of negative results. Demonstrations of the empirical influence of problem content on performance have challenged the prevalent models of problem solving in the last decade, which have typically had recourse to formal or abstract rules, with a striking separation (based on the computer analogy) between rule-based programs and stored representations (e.g. Braine, 1978; Cheng & Holyoak, 1985). Most rule-based accounts have difficulty in predicting when and how transfer occurs and when and how transfer fails. By contrast, the SOC framework makes predictions about the conditions that are likely to promote, or hamper, the possibility of transfer. Briefly, transfer is expected only when the commonality between the training and the new situation is a part of the conscious representations triggered by the two situations. In other words, transfer is only possible when the elements common to the original and the new situation are components of the conscious percepts. More precisely, the SOC framework anticipates that transfer occurs only when subjects' attention has been focused on the common abstract features. Many results lend support to this prediction. As Reeves and Weisberg (1994) concluded in their review, "in almost all cases, subjects must either work at schema induction by comparing the similarity between base analogues (Catrambone & Holyoak, 1985; Reeves & Weisberg, 1990), mapping one analog onto another (Ross & Kennedy, 1990), or being explicitly provided with schematic principles that accompany the base analogues (Fong et al., 1986; Gick & Holyoak, 1983)" (Reeves & Weisberg, 1994, pp.390; see also Clement, 1994).
Needless to say, we are not arguing that the SOC model is the only one capable of accounting for these data (see e.g. Singley & Anserson, 1989). Rather, our claim is that the findings evidencing transfer limitations and failures are compatible with this model, while they are difficult to reconcile with the idea of a cognitive unconscious giving automatic access to the deep structure of a problem.
7.
Problem Solving, Decision Making, and Automaticity
The experimental studies presented above suggest that the formation of conscious representations which are consonant with the structure of the material accounts for at least some of the phenomena usually attributed to processes that would operate through the sequential, analytical manipulation of information. Our aim now is to show that this suggestion may find echoes in the literature on problem solving, decision making, and automaticity.
7.1 Problem Solving and Incubation
Each of us has direct evidence of the sequential manipulation of symbols according to certain logical rules. Indeed, the solution to a problem is sometimes obtained through the effortful elaboration of a chain of reasoning. However, in many cases, the solution to a problem springs to mind without the phenomenal experience of engaging in logic-analytic operations. Conclusions simply rise to consciousness, without being the outcome of a worked-out inference. This dual nature of reasoning was acknowledged long ago, and framed into different terminology (Sloman, 1996). For instance, Smolenski (1988) distinguished between a rule interpreter and an intuitive processor. Likewise, Shastri and Ajjanagadde (1993) opposed reflective reasoning, which requires conscious deliberation, and reflexive reasoning, in which inferences appear as a reflex response of our cognitive apparatus. Johnson-Laird (1983, p.127) talks about explicit and implicit inferences, and Hinton (1990) distinguished between complex (rational) and simple (intuitive) inferences to refer to the same distinction. We are only concerned here with the second aspect of these dichotomies, the one which taps what Dulany (1997) calls the evocative mental episodes.
7.1.1 Problem Solving as the Formation of New Subjective Units. In keeping with the dominant zeitgeist, solving complex problems without the apparent involvement of explicit, deliberative processes, is commonly attributed to the action of an unconscious and sophisticated processor. The underlying idea is that the solution to a problem may be worked out in the absence of conscious awareness of the operations required by this problem. Our suggestion is that intuition and insight, and all the cases in which logic-like operations are apparently performed by the mind in the absence of conscious thought, can be encompassed within the notion of self-organizing consciousness. We have seen above how the notion of self-organizing consciousness allows us to account for the formation of internal representations which are increasingly congruent with the world structure. If we expand the scope of these representations to the various dimensions involved in a given problem, it becomes conceivable that a representation contains, in some sense, both the data and the solution of the problem. The solution pops up in the mind, because it is a part of the model of the world that people have built through automatic associative processes.
Let us take a simple example, one relating to the notion of transitivity. In the linear ordering tasks, two premises are presented, the formal expression of them being: A is longer than B and B is longer than C. Participants have to judge whether an expression such as: A is longer than C, is correct. It can be assumed that people solve this task because they have some formal notion about the transitivity of the expression "longer than", and that they apply the transitivity rule to the problem at hand. However, it is far simpler to assume that people have built an integrative representation of the premises in the form of a linear array, and then read the response to the question directly on this representation. There is now a consensus about the idea that people proceed in this way (Evans, Newstead, & Byrne, 1993). This illustrates how a representation which is isomorphic to the world structure makes rule knowledge unnecessary.
This claim is reminiscent of various proposals, from the notion of mental models advanced by Johnson-Laird (1983), to the representation/computation trade-off envisaged by Clark and Thornton (1997). Shastri and Ajjanagadde (1993) simulation model of reasoning relies on the same general view. These authors show how a neural network may simulate reasoning through the formation of a model of the world. To borrow their terms: "The network encoding of the Long term Knowledge Base is best viewed as a vivid internal model of the agent's environment, where the interconnections between (internal) representations directly encode the dependencies between the associated (external) entities. When the nodes in this model are activated to reflect a given state of affairs in the environment, the model spontaneously simulates the behavior of the external world and in doing so makes predictions and draws inferences". In Shastri and Ajjanagadde's framework, the internal model of the world takes the form of a neural network, and the authors do not provide a detailed account of the question of learning. Moreover, they say nothing about the issue of consciousness. However, it is easy to see how the same view can be held about the conscious representations which are built thanks to their self-organizing properties: Representations become able to provide a model of the world in which some structural relations that have not been encoded as such can be directly "read", instead of being computed through analytical inference processes.
7.1.2 Incubation. A marginal aspect in the literature on problem solving concerns the phenomenon of incubation. Everyone has had the experience of the solution to a problem suddenly occuring after we have given up our deliberative and unsuccessful search for it. The phenomenon may happen for relatively simple problems of daily life, as well as in more sophisticated situations. For example, Henri Poincaré provided a fine-grained description of this effect based on his own experience of the resolution of very complex mathematical problems. The phenomenon was termed incubation by Wallas (1926). According to Wallas, when the solution to a problem is not directly reached through explicit, step-by-step reasoning, it may be useful to suspend the search for a solution, in order to allow "the free working of the unconscious or partially conscious processes of the mind". This phenomenon is somewhat difficult to investigate in the laboratory, but there is nevertheless some experimental evidence for it. For instance, Fulgosi and Guilford (1968) asked their participants to anticipate the consequences of various improbable events, either for a period of four minutes, or during two sessions of two minutes separated by unrelated activities. Delays of at least 20 minutes were beneficial in producing responses. Such phenomena provide, at first glance, clear-cut evidence for the fact that after suspension of deliberative search, a sophisticated cognitive unconscious takes over and goes on searching in parallel to the overt activities.
However, as claimed by Mandler (1994) in an overview of the phenomenon, "there is no direct evidence that complex unconscious 'work' (new elaborations and creations of mental contents) contributes to the incubation effects" (Mandler, 1994, p.20). This is because incubation can be accounted for in much simple terms. Instead of imagining that the filling task leaves the cognitive unconscious free to search for a solution, it may be assumed that the intervening task makes it possible to forget certain aspects which are irrelevant to the solution of the problem at hand. The forgetting of inappropriate elements of response should promote the emergence of a new perceptual structuring. Smith and Blankenship (1989, 1991) have provided experimental evidence for this hypothesis: When misleading information was given to subjects while they were trying to solve various problems, an incubation delay led both to an improvement in problem solving and reduced memorization of the misleading information, with a close relation between the two effects.
Here again we find the idea developed in PARSER that forgetting is crucial for the formation of perceptual representations isomorphic to the structure of the material. For the sake of illustration, let us suppose that the correct segmentation of batubidutaba is batubi/dutaba, but that a subject initially perceives batu/bidu/taba. These percepts shape new internal units, and because perception is guided in turn by earlier processing units, the same display has a chance of eliciting the same erroneous perception in subsequent trials. Fortunately, internal units are progressively forgotten during the delay intervening between two repetitions. This makes it possible for a new parsing -which may turn out to be correct- to occur in subsequent trials. In one sense, one could say that, in PARSER, correct segmentation is the product of an incubation effect. Obviously, the subjective experience of mind popping is lacking with an artificial language, because a solution, whatever it is, never corresponds to a meaningful perception, as may be the case with other materials. But it is easy to imagine how the model could account for mind popping in a situation where a correct solution could be immediately identified as such, instead of being gradually confirmed with training.
To conclude this discussion of problem solving, it appears that the formation of conscious representations thanks to elementary mechanisms of associative learning is able to account for many cases where the discovery of a solution has been attributed to some unconscious analytical reasoning. The phenomenon of incubation, which gives us strong intuitive feeling that some unconscious genius goes on to work out inside our minds alongside our conscious occupations, might be nothing other that the forgetting of structurally irrelevant solutions. This conclusion fits well with the conclusion reached in earlier sections about other forms of learning. It could also be expanded to other forms of learning that space limitation prevents us to from examining in detail. For instance, studies on concept learning have yielded similar findings. In a study involving complex, ill-defined concept, Carlson and Dulany (1985) concluded that "hypotheses of unconscious learning are most strongly disconfirmed by evidence that the content of conscious awareness could, given reasonable process assumption, account for the learning observed" (Carlson & Dulany, pp. 45).
7.2 Decision Making
Going a step further in our speculation, decision making might prove to be another area of application of our framework. Of course, as in the case of reasoning, we do not refer here to the decisions that are the products of a deliberate, step-by-step conscious analysis, but to the decisions that emerge immediately, before any rational considerations. Most often, when faced with a choice, we have an immediate preference for one alternative, and explicit thoughts, when they occur, are merely able to suggest a posteriori justifications. It might again seem that spontaneous decisions are the product of an unconscious analysis of all the factors relevant for this decision. Our model suggests a far more parsimonious explanation, provided that we make some additional assumptions. Phenomenal experience does not only comprise the cold representations of the world: it is emotionally valenced, either positively or negatively. Our proposal is that decision could be directly based on this affective valence, and that the affective valence is itself the end-result of associative processes such as those involved in PARSER. In other words, we suggest that a situation is directly perceived as positively or negatively valenced, this feature being a consequence of the self-organizing property of consciousness. Indeed, there is no reason to think that emotive components escape from the associative processes that shape conscious experience. On the contrary, we have experimental evidence, through the studies on conditioning, and especially the recent studies on evaluative conditioning (e.g. De Houver, Hendrickx, & Baeyens, 1997), that the emotive components are responsive to the same mechanisms as those involved in PARSER. (see also the quotation of Mackinstoh, 1997, in Section 2.1.3 of the present paper). Thus the conscious representations which have developed under natural conditions are probably endowed with an emotive dimension which results from self-organization and which may be directly responsible for the decision.
7.3 Automaticity
The terms automatic and unconscious are often used interchangeably in everyday language. This is also the case in the writings of several psychologists, such as Jacoby (e.g. Jacoby, Ste-Marie, & Toth, 1993). At the same time, there is a consensus on the idea that an automatic mode of responding is not limited to only the most simple situations. Combining these two premises leads us to infer the existence of complex and sophisticated unconscious processing, a conclusion that is at odds with our general framework. Which of the two premises turns out to be questionable? We have no problem with the claim that people are able to deal with complex situations in automatic ways. Reading is often designated as the archetypical example of automatism, and, irrespective of the fuzziness inherent to the concept of complexity, it must be acknowledged that acceding to the meaning of a word from its graphemic representation is anything but a simple task. However, we strongly disagree with the collapsing of the notions of automaticity and unconsciousness. To make the point clear, we need to return to the literature on automatism formation.
There is general consensus that automaticity can be defined in terms of three main criteria (e.g. Neumann, 1984). The first refers to a mode of operation: An automatic process is not subject to interference from attended activities, and does not interfere with such activities. This criterion is often operationalized by the lack of interference in dual task experiments, in which participants have to carry out two actions simultaneously. The search tasks, in which participants are assumed to perform operations in parallel on a single visual display are also used for the same purpose. The second criterion refers to a mode of control: An automatic process can be triggered without a supporting intention (strategies, expectancies, and so on), and, once started, can not be stopped intentionally. The stroop task is the preferred way of investigating this property. In the prototypical version of this task, of which many variants exist, subjects are asked to name the color of a word while ignoring the word. The time taken to identify the color when it is paired with an incongruent color word is usually found to be slower than when it is paired with a neutral word, an effect revealing that the irrelevant word has been processed without intent. Finally, the concept of automaticity is defined by a mode of representation: Automatic processes are often unconscious. All of these properties are conceived of as a consequence of extended training. A given processing, initially susceptible to interference and under subjects' conscious control progressively loses these properties during practice with the task. This general pattern of changes, which can be observed in many situations of our everyday lives, leaves us with the idea that the very same operations that are initially performed consciously come to be performed, after appropriate training, by a powerful unconscious processor operating in parallel.
To begin with, it is worth emphasizing that the above description provides an idealized view of the phenomenon. The whole literature on automatism is characterized by a few initial papers which have posited a set of definitory criteria (e.g. Shiffrin & Schneider, 1977; Hasher & Zacks, 1979), followed by an overwhelming number of experiments demonstrating that these criteria are never fulfilled, even in those activities, such as reading, that everyone believes to be as prototypical of automatisms. A convincing argument for the graded nature of automatisms was presented in two well-documented reviews as early as the middle eighties (Neumann, 1984; Kahneman & Treisman, 1984). Subsequent research has confirmed and strengthened this standpoint. Maybe we should place special emphasis on the Stroop effect, because this effect is recurrently described as a compelling demonstration that reading lies outside of people's intentional control. In a recent experimental paper entitled: "The Stroop effect and the myth of automaticity", Besner, Stolz, and Boutilier (1997) report that the Stroop effect is eliminated when a single letter instead of the whole word is colored. From this and other related findings, they conclude that empirical data "are inconsistent with the widespread view reiterated in over 500 journal articles, chapters, and textbooks that a Stroop effect occurs because unconscious/automatic processes cannot be prevented from computing the semantics of the irrelevant word" (Besner et al., p.224).
However, the fact that the properties of automatisms are gradual rather than all-or-none is not sufficient to rule out the view that, as an effect of repeated practice, cognitive operations and representations progressively relax their initial link with conscious awareness. Consciousness, in this view, appears to be an optional quality of cognitive activities, a proposal that is in contradiction with our framework. The point we wish to make here is that although our framework is indeed incompatible with the possibility of transferring operations from a conscious to an unconscious mode, the idea that automatization consists in such a transfer is only one of several theoretical accounts of the phenomenon. This account is instantiated by the Laberge and Samuel (1974) theory, in which automatization is equated to the progressive withdrawal of attention from operations that are otherwise left qualitatively unchanged. In a similar vein, Shiffrin and Schneider (1977) argue for a transition from serial to parallel processing. These theories are obviously consistent with the prevalent zeitgeist, and converge to strengthen the view that the cognitive unconscious can perform the very same processing as conscious thought but with even greater proficiency.
These interpretations of automatisms were challenged by Logan and his collaborators (e.g. Logan, 1988). For Logan, the withdrawal of attention that characterizes automatization is not a cause, but a consequence of a change in the nature of the operations performed by the learner. The change is described as a transition from performance based on a general algorithm to performance based on memory retrieval. Logan illustrates this idea in the field of arithmetic computation: Initially, children perform, say, additions, with a general counting algorithm but, after practice, they retrieve sums directly from memory without counting. The point is that step-by-step counting operations do not transfer from a conscious to an unconscious mode of control: They are simply deleted, and replaced by another operation. This theory accounts nicely for the empirical data in which the notion of automatism is rooted. Indeed, retrieval from memory requires a minimal amount of cognitive effort and attention, and hence interferes minimally with other operations. Also, retrieval from memory is often triggered by the surroundings stimuli without any possibility of intentional control. Lastly, the nature of the process engaged to retrieve the solution is unavailable to consciousness. For instance, in the face of the problem: 5+3=?, an adult subject produces the response "8" with minimal cost, has difficulty in preventing the occurrence of this solution in mind, and has no introspective knowledge of the way by which the solution pops into his/her mind. These characteristics strikingly differ from those of the operations undertaken by a child performing the same addition on her fingers.
Several, although not all, aspects of Logan's theory of automaticity are directly compatible with the SOC model. To go a step further, Logan so-called "instance theory" is based on three main assumptions. The first is the obligatory encoding assumption, which asserts that attentional processing of an event causes it to be encoded in memory. The second assumption is the obligatory retrieval assumption, which assert that attentional processing of an event causes the retrieval of whatever was associated with this event in the past. The SOC model shares the same two assumptions. However, it strongly departs from the instance theory on the third assumption. Logan assumes that each event is represented separately in memory, even if it is identical to a previously experienced event. As extensively described above, the SOC model is rooted in an associative theory of learning and memory, which provides, we believe, a far better account of the progressive tuning to the world structure of subjective percepts and representations. However, our point here is not to discuss Logan's theory further, but rather to borrow the elements of this theory that allow the SOC model to encompass the data related to automaticity.
In the SOC model, automaticity may be construed as the possibility for a subject of forming a new conscious representation the components of which were previously perceived as independent primitives. Note that this definition does not differ from the one we proposed for implicit learning. When people create a new unit such as bupada, this unit is also composed of initially independent primitives such as bupa and da. The difference lies in the fact that, for instance, the final unit bupada is given in the data, and needs only to be captured through selection from other possible units. By contrast, the final unit "5+3=8" needs to be built through time-consuming operations on the part of the subjects. But this difference does not mean that the final outcome differs: after training, people evoke the conscious unit "5+3=8" in the very same way that they evoke the conscious unit "bupada". As Logan contends, automatic behavior in nothing other than memory retrieval.
The difference between this interpretation and the various interpretations framed in terms of attention withdrawal (e.g.: Laberge & Samuel, 1974) or parallel processing (e.g.: Shiffrin & Schneider, 1977) is overwhelming. This difference does not primarily refer to a simple/ complex dimension. Presumably, the biological mechanisms involved in creating a single conscious perception and representation are incredibly complex (but all the resources of the neural circuitry can be recruited for this task, given that it is the only one to be performed at a given moment). The point is that these mechanisms are grounded on associative principles, and do not involve the manipulation of unconscious symbol-like representations. A particularly clever empirical demonstration that automatic behavior does not consist in performing unconsciously the very same set of operations initially performed under attentional control has been provided by Zbrodoff (1999) in the context of arithmetic problems. Zbrodoff reasoned that if skilled people pass through intermediate counts while they solve a simple addition problem (e.g. gone through 5 and 6 when they solve 4+3) as children do when they begin to do arithmetic, intermediate counts should have a priming effect on subsequent tasks that involve those intermediate counts. She tracked this effect while subjects practiced alphabet arithmetic problems (e.g. B+4=F), and found that the priming effect of intermediate counts, observable in novices, disappeared after extensive practice.
Up to now, we have dealt only with cognitive automatisms, such as reading and arithmetic calculation. In the psychological literature, as well as in everyday language, the notion of automatism also embraces the motor components of behavior. Is it possible to encompass these aspects within the view outlined here? We believe that the response is yes, provided we accept the possibility that action and its results may be admitted as components of the phenomenal experience, in the same way as sensory input. This would lead to the formation of rich representations including not only our body and the world, but the interaction between them. Again, the entire literature on conditioning, and especially instrumental or operant conditioning, provides striking demonstrations that organisms' responses can enter into associative links. In keeping with our general framework, it follows that our own action and its consequences can participate in the self-organization of conscious representations, thus providing structurally isomorphic representations of the world - including ourselves and the consequences of our own actions. To take a simple example, switching the light switch while entering a familiar dark room may become a constitutive component of the phenomenal experience of this life episode.
We mentioned above that automaticity and the absence of consciousness are frequently referred to as identical (e.g. Jacoby, Ste-Marie, & Toth, 1993). Our conclusion is, ironically, at the exact opposite. The phenomenal experience is, to a large extent, the product of an automatization process. Tzelgov recently entitled one book chapter: "Automatic but conscious: That is how we act most of the time" (Tzelgov, 1997, p. 217). We fully agree with this claim, which, unsurprisingly, Tzelgov infers from his endorsement of Logan's theory of automaticity on the one hand, and Dulany's mentalistic framework on the other. Obviously, we are conscious of the output of the mechanisms involved, and not of the mechanisms themselves. But the automatisms have no specificity in this regard: this is the case for all biological processes. Automatic behaviors are unconscious in the same sense that, say, the explicit remembering of past may be said to be unconscious: In both cases, we have no access to the mechanisms generating the current phenomenal experience. What gives us the feeling that some sophisticated computation on symbolic representations occurs unconsciously in automatized performance is linked to the belief that performance after extensive practice involves the very same set of operations that was requested at the beginning of practice. Once this assumption is abandoned, automatized activities can be qualified in the same way as any other activities: they are the conscious outcomes of unconscious mechanisms.
8.
Revisiting Other Purported Evidence for the Cognitive Unconscious
The primary objective of all the prior sections was to demonstrate the viability of a framework involving exclusively conscious representations and computation, by showing how the concept of Self-Organizing Consciousness could account for a wide range of adaptive phenomena usually considered to be mediated by the cognitive unconscious. However, for obvious reasons, we have not commented on arguments that lend support to the cognitive unconscious based on data we consider to lack a reasonable empirical basis. We now have to consider these arguments. First, we will deal with the idea that one or several events could influence the processing of a subsequent event without the initial episode(s) having been attentionally processed. Second, we will turn towards the phenomenon of unconscious semantic priming, which constitutes one of the most immediate objections to be raised when the possibility is suggested, in formal setting or informal discussions, that a cognitive unconscious might have no actual existence. Finally, we will examine the literature on rare neuropsychological syndromes, such as blindsight, which also lend apparent support to a cognitive unconscious. We do not intend to provide an exhaustive discussion on these issues. Rather, our aim is to outline the way the arguments relying on these phenomena can be discounted through the detailed examination of a few examples, while referring to other discussions in the literature when available (for other critical examinations of the literature, see e.g. Dulany, 1991, 1997; O’Brien & Opie, 1999a, 1999b).
8.1 Implicit Memory and Learning Without
Attentional Encoding
Research into implicit memory (or repetition priming) provides overwhelming evidence that processing stimuli may induce changes in performance on the subsequent identification or production of the same stimuli, without it being necessary to retrieve the initial encoding episode explicitly. For instance, when the initial event is the reading of verbal items, subsequent facilitations in the processing of these items has been reported for word completion, tachistoscopic identification, identification in a perceptual clarification procedure, and many other perceptual tasks that do not require the explicit retrieval of the initial event. The implicit memory tasks may also rely more on the encoded meaning of concepts than on the perceptual record of the items. For instance, in the category-exemplar generation test, participants are asked to name the first exemplar of a given semantic category that comes to mind. Exemplars that were previously displayed are evoked more frequently than unseen exemplars. Several studies suggest that the effect of the initial event may be observed even in cases where the explicit retrieval of this event is not only absent from the task demand, but made impossible, due either to specific experimental manipulations (e.g. a large study test interval) or to amnesic disorders due to neurological lesions (for reviews, see e.g. Roediger & McDermott, 1993; for a skeptical standpoint about the experimental demonstration of the phenomenon in normal subjects, see Butler & Berry, 2001).
Although they have been the object of a considerable amount of interest over the two past decades, the basic phenomena highlighted in implicit memory research are anything but new. Indeed, nearly a century of research into conventional situations of learning and conditioning teaches us that prior experiences influence behavior in subsequent situations, without subjects explicitly remembering the events involved in the original experiences. Clear-cut evidence may also be found in everyday life. To take a simple example: Each of us is able to complete 5+3 with the solution 8, and this ability has obviously been acquired through experience. But it is unlikely that anyone is able to evoke the original training experience. The fact that repetition priming is generally studied after a single exposure differentiates this paradigm from standard learning studies, which generally involve multiple trials, and this feature may explain why the phenomenon has been compared to the memory tasks such as recall and recognition, rather than being integrated in the field of learning. But this is merely a statistical difference: Many studies investigate "one-trial learning", and other studies investigate the effect of multiple repetitions on priming. These studies confirm the conclusion that might reasonably be anticipated, namely that the same processes are involved in both cases (see also Logan, 1990).
Overall, these phenomena are fully compatible with our framework. Moreover, they provide elementary examples of the progressive transformation of conscious experiences after earlier identical or similar processing episodes, and are therefore central to our approach. It should be noted that we have assumed that this transformation was mediated by the unconscious tuning of processing mechanisms as a by-product of their recruitment, and not by the explicit recall of earlier episodes. Therefore, the observation of an effect caused by a past event that is currently forgotten, whether the phenomenon occurs in normal subjects or amnesic patients (for a review, see: Gabrielli, 1998), does not undermine our view in any way.
However, one aspect that is potentially difficult to reconcile with our framework has been reported. Indeed, some results suggest that an earlier event can have an effect even though this event has not been attentionnally processed. In implicit memory research, the study material is usually presented in fairly standard conditions, that is without any attempt to prevent subjects from paying attention to the displayed items. Some studies have investigated implicit memory after subjects had been faced with a secondary task during the study phase and these studies have returned positive results (e.g. Parkin & Russo, 1990). But these conditions were not intended to entirely prevent attentional processing. Eich (1984), on the other hand, has reported implicit memory for verbal information that was claimed to be totally ignored in a selective listening procedure. If such a result turns out to be robust, it argues against one of our basic postulate, because it suggests that the meaning of a word can be accessed unconsciously. Also, it argues in favor of a dissociation between the conscious/attentional system and learning, whereas the linkage between the two notions is a fundamental principle of the concept of Self-Organizing Consciousness.
An examination of the literature, however, lead us to doubt the reliability of implicit memory without attentional encoding during the study phase. For instance, Eich’s conclusion has been challenged by Wood, Stadler, and Cowan (1997), who showed that Eich's positive results were due to the slow rate of presentation used in this study. This allowed participants to pay at least some amount of attention to the to-be-ignored channel. There is now overwhelming evidence that attention to the material at the time of encoding is a necessary condition for the observation of an effect of these materials in subsequent implicit memory tests, such as word completion and perceptual identification tasks (e.g.: Crabb & Dark, 1999), reading tasks (MacDonald & MacLeod, 1998), or object decision tasks (Ganor-Stern, Seamon, & Carrasco, 1998).
The very same conclusion emerges from the implicit learning area, in which interest focuses on the effect of more complex and structured situations than those involved in implicit memory research. In most studies, the to-be-learned material is displayed in normal conditions, and the need for the attentional processing of this material has been acknowledged ever since Reber's early papers (1967) on artificial grammar learning. However, the hypothesis that implicit learning could occur without attentional encoding has been proposed in different contexts. Berry and Broadbent (1988), for instance, have introduced the concept of unselective (i.e. without attention) learning. Unselective learning was assumed to occur when the situation was too complex to be solved by attention-based mechanisms. Cohen, Ivry, and Keele (1990) also assumed nonattentional learning, although their proposal was diametrically opposed to Berry and Broadbent’s position. Their hypothesis was that attention is required for learning complex sequences, while nonattentional learning is effective for the simplest forms of sequential dependencies. In both cases, supporting evidence was provided by studies that used a concurrent secondary task during the training session.
Recent studies strongly challenge the claim that two forms of learning can be distinguished, with a nonattentional form emerging when the situation is very complex (e.g. Berry & Broadbent, 1988) or very simple (Cohen et al., 1990). In some cases, the prior evidence has not been replicated. For instance, Green and Shanks (1993) failed to replicate some of the results obtained by Broadbent and co-workers despite extensive attempts to do so, and observed that, as a rule, the secondary task impaired performance irrespective of the complexity of the task. In other cases, the prior evidence has been reinterpreted. Subsequent reappraisal has shown that, as in the field of memory research, the dual task conditions routinely used in the studies investigating the role of attention in implicit learning paradigms did not prevent brief attentional shifts towards the relevant information. As a case in point, Jimenez and Mendez (1999) conclude from their recent experimental studies that selective attention to the predictive dimensions in a sequence learning paradigm is necessary to learn about the sequential relationships (see also Jiang & Chun, 2001, for similar evidence from another paradigm, and Frensch, Buchner, & Lin, 1994, and Hsiao & Reber, 1998, for other approaches that emphasize the role of attention in learning).
8.2 The Unconscious Processing of Semantic
Information
We have now to examine results suggesting the possibility of semantic representations without concurrent conscious experience. Of special relevance is the so-called unconscious semantic priming effect. The semantic priming effect designates the influence of a prime on the processing of an immediately following target, when this influence logically implies the access to the meaning of the prime, beyond its low-level perceptual features. For instance, a word prime may shorten the naming time of a semantically related target, or influence the liking judgment of the target (e.g. Greenwald, Draine, & Abrams, 1996) even though the prime and the target are different. The unconscious semantic priming (USP) phenomenon corresponds to the case where the influential prime is unconsciously identified. Admittedly, the USP phenomenon requires that the system generates and uses a symbolic representation without any conscious counterpart, a requisite that is obviously at odds with the principles underpinning a mentalistic framework.
8.2.1. Does the USP phenomenon have a solid empirical basis? First, it should be mentioned that there are many reports of failures. For instance, Bar and Biederman (1998) asked their subjects to name a familiar object presented in subliminal conditions, as proven by the fact that performance was at chance in an immediately subsequent forced-choice test of recognition. This brief exposure resulted in a substantial increase in the naming accuracy of the same object later in the session. However, this effect was limited to the case where the very same object was presented in the two occurrences. No facilitation was observed when the second object shared the same name but not the same shape (e.g. an office swivel chair and a four-legged kitchen chair, or a motorboat and a sailboat). A considerable decline in the effect was observed when the same object was slightly translated in the visual field. Similar failures to reveal unconscious semantic processing have been reported using binocular rivalry as a tool (for a review, see Blake, 1998). When different images are shown to the left and right eyes, the conscious percept is characterized by alternating periods of left-eye dominance and right-eye dominance. The question is: how is information processed that is normally visible, but suppressed from conscious awareness while the other eye is dominant? Many experiments have shown that some aspects of information processing are unimpaired, such as certain visual aftereffects. However, once again, all semantic processing is completely disrupted. Thus, words erased from consciousness by rivalry suppression failed to improve performance in a subsequent lexical decision task (Zimba & Blake, 1983).
However, some authors claim they have obtained positive evidence for USP. Positive evidence has typically been obtained in conditions where the target follows the prime by a fraction of a second. For instance, in Greenwald et al. (1996), the influence of a prime on the liking judgment of a semantically related target was obtained only when the prime target interval did not exceed 100 ms. At the best of our knowledge, the fact that the effect of semantic, i.e. deep encoding may be extremely short-lived, has never been considered to be an objection even though it runs counter to the most established findings in the memory field. Whatever the case, even tagged with this astonishing characteristic, the existence of USP, if confirmed, rules out our claim that the only cognitive representations are those that form the phenomenal experience.
Our argument is that a compelling demonstration of USP has not yet been provided. All the alleged demonstrations of the phenomenon have been followed by the devastating criticisms from skeptics. A hallmark of this recurrent sketch is the BBS target paper by Holender (1986), who concluded from an impressive re-analysis of the available data that "none of these studies has included the requisite controls to ensure that semantic activation was not accompanied by conscious identification of the stimulus at the time of presentation... On the basis of the current evidence, it is most likely that these stimuli were indeed consciously identified." (Holender, 1986, pp.1). Since then, new papers for and against the argument have been published. For instance, Draine and Greenwald (1998) recently presented a new methodology to demonstrate USP, a methodology that Merikle and Reingold, in a subsequent comment, found "compromised by the same issues concerning the measurement of awareness that have plagued all previous attempts to use the dissociation paradigm to demonstrate unconscious perception in the complete absence of conscious perception (Merickle & Reingold, 1998, pp.304; see also Miller, 2000).
8.2.2. An Illustration. The presentation of the criticisms made by Holender (1986), Merickle and Reingold (1988), and a few others, goes well beyond the scope of this paper. However, to illustrate, a detailed analysis of a new example may be useful. The recent claimed experimental evidence for USP provided by Dehaene et al. (1998) is of special interest, because, to the best of our knowledge, it has not yet been challenged. The choice of this study was also prompted both by its recency and its publication in a high-impact journal, a status suggesting that it provides an existence proof of USP that addresses all the earlier criticisms. Dehaene and co-workers used a task in which participants had to press one key if a target number was larger than 5, and another key if the target was smaller than 5. The target was immediately preceded by a masked prime number, which could be also either larger or smaller than 5. Thus by crossing the values of the primes and the values of the targets, the experiment comprised four conditions, with half of them being congruent and half incongruent with regard to the expected motor response. The authors observed that congruent conditions elicited a reaction time 24 ms shorter than non congruent conditions. Interestingly, this positive priming effect was obtained even when (a) the numerical prime was presented as an arabic digit and the target as a spell-out number (or vice-versa) and (b) when the trials with repeated displays were removed from the analysis. This suggested that participants were not influenced by the surface similarity between the prime and the target, but instead performed a comparison between the numerical prime and 5, a task that undoubtedly taps the semantic level. On the other hand, because the authors obtained a non significant discrimination performance for the prime, as measured by d', they argued that the effect was unconscious. At first glance, the study does indeed support the authors' conclusion that "a large amount of cerebral processing, including perception, semantic categorization and task execution, can be performed in the absence of consciousness" (Dehaene et al., 1998, pp. 599).
Unfortunately, further scrutiny of the paper leads to far less clear-cut conclusions. On the one hand, there are serious reasons to doubt that the task tapped the semantic level. Indeed, the whole study involved in fact only four numbers, two numbers lower than 5 (1 and 4) and two numbers larger than 5 (6 and 9). Moreover, for each target number in a specific format (e.g.: 4, or NINE), the keypressing task was repeated over 64 trials. In these conditions, after a few training trials, it appears quite unlikely that participants actually performed a comparison with 5 when the numbers were displayed, whether as prime or as target. Studies on automatism (see Section 7.3) strongly suggest that participants quickly shifted from an algorithmic mode to a direct memory retrieval, linking 1 and 4 to the left key and 6 to 9 to the right key (or vice-versa depending on the group). In other words, after minimal experience with the task, it is clear that participants were no longer performing a comparison task, and proceeded with the numbers as they would have proceeded with meaningless visual patterns or sounds, that is to say without any semantic involvement.
On the other hand, the claim for unconsciousness is also questionable. The masked prime was displayed for 43 msec. Such a duration may be sufficient for identifying a stimulus such as a word. The hypothesis that the prime could have been consciously detected is strengthened by the fact that (a) the choice was limited to four, highly familiar primes; (b) the primes were short. Half of them were one-character long (the number in their digit format) and the others (the number written as words) comprised only a few letters; (c) each prime was repeated 64 times (in a specific format). This is of major importance. Indeed, it has be shown that the simple repetition of a subliminal stimulus in the same conditions of exposure greatly improves detection. For instance, Bar and Biederman (1998) report that the rate of correct identification of a familiar object presented for an average of 47 ms (range: 42 to 56 ms) increases from 13.5 percent in the first presentation to 34.5 percent in the second 15 min. later. Finally (d), the stimuli serving as prime were perceived as targets under normal exposure conditions throughout the experiment. Although the study was intended to capture the effect of "prime" on "target", we should also consider the possibility that, despite the labels given to the stimuli by the experimenters, the participants may also have become sensitive to the priming effect that the target had on a subsequent prime.
Although these conditions make the non detection of the prime highly unlikely a priori, the authors report a non-significant d' in two additional experiments devised to assess the rate of detection (Exp.1) or discrimination (Exp.2) of the prime. However, each prime was only presented 12 times (instead of 64 in the main experiment). Still more damaging to the authors' conclusion is the fact that there was a clear descriptive trend for an effect, at an even shorter duration of presentation than the 43 ms used in the main experiment. For instance, a prime presented during 29 msec. elicited 10.4 percent of hits versus 7.3 of false alarms. Despite this contradictory evidence, the authors concluded in favor of unconsciousness, relying on statistical non significance. It is worth adding that the well-known shortcomings of conclusions drawn from non significant results are specially relevant here, given the small number of observations on which the tests were based (these additional experiments were run with a smaller number of subjects than the main experiment: N= 6 and 7 respectively, instead of 12).
8.2.3. Concluding Comments. To conclude, the Dehaene et al. study is no more conclusive than the many earlier attempts that have flourished in the literature for two decades or so. Far from demonstrating unconscious semantic priming, this study describes an effect that is neither unconscious nor semantic. Parenthetically, our reappraisal highlights the fact that even very recent alleged evidence for unconscious semantic priming may contain a substantial number of conceptual and methodological flaws, the nature of which has been pointed out many years ago. This phenomenon reveals the depth of the commitment of most investigators to the zeitgeist, and illustrates how the prevalent view may reinforce itself circularly. For the concern of this paper, the Dehaene et al. study, when considered in conjunction with earlier studies, does not require us to reject the conclusion that there is to-date no compelling evidence for unconscious semantic priming. The only effects for which there is a solid empirical basis concern, on the one hand, unconscious priming attributable to the processing of some surface property of the prime and, on the other, semantic priming associated with the conscious processing of the prime. We will return to these phenomena shortly, but, for the moment, it is sufficient to note that none of these phenomena require the postulate that a symbolic representation can be created, stored, and used outside of the subject's phenomenal experience.
We focused above on the phenomenon of unconscious semantic priming, because it has been the most widely used argument in favor of the unconscious perception and representations of words. However, certain other studies have made use of the Stroop effect. For instance, Marcel (1983) and Chessman and Merikle (1986) reported a Stroop effect without color word detection, thus suggesting unconscious access to the meaning of the color word. Unfortunately, such an effect has also been criticized (Holender, 1986; Dulany, 1997), and has been found to be very difficult to replicate. Thus Tzelgov, Porat, and Henik (1997) showed that when the color word is displayed near the threshold, the stroop effect is only observed in trials in which participants correctly identified the word, and in participants who identified the words above chance level.
It may obviously be argued that the question of whether the meaning of a non identified word can influence behavior is still open. We agree. However, those who are tempted to see arguments for a cognitive unconscious in this literature should be aware of at least two points. First, isolated positive reports can not be considered demonstrative. This is because inference is probabilistic; experiments aimed at demonstrating unconscious semantic priming are presumably widespread, and a few of them ought to report statistically significant effects. To be reliable, a demonstration must define a set of specific conditions in which the effect is reproducible. No one can argue that this condition is currently fulfilled. Second, available experiments provide at least one firm conclusion: If an effect does occur, it is weak and short-lived. To rely on such marginal phenomena to support the idea of a cognitive unconscious is hardly serious. The concept of cognitive unconscious is overly costful, and if it turns out that its only effect lies in barely detectable phenomena the adaptive function of which is questionable, all the evolutionary biology principles need to be reconsidered! As pointed out by Dulany (1999): "If claims for the power of a cognitive unconscious were correct, the experimental effects would be too strong and replicable for these literatures even to be controversial. No one can claim that.". To conclude, we believe that available studies on the processing of unconscious semantic information fail to constitute a challenge to our framework.
8.3. Blindsight and Other Neuropsychological Disorders
In some circumstances, brain damaged people may give adapted responses to stimuli of which they deny any conscious perceptual experience. Although several syndromes fall under this head, we focus here on one of the most intensively investigated, namely blindsight. According to the standard description, patients with lesions in the primary visual cortex are able to respond appropriately to stimuli presented in regions of the visual field formerly represented by the lesioned cortex, without this responding being accompanied by a conscious visual experience. Among the preserved abilities are, for instance, the ability to detect the localization of a light point, the presence of a motion, and even the direction of this motion (e.g. Weiskrantz, 1997). This pattern of findings runs counter to our framework, because it suggests that visual representations can be processed unconsciously, as indicated by patients' preserved performance on some visual tasks. Blindsight performance lends favor to the postulate grounding the prevalent cognitive approach, namely that consciousness provides an optional access to phenomena which occurs in the cognitive unconscious.
Our proposal is that the available data, however, do not provide such clear-cut evidence of dissociation as the above description suggests (for a critical examination of the blindsight literature, see Campion, Latto, and Smith, 1983). On the one hand, it is worth stressing that performance involving the damaged cortical region of blindsight patients remains deeply impaired. To borrow Marcel’s (1986) famous example, no one has never seen a thirsty patient grasping a glass of water placed in the visual field formerly represented by the lesioned cortex. Even in the limited sample of tasks in which above chance performance has been reported, blindsight performance does not equal that of normals. This is also true of other, related syndromes. In a conclusion to a tutorial review covering blindsight, prosopagnosia, neglect, and alexia, Farah (1994) noted: "Among all of the syndromes, there is none for which visual perception, in its totality, has been convincingly demonstrated to be normal or near normal. Therefore, there is no reason to view these syndromes as consisting of normal perception with conscious awareness merely stripped away... There is currently no evidence for a dedicated conscious awareness system, distinct from the systems that perform specific perceptual or cognitive functions." (Farah, 1994, p.72). On the other hand, claiming that patients lack conscious experience is also an overstatement. Blindsight patients often report some feelings that guide their response. Surprisingly, this feeling is sometimes described as non specifically visual in nature, a fact that may have prompted observers to overemphasize the dissociation. But having abnormal subjective experience does not equal having no subjective experience at all. Overall, the neuropsychological disorders mentioned above certainly provide intriguing findings. However, the question of whether they illustrate more than above-chance, but degraded performance, accompanied by distorted, but still present phenomenal awareness, remains a point of debate.
9
Conclusion
9.1. Summary
The prevalent computational view of mind necessarily relies on the postulate of a sophisticated cognitive unconscious. Indeed, most psychological models require the existence of unconscious representations, and the possibility of performing various unconscious operations on these representations, such as rule abstraction, analysis, reasoning, or inference. This paper has explored the possibility of an alternative framework, originally proposed by Dulany (1991, 1997), which avoids any presupposition concerning an unconscious counterpart to our conscious mental life. In this so-called mentalistic framework, all mental life consists of nonconscious operations on conscious states, each of them doing nothing on its own independently of the other.
The challenge therefore consists in accounting for the relative isomorphism between the content of our phenomenal experience and the world structure, without calling for unconscious operations on unconscious representational contents. Our solution is based on the progressive transformation of phenomenal experiences as a result of self-organizing processes. Thanks to ubiquitous properties of the words of the language, and more generally of the objects of the environment, basic associative principles, when they are allowed to operate on successive conscious contents, appear sufficient to shape perceptual units that match words and objects. Moreover, with more extensive interaction with the world structure, the same processes turn out to be able to generate highly complex representations isomorphic with the world structure. These representations are themselves able to fulfill the function generally assigned to unconscious rule-governed thinking, as suggested in various recent research domains. Thus, when learning processes are given their full place, what seemed to be straightforward evidence for a cognitive unconscious turns out to be explicable through the self-organizing properties of conscious representations. In addition, we have shown that a wide range of phenomena generally thought to provide direct support for the existence of a cognitive unconscious, such as subliminal semantic activation, and various alleged examples of implicit/explicit dissociations, can also be encompassed within a mentalistic view.
9.2. Looking Towards the Future
Although much of the literature has been covered, our treatment of many issues has been somewhat cursory. We make no claim to have been exhaustive. It is possible that we may unintentionally have ignored some robust and replicable phenomena that may represent a challenge to a mentalistic account. It is incumbent primarily upon the opponents of this framework to identify these phenomena as empirical counterarguments. Pending further challenges, our provisional conclusion is that the self-organizing properties of conscious contents offer a way of triggering a far-reaching reappraisal of mainstream cognitive psychology. If this conclusion is accepted, a promising direction of research will be to further explore the SOC model and its implications in various domains, in order to consolidate and expand the scope of the mentalistic framework.
One issue of special importance relates to human development. Although a few indications of the implications of our view with regard to child development have been provided in Section 4.2 (for a more extensive treatment, see Perruchet & Vinter, 1998a), it is obvious that further work is needed. It is crucial for the SOC model to assess whether its underlying principles can be generalized across the entire life span, and notably at a very early stage of infant development. If the SOC model is able to succeed in this task, this model could provide the basis for a new developmental model which challenges most of the current theories. The theories that rely most extensively on nativism are obviously those that are the most concerned. But such a model would also be in sharp contrast to constructivist approaches, such as Karmiloff-Smith's (e.g. 1992) model, and the recent ideas inspired by connectionist modeling (e.g. Elman et al., 1996), notably concerning the status and the role of consciousness. Research aimed at exploring these aspects is currently in progress in our lab.
Another issue that deserves consideration concerns the neurobiological implementations of the SOC model. At a first glance, our model, because it is based on the formation of associations, should be well-suited for translation into biological mechanisms. But this simplicity may be a matter of appearance only. Indeed, it is worth recalling once more that associations bear on complex representations, which form the components of the phenomenal experience. Although associative mechanisms are fairly well understood in cases where simple stimuli are involved, associations involving the complex content of conscious experiences would appear to require other explanatory schemas. One relevant approach may be the neural interactionism proposed by Roger Sperry (e.g. Sperry, 1965), in which consciousness, conceived of as an emergent property of brain functioning, continually feeds back into the system from which it has emerged, thus resulting in the principle of downward causation.
Besides these issues, we have not commented on many other potentially relevant aspects, such as the implications of our view for neuropsychological disorders, or still for clinical and applied psychology. Our hope is that our proposals will appear suggestive enough to first-hand workers in those different areas to motivate them to investigate these aspects.
Footnotes
1- It could be argued that unconscious representations also fulfill a simple function of storage. Each of us has the strong intuition that if the mental picture of a pencil (or any other of the countless objects or events not currently on view) can be evoked at any moment, the representation of the pencil is stored somewhere in the brain independently of its conscious instantiation. This reasoning is questionable, however. By way of illustration, let us consider the physical picture of a pencil on a slide. Everybody would agree that the picture fulfills its representative function for a human perceiver when it is projected on screen. But what about this picture when it is not displayed? It is only a pattern of colored pixels on a film. Interestingly, the picture may be stored in a format that does not preserve its analogical relation with the original scene, such as in the series of binary digits obtained after compression of the digitized picture. The storage format does not matter, because a stored picture has no other function than making possible subsequent generations of the picture. What is kept over time is the possibility of generating the picture again through appropriate procedures, material, or decoding mechanisms, which are not embedded in the stored picture. A more biologically relevant analogy can be found in the way the information needed to synthesize proteins is coded in the genes. The proteins are not stored in a ready-to-use format: what is stored in the RNA are the assembly instructions to generate a specific protein when the appropriate signals and conditions are present. These illustrations make it clear that the possibility of generating the conscious representation of a past experience does not mean in any way that this representation has enduring existence as such (i.e. serves its function) in a putative unconscious system outside of its conscious and momentary instantiation. The same reasoning obviously holds for any form of knowledge, whether episodic or semantic.
2- There is at least one finding inconsistent with this hypothesis. Baeyens, Eelen, Crombez, & Van Den Bergh (1992) observed conditioned reactions in subjects who were unaware of the S1-S2 relationships in an evaluative conditioning procedure. The point is that these responses were presumably due to the knowledge of these relationships, and not to a change in the intrinsic properties of the conditioned stimulus, because they were affected by a post-conditioning revaluation of the unconditioned stimulus. Detailed methodological considerations are necessary here to suggest how this contradiction can be resolved (see Shanks & St.John, 1994, for a critical analysis of the Baeyens et al.'s results).
3- For most cognitive scientist, introducing consciousness into psychological modeling seemingly increases complexity. Taking for granted that unconscious mental activities are the basic stuff of the mind, the fact that some proportion of our cognitive activities appears to be conscious adds unwanted complexity, and confronts us with many difficult problems. Consciousness appears as the piece left over when the jigsaw has been completed. However, it is worth emphasizing that this line of reasoning holds only within the metatheoretical framework that we challenge.
4- The traditional collapsing of consciousness and language in many areas of research makes it necessary to emphasize that the contents of phenomenal experiences cannot be identified as verbalizable knowledge. As defined by Baars (1995, p.6): "The content of consciousness includes the immediate perceptual world; inner speech and visual imagery; the fleeting present and its fast-fading traces in immediate memory; bodily feelings like pleasure, pain, and excitements; surges of emotional feelings; autobiographical memories; clear and immediate intentions, expectations, and actions; explicit beliefs about oneself and the world; and concepts that are abstract but focal".
5- Provocative as this proposal sounds, we believe that taking the contents of phenomenal consciousness as a research target represents only a minimal departure, if any, from the current practice of experimental researchers. Indeed, a quick survey of the literature shows that the dependent variables used in most of the laboratory experiments are a direct reflection of the participants' phenomenal consciousness. Let us consider, for instance, the variables used in two fields that cannot be suspected of overemphasizing consciousness, namely the fields of implicit memory and implicit learning. A typical request made to participants in implicit memory tasks is to say the first word that comes to mind. Likewise, a typical task in implicit learning research is to assess whether a string of letters sounds consistent with an artificial grammar. Obviously, such requests capture aspects of the phenomenal experience of the participants. Along the same lines, studies on perception heavily relies on what people report they see, hear, or feel.
6- In keeping with a standard usage of the notion of self-organization, this terminology is not intended to mean that the structure of the world plays no role in the increasing representativeness of conscious experiences. We thank Don Dulany for warning us about this misinterpretation. Claiming that phenomenal experience is endowed with self-organizing properties means that the properties of conscious perceptions and representations, when considered jointly with the properties of the external world, are sufficient to account for the growing consistency of these perceptions and representations with the world structure. What is excluded from the causal sketch in the notion of self-organizing consciousness is not the environmental structure, but the cognitive unconscious.
7- For the sake of simplicity, we make a rigid distinction between the notion of primitives (a set of elements that are perceived as a whole by the system) and the notion of chunks (the momentary content of the phenomenal experience, which may include one or more primitives). It should be clear that this distinction may depend on contextual features. For instance, a word, in the usual linguistic behavior of adult readers, may be considered as the relevant primitive; However, the primitives may also be the letters when the task consists of checking the spelling of a word.
8- Applying PARSER's principles to infants rests on the premise that consciousness is not limited to humans endowed with language. Obviously, infants' conscious experience is presumably different from that of adult humans. But the fact that the phenomenal experience differs between individuals, because it is built throughout one's life, is precisely one of the key-tenets of the present paper.
9- Taking this speculation one step further, it may be noted that our model is also consistent with an evolutionary approach. Certainly it is unrealistic to assume that the processing constraints linked to consciousness evolved from their function in learning language because, presumably, the emergence of consciousness did not follow the arrival of language in the philogenetic chronology. However, as we discuss in the next section, the explanatory sketch outlined above for the words of the language also applies to the other units (objects, animals, etc.) of the real world. It is possible that the specific properties of conscious thought emerged by natural selection thanks to their efficiency in dealing with these natural units. The properties of languages could be due to the fact that language has evolved in such a way that it could be learned easily, given the human abilities, as pointed out by Newport (1990).
10- The very same argument could also be illustrated in the literature dealing with the implicit learning of invariant features. In McGeorge and Burton (1990), the task was to perform arithmetic computations on strings of four numbers. Unbeknown to subjects, every string contained the number 3. In a subsequent test phase, the subjects had to indicate which of two displayed strings had previously been presented. In fact, both strings of each pair was new, but one of them contained the number 3 while the other did not. Subjects preferentially selected the strings containing the number 3, although they were not conscious of this regularity. This finding was originally interpreted as evidence of unconscious rule abstraction. In fact, Wright and Burton (1995; see also Stadler, Warren, & Lesch, 2000) have shown that the presence of a fixed number within a string substantially decreased the probability of this string including a repeated digit. Different experimental findings have provided evidence that the subjects’ performance was in fact due to their sensitivity to this remote by-product of the experimenter’s rule.
11- The demonstration of the existence, in various species of animal, of abilities similar to those we consider to be responsible for human performance in complex transfer tasks suggests that our account successfully avoids any recourse to genuine abstraction. But the debate is not over, because it may be argued that animals also perform rule-abstraction. Talking about the animal conditioning literature, and explicitly borrowing their claim to Spence (1937), Wills and Mackintosh (1999) observed that "relational theories are not all alike. In one version, they seem to be talking of a conceptual process that abstract relationships of taller than, brighter than, same as, different from, and so forth. In another, they are appealing to a much lower-level, sensory process that allows the contrast between the two neighboring stimuli to enhance the perceived difference between them" (Wills & Mackintosh, 1999, p.48). These authors favor the simpler sort of mechanisms on the basis of the limitations inherent to the phenomenon of relational coding. The argument is as follows. If relational learning is the product of a sophisticated process of abstraction operating on the basic properties of the stimuli, it looks reasonable to anticipate that abstraction will occur irrespective of the nature of these stimuli, provided they are successfully encoded. By contrast, if relational coding is the end-result of low-level, hardwired mechanisms, there is no reason to expect generality over modalities and features. In support to the latter account, Wills and Mackintosh showed that pigeons have no difficulties in learning relations between rectangles differing in brightness, whereas they fail to learn the relations between stars with different number of vertices, although the two problems were formally similar.
12- Results concerning the implicit learning of invariant features (see footnote 7b) are more complex because positive transfer was initially reported (e.g. McGeorge and Burton, 1990). However, Stadler et al. (2000) have shown that transfer disappears when the response strategy initially discovered by Wright and Burton (1995) is denied to subjects. They concluded that "this form of learning, like many other forms of implicit learning and memory, is hyperspecific" (Stadler et al. 2000, p.235).
References
Adolph, K.E. (2000). Specificity of learning: Why infants fall over a veritable cliff. Psychological Science, 11, 290-295.
Altmann, G., & Dienes, Z. (1999). Rule learning by seven-month-old infants and neural networks. Science, 284, 875.
Anderson, J.R. (1994). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum Associates.
Anderson, J.R., & Milson, R. (1989). Human memory: An adaptive perspective. Psychological Review, 96, 703-719.
Aslin, R.N., Woodward, J.Z., LaMendola, N.P., & Bever, T.G. (1996). Models of word segmentation in fluent maternal speech to infants. In J. L. Morgan & K., Demuth (Eds.), Signal to Syntax. (pp. 117-134). Mahvah, NJ: Erlbaum Associates.
Baars, B.J. (1995). Psychology in a world of sentient, self-knowing beings: A modest utopian fantasy. In R.L. Solso (Ed.). Mind and Brain in the 21st Century. Cambridge, MA: MIT Press.
Baars, B.J. (1997). In the theater of consciousness: The workspace of the mind. Oxford University Press.
Baars, B.J. (1998). Metaphors of consciousness and attention in the brain. Trends in Neurosciences, 21, 51-89.
Baeyens, F., Eelen, P., Crombez, G., & Van Den Bergh, O. (1992). Human evaluative conditioning: Acquisition trials, presentation schedule, evaluative style, and contingency awareness. Behavior Research and Therapy, 30, 133-142.
Bar, M., & Biederman, I. (1998). Subliminal visual priming. Psychological Science, 9, 464-469.
Berry, D.C., & Broadbent, D.E. (1988). Interactive tasks and the implicit-explicit distinction. British Journal of Psychology, 79, 251-272.
Berry, D.C., & Dienes, Z. (1993). Implicit Learning: Theoretical and empirical issues. Hillsdale, N.J.: Lawrence Erlbaum Associates (pp. 197).
Besner, D, Stolz, J.A., & Boutilier, C. (1997). The Stroop effect and the myth of automaticity. Psychonomic Bulletin and Review, 4, 221-225.
Blake, R. (1998). What can be "perceived" in the absence of visual awareness? Psychological Science, X,157-162
Bornstein, M.H., & Krinsky, S.J. (1985). Perception of symmetry in infancy: The salience of vertical symmetry and the perception of pattern wholes. Journal of Experimental Child Psychology, 39, 1-19
Bower, T.G.R. (1979). Human development. San Francisco: Freeman.
Braine, M.D.S. (1978). On the relationship between the natural logic of reasoning and standard logic. Psychological Review, 85, 1-21.
Brent, M.R. (1996). Advances in the computational study of language acquisition. Cognition, 61, 1-38.
Brent, M.R., & Cartwright, T.A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61, 93-125.
Broadbent, D. E. (1977). Levels, hierarchies and the focus of control. Quarterly Journal of Experimental Psychology, 29, 181-201.
Bronson, G.W. (1982). The scanning patterns of human infants: Implications for visual learning. Norwood, N.J.: Ablex.
Brooks, L. R. (1978). Nonanalytic concept formation and memory for instances. In E. Rosch & B. B. Lloyd (Eds.), Cognition and Categorization (pp. 169-215). Hillsdale, NJ: Erlbaum.
Brooks, L. R. & Vokey, J. R. (1991). Abstract analogies and abstracted grammars: A comment on Reber, and Mathews et al. Journal of Experimental Psychology: General, 120, 316-323.
Bryant, P., Nunes, T., & Snaith, R. (2000). Children learn an untaught rule of spelling. Nature, 403, 157-158.
Buchner, A. & Erdfelder, E. (1996). On assumptions of, relations between, and evaluations of some process dissociation measurement models. Consciousness and cognition, 5, 581-594.
Butler, L.T., & Berry, D.C. (2001). Implicit memory: Intention and awareness revisited. Trends in Cognitive Sciences, 5, 192-197.
Campion, J., Latto, R., & Smith, Y.M. (1983). Is blindsight an effect of scattered light, spared cortex, and near-threshold vision? Behavioral and Brain Sciences, 6, 423-486.
Carlson, R.A., & Dulany, D.D. (1985). Conscious attention and abstraction in concept learning. Journal of Experimental Psychology: Learning, Memory, and Cognition. 11, 45-58.
Case, R. (1993). Theories of learning and theories of development. Educational Psychologist, 28, 219-233.
Chater, N., & Oaksford, M. (1999). Ten years of the rational analysis of cognition. Trends in Cognitive Sciences, 3, 57-65
Cheng, P.W., & Holyoak, K.J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391-416.
Chessman, J., & Merikle, P.M. (1986). Distinguishing conscious from unconscious perceptual process. Canadian Journal of Psychology, 40, 343-367.
Christiansen, M.H., Allen, J., & Seidenberg, M.S. (1998). Learning to segment speech using multiple cues: A connectionist model. Language and Cognitive Processes, 13, 221-268.
Christiansen, M.H., Conway, C.M. & Curtin, S. (2000). A connectionist single-mechanism account of rule-like behavior in infancy. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society (pp. 83-88). Mahwah, NJ: Lawrence Erlbaum.
Clark, A., & Thornton, C. (1997). Trading spaces: Computation, representation and the limits of uninformed learning. Behavioral and Brain Sciences, 20, 57-90.
Cleeremans, A. (1993). Mechanims of Implicit Learning: A connectionnist model of sequence processing. MIT press: Bradford Books (pp.227)
Cleeremans, A., Destrebecqz, A., & Boyer, M. (1998). Implicit learning: News from the front. Trends in Cognitive Sciences, 2, 406-416.
Cleeremans, A., & Jimenez, L. (1998). Implicit sequence learning: The truth is in the details. In M. Stadler & P. Frensch (Eds), Handbook of implicit learning (pp. 323-364). Thousand Oaks, CA: Sage Publications.
Cleeremans, A. & McClelland, J.L. (1991). Learning the structure of event sequences. Journal of Experimental Psychology: General, 120, 235-253.
Clement, C. (1994). Effect of structural embedding on analogical transfer: Manifest versus latent analogs. American Journal of Psychology. 107, 1-38.
Cohen, A., Ivry, R.I. & Keele, S.W. (1990). Attention and structure in sequence learning, Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 17-30.
Cowan, N. (1995). Attention and memory: An integrated framework. New-York: Oxford University Press.
Cowan, N. (in press). The magical number 4 in short-term memory: a reconsideration of mental storage capacity. Behavioral and Brain Sciences.
Crabb, B.T., & Dark, V. (1999). Perceptual implicit memory requires attentional encoding. Memory and Cognition, 27, 267-275
Craik, F.I.M., & Lockhart, R.S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671-684.
Dehaene, S., Naccache, L., Le Clec'h, G., Koechlin, E, Mueller, M., Dehaene-Lambertz, G, Van De Moortele, P-F, & Le Bihan, D. (1998). Imaging unconscious semantic priming. Nature, 395, 597-600
De Houver, J., Hendrickx, H., & Baeyens, F. (1997). Evaluative learning with "subliminally" presented stimuli. Consciousness and Cognition, 6, 87-107.
Dienes, Z., & Altmann, G. (1997). Transfer of implicit knowledge across domains: How implicit and how abstract? In D. Berry (Ed.). How implicit is implicit learning. Oxford: Oxford University Press.
Draine, S.C., & Greenwald, A.G. (1998). Replicable unconscious semantic priming. Journal of Experimental Psychology: General, 127, 286-303.
Dulany, D. E. (1991). Conscious representation and thought systems. In R. S. Wyer & T. K. Srull (Eds.) Advances in social cognition, Vol. 4, Hillsdale, N.J.: Erlbaum, pp. 97-120.
Dulany, D. E. (1997). Consciousness in the explicit (deliberative) and implicit (evocative). In J.D. Cohen & J.W. Schooler (Eds.) Scientific approaches to the study of consciousness (pp. 179-212). Mahwah, NJ: Erlbaum.
Dulany, D.E. (1999). Consciousness, connectionism, and intentionality. Behavioral and Brain Sciences, 22, 154-155.
Dulany, D.E., Carlson, A. & Dewey, G.I. (1984). A case of syntactical learning and judgment: How conscious and how abstract? Journal of Experimental Psychology: General, 113, 541-555.
Eich, E. (1984). Memory for unattended events: Remembering with and without awareness. Memory and Cognition, 12, 105-111.
Elman, J.L. (1990). Finding structure in time. Cognitive Science, 14, 179-211.
Elman, J.L., Bates, E.A., Johnson, M.H., Karmiloff-Smith, A., Parisi, D., & Plukett, K. (1996). Rethinking innateness: A connectionist perspective on development. Cambridge, M.A.: MIT Press.
Evans, J.St.B., Newstead, S.E., & Byrne, P. (1993). Human reasoning. Hillsdale, N.J.: Erlbaum.
Fahlman, S.E., & Lebiere, C. (1990). The Cascade-Correlation learning architecture. In D. S. Touretzky (Ed.). Advances in Neural Information Processing Systems: Vol 2 (pp. 524-532). San Mateo, California: Morgan Kaufman Publishers Inc.
Farah, M.J. (1994). Visual perception and visual awareness after brain damage: A tutorial overview. In C. Umilta and M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing, Cambridge, MA: MIT Press (pp. 37-76).
Fischer, K., & Granott, N. (1995). Beyond one-dimensional change: Parallel, concurrent, socially distributed processes in learning and development. Human Development, 38, 302-314.
Fodor, J. (1983). The modularity of mind. Cambridge, MA.: MIT Press.
Fong, G.T., Krantz, D.H., & Nisbett, R.E. (1986). The effects of statistical training on thinking about everyday problems. Cognitive Psychology, 18, 253-292.
Frensch, P.A., Buchner, A., & Lin, J. (1994). Implicit learning of unique and ambiguous serial transitions in the presence and absence of a distractor task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 567-584.
Frensch, P.A. & Miner, C.S. (1994). Effects of presentation rate and individual differences in short-term memory capacity on an indirect measure of serial learning. Memory and Cognition, 22, 95-110.
Fulgosi, A., & Guilford, J.P. (1968). Short term incubation in divergent production. American Journal of Psychology, 81, 241-248.
Gabrielli, J.D.E. (1998). Cognitive neuroscience of human memory. Annual Review of Psychology, 49, 87-115.
Ganor-Stern, D., Seamon, J.G., & Carrasco, M. (1998). The role of attention and study time in explicit and implicit memory for unfamiliar visual stimuli. Memory and cognition, 26, 1187-1195.
Garcia, J., Runsiniak, K.W., & Brett, L.P. (1977). Conditioning food-illness aversions in wild animals: caveant canonici. In H. Davis & H.M.B. Hurwitz (Eds.). Operant-Pavlovian interactions. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Gasser, M., & Smith, L.B. (1998). Learning nouns and adjectives: A connectionist account. Language and Cognitive Processes, 13, 269-306.
Gick, M.L., & Holyoak, K.J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1-38.
Gomez, R.L., & Gerken, L.A. (1999). Artificial grammar learning by one-year-olds leads to specific and abstract knowledge. Cognition, 70, 109-135.
Gomez, R.L., Gerken, L.A., & Schvaneveldt, R.W. (2000). The basis of transfer in artificial grammar learning, Memory and Cognition, 28, 253-263.
Green, R.E.A., & Shanks, D.R. (1993). On the existence of independent learning systems: An examination of some evidence. Memory and Cognition, 21, 304-317.
Greenwald, A.G., Draine, S.C., & Abrams, R.H. (1996). Three cognitive markers of unconscious semantic activation, Science, 273, 1699-1702.
Grodzinsky, Y. (in press) The neurology of syntax: Language use without Broca's area. Behavioral and Brain Sciences.
Gruber, R.P., Reed, D.R., & Block, J.D. (1968). Transfer of the conditioned GSR from drug to nondrug state without awareness, Journal of Psychology, 70, 149-155.
Haith, J. (1978). Visual competence in early infancy. In R. Held, H. Leibowitz, & H.L. Teuber (Eds.). Handbook of sensory physiology, Vol 3: Perception (pp. 311-356). Berlin: Springer-Verlag.
Hsiao, A.T., & Reber, A. (1998). The role of attention in implicit sequence learning: Exploring the limits of the cognitive unconscious. In M. Stadler & P. Frensch (Eds), Handbook of implicit learning (pp. 471-494). Thousand Oaks, CA: Sage Publications.
Hinton, G.E. (1990). Mapping part-whole hierarchies into connectionist networks. Artificial Intelligence, 46, 47-75.
Holender, D. (1986). Semantic activation without conscious identification in dichotic listening, parafoveal vision, and visual masking: A survey and apraisal. Behavioral and Brain Sciences, 9, 1-23.
Holland, P.C. (1980). CS-US interval as a determinant of the form of the Pavlovian conditioned response. Journal of Experimental Psychology: Animal Behavior Process, 6, 155-174.
Jacoby, L.L., Ste-Marie, D., & Toth, J.P. (1993). Redefining automaticity: Unconscious influences, awareness and control. In A.D. Baddeley & L. Weiskrantz (Eds), Attention, selection, awareness and control. A tribute to Donald Broadbent. Oxford, England: Oxford University Press. (pp. 261-282).
Jiang, Y., & Chun, M.M. (2001). Selective
attention modulates implicit learning. Quarterly Journal of Experimental
Psychology, 54, 1105-1124.
Jimenez, L., & Mendez, C. (1999). Which attention is needed for implicit sequence learning? Journal of Experimental Psychology: Learning, Memory, and Cognition., 25, 236-259.
Johnson-Laird, P.N. (1983). Mental Models. Cambridge, MA: Harward University Press.
Jusczyk, P.W. (1997). The discovery of spoken language. Cambridge, MA: MIT Press.
Jusczyk, P.W. & Aslin, R.N. (1995). Infants' detection of the sound patterns of words in fluent speech. Cognitive Psychology, 29, 1-23.
Kagan, J. (1971). Change and continuity in infancy. New York: Wiley.
Karmiloff-Smith, A. (1992). Beyond modularity: A developmental perspective on cognitive science. Cambridge, MA: Bradford/ MIT press.
Kihlstrom, J.F. (1987). The cognitive unconscious. Sciences, 237, 1445-1452.
Konorsky, J. (1967). Integrative activity of thge brain. Chicago: University of Chicago Press.
LaBerge, D., & Samuels, S.J. (1974). Toward a theory of automatic information processing in reading. Cognitive Psychology, 6, 293-323.
Lewicki, P., Hill, T. & Bizot, E. (1988). Acquisition of procedural knowledge about a pattern of stimuli that cannot be articulated. Cognitive Psychology, 20, 24-37.
Lewicki, P., Hill, T., & Czyzewska, M. (1992). Nonconscious acquisition of information. American Psychologist, 47, 796-801.
Logan, G.D. (1988). Towards an instance theory of automatization. Psychological Review, 76, 165-178.
Logan, G.D. (1990). Repetition priming and automaticity: Common underlying mechanisms? Cognitive Psychology, 22, 1-35.
Logan, G.D., & Etherton, J.L. (1994). What is learned during automatization? The role of attention in constructing an instance. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 1022-1050.
Ludden, D., & Gupta, P. (2000). Zen in the art of language acquisition: Statistical learning and the Less is More hypothesis. In L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of the 22nd Annual Conference of the Cognitive Science Society, pp. 812-817. Hillsdale, NJ: Lawrence Erlbaum.
MacDonald, P., & MacLeod, C.M. (1998). The influence of attention at encoding on direct and indirect remembering. Acta Psychologica, 98, 298-310.
Mackintosh, N.J. (1975). A theory of attention: Variations in the associability of stimuli with reinforcement. Psychological Review, 82, 276-298.
Mackintosh, N.J. (1997). Has the wheel turned full circle? Fifty years of learning theory, 1946- 1996. The Quarterly Journal of Experimental Psychology, 50A, 879-898.
MacWhinney, B. (1995). The Childes project: Tolls for analysing talk. Hillsdale, NJ: Erlbaum.
Mandler, G. (1975). Consciousness: Respectable, useful, and probably necessary. In R. Solso (Ed.), Information processing and cognition: The Loyola symposium. Hillsdale, NJ: Erlbaum (pp. 229-254).
Mandler, G. (1994). Hypermnesia, incubation and mind popping: On remembering without really trying. In C. Umilta and M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing, Cambridge, MA: MIT Press (pp. 3-33).
Manza, L. & Reber. A.S. (1997). Representing artificial grammar: Transfer across stimulus forms and modalities. In D. Berry (Ed.). How implicit is implicit learning. Oxford: Oxford University Press.
Marcel, A.J. (1983). Conscious and unconscious perception: Experiments in visual masking and word recognition. Cognitive Psychology, 15, 197-237.
Marcel, A.J. (1986). Consciousness and processing : Choosing and testing a null hypothesis. Behavioral and Brain Sciences, 9, 40-41.
Marcus, G.F., Vijayan, S., Rao, S.B., Vishton, P.M. (1999). Rule learning by Seven-Month-Old Infants. Science, 283, 77-80.
Marescaux, P-J, Dejean, K, & Karnas, G. (1990). Acquisition of specific or general knowledge at the control of a dynamic simulated system: an evaluation through a static situations questionnaire and a transfer control task. Report 2PR2GK of the KAUDYTE project (ESPRIT BRA #3219).
Markman, E.M. (1990). Constraints children place on word meanings. Cognitive science, 14, 57-77.
Mathews, R.C., Buss, R.R., Stanley, W.B., Blancard-Fields, F. Cho, J.-R., & Druhan, B. (1989). Role of implicit and explicit processes in learning from examples: A synergistic effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1083-1100.
McClelland, J.L., & Plaut, D.C. (1999). Does generalization in infant learning implicate abstract algebra-like rules? Trends in Cognitive Science, 3, 166-168.
McCloskey, M. (1992). Cognitive mechanisms in numerical processing: Evidence from acquired dyscalculia. Cognition, 44, 107-157.
McDonald, J.L. (1997). Language acquisition: The acquisition of linguistic structure in normal and special populations. Annual Review of Psychology, 48, 215-241.
McDonald, S., & Ramscar, M. (2001). Testing the distributional hypothesis: The influence of context on judgments of semantic similarity. Proceedings of the 23rd Annual Conference of the Cognitive Science Society. University of Edinburgh.
McGeorge, P., & Burton, A.M. (1990). Semantic processing in an incidental learning task. Quarterly Journal of Experimental Psychology, 42A, 597-609.
McKoon, G., & Ratcliff, R. (1998). Memory-based language processing: Psycholinguistic research in the 1960s. Annual Review of Psychology, 49, 25-32.
McLeod, P., & Dienes, Z. (1993). Running to catch the ball. Nature, 362, 23.
Merikle, P.M., & Joordens, S. (1997). Parallels between perception without attention and percpetion without awareness. Consciousness and Cognition, 6, 219-236.
Merikle, P.M., Reingold, E.M. (1998). On demonstrating unconscious perception: Comment on Draine and Greenwald (1998). Journal of experimental Psychology: General, 127, 304-310.
Miller, G.A. (1958). Free recall of redundant strings of letters. Journal of experimental Psychology, 56, 485-491.
Miller, G.A. (1962). Psychology: The science of mental life. New-York, Harper & Row.
Miller, J. (2000). Measurement error in subliminal percpetion experiments: Simulation analyses of two regression models. Journal of experimental Psychology: Human Perception and Performance, 26, 1461-1477.
Neal, A., & Hesketh, B. (1997). Episodic knowledge and implicit learning. Psychonomic Bulletin and Review, 4, 24-37.
Neumann, O. (1984). Automatic processing: A review of recent findings and a plea for an old theory. In W. Prinz and A.F. Sanders (Eds.), Cognition and motor processes. Berlin: Springer, (pp. 255-293).
Newport, E. (1990). Maturational constraints on language learning. Cognitive Science, 14, 11-28.
Noë, A., Pessoa, L. & Thompson, E. (2000). Beyond the grand illusion: what change blindness really teaches us about vision. Visual Cognition. 7, 93-106.
O'Brien, G., & Opie, J. (1999a). A connectionist theory of phenomenal experience. Behavioral and Brain Sciences, 22, 127-148.
O'Brien, G., & Opie, J. (1999b). Putting content into a vehicle theory of consciousness. Behavioral and Brain Sciences, 22, 175-196.
O'Regan, J.K. (1992). Solving the "real" mysteries of visual perception: The world as an outside memory. Canadian Journal of Psychology, 46, 461-488.
Parkin, A.J., & Russo, R. (1990). Implicit and explicit memory and the automatic/ effortful distinction. European Journal of Psychology, 2, 71-80.
Pacton, S., Perruchet, P., Fayol, M., & Cleeremans, A. (2001). Implicit learning out of the lab: The case of orthographic regularities. Journal of experimental Psychology: General, 130, 401-426.
Perruchet, P. (1984). Dual nature of anticipatory classically conditioned reactions. In S. Kornblum and J. Requin (Eds.). Preparatory states and processes (pp. 179-198). Hillsdale, N.J.: Lawrence Erlbaum Associates.
Perruchet, P. (1994). Learning from complex rule-governed environments: On the proper functions of nonconscious and conscious processes. In C. Umilta and M. Moscovitch (Eds.), Attention and performance XV: Conscious and nonconscious information processing (pp. 811-835).
Perruchet, P. & Amorim, M. A. (1992). Conscious knowledge and changes in performance in sequence learning: Evidence against dissociation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 785-800.
Perruchet, P., & Gallego, J. (1997). A subjective unit formation account of implicit learning. In D. Berry (Ed.). How implicit is implicit learning. (pp. 124-161). Oxford: Oxford University Press.
Perruchet, P., Gallego, J. & Savy, I. (1990). A critical reappraisal of the evidence for unconscious abstraction of deterministic rules in complex experimental situations. Cognitive Psychology, 22, 493-516.
Perruchet, P. & Pacteau, C. (1990). Synthetic grammar learning: Implicit rule abstraction or explicit fragmentary knowledge? Journal of Experimental Psychology: General, 119, 264-275.
Perruchet, P. & Pacteau, C., & Gallego, J. (1997). Abstraction of covariation in incidental learning and covariation bias. British Journal of Psychology, 88, 441-458.
Perruchet, P., & Vinter, A.(1998 a) Learning and development: The implicit knowledge assumption reconsidered . In M. Stadler & P. Frensch (Eds), Handbook of implicit learning (pp. 495-531). Thousand Oaks, CA: Sage Publications.
Perruchet, P., & Vinter, A. (1998 b). PARSER: A model for word segmentation. Journal of Memory and Language, 39, 246-263.
Perruchet, P., Vinter, A, & Gallego, J. (1997). Implicit learning shapes new conscious percepts and representations. Psychonomic Bulletin and review, 4, 43-48.
Perruchet, P., Vinter, A, Pacteau, C., & Gallego, J. (in press). The formation of structurally relevant units in artificial grammar learning. Quarterly Journal of Experimental Psychology.
Piaget, J. (1985). The equilibration of cognitive structures: The central problem of intellectual development. Chicago: University of Chicago Press.
Pinker, S. (1999). Out of the minds of babes. Science, 283, 40-41.
Posner, M.I., & Boies, S.J. (1971). Components of attention. Psychological Review, 78, 391-408.
Pribam, K.H. (1980). Mind, Brain, and Consciousness, in
Reber, A.S. (1967). Implicit learning of artificial grammars. Journal of Verbal Learning and Verbal Behavior, 6, 855-863.
Reber, A. S. (1969). Transfer of syntactic structure in synthetic languages. Journal of Experimental Psychology, 81, 115-119.
Reber, A.S. (1976). Implicit learning of synthetic languages: The role of instructional set. Journal of Experimental Psychology: Human Learning and Memory. 2, 88-94.
Reber, A.S. (1993). Implicit learning and tacit knowledge: an essay on the cognitive unconscious. Oxford University Press, New York.
Redington, M., Chater, N., & Finch, S. (1998). Distributional information: A powerful cue for acquiring syntactic categories. Cognitive Science, 22, 425-469.
Redington, M., & Chater, N. (in press). Knowledge Representation and Transfer in Artificial Grammar Learning. In R.A. French & A. Cleeremans (Eds), Implicit Learning. London: Routledge, Psychology Press.
Reeves, L. M., & Weisberg, R.W. (1994). The role of content and abstract information in analogical transfer. Psychological Bulletin, 115, 381-400.
Reingold, E.M., & Toth, J.P. (1996). Process dissociation versus task dissociations: A controversy in progress. In G. Underwood (Ed.), Implicit Cognition, (pp 159-202). Oxford: Oxford University Press.
Rey, G. (1991). Reasons for doubting the existence of even epiphenomenal consciousness. Behavioral and Brain Sciences, 14, 691-692.
Richardson-Klavehn, A., Gardiner, J.M., & Java, R.I. (1996). Memory: Task dissociation, process dissociations, and dissociations of consciousness. In G. Underwood (Ed.), Implicit Cognition, (pp 85-158). Oxford: Oxford University Press.
Roediger, H.L., & McDermott, K.B. (1993). Implicit memory in normal human subjects. In F. Boller & I. Grafman (Eds.), Handbook of neuropsychology, Vol. 8. Amsterdam: Elsevier.
Ross, B.H., & Kennedy, P.T. (1990). Generalizing from the use or earlier examples in problem solving. Journal of Experimental Psychology: Human Learning and Memory. 16, 42-55.
Saffran, J.R., Aslin, R.N., & Newport, E.L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928.
Saffran, J.R., Johnson, E.K., Aslin, R.N., & Newport (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70, 27-52.
Saffran, J.R., Newport, E.L., & Aslin, R.N. (1996). Word segmentation: The role of distributional cues. Journal of Memory and language, 35, 606-621.
Saffran, J.R., Newport, E.L., Aslin, R.N., Tunick, R.A., & Barrueco, S. (1997). Incidental language learning. Psychological Science, 8, 101-105.
Schmidt, P.A., & Dark, V.J. (1998). Attentional processing of "unattended" flankers: Evidence for a failure of selective attention. Perception and Psychophysics, 60, 227-238.
Schyns, P.G., Golstone, R.L & Thibaut, J-P. (1998). The development of features in object concepts. Behavioral and Brain Sciences, 21, 1-53.
Seidenberg, M.S., & Elman, J.L. (1999). Do infants learn grammar with algebra or statistics? Science, 284, 433.
Seligman, M.E.P. (1970). On the generality of the laws of learning. Psychological Review, 77, 406-418.
Servan-Schreiber, D., & Anderson, J.R. (1990). Learning artificial grammars with competitive chunking. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 592-608.
Shanks, D.R., Johnstone, T., & Staggs, L. (1997). Abstraction processes in artificial grammar learning. Quarterly Journal of Experimental Psychology, 50A, 216-252.
Shanks, D.R. & St.John, M.F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367-447.
Shastri, L., & Ajjanagadde, V. (1993). From simple associations to systematic reasoning. Behavioral and Brain Sciences, 16, 417-494.
Shevrin, H. & Dickman, S.(1980). The psychological unconscious. American Psychologist, 35, 421-434.
Shiffrin, R.M., & Schneider, W. (1977). Controlled and automatic human information processing: II. Perceptual learning, automatic attention and a general theory. Psychological Review, 84, 127-190.
Singley, M.K., & Anderson, J.R. (1989). Transfer of Cognitive Skill. Cambridge, MA: Harvard University Press.
Sloman, S.A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3-22.
Smith, E.E., Langston, C., & Nisbett, R.E. (1992). The case for rules in reasoning. Cognitive Science, 16, 1-40.
Smith, S.M., & and Blankenship, S.E. (1989). Incubation effects. Bulletin of the Psycholomic Society, 27, 311-314.
Smith, S.M., & and Blankenship, S.E. (1991). Incubation and the persistence of fixation in problem solving. American Journal of Psychology, 104, 61-87.
Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11, 1-74.
Spelke, E.S., Breinlinger, K., Macomber, J., & Jacobson, K. (1992). Origins of knowledge. Psychological Review, 99, 605-632.
Spence, K.W., (1937). The differential response in animals to stimuli varying within a single dimension. Psychological Review, 44, 430-444.
Sperry, R. (1965). Mind, brain, and humanist values. In J.R. Platt (Ed.), New views of the nature of man, Chicago: University of Chicago Press.
Squire, L.R., & Frambach, M. (1990). Cognitive skill learning in amnesia. Psychobiology, 18, 109-117.
Stadler, M.A. (1989). On learning complex procedural knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1061-1069.
Stadler, M.A. (1992). Statistical structure and implicit serial learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 318-327.
Stadler, M.A. (1995). Role of attention in implicit learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 674-685.
Stadler, M.A., Warren, J.L., & Lesch, S.L. (2000). Is there cross-format transfer in implicit invariance learning? The Quarterly Journal of experimental Psychology, 53A, 235-245.
Tolman, E.C. (1932). Purposive behaviors in animals and man. New-York, N.Y.: Appleton-Century.
Tomasello, M. (2000). The item-based nature of children's early syntactic development. Trends in Cognitive Sciences, 4, 156-164.
Treiman, A.M., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136.
Tunney, R.J., & Altmann, G.T.M. (1999). The transfer effect in artificial grammar learning: Re-appraising the evidence of transfer of sequential dependencies. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 1322-1333.
Tzelgov, J. (1997). Automatic but conscious: that is how we act most of the time. In R.S. Wyer, The automaticity of everyday life, Advances in Social Cognition, Vol. X, Mahwah, N.J.: Lawrence Erlbaum Associates. (p.217-230)
Tzelgov, J., Porat, Z, & Henik, A. (1997). Automaticity and consciousness: Is perceiving the word necessary for reading it? American Journal of Psychology, 110, 429-448.
Velmans, M.. (1991). Is human information processing conscious. Behavioral and Brain Sciences, 14, 651-726.
Velmans, M. (1998). Goodbye to reductionism. In S.R. Hameroff, AW. Kaszniak, & A.C. Scott (Eds.) Toward a Science of Consciousness II, Cambridge, MA: MIT Press.
Velmans, M. (1999). When perception becomes conscious. British Journal of Psychology, 90, 543-566.
Vinter, A. (1986). The role of movement in eliciting early imitations. Child Development, 57, 66-71.
Wagner, A.R. (1981). SOP: A model of automatic memory processing in animal behavior. In N.E. Spear & R.R. Miller (Eds.), Information processing in animals: Memory mechanisms (pp. 5-47). Hillsdale, NJ.: Lawrence Erlbaum Associates Inc.
Wallas, G. (1926). The art of thought. New York: Harcourt, Brace.
Weiskrantz, L. (1997). Consciousness Lost and Found: A Neuropsychological Exploration. Oxford: Oxford University Press. (272pp)
Whittlesea, B.W.A, & Dorken, M.D. (1993). Incidentally, things in general are incidentally determined: An episodic-processing account of implicit learning. Journal of Experimental Psychology: General, 122, 227-248.
Whittlesea, B.W.A., & Wright, R.L. (1997). Implicit (and explicit) learning: Acting adaptively without knowing the consequences. Journal of Experimental Psychology: Learning, Memory and Cognition, 23, 181-200.
Willingham, D.B., Nissen, M.J. & Bullemer, P. (1989). On the development of procedural knowledge, Journal of Experimental Psychology: Learning, Memory and Cognition, 15, 1047-1060.
Wills, S.J., & Mackintosh, N.J. (1999).Relational learning in pigeons? Quarterly Journal of Experimental Psychology, 52B, 31-52.
Wood, N.L., Stadler, M.A., & Cowan, N. (1997). Is there implicit memory without attention? A reexamination of task demands in Eich's (1984) procedure. Memory and Cognition, 25, 772-779.
Wright, R. L. & Burton, A. M. (1995). Implicit learning of an invariant: Just say no. Quarterly Journal of Experimental Psychology, 48A, 783-796.
Wulf, G., & Schmidt, R.A. (1997). Variability of practice and implicit motor learning, Journal of Experimental Psychology: Learning, Memory and Cognition, 23, 987-1006
Zbrodoff, N.J. (1999). Effects of counting in alphabet arithmetic: Opportunistic stopping and priming of intermediate steps. Journal of Experimental Psychology: Learning, Memory and Cognition, 25, 299-317.
Zimba, L., & Blake, R. (1983). Binocular rivalry and semantic processing: Out of sight, out of mind. Journal of Experimental Psychology: Human Perception and Performance, 9, 807-815.
APPENDIX A: PARSER
PARSER is centered on a single vector, called Percept Shaper (PS). At the start, PS contains only the primitives composing the material, namely a few syllables. Learning proceeds through the iterative processing of small parts of the linguistic corpus which can be in immediate succession or separated by a various amount of unprocessed material according to the simulations. Each part is composed of 1 to 3 processing primitives (the number is determined randomly for each percept), thus simulating the successive attentional focuses of a human subject processing the same corpus. Each perceived part is added to PS, and can itself serve as a new primitive for the shaping of subsequent inputs, as the syllables initially did. This simulates the fact that perceptual contents are changing throughout the task. Finally, if learning has been successful, PS contains all the words, and only the words of the language.
Why does PS not become encumbered with an innumerable set of irrelevant and increasingly lengthy units? It is because the future of a unit depends on its weight, which represents trace strength. The weight of a given unit is incremented each time this unit is perceived (weight= +1), and decremented each time another unit is perceived (decrement= -0.05). Decrement simulates forgetting. In order to fulfill its shaping function, any unit of PS needs to reach a threshold value (threshold= 1). As a consequence, a unit needs to be perceived repeatedly and regularly in order to persist on fulfilling a shaping function. In contrast, when the frequency of perception of a given element is not high enough to counteract the effects of forgetting, this element is removed from PS when its weight becomes zero.
It must be understood that the details of the functioning of the model are not intended to provide a realistic picture of the processes that are actually involved. As a case in point, forgetting is simulated through the linear decrement of a weight, whereas there is evidence that the forgetting curve fits only moderately well with a linear trend. Needless to say, whatever the mathematical function, forgetting is certainly not biologically implemented as the decrement of a numerical value. We believe the use of this artificial mean importantless, given that the simulated result, forgetting, corresponds to an ubiquitous phenomenon.
More importantly, it may be argued that the general architecture of PARSER is not compatible with the mentalistic view. Indeed, PS may be thought of as a memory store or a mental lexicon, in which symbolic representations exists independently of the current phenomenal experience of the subject. This possibility is not actually permitted in our general framework. Still more worrying is the fact that the items in PS with a weight lower than 1 could be viewed as instantiating "deeply unconscious representations", since they are stored without being able to shape the content of the phenomenal experience.
The contradiction is indeed evident, but, we believe, not detrimental to the demonstration provided by PARSER of the power of the general principles it implements. Indeed, the representations stored in PS, whatever their weight, play a role only when they match the external input. They perform no function except when they enter as a component of the current phenomenal experience. As argued in the main text (see Footnote 1), the same result should have been obtained had the memory of the system been simulated as a capacity to build an on-line representation in the presence of a given input, without directly storing the representation itself. Low-weighted items could have been replaced by a procedure in which, instead of creating new traces ex nihilo, a given input excites the same processing path as a prior, identical input, thus reinforcing this processing path. High-weighted items could have been replaced by a procedure in which the path excited by the processing of these items is strong enough to guide the formation of the current percept.
In fact, neural network modeling would certainly have been more in keeping with our approach, because it naturally implements the idea that the memory of the system is not necessarily a list of symbolic tokens. However, in most connectionist models, the representations embedded in the connection weights between units are not formatted to serve as new coding primitives. This makes it difficult to implement the idea that associations apply to increasingly complex representations. However, this is a technical difficulty that does not require us to discard connectionist modeling altogether. For instance, the Cascade Correlation architecture (Fahlman & Lebiere, 1990) may help solve the problem, due to its ability to dynamically build high-order feature detectors by adding extra units to small networks. Such algorithms, or other so-called constructive methods, seems to represent a promising way to implement the principles underlying PARSER in the form of a connectionist network.
ACKNOWLEDGMENTS
This research was supported by the Centre National de la Recherche Scientifique (UMR CNRS 5022), the University of Bourgogne, and the Region of Bourgogne (Contrat AAFE).