Preprint of: Joel Norman: Two Visual Systems and Two Theories of Perception: An Attempt to Reconcile the Constructivist and Ecological Approaches

Behavioral and Brain Sciences 24 (6): XXX-XXX.


This paper is also available in PDF FORMAT (file is off site)



This is the unedited final draft of a BBS target article that has been accepted for publication (Copyright 2000: Cambridge University Press) and is currently being circulated for Open Peer Commentary.

This preprint is for inspection only, to help prospective commentators decide whether or not they wish to prepare a formal commentary.

Please do not prepare a commentary unless you have received a formal invitation indicating that it has been possible to include you in the final list of invited commentators.

For information on becoming a commentator on this or other BBS target articles, write to bbs@soton.ac.uk

For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to:

journals_subscriptions@cup.org (North America)
journals_subscriptions@cup.cam.ac.uk (All other countries).



Two Visual Systems and Two Theories of Perception:

An Attempt to Reconcile the Constructivist and Ecological Approaches

 

Joel Norman

Department of Psychology

University of Haifa

Haifa, Israel

jnorman@psy.haifa.ac.il

 

“Only connect”

Epigraph to E. M. Forster’s Howards End

 

ABSTRACT

The two contrasting theoretical approaches to visual perception, the constructivist and the ecological, are briefly presented and illustrated through their analyses of space perception and size perception.  Earlier calls for their reconciliation and unification are reviewed.  Neurophysiological, neuropsychological, and psychophysical evidence for the existence of two quite distinct visual systems, the ventral and the dorsal, is presented.  These two perceptual systems differ in their functions; the ventral system’s central function is that of identification, while the dorsal system is mainly engaged in the visual control of motor behavior.  The strong parallels between the ecological approach and the functioning of the dorsal system and between the constructivist approach and the functioning of the ventral system are noted.  It is also shown that the experimental paradigms used by the proponents of these two approaches match the functions of the respective visual systems.  A dual-process approach to visual perception emerges from this analysis, with the ecological-dorsal process transpiring mainly without conscious awareness, while the constructivist-ventral process is normally conscious.  Some implications of this dual-process approach to visual-perceptual phenomena are presented, with emphasis on space perception.  

 

KEYWORDS:  Visual perception theories, ecological, constructivist, two visual systems, space perception, size perception, dual-process approach


1.  Introduction

 

Two contrasting theoretical approaches to visual perception are currently predominant, one consists of variants on the classical Helmholtzian constructivist-inferential approach (e.g., Rock, 1983, 1997; Gregory, 1993) and the second the newer Gibsonian ecological-direct approach (Gibson, 1979).  On the face of it, these two theories seem quite incompatible, espousing rather contradictory views of how visual perception transpires.  However, I will try to demonstrate that each of these seemingly contradictory theoretical approaches comprises a somewhat different aspect of visual perception, and that both can co-exist.  These two aspects have been delimited by recent neurophysiological, neuropsychological, and psychophysical research indicating the existence of two parallel visual systems, labeled here the dorsal and the ventral systems. The central tenet to be presented here is that these two visual systems parallel the ecological and constructivist approaches to perception, respectively, in their function.  In other words, it is being suggested that these two visual systems contribute to our pickup of visual information and our perception of the visual world.  The ventral system is seen to function in a manner commensurate with the Helmholtzian constructivist approach, and the dorsal system in a manner much more similar to Gibson's (1979) ecological approach.

 

Before starting it is important to clarify the usage of the term perception in this paper.  Perception can be defined in more than one way.  It is often defined narrowly as the conscious awareness of the objects and events in the perceiver’s environment.  Such definitions are in line with the constructivists' approach, and almost totally exclude dorsal system functions from "perception" leaving only the ventral system to partake in perception.  This is the tack taken by Milner and Goodale (1995) in their interpretation of their very important findings concerning the two visual systems.  I will argue for a broader definition of perception where perception is seen to encompass both conscious and unconscious effects of sensory stimulation on behavior.  This broader definition is more commensurate with the attempt made here to include both approaches, the constructivists and ecological, under a common framework.  It is also necessitated by the findings that indicate that many perceptual activities can be carried out by both systems and that often they interact synergistically in these perceptual activities.  But, to assist the reader as to which system I believe to be involved in a given perception, I will refer to the dorsal system as “picking up” (information), following Gibson, and to the ventral system as “perceiving” the stimulation in question.

 

Section 2 will begin with a brief review of the two theoretical approaches, the constructivist and the ecological, followed by a look at some previous claims that the two approaches are not incompatible.  This will be followed by a summary, in Section 3, of some of the more relevant findings concerning the two visual systems, and their currently assumed functions.  These two sections are essentially literature reviews, and some readers might want to skip them moving directly to the more central theses of this paper.  Section 4 will take a second look at the two theories and some of the research carried out under their aegis and try to demonstrate the parallels between ecological theory and its research and the functions of the dorsal system, and between constructivist theory and its research and the functions of the ventral system.  In most instances the examples will be from the domain of space perception with emphasis on size perception or size constancy, that is the invariance of perceived size with distance variant.  Hopefully, choosing examples from a single domain will yield a more coherent presentation, but, of course, there will remain the question of generalization to other domains.  Finally, Section 5 will try to summarize the emergent dual-process approach, and show how it sheds new light on some topics in visual perception, and point to some of its relations to other theoretical accounts.  A brief look at the conclusions will appear in Section 6.

 

2.     Two competing theories of perception, the Constructivist and the Ecological

 

The two competing theories, or variants on these theories, have been given a wide assortment of labels.  The older theory, which I will in the main refer to as constructivist or indirect, has also been called Helmholtzian, cognitive, algorithmic, mediational, among other labels.  The newer theory, which I will usually refer to as ecological, direct, or Gibsonian, has also been called sensory, proximal, and immediate, among others.  There are those who equate the constructivist approach with a computational theory of vision.  But as has been pointed out (e.g., Epstein, 1980; Hatfield, 1990b) both theories can be seen as computational, the differences between them depending on what type of information those computations process.  The constructivist approach is seen to process information beyond that found in the sensory stimulation while the ecological approach limits itself only to information in the stimulation.  No attempt will be made to give a thorough and complete review of these theories but simply my, hopefully unbiased, understanding of them. The constructivist view has taken on several somewhat different stances over the years, and Epstein (1995) has recently briefly reviewed several of these.  The much newer ecological approach (but see Lombardo, 1987) is mainly the product of the life's work of Gibson, as spelled out in his last book (Gibson, 1979).  Here I will simply try to point to some of the central themes of these theories, especially those that are relevant to what constructivists might call "space perception" and Gibson would have probably called "the pick up of information about the affordances of the ambient environment." 

 

Let me start by stating in very general terms what I believe to be the major differences between the two approaches to perception.  These relate to two interrelated topics, the richness of the stimulation reaching our sensory apparatus, and the involvement of "higher" mental processes in the apprehension of our environment.  The constructivists see the stimulation reaching our senses as inherently insufficient necessitating an "intelligent" perceptual system that relies on inferential types of mechanisms to overcome this inherent equivocality of stimulation.  The ecologically oriented theorists argue that the information in the ambient environment suffices and is not equivocal and thus no "mental processes" are needed to enable the pick up of the relevant information.  The constructivists see perception as multistage with mediational processes intervening between stimulation and percept, i.e., perception is indirect. The ecological theorists see perception as a single-stage process, i.e., it is direct and immediate.  For the constructivists, memory, stored schemata, and past experience play an important role in perception.  The ecologically oriented approach sees no role for memory and related phenomena in perception.  Finally the two approaches differ on the aspects of perception they emphasize, the constructivists excel at analyzing the processes and mechanisms underlying perception, while the ecological approach excels at the analysis of the stimulation reaching the observer.  This is clearly a very oversimplified view of the differences between the two views, but I believe that it contains the gist of the main differences between them.  Let us look at the two approaches in somewhat greater detail.

 

            2.1 The Constructivist Approach

 

Of the two competing theoretical approaches, the constructivist approach is the older, more "classical", approach.   Although its roots are much older, many see Helmholtz as it modern forefather, often citing his notion of “unconscious inference” as the forerunner of current constructivistic thinking (see e.g., Rock, 1977).  In reality the Helmholtzian notion of unconscious inference was more encompassing than its current equation with the “taking-into-account” notion (see below).  It intertwined perceptual processes with the nativism-empiricism debate with Helmholtz utilizing it to reinforce his empiricistic stance (see Hatfield, 1990a, ch. 7). 

 

More recently Boring (1946) borrowed and sharpened Titchener’s (1914) distinction between core and context to explicate the results of the classic Holway and Boring (1941) experiment.  In that experiment observers judged the size of a disk, presented at varying distances, under conditions of increasing "reduction", i.e., where more and more distance cues were eliminated.  Their finding was that the more cues “reduced”, the poorer the size constancy, i.e., the more the judgments were of the proximal size and not of the distal size.  Boring (1946) writes:

“For descriptive purposes it is convenient to say that the sensory data that contribute to a perception can be divided into a core and its context.  The core is the basic sensory excitation that identifies the perception, that connects it most directly with the object of which it is a perception.  The context consists of all the other sensory data that modify or correct the data of the core as it forms the perception.  The context also includes certain acquired properties of the brain, properties that are specific to the particular perception and contribute to the modification of its core.  In other words, the context includes knowledge about the perceived object as determined by past experience, that is, by all the brain habits which affect perceiving.

            In visual perception the core is the retinal excitation, that is to say, the total optical pattern, specified with respect to the wavelengths and energies involved and the spatial distribution and temporal changes of each.  Thus in the visual perception of size with distance variant, the core is the size of the retinal image.  The context includes all the clues to the distance of the perceived object – clues of binocular parallax and convergence, and of lenticular accommodation and perspective, as well as the other monocular clues to the awareness of distance….”1

 

Boring’s is a strong constructivist stance, where the process of perception consists first of a core stimulus, a proximal image of the disk subtending 1° on the retina, and this core is modified by the context, by all the cues (clues) that yield information about the distance of the disk.  This modification process mediates between the core and the final percept, the more complete the information about distance (the fewer the cues “reduced”) the more the percept matches the distal stimulus.  In other words the perceptual process takes into account the perceived distance in attempting to assess the true, distal, size of the disk.  This “taking into account” formulation of the perceptual constancies was elucidated in an article by Epstein (1973) where he spelled out the underlying common mechanism for seven constancies.  That mechanism consists of a combinatorial process where, for each of the constancies, two independent variables yield the distal attribute.  In the case of shape constancy, for example, the variables are the projective shape (the local retinal attribute in Epstein’s usage or the core in Boring’s) and the concomitant variable, the apparent slant, which together yield the perception of the distal attribute, in this case the apparent shape.  The elegance of a common mechanism for all these constancies is somewhat marred, as Epstein points out, by the fact that empirical tests have not always yielded results consistent with it.  Epstein points to several reasons for this, one of these is based on the distinction between perceived and registered variables.  This refers to the fact that there may be a difference between the perception of the concomitant variable and its registration in the nervous system.  Taking size constancy as an example, which relies on distance information according to this view, it is being suggested that the perceived or reported distance differs from that registered by the nervous system.  Experiments attempting to verify the “taking into account” hypothesis have utilized the reported distance, but this is different from the registered distance, and it is possible that it is the latter that combines with the core size in yielding the size percept.  I will return to this topic in section 5.2.3.

           

The most prolific proponent of the constructivist approach in recent years has been Irvin Rock (e.g., 1977, 1983, 1997).  His The Logic of Perception (1983) is a treatise devoted in its entirety to documenting the evidence in favor of the constructivist view, and his recent Indirect Perception (1997) is a collection of papers seen to support that view accompanied by his introductory chapter and his comments at the beginning of the sections.  The first sentence in the former book is:  “The thesis of this book is that perception is intelligent in that it is based on operations similar to those that characterize thought.” (1983, p.1).  A little later in the book he makes it clear that this thought-like process occurs unconsciously.  Equating perception and thought processes is adopting a rather extreme position, both because it is difficult to envision the exact parallel between the two, and because it is very difficult to empirically verify its validity.  In the introduction to the first section of his later book,  Rock (1997) takes a somewhat different tack, explaining that indirect perception means “that perception is based on prior perception, implying a perception-perception chain of causality.”  This interdependency of perceptual processes is something that can be examined empirically, and, indeed, the studies reprinted in that book clearly evidence that chain of causality.  Actually, one senses a transition from the stance of the strong opening sentence in Rock’s (1983) book as one reaches the tenth chapter of the book, entitled “Perceptual Interdependencies” where Rock develops this idea based on the writings of several notable students of perception (Epstein, 1982; Gogel, 1973; Hochberg, 1956, 1974).

 

Epstein’s (1982) cites several examples of such perceptual interdependencies. These he labels “percept-percept couplings” (after Hochberg, 1974), where the perception of one stimulus dimension is altered by changes in a different stimulus dimension.  One well-known example are Gilchrist’s (1977, 1980) experiments demonstrating that the perceived lightness of a reflecting patch can be changed drastically by manipulations of the stimulus situation affecting where it is perceived to be (e.g., in a dimly vs. a well-lit room) or what its physical slant is (e.g., facing the light source or not).  Percept-percept couplings, according to Epstein and Rock are an anathema to direct theory in that “the cardinal tenet of direct theory cannot be sustained.  The percept in question will have been removed from direct control by information in stimulation.”  (Epstein, 1982).  Epstein also presents evidence favoring a causal interpretation rather than a correlational interpretation of such percept-percept couplings.  Such are a set of studies by Gogel and Tietz (1973, 1974, 1977, 1979) showing that completely independent stimulus manipulations such as changes in oculomotor convergence or motion parallax, affect perceived distance in a similar manner.  It should be noted that in most of the examples of such percept-percept couplings presented by Epstein (1982) and Rock (1997), the second, concomitant, variable manipulated (not the core) consists of some manipulation of the stimulus situation affecting the subject’s perception of three-dimensional space.  I shall return to this topic in Section 5.2.2.

 

Returning once again to the question of size perception, Rock (1983) specifically invokes a syllogistic inferential mechanism:

“I will argue that the process of achieving constancy is one of deductive inference where the relevant ‘premises’ are immediately known.  That is to say, in the case of a specific constancy such as that of size, two aspects of the proximal stimulus are most relevant, one being the visual angle subtended by the object and the other being information about the object’s distance.” (p.240).

Like Boring and Epstein, Rock sees size perception as depending on two perceptions, that of proximal size and that of distance, together leading through a syllogism to the veridical distal percept.  While there are slight differences in emphases between these three researchers, all three call for some sort of combination of proximal size information and distance information in the achievement of size constancy.  In a similar manner, the same combinatorial process holds for all the constancies, according to the constructivist view.  Those who adopt the ecological view, as will be seen in the next section, do not accept this view.

 

2.2    The Ecological Approach

 

Gibson’s ecological theory as expounded in his The Ecological Approach to Visual Perception (1979) evolved over his entire career (see the fascinating account in Reed, 1988).  In that book Gibson presented an exciting new approach to the study of visual perception that included many new concepts and new ways of looking at perception.  The entire first half of the book is devoted to a novel analysis of the ambient environment and the information it proffers the observer.  Gibson finds the classical approach of describing the stimuli for perception in terms of stimulus energies impinging upon the receptors completely unsatisfactory.  He points to the differences between these energies and the optical information available in the ambient optic array.  That information is picked up by a stationary or moving observer.  Gibson, like Johansson (e.g., 1950), calls attention to the fact that perception consists of perceiving events; i.e., perceiving changes over time and space in the optic array.

 

Perhaps one of Gibson’s most important contributions is the concept of affordances.  Gibson writes: “The affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or for ill” (1979 p. 127).  Mark (1987) defines affordances as "the functional utility of certain environmental objects or object complexes taken with reference to individuals and their action capabilities".  Gibson gives examples of the various affordances of surfaces, such as “stand-on-able”, “climb-on-able”, or “sit-on-able”, and writes:

“The psychologists assume that objects are composed of their qualities.  But I now suggest that what we perceive when we look at objects are their affordances, not their qualities.” (1979, p. 134)

and:

“…… the basic affordances of the environment are perceivable and usually perceivable directly, without an excessive amount of learning.  The basic properties of the environment that make an affordance are specified in the structure of ambient light, and hence the affordance itself is specified in ambient light. Moreover, an invariant variable that is commensurate with the body of the observer himself is more easily picked up than one not commensurate with his body.” (1979, p. 143).

Quite a few experimental studies of affordances have been published, focusing on a variety of topics such as the affordance of stairs for climbing, the affordance of chairs for sitting, or the affordance of apertures for walking through.  I will return to one of these and to the concept of affordances once again in section 4.1.

 

Gibson’s is a theory of direct perception and he describes it thusly:

            “So when I assert that perception of the environment is direct, I mean that it is not

mediated by retinal pictures, neural pictures, or mental pictures.  Direct perception is the

activity of getting information from the ambient array of light.  I call this a process of information pickup that involves the exploratory activity of looking around, getting around, and looking at things." (1979, p. 147)

What sort of information is picked up in direct perception?  Gibson suggests that there exist higher-order invariants2 in the optic array that serve to supply the observer with unequivocal information.  He musters a great deal of evidence to prove this point.  Among the items of evidence he presents is a study of size perception he performed during World War II.  In that study he presented aviation cadets with the task of matching the height of stakes planted at various distances in a very large plowed field with a set of stakes of varying size nearby.  His finding was that size perception remained invariant no matter how far away the stake was planted: “The judgments became more variable with distance but not smaller.  Size constancy did not break down.” (1979, p. 160).  Unlike the constructivists Gibson does not ascribe this size constancy to the taking-into-account of distance, but rather:

“The implication of this result, I now believe, is that certain invariant ratios were picked up unawares by the observers and that the size of the retinal image went unnoticed.  No matter how far away the object was, it intercepted or occluded the same number of texture elements of the ground.  This is an invariant ratio.  For any distance the proportion of the stake extending above the horizon to that extending below the horizon was invariant.  These invariants are not cues but information for direct size perception….” (1979, p. 160).

Gibson is suggesting that size constancy results from the direct pickup of invariant ratios in the ambient array.  He proposes two such invariant ratios, the amount of texture intercepted and the horizon ratio.  It is also noteworthy that he claims that these invariant ratios are picked up "unawares" .  There is no need, according to his view, for perceived distance to be involved here, nor for the inferential mental processes that the constructivists purport to underlie size perception. 

“…. both size and distance are perceived directly.  The old theory that the perceiver allows for the distance in perceiving the size of something is unnecessary.” (1979, p. 162).

 

Gibson’s conception is one of an active perceiver exploring his environment.  Eye-,

head-, and body-movements are part and parcel of the perceptual process.  Perception transpires continuously over both time and space.  “Space” here refers not to  an empty space but to the many surfaces that make up the environment, the most important being the terrain that at times reaches the horizon.  The horizon is of importance as it serves as an important reference standard, and when it is occluded Gibson speaks in terms of an implicit horizon, presumably similar to what architects and others have called the eye-level plane.  With such a conception Gibson is totally adverse to the reductionist experimental paradigms.  Brief exposures or looks through monocular “peep-holes” do not represent true perception in his view.  In discussing the famous Ames demonstrations of the trapezoidal room or window, he writes:

“An observer who looks with one eye and a stationary head misperceives the trapezoidal surfaces and has the experience of a set of rectangular surfaces, a ‘virtual’ form or window. …. The eye has been fooled.

            The explanation is that, in the absence of information, the observer has presupposed (assumed, expected, or whatever) the existence of rectangular surfaces causing the solid angles at the eye.” (1979, p.167)

Gibson also eschews the idea that a perceptual system has a memory.  He claims that “there is no dividing line between the present and the past, between perceiving and remembering” (1979, p.253).

 

In his book Gibson almost totally refrains from discussing the processes underlying perception.  Perception is simply the pickup of information from invariants in the ambient environment.  His only allusions to underlying processes are in terms of resonance:

“In the case of the persisting thing, I suggest, the perceptual system simply extracts the invariants from the flowing array; it resonates to the invariant structure or is attuned to it.  In the case of substantially distinct things, I venture, the perceptual system must abstract the invariants.  The former process seems to be simpler than the latter, more nearly automatic.” (1979, p. 249).

In their explication of Gibson’s approach, Michaels and Carello (1981) are somewhat more explicit about what they call “The Resonance Model”.  They refer back to Gibson’s (1966) radio metaphor for perception, pointing out that “the recognition or detection of radio waves is based on principles of resonance.”  They suggest that the environment “broadcasts” information and that information must be “tuned in”.  They also point out that the radio metaphor is lacking on two counts.  First it accounts for only the perceptual part of the perception-action continuum, and second a radio needs someone to tune it, while a perceptual system is a self-tuning device.


 

2.3    Calls for reconciliation and unification

 

In spite of the sharp contrasts between the constructivist and ecological approaches, there were those who, not long after the publication of Gibson’s (1979) last book, called for seeking out ways to reconcile the two approaches.  My own “awakening” came from the results of three experiments on size perception (Norman, 1980).  Those experiments, somewhat naively, aimed at pitting the two approaches, the constructivist and the ecological, against each other by examining the effects of object distance on size perception.  The participants in the three experiments were presented with a monocular view, through a “peephole” containing an electronic shutter, of two square pieces of red Plexiglas standing erect on surface covered with a highly textured cloth (Experiments 1 and 2) or on a dull gray textureless cloth (Experiment 3).  The two red squares were never the same physical size and in most instances they were not at the same distance from the participant.  The task was to judge which of the two squares was physically bigger and press an appropriate button.  Response times were measured from the opening of the shutter till the correct response was made.  The shutter was closed immediately after the response was made, and the stimuli were changed.

 

The idea behind this research paradigm was to try to determine if the constructivists are right in positing that the perception of size-at-a-distance entails a taking-into-account of object distance.  Or, in contrast, whether the claim of the ecological approach is correct that perceived distance is irrelevant, with size information being picked up with the aid of invariant ratios available in the ambient array, such as texture occlusion or the horizon ratio.  The analysis of the first experiment examined the response time data in terms of two stimulus parameters, the “distal ratios”, the ratios of the objective (physical) sizes of the two red squares being judged, and the “proximal ratios”, the ratios of the proximal (retinal) sizes of those squares.  If distance is taken into account, as the constructivists or indirect theorists claim, then the response times should be affected by the proximal ratios, which are determined by the relative distance of the two squares from the observer.  But if the ecological or direct theorists claim that distance does not play a role in the perception of size is correct, then only the distal ratios should affect the response times.  The results of the first experiment indicated that the proximal ratios affected response times to a greater extent than did the distal ratios.  This finding is more in line with the predictions of the indirect theory.  But there also was evidence of an independent effect of the distal ratios on response times, and the results of Experiments 2 and 3 further elucidated this effect.  Briefly the results of those two experiments showed that the effect of the proximal ratios on response times was contingent on the distal ratios: The smaller the distal ratio (i.e., the more different in size the two squares being compared) the smaller the effect of the proximal size.  In other words, the greater the difference between the physical sizes of the squares being compared the more direct the perception of their size.  In fact, in the stimulus conditions with the greatest difference between the distal sizes (smallest distal ratios) the manipulation of distance had no affect on the response times at all.  Thus, there appeared to be an indication that size-at-a-distance could be picked up without the involvement of perceived distance under certain conditions.  I summarized the implications of these findings thusly:

“To sum up, it is being suggested that both direct and indirect perception occur, that they do not define a dichotomy but a continuum, and that the location of a perceptual act on that continuum is determined by some interaction of the difficulty of the perceptual discrimination required and the richness of the stimulus conditions….. The challenge facing the perceptual theorist is not to choose between the two theories, but to incorporate the two approaches into a common framework with the aim of delineating the conditions under which direct and indirect processes emerge.” (Norman, 1983).

It is being suggested here that such a common framework does exist. It is based on the findings concerning the existence of two visual systems each with its specific modes of functioning, each with its complementary contribution to the organism’s ability to utilize the impinging sensory stimulation in coping and behaving in its environment.

 

The “richness of the stimulus conditions” in the previous quote refers to the fact that in spite of my using a highly textured and well-illuminated surface, the experimental setup was not really conducive to what ecologically oriented researchers would consider a “fair” assessment of perception.  The participants were given a very brief monocular view of the stimulus array; a far cry from what Gibson would consider an adequate setup allowing true perception.  Yet, in spite of these limitations evidence for direct perception of size seemed to emerge.  The possibility exists that had the experimental conditions allowed binocular rather than monocular vision, and much longer exposures, perhaps entailing movement by the participants, direct perception of size might have been found for other conditions as well.

 

At about the same time others also called for the amalgamation of the two theoretical approaches, the constructivist and the ecological.  Haber (1985) reviewed 100 years of research on perception in a paper presented in 1979 at the APA convention to celebrate the centennial of experimental psychology (and printed very much later).  In that paper he noted that while the Gibsonian approach excels in its analysis of the stimulation reaching the organism, it needs to be supplemented by an adequate theory of the underlying processes along Helmholtzian lines.  His conclusion: “I feel that as soon as we create a truly Gibholtzian theory of space perception, this merger will produce a breakthrough in our understanding of space”.  In an early review of the contributions of developments in computer vision to perceptual theory, McArthur (1982) found that both bottom-up and top-down processing (not his terms, but commonly used today) are required for efficient computer vision models, leading him to write:

“We can identify hypotheses about the kinds of knowledge and uses of knowledge in perception that could be regarded as ‘more or less’ Gibsonian or constructivist.  More generally, we might regard the extreme Gibsonian and constructivist views as end points on a continuum, or space, of possible theoretical positions concerning the role of knowledge in perception.”

It should also be mentioned that Rock (1983) also foresaw the possibility of a unified theory.  In his discussion of the various theories of perception he wrote:  “Varieties of each of these are of course possible, and one might develop an overall theory that combines features of each.”  (p. 28).

 

Others have noted that one of the problems in trying to find a middle way between the two approaches is that they are very different in the conceptualizations they adopt.  A means for ameliorating this problem was suggested and thoroughly analyzed by Hatfield (1988, 1990b).  He proposed that a connectionist analysis of perception can serve as the bridge between the two approaches.  Very briefly, he showed how a connectionist model can satisfy the claims of the constructivists that rule-like behavior underlies perception, but by being rule-instantiating without being rule-following the model can also satisfy the Gibsonian strictures against cognitive mediation.  In other words, the connectionist network with its hidden units and the connection weights among them can respond (or resonate) as if it is making inferences without implementation of anything more than some changes in the weights in the model.  These weight changes are the “representations” of the system.  Hatfield suggested that representations of this sort are commensurate with Gibson’s approach.

 

Another attempt at bridging the gap between the two theoretical views was offered by Bennett, Hoffman, and Prakash (1989, 1991).  They presented a mathematical theory of perception that they suggested can also serve as a rapprochement between the ecological and constructivist views (see also Banks & Krajicek, 1991; and Braunstein, 1994).  Their theory is built around the concept of an "observer".  An observer is not necessarily a perceiver but "each perceptual capacity can be described as an observer" (Bennett et al., 1991).  These observers perform inductive rather than deductive inferences and such inferences can serve in both what appear to be direct or ecological perceptual processes and in the type of processes proposed by the constructivists.  By transforming the two approaches to inferences of a similar nature they suggested that the gap between the two can be bridged.

 

More recently Neisser (1994) has proposed a tripartite division of perception, consisting of three perceptual systems:

“1.  Direct perception/action, which enables us to perceive and act

effectively on the local environment.

2.       Interpersonal perception/reactivity, which underlies our immediate

social interactions with other human beings.

3.       Representation/recognition, by which we identify and respond

appropriately to familiar objects and situations.”

While Neisser does not go into very much detail concerning the three systems, it would appear that the first and third systems above are very similar to the two systems being suggested here, the dorsal and the ventral, respectively.  The second system, the one dealing with social interactions, while very interesting is beyond the scope of the topics being dealt with here.

 

In a recent guest editorial in the journal Perception Heller (1997), whose central interest is in the sense of touch, also calls attention to the fact that “An important gap in theoretical positions exists between the ecological and traditional points of view”.  He uses the term traditional as synonymous with the “constructionist (representational) viewpoints”.  He then writes: “It is very possible that the ecological position and the inferential hypothesis testing views of perception are both correct, within limits… Thus, the distinction between the ‘what’ and ‘where’ functions of perception may help to resolve the apparent conflict between the ecological and other, constructionist approaches.”  This statement also bears much similarity to the thesis set forth in this paper.

 

 

3.  The two visual systems, the ventral and the dorsal

 

The idea of two visual systems is far from new (see reviews in Jeannerod, 1997, Ch. 2; and Milner & Goodale, 1995, Ch. 1).  In the late sixties, a group of studies produced evidence for this idea.  One of the better known studies was carried out by Schneider (1967,1969) who described experiments on hamsters where ablation of the cortical visual system (areas 17 and 18) left the hamsters incapable of demonstrating pattern discrimination but still capable of orienting toward objects.  In contrast, in a second group of hamsters undercutting the tectum, thus disconnecting the superior colliculus, had the opposite effect. The latter group of animals could make pattern discriminations but could not orient themselves in space.  Schneider (1969) saw these findings as indicating that the hamster had two visual systems one a cortical system answering the question 'What is it?', and the second a subcortical system answering the question 'Where is it?'.  At about the same time, Trevarthen (1968) who had been studying the behavior of split-brain monkeys also came to the conclusion that there were two visual systems, one a subcortical system that he called 'ambient' and one a cortical system that he called 'focal'.  The former was primarily subserved by peripheral vision and the latter by foveal vision.  Quite a few other studies during that period also pointed to the existence of two visual systems.  For example, Ingle (1973) provided evidence for the existence of two visual systems in the frog.  Also Held (1970) published a review of a wide variety of studies of perceptual adaptation all consistent with the idea that there exist two modes of visual analysis, a “contour-specific” mode and a “locus-specific” mode.

 

This focal-ambient nomenclature was adopted by quite a few researchers including Leibowitz and Post (1982) who summarized implications of these two modes to several quite diverse topics in vision and visual perception.  Among the studies these author’s summarized was an earlier study of theirs (Leibowitz, Wilcox, & Post, 1978) where they examined the effect of inducing refractive error (blur) on both size constancy and shape constancy.  They found that increasing blur decreased shape constancy, but, in contrast, increasing blur had no effect on the degree of size constancy.  Leibowitz and Post (1982) suggested that these differences were due to the differences between the focal and ambient systems.  The focal system is very sensitive to decreases in spatial frequency while the ambient system functions efficiently over a large range of spatial frequencies. It is suggested that the focal-ambient distinction as used many years ago by Leibowitz, Held, and others is quite similar, if not identical, to the ventral-dorsal distinction to be elaborated here. The Leibowitz, Wilcox, and Post (1982) study is an early finding indicating that the dorsal system is involved in the pickup of size information.

 

The general consensus during the 60s and 70s was that the focal system was under cortical control while the ambient system was subcortical (e.g., Perenin & Jeannerod, 1979).   But in 1982 Ungerleider and Mishkin presented evidence that in the visual cortex of the monkey there were two separate pathways, one they labeled the ventral stream leading from the occipital cortex to the inferior temporal cortex and the second, the dorsal stream leading to the posterior parietal cortex.  Lesioning the inferior temporal cortex left the monkeys unable to discriminate between objects of different shapes, while lesioning the posterior parietal cortex left them unable to perform a landmark discrimination task.  These findings led Ungerleider and Mishkin to suggest that the ventral pathway dealt with object identification, and the dorsal pathway dealt with object location.  Somewhat like Schneider they called the ventral pathway a "what" system and the dorsal pathway a "where" system, but unlike Schneider both systems were cortical.

 

More recently a somewhat different interpretation of this dichotomy has been suggested by Goodale and Milner (1992; see also Milner & Goodale, 1995).  Their interpretation of the functions of the ventral stream does not differ markedly from that of Ungerleider and Mishkin.  They also see it as mainly involved in the processes of recognition and identification.  Their main innovation is the functions they attribute to the dorsal stream.  Rather than mapping the location of objects, they see it as a system for the visual control and guidance of motor behavior.  They present a great deal of evidence showing that the dorsal stream is capable of utilizing visual information for the control of movement, and that it is dissociated from the ventral stream.  According to Goodale and Milner the major difference between the two streams is not in the visual information they process, but in the transformations they perform on the available visual information.  In other words, the ventral stream transforms visual information into an exocentric (also labeled "allocentric") framework allowing the perception of the object as it relates to the visual world.  The dorsal system, on the other hand, transforms visual information into an egocentric framework allowing the actor to grasp or otherwise bodily manipulate the object.

 

The labels "dorsal system" and "ventral system" will be used here to denote the two systems3.  At the end of this section I will try to summarize what is known about the functions of the two systems and the differences between them.  This will follow a review of  neurophysiological studies, referring mainly to physiological studies on monkeys, but also some of the recent imaging work (PET and fMRI) corroborating these findings in humans, a review of neuropsychological studies of brain-damaged patients, and finally a review of psychophysical studies on healthy humans.

 

3.1 Neurophysiological Studies

 

The labels “dorsal system” and “ventral system” are used in this article to connote two theoretical entities, but these labels are borrowed from, and have their roots in, two anatomical-physiological entities usually labeled the dorsal and ventral streams.  These streams are located in different parts of the cortex.  The dorsal stream is located in the main in the posterior parietal cortex and adjacent areas and includes areas such as MT (middle temporal or V5), MST (medial superior temporal), LIP (lateral intraparietal), among others.  The ventral stream is located in the main in the inferotemporal cortex and adjacent areas and also includes area V4.  Both the ventral and dorsal streams receive input from V1, but the dorsal stream also receives direct subcortical inputs, via the superior colliculus and pulvinar.  It is this subcortical pathway that was once thought to serve the ambient visual system.  Of the two streams, the ventral appears to receive its major input from the parvocellular retinocortical pathway, although it also receives considerable magnocellular input, while the dorsal stream receives its main, if not total, input from the magnocellular retinocortical pathway (see Merigan & Maunsell, 1993).  The differences between the parvocellular and the magnocellular pathways are important for gaining a better initial understanding of the functions of these two visual systems.  Recent textbooks on vision (e.g., Wandell, 1995) give detailed information on the parvocellular and magnocellular pathways, and I shall only describe them briefly here.

 

The two pathways are seen as originating in the ganglion cells of the retina with the parvocellular pathway in the much smaller and more plentiful midget ganglion cells and the magnocellular pathway in the much larger parasol ganglion cells.  (Evidence exists for a third type of ganglion cell, the w cells, feeding into a third pathway, the koniocellular pathway described by Casagrande, 1994, but not enough is known about this pathway to include it here). The two pathways are still segregated at the lateral geniculate nuclei, the axons from the parasol ganglion cells reaching the two magnocellular layers and those from the midget cells the four parvocellular layers.  This segregation continues in V1 as well as in extrastriate visual areas, with the pathways seen as splitting into three (e.g., DeYoe & Van Essen, 1988; Livingstone & Hubel, 1988) or even four (Zeki, 1993) different pathways.  It has been suggested that these pathways serve different visual/perceptual functions, but more recent evidence has indicated that these proposals of clearly segregated pathways are inaccurate, both at a physiological level and a functional (visual perception) level (see e.g., Bullier & Nowak, 1995; Schiller, 1996).  Today, the consensus seems to be that the major difference between the two pathways is in their relative spatial and temporal sensitivities, the parvocellular pathway capable of processing information at higher spatial frequencies and the magnocellular pathway at higher temporal frequencies.  It is also claimed that the contrast sensitivity of the magnocellular system is greater at low spatial frequencies (see e.g., Schiller, 1996).  One further important point is the fact that the magnocellular pathway is the faster of the two with response latencies  about 20 ms shorter than the parvocellular pathway (see Bullier & Nowak, 1995).  The magnocellular pathway has also been seen to be highly implicated in the processing of motion information (Logothetis, 1994).

 

The brunt of motion analysis is carried out in the dorsal system, mainly in areas MT and MST (Logothetis, 1994).  It has also been shown that in macaques dorsal system inputs are from areas dealing with spatial or motion analysis and from peripheral representations of the retina, while those of the ventral system are from areas dealing with form and color analysis from more central representations of the retina (e.g., Baizer, Ungerleider, & Desimone, 1991).  But this simple view of the ventral system dealing with form and color perception and the dorsal system dealing with motion and spatial analysis is an oversimplification.  For example, area V4 is considered to be part of the ventral system but also possesses cells that are motion sensitive (Ferrera, Rudolph, &

Maunsell, 1994;  Logothetis, 1994).  On the other hand, there is evidence for the involvement of the dorsal system in of some type of shape or form analysis.  Features necessary for object identification, such as shape and size, are processed by the ventral system, but the dorsal system also has access to information about the shape and size of objects, albeit to serve a different purpose, that of performing motor movements vis-à-vis those objects, and utilizing a different framework, egocentric rather than allocentric.   Sakata, Taira, Kusunoki, Murata, and Tanaka (1997) have recently summarized a large group of studies indicating that the parietal cortex of monkeys contains at least five types of cells relevant to depth perception and the visual control of hand movements.  Many of these cells were found to be sensitive to the 3-D features of objects, such as shape, orientation, and size.  There is also evidence from PET imaging studies that this is true in humans as well (Baker, Frith, Frackowiak, & Dolan, 1996; Faillenot, Toni, Decety, Gregoire, & Jeannerod, 1997).  The studies by Sakata and his colleagues also show that cells in the parietal cortex respond to binocular inputs, including sensitivity to binocular disparity.

 

Clearly the input into the two systems must combine at some point and recent studies have also been focusing on what becomes of the information in the two systems and where it is integrated.  Evidence for continued segregation of the two systems in the frontal lobe (frontal eye fields) has been reported  (Bullier, Schall, & Morel, 1996 ;  Schall, Morel, King, and Bullier, 1995).  Owen, Evans, and Petrides (1996) report similar findings in a PET study of humans.  More recently Rao, Rainer, and Miller (1997) reported a study of neurons in the monkey’s prefrontal cortex, where both object-oriented and location-tuned tasks were used.  Some of the neurons showed specific object- or location-tuning, but 52% of the cells showed tuning to both dimensions, leading these researchers to suggest that: “These neurons may contribute to the linking of object information with the spatial information needed to guide behavior.”

 

To sum up, physiological research on monkeys and imaging studies on humans have produced evidence for the existence of two cortical visual systems, the ventral system that processes pattern, form, and color information, and the dorsal system that processes motion and spatial information.  It would seem that recent neurophysiological findings concur with the neuropsychological and psychophysical findings reviewed below in that both systems overlap somewhat in the type of visual input they process, but process this information for quite different purposes.

 

3.2 Neuropsychological Studies

 

Many insights into the functions of the two visual systems and their dissociation have come from studies on patients where apparently one of the two systems is damaged due to some localized injury to the brain.  Many of these studies have been thoroughly reviewed in Milner and Goodale’s (1995) book, and I will only mention some highlights.  On the one hand, there are patients who have incurred damage in their parietal lobe, and presumably some of their dorsal system functions are defective.  Some of these patients suffer from what is called optic ataxia, manifesting great difficulties in making correct motor movements towards visually displayed targets, but have no trouble discriminating and identifying visual stimuli of all sorts.  In a word, these patients presumably have an intact ventral system, but a damaged dorsal system.  Patients suffering from optic ataxia have been described quite often in the literature (e.g., Perenin & Vighetto, 1988).

 

In a recent study Milner, Paulignan, Dijkerman, Michel, and Jeannerod (1999)  presented evidence for the dissociation of the two systems in a visual localization task.  They compared the pointing accuracy of a patient suffering from optic ataxia (who can be seen as having a deficient dorsal system) to that of three normal subjects.  All were required to point at one of seven target positions under two conditions; no delay in pointing, and a 5 sec delay.  The normal subjects, as might be expected, were better at pointing when there was no delay, but the optic ataxia patient's pointing errors were greater in the no delay condition than in the delay condition.  The author's note that "The data are consistent with a dual processing theory whereby motor responses made directly to visual stimuli are guided by a dedicated system in the superior parietal and premotor cortices, while responses to remembered stimuli depend on perceptual processing and may thus crucially involve processing within the temporal neocortex."  In other words, the optic ataxia patient lacking a functional dorsal system could make use of her ventral system, which comes into play after a few seconds.

 

Goodale, Milner, and their colleagues have carried out a large number of studies on a visual agnosic patient, DF, who suffered extreme carbon monoxide poisoning that apparently disconnected the V1 input into the inferotemporal cortex.  In other words, this patient is apparently incapable of using her ventral system for analyzing visual input; i.e., she is suffering from an extreme type of visual form agnosia.  Not only can she not recognize faces and objects, but she is incapable of making much simpler discriminations such as between a triangle and a circle.  She is capable of drawing objects fairly well from memory but cannot copy pictures nor can she recognize the objects she has drawn.  But DF appears to have an intact dorsal system, and is capable of carrying out visuomotor activities.  Goodale, Milner, Jakobson, and Carey (1991) reported a study of orientation and size perception on patient DF.  When asked to insert a card into a slot presented at varying angles, she had no trouble in orienting her hand to match the correct orientation in spite of the fact that she was incapable of reporting in any manner what the orientation of the slot was.  As for size perception, she was unable to tell if two small plaques were the same or different widths, nor was she able to indicate the widths of the plaques by adjusting the distance between her index finger and thumb.  Both these tasks were very simple for the two control subjects.  But when DF was asked to pick up the plaques, the aperture between her fingers in preparation for picking up the plaques was highly correlated with the width of the plaques, similar to the control subjects.  In other words, this subject who apparently has an intact dorsal system, but a completely dysfunctional ventral system is incapable of demonstrating perceptual cognizance of the size of the plaques, but when asked to pick them up demonstrates that size information is available to her.

 

Quite a few other studies of DF’s visual and perceptual capacities have been carried out.  She has been shown to possess color vision and can utilize this capacity to recognize natural objects (Humphrey, Goodale, Jakobson, & Servos 1994).  Utilizing this capacity it was shown that she manifests the McCullough Effect, another indication that her visual system is capable of picking up orientation information (Humphrey, Goodale, & Gurnsey, 1991; Humphrey, Goodale, Corbetta, & Aglioti, 1995).  Two studies have demonstrated that she is incapable of utilizing Gestalt principles of organization of shape information. (Goodale, Jakobson, Milner, et al 1994; Milner, Perrett, Johnston, Benson, Jordan, Heeley, Bettucci, Mortara, Mutani, Terazzi, & Davidson, 1991).  Carey, Harvey, and Milner (1996) have shown that DF is capable of grasping tools and utensils quite proficiently but has difficulty in visually recognizing the right part of the object to grab (e.g., handle).  This study also showed that she is capable of responding concurrently to both size and orientation information.

 

Of special interest here are studies of DF’s capacities to adapt to the contingencies of her spatial environment.  She has been shown (Patla & Goodale, 1996) to be able to negotiate obstacles during locomotion as well as control subjects, even though when asked to estimate their height she does this much more poorly than control subjects.  She has also been shown not to differ from normal controls in the effects of the pitch of the visual field on her perceived eye level, but she could not report that pitch, an easy task for the control subjects (Servos, Matin, & Goodale, 1995).  DF is highly proficient at grasping objects when she views them binocularly, but this ability is disrupted when she is allowed only monocular vision (Dijkerman, Milner, & Carey, 1996;  Marotta, Behrmann, & Goodale, 1997).  When allowed to move her head during monocular viewing, yielding motion parallax, her grasping improves considerably (Dijkerman, Milner,   & Carey, 1999).  It has also been shown that without binocular vision DF manifests serious disruptions in the size-constancy of grip aperture (Marotta, Behrmann, & Goodale, 1997).  The authors suggests that this is due to the fact that she cannot use pictorial cues to assess the objects’ distance, not allowing the further assessment of the object’s size.  In a related study (Humprey, Symons, et al., 1996) it was shown that DF could discriminate apparent three-dimensional structure and orientation of shapes only on the basis of shading gradient cues and not when the edges were depicted as lines or as luminance discontinuities.  A broader analysis of  DF’s abilities to pick up information about space will be undertaken in Section 5.2.1.

 

3.3 Psychophysical Studies 

 

In the search for a dissociation between the ventral and dorsal systems in healthy subjects a fairly large number of psychophysical studies have compared judgmental responses to motor responses to the same stimuli.  The judgmental responses can be seen as mainly based on ventral system function, the motor responses mainly on dorsal system function.  Among the first to carry out such studies were Bridgeman and his colleagues.  They utilized three methods to demonstrate this dissociation, studies of saccadic suppression, studies of induced movement, and studies of the Roelofs effect.  Bridgeman, Lewis, Heit, and Nagle (1979) utilized the phenomenon of saccadic suppression to show that when targets are moved slightly during a saccade, these small displacements are not reportable by either verbal responses or button presses, while both eye-movements and pointing behavior are influenced by the change in location.  Bridgeman, Kirch, and Sperling (1981) showed that the induced movement illusion affected verbal reports, but the pointing responses were veridical.  In a related study Wong and Mack (1981) utilized the induced movement illusion to cause the target to be reported as moving in the direction opposite to its actual movement.  In contrast, the subjects’ eye-movements followed the actual movement direction and not the illusory direction.  When a delay was introduced the eye-movements followed the illusory displacement, suggesting that memory of the movement was stored in the ventral system.

 

Smeets and Brenner (1995) carried out a study that led them to propose that the findings of Bridgeman et al. (1981) were not the result of the dissociation of perception and action systems, but rather from independent processing of velocity and position.  In response Bridgeman has recently demonstrated similar dissociations between the two systems can occur with stationary stimuli utilizing a phenomenon known as the Roelofs effect (Bridgeman, Peery, and Anand, 1997).  This effect causes target position to be misperceived when it is surrounded by a frame presented asymmetrically.  Targets tend to be misperceived in the direction opposite to the offset of the frame.  When no delay was introduced between stimulus exposure and the cue to either make a judgment or point to where the target had been, all 10 subjects evidenced the effect in their judgments, but five did not with the pointing response.  Thus, at least for some of the subjects the surrounding frame did not affect the motor response.  In the 4- or 8-sec delay conditions this dissociation was not found, all subjects showing the effect also with the pointing response.  This, once again, suggests that the dorsal system has a very limited short-term memory.  In a subsequent study Bridgeman and Huemer (1998) used an auditory cue immediately prior to a motor response in a Roelofs effect setup.  The auditory cue indicated which of two targets should be jabbed.  In spite of the fact that the motor response was preceded by a cognitive analysis of the auditory cue, the motor response was not susceptible to the Roelofs effect, indicating that a prior cognitively processed cue can still prime the dorsal system response.

 

Several studies have compared verbal responses and motor responses in the perception of distance.  Some of these have focused on short distances, where the motor responses have usually been reaching movements, while others on somewhat longer distances where the motor responses have been pointing or walking (without vision).  Gentillucci and Negrotti (1994) studied exocentric distance4 perception using two response methods, a pointing response and a visual reproduction response. The stimuli were presented frontally and close to the subjects with the distances between them ranging between 5 and 17.5 cm.   The two response modes yielded different patterns of constant errors, with those for the pointing responses decreasing with distance and those for the reproduction increasing.  These findings led the authors to conclude that their findings “support the hypothesis that perception and visuo-motor transformations are two separate processes”.  In a second study these researchers (Gentilucci & Negrotti, 1996) required subjects not to reproduce the distance but to reproduce a double distance.  Here the results were similar for both response modes, indicating that the doubling instruction involved the ventral system for both response modes.  Related findings have been reported by Pagano and Bingham (1998) who studied the monocular perception of egocentric distance given by optic flow generated by head movements towards a target.  Two response measures were used to assess the perception of distance, verbal reports and reaches.  It was found that verbal and reaching errors were uncorrelated leading, once again, to the suggestion that this was due to the independent functioning of the two systems.

 

Other studies have looked at distance perception for distances beyond arm’s reach.  Some of these studies have used judgmental estimates of distance, usually egocentric distance, while others have used motor responses to distance such as blindfolded walking or pointing.  The results of the studies using judgmental estimates have yielded inconsistent results in some cases yielding quite veridical estimates, but in other cases yielding quite systematic underestimates (see review in Bingham & Pagano, 1998).  In contrast, the studies using motor responses have yielded veridical distance perceptual responses.  Among these are studies by Loomis and his colleagues (see review in Loomis, Da Silva, Philbeck, & Fukusima, 1996) who compared blind walking to distance estimates of distances up to 12 m.  For example, in one experiment (Loomis, Da Silva, Fujita, & Fukusima, 1992, Exp. 2) frontal exocentric distances and sagittal depth-interval distances were shown to be judged quite differently, with the sagittal distances set to 50% to 90% more than the frontoparallel estimates to appear equal.  In contrast, blind walking to the endpoints of the two types of intervals yielded equal responses.  Loomis, et al. (1996) ascribe these differences to a dissociation between egocentric (sagittal) and exocentric (frontal) distance perception.  This claim can be interpreted in terms of the two visual systems, where the dorsal system deals with egocentric measures and the ventral system with exocentric (or relative) measures.  Thus, the estimates differ because the dorsal system is less involved in the frontal estimates than in the depth-intervals, while the walking responses rely in both cases mainly on the dorsal system.

 

Dissociations in the perception of size have also been examined in many recent studies comparing motor and judgmental responses to stimuli presented within the context of well-know visual size illusions.  These studies have yielded conflicting results, possibly related to the lack of an adequate understanding of the processes underlying these illusions.  In an early, much cited, study Aglioti, DeSouza, and Goodale (1995) utilized the Ebbinghaus (or Titchener) illusion, where the reported size of a central circle is influenced by the circle-size of a group of circles surrounding it.  In their study these researchers replaced the drawing of the inner circle with a thin poker-chip like token.  When asked to judge the size of the target tokens the subjects manifested the illusion throughout the experiment, but when asked to manually pick up the central target token, manual grip size during the grasping movement was much less influenced by the illusion.  This was seen to indicate that the ventral system is influenced by the illusion and the dorsal system is not.  

 

Haffenden and Goodale (1998) replicated the findings of the Aglioti et al. (1995) study adding further control conditions, such as not letting the subject view her hand as it moved (open-loop conditions) and having the subjects indicate the judged size with a manual response of the distance between the thumb and forefinger.  Marotta, DeSouza, Haffenden, and Goodale (1998) also replicated the findings of no or little illusion with a motor response in a study that compared binocular and monocular presentations of the illusion (see section 5.2.1).  In another recent study  (Westwood, Chapman, & Roy, 2000) that compared pantomimed and natural actions, these findings were also replicated.  A study by Franz, Gegenfurtner, Buelthoff, and Fahle, (2000) has not replicated these findings with very similar effects of the illusion on both perceptual judgments and grip apertures.  These researchers point out that in the previous studies the perceptual judgments were carried out by comparing two circles, one surrounded by small circles, the other by large circles.  In contrast, the manual responses were made towards only one of the circles.  They show that when the illusion’s perceptual effects are studied with single-circle presentations there are no differences between the two types of responses.  Pavani, Boscagli, Benvenuti, Rabufetti, and Farne (1999) have also reported similar results.  Haffenden and Goodale (2000) have recently suggested that the discrepancy between the results of these two studies and theirs are due to the size of the gaps between the central and surrounding circles used in the latter two studies.  To add to the current confusion, Donkelaar (1999) has shown that a different motor response, a pointing response, is affected by the Ebbinghaus illusion.

 

Judgmental and motor responses have also been compared with other visual size illusions.  Post and Welch (1996) utilized an open loop reaching task with the Müller-Lyer and two other illusions.  In the case of the Müller-Lyer illusion they did indeed find that the illusion did not affect the reaching responses but did affect the judgments.  But in two additional experiments using other illusions they demonstrated that these results need not be explained in terms of a dissociation between the two systems, and can be seen to depend on the subjects’ egocentric localization.  In a study that looked only at motor responses Gentilucci, Chieffi, Deprati, Saetti, and Toni (1996) studied pointing responses to a vertex of the Müller-Lyer figure.  There were four conditions: full vision of the stimulus and the pointing hand, vision of the stimulus but not of the hand, no vision of either (0 sec delay), and no vision of either with a 5 sec delay before pointing.  The illusion had an effect in all conditions, but it was relatively small in the full vision condition, and increased in size over the other four conditions.  In other words, the more the pointing was based on memory the greater the effect of the illusion.  In terms of the two visual systems these results indicate a growing reliance on the ventral system as memory became more and more involved.  In a subsequent study (Daprati & Gentilucci, 1997) the motor reaching task was supplemented by two tasks of length reproduction.  Grip aperture for the length of a Müller-Lyer shaft was influenced by the illusion but this effect was smaller than that found with the two reproduction tasks.

 

Brenner and Smeets (1996) utilized a converging line variant of the Ponzo illusion to examine its effects on grasping responses.  Disks were placed on the background that yields the illusion and subjects were asked to lift them.  These researchers also found that grip aperture was not influenced by the illusory size, but they did show that the illusion did influence the force used to lift the disks. More force was applied to the perceptually larger disks.  Similar results have been reported by Jackson and Shaw (2000).  In a recent study Ellis, Flanagan, and Lederman (1999) compared verbal estimates and grasping responses for the center of a steel bar placed on two illusory backgrounds: the same variant of the Ponzo illusion as used by Brenner and Smeets, and for the Judd illusion (a variant on the Müller-Lyer illusion, where both arrows point in the same direction).  They found that the two illusions affected both types of responses but the errors in the grasping responses were significantly smaller than in the verbal estimates.  They see these results as indicative of a partial dissociation between the two systems.  But Mon-Williams and Bull (2000) have recently reported a study that appears to show that the Judd illusion results "may be due to occlusion of the illusory background during the transport phase of the movement."

 

Servos, Carnahan, and Fedwick (in press) have reported similar results to those of Aglioti et al. (1995) for another size illusion, the horizontal-vertical illusion.  In this illusion two equal-length lines are presented as an inverted "T" (^), but the vertical line is perceived to be considerably longer.  The illusion affected subjects' judgments but did not affect their grip aperture.  Vishton, Rea, Cutting, and Nuñez (1999) also studied the horizontal-vertical illusion in a set of four experiments.  While the results of their first experiment are similar to those of Servos et al. (in press), the second and third experiments showed that when subjects directed their perceptual judgments to only a single element (line) in the display, their judgments were as accurate as in the motor response.  Their fourth experiment further showed that when the grip response requires taking both elements into account it is as susceptible to the illusion as the judgmental response.  These findings led the authors to suggest that the differences found in studies of this type are "best described as a dissociation between relative and absolute size perception, rather than a dissociation between perception and action."  Recalling that dorsal system responses to visual size are normally based on absolute size, while ventral system responses are normally based on relative size, these findings are consistent with the general claim of differential processing by the two systems, but also show that both systems can mimic the functions of the other when this is called for.

 

The studies of distance perception reviewed above appear to strengthen the hypothesis of the dissociation of the two visual systems, but the results of the studies on size perception are somewhat equivocal and difficult to interpret.  Perhaps the reason for this difference is the fact that in the studies of distance perception the subjects were requested to carry out more natural and more ecologically valid tasks than those in the studies of size perception all of which utilized size illusions.  When faced with a novel task utilizing a visual illusion the ventral system might at times override the functions of the dorsal system.  Perhaps a better way to study the dissociation between the two systems in the perception of size would be to use techniques like those of Warren and Whang (1987) described below (see Section 4.1).

 

Two studies have extended the range of the applicability of the two systems notion.  These studies appear to indicate that the dissociation can also be meaningful for much more distant stimuli than those used in the laboratory studies reviewed above.  Proffitt, Bhalla, Gossweiler and Midgett (1995) had subjects judge the inclination or steepness of hills, both out of doors and in a simulated virtual environment.  The angle judgments were obtained with three response measures, verbal estimates, adjustments of a representation of the hill's cross-section, and haptic adjustments of a tilt board with an unseen hand.  The first two measures yielded large overestimations of hill incline while the latter judgments were close to the veridical.  They “propose that the radically different pitch estimates obtained with verbal and visual reports versus haptic adjustments are due both to the dissociation in the visual pathways that inform these two sorts of responses and to the calibration mechanisms that coordinate their functioning”.  In a subsequent study Creem and Proffitt (1998) examined the effects of delays between viewing the hills and responding both verbally and haptically.  With short delays the haptic responses remained veridical as in the previous study, but with longer delays they were seen to be influenced by the ventral system.  It should be noted that the short delays in this study were considerably longer than those used in the previous studies (see above) reaching two minutes in comparison with only a few seconds in the earlier studies.  The authors suggest that the length of the dorsal system memory might relate in some way to the amount of time necessary to carry out the motor task in question.

 

Recent studies have presented evidence for the dissociation of the two systems in other domains.  Neurophysiological findings indicate that the ventral system receives its main input from the central portions of the retina while the dorsal system is attuned to the entire retina, leading Goodale and Murphy (1997) to test the hypothesis that judgmental responses would be more affected by retinal eccentricity than motor responses.  They asked subjects to carry out two tasks, a grasping task and a categorization task, using blocks of different sizes at 5° to 70° in the periphery.  They found that in the grasping task the correlation between maximum aperture and block size is maintained in the far periphery, although the amplitude of the grasp increases with eccentricity.  In contrast, the categorization judgments decreased with eccentricity.  More important, the variability of the grasp size did not increase with eccentricity as it did with the categorization judgments.  Goodale and Murphy see these results as indicating that dorsal system motor responses to peripheral inputs are much more reliable than perceptual judgments of peripheral stimuli.

 

Dijkerman and Milner (1998) recently examined subjects’ ability to discriminate the orientation of a square plaque tilted in depth, using two modes of response, grasping and perceptual matching.  While both response modes yielded high correlations between tilt and the response extent, there were differences between the matching and grasping data.  The grasping data yielded a linear function, while the matching data showed a consistent curvature.  The authors ascribe these differences to the operation of the two different systems in the analysis of orientation in depth.  The dorsal system requires information about the absolute stimulus properties leading to the linear function, while the ventral system can do with more categorical information for processing the relative orientation, yielding the curved function that somewhat de-emphasizes the differences between the orientations close to either the horizontal or the vertical.  This study also compared monocular and binocular viewing, but no differences were found.  Other studies of a similar nature have found differences between binocular and monocular viewing (see Section 5.2.1).

 

Finally, if it is true that the two systems function independently and that the dorsal system functions can be carried out with little or no conscious awareness, it is possible that the two systems will be capable of simultaneously processing two different sources of visual information with very little interference.  Does any evidence exist for the possibility that subjects can carry out two tasks simultaneously, one dorsal in nature and one ventral, without interference between them?  Ideally such an experiment would consist of requiring subjects to identify a visual stimulus presented foveally and at the same time give a motor response to a visual stimulus presented to the visual periphery.  Little research of this exact nature has been carried out.  While there is a very extensive literature on “dual-task” performance, it invariably emphasizes the allocation of attention between two tasks of a ventral nature.

 

Among the very few relevant studies is a study by Castiello, Paulignan, and Jeannerod (1991).  These researchers compared the timing of responses to the sudden displacement of a visual object, comparing a grasping response to that of a simple vocal utterance (Tah!).  They found the mean vocal response latencies to be 420 ms, more than 300 ms after the motor response.  What is relevant to the question at hand is the comparison of the results of the testing of both responses simultaneously and the control experiments where each response was examined separately.  The results showed that the latencies of both types of response when executed alone were not any faster than those executed simultaneously.   In other words, the dorsal grasping response and the ventral vocal response did not interfere with each other.  But this study focused on response times of the two systems to a sudden and singular change in stimulation.  It did not really deal with the question as to whether two continuous tasks, one dorsal and the second ventral, can be undertaken simultaneously without detriment to the performance of each. 

 

Deubel, Schneider, and Paprotta (1998) examined this question in a study aimed at examining the sharing of attention between a ventral and a dorsal task.  They utilized a dual task paradigm where the primary task was a reaching response for a designated location and the secondary task called for the discrimination between “E” and “$”.  The reaching response was seen by the authors to be carried out by the dorsal system and the discrimination response by the ventral system.  The results indicated superior performance when the discrimination task appeared at the same location as the aim point of the reaching response.  These results were interpreted as arguing “for an obligatory coupling of (ventral) selection-for-perception and (dorsal) selection-for-action.”  While this study would appear to yield a negative answer to possibility of independent functioning of the two systems, it might be argued that a different interpretation is possible.  The aim point for each reach response was changed between trials.  The subjects were informed where to reach by a pointing triangle that both specified the side to reach and its color the exact location.  According to the depiction of the two systems presented here, identifying the direction of the arrow and its color are both ventral system responses.  Thus, it can be argued that the dorsal response used in their study also activated the ventral system.  But it should also be pointed out that Bridgeman and Huemer (1998) (see above) showed that dorsal motor responses can follow from decisions based on ventral activity.

 

Ho (1998) recently reported a study that appears to indicate that ventral and dorsal system tasks can be undertaken simultaneously without detriment to the performance of each.  The two tasks were a motion processing task and a RSVP (rapid serial visual presentation) letter-recognition task.  The motion-processing task was presented in an annulus that surrounded the area where the RSVP task was presented.  The motion stimulus was ambiguous in that it could be interpreted by the subjects as either rotating clockwise or counterclockwise, depending on whether they employed a second-order (texture-defined) motion algorithm or a third-order (pattern-tracking) motion algorithm.  The participants split into two groups, depending on their natural tendencies to see either second- or third-order motion.  Briefly, the findings showed no interference between second-order motion perception and the letter-recognition task, but interference was found between third-order motion perception and letter recognition.  Second-order motion tasks are thought to be processed by the dorsal system (e.g., O’Keefe & Movshon, 1998), and these were shown not to interfere with the ventral letter-recognition task.  Ho suggests that third-order motion processing requires ventral processing, but there does not seem to be any study corroborating this.

 

3.4   Contrasting the two systems

 

To summarize the discussion of the two visual systems, let me briefly list some of the differences between them:

 

            3.4.1 Function

 

While both systems analyze the visual input, this analysis is carried out for different purposes.  The primary function of the ventral system is the recognition and identification of the visual input.  Recognition and identification must depend on some comparison with some stored representation.  In contrast, the primary function of the dorsal system is analysis of the visual input in order to allow visually guided behavior vis-à-vis the environment and objects in it (e.g., pointing, reaching, grasping, walking towards or through, climbing, etc.).  While these are the primary functions of the two systems, it would seem that they also participate in other functions.  Thus, for example, the dorsal system would seem to be involved in the identification of moving objects, while the ventral system has capacities that parallel those of the dorsal system, such as size perception, albeit a somewhat different type of size perception.

 

                        3.4.2 Sensitivity

 

The two visual systems differ with respect to their sensitivities in the spatial and the temporal domains.  The ventral system is more sensitive to high spatial frequencies while the dorsal system to high temporal frequencies.   In other words, the ventral system is superior at seeing fine details, while the dorsal system is better at seeing motion.  Comparing the two systems with respect to contrast sensitivity we find that the dorsal system has the higher contrast sensitivity, i.e., it responds to very low contrasts at relatively coarse spatial frequencies.  Some qualifications here as well: there is evidence that certain complex motions are processed by the ventral system (e.g., Ferrera, Rudolph, & Maunsell, 1994).  It is also clear that the dorsal system responds to static shapes, albeit in less detail; witness the ability of DF to shape her grasp to fit the shape of an object before touching it.

 

3.4.3 Memory

 

The ventral system is the memory-based system, utilizing stored representations to recognize and identify objects and events.  In contrast the dorsal system appears not to have a long-term storage of information, but only very short-term storage allowing the execution of the motor behavior in question.  Presumably the duration of this short-term memory varies with the motor behavior in question, being shorter for reaching and grasping movements than, say, walking through some aperture such as a door.5

 

3.4.4 Speed

 

Of the two visual systems, the dorsal system is the faster.  This statement is based on both the fact that the dorsal system receives magnocellular input while the ventral system receives a great deal of parvocellular input as well as magnocellular input.  The magnocellular system has been shown to respond faster than the ventral system.  Psychophysical studies have also shown this to be the case, where motor responses to sudden visual changes were found to be considerably faster than verbal responses to those same changes.  It should be noted, however, that there are perceptual activities that clearly include a ventral component, such as reading, that appear to be carried out with extreme speed.

 

3.4.5 Consciousness

 

In their normal everyday functioning it is probably fair to say that we are much more  conscious of ventral system functioning and hardly conscious of dorsal system functioning.  Evidence for this comes from all the psychophysical studies reported above of the dissociation of the two systems where subjects report awareness of the ventral processing, but simultaneously manifest different dorsal processing.  What is more, patient DF described above is capable of carrying out visuomotor tasks with the aid of her dorsal system, but is unaware of the features of the stimuli that made the carrying out of those tasks possible (see Milner, 1995, 1997).  But there also exist examples of apparent awareness of dorsal system functions and of the opposite, unconscious ventral functions (see Section 5.1 below).

 

3.4.6 Frame of reference and metrics

 

Both visual systems process information about objects in our environment, but for different purposes.  Ventral system functions aim at recognizing and identifying the object and for this purpose all that is needed is object-centered information.  In other words, the ventral system utilizes an allocentric frame of reference.  In contrast, the dorsal system must perform some action on, or in relation to, the object, such as grasping it.  For this purpose it needs to know the dimensions of the object in body-centered terms; e.g., how large should the gap between the thumb and forefinger be in order to pick up that block.  Thus, the dorsal system must utilize egocentric frame of reference.  In order to be able to pick up the object the dorsal system must utilize absolute metrics, while functions of the ventral system only require relative metrics.

 


 

3.4.7 Visual input

 

Two aspects relating to the sensitivity to visual inputs differentiate the two systems.  The ventral system is mainly attuned to foveal or parafoveal visual input.  Its sensitivity falls off sharply with retinal eccentricity.  In contrast, the dorsal system (with its magnocellular inputs) is much less affected by retinal eccentricity.  The two visual systems also appear to differ in their ability to cope with a transition from normal binocular vision to monocular vision.  While dorsal system function suffers considerably when forced to rely solely on monocular vision (without concomitant motion parallax) , the ventral system is much less affected.  (see Section 5.2.1).

 

3.4.8 Similarities and synergistic interactions

 

The points above have all pointed to differences between the two systems but it should also be mentioned that the two systems appear to perform many ostensibly similar functions, albeit for quite different purposes and using quite different mechanisms.  Thus, for example, both systems deal with object shapes, sizes, and distances.  A more detailed look at the parallel processing of size information will appear below.  What is more it should also be noted that in normal, non-brain damaged, people the two systems obviously function synergistically.  Thus, when one picks up a hammer, the control and monitoring of the actual movements is by the dorsal system but there also occurs intervention of the ventral system that recognizes the hammer as such and directs the movement towards picking up the hammer by the handle and not by the head.

 

4.  Making Connections

 

Having reviewed some of what is known about the two visual systems, I should now like to return to the two theoretical approaches and look at them once again pointing to parallels between the ecological approach and the dorsal system and between the constructivist approach and the ventral system.  In addition I will present a few examples of the research carried out under the aegis of each approach trying to show how the methodology employed is commensurate with the functions of the system in question.

 

4.1 Ecological theory and research and its relation to the dorsal system

 

Towards the end of his Ecological Approach Gibson proposes “a redefinition of perception”:

“Perceiving is an achievement of the individual, not an appearance in the theater of his consciousness.  It is a keeping-in-touch with the world, an experiencing of things rather than a having of experiences.  It involves awareness-of instead of just awareness.  It may be awareness of something in the environment or something in the observer or both at once, but there is no content of awareness independent of that of which one is aware.” (1979, p. 239).

In this redefinition we discern Gibson’s conception of the perceiver as active.  Perception is an achievement, a keeping-in-touch, not a passive experiencing of one’s conscious responses to stimulation.  This view contrasts with the constructivist perspective of a perceiver who passively examines her/his conscious awareness of the stimulation impinging on her/his senses.  This view of perception as resulting from an active perceiver is, of course, consonant with what we know about dorsal system functions.  It is the system that picks up information for or through action.  The notion of an active perceiver will be dealt with again in what follows, but first an examination of Gibson’s claims concerning “awareness” appearing in the above passage.

 

Gibson makes a distinction between the “content of awareness” and “awareness-of”, and I would suggest that the former might be equated with what is usually called “consciousness” and the latter refers to the pick up of information about our environment.  This dissociation between the usages of awareness and consciousness becomes clearer as one reads on.  Gibson is more specific in his “Summary of the theory of pickup” when he writes: “The term awareness is used to imply a direct pickup of information, not necessarily to imply consciousness.” (1979, p.250).  When discussing what are clearly cognitive processes such as conveying information through speech and language, Gibson writes:

“Knowledge that has been put into words can be said to be explicit instead of tacit.  The human observer can verbalize his awareness, and the result is to make it communicable.  But my hypothesis is that there has to be an awareness of the world before it can be put into words.  You have to see it before you can say it.  Perceiving precedes predicating.” (1979, p. 260).

Recalling that “awareness of” in the above passage need not imply consciousness, it appears that Gibson is implying that perception, or to remain consistent with my usage, pick up of information precedes conscious awareness.  This interpretation is bolstered by a sentence from the passage on size perception quoted above (see Section 2.2):  “The implication of this result, I now believe, is that certain invariant ratios were picked up unawares by the observers and the size of the retinal image went unnoticed” (1979, p. 160). 

 

The implication then is that Gibsonian pickup of information involves little or no consciousness.  This is consistent with the understanding of the functioning of the dorsal system where conscious awareness plays a very minor or no role at all.  But how does this claim of lack of conscious awareness jibe with our phenomenal experience of clearly being conscious of all aspects of our environment, including, say, the size of objects in it?  There are two somewhat speculative answers to this question.  One is that the ventral system has the ability to monitor the dorsal system by bringing into conscious awareness the relevant information picked up.  This, it is suggested, normally only occurs when that information is insufficient for the execution of some action, or when there is some sort of conflicting information in the stimulus situation.  The other is that the ventral system has its own parallel mechanism for perceiving the environment.  Thus, in the case of size perception while the dorsal system would be engaged in picking up size information in body-scaled terms enabling motor interaction with the object in question, the ventral system would be engaged in perceiving size in relative, object-centered, terms enabling better recognition of that object and its comparison with other objects.  Of course, a very viable possibility is that both these occur together.

 

In the initial brief review of the ecological approach above (see Section 2.2) the concept of affordances was introduced.  This concept can also be seen to tie in with the idea of dorsal system processing.  As was noted above Gibson’s examples of affordances include “climb-on-able”, “sit-on-able”, and others.  All these require some action by the observer, climbing, sitting, etc.  Gibson notes that “the affordances of things for an observer are specified in the stimulus information.  They seem to be perceived directly because they are perceived directly” (1979, p. 140).  Following from the previous discussion it is then suggested that affordances are picked up with little or no conscious awareness.  This idea also ties in with what we know about the dorsal system.  In the review of the neuropsychological evidence for the dissociation of the two visual systems above (see Section 3.2) the studies by Goodale, Milner, and their colleagues on patient DF were reviewed.  This patient was shown to be able to perform visuomotor tasks without being able to report anything about the stimuli that she manipulated or reacted to.  It was suggested that this patient’s ventral system was disconnected and she relied totally on her intact dorsal system.  Much of the initial research on this patient focused on her ability to grasp objects, and Gibson also touches upon the affordance of graspability:

            “To be graspable, an object must have opposite surfaces separated by a distance

less than the span of the hand.  A five-inch cube can be grasped, but a ten-inch cube cannot (Gibson, 1966b, p. 119).  A large object needs a “handle” to afford grasping.  Note that the size of an object that constitutes a graspable size is specified in the optic array.  If this is true, it is not true that the tactual sensation of size has to become associated with the visual sensation of size in order for the affordance to be perceived.” (1979, p.133).

The last sentence is, of course, a gibe at Berkeleyan empiricism, one of the forerunners of Helmholtzian constructivism.  More to the point is the fact that Gibson’s description of the affordance of grasping is consistent with the findings concerning patient DF, who is capable of picking up the size, shape, or orientation information concerning an object without conscious awareness and utilizing that information to act upon the object.

 

To sum up, the concept of affordances serves to tie together the connection between the visual information in the ambient array and the actions taken by the observer with respect to the objects in that array.  This tie between perception and action fits in nicely with what we know about the functions of the dorsal system, a system that picks up information relevant for actions.  Gibson reiterates the connection between perception (information pickup) and action many times in his book.  For example, when comparing knowledge and perception he writes:

"The direct perception of a distance is in terms of whether one can jump it.  The direct perception of a mass is in terms of whether one can lift it.  Indirect knowledge of the metric dimensions of the world is a far extreme from direct perception of the affordance dimensions of the environment. Nevertheless they are both cut from the same cloth" (1979, p. 260).

Thus, Gibson is saying that the direct perception of the affordances of objects enables the organism to act appropriately with regard to those objects, and that this occurs without any mediational mechanisms such as recognition of the object.  Some of Gibson’s writings on this topic have been  criticized as indicating that objects are recognized directly.  Recognition without recourse to representations in memory is indeed hard to fathom.  While Gibson was very explicit in stating that “To perceive an affordance is not to classify an object” (1979, p.134), some of his statements are indeed problematic.  Examples are his writing that apples afford eating or postboxes afford letter mailing (1979, p.139).  What is more, in the beginning of his chapter on affordances he writes: “This is a radical hypothesis, for it implies that the ‘values’ and ‘meaning’ of things in the environment can be directly perceived” (1979, p.127).  In terms of the dual-process approach suggested in this paper it is suggested that only what Neisser (1989) labeled "physical affordances" (see Palmer, 1999, p. 411) are perceived directly.  These are only the functional properties of objects and not their "meanings".  In other words, when one directly picks up the affordance of a chair, one does not directly recognize it as a type of furniture labeled "chair", but rather one directly picks up the information that that object contains a surface on which one can sit.  In a similar manner  it is suggested that rather than saying that the postbox affords letter-mailing, it would be better to say that the slot  in the mailbox affords inserting an object of appropriate size and shape6.

 

In the previous review of the ecological approach (Section 2.2) it was pointed out that it, in contrast to the constructivist approach, does not deal in any depth with the processes underlying perception.  This is not simply an omission on Gibson’s part.  It stems on the one hand from Gibson’s dissatisfaction with the mentalistic mediational processes invoked by the constructivists, but more importantly, to my mind, from his very different conceptualization of the underlying processes of perception.  The only allusion to something resembling underlying processes is to resonance or to attunement (see Section 2.2).  But resonance is not really a “process” in the sense of a taking-into-account constructivist process.  A body or system resonates to some impinging energy due to its internal structure, it does not process that energy in any way.  What is more, resonance does not depend on memory other than the built in features that resonate to something.

 

It is in this sense that Gibson prefers to talk of a perceptual system that functions without recourse to memory.  It is not a cognitive mechanism that is called up when a familiar stimulus occurs.  Presumably the Gibsonian perceptual system picks up invariants in the ambient array by resonating to the features of that array.  No “cognitive” memory mechanisms, such as, say, schemata need be invoked.  How does such a conception match what we know about the functions of the dorsal system?  First of all, it is claimed that the dorsal system has no representational memory to speak of, certainly nothing more than a few seconds or minutes to allow some action to be performed.  Thus, the lack of memory posited by the ecological approach matches what is known about the dorsal system.  What of the concept of resonance?  Is there any way in which the functioning of the dorsal system can be said to be resonating to the visual input reaching it?  An attempt at an initial answer to this question will be made below (see Section 5.2.1)

 

It is enlightening to compare the research methods used by those adhering to

ecological approach to those adhering the constructivist approach.  To this purpose I will briefly describe some studies of visual size perception in this and in the following section, each carried out in the “tradition” of each of the two approaches.  The emphasis will not be on the results of these studies but more on the methods, aiming to show that the methods chosen are appropriate for the study of the visual system of relevance to each approach.  Relatively little research on visual perception of size has been carried out by ecologically oriented researchers as they have preferred to focus on the haptic perception of size.  It is probably not fortuitous that this group has chosen to study haptic perception (see e.g., Turvey, 1996) as the sense of touch requires a great deal of motor behavior that is controlled by the dorsal system.  What is more the haptic system is much less representational in its nature than vision.  In their studies of haptic perception of size (e.g., Barac-Cikoja & Turvey, 1991, 1993, 1995) subjects were required to assess the size of gaps between two blocks by wielding unseen rods.  This is not a “judgmental” response about size, but a motoric manipulative response where the subject adjusts the gap between two visually presented blocks to be equivalent to the felt size of the gap.  In other words, an attempt is made to limit the involvement of judgmental or ventral mechanisms.  These researchers succeeded in arriving at an equation that depicts the very systematic relations between haptic perception of size and the physical parameters of stimulation.  Importantly, that equation only contains physical measures of the rod wielding without “mentalistic” conceptualization such as “taking distance into account” (see Barac-Cikoja & Turvey, 1995).            

 

One study by this group did investigate the visual perception of size.  Garrett, Barac-Cikoja, Carello, and Turvey (1996) sought parallels between visual and haptic perception of size.  Could a similar equation to that found for the haptic perception of size be found for vision?  The method used to study visual size perception was based on the method used to study haptic perception.  Pairs of blocks were placed at one of three distances from the observer.  The gaps between the blocks were adjusted to one of three gap sizes and the subject had to match the seen gap with a manual motor response of adjusting the gap between a near pair of blocks to the observer’s left.  The subjects were allowed to look back and forth between the far and near displays.  The experimental method in this study differs from the constructivist size perception experiments to be described in the next section.  Subjects were given a binocular view and allowed to move their heads and no time limitations were imposed.  What is more, they responded with a motor response rather than a judgmental response.  All these conditions, it is suggested, are conducive to inducing dorsal system function in preparing the response to the gap size.

           

            Quite a few studies have attempted to test and validate Gibson’s concept of affordance.  These have been carried out, of course, in the Gibsonian tradition and dealt with such topics as the affordances of stair-climbing, sitting, or ball-catching, among others.  One especially interesting study (Warren & Whang, 1987) focused on the affordance of apertures for walking-through.  The first experiment in this study, in my estimation, is the most direct examination of pick-up of size information by the dorsal system.  The size information picked up was the width of an aperture the subjects had to walk through.  The subjects were asked to walk through apertures of differing widths and the extent of their shoulder rotation was measured.  As might be expected these authors found that the smaller the aperture the greater the shoulder rotation.  In order to better understand this relation they chose two groups of subjects, one large (taller and broader shoulders) and one small.  When the relation between aperture width and rotation was plotted for each of these two groups separately, it was found that the two groups yielded parallel but distinct functions with the large group rotating their shoulders to a greater extent for each aperture size.  But when rotation angle was plotted not as a function of aperture width but of the ratio of aperture width to shoulder width the functions overlapped.  This was seen by the authors as evidence that aperture width is picked up in body-scaled terms.

 

Why is this a direct examination of pick up of size information by the dorsal system?  Because the subjects were required to act vis-à-vis a given stimulus situation, i.e., a given width of the aperture.  They were not required to make any perceptual judgments that would have involved the ventral system in the task.  The task occurred over time and the subjects were not limited in any way in time or space in performing the task, with the exception that there was a fast walking condition.  In contrast, in the second and third experiments in this study subjects made passability judgments about the aperture without actually walking through the apertures.  The second experiment compared such judgments in two conditions, static, with a reduction screen, and moving, allowing head movements.  The authors point out that “the results of the two studies do not offer striking convergence between the two tasks [walking and judging]”, with the subjects in the second study judging narrower apertures as passable.  This, I would suggest, is due to the fact that the in the first experiment dorsal system pick up of size information was mainly involved, while in the second experiment entailed a combined effort of both systems, with the ventral system playing the major role.  The reason for the latter claim is that no significant differences were found between the static and moving conditions, and I will try to show that the dorsal system relies quite heavily on movement (see Section 5.2.1 below).

 

The third experiment also utilized passability judgments.  Its purpose was to compare a condition with a normal flat floor with one in which the floor was raised a bit.  The latter condition biased the pickup of eye-level plane information and yielded the expected overestimation of aperture width.  The subjects were also asked to give distance estimates, and raising the floor did not bias these.  The authors suggest that “this casts doubt on the explanation that the shift in passability judgments is due to a shift in the perceived absolute distance of the aperture”.  While the latter interpretation is a clear possibility, these results can also be interpreted in terms of the distinction between perceived and registered distance (see Section 5.2.3).  In a word, there is a difference between the reported perceived distance (ventral) and that picked up by the dorsal system.

 

4.2     Constructivist theory and research and its relation to the ventral system

 

In contrast to the theoretical concepts and experimental methods of the ecological approach, outlined in the previous section, those of the constructivists parallel what we know about the functions of the ventral system.  In this section I will try to point to some of these parallels.  For example, at the beginning of his book Rock (1983) discusses perceptual theories and says:

“…… a summary statement of the kind of theory I propose to advance in the remainder of the book.  My view follows Helmholtz’s (1867) that perceptual processing is guided by the effort or search to interpret the proximal stimulus, i.e., the stimulus impinging on the sense organ, in terms of what object or event in the world it represents, what others have referred to as the “effort after meaning”. (p. 16)

In other words Rock is conceiving of perception as an effortful, but unconscious, attempt at identifying an object or event.  As was pointed out above, it is the ventral system that has the capacity to identify objects and events.  Identification must be based on some information stored in some representational system.  Once again, it is only the ventral system that has a representational memory; the dorsal system has been shown to lack more than a very brief memory needed to carry out some given action.

 

In his characterization of theories of perception, Rock suggests three types of theory.  One he labels “stimulus theory” which is akin to some of Gibson’s earlier thinking, and two versions of “constructive theories”.  One he labels “spontaneous interaction theory” where “the determinant of perception is not the stimulus but spontaneous interactions between the representations of several stimuli or interaction between the stimulus and more central representations” (p. 31).  Rock sees the Gestalt theory of perception (Koffka, 1935) as fitting this rubric.  An example along the lines of this theoretical approach is Wallach’s (1948) attempt to explain lightness constancy in terms of stimulus ratios.  As Rock notes “there is a great deal of similarity between modified stimulus theory and the spontaneous interaction theory” (p. 34), and I would venture to add that it is in some ways compatible with ecological theory.  The two theoretical approaches are similar in that they ascribe much of perception to information in the stimulus, but the Gestalt approach also adds mentalistic processes, such as the effects of familiarity on perception.  Rock finds the spontaneous interaction theory lacking in its ability to explain certain phenomena: “… perceptual constancy cannot adequately be explained on the basis of higher-order features such as relationships, ratios, or the interactions to which they give rise” (p. 36).  It is for this reason that Rock opts for the second constructive theory that he labels “cognitive theory”, a theory that maintains “that the correlate of perception is not the stimulus per se but interpretations or inferences made from it concerning what the object or event is in the world that produced it” (p. 32).  Rock sees this approach to perception as incorporating a homunculus, or executive agency where

“.. the better explanatory model here would seem to be one of a higher agency of mind comparing a percept with a specific memory on the basis of certain criteria of what constitutes an adequate match after isolating the latter by some process of internal scanning.” (p.39)

Note that the comparison to an item in memory is the type of function carried out by the ventral system.

 

The previous section (4.1) included descriptions of two studies relating to size perception in the ecological vein with the aim of showing that they are commensurate with dorsal system functioning.  In a similar manner, I should like to look at two studies relating to size perception in the constructivist vein.  The first is a study that Rock (1983) chose to describe in the section on size constancy in his chapter on unconscious inference (Ch. 9).  In that study Rock, Wheeler, Shallo, and Rotunda, (1982) created the illusion of a receding plane using drawings of three-dimensional cubes and their appropriate shadows on a set of three upright textureless boards.  The cubes were drawn so as to yield equal sized proximal images.  They also saw to it that the edges of the tops of the cardboards were blurred and could not be discriminated.  The subjects were asked to report the arrangement of the display, and nearly all reported seeing a flat receding plane (although somewhat tilted upward).  They were also asked to compare the size of the top (far) and bottom cubes (near).  The results indicated partial size constancy.  Looking at the methods these researchers used, it should be noted that they created a very unnatural stimulus situation, one that probably could not occur in a natural scene.  The textureless environment severely limited the available information.  By having the subjects look through a peephole, they prevented head movements.  These manipulations made the pickup of size information by the dorsal system very difficult.  The dorsal system normally requires movement and/or binocular viewing for it to function adequately.  Movement, binocular viewing, and textures were all missing from the Rock et al. (1982) setup.  When dorsal system functioning is limited by “special” laboratory conditions, the ventral system is called on for help.  This together with the fact that the subjects had to make verbal comparisons, which also called the ventral system into play, all lead to an analysis of size perception by the ventral system in this study with very limited intervention of dorsal mechanisms.

 

In his Indirect Perception Rock (1997) chose to reprint many studies that yielded evidence of percept-percept couplings.  None of the studies chosen was a direct study of size perception per se, but one was a study that looked at both speed constancy and size constancy (Rock, Hill, & Fineman, 1968).   Its purpose was to lend support to an indirect theory of speed constancy, in contrast to the Gestalt theory (Wallach, 1939), which notably Rock says (p. 206) “might be thought of as direct”.  Speed constancy refers to the fact that perceived speed does not appear to change with changes in the viewing distance of the moving object.  The Gestalt theory suggests that this occurs because the speed is judged as relative to some frame of reference, and the ratio between the speed of the object and its frame of reference remain constant over varying distances.  In contrast, the indirect theory claims that speed constancy is a function of size constancy, constancy of the distance traversed by the moving object.  In other words, “speed must be perceived by taking distance into account” (p.206).  In the experiments the subjects made both speed and size judgments, and for both tasks the results appear to show that distance is taken into account when constancy is achieved.  But, once again, the exact results are not of primary interest here, rather the methods used.  In the speed judgment task the subjects judged the speed of luminous circles and in the size judgment task the size of luminous triangles.  Both tasks were carried out in complete darkness, in order to preclude the subject having a frame of reference.  These tasks were presented under two conditions, a binocular condition and an artificial pupil condition.  In the binocular condition it was presumed that accommodation and convergence would supply distance information.  In the artificial pupil condition the subjects wore patches over both eyes with a 1-mm pinhole in the right-eye patch, precluding input of information from both accommodation and convergence.  As a whole the experimental set up is one that leads to much ventral involvement.  First judging speed and size in total darkness with no additional background is a very impoverished and unnatural situation.  Second, the subjects were required to make verbal judgments, which would call the ventral system into play.  Thus, once again it is claimed that the very experimental paradigm used here leads to the involvement of the ventral system, whereas in a natural information-rich environment speed would be processed in the main by the dorsal system.

 

To sum up this section on the two theories and their experimental methods, it was seen that the theoretical stances of each theory parallel what is known about the functions of a given visual system: The ecological theory the dorsal system, and the constructivist theory the ventral system.  What is more, the experimental methods used by the adherents of the two theories are commensurate with the functioning of the respective visual system. The constructivists in their attempts at isolating the effects of single variables use highly reduced laboratory conditions, and these in turn favor the predominance of ventral system functioning.  Followers of Gibson, on the other hand, in trying to create ecologically valid experimental conditions, present their subjects with much richer stimulus conditions.  Together with the fact they often opt for motor responses rather than verbal judgments, this leads to a much greater involvement of dorsal system functioning.

 

5.     The emergent dual-process approach and some of its implications

 

5.1  A dual-process approach

 

This article has put forth the hypothesis that both approaches to perception, the ecological and the constructivist, are valid descriptions of perception, but of different aspects of perception.  This hypothesis leads to what I have labeled “the dual-process approach”, an approach that bears a great deal of similarity to previous suggestions (e.g., Bridgeman, 1992; Neisser, 1994).  The hypothesis is that perception consists of two systems functioning more or less in parallel.  One system, similar in function to Gibson’s (1979) direct perception and labeled dorsal here, and a second system similar to Rock’s (1983,1997) indirect perception and labeled ventral here.  The first, the dorsal system, picks up visual information mainly to allow the organism to function in its environment.  It does this more quickly than the ventral system, and in the main without much involvement of conscious awareness, and as such does not encumber the cognitive system with the task of “interpreting” the stimulus input.  It is suggested that nearly all the information pickup for, or enabling, the performance of well-ingrained actions or behaviors, are carried out by the dorsal system.  In contrast, the ventral system primarily serves in the recognition and identification of objects and events in one’s environment.  It compares visual inputs to stored information in a quest for a meaningful interpretation of those inputs.  When needed the ventral system also participates in other perceptual activities, such as different aspects of space perception like the perception of size and distance.  As it is the system of which we are normally conscious it has, in a fashion, “the last word” as to our judgmental interpretation of stimulation reaching our senses.

 

While the two systems have different functions it should be emphasized that there is a great deal of cross talk between them and they normally function in synergy.  At times dorsal system processing can enter consciousness via the ventral system after the event.  What is more the ventral system often is involved in what appear to be dorsal functions.  Some examples: 1) When the dorsal system is faced with difficulties in picking up the necessary information, due to, say, insufficient information or conflicting information, the ventral system can be turned to for help (see Norman, 1980, 1983); 2) When the visuomotor behavior in question is complex and yet not well learned, as in the case of, say, novice tennis players, many functions that are later performed solely by the dorsal system are supported by the ventral system (see Williams, Davids, & Williams, 1999);  3) When some visuomotor activity leads to some type of judgmental or comparative response, or simply some verbal response is required, then the ventral systems participates as well;  4) When there is some time delay between the visual input and the required motor output, the ventral system is called upon to temporarily store the visual information as the dorsal system is incapable of bridging that delay.

 

The question of the relation between the two systems and consciousness is a thorny one.  On the one hand, it seems fair to say that the dorsal system functions without much involvement of consciousness, and that the functioning of the ventral system is normally accompanied by consciousness.  This generalization is also in accord with the differences between the two theoretical approaches as they were outlined above.  The review of Gibson’s ecological approach above (see Section 4.1) indicated that Gibson did not see the direct pickup of information as demanding consciousness.  The suggestion then is that the Gibsonian pickup of information is carried out without consciousness by the dorsal system, and that the apparent conscious awareness of certain dorsal system processes often is an after-the-fact epiphenomenon resulting from the transfer of the information to the ventral system for registration or assistance when needed.  But it should be emphasized that there are reasons to believe that not al