Below is the unedited preprint (not a quotable final draft) of:
Wertheim, A.H. (1994). Motion perception during self-motion: The direct versus inferential controversy revisited. Behavioral and Brain Sciences 17 (2): 293-355.
The final published draft of the target article, commentaries and Author's Response are currently available only in paper.

For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to: journals_subscriptions@cup.org (North America) or journals_marketing@cup.cam.ac.uk (All other countries).

MOTION PERCEPTION DURING SELF-MOTION The Direct versus Inferential controversy revisited

Alexander H. Wertheim
TNO Institute for Perception
P.O Box 23
3769 ZG
Soesterberg
The Netherlands
wertheim@izf.tno.nl

Keywords

motion perception, velocity perception, self-motion, extraretinal signal, efference copy, direct perception, visual-vestibular interactions.

Abstract

According to the traditional inferential theory of perception, percepts of object motion or stationarity stem from an evaluation of afferent retinal signals (which encode image motion) with the help of extraretinal signals (which encode eye movements). Direct perception theory, on the other hand, assumes that the percepts derive from retinally conveyed information only. Neither view is compatible with a special perceptual phenomenon which occurs during visually induced sensations of ego-motion (vection). A modified version of inferential theory yields a model in which the concept of an extraretinal signal is replaced by that of a reference signal. Reference signals do not encode how the eyes move in their orbits, but how they move in space. Hence reference signals are produced not only during eye movements but also during ego-motion, (i.e., in response to vestibular stimulation and to retinal image flow, which may induce vection). The present theory describes how self-motion and object motion percepts interface. Empirical tests (using an experimental paradigm that allows quantitative measurement of the magnitude and gain of reference signals and the size of the Just Noticeable Difference (JND) between retinal and reference signals) reveal that the distinction between direct and inferential theories largely depends on: (1) a mistaken belief that perceptual veridicality is evidence that extraretinal information is not involved, and (2) a failure to distinguish between (the perception of) absolute object motion in space and relative motion of objects with respect to each other. The new model corrects these errors, thus providing a new, unified framework for interpretating many phenomena in the field of motion perception.

1. INFERENTIAL VERSUS DIRECT PERCEPTION

How do we maintain the visual percept of a stable world while images of our environment move across the retinae during eye movements? Answers to this question classify in two main theoretical approaches. According to the traditional view, here called Inferential theory, we perceive motion or stationarity of an object, or of the visual world itself, depending on the outcome of a comparison process between two neural signals (see e.g. Helmholtz, 1910; Von Holst and Mittelstaedt, 1950; Sperry, 1950; MacKay, 1972; Jeannerod et al., 1979; Mittelstaedt, 1990). One signal, here to be called the "retinal signal", consists of retinal afferents encoding the movement characteristics of the objects' image across the retina. The other signal, encoding the concurrent eye movement characteristics, is usually termed the "extraretinal signal" because it does not derive from visual afferents (Matin et al., 1969; see also Matin, 1982, 1986; Mack, 1986). The comparison mechanism treats the two signals as vectors (see e.g. Wallach et al., 1985; Mateeff et al., 1991) and applies a simple rule: when they differ object motion is perceived; when they are equal object stationarity is perceived. Wertheim (1981) showed that when a smooth pursuit eye movement is made across a visual stimulus pattern, the magnitude of the retinal signal corresponds to the velocity of the retinal image flow of the pattern. Similarly, the magnitude of the extraretinal signal corresponds to the velocity of the concurrent eye movements as "estimated" within the perceptual apparatus (see section 5.1). In the present paper eye movements are mainly of the smooth pursuit type. Hence, the terms "magnitude" or "size" of retinal and extraretinal signals, will refer to these velocity vectors. We see a stable world during eye movements because retinal and extraretinal signals are equal: the velocity of the image of the world across the retinae equals the velocity of the eyes.

The alternative theoretical view, here called the theory of Direct Perception, which originated from Gibson (1966, 1979), has no need for the concept of an extraretinal signal (Gibson, 1968, 1973) as it assumes that the perception of motion derives exclusively from retinal afferent information. Its point of departure is that in normal every day circumstances perception is veridical (it should be: the organism's chances of survival depend on it - for this reason the approach is also called the ecological theory of perception). Hence the perceptual mechanism functions as an unbiased sampling of external information from the real world (see Lombardo, 1987). According to this theory, the visual world manifests itself as the particular pattern of light that hits an observers eye, called the optic array. The informational content of the scene is given in ("specified" by) particular invariant structural features of this light pattern. To perceive is to "pick up" such invariants. Thus movement of an object may be specified by an invariant described as: the concurrent appearance and disappearance of part of the array - specifying the background - along the two opposite borderlines of another part of the array - specifying the object. When the eyes move across the visual world as during (combined) eye- head- or ego-movements, a coherent streaming motion of the optic array relative to the retinae usually occurs. The resulting retinal flow pattern, has, in recent years, become the focus of research in the literature of Direct Perception theory. The basic assumption is that the brain is able to "pick up" from retinal flow those flow characteristics which are caused by invariants of the optic array such as the one mentioned above (they may be called "optic flow invariants"). However, a retinal flow pattern may also contain characteristics which stem from movements of the eyes in space (caused by eye- head- or ego-motion). But these invariants only specify the eyes move (or are moved in space), and when "picked up" we perceive (i.e. become aware of) these particular self- or ego-motions (Footnote 1). This is called "visual kinaesthesis". For example, an invariant which specifies eye movements in the head, could be: motion of the dark middle area of the array - specifying the nose - relative to the outer boundaries of the optic flow field. Other invariants specify head- or ego movements (Footnote 2).

Since the optic array stems from a stable world, retinal flow never holds optic flow invariants that could specify motion of the world. Consequently, the visual world cannot be perceived as moving. Recently the question has been raised whether the visual system always needs to distinguish between optic flow invariants and self-motion invariants (Cutting et al., 1992). Although strictly speaking, this reflects a deviation from the original point of departure of Direct Perception theory, this does not affect its fundamental principle to be discussed in this paper, that the perception of motion or stationarity stems only from retinal afferent information and not from a comparison process between retinal and extraretinal information.

In neuro-physiological research, the awareness of ego-motion is usually associated with the output activity of cells in particular areas of the brain, notably the vestibular nuclei and the vestibular cortex. These cells are driven by afferents from the equilibrium system and the somato-sensory kinaesthetic system (together here to be called vestibular afferents). Many of these cells are also driven by visual (image flow) afferents. One important pathway through which these visual afferents are conducted is known as the Accessory Optic pathway (see e.g., Dichgans et al., 1973; Henn et al., 1974; Dichgans and Brandt, 1978; Bttner and Bttner, 1978; Henn et al., 1980; Bttner and Henn, 1981; Cohen and Henn, 1988). These visual afferents are complementary to vestibular afferents. Their function is to generate or sustain sensations of ego-motion when the equilibrium system remains silent, i.e., in the absence of an accelerating force acting on the equilibrium system (e.g. when traveling at constant velocity in a train). In the literature concerned with research in this area of so called "visual-vestibular interactions", a visually induced sensation of ego-motion is termed "vection", and the particular features of retinal flow that generate vection are not called "invariants that specify ego-motion", but "optokinetic". The stimuli that generate them will here be termed "optokinetic stimuli", and the term "Optokinetic pathway" will be used to denote in general terms the combined neural channels that convey the optokinetic afferents which generate vection and interact with vestibular afferents. To be optokinetic, a visual pattern must be large, have relatively low spatial frequency characteristics, move (not too fast) across the retinae and remain visible for more than a very brief interval (see e.g., Brandt et al., 1973; Berthoz et al., 1975; Dichgans and Brandt, 1978; Berthoz and Droulez, 1982; De Graaf et al., 1990).

It is the purpose of the present paper to show that - within the domain of motion perception, an adapted version of Inferential theory, in combination with knowledge from the research area of visual- vestibular interactions and ego-motion, resolves the differences of opinion between Inferential and Direct theories of perception.

2. PROBLEMS FOR BOTH THEORIES

If vection is generated in the laboratory, some perceptual phenomena may occur that are incompatible with both Direct and Inferential theory. As an example consider vection created with an "optokinetic drum", a large drum with vertical black and white stripes painted on its inside wall, that can be rotated around an observer seated inside on a stationary chair. For the present purpose let us assume that the drum rotates with an angular velocity of 60 deg/sec around a stationary observer whose body, head and eyes are fixed in space (using a small stationary fixation point attached to the stationary chair). Let us further assume that the lights inside the drum are extinguised, i.e. the observer sits in the dark and does not know that the drum rotates. If we now suddenly switch on the lights inside the drum, the observer will initially perceive the drum correctly as rotating and experience no ego-rotation. However, within a few seconds an illusory sensation of ego-rotation in the direction opposite to that of the drum (called circularvection) gradually develops. During this period ego-velocity is experienced as increasing and the rotation of the drum appears to slow down. Finally, the drum is perceived as completely stationary in space and ego-velocity does not seem to increase any further. Circularvection is then said to be saturated. The whole process - from the moment the lights inside the drum are switched on to the saturation of vection - may last between 4 and 6 seconds, depending on the velocity of the drum. At very low drum velocities saturated vection may even be immediate, but in case of the present example, where drum velocity is considerably higher, it may take as much as 6 seconds or more before vection is completely saturated (for more details about the dynamics of circularvection see e.g. Dichgans and Brandt, 1978; Wong and Frost, 1978; Mergner and Becker, 1990).

The question that raises the theoretical problems for both Direct and Inferential theory is: Why, during saturated circularvection, is the drum perceived as stationary in space?

Direct Perception theory has a simple answer: a coherent retinal flow of the entire environment is an invariant that normally specifies ego- motion. When picked up, this yields a percept (an awareness) of ego- motion, not of drum motion. But this reasoning poses two problems. First, how could the drum initially have been perceived as moving? That suggests the presence of an invariant which specifies environmental motion. Second, this anomalous invariant seems to dissipate in time (as drum rotation appears to slow down gradually) and disappears completely upon saturation, even though the optic array and the retinal flow characteristics remain physically identical (Footnote 3).

Inferential theory can explain why the drum is initially perceived as moving: Its moving retinal image generates a substantial retinal signal, but the stationary eyes (focussed on the fixation point) generate a zero extraretinal signal. Therefore the two signals differ and the drum is seen to move. Hence for Inferential theory the problem is that the drum appears to be stationary once vection is saturated.

3. AN ALTERNATIVE MODEL

The problems can be solved within the framework of Inferential theory by reconsidering the concept of an extraretinal signal. This signal is usually defined as encoding ocular velocity and serves to determine to what extent retinal image motion is an eye movement artefact. The remaining image motion then reflects real object motion in external space. However, this reasoning only holds if the signal encodes eye velocity relative to external space, not relative to the head. The logic of this point has been recognized by many authors (see e.g. Wallach, 1987; Swanston et al., 1987; Swanston and Wade, 1988), but its consequences for the nature of extraretinal signals have not been recognized to the full extent.

Formally speaking, eye velocity in space (Veyes.s) corresponds to the vectorial addition of eye velocity in the head (Veyes.h) and head velocity in space (Vhead.s). Thus it is here proposed that extraretinal signals actually consist of the vector sum of a Veyes.h and a Vhead.s velocity vector. The Veyes.h. vector may derive from what is known as the "efference copy" - a neural corollary to the efferent oculomotor commands (Von Holst and Mittelstaedt, 1950) (footnote 4) - while the Vhead.s vector most likely derives from vestibular afferents which result from head movements.

The implication of this reasoning is, that during ego-motion extraretinal signals must also be generated: although the eyes may not move in their orbits during ego-motion, they do move in space and thus create artefactual retinal image motion (footnote 5). How are these extraretinal signals generated? First, they most likely derive from the already mentioned vestibular afferents which encode Vhead.s during ego-motion. However, there must be another component. The point is that in cases where the awareness of ego-motion is sustained visually (vection), there are no such vestibular afferents: their function is taken over by the visual afferents that are induced by optokinetic image flow and that pass through such channels as the Accessory Optic pathway. Henceforth, these pathways will be referred to with the general term "Optokinetic pathway". Thus, it is here proposed that such particular visual afferents may also generate (part of) an extraretinal signal. Obviously, this renders the term "extraretinal signal" incorrect. Therefore, from here on, the term "reference signal" will be used, which emphasizes only the evaluative function of the signal with respect to retinal image motion.

In summary then, the present model holds that reference signals are compound signals, which may include (any combination of) an efference copy, a vestibular, and a visual component. Fig. 1 illustrates how such reference signals may be generated.

------------------------- Fig. 1 about here -------------------------

The gating mechanism in the Optokinetic pathway determines what aspects of visual afferents generate vection and thus generate or affect reference signals. The features that make a visual stimulus (its retinal flow) optokinetic have already been mentioned. They suggest that the gating mechanism acts as a low band pass spatio-temporal filter.

A warning should be made here: the addition of a visual component to the reference signal is not meant to imply strict linear additivity. In fact, it is quite likely that the interaction between retinal and vestibular afferent information at the level of the estimator of head velocity in space, is of a non-linear nature (see e.g. Probst et al., 1985; Barthlmy et al., 1988; Xerri et al., 1988; Borah et al., 1988; Fletcher et al., 1990).

The theoretical significance of the visual component in the reference signal - it may be conceptualized in cybernetic terms as a kind of feedforward signal - is that it implies a self referential circularity or "strange loop" (Hofstadter, 1980) within the perceptual system: Retinal image motion may create (part of) a reference signal to determine its own perceptual interpretation. This circularity solves the problems associated with the development and saturation of circularvection: When the optokinetic drum starts rotating, the moving image of its stripes immediately generates a retinal signal (in the present example the eyes do not move in the head, as they remain focussed on the stationary fixation point). But in the present example (in which the drum rotates at 60 deg/s) vection develops only gradually, due to the low temporal bandpass characteristics of the gating mechanism in the Optokinetic pathway. Therefore, a (visually induced) reference signal is not immediately present. Hence, initially the drum is correctly perceived as moving. However, when vection begins to build up, so does the reference signal. The difference between the (unchanged) retinal signal and this growing reference signal thus decreases gradually. If perceived object velocity is determined by this difference - as shown in section 5.2 - drum velocity will be seen as slowing down until saturation is reached, i.e until the reference signal has become approximately equal to the retinal signal. The drum is then perceived as stationary in space.

The relevance of this model for the discussion between Direct and Inferential theories of motion perception is, that it provides a view which to a large extent agrees with both these theories, i.e. it creates a compatibility between the basic presumptions of both Inferential and Direct theory: On the one hand, it agrees with the main Inferential premise that information about how the eyes move (in space) is always necessary to perceive object motion or stationarity. On the other hand, it also agrees with three main assumptions of Direct Perception theory: First, the percept of object motion or stationarity may indeed stem exclusively from visual afferents (i.e. when reference signals only consist of a visual component). Second, retinal flow patterns may indeed specify ego-motion, and do not specify motion of the visual environment. Third, the gating mechanism in the Optokinetic pathway (see Fig. 1) can be viewed as the mechanism responsible for "picking up" invariants from retinal image flow. Hence, in the light of the present model the fundamental postulates of Direct and Inferential theory are not any more contradictory.

In the remainder of this paper it will be shown that this also holds for the empirical database which has given rise to the controversies between Direct and Inferential theory, as well as to theoretical attempts to find a compromise between the two approaches (i.e. theories which propose that Direct and Inferential perception are not mutually exclusive but reflect two distinct modes of perception. In this paper, such theories will be termed "Dual Mode theories"; see section 5). To make this clear, empirical tests will be reviewed of predictions that derive from the present model, but which do not follow from Dual Mode theory, or from either of the original two rival approaches themselves. However, first an experimental paradigm must be outlined, to serve as the frame of reference in terms of which the data obtain their significance.

4. EXPERIMENTAL PARADIGM

Imagine a subject, looking at a screen in front of the eyes. On the screen a visual stimulus is projected. The stimulus can move in both horizontal directions with a fixed velocity, set by the experimenter. Assume also that the subject's head is fixed in space, but that the eyes pursue a small fixation point sweeping horizontally (with another fixed velocity) across the moving stimulus. If we synchronize the beginning and termination of the motions of the stimulus and the fixation point, we can study the perception of stimulus motion or stationarity during a (pursuit) eye movement - made across the stimulus - of any given velocity. We will then use the following conventions: First, the terms "retinal image" or "retinal signal" will always be used to refer to the image of the stimulus, not the image of the fixation point. Second, retinal image velocity will be defined as the velocity of the eyes in space minus the velocity of the stimulus in space. This means that the directional sign given to the retinal image velocity vector (i.e. to the retinal signal, Vret) will be such that in the case of a stationary stimulus it is the same as the sign given to the direction in which the eyes move in space (Veyes.s).

Thus when, in the present example, the stimulus is stationary, the velocity of its retinal image equals Veyes.s. If the stimulus is indeed perceived as stationary, retinal and reference signals must be equal too. Now imagine that we move the stimulus slightly in the same direction as the eyes. This reduces retinal image velocity, and thus decreases the size of the retinal signal which then becomes slightly smaller than the reference signal. If we further increase stimulus velocity, the difference between retinal and reference signals increases further until it becomes detectable within the perceptual apparatus. At that point the threshold is reached for perceiving stimulus motion during a pursuit eye movement. The retinal signal is then exactly one Just Noticeable Difference (JND) smaller than the reference signal (see Wallach and Kravitz, 1965; MacKay, 1973; Wertheim, 1981). This may be expressed as:

(Formula 1)

where VretW is retinal signal size at the threshold for stimulus motion with the eyes (with-threshold), and Vref is the magnitude of the reference signal induced by the eye movement. Conversely, if the stimulus moves in the direction opposite to the eyes, retinal image velocity increases. The threshold for perceiving stimulus motion in that direction (against-threshold) is then reached when

(Formula 2)

where VretA is retinal signal size at the against-threshold. It thus follows that

(Formula 3)

Since retinal image velocity can be calculated as Veyes.s - Vstim.s (where Vstim.s is stimulus velocity in space), this may also be written as:

(Formula 4)

Hence, half the difference between the stimulus velocities at the two opposite thresholds for perceiving object-motion, can be used as an operational measure of the magnitude of one JND between retinal and reference signals (footnote 6).

At the exact midpoint between the two opposite thresholds - which in this paper will be called the Point of Subjective Stationarity (PSS) - retinal image velocity (Vret.PSS) corresponds to Vref because

(Formula 5)

Thus at the PSS retinal image velocity is not only proportional to the retinal signal, but also to the concurrent reference signal. Therefore, we may take retinal image velocity at the PSS as an operational measure of reference signal size.

The gain of a reference signal (Gref)is the extent to which it registers the actual velocity of the eyes in space (Veyes.s). It can be expressed as:

(Formula 6)

Since Vref was operationalized as Vret.PSS, Gref may also be expressed as:

(Formula 7)

VeyesPSS.s being the velocity of the eyes in space at the PSS. Since retinal image velocity equals Veyes.s - Vstim.s, this may also be expressed as:

(Formula 8)

where VheadPSS.s is head velocity in space at the PSS, and VeyesPSS.h is eye velocity in the head at the PSS. Note that the PSS is the midpoint between two opposite thresholds. If they are equal, VstimPSS.s is zero. Gref then equals 1, which means that eye velocity in space is correctly registered in the reference signal.

What would unequal thresholds mean? Assume that the with-threshold is higher than the against-threshold. VstimPSS.s then differs from zero and is in the same direction (has the same sign) as VeyesPSS.s. According to equation 8, Gref is then smaller than one, which means that the reference signal is too small, i.e. that eye velocity in space is underregistered in the reference signal (to the extent of 1-Gref). Conversely, if the against-threshold is higher than the with- threshold, the stimulus moves at the PSS in the direction opposite to VeyesPSS.s. Gref is then larger than 1, and Gref-1 then indicates the extent to which eye velocity in space is overrepresented in the reference signal. Hence asymmetric thresholds indicate an under- or overregistration of eye velocity in space in the reference signal, dependent on which threshold is higher, i.e. on whether the PSS has shifted in the direction with or against the eyes.

5. EMPIRICAL TESTS OF THE MODEL AND THEIR RELEVANCE FOR DIRECT AND INFERENTIAL THEORY

5.1 Thresholds for motion perception

As mentioned in section 3, there have been some attempts to bridge the gap between the Direct and Inferential approaches in the form of a Dual Mode theory. This basically is the assumption that there exist two modes of visual perception: a Direct mode, in which extraretinal signals play no role, and which yields veridical percepts, and an Inferential mode, which makes use of extraretinal signals, and which may yield illusions. For example, it is claimed that when a visual pattern is very large and covers most, or all, of the visual field, a particular mode of perception, called "visual capture" becomes dominant. This mode needs no extraretinal signals and creates veridical percepts (see e.g. Stark and Bridgeman, 1983). Hence, it can be viewed as a Direct Perceptual mode (e.g. Mack 1978). (It is also possible to view visual capture as a cognitive influence on perception, assuming that such patterns evoke a cognition of environmental stationarity because we know that normally our environment is stationary.)

Dual Mode theory (Mack 1978, 1986; see also Matin 1986) has developed from concepts originally formulated by Wallach (see e.g. Wallach 1959) to explain the phenomenon of center surround induced motion (a stationary stimulus is seen to move when its surrounding background moves, irrespective of whether the eyes fixate the stimulus or track the surround; see e.g. Shulman 1979). According to Wallach, there are two kinds of cues that may generate a percept of motion: "object- relative" and "subject-relative" cues (see also Shaffer and Wallach 1966). The "object-relative" cues stem from motion of objects relative to each other (i.e. from motion of object images relative to each other on the retina; see Matin 1986). These "Object-relative" cues presumably overrule or suppress what Wallach called "subject-relative" cues, which stem from object motion relative to the observer. Center- surround induced motion is then explained as follows: the percept of surround motion, which is "subject-relative", is overruled by the percept of motion that stems from the "object-relative" cue of surround motion relative to the center stimulus. The impression of motion is, however, attributed to the small center stimulus, because -

according to a Gestalt-like principle, called the "stationarity tendency of large stimuli" (Duncker, 1929) - a surround tends to act as a perceptual frame of reference (see e.g. Mack and Herman, 1978; Wallach, 1972).

According to Dual Mode theory "object-relative" and "subject-relative" cues somehow force the visual system to operate in a Direct or in an Inferential perceptual mode respectively. The dominant Direct mode is always operative in normal circumstances, because objects usually move relative to a full field visually structured background - which implies the presence of "object-relative" motion cues - and the Gestalt principle mentioned above attributes the impression of motion always to the smaller objects. The Inferential mode, on the other hand, is seen as a kind of backup system, which uses extraretinal signals. It becomes operative if no "object-relative" cues are present (e.g. when objects move in a totally darkened environment). This mode produces illusions because of the underregistration of eye velocity in the efference copy.

Dual Mode theory may be criticized on the basis of the argument that illusions of motion of the visual world often occur in situations where they should be prevented by capture (e.g. when dizzy, or when gently pressing a finger against the eyeball). But in the present section we will take a different approach: A series of experiments will be reviewed, the results of which show that the logic of Dual Mode theories is flawed, because the empirical criterion for distinguishing between the two modes is questionable.

The experiments concern predictions about the thresholds for motion during eye movements. According to the new model, the difference between the with and against the eyes thresholds corresponds to twice the JND between the retinal and the reference signal (equation 4). As JND's increase linearly with signal size - Weber's law - the distance between the two thresholds should increase linearly with eye velocity (in space). Wertheim (1981) measured these thresholds for a large stimulus pattern (head fixed in space) and showed this to be true (Fig. 2). The dependency of the thresholds on eye movement velocity (rather than amplitude) implied that, during pursuit eye movements, the magnitude of retinal and reference signals corresponds the encoded velocity of eye and image movements (Footnote 7).

------------------------- Fig. 2 about here -------------------------

------------------------- Fig. 3 about here -------------------------

In Fig. 3 the data from the same experiment are plotted in terms of a relation between retinal image velocity and eye velocity (in space). The dotted line in this graph divides the vertical distance between the two threshold lines in half. It thus represents retinal image velocity at the midpoints between the two opposite thresholds, or Vret.PSS, i.e. it gives the magnitude of Vref at any eye velocity (in space), and according to equation 7, its slope reflects Gref.

In this particular experiment Gref was approximately 1, i.e. eye velocity in space was encoded more or less correctly in the reference signal. It should be noted, that in this study the stimulus pattern was present on the screen throughout each pursuit eye movement that was made across it. Hence, during the eye movements there was always retinal flow. Therefore, the reference signal must - apart from its efference copy component - have contained a (relatively small) visual component. If the stimulus would have been very small and would have been visible only briefly during each pursuit eye movement, no such visual component would have been generated, because with such stimuli retinal afferents are too small and too short lived to pass through the Optokinetic pathway (given its low spatio-temporal band pass gating characteristics). Consequently, it is predicted that with small and briefly visible stimuli the reference signal (its size and gain) should be less than with large stimuli that remain visible for a longer period.

Experiments with such small and briefly visible stimuli (performed in total darkness) have been reported by Mack and Herman (1978). They do indeed indicate the presence of undersized reference signals (Gref < 1), because they yield high with- and low against-thresholds: at the PSS the stimuli always moved slightly in the same direction as the eyes. Since in these experiments reference signals could have consisted only of an efference copy, this evidences that during smooth pursuit eye movements eye velocity in the head is underregistered in the efference copy.

In the Mack and Herman study the asymmetry between the with- and against-thresholds was quite strong. The against-threshold was often so low that it actually became "negative", i.e. when stationary, the stimuli were still perceived as moving above threshold against the eyes (to reach the against-threshold they must be moved slightly with the eyes). This phenomenon is known as the Filehne illusion (Filehne, 1922; Mack and Herman, 1973; Wertheim, 1987; De Graaf and Wertheim, 1988). Its occurrence always implies a significantly undersized reference signal (Footnote 8).

However, the Wertheim 1981 study does not necessarily prove the existence of reference signals which include a visual component. Since the stimulus was quite large (38 x 20 deg), the absence of a Filehne illusion with a could be explained as an instance where, according to Dual Mode theory, a Direct mode of perception has occurred: visual capture may have happened, or the "stationarity tendency" of large stimuli may have counteracted the Filehne illusion.

To test these hypotheses against the present one, the Wertheim 1981 study was replicated with a large, but briefly visible stimulus pattern, flashed on the screen for only 300 ms during the pursuit eye movement (Wertheim and Bles 1984; Wertheim 1985). Since briefly visible stimuli, whatever their size, cannot be optokinetic (do not pass the low temporal band pass gating in the Optokinetic pathway) they cannot generate a visual component in the reference signal (see Fig. 1). Hence the Filehne illusion should reappear. But according to a visual capture or stationarity tendency hypothesis, no such illusion should occur with such a large stimulus. As shown in Fig. 4, however, the illusion was observed.

------------------------- Fig. 4 about here -------------------------

Nevertheless, the support for the present model is still not definitive, because visual capture or a stationarity tendency might need more than 300 ms to built up. To test the model against this possibility, the experiment was repeated again, but now with stimuli varying in optokinetic potential (Wertheim 1976). A very powerful optokinetic stimulus should induce such a large visual component that reference signals may become oversized (Gref > 1). In terms of equation 7, this means that to reach the PSS such a pattern should be moved against the eyes. If such an effect is strong enough an inverted Filehne illusion should be observed (the stimulus would, when stationary, seem to move with the eyes). No visual capture or stationarity tendency hypothesis can be compatible with such a result. Various stimulus patterns were used. Each one consisted of a large sinusoidal grating of a particular spatial frequency. Low spatial frequency patterns have a stronger optokinetic potential than high spatial frequency patterns (Berthoz and Droulez, 1982; Bonnet, 1982; De Graaf et al., 1990). Hence the former should create a larger visual component in the reference signal than the latter, and with very low spatial frequencies the reference signal might become oversized.

And indeed this happened: when the patterns were made visible long enough (1 sec) to generate a visual component in the reference signal, the lowest spatial frequency pattern created an inverted Filehne illusion, and increasing spatial frequency reduced Gref. At the highest spatial frequency Gref even became less than 1 again (Footnote 9). Interestingly, when the gratings were presented only briefly (300 ms) during the pursuit eye movement, the normal Filehne illusion was again always observed (Gref being approximately 0.8) and spatial frequency had no effect. This was in line with expectations, because such briefly visible stimuli, whatever their spatial frequency characteristics, have no optokinetic potential.

The conclusion that reference signal gain can actually be modulated invalidates the empirical basis on which the compromise of Dual Mode theory rests. The point is that the empirical criterion, which makes it possible to identify whether a percept is Direct or Inferential, depends on the issue of perceptual veridicality, an issue closely tied to the idea that extraretinal signals are always undersized.

The traditional claim of Direct Perception theory is, that perceptual deviations from reality indicate a lack of information in the optic array, i.e. particular invariants are absent, incomplete or have changed structurally. Such instances do not reflect (deficient) characteristics of the perceptual picking-up mechanism, but "impoverished" visual information in the environment, often believed to be an artefact of laboratory conditions. Normal, ecologically relevant, percepts are thought to be veridical. (see for some discussions about the central role of veridicality in Direct Perception theory: Gyr, 1972; Ullman, 1980; Lombardo, 1987).

To Inferential theory, the extent to which percepts deviate from reality reflects the extent to which the gain of extraretinal signals deviates from 1. Since the Mack and Herman (1973) studies on the Filehne illusion (see above) it has been assumed that extraretinal signals have a gain less than 1. Consequently, Inferential theory has always found it difficult to explain instances of really veridical perception (see e.g. Matin, 1982).

These contradictory views have (implicitly) led to the decision rule of Dual Mode theory: If a percept is not veridical, this evidences that it must have been mediated Inferentially, i.e. with the help of (insufficient) extraretinal information, and if the percept is veridical, it must have been mediated Directly (see for some examples of this reasoning: Mack 1978; Matin 1982; Stark and Bridgeman 1983; Bridgeman and Graziano 1989). The evidence from the threshold experiments mentioned above, shows the flaw in this argumentation: it is the implicit, but mistaken, belief that Inferential perception should always be biased because the reference signal is always undersized. This is not true. Reference signal gain is not a constant. Hence Inferential perception may or may not be veridical. Perceptual veridicality thus becomes a matter of degree and depends on whether or not (and how much) Gref deviates from 1. The present conclusion that reference signal gain is not fixed, but can be modulated by retinal flow, thus destroys the criterion for distinguishing between Direct and an Inferential perceptual modes, and thus invalidates its empirical base.

The present model - the notion of a visual component in reference signals - provides a new explanation (without the need for Dual Mode theory) of why in normal daylight circumstances no illusory motion of the world occurs during an eye movement: such illusions only happen if Gref differs significantly from 1. Although efference copy components in reference signals are indeed too small, the reference signals themselves usually are not: eye movement induced retinal image flow generates an additional compensatory visual component (the compensation need not be very precise: Vref must only be enhanced enough to make its difference with Vret less than one JND). Actually, the reason that efference copies associated with pursuit eye movements are undersized, may be that if they were not undersized, an eye movement induced visual component would oversize the reference signal, and this could create illusory motion of the world.

The present model is also able to explain center-surround induced motion without using the concepts of "object-relative" and "subject- relative" motion: When the stationary center stimulus is fixated with the eyes, the moving surround induces image flow across the retinae, and this generates a (relatively small) reference signal. But the image of the center stimulus moves not on the retinae, and thus generates a zero retinal signal. Hence, the center stimulus is perceived as moving in space. When the surround is pursued with the eyes, the illusion corresponds to the Filehne illusion: the small stationary stimulus seems to move against the eyes during a pursuit eye movement (see section 5.3 for a quantitative treatment of induced motion).

5.2 Velocity perception

We are now in a position to investigate some basic assumptions of Direct Perception theory. To this purpose we will begin with a closer look at Fig. 3. Imagine a horizontal line, crossecting this graph. Along this line Vret remains constant, which means that we always have the same retinal image flow characteristics (invariants): those present in the retinal image flow at the intersection between the vertical axis and the horizontal line. But when we move from left to right along this horizontal line, the percept varies. First the stimulus is seen to move against the eyes. But then, with increasing eye velocity, the perceived velocity of the stimulus is reduced until, at a certain eye velocity, the (against-) threshold is reached. After this point the stimulus is seen as stationary across a certain range of eye velocities. At the end of that range the with-threshold is reached. Now the stimulus is again perceived as moving, but in the other direction (with the eyes) and now its perceived velocity increases with eye velocity. In other words, all percepts of motion, stationarity, direction and velocity depend on the ratio between retinal image velocity and eye velocity (in space). This means that, contrary to the claims of Direct Perception theory, the invariants present in a particular instance of image flow have themselves no fixed perceptual significance. In defence of Direct Perception theory, it might be postulated that the invariant which must be "picked up" to perceive object motion could be a "higher order" invariant (similar to the one mentioned in footnote 2), consisting of the ratio between a normal invariant present in the retinal image flow (Vret) and eye velocity information. But that would be contradictory to the basic idea of Direct Perception theory that the percept of object motion derives exclusively from retinal information. The point is that such a "higher order" invariant actually represents the main Inferential principle: next to retinal information eye movement information is always necessary.

The claim, that the perceived above threshold velocity of a visual stimulus depends on the relation between retinal image velocity and eye velocity (in space), is incompatible with Direct Perception theory, also for another reason. According to this theory, eye movements are considered exploratory information sampling activities, necessary to "pick up" invariants. They do not (i.e. should not) affect percepts of object motion. If anything, they might enhance the quality of such percepts, but they do not define them (see e.g. Gibson 1979, pp 219).

In terms of the present model, the claim that perceived stimulus velocity depends on both how the image moves across the eyes and how the eyes move (in space), can be formalized as follows: perceived stimulus velocity depends on how much retinal and reference signals differ, minus the JND between them, or

(Formula 9)

where Vest.s signifies the subjectively perceived velocity of the stimulus in space, and Vref and Vret the magnitudes of the concurrent reference and retinal signals respectively. The threshold is represented by the additional requirement that Vest.s remains zero as long as 3Vref - Vret3 s JND. Note that when the eyes move faster across a stimulus, Vref and Vret increase equally, so their difference remains the same. However the JND grows (Weber's law), reducing Vest.s. Hence the present model predicts that during (faster) pursuit eye movements we should underestimate stimulus velocity in proportion to the increased JND, or, stated differently, Vest.s should depend on eye movement induced changes in the thresholds for motion.

To test this prediction, a velocity magnitude estimation experiment was carried out, in which stimulus velocity was judged while pursuit eye movements - of various velocities - were made across the stimulus pattern (Wertheim and Van Gelder 1990). The results showed that when the stimulus moved in the same direction as the eyes, Vest.s was indeed underestimated as much as the with-threshold for motion was elevated.

When stimuli moved against the eyes the underestimation of Vest.s was less pronounced and with high stimulus velocities it was even absent. One explanation is that the high retinal image velocity afferents which occur in against-the-eyes conditions, may not so easily pass the low temporal bandpass gating mechanism in the Optokinetic pathway (see Fig. 1). This would decrease the (visual component in the) reference signal, i.e. reduce Vref in equation 9. Vest.s then increases, because the difference between Vret and Vref increases (Vret is always larger than Vref when stimuli are perceived as moving against the eyes - see Fig. 3). That counteracts the underestimation effect. Another explanation could be as follows: When a stimulus is perceived as moving in the same direction as the eyes, Vret is always smaller than Vref (see Fig. 3). Hence, in equation 9, (Vref - Vret) is positive. As soon as it grows larger than one JND, Vest.s increases from its initial zero level. But when stimuli are perceived as moving against the eyes, Vret is larger than Vref (see Fig. 3), which means that the factor (Vref - Vret) is negative. As long as the absolute value of the factor (Vref - Vret) remains less than one JND, Vest.s remains zero, i.e. below threshold. But as soon as it grows larger than one JND, the absolute value of Vest.s in equation 9 becomes larger than 2JND. Thus a discontinuity may occur immediately above the against-threshold: Vest.s does not gradually increase from zero but jumps to a higher level, canceling the velocity underestimation effect of the increased threshold.

An effect opposite to the threshold related underestimation of stimulus velocity with stimuli moving in the same direction as the eyes, should happen when the eye movement is stopped abruptly (e.g. when the fixation point sweeping across the stimulus pattern is suddenly arrested). This reduces the threshold, and the stimulus should thus suddenly be perceived as accelerating, i.e. as moving faster than when the eyes were still moving. This acceleration illusion was also reported by Wertheim and Van Gelder (1990), who showed it to be independent of other factors, such as the sudden change in Vret itself or in relative velocity between the (images of) the stimulus pattern and the fixation point.

The underestimation phenomenon with stimuli that move in the same direction as the eyes, explains the so called Aubert-Fleischl phenomenon: the perceived velocity of a stimulus is less when it is pursued with the eyes, than when it moves - with the same speed - across stationary eyes (Fleischl, 1882; Aubert, 1886, 1887; Gibson et al., 1957; Dichgans et al., 1969; Mack and Herman, 1972; Dichgans et al., 1975). The phenomenon also occurs in a visually "rich" environment and has been recognized as anomalous to Direct Perception theory (Gibson et al., 1957). The present model explains the phenomenon as being identical to the velocity underestimation phenomenon during pursuit eye movements: when a stimulus is tracked with the eyes, it moves in the same direction as the eyes, and thus its velocity is underestimated. The fact that the stimulus is actually tracked with the eyes is irrelevant (for a quantitative analysis of this claim, see Wertheim and Van Gelder, 1990).

This explanation obviates another slightly different version of Dual Mode theory, one originally designed to explain the Aubert-Fleischl phenomenon (Dichgans and Brandt, 1972). Accordingly, we perceive motion either in an "afferent mode" from image motion across (stationary) eyes, or in an "efferent mode" by identifying object motion with ocular motion, i.e. during ocular pursuit of the stimulus (actually the "efferent mode" has also been considered as one of three modes of visual perception - see e.g. Wallach et al., 1982; Wallach, 1987 - the other two being related to retinal image motion cues and to object-relative motion cues). Presumably, the "efferent mode" is less precise, yielding slower velocity percepts. The modes have been identified with the Direct and Inferential modes mentioned earlier (Mack and Herman, 1972, Mack, 1986), the slower percepts of the "efferent mode" being explained as caused by the underregistration of eye velocity in the efference copy.

Interestingly, Dichgans et al. (1975) reported the Aubert-Fleischl phenomenon to be less pronounced with low than with high spatial frequency stimuli. The reason was that the perceived velocity of gratings moving across stationary eyes was reduced with lower spatial frequencies, and this did not happen when the gratings were pursued with the eyes (see also Diener et al., 1976). In terms of the present model this is explained as follows: When gratings move across stationary eyes they generate retinal flow, which induces a reference signal that consists only of a visual component. But low spatial frequency gratings are more optokinetic than high spatial frequency ones. Hence, the first should induce larger reference signals than the latter, i.e. larger JND's (Webers'law), and thus higher thresholds. Since, as explained above, higher thresholds create slower perceived velocities, low spatial frequency stimuli will appear to move slower across stationary eyes than high spatial frequency ones. When the gratings are tracked with the eyes, spatial frequency has no effect, because there is no image flow across the retinae, i.e. no visual modulation of reference signals.

This also explains the stationarity tendency of large stimuli: they are simply more optokinetic than small ones. Therefore they have higher motion thresholds and their perceived above threshold velocities are correspondingly reduced. Thus, there is no need to assume that large stimuli tend to act as a perceptual frame of reference (Mack and Herman, 1978) - an assumption which is questionable anyway: a frame of reference does not define its own motion or stationarity.

Patterns moving across the retinal periphery also seem to have more optokinetic potential than when they move centrally (Dichgans and Brandt, 1978). Thus, when a stimulus moves continuously across the retinal periphery of stationary eyes, it may gradually produce quite a large reference signal (composed of only a visual component). Hence, the difference between retinal and reference signal is gradually reduced, which should result in a decrease of perceived stimulus velocity. In some cases the difference may even become less than one JND, causing the stimulus to appear as stationary. Such phenomena have indeed been reported (Cohen, 1965; MacKay, 1982; Hunzelmann and Spillmann, 1984).

5.3 Absolute versus relative motion perception

So far, when referring to the present model, the terms "stimulus velocity", "threshold for motion" or "perceived motion" meant motion of objects relative to external space (i.e. 3-D "Newtonian" space, as defined by the horizontal surface of the earth and its gravitational field). Henceforth, this will be termed absolute motion. Now let us consider the perception of motion of objects relative to each other, which will be called relative motion (see Kinchla, 1971, for a similar use of the terms absolute and relative motion).

Assume that the eyes sweep across two stimuli, S1 and S2, moving relative to each other. According to equation 9 (section 5.2), the subjectively estimated absolute velocity of a S1 (Vest1.s) equals the difference between the eye movement induced reference signal and the retinal signal (Vret1), minus the JND:

(Formula 10)

Similarly, with respect to S2 we may write:

(Formula 11)

The subjectively perceived velocity of S1 relative to S2 (Vest1~2) equals the difference between Vest1.s and Vest2.s. Hence:

(Formula 12)

This means that the perceived velocity of two stimuli relative to each other should be independent of how the eyes move (i.e. of reference signals) and only depend on the difference between the two associated retinal image velocities minus a noise factor (Footnote 10).

Equation 12 is of course subject to the condition that Vest1~2 remains zero (below threshold) whenever 3Vret2 - Vret13 s JND~2. In terms of weber's law this means that, at the threshold for relative motion between S1 and S2,

(Formula 13)

This prediction was tested (Wertheim and Niessen, 1986) by measuring the threshold for relative motion between two identical stimulus patterns, while subjects tracked a fixation point sweeping (at various velocities) across both stimuli. The results (Fig. 5) confirm equation 13.

------------------------- Fig. 5 about here -------------------------

This finding is theoretically important for the debate between Direct and Inferential theory. The point is that, since retinal image velocity is always equal to the difference between eye velocity in space (Veyes.s) and absolute stimulus velocity (Vstim.s), equation 12 can be written as

(Formula 14)

Hence, not only does the percept of relative motion between objects depend exclusively on retinal afferents (equation 12), it is also always veridical, because it corresponds to the physical description of how the objects move in space - apart from a noise factor (equation 14). These conclusions agree with the basic claims of Direct Perception theory, even though they follow from Inferential reasoning. Hence, with respect to relative motion, there is no disagreement whatsoever between the two approaches. It seems that the debate between the two theories actually reflects a failure to distinguish between percepts of relative motion (which are independent of reference signals) and percepts of absolute motion (which depend on reference signals). To state that both theories concern the perception of "motion" is to cause confusion. We should separate the concept of "motion" into absolute and relative motion, and correspondingly distinguish between percepts of absolute and relative motion (e.g. between seeing whether a car moves on the road and seeing whether it moves relative to another car).

In retrospect this makes sense: Inferential theory always concerns percepts of absolute motion, even if not mentioned explicitly (as for example in the literature on the Filehne illusion). Therefore it refers to illusions caused by properties of reference signals. Direct Perception theory is concerned with perception in natural "ecologically relevant" environments, i.e. with the perception of relative motion of objects moving against a visual background. If the background is seen as stationary in space, the relative motion of an object against the background equals its absolute motion in space. Hence all percepts of motion become veridical. To illustrate this, let S1 be an object moving against a visual background S2. The subjectively perceived absolute velocity of the object S1 can be expressed as:

(Formula 15) or (Formula 16) or (Formula 17)

Equation 16 shows that if a background is stationary (Vstim2.s = 0) and is also perceived as such (Vest2.s = 0), the absolute motion of the object, Vest1.s, is perceived veridically (apart from a noise factor). Equation 17 shows that this is even true in cases where the gain of reference signals differs from 1, if only the JND between Vref and Vret2 is large enough to maintain a percept of background stationarity. Note that this is an example of a veridical percept of absolute motion in the presence of an inappropriately sized reference signal (visual capture).

If, on the other hand, the background moves in space, its estimated absolute velocity, Vest2.s, is usually unequal to Vstim2.s (e.g. because of a size or spatial frequency induced stationarity tendency, or because it is perceived during an eye movement). Equation 16 shows that the percept of absolute object motion, Vest1.s, then becomes less veridical, i.e. unequal to Vstim1.s. This provides a quantitative description of center-surround induced motion: A stationary object (Vstim1.s = 0), seen against a moving surround (Vest2.s < Vstim2.s), is perceived as moving in space (Vest1.s 0).

Note, that this view on induced motion differs from the one given by Wallach or Dual Mode theory (see section 5.1), according to which the crucial element of induced motion is the dominance of "object- relative" motion cues. Seen from the present perspective, however, induced motion is an illusion of absolute motion (see also Kinchla, 1971). The illusion is not that the center dot seems to move relative to its surround (this is seen correctly), but that the center dot seems to move in space. This is illustrated by the fact that we can also express induced motion formally by substituting (Veyes.s - Vstim1.s) for Vret1.s in equation 10:

(Formula 18)

Thus, if we fixate the stationary stimulus (Vstim1.s = Veyes.s = 0), it is seen to move in space with a velocity proportional to the visually induced Vref, created by the surround image flow across the retinae (minus the JND). Note that in such circumstances the illusion should develop gradually, because the induction of a visual reference signal is a gradual process (actually we should expect the duration of this process to become shorter with slower surround motion; see the discussion about the generation of vection in sections 2 and 3). If, on the other hand, the eyes track the surround, induced motion should be immediate, because Vref then consists of just an efference copy component (no image flow across the retinae), which is about 20% smaller than Veyes.s. Vest1.s is then proportional to (Vref - Veyes.s). If this is larger than one JND, induced motion (the Filehne illusion) occurs. There is indeed some empirical evidence (see Reinhardt- Rutland, 1992) which supports this claim, that induced motion develops gradually when the eyes fixate the stationary center stimulus, but is immediate when the eyes fixate the moving surround. A related prediction would be that no induced motion should occur if Vref approximates Veyes.s (see equation 18), i.e. if reference signal gain is close to 1. That may happen if the eye movement sweeps across the whole induced motion display in a normally illuminated environment, generating a visual component next to the efference copy.

Although relative motion between objects is not affected by reference signals, it may be affected by eye movements for another reason: Eye movements made across various moving stimuli may increase retinal image velocities, i.e. retinal signals. That would not affect the differences between these retinal signals, but it would increase the JND's between them (Weber's law). According to equation 12, this should elevate the threshold for relative motion of objects with respect to each other (see Murphy 1978; Nakayama 1981) and reduce perceived relative velocities. This may cause a "freezing illusion": Imagine a screen on which various stimuli move relative to each other with different but not too high velocities. An eye movement across the screen will then increase the relative motion thresholds so much that the display seems to become motionless, as if suddenly frozen. Nakayama (1981) actually predicted that such a phenomenon should cause the disappearance of kinetic depth perception, which depends on the detection of small differences in relative velocity between many stimuli on a screen (e.g. Wallach and O'Connell, 1953; Braunstein, 1976)

5.4 Interfacing ego- and object motion perception and visual-vestibular interactions

The main function of the vestibular apparatus is to signal head movements to the brain. According to Direct perception theory, this is only confirmative information because the visual system does the same through visual kinaesthesis. No particular interaction between these two kinds of information is postulated (but see footnote 2) and the visual perception of object motion is thought to be independent from vestibular stimulation. As mentioned earlier, eye and head movements are viewed as exploratory information sampling activities, which, if anything, should only improve perception. In the literature of Direct Perception there is only one exception. This is the case of overstimulation of the vestibular apparatus. Such overstimulation yields a percept, or awareness, of self-motion which differs from that of visual kinaesthesis: orderly percepts are disturbed and the observer experiences a sense of disorientation, part of which consists of perceiving the visual world as moving. However, since Direct Perception theory is mainly interested in normal (ecologically relevant) perceptual conditions, it has no formal model for what happens in such cases, apart from the assumption that such conditions make retinal events "obtrusive" (Gibson 1968).

As shown above, the present model differs from this view. Although it agrees that percepts of relative motion may indeed be independent of vestibular stimulation, this is not the case with percepts of absolute motion. Here reference signals are always involved, and they may include a vestibular component. The idea of a vestibulary induced kind of efference copy was first proposed by Sperry (1950), who called it a "corollary discharge". This term often features in the Inferential literature (see e.g. Jeannerod et al., 1979). To a certain extent the present model agrees with this idea. The difference is, however, that the present model assumes that vestibular stimulation does not generate an independent signal, but a component in the reference signal. This is not just a matter of semantics, because the further assumption that reference signals may also include an visual component now introduces a new element: it implies an interaction between visual and vestibular information within the reference signal (i.e. within the brain's estimate of how the eyes move in space). As a result the neurophysiological literature on visual-vestibular interactions, which mainly stems from research on ego-motion perception, now becomes relevant to the study of the visual perception of object motion.

Although this literature is much too large to review in the present paper, it should be mentioned that it often includes speculations about possible neural substrates of what we have called reference signals. For example, physical movements of the eyes in space - irrespective of whether they are caused by eye movements in the head, head movements, or both - have been recognized in the output activity of cells in the Vestibular Nuclei (Berthoz et al., 1981; Cohen, 1981; Fuchs and Kim, 1975; McCrea et al., 1981; Yoshida et al., 1981), in the Flocculo-Nodal lobe of the Cerebellum (Lisberger and Fuchs, 1987a,b; Cohen, 1981; Stone and Lisberger, 1990a,b) and in the vestibular cortex (Bttner and Bttner, 1978; Bttner and Henn, 1981). The activity of some of these cells is in fact modified by visual stimulation, i.e., by retinal image motion or optic flow (see e.g. Waespe et al., 1981; Waespe and Henn, 1981; Watanabe, 1984; Noda, 1986; Nagao, 1988). The time course of this modulation differs among cells, but seems to be slowest in the vestibular cortex. Hence the output activity of cells in that area might represent the neurological substrate of reference signals (see also Straube and Brandt, 1987). The neural networks of which these cells are part, have largely been charted out (e.g., Waespe and Henn, 1979, 1981; Henn et al., 1980; Ito, 1982; Precht, 1982; Berthoz and Melvin Jones, 1985; Straube and Brandt, 1987; Xerri et al., 1987, 1988; Barthlmy et al., 1988; Cohen and Henn, 1988). They are sufficiently complex to allow for a subsystem such as described in Fig. 1 (or Fig. 7 below).

To illustrate how closely object motion perception is linked with self-

and ego-motion, let us analyze the occurrence of saturated vection, not as described earlier for the case of circularvection in an optokinetic drum, (section 3), but as it occurs in a normal every day kind of situation. Imagine a train engineer, seated at the front of a train, who looks straight ahead and makes no head movements. When the train starts moving it accelerates. The vestibular apparatus which only reacts to accelerations, responds. Integration of the response produces information about head velocity in space (recognizable at the level of single cell activity - see e.g. Benson, 1990), which according to the present model is used to generate a reference signal that provides the visual system with an estimate of how fast the eyes move in space (see Fig. 1). If the estimate is not too much in error, the reference signal will be approximately equal to the retinal signal evoked by the moving image of the visual world. Since small differences are masked by the JND, this keeps the world perceptually stable. When the train reaches a constant velocity, the vestibular apparatus becomes silent, but now vection takes over to maintain the sensation of ego-motion, i.e., the decreasing vestibular component in the reference signal is replaced by a gradually growing visual one. The reference signal thus maintains its size and the percept of a stable world remains. Without this visual-vestibular interaction, the reference signal would decrease with the decrease of vestibular reactivity, and the world would lose its stability; it would seem to "rush" towards the observer. This illustrates an important ecological function of the visual-vestibular interaction taking place within reference signals: to interface the perception of a stationary world with the perception of ego-motion.

However, such interactions do raise a problem for the present model: it is a well established fact that the time course of development of a vestibular response differs from that of an optokinetic one (see e.g. Dichgans and Brandt, 1972; Henn et al., 1980). Hence the development of the various reference signal components is not always synchronous. For example, the vestibular apparatus reacts fast to relatively high frequency self-movements, but it may take longer before a visual component is fully grown. Another problem is that the integration of vestibular information into velocity information is not perfect and depends on the frequency range within which the vestibular system responds (Benson, 1990). Thus, it is unlikely that, during activities like running or dancing, Gref is continuously close to 1. Nevertheless, we usually do not experience illusory motion of the visual world when engaged in such activities. The answer is probably that the JND's of reference signals which include a vestibular component are very large. That would mask quite large unwanted differences between retinal and reference signals.

To investigate this issue, Wertheim and Bles (1984) measured the JND of reference signals during ego-motion. They rotated subjects sinusoidally (0.05 Hz, various amplitudes) on a rotating chair inside a totally darkened optokinetic drum, which could be rotated independently around the subject. The inside of the vertically striped drum could be illuminated briefly (400 ms). This was done at the point where the subject rotated at peak velocity. Thus the drum wall served as a (full field) stimulus pattern that could be moved in space with or against the direction of the subjects ego-rotation in space. The two opposite thresholds for absolute motion of the drum wall were measured at various ego-velocities. This yielded JND's of 35% of ego velocity. This is similar to findings of Wallach (1985), who reported that the distance between the with- and against-thresholds for perceiving object motion in space is very large when measured with subjects walking alongside the stimulus. His results suggest JND's that amount to 40% of ego-velocity.

Such large JND's should indeed facilitate a smooth interfacing of ego- motion with percepts of environmental stationarity. However, the price is a dramatic increase of the perceptual thresholds for absolute object motion during ego-motion and, because of that, a strong underestimation of absolute object velocity during ego-motion. Such effects are indeed well documented (Pavard and Berthoz, 1977; Bchele et al., 1980; Probst et al., 1980; Berthoz and Droulez, 1982; Probst et al., 1984, 1986).

But the mechanism which serves the ecological function of interfacing percepts of ego-motion and environmental stability, has more drawbacks: percepts may be produced exactly opposite to what they should be: a really moving scene can erroneously be seen as stationary. This happens when we see a moving train close to the window of our own stationary train: the moving train acts as an optokinetic stimulus and creates a sensation of ego-motion. It thus generates a reference signal that grows in size until its difference with the retinal signal (encoding the retinal image velocity of the moving train) becomes less than one JND. The moving train is then erroneously seen as stationary in space. Basically, this is the same phenomenon as the development of saturated circularvection in an optokinetic drum.

The opposite, illusory motion of an actually stationary scene, may also occur. A common example is what happens after a period of extreme vestibular stimulation: After cessation of vestibular stimulation, neural activity of cells in the central areas upon which the vestibular afferents converge dies out only gradually (as evidenced by a continuation of reflexive nystagmus eye movements, called "afternystagmus"; see e.g. Henn et al., 1980). Hence, a residual vestibular component remains present in reference signals and oversizes them, causing illusions of environmental motion. Note that this explains the perception of environmental motion during dizziness. Here the present model differs from the traditional Inferential view that such percepts are caused by an absence of efference copies during such reflexive nystagmus eye movements.

Similar reasoning may apply to the Movement After Effect, MAE, (when a stimulus pattern is suddenly stopped after having moved for a while across stationary eyes, it is perceived as moving slightly in the opposite direction. The illusion may last many seconds, during which the threshold for object motion in the original direction is elevated). Its most common, but still somewhat controversial, explanation is in terms of fatigued direction selective cells (see Denton, 1977; Favreau, 1976; Moulden, 1975; Sekuler et al., 1982). The present explanation is different: When in an optokinetic drum the lights are suddenly put out vection decays only slowly and reflexive nystagmus eye movements continue for a while. This suggests a continuation of central neural activity upon cessation of retinal flow (Henn et al., 1980). Hence visually induced (components in) reference signals may also decay gradually after retinal flow stops. But as long as they last, a stationary stimulus, viewed with stationary eyes, will be seen as moving in space (see equation 18). The JND associated with that residual reference signal explains the elevated threshold for motion in the direction of the original retinal flow (i.e. in the vectorial direction of the reference signal).

These examples show that, according to the present model, appreciation of the research field of visual-vestibular interactions is necessary requirement if we want to explain phenomena in the field of visual motion perception. However, the inverse is also true: the present experimental paradigm can serve as a research tool in the area of ego- motion perception and visual-vestibular interactions: the method of measuring reference signal magnitude (and gain) by measuring retinal image velocity at the PSS, can be used to measure the gain of the response of the various parts of the equilibrium system (the semicircular canals which react to angular accelerations, and the otoliths which respond to linear accelerations of the head in space).

An example of such a study is the above mentioned Wertheim and Bles (1984) experiment, with subjects who were rotated inside an optokinetic drum. That experiment was not only designed to measure the JND between retinal signals and vestibulary induced reference signals. It also served as an attempt to measure the response of the semicircular canals (neglecting possible kinaesthetic feedback) and its interaction with reflexive nystagmus eye movements during ego- rotation in darkness. According to the present model, such ego- rotation should induce reference signals which consist of the vectorial sum of Vhead.s (the response of the semicircular canals) and a Veyes.h component. The presence of a Veyes.h component stems from the reflexive nystagmus eye movements which happen during stimulation of the semicircular canals (in a normally illuminated environment nystagmus eye movements serve to stabilize the visual gaze in space during ego-motion, but they also happen in darkness). When a subject is rotated around the vertical axis on a rotating chair, nystagmus consist of slow phase smooth compensatory eye movements in the direction opposite to head rotation, alternating with fast phase recuperating saccades in the same direction as head rotation. Thus, during slow phase nystagmus eye movements, reference signal magnitude should be smaller than during the suppression of nystagmus (suppression of nystagmus happens when we ask the rotating subject to fixate the eyes on a small head-stationary fixation point), because Veyes.h then approximates zero.

Wertheim and Bles tested this hypothesis by performing their drum experiment with and without the suppression of nystagmus. They showed that (at the 0.05 Hz ego-rotation frequency used in this experiment)

(Formula 19)

Hence, Vref was indeed reduced by slow phase nystagmoid eye movements (during slow phase nystagmus eye movements the sign of Veyes.h is opposite to that of Vhead.s) and increased when they are suppressed. Note that this means that slow phase nystagmus eye movements do in fact generate efference copies in reference signals in which only 72% of Veyes.s is registered, just as in the case of pursuit eye movements. This finding is at variance with the traditional view, mentioned earlier, that nystagmoid eye movements do not generate efference copy signals (Howard and Templeton, 1966; Johnstone and Mark, 1970, 1971, 1973; Kornhuber, 1974; Leibowitz et al., 1982 Raymond et al., 1984; -

but see Bedell et al., 1989, and Mittelstaedt, 1990, for experimental findings and theoretical views which agree with the present observation).

Since in this experiment subjects were rotated along their vertical axis in total darkness, the Vhead.s term in equation 19 actually reflects the gain of semicircular canal afferents (although some kinaesthetic feedback may also have been present). The small (7%) overregistration of head velocity in these afferents explains the Oculogyral illusion (when an observer is rotated in complete darkness, and nystagmus is suppressed with a head stationary fixation point, this fixation point, rotating with the observer, seems to move slightly faster than the observer; see e.g., Graybiel and Hupp, 1946; Whiteside et al., 1965; Elsner, 1971; Ross, 1974; Howard, 1982): the velocity of the fixation point in space is overestimated, because it corresponds to the difference between a zero retinal and a slightly oversized reference signal.

Recently, the characteristics of reference signals created by linear accelerations of the head in space - the characteristics of the otolith afferent response - have also been investigated in a series of experiments at our laboratory (Zeppenfeldt, 1991; Wertheim, 1992a,b; Wertheim and Mesland, 1993). Here Vret.PSS was measured with subjects moving forward or backward on a linear track sled between two screens on which the stimulus (a checkerboard pattern) was flashed (300 ms). The subjects looked straight ahead (a fixation point was placed several meters in front of the endpoint of the sled's track), and they thus perceived the stimulus patterns peripherally. The sled moved sinusoidally (at 0.15 Hz and with a 109.5 cm/s peak velocity) and the experimental room was absolutely darkened to prevent the creation of a visual component in the reference signal (no retinal flow from the environment). Reference signals were measured with the monitors placed at various positions along the sled's track (i.e. at various phases of the sinusoidal sled motion) and the best fitting sinus through these data was calculated.

------------------------- Fig. 6 about here -------------------------

As can be seen in Fig. 6, the results showed that this particular ego- motion profile created undersized reference signals with a gain of 0.76 and a small phase lead of 3.8 deg. (a similar phase lead - of approximately 6 deg. - can be calculated on the basis of a mathematical model of the otolith system; see Grant and Best, 1987; Marcus, 1992). If such experiments are performed with linear ego- motion sinuses of other frequencies and amplitudes, the full transfer function of the otoliths may become known (again, under the assumption of neglectable kinaesthetic feedback).

In a similar experiment (see also Wertheim and Mesland 1993) we measured Gref at the point of maximum sled velocity (109.5 cm/s) in darkness, but now we compared it to a condition with the lights on in the experimental room. In darkness the reference signal was again undersized (Gref being 0.8), but when the lights were on, allowing for the generation of a compensatory visual component in the reference signal, Gref became 1. This pattern of results is remarkably similar to the one discussed with relevance to the Filehne illusion (see section 5.1).

The same logic is used in a current research project, in which we investigate whether the otolith response changes after adaptation of the equilibrium system (adaptation is induced by rotating subjects in the gondola of a centrifuge such that they sustain a force of 3G for periods between 1 and 2 hours; see Bles et al., 1989; Ockels et al., 1989, 1990; Wertheim et al., 1989; Wertheim, 1992a; Wertheim, 1993).

The present paradigm might even prove useful in the clinical diagnosis of vestibular deficiencies. For example, in one study (Wertheim et al., 1985) a hypothesis was tested, according to which the resting level activity of the central vestibular system is abnormally noisy in Schizophrenia. Functionally, this implies very noisy reference signals, i.e. abnormally large JND's between retinal and reference signals, even if no head movements are made. Very high thresholds for motion were indeed observed with such patients. Similar findings were obtained with patients that were not schizophrenic, but who had been diagnosed otolaryngologically as having a noisy vestibular apparatus.

5.6 Conclusions

The controversies between Direct and Inferential theories of motion perception may, at least partly, have originated from different and sometimes contradictory philosophical views (Gibson 1973, Lombardo 1987). However, on the empirical level, most of the debate stems from the puzzling observation that the data gathered in "normal" or "every day" kind of situations often differ from those gathered in strictly controlled laboratory conditions. The present model provides a theoretical alternative for the two approaches for two reasons. First, it explains the "puzzling" differences, by showing that the two approaches actually reflect research on different topics: Direct perception theory is concerned with the perception of relative motion and Inferential theory with the perception of absolute motion. Second, it describes (quantitatively) how the two topics relate to each other.

As a result a certain compatibility is created between premises from Direct and Inferential theories, premises that have traditionally been considered contradictory. Thus it is agreed with Direct perception theory that the perception of object motion may indeed stem exclusively from visual afferents, and that retinal flow itself only holds information about ego-motion and not about motion of the visual world. There is also agreement with the Inferential assumption that information about how the eyes move in space is necessary for perceiving absolute object motion. However, on other issues the model diverges from Direct and Inferential theories. Thus it disagrees with the Direct Perception assumption that self-motion is basically explorative and only serves to upgrade perception. It also disagrees with the assumption that to perceive absolute motion, the brain needs no estimate of how the eyes move in space. With respect to Inferential theory, the present model replaces the concepts of efference copy and corollary discharge with that of a (compound) reference signal. As it includes a visual component, the common assumptions that it should be considered extra retinal and has a fixed gain are also abandoned. Finally, since the present model actually describes how percepts of self-motion and of object motion interface, it broadens the scope of the literature relevant to the study of visual object motion perception to include that of visual-vestibular interactions.

So far this paper has been devoted to the description of the model and of an empirical paradigm which can be used to quantify its parameters and to test its predictions. The results of these empirical tests appear to support the model, and their theoretical implications are shown to resolve most of the controversies between Direct and Inferential theory, and seem to invalidate the theoretical rationale of Dual Mode theory. At the same time, new explanations have been given for many well known phenomena in the field of motion perception. What now remains is to review theories and data that may point to deficiencies of the present model and directions for further research.

6. PROBLEMS AND SPECULATIONS

6.1 The Post and Leibowitz model

Post and Leibowitz (1985) have proposed a version of Inferential theory which is at odds with the present model for two reasons. First, it assumes that a very large moving stimulus pattern always induces vection, irrespective of whether its image moves across the retinae (according to the present model vection develops only through retinal image flow) and, in addition such stimuli always cause reflexive optokinetic nystagmus. Second, according to Post and Leibowitz, efference copies - which, for the purpose of comparing their model with the present one, may be treated as reference signals - are proportional not to eye velocity, but to the effort invested in voluntary control of oculomotor activity. No effort is invested when eye movements are reflexive. Thus optokinetic nystagmus generates no efference copies, but its suppression (by focussing the eyes on a head-stationary fixation point) takes effort, and this does evoke efference copies. Presumably, with stronger optokinetic stimuli, it takes more effort to suppress optokinetic nystagmus. Consequently, just as in the present model, efference copies are proportional to the force of vestibular and optokinetic stimulation. Therefore, many predictions derived from the present model also follow from the Post and Leibowitz model.

However, the two models can be tested against each other, because they predict opposite effects when nystagmus is not suppressed. Consider a large stimulus pattern which moves sinusoidally in front of the observer, slow enough and with an amplitude small enough, to maintain continuous sinusoidal slow phase optokinetic nystagmoid eye movements without any fast phase saccadic eye movements. According to the Post and Leibowitz model, such conditions evoke sinusoidal vection, and since there is no efference copy (the eye movements are reflexive) and a zero retinal signal (no retinal image motion), the pattern should be seen as stationary in space. The present model predicts the opposite: First, in the absence of image flow across the retina no vection can develop. Second, the absence of retinal image motion implies a zero retinal signal, but the slow phase eye movements generate non-zero (efference copy composed) reference signals. Hence, the pattern should be perceived as moving in space.

Such an experiment was recently reported (Mergner and Becker, 1990). The stimulus consisted of a full field shadow pattern moving sinusoidally across a semi-circular screen. Subjects fixated a small fixation point, which was also projected on the screen and which could move independently. It moved synchronously with the shadow pattern, having the same frequency but a different amplitude, i.e. a different velocity. In their experiment Mergner and Becker started out with the fixation point moving much slower than the shadow pattern, causing retinal image motion of the pattern. In this situation sinusoidal vection always developed to saturation (at which point the shadow pattern appeared as stationary in space). They then gradually increased the velocity of the fixation point. Vection remained. However, at a certain moment fixation point velocity became equal to the velocity of the pattern, that is, it became part of the pattern. This is the critical condition because in terms of the Post and Leibowitz model the slow phase reflexive nystagmus eye movements are now completely unobstructed by any voluntary effort to track the target. At this moment all subjects experienced a sudden drop out of vection, perceived themselves as stationary and saw the pattern move in space, whatever the duration of the trial. This supports the present model and is contrary to the predictions the Post and Leibowitz model.

6.2 Retinal image flow and vection

The Merger and Becker (1990) experiment showed indeed that vection does not develop in the absence of retinal slip (see also Fig. 1). But this poses a problem: in an optokinetic drum circularvection always occurs, i.e. also when nystagmus is not suppressed. There may be two reasons for this.

First, if a full field stimulus pattern is tracked with the eyes from extremely right to extremely left (as during the slow phase of optokinetic nystagmus) its image does not move across the retinae, but illuminates different parts of the retinae. Maybe to the visual system this is also a vection inducing cue. Merger and Becker used a stimulus that consisted of a shadow pattern with low contrast values. That may have reduced the salience of this cue. We tried to test this idea with some pilot measurements: With a sinusoidally moving optokinetic drum (high contrast black and white stripes) vection always developed, also when the stripes were tracked with the eyes. However, one may also track the stripes of the drum with the head (eyes stationary in the head). The retinal image of the stripes then always illuminates the same retinal area. It appeared that in such cases vection did indeed not develop.

Second, it is possible that the repetition of brief instances of image flow during the fast phases of normal optokinetic nystagmus have the potential to induce vection. If so, the gating mechanism in the Optokinetic pathway could be viewed as a velocity storage mechanism (Raphan et al., 1977) which can be loaded by brief repetitive instances of image flow across the retinae.

Another problem which should be mentioned here is, that in section 3 it was argued that an optokinetic stimulus generates (a visual component in) reference signals because during vection the visual system assumes that the eyes move in space. But, if so, should not vection always occur when an optokinetic reference signal (component) is generated? Clearly this does not always happen. For example, when pursuit eye movements across a visual background generate image flow across the retinae, vection usually does not occur, not even if the background consists of a highly optokinetic stimulus pattern (such as used by Wertheim (1987) to invert the Filehne illusion). This suggests that the common pathway on which Optokinetic afferents and vestibular afferents converge, branches off in two directions, one branch generating ego-motion, the other - its corollary - converging on the reference signal. Different gating mechanisms, i.e. different thresholds, may then be associated with the two branches (see Fig. 7).

------------------------- Fig. 7 about here -------------------------

6.3 A "visual efference copy"?

Ehrenstein et al. (1986a,b; 1987), using a briefly visible point stimulus, reported that the Filehne illusion increased dramatically with stimulus presentation times below 300 ms, implying a very strong reduction of reference signal size (in some cases even to zero). This poses a problem: without its visual component reference signal magnitude should remain constant, as it still contains an efference copy component, encoding about 80% of eye velocity in the head. Ehrenstein et al. measured the PSS with a forced choice method of constant stimuli using only two response alternatives (motion with or against the eyes), excluding "no motion" responses. Since with extremely brief stimulus presentations, motion perception may become ambiguous or impossible (see e.g. Henderson, 1971; Johnson and Leibowitz, 1976; Bonnet, 1982; Algom and Cohen-Raz, 1984), this may have caused a response bias. We failed to replicate Ehrensteins finding with a larger stimulus pattern (Wertheim and Bekkering, 1991, 1992) using our standard staircase method of limits, in which the two opposite thresholds are measured separately and which thus always allows for "no motion" responses. Reducing presentation times to 150 ms never yielded large Filehne illusions (Gref remained approximately 0.8). However, the JND increased dramatically (suggesting that such brief retinal afferents are quite noisy). With presentation times below 150 ms, the JND became so high that retinal image velocity at the against the eyes threshold reached the upper limit for detecting image motion. In such cases subjects never perceive motion against the eyes, which means that the PSS cannot anymore be measured.

However, in another experiment (De Graaf and Wertheim 1988) with a very high spatial frequency stimulus pattern (a little cloud of dots), made visible for 300 ms in the retinal periphery, we did observe a very large Filehne illusion, suggesting a 0.5 reference signal gain. This is difficult to explain. A speculative solution would be to postulate that the efference copy is not (as traditionally assumed) a neural corollary of the efferent command signals to the oculomotor musculature (footnote 4), but stems from visual afferents (i.e., from visual kinaesthesis) that converge on the reference signal through a fast visual pathway with high spatio-temporal frequency band pass gating characteristics. In the retinal periphery such a "visual efference copy" would be smaller than in the foveal area, because the retinal periphery is much less sensitive to high spatial frequencies. This could also explain the Ehrenstein effect: retinal afferents from a single point stimulus that is extremely briefly visible might not generate such a "visual efference copy" at all, because they may pass not even this high spatio-temporal frequency band pass gating.

Anatomically, such a fast visual channel could be included in the Optokinetic pathway. There are some indications (Stone and Lisberger, 1990a,b) that the Accessory optic pathway contains both a low and a high spatio-temporal band pass gating mechanism, as reflected in the different temporal characteristics of the simple and complex spikes of floccular Purkinje cells.

Although the idea of such a fast visual channel in the Optokinetic (Accessory Optic) pathway is quite speculative, the idea is attractive because it further blurs the distinction between Direct and Inferential theory: a "visual efference copy" would make the concept of visual kinaesthesis compatible with Inferential theory. It might also add to the explanation why vection can be instantaneously when an optokinetic drum is very slowly set into motion and optokinetic nystagmus is not suppressed.

On the other hand, the idea of a "visual efference copy" should not be embraced too easily, because it also creates a serious problem: when a moving stimulus is properly tracked with the eyes in total darkness, there would be no efference copy, i.e. no reference signal. Since there is also (almost) no retinal image motion, i.e. no retinal signal, such a stimulus should be seen as stationary in space. This does not happen: we do see stimulus motion in such conditions. This seems quite incompatible with the concept of a "visual efference copy".

6.4 Signal magnitude

Recently, we observed that the Filehne illusion is age dependent (Wertheim and Bekkering, 1991, 1992): With very brief stimulus presentation times (150 ms), the usual illusion occurred with normal student subjects, but with older subjects it disappeared and with subjects over 50 years of age it was inverted (the correlation between Gref and age was approximately 0.7; n=38). With longer stimulus presentations Gref gradually returned to approximately 1 for all ages. Thus it appeared as if reference signals, i.e. efference copies, grow (beyond proportion) with increasing age. However, a more plausible explanation is that when people age, it takes more time to register image velocity in the retinal signal. Extremely brief stimulus presentations would then yield undersized retinal signals. Very high retinal image velocities are then needed to augment retinal signals enough to make them larger than reference signals, i.e. to reach the against-the-eyes thresholds. Hence stimulus velocity at the against- the-eyes threshold would become very high, and consequently Vret.PSS would increase, creating the impression of a very large reference signal.

This illustrates a particular complexity of the present model: In section 4 reference signal magnitude was operationalized as retinal image velocity at the PSS. However, this presupposes a proper encoding of retinal image velocity in the retinal signal. If retinal signals underregister image velocity (i.e. if their gain is less than 1), an inverted Filehne illusion should occur, which creates the impression that reference signals are oversized (see section 5.1). This means that the magnitude of retinal and reference signals cannot be assessed absolutely, but only relative to each other. Hence, to decide whether a particular condition really creates an in- or decrease in reference signal size, arguments additional to those mentioned in section 5 must be considered. For example, the occurrence of a Filehne illusion with briefly visible stimuli can only evidence undersized reference signals, not oversized retinal signals, because briefer stimulus presentations are unlikely to increase retinal signals.

6.5 The vectorial nature of retinal and reference signals

The present experimental paradigm is based on stimulus motion collinear with self- or ego-movements. Therefore it uses simple substraction and addition of retinal and reference signals and of the components within reference signals. The only exception in this respect is the visual component in reference signals, which is likely to show non-linear interactions with the other components. The other additivities in the present model could be viewed, however, as a minimum requirement, at least as long as we do not discover evidence to the contrary. Thus, at present, the model considers calculations concerning these components as basically vectorial.

The somewhat more complex calculations, which ensue when retinal and reference signals are not collinear (i.e. when the stimulus and the eyes do not move collinearly), have recently been described by Mateeff et al. (1991). They could be extended to include 3D motion in space: since vestibular afferents encode 3D ego-motion, they may induce 3D (components in) reference signals.

Inferential theory was originally designed to describe the perception of position of stimuli in space as a function of eye position in the head (see e.g. Helmholtz, 1910; Matin and Matin, 1969; Mittelstaedt, 1990). Since velocity relates mathematically to position, it might be possible to extend the present model to include the subjective perception of the position of stimuli in space, and maybe also the perception of direction and orientation of stimuli in space. In addition, a model similar to the present one could be developed to describe perception during saccadic eye movements (see footnote 7).

6.6 Other sensory domains

Formally, the reasoning behind the present model applies to any perceptual system with which object motion can be perceived. For example, take the tactile system: when our fingertips move across a tactile stimulus (e.g., a rough surface) its shearing velocity across the skin is encoded in an afferent tactile velocity signal (Vskin). To determine its perceptual significance, a reference signal (Vref), encoding finger velocity in space (Vfing.s), should be created. The stimulus will then be felt to move in space if the difference between Vref and Vskin exceeds one JND. Because of Weber's law, the tactile thresholds for stimulus motion with and against finger movements should grow wider apart when finger velocity increases, just as the opposite thresholds for visually perceived motion grow wider apart with increasing eye velocity (Fig. 2). The shearing velocity of the stimulus across the skin at the midpoint between these thresholds (Vskin.PSS) would then indicate the magnitude of the tactile reference signal, and its ratio with Vfing.s, would express its gain.

Although no such experiments have been reported (footnote 11), there is some evidence supportive of such a tactile model: tactile vibrational thresholds are elevated with increased velocity of the skin surface in space (Coquery and Amblard, 1973; Coquery, 1978, 1981; Dyhre-Poulsen, 1978; Paillard et al., 1978; Angel and Malenka, 1982; Rauch et al., 1985; see also MacKay 1973). Interestingly, recent data from our lab (Bles et al., 1993) show that skin stimulation may cause (illusory) sensations of ego-motion. Hence, in the tactile domain there may also exist a self referential (component in the) reference signal, analogous to the visual one in the present model.

Similar experiments can be envisaged in the auditory domain, by measuring thresholds for hearing motion of a sound source in space during self- and ego-motion. The present model thus provides a theoretical frame of reference in terms of which the perception of object motion (and stationarity) can be investigated in any sensory domain, i.e. within any perceptual system that has a sensory surface which can move in space.

ACKNOWLEDGEMENTS

This paper was written with support of the Royal Dutch Air Force (contract A86/KLu/0481). The helpful comments of Bruce Bridgeman of the University of California, and of Rik Warren of the Amstrong Aerospace Medical Research Laboratory, Ohio, on earlier drafts are gratefully acknowledged.

REFERENCES

Algom, D. and Cohen-Raz, L., 1984. Visual velocity input-output functions: The integration of distance and duration onto subjective velocity. Journal of Experimental Psychology: Human perception and performance 4, 486-501.

Andersen, G.J., 1990. Segregation of optic flow into object and self- motion components: Foundations of a general theory. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-motion. Lawrence Earlbaum, Hillsdale, NJ. pp 127-142.

Angel, R.W. and Malenka, R.C., 1982. Velocity dependent suppression of cutaneous sensitivity during movement. Experimental neurology 77, 266-274

Aubert, H., 1886. Die Bewegungsempfindung. Pflgers Archiv 39, 347-370

Aubert, H., 1887. Die Bewegungsempfindung. Zweiter Mitteilung. Pflgers Archiv 40, 459-480

Barthlmy, J., Xerri, L., Borel, L., and Lacour, M., 1988. Neuronal coding of linear motion in the vestibular nuclei of the alert cat. II: response characteristics to vertical optokinetic stimulation. Experimental Brain Research 70, 287-298.

Bedell, H., Klopfenstein, J.F., and Yuan, N., 1989. Extraretinal information about eye position during involuntary eye movement: Optokinetic afternystagmus. Perception and Psychophysics 46, 579-586.

Benson, A.J., 1990. Sensory functions and limitations of the vestibular system. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-motion. Lawrence Earlbaum, Hillsdale, NJ., pp 145- 170.

Berthoz, A. and Droulez, J., 1982. Linear self-motion perception. In: Tutorials on Motion Perception. A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.). Plenum, New York

Berthoz, A. and Melvill-Jones, G. (eds.), 1985. Adaptive Mechanisms in Gaze Control. Elsevier, New York

Berthoz, A., Pavard, B. and Young, L., 1975. Perception linear horizontal self-motion induced by peripheral vision. Experimental Brain Research 23, 471-489

Berthoz, A., Yoshida, K., and Vidal, P.P., 1981. Horizontal eye movement signals in second-order vestibular nuclei neurons in the cat. In: Vestibular and Oculomotor Physiology. B. Cohen (ed.). International meeting of the Barany Society; Annals of the New York Academy of Sciences. 374, 144-156

Bles, W., Bos, J.E., Furrer, R., Graaf, B. de, Hosman, R.J.A.W., Kortschot, H.W., Krol, J., Kuipers, A., Marcus, J.T., Messerschmid, E., Ockels, W.J., Oosterveld, W.J., Smit, J., Wertheim, A.H., and Wientjes, C.J.E., 1989. Space Adaptation Syndrome induced by a long duration +3Gx centrifuge run. Institute for Perception Technical Report, IZF-1989-25, TNO Institute for Perception, Soesterberg, The Netherlands.

Bles, W., Jelmorini, M., Bekkering, H. and De Graaf, B., 1993. Arthrokinetic information effects linear self-motion perception. (in prep)

Bonnet, C., 1982. Thresholds of Motion Perception. In: Tutorials on Motion Perception. A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.). Plenum, New York

Borah, J., Young, L., and Curry, R.E., 1988. Optimal estimator model for human spatial orientation. In: B. Cohen and V. Henn (eds), Representation of Three-Dimensional Space in the Vestibular, Oculomotor, and Visual Systems. Annals of the New York Academy of Sciences 545.

Brandt, T., Dichgans, J. and Koenig, E., 1973. Differential effects of central versus peripheral vision on egocentric and exocentric motion perception. Experimental Brain Research 16, 476-491

Braunstein, M.L. 1976. Depth perception through motion. Academic Press, NY.

Bridgeman, B. and Graziano, J.A., 1989. Effect of context and efference copy on visual straight ahead. Vision Research 29, 12, 1729- 1736.

Bridgeman, B., Hendry, D. and Stark, L., 1975. Failure to detect displacement of the visual world during saccadic eye movements. Vision Research, vol. 15, 719-722

Bchele, W., Degner, D., and Brandt, T., 1980. Thresholds for object motion perception raised by concurrent head movements. Pflgers Archiv Supplement 384, R33

Bttner, U. and Bttner, U.W., 1978. Parietal cortex (2 V) neuronal activity in the alert monkey during natural vestibular and optokinetic stimulation. Brain Research 153, 392-397

Bttner, U. and Henn, V., 1981. Circularvection: Psychophysics and single-unit recordings in the monkey. Annals of the New York Academy of Sciences, Vol. 374, 274-283.

Cohen, B. (ed). 1981. Vestibular and Oculomotor Physiology: International meeting of the Barany Society. Annals of the New York Academy of Sciences 374

Cohen, B. and Henn, V. (eds), 1988. Representation of Three- Dimensional Space in the Vestibular, Oculomotor, and Visual Systems. Annals of the New York Academy of Sciences 545.

Cohen, R.L., 1965. Adaptation effects and aftereffects of moving patterns viewed in the periphery of the visual field. Scandinavian Journal of Psychology 6, 257-264.

Coquery, J.M., 1978. Role of active movement in control of afferent input from skin in cat and man. In: Active touch G. Gordon (ed.). Pergamon Press, Oxford

Coquery, J.M., 1981. Changes in somaesthetic evoked potentials during movement Brain Research 31, 361-378

Coquery, J.M. and Amblard, B., 1973. Backward and Forward masking in the perception of cutaneous stimuli. Perception and Psychophysics 13 (2), 161-163

Cutting, J.E., Springer, K., Baren, P.A., and Johnson, S.H., 1992. Wayfinding on foot from information in retinal, not optical, flow. Journal of Experimental psychology (General), 121, 1, 41-72

Denton, G.G., 1977. Visual motion after effect induced by simulated rectilinear motion. Perception 6, 711-718

Dichgans, J. and Brandt, T., 1972. Visual-Vestibular Interaction and Motion Perception. Bibl. opthal. 82, 327-338.

Dichgans, J., Schmidt, C.L. and Graf, W., 1973. Visual input improves the speedometer function of the vestibular nuclei in the goldfish. Experimental Brain Research, 18, 319-322.

Dichgans, J. and Brandt, T.,1978. Visual-vestibular inter- action: Effects on self-motion perception and postural control. In: Handbook of Sensory Physiology Vol. VIII: Perception, R. Held, H.W. Leibowitz and H.L. Teuber (eds.). Springer Verlag, Berlin

Dichgans, J., Korner, F. and Voigt, K., 1969. Vergleichende Skalierung des afferenten und efferenten Bewegungssehens beim Menschen: Linearen Funktionen mit verschiedenen Ausstiegssteilkeit. Psychologische Forschung 32, 277-295

Dichgans, J., Wist, E., Diener, H.C. and Brandt, T., 1975. The Aubert- Fleischl phenomenon: A temporal frequency effect on perceived velocity in afferent motion perception. Experimental Brain Research 23, 529-533

Diener, H.C., Wist, E., Dichgans, J. and Brandt, T., 1976. The spatial frequency effect on perceived velocity. Vision Research 16, 169-176

Duncker, K., 1929. Ueber induzierte Bewegung. Psychologische Forschung 12, 180-259

Dyhre-Poulsen, P., 1978. Perception of tactile stimuli before ballistic and during tracking movements. In: Active Touch. G. Gordon (ed.). Pergamon Press, Oxford, 1978

Ehrenstein, W.H., Mateeff, S. and Hohnsbein, J., 1986(a). Zeitliche Aspekte der Ortskonstanz bei Augenfolgbewegungen. Paper presented at the 63 rd annual meeting of the German Physiological Society, March, Berlin.

Ehrenstein, W.H., Mateeff, S. and Hohnsbein J, 1986(b). Temporal aspects of position constancy during ocular pursuit. Pflgers Archiv, 406, R15;47

Ehrenstein, W.H., Mateeff, S. and Hohnsbein, J., 1987. Influence of exposure duration on the strength of the Filehne Illusion. Perception 16, 253 (A29b)

Elsner, W., 1971. Power laws for the perception of rotation and the oculogyral illusion. Perception and Psychophysics 9 (5), 418-420

Favreau, O.E., 1976. Motion after effects: Evidence for parallel processing in motion perception. Vision Research 16, 181-186

Filehne, W., 1922. Ueber das optische Wahrnehmen von Bewegungen. Zeitschrift fr Sinnephysiologie 53, 134-145

Fleischl, E. von., 1882. Physiologisch-optische Notizen. Sitzungsbez. Akad. Wissensch. III (86), 7-25

Fletcher, W.A., Hain, T.C. and Zee, D.S., 1990. Optokinetic nystagmus and afternystagmus in human beings: relationship to nonlinear processing of information about retinal slip. Experimental Brain Research 81, 46-52.

Fuchs, A.F. and Kim, J., 1975. Unit activity in vestibular nucleus of the alert monkey during horizontal angular acceleration and eye movement. Journal of neurophysiology 38, 1140-1161, 1975

Gibson J.J., 1966. The senses considered as perceptual systems. Houghton Mifflin Company, Boston.

Gibson, J.J., 1968. What gives rise to the perception of motion? Psychological Review 75, 335-345

Gibson, 1973. Direct visual perception: a reply to Gyr. Psychological Bulletin, Vol. 79, 6, 396-397

Gibson, J.J., 1979. The ecological approach to visual perception. Houghton Mifflin Company, Boston

Gibson, J.J., Smit, O.W., Steinschneider, A. and Johnson, C.W., 1957. The relative accuracy of visual perception of motion during fixation and pursuit. American Journal of Psychology 70, 64-68

Graaf, B. de, and Wertheim, A.H., 1988. The perception of object motion during smooth pursuit eye movements: Adjacency is not a factor contributing to the Filehne illusion. Vision Research 28, 497-502.

Graaf, B. de, Wertheim, A.H., Bles, W. and Kremers, J.J.M., 1990. Angular velocity, not temporal velocity determines circularvection. Vision research 30, 4, 637-646.

Grant, W., and Best, W., 1987. Otolith-organ mechanics: lumped parameter model and dynamic response. Aviation Space and Environmental Medicine 58, 970-976.

Graybiel, A., and Hupp, E.D., 1946. The oculogyral illusion: A form of apparent motion which may be observed following stimulation of the semi-circular canals. Journal of Aviation Medicine 17, 3-27

Gyr J.W., 1972. Is a theory of direct visual perception adequate? Psychological Bulletin 77, 246-261.

Helmholtz, H., 1910. Handbuch der physiologischen Optik. Vol. 3, Leipzig: Voss.

Henderson, D.C., 1971. The relationship among time, distance, and intensity as determinants of motion discrimination. Perception and Psychophysics 10, 310-320.

Henn, V., Cohen, B., and Young, L., 1980. Visual-vestibular interaction in motion perception and the generation of nystagmus. Neurosciences Research Program Bull. Vol. 18 No.4, MIT Press, Cambridge, USA

Henn, V., Young, L., and Finley, C., 1974. Vestibular nucleus units in alert monkeys are also influenced by moving visual fields. Brain Research, 71, 144-149.

Hofstadter, D., 1980. Gdel Escher Bach. Vintage Press, New York

Holst, E. von, 1954. Relations between the central nervous system and peripheral organs. British Journal of Animal Behaviour 2, 89-94

Holst, E., von and Mittelstaedt, H., 1950. Das Reafferenzprinzip. Naturwissenschaften 37, 464-476

Howard, I.P., 1982. Human Visual Orientation. John Wiley, New York

Howard, I.P. and Templeton, W., 1966. Human spatial orientation. John Wiley and Sons, New York

Hunzelmann, N., and Spillmann, L., 1984. Movement adaptation in the peripheral retina. Vision Research 24, 12, 1765-1769.

Ito, M., 1982. Cerebellar control of the vestibulo-ocular reflex: around the flocculus hypothesis. Annual review of neurosciences 5, 275-296

Jeannerod, M., Kennedy, H. and Magnin, M. 1979. Corollary discharge: its possible implications in visual and oculomotor interactions. Neuropsychologia 17, 241-258.

Johnson, C.A., and Leibowitz, H.W., 1976. Velocity-time reciprocity in the perception of motion: foveal and peripheral determinations. Vision Research 16, 177-180.

Johnstone, J. and Mark, R.F., 1970. Two classes of eye movement and their perceptual consequences. Proc. Aust. Physiol. Pharmacol. Soc, 1 (2), 46-47

Johnstone, J. and Mark, R.F., 1971. The efference copy neurone. Journal of Experimental Biology 54, 403-414

Johnstone, J. and Mark, R.F., 1973. Corollary Discharge. Vision Research 13, 1621.

Kinchla, R.A., 1971. Visual movement perception: a comparison between absolute and relative movement discrimination. Perception and Psychophysics 9, 2A, 165-171.

Koenderink,J.J., 1990. Some theoretical aspects of optic flow. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self- motion. Lawrence Earlbaum, Hillsdale, NJ. pp 53-68.

Koenderink, J.J. and Van Doorn, A.J., 1987. Facts on optic flow. Biological cybernetics 56, 247-254.

Kornhuber, H.H., 1974. Nystagmus and related phenomena in man: An outline of Otoneurology. In: Handbook of Sensory Physiology Vol. VI/2 Vestibular System, part 2, Psychophysics. Applied Aspects and General Interpretations, Springer Verlag, New York

Leibowitz, H.W., Post, R., Brandt, T. and Dichgans, J., 1982. Implications of Recent Developments in Dynamic Spatial Orientation and Visual Resolution for Vehicle Guidance. In: Tutorials on Motion Perception. A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.) pp. 231-260. Plenum Press, New York

Lombardo, T.J., 1987. The Reciprocity of Perceiver and Environment; the Evolution of J.J. Gibson's Ecological Psychology. Lawrence Earlbaum, Hillsdale, New Jersey.

Lisberger, S.g., and Fuchs, A.F., 1978a. Role of primate flocculus during rapid behavioral modification of vestibulo-ocular reflex I. Purkinje cell activity during visually guided horizontal smooth pursuit eye movements and passive head rotation. Journal of Neurophysiology 41 (3), 733-763

Lisberger, S.G., and Fuchs, A.F., 1978b. Role of primate flocculus during rapid behavioral modification of vestibulo-ocular reflex II. Mossy fiber firing patterns during horizontal head rotation and eye movement. Journal of Neurophysiology 41 (3), 764-777

Mack, A., 1978. Three modes of visual perception. In: Models of Perceiving and Processing Information. H.L. Pick and E. Saltzman (eds.) Lawrence Erlbaum Ass., Hillsdale, New Jersey

Mack, A., 1986. Perceptual aspects of motion in the frontal plane. In: Boff, K., Kaufman, L. and Thomas, J.P. (eds.) Handbook of Perception and Human Performance, Vol I: Sensory processes and Perception. New York: John Wiley, pp 17-1 to 17-38

Mack, A. and Herman, E., 1972. A new illusion: The underestimation of distance during smooth pursuit eye movements. Perception and Psychophysics 12 (6), 471-473

Mack, A. and Herman, E., 1973. Position constancy during pursuit eye movement: An investigation of the Filehne illusion. Quarterly Journal of Experimental Psychology 25, 71-84

Mack, A. and Herman, E., 1978. The loss of position constancy during pursuit eye movements. Vision Research 18, 55-62

MacKay, D.M., 1972. Voluntary eye movements as questions. In: Cerebral control of eye movements and motion perception; Bibli. Opthal., Vol 82, 369-376. Kargel, Basel.

MacKay, D.M., 1973. Visual stability and voluntary eye movements. In: Handbook of Sensory Physiology VII/3A. R. Jung (ed.). Springer Verlag, Berlin, 307-331

MacKay, D.M., 1982. Anomalous perception of extrafoveal motion. Perception 11, 359-360

Marcus, J.T., 1992. Vestibulo-ocular responses in man to gravito- inertial forces. PhD thesis. TNO Institute for perception, Soesterberg, The Netherlands.

Mateeff, S., Yakimoff, N., Hohnsbein, J. and Ehrenstein, W.H., 1991. Perceptual constancy during ocular pursuit: a quantitative estimation procedure. Perception and Psychophysics, 49, 4, 390-392

Matin, L., 1982. Visual localization and eye movements. In: Tutorials on Motion Perception. A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.). Plenum, New York, pp 101-156

Matin, L., 1986. Visual localization and eye movements. In: Boff, K., Kaufman, L. and Thomas, J.P. (eds.) Handbook of Perception and Human Performance Vol. I: Sensory Processes and Perception. New York: John Wiley, pp 20-1 to 20-43

Matin, L. Matin, E. and Pearce, D.G., 1969. Visual perception when voluntary saccades occur I. Relation of visual direction of a fixation target extinguished before a saccade to a flash presented during the saccade. Perception and Psychophysics 5, 65-80

McCrea, R.A., Yoshida, K., Evinger, C., and Berthoz, A., 1981. The location, axonal arborization and termination sites of eye movement related secondary vestibular neurons demonstrated by intra-axonal HRP injection in the alert cat. In: Fuchs, A.F. and Becker, W. (eds.). Progress in Oculomotor Research. Elsevier North-Holland, New York

Mergner, T. and Becker, W., 1990. Perception of horizontal self- rotation: Multisensory and cognitive aspects. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-Motion. Lawrence Earlbaum, Hillsdale, NJ. pp 219-264.

Mittelstaedt, H., 1990. Basic solutions to the problem of head-centric visual localization. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-Motion. Lawrence Earlbaum, Hillsdale, NJ. pp 267-288.

Moulden, B., 1975. Eye movements and the movement after effect. Vision Research 15, 1169-1170.

Murphy, B.J., 1978. Pattern thresholds for moving and stationary gratings during smooth pursuit eye movement. Vision Research 18, 521- 530.

Nagao, S., 1988. Behavior of floccular Purkinje cells correlated with adaptation of horizontal optokinetic eye movement response in pigmented rabbits. Experimental Brain Research 73, 489-497

Nakayama, K., 1981. Differential motion hyperacuity under conditions of common image motion. Vision Research 21, 1475-1482.

Noda, H., 1986. Mossy fibers sending retinal-slip, eye, and head velocity signals to the flocculus of the monkey. Journal of Physiology, 379, 39-60

Ockels, W.J., Furrer, R. and Messerschmid, E., 1990. Space Sickness on Earth. Experimental Brain Research 79, 3, 661-663.

Ockels, W.J., Furrer, R. and Messerschmid, E., 1989. Space sickness on Earth. Nature vol 340, August 1989, 681-682.

Owen, D.H., 1990. Perception and control of changes in self-motion: a functional approach to the study of information and skill. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self- Motion. Lawrence Earlbaum, Hillsdale, NJ. pp 289-322.

Paillard, J., Brouchon-Viton, M. and Jordan, P., 1978. Differential encoding of location cues by active and passive touch. In: Active Touch, G. Gordon (ed.). Pergamon Press., Oxford.

Pavard, B. and Berthoz, A., 1977. Linear acceleration modifies the perception of a moving visual scene. Perception 6, 529-540

Post, R. and Leibowitz, H.W., 1985. A revised analysis of the role of efference in motion perception. Perception 14, 631-643

Precht, W., 1982. Anatomical and functional organization of optokinetic pathways. In: Functional Basis of Ocular Motility Disorders. Lennestrand, G., Lee, D.S. and Keller, E.L. (eds.). Pergamon, New York

Probst, T., Brandt, T. and Degner, D., 1986. Object-motion detection affected by concurrent self-motion perception: Psychophysics of a new phenomenon. Behavioural Brain Research, 22, 1-11

Probst, T., Degner, D. and Brandt, T., 1980. Object motion perception affected by concurrent self-motion. Paper held at the third European Conference on Visual Perception. Brighton, UK, August

Probst, T., Straube, A., and Bles, W., 1985. Differential effects of ambient visual-vestibular-somatosensory stimulation on the perception of self-motion. Behavioural Brain Research 16, 71-79.

Probst, T., Krafczyk, S., Brandt, T. and Wist. E., 1984. Interaction between perceived self-motion and object-motion impairs vehicle guidance. Science 225, 536-538

Raphan, T., Cohen, D. and Matsuo, V., 1977. A velocity storage mechanism responsible for OKN, OKAN and vestibular nystagmus. In: Baker, R. and Berthoz, A. (eds.). Control of gaze by brainstem neurons, Developments in Neuroscience, Vol I, Elsevier North Holland, Biomedical Press, New York

Rauch, R., Angel, R.W. and Boylls, C.C., 1985. Velocity-dependent suppression of somatosensory evoked potentials during movement. Encephalography and clinical Neurophysiology 62, 421-425.

Raymond, J.W., Shapiro, K.L. and Rose, D.J. 1984. Optokinetic backgrounds affect perceived velocity during ocular tracking. Perception and Psychophysics, 36, 3, 221-224

Reinhardt-Rutland, A.H., 1992. Does the type of eye motion determine whether induced motion is diminished or enhanced? Perceptual and Motor Skills 74, pp 882.

Ross, H., 1974. Behavior and perception in strange environments. George Allen and Unwin, London

Sekuler, A., 1990. Motion segregation from speed differences: evidence for nonlinear processing. Vision Research 30, 5, 785-795.

Sekuler, R., Ball, K., Tynan, P. and Machamer, J., 1982. Psychophysics of motion perception. In: Tutorials on Motion Perception. A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.). Plenum, New York

Shaffer, O. and Wallach, H., 1966. Extent-of-motion thresholds under subject-relative and object-relative conditions. Perception and Psychophysics 1, 447-451.

Shulman, P.H., 1979. Eye movements do not cause induced motion. Perception and Psychophysics 26, 381-383

Skavenski, A.A., 1972. Inflow as a source of extraretinal eye position information. Vision Research 12, 221-229

Sperry, R.W., 1950. Neural basis of the spontaneous optokinetic response produced by visual inversion. J. Comp. Physiol. Psychol. 43, 482-489

Stark, L. and Bridgeman, B., 1983. Role of corollary discharge in space constancy. Perception and Psychophysics 34, 4, 371-380.

Steinbach, M.J., 1987. Proprioceptive knowledge of eye position. Vision Research 27, 10, 1737-1744.

Stoffregen, T.A. and Riccio, G.E., 1988. An ecological theory of orientation and the vestibular system. Psychological review 95, 1, 3- 14.

Stone, L.S. and Lisberger, S.G., 1990 (a). Visual responses of Purkinje cells in the cerebellar Flocculus during smooth pursuit eye movements in monkeys. I: simple spikes; II: complex spikes. Journal of Neurophysiology 63, 5, 1241-1275

Stone, L.S. and Lisberger, S.G., 1990 (b). Synergistic action of complex and simple spikes in the monkey flocculus in the control of smooth pursuit eye movement. Experimental Brain Research, 17, 299-312

Straube, A. and Brandt, T., 1987. Importance of the visual and vestibular cortex for self-motion perception in man (circularvection). Human Neurobiology, 6: 211-218.

Swanston, M.T., and Wade, N.J., 1988. The perception of visual motion during movements of the eyes and of the head. Perception and Psychophysics 43, 6, 559-566.

Swanston, M.T., Wade, N.J., and Day, R.H., 1987. The representation of uniform motion in vision. Perception 16, 143-159.

Ullman, S., 1980. Against direct perception. The Behavioral and Brain Sciences 3, 373-415.

Waespe, W., Bttner, U. and Henn, V., 1981. Visual-vestibular interaction in the flocculus of the alert monkey I. Input activity. Experimental Brain Research 43, 336-348

Waespe, W. and Henn, V., 1979. The velocity response of vestibular nucleus neurons during vestibular, visual and combined angular acceleration. Experimental Brain Research 37, 337-347.

Waespe, and Henn, V., 1981. Visual-vestibular interaction in the flocculus of the alert monkey II. Purkinje cell activity. Experimental Brain Research 43, 349-360

Wallach, H. and O'Connell, D.N., 1953. The kinetic depth effect. Journal of Experimental Psychology 45, 205-217.

Wallach, H., 1959. The perception of motion. Scientific American, 201, 56-60.

Wallach, H., 1982. Eye movement and motion perception. In A.H. Wertheim, W.A. Wagenaar and H.W. Leibowitz (eds.), Tutorials on Motion Perception. Plenum, NY. pp 1-19.

Wallach, H., 1985. Perceiving a stable environment. Scientific American, 252, 4, 92-98.

Wallach, H., 1987. Perceiving a stable environment when one moves. Annual Review of Psychology 38, 1-27

Wallach, H., Becklen, R. and Nitzberg, D., 1985. The perception of motion during collinear eye movements. Perception and Psychophysics 38, 18-22

Wallach, H. and Kravitz, J.H., 1965. The measurement of the constancy of visual direction and of its adaptation. Psychonomic Science 2, 217- 218

Wallach, H., O'Leary, A., and McMahon, M.L., 1982. Three stimuli for visual motion perception compared. Perception and Psychophysics 32, 1, 1-6.

Warren, R., 1990. Preliminary questions for the study of ego-motion. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-Motion. Lawrence Earlbaum, Hillsdale, NJ., pp 1-32.

Watanabe, E., 1984. Neuronal events correlated with long-term adaptation of the horizontal vestibulo-ocular reflex in the primate Flocculus. BraIN research 297, 169-174.

Wertheim, A.H., 1981. On the relativity of perceived motion. Acta Psychologica 48 (special volume on the perception of motion), 97-110

Wertheim, A.H., 1985. How extraretinal is extraretinal? Perception 14, 1 A8

Wertheim, A.H., 1987. Retinal and extraretinal information in movement perception: how to invert the Filehne illusion. Perception 16, 3, 277- 414

Wertheim A.H., 1992(a). A psychophysical method to assess the gain of the otolith response. Paper presented at the satellite symposium of the XVIIth meeting of the Barany Society: "Vestibular-Proprioceptive interaction for body orientation in space". Smolenice, Czecho-Slovakia, 5-7 June 1992.

Wertheim A.H., 1992(b). Motion perception during ego-motion: measuring the otolith response. Perception 21, supp. 2, 49-50.

Wertheim, A.H., 1993. Pilot studies on object motion perception during linear self-motion after long duration centrifugation of human subjects. Institute for Perception Technical Report IZF-1993-B-3. TNO Institute for Perception, Soesterberg, The Netherlands.

Wertheim, A.H. and Bekkering, H., 1991. The Filehne illusion is age dependent. Perception, 20, 1, 85-86.

Wertheim, A.H. and Bekkering, H., 1992. Motion thresholds of briefly visible stimuli increase asymmetrically with age. Vision Research, 32, 12, 2379-2384.

Wertheim, A.H. and Bles, W., 1984. A reevaluation of cancellation theory: Visual, vestibular and oculomotor contributions to perceived object motion. Institute for Perception Technical Report IZF-1984-8. TNO Institute for Perception, Soesterberg, The Netherlands.

Wertheim, A.H. and Van Gelder, P., 1990. An acceleration illusion caused by underestimation of stimulus velocity during pursuit eye movements: the Aubert-Fleischl phenomenon revisited. Perception, 19, 4, 471-482 (erratum in: Perception 19, 5, pp 700)

Wertheim, A.H., Van Gelder, P. Lautin, A., Peselow, E. and Cohen, N., 1985. High thresholds for motion perception in schizophrenia may indicate extraneous noise levels of central vestibular activity. Biological Psychiatry 20, 1197-1210

Wertheim, A.H., Hosman, R.J.A.W., Graaf, B. de, Bles, W. and Krol, J. 1989. Visual motion perception during simulated space sickness on earth. Proc. XVth. ann. meeting, European Undersea Biomedical Society, Eilat, Israel, pp 406-411.

Wertheim, A.H. and Niessen, M.W., 1986. The perception of relative motion between objects during pursuit eye movements. Perception 15, 1, A9

Wertheim, A.H. and Mesland, B., 1993. Motion perception during linear ego-motion. Institute for Perception Technical Report IZF-1993-3, TNO Institute for Perception, Soesterberg, The Netherlands.

Whiteside, T.C.D., Graybiel, A. and Niven, J.I., 1965. Visual illusions of movement. Brain 88, 13-210

Wolpert, L., 1990. Field-of-view information for self-motion perception. In: Warren, R. and Wertheim, A.H. (eds.) Perception and Control of Self-Motion. Lawrence Earlbaum, Hillsdale, NJ. pp 101-126.

Wong, S.C.P. and Frost, B.F., 1978. Subjective motion and acceleration induced by the movement of the observers entire visual field. Perception and Psychophysics 24 (2), 115-120

Xerri, C., Barthlmy, F., Harlay, F., Borel, L. and Lacour, M., 1987. Neuronal coding of linear motion in the vestibular nuclei of the alert cat. I: response characteristics to vertical otolith stimulation. Experimental Brain Research 65, 569-581

Xerri, C., Barthlmy, F., Borel, L. and Lacour, M., 1988. Neuronal coding of linear motion in the vestibular nuclei of the alert cat. III: Dynamic characteristics of visual-otolith interactions. Experimental Brain Research 70, 299-309

Yoshida, K., Berthoz, A., Vidal, P.P. and McCrea, R., 1981. Eye movement related activity of identified second order vestibular neurons in the cat. In: Progress in Oculomotor Research. Fuchs, A.F. and Becker, W., (eds.). Elsevier, North-Holland. new York

Zeppenfeldt, P., 1991. Bepaling van de drempelwaarden voor het waarnemen van verschillen tussen visuele en vestibulaire stimulatie tijdens eigenbeweging (Determination of thresholds for the detection of differences between visual and vestibular stimulation). Thesis, Technical University of Delft, The Netherlands.

FOOTNOTES

Footnote 1: In this paper the term "self-motion" denotes movement of parts of the body of the observer. The term "ego-motion" denotes whole body movement.

Footnote 2: Originally, Gibson (1966, pp 283-284) recognized that vestibular (and somatosensory) afferents may also generate or contribute to percepts of head- or ego-motion. He proposed that such percepts derive from the covariation of visual and vestibular afferents, their correlation serving as a special kind of invariant. However, in a later paper Gibson seems to let go of this idea, as he suggests that information about