What is the difference in retinal images that plays a key role in our ability to perceive depth?

  • Journal List
  • HHS Author Manuscripts
  • PMC4060807

Perception. Author manuscript; available in PMC 2014 Jun 17.

Published in final edited form as:

PMCID: PMC4060807

NIHMSID: NIHMS584965

Abstract

Motion parallax provides a dynamic, unambiguous, monocular visual depth cue. However, the lateral image motion in computer-generated motion parallax displays is depth-sign ambiguous. While mounting evidence indicates that the visual system uses an extra-retinal signal from the pursuit system to disambiguate depth, vertical perspective is a potential confound because it co-varies with the stimulus translation that produces the pursuit signal. Here the role of an extra-retinal pursuit signal and the role of vertical perspective in disambiguating depth from motion parallax were investigated. Through the careful isolation of each cue, the results indicate that observers have excellent depth discrimination with an extra-retinal pursuit cue alone, but have poor discrimination with vertical perspective alone. The conclusion is that vertical perspective does not play a role in the disambiguation of depth in small computer-generated motion parallax displays.

An important task for the human visual system is to interpret the three-dimensional world using information from a two-dimensional retinal image. To perform this task accurately, the visual system often relies on multiple visual depth cues. One important monocular depth cue, motion parallax (MP), is provided by relative movement of objects in the visual scene, relative to a stable point of fixation, during observer translation. Another important source of depth information is provided by perspective, the optical geometry that produces the apparent convergence of parallel lines as they extend away from the observer. Observer translation, in addition to creating conditions of MP, can also create perspective information, a change in the relative retinal image size of various parts of an object as an observer moves past the object. The study presented here investigates whether perspective information, generated by translation conditions, could have confounded results in previous studies of MP.

A fundamental issue in understanding MP is determining the source of information for the unambiguous perception of relative depth. While early research revealed that MP was capable of producing the perception of depth, this research suggested that determining which points or portions of objects were nearest and farthest was often unreliable (Gibson, Gibson, Smith & Flock, 1959). This early research used a broad variety of visual stimuli. Various studies used illuminated points or needles (Bourdon, 1898; 1902; Graham et al., 1948), wires in frames (Tschermak, 1939), points of light (Gogel & Tietz, 1973), shadow projections of opaque objects (Gibson et al, 1959) and paint or powder splattered panels (Smith & Smith, 1963). Following the success of computer-generated random-dot stereograms in the study of binocular stereopsis (Julesz, 1960; 1963; 1971), the development of random-dot MP stimuli helped show “…that motion parallax can be a sufficient cue to the shape and depth of three-dimensional surfaces, in the absence of all other depth cues” (Rogers & Graham, 1979). These other depth cues could include relative size, familiar size, interposition, relative brightness, and perspective, and each has the potential to confound an experimental study of MP. However, with random-dot MP stimuli, individual stimulus dots can be shifted on the display screen to recreate the appropriate retinal image motion to simulate a three-dimensional shape during observer head translation or during display translation (Rogers & Graham, 1979), free from many other potential depth cues. This random-dot methodology has been very important in the study of depth perception by allowing the isolation of particular stimulus variables, especially those required for the unambiguous perception of relative depth from MP.

Using these same computer-generated, random-dot MP stimuli, recent psychophysical studies have shown that the underlying neural mechanisms rely on an extra-retinal signal from the pursuit eye-movement system for the disambiguation of depth from MP (Nawrot, 2003a; Nawrot, 2003b; Naji & Freeman, 2004; Nawrot & Joyce, 2006). Regardless of whether the observer or stimulus translates, eye movements are generated to maintain foveation of a certain point in the scene. An extra-retinal signal derived from the pursuit component of these compensatory eye movements is used to disambiguate the motion-from-depth. Reversing direction of the eye movement, or the retinal image motion, reverses the perceived depth, but a reversal of both (such as with a reversal in observer head movement) maintains depth constancy. Subsequent psychophysical and theoretical work has shown that the visual system appears to use a ratio of retinal image velocity/pursuit eye velocity and fixation distance to recover the relative depth of objects in a scene (Nawrot & Stroyan, 2009; Stroyan & Nawrot, 2012), much like how binocular disparity and fixation distance are used to recover relative depth in binocular stereopsis (Stroyan, 2010).

Recent neurophysiological research, using similar random-dot stimuli presented on a flat display surface, provides additional support the use of the extra-retinal pursuit signal for unambiguous depth perception from MP. Neurons in area MT modulate their activity with respect to an extra-retinal signal regarding pursuit direction, not head-movement direction (Nadler et al., 2008; 2009), giving these neurons depth selectivity for a MP stimulus.

While there is growing neurophysiological evidence for an extra-retinal pursuit signal used to disambiguate depth from MP (Moeeny & Cumming, 2011; Kim et al, 2011), one potential confound, suggested by Rogers (2010), is the vertical perspective produced by the relative observer-stimulus translation. Like MP, perspective is undeniably a very important depth cue. Almost every visual landscape with depth, or simulated depth (e.g. a painted landscape or cityscape), contains visual perspective cues as one of the dominant cue to depth. Consider the vertical perspective cues depicted in Figure 1. Assuming all of these signs depict regular rectangular shapes, perspective provides important information for the perception of spatial relationships in the scene. For instance, signs that subtend a smaller visual angle are perceived as farther away. Consider how the signs to the left and right of the observer appear as (isosceles) trapezoids with the vertical edge farthest from the observer subtending a smaller angle than the nearer vertical edge. This can be called vertical perspective, and is an important and powerful depth cue as seen in the figure. Due to shape constancy, this change in shape may or may not be apparent to the observer, but the cue is present in the retinal image and is therefore available to the visual system. Moreover, as depicted in Figure 1, as the observer translates, producing MP, the translation also produces changes in the vertical perspective information. Therefore, as suggested by Rogers (2010), vertical perspective is a potential confound in our understanding of MP.

What is the difference in retinal images that plays a key role in our ability to perceive depth?

A depiction of vertical perspective as a cue to depth. As the observer translates rightward, the farther (left) vertical edge of the left and center rectangles will subtend a smaller visual angle, due to their larger distance from the observer, than the nearer vertical edge.

Rogers and Rogers (1992) examined the role of perspective information in disambiguating depth from MP. In their study, perspective information was manipulated by rotating the viewing monitor (real perspective), altering the stimulus so that one of the vertical edges was shorter than the other (simulated vertical perspective), and altering both the vertical edges and the overall width of the stimulus (simulated vertical perspective plus width change). While observers were not significantly above chance at disambiguating depth from MP in the two vertical perspective conditions, the role of extra-retinal pursuit information was not controlled or manipulated, so it is possible that any observed disambiguation was due to an uncontrolled extra-retinal pursuit signal. For instance, if observers more often visually pursued dot movement near the response “target band” (the portion of their stimulus upon which observers were to make their relative depth judgment), this could produce a systematic response bias linked to pursuit direction, not perspective. Moreover, it is unclear how (in a phenomenological sense) vertical perspective would disambiguate depth from MP. That is, which direction of retinal image motion would be perceived nearer or farther than the fixation point in what form of relation to the change in vertical perspective? In what cases does vertical perspective cause the rightward retinal image motion to be perceived nearer than fixation, and when does vertical perspective cause the leftward retinal image motion to be perceived nearer? While the relationship between retinal image motion, pursuit direction, and perceived depth sign is described by the pursuit theory of MP (Nawrot & Joyce, 2006), and quantitatively detailed with the motion/pursuit law (Nawrot & Stroyan, 2011), it is unknown how depth sign is influenced by vertical perspective, or what the relationships with retinal image motion and perceived depth might be. Therefore, it has remained unresolved whether the effect of pursuit and vertical perspective are confounded in computerized MP displays.

Indications that vertical perspective changes are unnecessary are seen in two recent studies. Nawrot and Stroyan (2012) showed that unambiguous depth from MP may be recovered with very brief, 30 msec, presentations over which the MP stimulus travels only fractions of a degree of visual angle. Such a small translation is unlikely to create a detectable change in vertical perspective. Similarly, Nawrot and Joyce (2006; see also Nawrot, 2003a) eliminated the vertical perspective cue in a MP stimulus by linking stimulus window movement to observer head movement. As the observer’s head translated laterally, the MP stimulus translated the identical distance on the display screen. This fixed the MP stimulus size and position on the observer’s retina, and prevented a change in vertical perspective. Because the translational vestibulo-ocular response (TVOR) is reflexively generated to shift the eyes in the direction opposite the head translation, this TVOR required the observer’s oculomotor system to generate a countermanding pursuit signal to maintain stable fixation, and keep the observer’s eye stable in the orbit. This internal pursuit signal disambiguates the perception of depth from MP in the absence of any change in vertical perspective. However, it remains important to investigate the issue of a potential confound between pursuit and vertical perspective in a study where both cues are precisely controlled and manipulated.

To address this issue of a potential confound between two putative cues, the current study was designed to test whether vertical perspective (vertical perspective hypothesis) or pursuit eye movement signals (pursuit hypothesis) provide a disambiguating cue in MP studies using computerized displays. As outlined in Figure 2, the two hypotheses predict different patterns of results across four stimulus conditions used here. If the vertical perspective theory is correct and changes in vertical perspective disambiguate depth from MP, then participants should have stable depth percepts across trials, regardless of pursuit eye movements, and the results should look similar to Figure 2a. However, if the pursuit theory is correct and a pursuit eye movement signal disambiguates depth from MP, then participants should have stable depth percepts across trials, regardless of vertical perspective information, and the results should look similar to Figure 2b. The two control conditions, having neither (leftmost bar in each panel) or both (rightmost bar in each panel) provide a comparison for chance and perfect behavior in the two experimental conditions. Of course, if both cues disambiguate depth from MP only the “No Pursuit, No Vertical Perspective” condition would show chance performance. Such a result would indicate that both cues have an independent role in the disambiguation of depth from MP in the particular conditions studied here.

What is the difference in retinal images that plays a key role in our ability to perceive depth?

Hypothetical results comparing the vertical perspective hypothesis of depth from motion parallax and the pursuit hypothesis of depth from motion parallax. (a.) If change in vertical perspective is the cue that disambiguates depth from motion parallax, then participants should perform much better on trials that contain changes in vertical perspective, regardless of pursuit eye movements. (b.) If pursuit eye movements are required to disambiguate depth from motion parallax, then participants should perform significantly better on trials that require pursuit eye movements to track a moving stimulus, regardless of changes in vertical perspective.

Methods

Apparatus

The experiment was conducted on a Macintosh computer running OSX 10.6.8. Stimuli were presented on a 21-inch NEC CRT with a resolution of 1920 ×1440 pixels (50 × 38 deg) and a 75 Hz refresh rate. Software was written in Matlab (Rev. 2010a) using Psychophysics Toolbox-3 (Brainard, 1997; Pelli, 1997; Kleiner et al., 2007). Participant keyboard response data was recorded for subsequent analysis.

To accurately control vertical perspective, the observer’s head was immobilized in a headrest mounted at a viewing distance of 41 cm from the display. Prior to every block of trials, the participant’s right eye was carefully aligned with the center of the display using a T-square with an attached level. This apparatus was used to ensure that the participant’s viewing eye was precisely centered, perpendicular, level, and at the correct viewing distance from the central “origin” point on the display.

Stimuli

Random-dot MP stimuli depicted one cycle of a sinusoidal corrugated surface with the contours oriented horizontally (Rogers & Graham, 1979). Each stimulus was composed of 5000, 1.6 × 1.6 min white dots within an 8.9 × 8.9 deg (initially) square stimulus graphics window. This is average pixel size as the angular size of pixels varied across the screen from approximately 1.8 min (center) to 1.4 min (edges). The black stimulus window was the only region in which white stimulus dots were presented, and was not otherwise delineated from the black screen background.

To create the MP cue, each dot was translated horizontally at a particular fixed velocity within the stimulus window. The particular velocities varied along the vertical dimension of the stimulus window, from top-to-bottom, with a sinusoidal profile with a peak velocity of ± 2.0 deg/s and a minimum speed of 0 deg/s at the top, bottom, and horizontal meridian. The sinusoid could have one of two opposite phases corresponding to opposing directions of dot motion. This produced a stimulus with “a peak” appearing either above or below the central fixation point with “a valley”, having the opposite depth-sign, appearing on the opposite side of the fixation point. This MP cue was present in every trial.

To create the pursuit cue, the stimulus window was translated laterally, leftward or rightward from an initial position at screen center, at 12 deg/s. Observers were instructed to maintain fixation on the spot at the center of the stimulus window, moving their eyes along with the stimulus window motion. In pursuit conditions the stimulus window translated a distance of 20.6 deg.

A geometric description of depth (d) in this translating dynamic stimulus is provided by the motion/pursuit ratio (Nawrot & Stroyan, 2009; Stroyan & Nawrot, 2012), which takes into account retinal image velocity (dθ), pursuit velocity (dα), and viewing distance (f) with the formula:

This stimulus has a peak motion/pursuit ratio of 0.167. With a viewing distance of 41 cm, the peak stimulus depth corresponds to ±6.8 cm from the fixation point.

To control the vertical perspective cue during the horizontal translation of the stimulus window, the vertical position of stimulus dots, and therefore the vertical dimensions of the stimulus window, was varied. Overall, to produce a translating stimulus window with unchanging vertical perspective, the vertical dimensions of the stimulus window were increased, as the window translated laterally, to maintain a constant 8.9 deg vertical visual angle. That is, the stimulus window slowly transformed from square to trapezoidal as it translated laterally across the monitor face. This was accomplished by recalculating the vertical position of each stimulus dot for each frame of the stimulus presentation, after that dot’s position had been shifted laterally to produce the appropriate MP cue. That is, to maintain each dot’s fixed vertical angle (φ) from the horizontal meridian (x-axis) as that dot’s horizontal distance (dh) from the origin changes, each dot’s appropriate vertical distance (dv) from the horizontal meridian, is calculated using dv = tan φ * dO, where dO is the calculated distance from the fixed observer to the point (dh, 0). Dot position, given by dh,dv, is then rounded to the pixel resolution of the monitor. To illustrate, a hypothetical stimulus dot positioned in the upper-right corner of the stimulus window would, at farthest extent of the 20.6 deg stimulus window translation to the right, have shifted vertically 12 pixels, 22 min arc, or 8% (< 3 mm). At the same time, a hypothetical dot in the upper-left corner of the stimulus window, would have shifted vertically 5 pixels, 9 min arc, or 2% (~ 1mm). The result is that the vertical angular position of each stimulus dot, and the vertical angular dimension of the entire stimulus window, remained constant on the observer’s retina during the lateral translation. However, the presence or absence of this vertical modulation of specific dot position was not discernable by observers maintaining fixation at the center of the stimulus.

To create a stationary stimulus window with the same changing vertical perspective as the translating case, the vertical stimulus dimension changes previously used to eliminate vertical perspective in the translating stimulus were applied to the stationary stimulus. When presented to the observer this depicted dots within a stimulus window that transitioned from a square to a trapezoidal, duplicating the vertical perspective ordinarily present in the translating stimulus.

Experimental design

Four conditions were used to determine whether vertical perspective or pursuit is used to disambiguate depth sign in MP.

No-pursuit/no-vertical perspective (NPNV): Here the stimulus window remained centered on the computer monitor. The shape of the stimulus window did not change. Within the stationary stimulus window, dots translated laterally producing the MP cue. This is similar to stimulus condition 5 in Nawrot and Joyce (2006). In that study the perceived depth-phase of the stimulus was ambiguous.

No-pursuit/vertical perspective (NPV): Here the stimulus window remained centered on the computer monitor. The vertical perspective cue was applied to the stimulus window in either rightward or leftward trapezoidal direction (one side of the stimulus becoming larger, while the other side becomes smaller). Within the stationary stimulus window, dots translated laterally producing the MP cue. This is similar to the simulated vertical perspective condition in Rogers and Rogers (1992).

Pursuit/no-vertical perspective (PNV): Here the stimulus window translated laterally leftward or rightward. The vertical dimension of the stimulus was manipulated during translation to maintain a fixed retinal size and eliminate a changing vertical perspective cue. Within the translating stimulus window, dots translated laterally producing the MP cue.

Pursuit/vertical perspective (PV): Here the stimulus window translated laterally leftward or rightward. The vertical dimension of the stimulus was fixed during translation creating vertical perspective cue as the stimulus window translated laterally away from the observer. Within the translating stimulus window, dots translated laterally producing the MP cue. This is similar to stimulus condition 2 in Nawrot and Joyce (2006). In that study the perceived depth phase of the stimulus was unambiguous.

Participants and Procedure

Six participants with normal vision, or corrected to normal, completed this study. All but one participant were naïve to the purpose of the experiment, and all had previous experience in psychophysical experiments. All stimuli were viewed monocularly with the right eye with an eye patch placed over the left eye. Each trial began with the presentation of fixation point at the center of the computer monitor. Participants fixated the point with their right eye and initiated the trial with a key press. Observers maintained fixation on the point, which remained at the center of the stimulus window, throughout the trial, both when the stimulus window translated and remained stationary. Every stimulus presentation had a duration of 1.6 s. Following stimulus presentation, the screen was blanked and participants indicated with a key press whether they perceived the peak of the stimulus above or below the fixation point.1 After the participant entered a response, the fixation point reappeared at the center of the screen and participants could initiate the next trial with a key press.

Each participant completed a total of 20 blocks of 80 trials. Each block consisted of 20 trials from each of the 4 conditions. Trials from each of the 4 conditions were presented randomly within a block.

All participants volunteered and gave informed consent for their participation in this study. This study was carried out in accordance with, and oversight by, the institutional IRB and in accordance with national regulations and legislation and with the World Medical Association Helsinki Declaration as revised in October 2008 (http://www.wma.net/en/30publications/10policies/b3/index.html).

RESULTS

To determine whether pursuit or vertical perspective change in a standard computer-generated MP display is sufficient to disambiguate depth sign, d-prime (d′) values were computed for each participant, in each of the four stimulus conditions (Macmillan & Creelman, 1991). If either of these cues disambiguates depth from MP, observers should be able to correctly report when the cue information indicates that the upper portion of the sinusoidal stimulus appeared to protrude out nearer than the fixation point and when it did not. Correct identification of this was classified as a “hit”. Correct identification that the lower portion appeared nearer than fixation was classified as a “correct rejection”. Trials in which the upper stimulus portion was incorrectly reported as nearer were classified as “false alarms” and trials in which the lower stimulus portion was incorrectly reported as nearer were classified as “misses”.

In pursuit conditions (PV and PNV) the motion/pursuit law (Nawrot & Stroyan, 2009) unambiguously defines the relationship between direction of stimulus dot motion and the direction of pursuit (stimulus window motion) allowing the classification of responses. That is, if the pursuit theory is correct, the upper portion of the stimulus should appear nearer than fixation when the dots translate in the direction of the stimulus window translation. In the NPNV condition, which was devoid of either cue, pursuit was defined within the program as being leftward or rightward at 0 deg/s allowing the classification of responses (e.g., see condition 5 in Nawrot & Joyce, 2006).

As mentioned in the introduction, the relationship between depth sign and perspective is not as clear as the lawful relationship with pursuit. For the purpose of this analysis, it is necessary to establish a reasonable hypothetical relationship that could potentially explain the unambiguous depth sign in the PV condition (disregarding any role that pursuit has in disambiguating depth from MP), which would then also apply to the analysis of the NPV condition. In the PV condition (e.g., the conditions of most previous studies of MP), horizontal stimulus dot motion in the direction of the “farther” or “smaller visual angle” side of the shape transformed by vertical perspective would be perceived as nearer in depth than the opposite horizontal direction of stimulus dot motion. This relationship provides a categorization of the responses in the PV condition that is the same as the pursuit theory. Therefore, while performance in the PV condition does not differentiate between the two hypotheses (Figure 2), if vertical perspective provides a lawful relationship between the direction of retinal image motion and perceived relative depth, then this relationship should hold in the absence of pursuit, as tested in the NPV condition.

Individual d′ values were averaged, providing a measure of the participants’ ability to correctly discriminate perceived depth sign, while taking response bias into account. Figure 3 shows average d′ values plotted as a function of condition. In the NPNV condition, very low values were observed with a mean d′ = 0.003, SE = 0.063. The same was true for the NPV condition with a mean d′ = 0.065, SE = 0.054.

What is the difference in retinal images that plays a key role in our ability to perceive depth?

Average d′ scores for each of the 4 stimulus conditions. The vertical axis represents average d′ scores. The horizontal axis contains each of the 4 stimulus conditions. Conditions that required pursuit eye movements were significantly greater than those that did not require pursuit eye movements regardless of changes in vertical perspective.

In the PNV, one participant correctly responded to all 400 trials and another observer committed no false alarms. Because false alarms are required to compute d′ values, the participant with perfect discrimination had a single trial removed from the ‘hits’ category and added to the ‘miss’ category, thereby indicating slightly worse performance than actually achieved. Likewise, a single trial was removed from the ‘correct rejections’ category and placed into the ‘false alarms’ category. The same procedure was done for the participant that had achieved no ‘false alarms’. This approach created usable data points for those participants, though it also produced slightly more conservative average d′ values. Regardless, participants in the PNV condition showed excellent discrimination of depth sign with a mean d′ = 3.389, SE = 0.605.

In the PV condition, one participant committed no misses and a second participant committed no false alarms. The same process that was used in the PNV condition was used in the PV condition in order to compute d′ values. Participants showed similar performance in the pursuit/vertical perspective condition (PV), d′ = 3.374, SE = 0.604.

The difference in depth discrimination performance between the conditions that included a pursuit signal (PV (d′ = 3.374) and PNV (d′ = 3.389)) and conditions absent the pursuit signal (NPV (d′ = 0.065) and NPNV (d′ = 0.003)) cannot be explained by criteria shifts. The mean criteria values (Macmillan & Creelman, 1990) for participants in both the pursuit conditions ((PV (c = 0.336) and PNV (c = 0.329)) and non-pursuit conditions (NPV (c = 0.178) and NPNV (c = 0.134)) were all very close to zero.

Although there is no evidence for criteria shifts, one alternative hypothesis relates to static levels of observer sensitivity, motivation, or strategy while participating in the experiment. Perhaps the strong effect of the pursuit signal, available in 50% of the interleaved trials, made observers less sensitive or less attentive to any subtle influence of perspective information in non-pursuit (NP) trials. To address this alternative hypothesis, the experiment was repeated using only the NPV and NPNV interleaved conditions. If vertical perspective were having some subtle effect, one might expect the d′ value in the NPV condition to increase. In this control experiment 10 new naïve observers, who had not participated in the previous experiment but had prior experience in psychophysical experiments, completed 40 trials in each condition. However, in the absence of any interleaved pursuit trials, depth-sign discriminability remained very low. For the perspective containing NPV condition the mean d′ = 0.098 (SE = 0.122, c = 0.184) which is not different from the NPNV condition having a mean d′ = 0.091 (SE = 0.152, c = 0.281) or the previous main experiment NPV condition with a mean d′ = 0.065 (SE = 0.054, c = 0.178). Therefore the interleaving of pursuit and non-pursuit trials had no effect upon the utility of vertical perspective information to disambiguate depth from motion in these display conditions.

The pattern of results is very similar to that proposed by the pursuit hypothesis outlined in Figure 2. Performance in the NPNV and the PV conditions do not help discriminate between either of the two hypotheses, but the NPV and the PNV indicate that observers can discriminate depth sign with pursuit alone (without vertical perspective) and cannot with vertical perspective alone (without pursuit). Participants performed at very low levels of depth discrimination when vertical perspective information was the sole source of information available to determine depth sign.

DISCUSSION

The results of this experiment add to the growing experimental evidence that a pursuit signal is used by the visual system to recover depth sign information from MP. Moreover, these results provide strong evidence that vertical perspective is not responsible for disambiguating depth from MP in these particular computerized display conditions. To be certain, the issue under investigation here is not whether vertical perspective provides an important cue to depth: It does! Rather, in the case of MP stimuli presented on a flat computer display, the results of the current experiment provide no evidence that vertical perspective is used to disambiguate depth-from-motion. This means vertical perspective was not an under-appreciated confounding variable in MP studies using these computerized displays. Indeed, an extra-retinal pursuit signal appears to be the crucial source of information required to disambiguate depth from MP in these particular experimental conditions (Nawrot & Joyce, 2006).

One possible reason that vertical perspective did not have an effect here is that the amount of vertical perspective created by pure stimulus translation on a flat display surface was too small to influence perception. Indeed, in debriefing study participants none reported being able to see any stimulus window shape transformation. It is possible that a larger vertical transformation, for instance, one simulating a larger stimulus rotation (e.g. Braunstein & Payne, 1962), might have some demonstrable effect. Indeed, Papathomas and colleages (Papathomas, 2002; 2007; Papathomas & Bono, 2004; see also Rogers & Gyani, 2010) have shown how large or exaggerated perspective cues can dominate the unambiguous perception of depth from MP, just as these perspective cues, and also face cues (Yellott & Kaiwi, 1979; Hill & Johnston, 2007), can dominate the retinal disparity cues for binocular stereopsis. It is important to recognize that while this is not an argument against the utility of retinal disparity signals for binocular stereopsis, it is similarly not an argument against the utility of pursuit signals for MP (Rogers, 2010). Instead the shifting dominance of various cues to depth illustrates the importance of reductive studies of visual cues in isolation, before the conditions of cue conflict and cue combination can be properly understood.

Interestingly, the pattern of results here are similar to those reported for the vertical disparity conditions (3d and 3e) in Rogers and Rogers (1992). That is, the disambiguation of perceived depth in their vertical perspective conditions was not significantly different from their control condition, which lacked vertical perspective and other cues to disambiguate depth from motion. This was despite the vertical perspective being “slightly exaggerated” because “…it was not possible to simulate these changes [in vertical perspective] accurately…” (p. 449) in their experimental procedure. In the present study the NPV (vertical perspective) and the NPNV (control) conditions were not different from each other, and both showed no evidence of disambiguating depth from motion.

In conclusion, MP provides important information with respect to the three-dimensional world. Pursuit eye movements are necessary in order for the visual system to properly interpret depth from MP. This study clearly shows that when pursuit eye movements are not present, depth from MP is ambiguous, even in cases where small amounts of vertical perspective are introduced. It is crucial then, that we understand the mechanisms used by each of these depth cues alone in order to comprehend how the visual system uses them collectively to interpret the three-dimensional world.

Acknowledgments

This project was supported by grants from the National Center for Research Resources (NCRR; P20 RR020151)) and the National Institute of General Medical Sciences (NIGMS; P20 GM103505) from the National Institutes of Health (NIH). The authors thank Huan Gu and the programming core of the Center for Visual and Cognitive Neuroscience for programming assistance.

Footnotes

1The task was originally conceived as a traditional yes/no detection task in which observers indicated whether the corrugation peak was above the horizontal meridian or not. However, our observers were experienced and practiced at performing a peak discrimination task. Since both tasks used the same two-button responses to report the same percept of stimulus depth, we considered the two tasks equivalent and had observers continue performing the peak discrimination task.

The contents of this report are solely the responsibility of the authors and do not necessarily reflect the official views of the NIH, NCRR, or NIGMS.

References

  • Bourdon B. La perception monoculaire de la profondeur. [Monocular depth perception]. Revue Philosophique de la France et de l’Etranger. 1898;46:124–145. [Google Scholar]
  • Bourdon B. La perception visuelle de l’espace [Visual perception of space] Paris: Schleicher Frères; 1902. [Google Scholar]
  • Brainard DH. The psychophysics toolbox. Spatial Vision. 1997;10:433–436. [PubMed] [Google Scholar]
  • Braunstein ML, Payne JW. Perspective and the rotating trapezoid. Journal of the Optical Society of America. 1968;58:399–403. [PubMed] [Google Scholar]
  • Gibson EJ, Gibson JJ, Smith OW, Flock H. Motion parallax as a determinant of perceived depth. Journal of Experimental Psychology. 1959;58:40–51. [PubMed] [Google Scholar]
  • Gogel WC, Tietz JD. Absolute motion parallax and the specific distance tendency. Perception and Psychophysics. 1973;13:284–292. [Google Scholar]
  • Graham CH, Baker KE, Hecht M, Lloyd VV. Factors influencing thresholds for monocular movement parallax. Journal of Experimental Psychology. 1948;38:205–223. [PubMed] [Google Scholar]
  • Hill H, Johnston A. The hollow-face illusion: Object-specific knowledge, general assumptions or properties of the stimulus? Perception. 2007;36:199–223. [PubMed] [Google Scholar]
  • Julesz B. Binocular depth perception of computer-generated patterns. Bell System Technical Journal. 1960;39:1125–1162. [Google Scholar]
  • Julesz B. Stereopsis and binocular rivalry of countours. Journal of the Optical Society of America. 1963;53:994–999. [PubMed] [Google Scholar]
  • Julesz B. Foundations of Cyclopean Percpetion. University of Chicago Press; 1971. [Google Scholar]
  • Kim HR, Angelaki DE, DeAngelis GC. Program No 577.08 2011. Neuroscience Meeting Planner. Washington, D C: Society for Neuroscience Online; 2011. Neural correlates of depth perception from motion parallax in macaque area MT. [Google Scholar]
  • Kleiner M, Brainard D, Pelli D. What’s new in psychtoolbox-3? Perception. 2007;36 ECVP Abstract Supplement. [Google Scholar]
  • Macmillan NA, Creelman CD. Response bias: Characteristics of detection theory, threshold theory and “nonparametric” indexes. Psychological Bulletin. 1990;107:401–413. [Google Scholar]
  • Macmillan NA, Creelman CD. Detection theory: A user’s guide. Cambridge: Cambridge University Press; 1991. [Google Scholar]
  • Moeeny A, Cumming B. Program No 578.12 2011. Neuroscience Meeting Planner. Washington, D C: Society for Neuroscience Online; 2011. Effect of pursuit eye movement on responses of MT neurons systematically depends on disparity and motion direction preference. [Google Scholar]
  • Nadler JW, Angelaki DE, DeAngelis GC. A neural representation of depth from motion parallax in macaque visual cortex. Nature. 2008;452:642–645. [PMC free article] [PubMed] [Google Scholar]
  • Nadler JW, Nawrot M, Angelaki DE, DeAngelis GC. MT neurons combine visual motion with a smooth eye movement signal to code depth-sign from motion parallax. Neuron. 2009;63:523–532. [PMC free article] [PubMed] [Google Scholar]
  • Naji JJ, Freeman TCA. Perceiving depth order during pursuit eye movements. Vision Research. 2004;44:3025–3034. [PubMed] [Google Scholar]
  • Nawrot M. Eye movements provide the extra-retinal signal required for the perception of depth from motion parallax. Vision Research. 2003a;43:1553–1562. [PubMed] [Google Scholar]
  • Nawrot M. Depth from motion parallax scales with eye movement gain. Journal of Vision. 2003b;3:841–851. [PubMed] [Google Scholar]
  • Nawrot M, Joyce L. The pursuit theory of motion parallax. Vision Research. 2006;46:4709–4725. [PubMed] [Google Scholar]
  • Nawrot M, Stroyan K. The motion/pursuit law for visual depth perception from motion parallax. Vision Research. 2009;49:1969–1978. [PMC free article] [PubMed] [Google Scholar]
  • Nawrot M, Stroyan K. Integration time for the perception of depth from motion parallax. Vision Research. 2012;59:64–71. [PMC free article] [PubMed] [Google Scholar]
  • Papathomas TV. Experiments on the role of painted cues in Hughes’s reverspectives. Perception. 2002;31:521–530. [PubMed] [Google Scholar]
  • Papathomas TV. Art pieces that ‘move’ in our minds--an explanation of illusory motion based on depth reversal. Spatial Vision. 2007;21:79–95. [PubMed] [Google Scholar]
  • Papathomas TV, Bono LM. Experiments with a hollow mask and reverspective: Top-down influences in the inversion effect for 3-D stimuli. Perception. 2004;33:1129–1138. [PubMed] [Google Scholar]
  • Pelli DG. The video toolbox software for visual psychophysics: Transforming numbers into movies. Spatial Vision. 2007;10:437–442. [PubMed] [Google Scholar]
  • Rogers BJ. The role of visual and extraretinal information in disambiguating motion Parallax transformations: A test of Nawrot’s (2003) pursuit eye movement hypothesis. Perception. 2010;39:107. ECVP Abstract Supplement. [Google Scholar]
  • Rogers B, Graham M. Motion parallax as an independent cue for depth perception. Perception. 1979;8:125–134. [PubMed] [Google Scholar]
  • Rogers B, Gyani A. Binocular disparities motion parallax and geometric perspective in Patrick Hughes’s ‘reverspectives’: Theoretical analysis and empirical findings. Perception. 2010;39:330–348. [PubMed] [Google Scholar]
  • Rogers S, Rogers BJ. Visual and nonvisual information disambiguate surfaces specified by motion parallax. Perception & Psychophysics. 1992;52:446–452. [PubMed] [Google Scholar]
  • Smith OW, Smith PC. On motion parallax and perceived depth. Journal of Experimental Psychology. 1963;65:107–108. [PubMed] [Google Scholar]
  • Stroyan K. Motion parallax is asymptotic to binocular disparity. 2010 http://arxiv.org/abs/1010.0575.
  • Stroyan K, Nawrot M. Visual depth from motion parallax and eye pursuit. Journal of Mathematical Biology. 2011;64:1157–1188. [PMC free article] [PubMed] [Google Scholar]
  • Tschermak-Seysenegg A. Ueber parallaktoskopie. Pfluegers Archiv fur Physiologie. 1939;241:455–469. [Google Scholar]
  • Yellott JI, Kaiwi JL. Depth inversion despite stereopsis: The appearance of random dot stereograms on surfaces seen in reverse perspective. Perception. 1979;8:135–142. [PubMed] [Google Scholar]

Is the difference in retinal images that plays a key role in our ability to perceive depth?

Retinal disparity is the fact that the left and right fields of vision provide slightly different visual images when focusing on a single object. It is a type of binocular visual cue that allows people to perceive depth and distance.

How does retinal disparity influence depth perception?

Retinal disparity provides a binocular cue that facilitates depth perception. Examples Score “Distance between the eyes creates two different images needed for good depth perception.”

When we use retinal disparity to perceive depth what do we compare?

Retinal disparity is a binocular cue used to perceive depth between two near objects. It does so by comparing the different images from both retinas. Each eye receives different images because they are usually around two and half inches apart.

What is the difference between retinal disparity and convergence?

Retinal disparity increases as the eyes get closer to an object. The brain uses retinal disparity to estimate the distance between the viewer and the object being viewed. Convergence is when the eyes turn inward to look at an object close up.