3D Shape Perception Lab

jump to Introduction to Spatial/Temporal Integration

jump to Some Cues to Depth

jump to Current Research

Introduction to Spatial/Temporal Integration

In our daily life, we move in the environment, grab objects and perform a number of actions that are fundamental to our survival.  These actions may seem very natural and effortless despite the fact that the brain performs an extremely complex analysis of the light pattern that falls on our eyes in order to determine the structure and shape of the surrounding objects.  This problem is very difficult to solve since objects are three-dimensional but our eyes only register their two-dimensional projection (also called retinal image), like the film in a camera.

Most vision scientists have approached this problem by asking the following question: How does the brain derive the 3D structure of objects from the information that is present in a certain instant of time in a certain region of the retinal image? Because the image on the retina is two-dimensional and time can be added as a third dimension, the visual stimuli can be represented in a three-dimensional space.  The research conducted so far has focused on the problem of how local regions of this space-time domain are analyzed by the visual system, while the problem of how the visual system is capable of integrating the information contained in different spatial-temporal regions has been neglected.

The overall objective of the present research project is to investigate the spatial-temporal integration of information in the recovery of 3D shape from retinal projections.  In particular, three goals will be pursued.  First, the research will investigate in which manner local visual processing is affected by interactions with stimulus information present in different space-time locations.  Second, the research will exploit the stimulus conditions that are responsible for spatial and temporal organization.  Third, the research will determine whether spatial and temporal interactions occur among different sources of depth information.  Understanding how the human visual system solves this problem will not only be a valuable advance in the study of visual perception but could also produce novel insights toward the building of machines that mimic our behavior and interactions.

Some Cues to Depth

Convergence

Convergence is the degree at which the eyes are turned inward in order to fixate on an object.  The eyes are said to fixate on something when they are both aimed directly at the same point in space.  The lines of sight from each eye meet at a point to form an angle called the convergence angle, which varies depending on the distance of the object from the viewer, with smaller angles occurring at farther distances and larger angles occurring at closer distances.  The convergence angle can theoretically provide information about the absolute distance of an object.

Stereo Vision / Binocular Disparity

Due to the fact that the eyes are displaced laterally on the head, when viewing an object, the same point in space will project to slightly different positions on the retina in each eye.  The way in which these images are displaced is dependent upon the distance of the object from the viewer.  Points located nearer to the observer than the fixation point will have a greater binocular disparity than those located farther away.  Binocular disparity thus provides relative depth information and theoretically could provide absolute depth information, if the disparities are scaled by the known distance to the fixation point.

A good way to see binocular disparity in action is to close one eye and line your two index fingers up, one about six inches and the other about a foot away from your face.  When the further finger appears to be completely blocked by the closer one, switch eyes.  You will notice that the same visual environment in front of your face looks entirely different.

Below is a set of stereo images.  To see the scene in stereo, you must either angle your eyes at something that is behind the screen (parallel method) or in front of the screen (cross-eyed method) and focus on the screen.  Many people find this difficult to do at first, but it gets easier with practice.

view of hallway from left eyeview of hallway from right eye
View with the parallel method

view of hallway from right eyeview of hallway from left eye
View with the cross-eyed method

Motion

Motion is an important cue to depth and arises due to information from motion parallax or object motion.  Motion parallax refers to the different rates of motion of stationary objects as they vary in distance from a moving observer.  As the standpoint of the observer changes, objects at different distances will have different retinal velocities on the eye.  Closer objects move further and faster than more distant objects.  The different velocities therefore provide cues about the relative distance of points in the environment.

Texture

The systematic and gradual changes in texture of a surface also provide cues to depth.  Otherwise-equal texture components reduce in size as distance increases (imagine sitting on a tiled kitchen floor; the tiles close to you appear larger than those further away).  Relative distance across the surface can therefore be judged based on the size of these texture items.

The projection of the individual texture elements on the retina also deforms in shape due to changes in depth.  Consider again the tiled kitchen floor; the parallel lines formed by the rows of tiles converge somewhat as they stretch off into the distance, and individual tiles look more like trapezoids than squares.  As the mind assumes a regular texture, however, these deformations are processed as changes in distance and depth.

Accommodation

Accommodation is the process by which the eye alters the curvature of the lens to view objects at different distances clearly.  Viewing closer objects requires the lens to be more curved, while viewing distant objects requires it to be more flat.  The brain then factors this physical information into our judgments of distance.

Current Research

Cue Combination & Virtual Reality

Current work performed in the 3D Shape Perception Lab concerns the ways in which people combine any number of cues to depth in order to extract three-dimensional structure as accurately and efficiently as possible.  In everyday experience, one encounters many of these cues at once, and it is difficult to interpret the individual influence of any single cue.  This is where virtual display becomes crucial; in a virtual environment, the presence (or absence) of these cues can be controlled.  The ability to generate, for example, only one cue at a time, or even cues in conflict with one another, allows for insight that would be otherwise unavailable in natural environments.

The majority of research undertaken by the 3D Shape Perception Lab implements virtual stimuli.  Current experiments include:

  • Perceptual and motor-based response to conflicting depth cues
  • Haploscope-aided study of the effect of convergence angle on perception of individual depth cues
  • Investigating the processing location(s) of depth cues and cue combination using fMRI
  • Adaptation following prolonged heightened influence to individual depth cues

Real-World Replication

Three-dimensional perception studies conducted with virtual stimuli have always been victim to some criticism.  A new series of studies in the 3D Shape Perception Lab seeks to put some of these criticisms to rest.  The recent construction of a large-scale, real-world version of our most prominent virtual display will allow us to demonstrate that people perceive and behave more similarly in response to virtual stimuli on a computer screen and tangible stimuli in everyday surroundings than critics believe.


In the experiment pictured above, subjects (at a distance from the setup) are asked to adjust the front pole to create a right angle with the back two poles.  This replicates directly one of our virtual reality experiments.

Haploscope

The haploscope, pictured below, is a device used to control convergence angle of the eyes when viewing an image.  A haploscope consists of two adjustable monitors that are rigidly attached to adjacent mirrors, allowing for the projection of a different image onto each retina.  The rigidity of this setup assures that the different projections on each eye (whether similar or identical) will always be able to fuse, appearing as a single two- or three-dimensional image.  Currently, the 3D Shape Perception Lab implements a haploscope in the study of how perceptual experience at various convergence angles affects depth perception of multiple depth cues, namely stereo and motion.


Haploscope in the 3D Shape Perception Lab

fMRI (functional magnetic resonance imaging)

In collaboration with the Badre Lab, the 3D Shape Perception Lab is beginning a series of fMRI studies.  fMRI is a type of MRI scan that maps dynamic blood flow throughout the brain or spinal cord as a representation of neural activity.  This relatively new form of neuroimaging is especially appealing due to its detailed and noninvasive nature.  A new study in our lab seeks to examine how and where the brain processes different depth cues - specifically stereo and motion - and whether or not these cues are combined before a depth interpretation is assigned to them.

Perception & Action

Another topic of great interest to the 3D Shape Perception Lab is perceptual- versus motor-based understanding of visual stimuli.  For example, does an individual's brain perceive an object in front of him differently based on whether or not he needs to reach out and grasp it? Several experiments conducted over the past few years seem to indicate the answer to this question is yes, citing differing ventral and dorsal streams of processing as a tentative explanation.  Current experiments in the lab on this topic require individuals to first make perceptual judgments about conflicting cue combination surfaces and then make motor movements towards these surfaces in a virtual environment.  We are interested in the potential differences in depth and slant perception between these two conditions.

Haptic feedback - the experience of actual touch in conjunction with visual perception - also serves an important purpose here.  We are currently working on a design to include a random haptic feedback system in our Perception & Action apparatus, a change that will be implemented in the near future to give us further insight into motor-based visual processing.