Particularly Exciting Experiments in Psychology

photo of stack of 6 APA journals that focus on experimental psychology

August 24, 2018

Factors That Influence Audio-Visual Integration

young man with his hand cupped around his earWhen a car speeds past, you perceive the sight of the car and the sound of its engine as part of a single event, rather than two separate experiences. Such audiovisual integration is impressive considering that the input from each modality is processed by different sensory organs.

Given that we generally encounter a single visual object associated with a single auditory stimulus (e.g., one car makes one engine sound), it has been suggested that the upper limit on the number of visual and auditory events that can be integrated together is one.

Wilbiks and Dyson (2018, Journal of Experimental Psychology: Human Perception and Performance) tested whether perceptual manipulations can increase audiovisual integration capacity beyond this presumed limit.

On each trial, arrays of 8 black or white dots were presented in 10 sequential frames, with a subset of dots changing from black to white (or white to black) in each frame. On the 9th frame, dot array onset was accompanied by an auditory tone. After the 10th frame and a 1000 ms blank screen, the 9th array was presented again, and participants had to indicate whether the dot at a probed location had changed in that frame.

Estimates of audiovisual capacity (K) exceeded 1 if the target dot changed from black to white in synchrony with a high-pitched tone, or from white to black in synchrony with a low pitched tone, that is, if there was structural congruency between auditory pitch and visual brightness. However, this effect was only evident for slow (700 ms between frames) but not fast (200 ms between frames) presentation rates.

Estimates of audiovisual capacity (K) exceeded 1 for both slow and fast presentation rates when the dots that changed in each frame were connected by lines overlaid on the dot array (Experiment 2).

The authors conclude that audiovisual integration capacity is flexible, and depends on stimulus factors.

Wilbiks and Dyson tested how stimulus factors influence audiovisual integration. However, audiovisual integration may also be influenced by participant-level factors.

In Parker and Robinson (2018, Psychology and Aging), participants had to report the number of beeps (auditory response) or flashes (visual response) in cross-modal conditions where both beeps and flashes were presented together.

On congruent trials, there was the same number of beeps and flashes (e.g., 3 beeps and 3 flashes); on incongruent trials, there was a mismatch between the number of beeps and flashes (e.g., 4 beeps and 3 flashes).

The difference in accuracy for visual responses between congruent vs. incongruent trials was largest for young adults (18–21 years old), followed by older adults (62–89 years old), with children (average 5–13 years old) showing the smallest effect of congruency. In contrast, the presence of flashes did not influence beep perception at all in young adults, whereas there were congruency effects in auditory responses for children and older adults.

This suggests dominance of auditory information in audiovisual perception in young adults, such that auditory information influenced visual perception but not vice versa, whereas there were more symmetrical effects of auditory and visual information in children and older adults, indicating that contributions of visual and auditory information to audiovisual integration vary across the life-span.


  • Wilbiks, J. M. P., & Dyson, B. J. (2018). The contribution of perceptual factors and training on varying audiovisual integration capacity. Journal of Experimental Psychology: Human Perception and Performance, 44(6), 871–884.
  • Parker, J. L., & Robinson, C. W. (2018). Changes in multisensory integration across the life span. Psychology and Aging, 33(3), 545–558.

Author Commentary

In addition to identifying cars (or birds, or people), we also find it interesting to consider how audiovisual integration capacity would be implicated in a potentially life-threatening situation. For example, if you were surrounded by a lion, a tiger, and a bear (oh my!), how would you identify which one is making that "I'm so hungry" sound?

Our research shows that you would prioritize temporal information — whichever mouth was open during the hungry sound — over trying to identify based on the type of sound.

We also find that if temporal information is ambiguous, you will integrate as many potential threatening stimuli as possible in the moment, and then try to narrow down your choices once the threat has passed.


Jonathan Wilbiks is an assistant professor of experimental psychology at the University of New Brunswick. In addition to research on the integration of auditory and visual information, he conducts research examining the effects of musical training on memory, as well as other musical perceptual processes. If you are interested in discussing any research ideas further, you can contact Dr. Wilbiks via email

Many situations require the simultaneous processing and integration of multisensory information. To the best of our knowledge, this is the first study to use the same task and stimuli across children, young adults, and older adults to examine developmental changes in multisensory integration.

The differences between the three age groups suggest that underlying mechanisms may change with age, which affect the simultaneous processing of auditory and visual information.

Further research is needed to determine what factors facilitate and inhibit integration of auditory and visual information and to better understand the underlying factors that account for developmental change.


Jessica Parker recently graduated from the Ohio State University with a Bachelor of Science in Psychology, and Chris Robinson is an Assistant Professor at Ohio State Newark where he studies: (a) how infants, children, and adults process and integrate multisensory information and (b) how this ability sub-serves various cognitive tasks such as word learning, categorization, individuation, and statistical learning.

Discussion Questions

  1. Describe the psychological concept of "chunking". How does chunking explain the results from Wilbiks & Dyson, Experiment 2, where connecting the dots that changed in each frame increased audiovisual integration capacity?
  2. Parker and Robinson also included baseline trials where only beeps or only flashes were presented. Explain why these baseline trials are necessary for distinguishing between facilitation effects on congruent trials and interference effects on incongruent trials.
  3. Explain one audio-visual illusion (e.g., McGurk effect, flash-lag effect).