IMR Press / FBS / Volume 6 / Issue 1 / DOI: 10.2741/S417

Frontiers in Bioscience-Scholar (FBS) is published by IMR Press from Volume 13 Issue 1 (2021). Previous articles were published by another publisher on a subscription basis, and they are hosted by IMR Press on as a courtesy and upon agreement with Frontiers in Bioscience.

How do we recognise who is speaking?
Show Less
1 MPRG Neural Mechanisms of Human Communication, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1, 04103 Leipzig, Germany
2 Center for Computational Neuroscience and Neurotechnology, Boston University, 677 Beacon Street, Boston, MA 02215, USA
3 Department of Psychology, Humboldt University of Berlin, Rudower Chaussee 18, 12489,Berlin, Germany

*Author to whom correspondence should be addressed.


Front. Biosci. (Schol Ed) 2014, 6(1), 92–109;
Published: 1 January 2014

The human brain effortlessly extracts a wealth of information from natural speech, which allows the listener to both understand the speech message and recognise who is speaking. This article reviews behavioural and neuroscientific work that has attempted to characterise how listeners achieve speaker recognition. Behavioural studies suggest that the action of a speaker's glottal folds and the overall length of their vocal tract carry important voice-quality information. Although these cues are useful for discriminating and recognising speakers under certain circumstances, listeners may use virtually any systematic feature for recognition. Neuroscientific studies have revealed that speaker recognition relies upon a predominantly right-lateralised network of brain regions. Specifically, the posterior parts of superior temporal sulcus appear to perform some of the acoustical analyses necessary for the perception of speaker and message, whilst anterior portions may play a more abstract role in perceiving speaker identity. This voice-processing network is supported by direct, early connections to non-auditory regions, such as the visual face-sensitive area in the fusiform gyrus, which may serve to optimize person recognition.

Speaker Recognition
Voice Perception
Back to top