Which one of the following structures contains the sensory receptors of hearing?


Sound is the sensation produced when longitudinal vibrations of the molecules in the external environment—that is, alternate phases of condensation and rarefaction of the molecules—strike the tympanic membrane. A plot of these movements as changes in pressure on the tympanic membrane per unit of time is a series of waves (Figure 10–8); such movements in the environment are generally called sound waves. The waves travel through air at a speed of approximately 344 m/s (770 mph) at 20°C at sea level. The speed of sound increases with temperature and with altitude. Other media can also conduct sound waves, but at a different speed. For example, the speed of sound is 1450 m/s at 20°C in fresh water and is even greater in salt water. It is said that the whistle of the blue whale is as loud as 188 dB and is audible for 500 miles.


Characteristics of sound waves. A is the record of a pure tone. B has a greater amplitude and is louder than A. C has the same amplitude as A but a greater frequency, and its pitch is higher. D is a complex wave form that is regularly repeated. Such patterns are perceived as musical sounds, whereas waves like that shown in E, which have no regular pattern, are perceived as noise.

In general, the loudness of a sound is directly correlated with the amplitude of a sound wave. The pitch of a sound is directly correlated with the frequency (number of waves per unit of time) of the sound wave. Sound waves that have repeating patterns, even though the individual waves are complex, are perceived as musical sounds; aperiodic nonrepeating vibrations cause a sensation of noise. Most musical sounds are made up of a wave with a primary frequency that determines the pitch of the sound plus a number of harmonic vibrations (overtones) that give the sound its characteristic timbre (quality). Variations in timbre permit us to identify the sounds of the various musical instruments even though they are playing notes of the same pitch.

Although the pitch of a sound depends primarily on the frequency of the sound wave, loudness also plays a part; low tones (below 500 Hz) seem lower and high tones (above 4000 Hz) seem higher as their loudness increases. Duration also affects pitch to a minor degree. The pitch of a tone cannot be perceived unless it lasts for more than 0.01 s, and with durations between 0.01 and 0.1 s, pitch rises as duration increases. Finally, the pitch of complex sounds that include harmonics of a given frequency is still perceived even when the primary frequency (missing fundamental) is absent.

The amplitude of a sound wave can be expressed in terms of the maximum pressure change at the eardrum, but a relative scale is more convenient. The decibel scale is such a scale. The intensity of a sound in bels is the logarithm of the ratio of the intensity of that sound and a standard sound. A decibel (dB) is 0.1 bel. The standard sound reference level adopted by the Acoustical Society of America corresponds to 0 dB at a pressure level of 0.000204 × dyne/cm2, a value that is just at the auditory threshold for the average human. A value of 0 dB does not mean the absence of sound but a sound level of an intensity equal to that of the standard. The 0–140 dB range from threshold pressure to a pressure that is potentially damaging to the organ of Corti actually represents a 107 (10 million)-fold variation in sound pressure. Put another way, atmospheric pressure at sea level is 15 lb/in2 or 1 bar, and the range from the threshold of hearing to potential damage to the cochlea is 0.0002–2000 μbar.

A range of 120–160 dB (eg, firearms, jackhammer, and jet plane on takeoff) is classified as painful; 90–110 dB (eg, subway, bass drum, chain saw, and lawn mower) is classified as extremely high; 60–80 dB (eg, alarm clock, busy traffic, dishwasher, and conversation) is classified as very loud; 40–50 dB (eg, moderate rainfall and normal room noise) is moderate; and 30 dB (eg, whisper and library) is faint. Prolonged or frequent exposure to sounds greater than 85 dB can cause hearing loss.

The sound frequencies audible to humans range from about 20 to a maximum of 20,000 cycles per second (cps, Hz). In bats and dogs, much higher frequencies are audible. The threshold of the human ear varies with the pitch of the sound (Figure 10–9), the greatest sensitivity being in the 1000- to 4000-Hz range. The pitch of the average male voice in conversation is about 120 Hz and that of the average female voice about 250 Hz. The number of pitches that can be distinguished by an average individual is about 2000, but trained musicians can improve on this figure considerably. Pitch discrimination is best in the 1000- to 3000-Hz range and is poor at high and low pitches.


Human audibility curve. The middle curve is that obtained by audiometry under the usual conditions. The lower curve is that obtained under ideal conditions. At about 140 dB (top curve), sounds are felt as well as heard.

The presence of one sound decreases an individual’s ability to hear other sounds, a phenomenon known as masking. It is believed to be due to the relative or absolute refractoriness of previously stimulated auditory receptors and nerve fibers to other stimuli. The degree to which a given tone masks others is related to its pitch. The masking effect of the background noise in all but the most carefully soundproofed environments raises the auditory threshold by a definite and measurable amount.


The ear converts sound waves in the external environment into action potentials in the auditory nerves. The waves are transformed by the eardrum and auditory ossicles into movements of the footplate of the stapes. These movements set up waves in the fluid of the inner ear (Figure 10–10). The action of the waves on the organ of Corti generates action potentials in the nerve fibers.

FIGURE 10–10

Schematic representation of the auditory ossicles and the way their movement translates movements of the tympanic membrane into a wave in the fluid of the inner ear. The wave is dissipated at the round window. The movements of the ossicles, the membranous labyrinth, and the round window are indicated by dashed lines. The waves are transformed by the eardrum and auditory ossicles into movements of the footplate of the stapes. These movements set up waves in the fluid of the inner ear. In response to the pressure changes produced by sound waves on its external surface, the tympanic membrane moves in and out to function as a resonator that reproduces the vibrations of the sound source. The motions of the tympanic membrane are imparted to the manubrium of the malleus, which rocks on an axis through the junction of its long and short processes, so that the short process transmits the vibrations of the manubrium to the incus. The incus moves so that the vibrations are transmitted to the head of the stapes. Movements of the head of the stapes swing its footplate.

In response to the pressure changes produced by sound waves on its external surface, the tympanic membrane moves in and out. The membrane therefore functions as a resonator that reproduces the vibrations of the sound source. It stops vibrating almost immediately when the sound wave stops. The motions of the tympanic membrane are imparted to the manubrium of the malleus. The malleus rocks on an axis through the junction of its long and short processes, so that the short process transmits the vibrations of the manubrium to the incus. The incus moves in such a way that the vibrations are transmitted to the head of the stapes. Movements of the head of the stapes swing its footplate to and fro like a door hinged at the posterior edge of the oval window. The auditory ossicles thus function as a lever system that converts the resonant vibrations of the tympanic membrane into movements of the stapes against the perilymph-filled scala vestibuli of the cochlea (Figure 10–10). This system increases the sound pressure that arrives at the oval window, because the lever action of the malleus and incus multiplies the force 1.3 times and the area of the tympanic membrane is much greater than the area of the footplate of the stapes. Some sound energy is lost as a result of resistance, but it has been calculated that at frequencies below 3000 Hz, 60% of the sound energy incident on the tympanic membrane is transmitted to the fluid in the cochlea.

Contraction of the tensor tympani and stapedius muscles of the middle ear cause the manubrium of the malleus to be pulled inward and the footplate of the stapes to be pulled outward (Figure 10–2). This decreases sound transmission. Loud sounds initiate a reflex contraction of these muscles called the tympanic reflex. Its function is protective, preventing strong sound waves from causing excessive stimulation of the auditory receptors. However, the reaction time for the reflex is 40–160 ms, so it does not protect against brief intense stimulation such as that produced by gunshots.


Conduction of sound waves to the fluid of the inner ear via the tympanic membrane and the auditory ossicles, the main pathway for normal hearing, is called ossicular conduction. Sound waves also initiate vibrations of the secondary tympanic membrane that closes the round window. This process, unimportant in normal hearing, is air conduction. A third type of conduction, bone conduction, is the transmission of vibrations of the bones of the skull to the fluid of the inner ear. Considerable bone conduction occurs when tuning forks or other vibrating bodies are applied directly to the skull. This route also plays a role in transmission of extremely loud sounds.


The movements of the footplate of the stapes set up a series of traveling waves in the perilymph of the scala vestibuli. A diagram of such a wave is shown in Figure 10–11. As the wave moves up the cochlea, its height increases to a maximum and then drops off rapidly. The distance from the stapes to this point of maximum height varies with the frequency of the vibrations initiating the wave. High-pitched sounds generate waves that reach maximum height near the base of the cochlea; low-pitched sounds generate waves that peak near the apex. The bony walls of the scala vestibuli are rigid, but Reissner membrane is flexible. The basilar membrane is not under tension, and it is also readily depressed into the scala tympani by the peaks of waves in the scala vestibuli. Displacements of the fluid in the scala tympani are dissipated into air at the round window. Therefore, sound produces distortion of the basilar membrane, and the site at which this distortion is maximal is determined by the frequency of the sound wave. The tops of the hair cells in the organ of Corti are held rigid by the reticular lamina, and the hairs of the outer hair cells are embedded in the tectorial membrane (Figure 10–4). When the stapes moves, both membranes move in the same direction, but they are hinged on different axes, so a shearing motion bends the hairs. The hairs of the inner hair cells are not attached to the tectorial membrane, but they are apparently bent by fluid moving between the tectorial membrane and the underlying hair cells.

FIGURE 10–11

Traveling waves. Top: The solid and the short-dashed lines represent the wave at two instants of time. The long-dashed line shows the “envelope” of the wave formed by connecting the wave peaks at successive instants. Bottom: Displacement of the basilar membrane by the waves generated by stapes vibration of the frequencies shown at the top of each curve.


The inner hair cells are the primary sensory receptors that generate action potentials in the auditory nerves and are stimulated by the fluid movements noted above. The outer hair cells, on the other hand, respond to sound like the inner hair cells, but depolarization makes them shorten and hyperpolarization makes them lengthen. They do this over a very flexible part of the basal membrane, and this action somehow increases the amplitude and clarity of sounds. Thus, outer hair cells amplify sound vibrations entering the inner ear from the middle ear. These changes in outer hair cells occur in parallel with changes in prestin, a membrane protein, and this protein may well be the motor protein of outer hair cells.

The olivocochlear bundle is a prominent bundle of efferent fibers in each auditory nerve that arises from both ipsilateral and contralateral superior olivary complexes and ends primarily around the bases of the outer hair cells of the organ of Corti. The activity in this nerve bundle modulates the sensitivity of these hair cells via the release of acetylcholine. The effect is inhibitory, and it may function to block background noise while allowing other sounds to be heard.


The frequency of the action potentials in single auditory nerve fibers is proportional to the loudness of the sound stimuli. At low sound intensities, each axon discharges to sounds of only one frequency, and this frequency varies from axon to axon depending on the part of the cochlea from which the fiber originates. At higher sound intensities, the individual axons discharge to a wider spectrum of sound frequencies, particularly to frequencies lower than that at which threshold simulation occurs.

The major determinant of the pitch perceived when a sound wave strikes the ear is the place in the organ of Corti that is maximally stimulated. The traveling wave set up by a tone produces peak depression of the basilar membrane, and consequently maximal receptor stimulation, at one point. As noted above, the distance between this point and the stapes is inversely related to the pitch of the sound, with low tones producing maximal stimulation at the apex of the cochlea and high tones producing maximal stimulation at the base. The pathways from the various parts of the cochlea to the brain are distinct. An additional factor involved in pitch perception at sound frequencies of less than 2000 Hz may be the pattern of the action potentials in the auditory nerve. When the frequency is low enough, the nerve fibers begin to respond with an impulse to each cycle of a sound wave. The importance of this volley effect, however, is limited; the frequency of the action potentials in a given auditory nerve fiber determines principally the loudness, rather than the pitch, of a sound.


The afferent fibers in the auditory division of the eighth cranial nerve end in dorsal and ventral cochlear nuclei (Figure 10–12). From there, auditory impulses pass by various routes to the inferior colliculi, the centers for auditory reflexes, and via the medial geniculate body in the thalamus to the auditory cortex located on the superior temporal gyrus of the temporal lobe. Information from both ears converges on each superior olive, and beyond this, most of the neurons respond to inputs from both sides. In humans, low tones are represented anterolaterally and high tones posteromedially in the auditory cortex.

FIGURE 10–12

Simplified diagram of main auditory (left) and vestibular (right) pathways superimposed on a dorsal view of the brainstem. Cerebellum and cerebral cortex have been removed. For the auditory pathway, eighth cranial nerve afferent fibers form the cochlea end in dorsal and ventral cochlear nuclei. From there, most fibers cross the midline and terminate in the contralateral inferior colliculus. From there, fibers project to the medial geniculate body in the thalamus and then to the auditory cortex located on the superior temporal gyrus of the temporal lobe. For the vestibular pathway, the vestibular nerve terminates in the ipsilateral vestibular nucleus. Most fibers from the semicircular canals terminate in the superior and medial divisions of the vestibular nucleus and project to nuclei controlling eye movement. Most fibers from the utricle and saccule terminate in the lateral division, which then projects to the spinal cord. They also terminate on neurons that project to the cerebellum and the reticular formation. The vestibular nuclei also project to the thalamus and from there to the primary somatosensory cortex. The ascending connections to cranial nerve nuclei are concerned with eye movements.

The responses of individual second-order neurons in the cochlear nuclei to sound stimuli are like those of the individual auditory nerve fibers. The frequency at which sounds of the lowest intensity evoke a response varies from unit to unit; with increased sound intensities, the band of frequencies to which a response occurs becomes wider. The major difference between the responses of the first- and second-order neurons is the presence of a sharper “cutoff” on the low-frequency side in the medullary neurons. This greater specificity of the second-order neurons is probably due to an inhibitory process in the brainstem. In the primary auditory cortex, most neurons respond to inputs from both ears, but strips of cells are stimulated by input from the contralateral ear and inhibited by input from the ipsilateral ear.

The increasing availability of positron emission tomography (PET) scanning and functional magnetic resonance imaging (fMRI) has greatly improved the level of knowledge about auditory association areas in humans. The auditory pathways in the cortex resemble the visual pathways in that increasingly complex processing of auditory information takes place along them. An interesting observation is that although the auditory areas look very much the same on the two sides of the brain, there is marked hemispheric specialization. For example, Wernicke area (see Figure 8–7) is concerned with the processing of auditory signals related to speech. During language processing, this area is much more active on the left side than on the right side. Wernicke area on the right side is more concerned with melody, pitch, and sound intensity. The auditory pathways are also very plastic, and, like the visual and somatosensory pathways, they are modified by experience. Examples of auditory plasticity in humans include the observation that in individuals who become deaf before language skills are fully developed, sign language activates auditory association areas. Conversely, individuals who become blind early in life are demonstrably better at localizing sound than individuals with normal eyesight.

Musicians provide additional examples of cortical plasticity. In these individuals, the size of the auditory areas activated by musical tones is increased. In addition, violinists have altered somatosensory representation of the areas to which the fingers they use in playing their instruments project. Musicians also have larger cerebellums than nonmusicians, presumably because of learned precise finger movements.

A portion of the posterior superior temporal gyrus known as the planum temporale is located between Heschl gyrus (transverse temporal gyrus) and the sylvian fissure (Figure 10–13). It has an almost triangular shape that is larger in the left than in the right cerebral hemisphere in about three-quarters of the population. It coincides in part with Wernicke area in the left hemisphere and likely is involved in language-related auditory processing. A curious observation is that the planum temporale is larger than normal on the left side in musicians and others who have perfect pitch. Also curiously, this size asymmetry appears to be absent in individuals with schizophrenia.

FIGURE 10–13

Asymmetries in the size of the left and right planum temporale as shown in a brain sectioned horizontally along the plane of the sylvian fissure. The triangular shape of the planum temporale is larger in the left than in the right cerebral hemisphere in about three-quarters of the population. This region coincides in part with Wernicke area in the left hemisphere and is involved in language-related auditory processing. Plane of section shown in the insert at the bottom. (Reproduced with permission from Kandel ER, Schwartz JH, Jessel TM [editors]: Principles of Neural Science, 3rd ed. New York, NY: McGraw-Hill; 1991.)


Determination of the direction from which a sound emanates in the horizontal plane depends on detecting the difference in time between the arrival of the stimulus in the two ears and the consequent difference in phase of the sound waves on the two sides; it also depends on the fact that the sound is louder on the side closest to the source. The detectable time difference, which can be as little as 20 μs, is said to be the most important factor at frequencies below 3000 Hz and the loudness difference the most important at frequencies above 3000 Hz. Neurons in the auditory cortex that receive input from both ears respond maximally or minimally when the time of arrival of a stimulus at one ear is delayed by a fixed period relative to the time of arrival at the other ear. This fixed period varies from neuron to neuron.

Sounds coming from directly in front of the individual differ in quality from those coming from behind because each pinna (the visible portion of the exterior ear) is turned slightly forward. In addition, reflections of the sound waves from the pinnal surface change as sounds move up or down, and the change in the sound waves is the primary factor in locating sounds in the vertical plane. Sound localization is markedly disrupted by lesions of the auditory cortex.


Deafness can be divided into two major categories: conductive (or conduction) and sensorineural hearing loss. Conductive deafness refers to impaired sound transmission in the external or middle ear and impacts all sound frequencies. Among the causes of conduction deafness are plugging of the external auditory canals with wax (cerumen) or foreign bodies, otitis externa (inflammation of the outer ear, “swimmer’s ear”) and otitis media (inflammation of the middle ear) causing fluid accumulation, perforation of the eardrum, and osteosclerosis in which bone is resorbed and replaced with sclerotic bone that grows over the oval window.

Sensorineural deafness is most commonly the result of loss of cochlear hair cells but can also be due to problems with the eighth cranial nerve or within central auditory pathways. It often impairs the ability to hear certain pitches while others are unaffected. Aminoglycoside antibiotics such as streptomycin and gentamicin obstruct the mechanosensitive channels in the stereocilia of hair cells (especially outer hair cells) and can cause the cells to degenerate, producing sensorineural hearing loss and abnormal vestibular function. Damage to the hair cells by prolonged exposure to noise is also associated with hearing loss (Clinical Box 10–1). Other causes include tumors of the eighth cranial nerve and cerebellopontine angle and vascular damage in the medulla.

Auditory acuity is commonly measured with an audiometer. This device presents the subject with pure tones of various frequencies through earphones. At each frequency, the threshold intensity is determined and plotted on a graph as a percentage of normal hearing. This provides an objective measurement of the degree of deafness and a picture of the tonal range most affected.

Conduction and sensorineural deafness can be differentiated by simple tests with a tuning fork. Three of these tests, named for the individuals who developed them, are outlined in Table 10–1. The Weber and Schwabach tests demonstrate the important masking effect of environmental noise on the auditory threshold.

TABLE 10–1Common tests with a tuning fork to distinguish between sensorineural and conduction deafness.

View Table||Download (.pdf)

TABLE 10–1 Common tests with a tuning fork to distinguish between sensorineural and conduction deafness.

Method Base of vibrating tuning fork placed on vertex of skull Base of vibrating tuning fork placed on mastoid process until subject no longer hears it, then held in air next to ear Bone conduction of patient compared with that of healthy subject
Normal Hears equally on both sides Hears vibration in air after bone conduction is over
Conduction deafness (one ear) Sound louder in diseased ear because masking effect of environmental noise is absent on diseased side Vibrations in air not heard after bone conduction is over Bone conduction better than normal (conduction defect excludes masking noise)
Sensorineural deafness (one ear) Sound louder in normal ear Vibration heard in air after bone conduction is over, as long as nerve deafness is partial Bone conduction worse than normal

