Chapter 2:
The perception of loudness
-
Introduction.
-
The human ear has incredible absolute sensitivity and
dynamic range.
-
The most intense sound we can hear without immediate
damage to the ear is at least 120 dB above the faintest sound we can just
detect.
-
This corresponds to an intensity ratio of 1,000,000,000,000
: 1.
-
How could such a range be encoded?
-
How does the loudness of sounds depend on frequency
and intensity?
-
How can the loudness (as opposed to the intensity) of
sounds be measured.
-
What adaptation and fatigue occur in the auditory system?
-
How can hearing disorders be diagnosed?
-
Absolute thresholds.
-
The absolute threshold of a sound is the minimum detectable
level of that sound in the absence of other external sounds.
-
Two methods of measuring physical intensity of the threshold
stimulus yield slightly different results.
-
Minimum audible pressure (MAP) uses a small microphone
at or inside the ear canal and sound is usually presented with earphones.
-
Minimum audible field (MAF) presents sound via loudspeakers
in an anechoic chamber and sound pressure is measured by placing a microphone
where the center of the listener's head would be.
-
Average results for these two methods in young listeners
are shown in Fig. 2.1.
-
Both MAP and MAF curves are lowest in the middle frequencies.
-
Outer ear enhances sound level at the eardrum by as
much as 15 dB between 1.5-6 kHz.
-
Transmission by the middle ear is most efficient at
middle frequencies.
-
At low frequencies, MAPs are 5-10 dB higher than MAFs
due to physiological noise of vascular origin.
-
Highest audible frequency may be as much as 20 kHz but
decreases and becomes more variable with age--a condition known as presbyacusis.
-
The low frequency limit for true hearing is about 16
Hz.
-
In most practical situations, detection depends more
on masked threshold than absolute threshold.
-
In clinical measurement of hearing, thresholds are specified
relative to the average threshold for young healthy listeners with 'normal'
hearing.
-
Thresholds specified this way have units dB HL (hearing
level) or dB HTL (hearing threshold level).
-
Fig. 2.1b shows typical audiograms for a young graduate
student and an old professor.
-
Equal-loudness contours.
-
It is often useful to have a subjective scale for the
loudness of a sound. Since sounds are often analyzed in terms of their
individual frequency components, a useful first step is to devise such
a scale for pure tones.
-
One way to do this the loudness level, which tells us
not how loud a tone is, but how intense a 1000 Hz tone must be to sound
equally loud.
-
The loudness level in phons of a 1000 Hz pure tone is
defined to be equal to its sound pressure level in dB SPL.
-
The loudness level in phons of any other pure tone is
the sound pressure level in dB SPL of a 1000 Hz pure tone judged to be
equally loud.
-
If subjects are alternately presented various pure tones
with a 1000 Hz pure tone, and asked to adjust one or the other of the tones
until they have the same loudness, the equal loudness contours shown in
Fig. 2.3 result.
-
The equal loudness contours are similar in shape to
threshold functions, but become flatter at higher loudness levels.
-
This means that the rate of growth of loudness with
intensity differs for different frequencies.
-
Specifically, loudness level grows faster with intensity
at low frequencies (and to some extent high frequencies) than at middle
frequencies.
-
Practical implications of equal loudness contours for
the reproduction of sounds.
-
The relative loudness of different frequency components
in a sound changes as a function of overall level, thus affecting the tonal
balance.
-
At low levels, we are less sensitive to the very low
and very high frequencies.
-
Many amplifiers have a 'loudness' control which boosts
bass and treble at low listening levels.
-
Sound level meters have weighting scales (A, B and C)
to crudely correct for the effect of overall sound level on the contribution
of different frequencies to overall loudness.
-
The scaling of loudness.
-
Development of scales relating the physical magnitude
of sounds to their subjective loudness was pioneered by S. S. Stevens,
primarily using two methods.
-
In magnitude estimation, sounds with different levels
are presented and the subject is asked to assign a number to each one according
to its perceived loudness. Sometimes a reference sound is also provided.
-
In magnitude production, the subject is asked to adjust
the level of a test sound until it has a specified loudness, either in
absolute terms, or relative to a reference sound.
-
Results of such studies indicate that loudness (L) is
a power function of intensity (I): L = kI0.3, where k is a constant
depending on the subject and the units used.
-
Simple approximation is that a two-fold change in loudness
is produced by a 10-dB change in level.
-
The sone is a unit of loudness equivalent to the loudness
of a 1000 Hz pure tone at 40 dB SPL. Fig. 2.5 shows the relationship between
sones and phons for a 1000 Hz pure tone.
-
Criticisms of loudness scaling.
-
Susceptible to bias caused by a number of factors.
-
The range of stimuli presented.
-
The first stimulus presented.
-
The instructions to the subject.
-
The range of permissible responses.
-
Symmetry of the response range.
-
Other factors related to experience, motivation, training
and attention.
-
Large individual differences and within-subject variability
require the averaging of many subjects and responses for consistent results.
-
We are used to judging the loudness of sources, but
not of sensations.
-
Scaling assumes that the relationship between sensation
and response is linear, but that assumption is not independently verifiable.
-
Models of loudness.
-
As an alternative to the scaling of loudness, models
have been constructed which are fairly successful in predicting the loudness
of simple and complex sounds from their physical parameters.
-
Treatment of these models is beyond the scope of this
course.
-
Temporal integration.
-
For tone durations in excess of 500 ms, sound intensity
at threshold is independent of duration.
-
For durations less than about 200 ms, the sound intensity
necessary for detection increases as duration decreases.
-
Over a reasonable range of durations, the ear appears
to integrate the energy of the stimulus over time in the detection of short
duration tones.
-
If this were exactly true, then the threshold intensity
(I) times the tones duration (t) would be a constant for a particular frequency.
-
A better representation of actual results is that (I
- IL) x t = IL x t
= constant, where IL is the threshold intensity for a long-duration
tone and t is the
integration time of the auditory system.
-
The auditory system almost certainly integrates neural
activity, rather than stimulus energy.
-
The auditory system may not actually perform the operation
of integration, but simply have more opportunities to detect the stimulus
as its duration increases (multiple looks).
-
Some investigators have found that the time constant
of integration t decreases
with increasing frequency, but others have found it to remain relatively
constant.
-
The limits of energy integration have been studied using
tones of various durations, but constant energy (I x t). Typical results
are shown in Fig. 2.7.
-
Using a constant-energy, 1 kHz tone with durations from
5 to 500 ms, it was found that detectability (d') was constant for durations
from 15 to 150 ms, but fell off for durations longer and shorter than this.
-
Other investigators have found that the plateau occurs
at longer durations for lower frequencies and shorter durations for higher
frequencies.
-
The fall in detectability at very short durations may
indicate that energy can only be integrated over a narrow range of frequencies.
-
The detection of intensity changes and the coding of
loudness.
-
The smallest detectable change in intensity (difference
threshold) has been measured for many types of stimuli by a variety of
methods.
-
Most of these methods use two-interval, two-alternative
forced-choice (2AFC). Two stimuli differing in intensity are presented
successively in random order. The threshold is usually defined as the intensity
difference which yields 75% correct responding.
-
Modulation detection. In one interval, the stimulus
is unmodulated and in the other it is amplitude modulated at a low rate.
Subjects must indicate which interval contained the modulation.
-
Increment detection. A continuous stimulus is presented
and an increment in level is imposed in one of two intervals. Subjects
must indicate which interval contained the increment.
-
Intensity discrimination of pulsed stimuli. Two pulses
of sound are presented successively, one being more intense than the other.
Subjects must indicate which was more intense.
-
Results for these three methods are similar and are
usually specified in decibels such that D
L = 10log10{(I + D
I)/I}.
-
For wideband or bandpass-filtered noise, Weber's law
holds.
-
The smallest detectable change is a constant fraction
of the intensity of the stimulus, i.e., D
I/I (the Weber fraction) is constant.
-
If D
L is expressed in decibels, it too is constant at 0.5-1 dB.
-
For pure tones a 'near miss' to Weber's law is obtained.
-
If D
I is plotted against I (both in dB), a line of slope 0.9 is obtained instead
of the slope 1.0 predicted by Weber's law.
-
Discrimination, as measured by the Weber fraction, improves
at high levels.
-
For a 1000 Hz pure tone, D
L ranges from 1.5 dB at 20 dB to 0.3 dB at 80 dB.
-
A suitable account of the physiological encoding of
intensity in the auditory system must thus account for a 120-dB dynamic
range, Weber's law for noise bursts, and improved discrimination with level
up to about 100 dB for pure tones.
-
The dynamic range of the auditory system.
-
If intensity discrimination were based on the firing
rates of neurons with center frequencies close to the frequency of the
stimulus, we might expect discrimination to worsen at sound levels above
about 60 dB SPL since most of these neurons would be saturated.
-
Since discrimination does not worsen above 60 dB SPL,
there might be other mechanisms for coding of intensity changes at high
intensities.
-
One possibility is that when neurons at the center of
the excitatory pattern are saturated, changes in intensity could still
be signaled by changes in the firing rates of neurons at the edges of the
pattern.
-
Attempts to selectively mask the edges of the excitatory
pattern with noise show that this information may play a role in intensity
discrimination, but is not necessary to the wide dynamic range of the auditory
system.
-
Another possibility is that even when neurons are saturated,
an increase in intensity increases phase locking to the stimulus (quantity
and quality).
-
Studies using stimuli containing only frequencies above
the range where phase locking occurs indicate that although changes in
phase-locking may play a role, they are also not necessary for the wide
dynamic range of the auditory system.
-
New studies showing that individual neurons carry information
about intensity changes in the shape of the rate vs. level function and
its variability at each level change the problem.
-
Simulations show that such information from about 100
neurons is sufficient to account for intensity discrimination.
-
There are about 30,000 neurons in the auditory nerve,
so the question arises as to why intensity discrimination is not finer
than it is.
-
The problem seems to be more one of understanding the
limited capacity of more central parts of the auditory system to use information
carried in the firing rates of neurons, than one of how intensity changes
can be coded in those firing rates.
-
Weber's law.
-
At one time it was thought that Weber’s law held for
bands of noise because the statistical fluctuations in the noise limited
performance.
-
In intensity discrimination, a device that chooses the
noise burst containing the greater energy on each trial can be shown to
conform to Weber’s law.
-
However, even noise without random fluctuations from
trial to trial produces results conforming to Weber’s law, indicating that
it must arise instead from the operation of the auditory system.
-
The information conveyed by a single neuron is optimal
over a small range of sound levels.
-
Levels close to or below threshold result in minimal
changes in firing rate with level.
-
Poor coding at high levels is the result of saturation.
-
Thus, if discrimination were based on information from
single neurons, it would not conform to Weber’s law.
-
Weber’s law for bands of noise can be predicted by models
which combine the firing rate information from a small number of neurons
(about 100) whose thresholds and dynamic ranges are appropriately staggered
so as to cover the dynamic range of the auditory system.
-
Such models assume that information from a number of
neurons with similar center frequencies is combined.
-
It is further assumed that there are many independent
channels, each responding to a limited range of center frequencies.
-
Weber’s law is assumed to hold for each of these channels.
-
In later chapters we shall consider this notion that
there are many frequency channels in the auditory system, each conforming
to Weber’s law, in more detail and see that there is some evidence to the
contrary.
-
The near miss to Weber’s law.
-
If Weber’s law reflects the normal mode of operation
of a given frequency channel, we need to explain why intensity discrimination
of pure tones and very narrow bands of noise deviates from it.
-
There are probably at least two factors contributing
to the improvement in intensity discrimination of pure tones and narrow
bands of noise at high sound levels.
-
Zwicker, who studied modulation detection for pure tones,
described the first factor.
-
He assumed that Weber’s law holds for all frequency
channels and that the Weber fraction is about 1 dB.
-
The high-frequency side of excitation patterns (estimated
from masking studies, Chapter 3) grows in a nonlinear way with increasing
intensity, as shown in Fig. 2.9.
-
Thus, for example, a 1-dB change in stimulus level produces
greater than a 1-dB change on the high-frequency side of the pattern at
high sound levels and the Weber fraction appears to decrease.
-
This idea is supported by Zwicker’s demonstration that
addition of a highpass noise to mask the high-frequency side of the excitation
pattern produces discrimination results very close to Weber’s law.
-
The second factor contributing to the near miss to Weber’s
law is Florentine and Buus’s suggestion that subjects combine information
across a number of frequency channels, i.e., the whole excitation pattern.
-
As the level of a tone is increased, more channels become
excited which allows for improved intensity discrimination.
-
A model based on this idea is capable of predicting
both the near miss to Weber’s law and the effects of highpass masking noise
on intensity discrimination.
-
In summary, the near miss to Weber’s law for pure tones
can be accounted for by the nonlinear growth of the high-frequency side
of the excitation pattern; and the ability of subjects to combine information
from different parts of the excitation pattern.
-
Loudness adaptation, fatigue and damage risk.
-
The distinction between adaptation and fatigue.
-
In all sensory systems, exposure to a stimulus of sufficient
duration and intensity produces reductions in responsiveness.
-
Auditory fatigue results from application of a stimulus
which is usually much in excess of that required to sustain the normal
physiological response of the receptor and is measure after the fatiguing
stimulus has been removed.
-
Auditory adaptation refers to a decline in response
of a receptor to a steady stimulus as a function of time until it reaches
a steady value.
-
Post-stimulatory auditory fatigue.
-
The most common measure of auditory fatigue is a temporary
threshold shift (TTS).
-
The subject’s absolute threshold at a particular frequency
is measured.
-
A fatiguing stimulus is presented for a specified time,
and then removed.
-
The threshold is again measured and any increase would
be taken as a measure of fatigue.
-
There are five major factors that influence the size
of the TTS.
-
The TTS generally increases with the intensity of the
fatiguing stimulus.
-
At low intensities of the fatiguing stimulus TTS changes
slowly with intensity and only occurs for test tones with frequencies close
to that of the fatiguing stimulus.
-
As intensity increases, the TTS increases, as well as
the range of frequencies affected. At very high frequencies, the maximum
TTS may occur for test frequencies a half octave or more above the frequency
of the fatiguing stimulus as shown in Fig. 2.10.
-
For fatiguing stimuli above 90-100 dB, there is a precipitous
increase in TTS that may represent the transition between fatigue that
is physiological and transient and that, which is more permanent and pathological.
-
TTS generally increases with duration of exposure to
the fatiguing stimulus and is often linearly related to the log of the
duration.
-
TTS generally increases with frequency of the fatiguing
stimulus, at least up to 4-6 kHz. This is also the range in which permanent
hearing loss resulting from exposure to intense sounds or from presbyacusis
tends to be greatest.
-
TTS generally decreases with time since the fatiguing
stimulus was removed, but is often diphasic at higher test frequencies,
as shown in Fig. 2.11.
-
There is some suggestion that high levels of sound may
be less permanently damaging if the sound is pleasant (e.g., music) than
unpleasant (e.g. industrial noise). However, long enough exposure to sufficiently
intense sound will produce permanent damage.
-
Exposure to a sound level of 85 dB SPL for 8 hours a
day is currently considered safe.
-
If the exposure duration is halved, permissible intensity
is doubled (i.e., increased by 3 dB).
-
Sound levels over 110 dB can produce permanent damage
very quickly.
-
Auditory adaptation.
-
Early studies of auditory adaptation used a simultaneous
dichotic loudness balance test in which a continuous tone is applied through
earphones to one ear and the subject adjusts the level of a similar tone
applied occasionally to the other ear until it sounds equally loud.
-
Such studies found large amounts of auditory adaptation
in that the test tone was adjusted to lower levels as time passed.
-
Newer techniques that eliminate binaural interaction
between the adapting and the comparison tones either use very different
frequencies for the two tones, or use just an adapting tone that the subject
adjusts to maintain constant loudness.
-
Such techniques find little if any auditory adaptation
for tones well above absolute threshold (50-90 dB SPL).
-
Significant amounts of auditory adaptation only appear
to occur for low level, high frequency sounds, and even then there are
large individual differences in the results obtained.
-
Abnormalities of loudness perception in impaired hearing:
Loudness recruitment and pathological adaptation.
-
Types of hearing loss.
-
Conductive hearing loss refers to a defect in the outer
or middle ear that reduces transmission of sound to the inner ear.
-
Could be produced by conditions such as ossification
of the ossicles, growth of bone over the oval window, build up of fluid
due to middle ear infection, or wax in the ear canal.
-
Produces a simple attenuation of the incoming sound
so that the difficulty experienced by the sufferer can be well predicted
from the elevation in absolute threshold (audiogram).
-
Usually amenable to treatment by a hearing aid to amplify
sound or surgery to remove the obstruction.
-
Sensorineural hearing loss (sometimes inaccurately called
nerve deafness) refers to a defect in the cochlea (cochlear loss) or, less
commonly, in the auditory nerve or higher centers in the nervous system
(retrocochlear loss).
-
Could be produced by such things as birth defects, anoxia,
traumatic injury, prolonged exposure to intense sounds, age or certain
drugs.
-
Often the extent of loss increases with frequency and
difficulties experienced by the sufferer are not always well predicted
from the audiogram.
-
Sufferers often experience difficulty in understanding
speech in noisy settings.
-
The condition is usually not completely alleviated by
conventional hearing aids, nor is it usually treatable by surgery.
-
Loudness recruitment.
-
Cochlear hearing loss is almost always accompanied by
loudness recruitment, which is an unusually rapid growth of loudness as
the sensation level of a tone is increased.
-
Thus, absolute thresholds on an audiogram would be elevated,
but loudness levels at high sound levels (perhaps as indicated by loudness
discomfort levels) would be similar to those for normal ears.
-
Loudness recruitment occurs in normal ears for very
low and very high frequencies.
-
When only one ear is affected, loudness recruitment
can be measured with the alternate binaural loudness balance test.
-
A tone of a given level and frequency is present to
one ear and alternated with a variable tone of the same frequency to the
other ear.
-
The level of the variable tone is adjusted to have the
same loudness.
-
If this is repeated for a number of different levels,
the rate of growth of loudness in the normal and impaired ear can be compared,
as shown in Fig. 2.12.
-
There are other clinical tests for loudness recruitment
based on the assumption that if loudness is increasing more rapidly than
normal as the stimulus intensity is increased, then a smaller than normal
intensity change should be required for a just-noticeable difference in
loudness.
-
Intensity discrimination is affected not only by the
slope of the loudness-growth function, but also by internal variability
that also tends to increase in impaired ears and may offset the gain from
the steeper growth.
-
These tests are of questionable validity and should
not be relied on.
-
The simplest and best clinical test for loudness recruitment
at this time is to look for a combination of elevated absolute thresholds
(audiogram) and normal loudness discomfort levels. This is a very reliable
indicator of cochlear damage.
-
Without going into detail, it appears that loudness
recruitment is caused by a steepening of the input-output function (velocity
of movement as a function of sound level) of the basilar membrane when
the cochlea (probably the outer hair cells) is damaged.
-
Pathological adaptation.
-
Abnormal processes in the auditory nerve (much less
commonly in the cochlea) sometimes result in very rapid decreases in neural
responses after a nearly normal onset response.
-
Perceptually this leads to more extreme and rapid than
normal adaptation and can be used to diagnostically identify the source
of a hearing loss as retrocochlear.
-
Methods of measurement.
-
The simultaneous dichotic loudness balance procedure
also used to study normal adaptation.
-
The most common clinical procedure is called a tone
decay test and simply measures threshold (audiogram) for continuous vs.
interrupted tones. For persons with retrocochlear hearing loss, the threshold
for continuous tones may be 20-30 dB higher than for interrupted tones.
-
In summary, recruitment is the hallmark of cochlear
hearing losses, while pathological adaptation is usually indicative of
a retrocochlear loss.