An Overview of Psychoacoustics and Auditory Perception

AN OVERVIEW OF PSYCHOACOUSTICS AND
AUDITORY PERCEPTION
NEAL F. VIEMEISTER
Departmentof Psychology, Universityof Minnesota,75 East RiverRoad Minneapolis,Minnesota55455
This paper surveys experimental and theoretical work on the psychology of hearing, particularly
those aspects that are, or may be, relevant to audio reproduction. The general areas considered
include: (1) auditorY sensitivity and dynamic range; (2) temporal aspects of hearing; (3) frequency
and pitch perception; (4) intensity and loudness perception. Current work and directions will be
discussed and particular attention will be devoted to the neural and mechanical correlates of these
psychological phenomenon.
INTRODUCTION
Psycboacoustics,
broadly defined, is the study of the psychology of hearing. It is concerned with how organisms respond behaviorally to sound. This includes research on basic auditory capabilities, such as the detection and
discrimination of pure tones, to more "psychological" research on how sounds are recognized and interpreted. Although the term was coined only recently, psychoacoustics
has a very long history and now has many manifestations
one finds research in psychoacoustics being conducted
in psychology, physics, engineering, audiology, and physiology. There are subspecialties of clinical psychoacoustics,
animal psychoacoustics, musical psychoacoustics, speech
psychoacousties, and, of course, the psychoacoustics of audio reproduction,
Before surveying the basic concepts in this broad area, I
would like to make some comments about how we study
hearing, specifically about our choice of stimuli and about
the methods we use. In psychoacoustics we often use "unnatural", simple stimuli such as pure tones and noise. Indeed, much of the data I will be presenting were obtained
using such stimuli. We have been criticized, particularly by
some psychologists, for concentrating on these stimuli,
stimuli that seem to have no relationship to real-life sounds,
There is some justification to this criticism -- some of us
have become so preoccupied with the psychoacoustics of
simple stimuli that we lose track of the general goal of trying to understand auditory perception. But the criticism
misses a crucial point, namely that we use simple stimuli
as tools or probes to tell us how the auditory system
works, to give us information about the basic mechanisms
of hearing. Few of us are interested in the perception of
pure tones, but it is clear that by using pure tones we have
learned a lot about how the system works and about the proAES8th INTERNATIONAL
CONFERENCE
cessingunderlyingtheperceptionof complex,real-]ifcsounds.
Thesecondcommentisaboutthe methodsweuseto"mcasure" heating, the socalled psychophysicalmethods. It is useful to divide these psychophysical methods into two general
categories: objective methods and subjective methods. The
distinguishingfeature of an objective method is that the subject's response can be classified as being eitherright or wrong.
For example,we can define a certain temporalinterval for the
subject, using a light,perhaps, and duringthis interval,the socalled observation interval,we present a signal or we do not.
The subject is to respond either "yes" or "no" to indicate
whether he or she thought a signal was presented. The response can be scoredas being correct or incorrectbecause we
know whether or not the signal was presented. In contrast,
with a subjectivetechnique,there is no corrector incorrectresponse, there is only a response.In the "method of magnitude
estimation", for example,the subject is to report a number to
indicate the loudness of a sound. We are tapping a subjective
attributeand there is no corrector incorrectresponse. The distinction between objective and subjective psychophysical
methods is important because there appears to be a considerable difference in the intrinsic validity of data obtained with
these methods.Specifically,the validity of data obtained using
subjective methods generally is far more questionable than
that from objective techniques.There has been continuingdebate about whetherthe numbers that the subjectreports in the
magnitude estimationprocedure are pure, true, valid indicalions of subjectivemagnitude.And this is arehtively straightforward case in the sense that there is general agreement
about what "loudness"means.When one gets to less well-defined subjective attributes, like "quality", the question of validitybecomes even morepressing -- is there a dimension of
quality and can we validly measure it? This is not to say that
the validityof theobjective methods is beyond question.Sim13
VIEMEISTER
I
i40 - _._
12o
l00
n-a 8o
m
o3
' ' '''"1
'
' ' '''"1
'
' ' ''"'1
60
40
20
0
_ .......
j_
, ...... ,I
, ....... t
t
z0
100
1000
10000
Frequency (Hz)
Figure1.Thelowercurveisthe audibilitycurvebasedupon
minimumaudiblefield measurements.
Thiscurveisa eompositeof datafromCorse[TheExperiment
Psychoio.qy
of
Sensory
Behavior,
Holt,Rinehart
a
nd
Winston,
N
ewYork,p.
280, 1970]anddatafromRobinsonandDadson[Brit.J.
Appl.Phys.7, 166-181,1956].Theuppercurveis thethresholdfor painor "tickle"and isbasedupondatafromWegel
[Ann.(2)to/.,Rhinol.,Laryngol.41,740-779, 1932].
ply because we measure a threshold using an objective technique does not mean that threshold is a valid measureof the
subject's ability to hear. It has been repeatedly demonstrated
that factors that are unrelatedto hearing can affect thresholds,
These include learning and attentionaleffectsand responsebiases. A fairly recent development in psychophysics has been
the application of Signal Detection Theory. Among other
things this has provided measures of performance that are
more valid in the sense that they are relatively freeof contamination by non-sensory factors,
The point of all this is to be skeptical of the numbers -they may not mean what we think they do. We are trying to
measure some aspect of the behavior of a very complex biological system. The strategies to measure behavior are deceptively simple. However, with objective psychophysical
techniques and great care in subject training, the reliability
and validity of psychophysical measurement is close to
that for physical measurement,
I. ABSOLUTE SENSITIVITY AND THE DYNAMIC
RANGE OF HEARING
A. Absolute sensitivity: the audibility curve
Figure 1 shows the familiar audibility curve for human
hearing together with one measure of the upper intensity
limit of hearing. These curves bound the so-called auditory
area. The audibility curve represents the threshold in dB
SPL for a pure tone presented in quiet as a function of frequency. The curve shown is for young adults with normal
hearing and the tones are presented via a loudspeaker -these are "Minimum Audible Field" measurements and, unlike headphone measurements, reflect, in part, the acoustic
properties of the head, torso, and external ear. I would like
to call comment on three aspects of the audibility curve,
The first is the general U-shaped form of the function,
What accounts for this? There are many factors but the
most important -- for normal hearing people -- is the
transfer function of the middle ear. The human middle ear
consistsof three small bones,the ossicles,and their supporting structures. It transfers sound energy from the
14
eardrum to the cochlea, where the hair cell receptors are lorated, and serves essentially as an impedance matching device. There is a beautiful story of the evolution of the
middle ear but there is not enough time in this talk to do it
shows a bandpass characteristic that is roughly similar to
an inverted audibility curve.
The second aspect concerns the effects of hearing loss.
When we talk of a hearing loss we are referring to a
justice. in
The
transfer
of an
the elevation
middle ear
change
thepressure
audibility
curve, function
specifically
in
thresholds above the normal threshold. Hearing loss can be
producedin manyways-- exposureto intensesoundscan
irreversibly damage hair cells, the receptor cells within the
cochlea,as can exposureto certaindrugs, ototoxic chugs.
High frequenciesaregenerallymoresusceptibleto damage
and as we age we tend to first lose our sensitivity to high
frequencies.I would like to emphasizethat theregenerally
is moreto hearinglossthana simplereductionin sensitivity. In a frequency region of hearing loss there often are additional changes that can affect perception. Thus, we generally can not restore normal perception by restoring
normal sensitivity using a hearing aid or using equalizers
or tone controls on a audio reproduction system. The study
of the perceptual consequences of hearing loss is a very acfive research area of psychoacoustics and audiology.
Finally, I would like to remark on the incredible absolute
sensitivity of our auditory system. At 3 kI-Iz, where we are
most sensitive, a sound at threshold produces a displacement of the eardrum that is about 1/100 of the diameter of a
hydrogen molecule! One can speculate on why we are so
sensitive, but I won't. A more tractable question is what determines our absolute sensitivity. One possibility is that
there is a true sensory threshold, a "barrier" in our auditory
system which requires a certain energy to be exceeded. The
notion of a true sensory threshold, as opposed to the operationally defined thresholds we usually talk about such as for
the audibility curve, is generally held in disrepute. Very
weak signals, even those below "threshold" convey some
information -- they are not filtered out by the operation of
a sensorythreshold. (If thereis no threshold,or limit on
perception, then the issue of subliminal perception becomes
moot).Anotherpossibleexplanationfor our absolutesensitivity is that thermal agitation of the air molecules near the
eardrum -- Brownian motion -- provides a noise floor that
limits our ability to detect a tone in "quiet". This also appears not to be correct, at least for humans. The current
consensus is that our sensitivity is limited by noise, but not
Brownian noise at the eardrum. Rather it is the "noise" that
is characteristic of sensory transmission. Transmission of
information through our auditory system is inherently
stochastic. For example, most auditory nerve fibers are
spontaneously active -- they show responses in the absence
of auditory stimulation. "Internal noise" such as this must
limit sensitivity, not only our absolute sensitivity but also
our sensitivity to changes in frequency and amplitude.
B. The dynamic rangeof hearing
Theuppercurve in Figure I showsthe "threshold"for pain
and is one measure of the upper intensity limit of hearing.
AES8Ih INTERNATIONAL
CONFERENCE
AN OVERVIEW OF PSYCHOACOUSTICS AND AUDITORY PERCEPTION
It is clear from this figure that the dynamic range of hearing, the range between threshold and pain, is enormous. At
3 kHz the dynamic range is about 120 dB. Of course, the
"every day" dynamic range, the one in realistic listening
situations, is smaller because the lower limit, the audibility
curve, will be raised by ambient noise. Nevertheless, we
can hear over an remarkable range, particularly remarkable
and wonderful because the "front end" of the auditory system is essentially mechanical. It is also remarkable because
this dynamic range is available to us almost instantly: after
listening to a very loud sound, e.g. a pure tone at 115 dB,
our threshold has recovered to the quiet threshold within
about 500 ms. This and other evidence indicates that the
auditory system does not appear to maintain its dynamic
range by using a gain-control mechanism -- one which adjusts a limited dynamic range to an operating point determined by the ambient level. This is the "trick" used by the
visual system -- the system adapts to the ambient illumination and works around that point. The auditory system
does not appear to work that way. How does the auditory
system maintain its dynamic range? This is a fundamentally important question because it concerns the question of
how auditory information is coded and processed, something we must know if we are to claim that we understand
hearing. At the level of the auditory nerve we know that
somehow information over a 120 dB range is represented
in the activity of a population of nerve fibers each of which
has a typical dynamic range of only 30- 40 dB. At present
the best hypothesis is that sub-populations of nerve fibers
convey information over different intensity ranges. At levels above about 40 dB only a small population, about 1520% of the 30000 nerve fibers, are coding the sounds we
communicate with.
II. TEMPORAL ASPECTS OF HEARING
A. Temporal resolution
The auditory system is extremely fast, at least when
compared to other sensory systems. By this I mean that we
can detect or resolve relatively brief changes in sounds,
For example, it has been shown that a periodically interrupted white noise sounds continuous -- the interruptions
are no longer audible -- only when the interruption rate is
more than 24 thousand interruptions per second, another
way of looking at this is that we can resolve or "follow"
very brief intensity changes. In contrast, an interrupted
light appears continuous once the interruption rate exceeds
only about 60 interruptions per second. In this sense the
auditory system is almost two orders of magnitude faster
than the visual system,
There are many manifestation of the rme temporal resolution of hearing. The threshold for detecting a gap in a
continuous sound is about 2 msec, we can detect amplitude
modulation up to modulation frequencies of about 2 kHz,
and, as mentioned above, we recover from exposure to intense sounds quite quickly. The auditory system seems to
have been "built" to process rapidly changing dynamic signals. It is not surprising that the rapid dynamic changes
present in speech and in music are percept,Ja!!yimportant,
Clearly, the information transmitted via the auditory
AES 8th INTERNATIONAL CONFERENCE
nerve must preserve temporal information with a fairly

high degree of accuracy. One manifestation of this is
"phase-locking". This refers to the fact that the spike discharges of a single auditory nerve fiber tend to coincide
with a certain phase of a pure-tone stimulus -- the fiber
fires synchronously with the input. The degree of synchrony is best for low frequencies, below about 1 kHz, and
rolls off to an upper frequency limit of about 4 kHz (in
cat). This synchrony to the temporal fine structure of sound
also implies that the fiber can follow fairly rapid intensity
changes, such as that which occurs when a gap is introduced in the stimulus -- unless the gap is too brief, the
fiber will decrease its firing rate during the gap. Thus, information will be available to higher centers for detecting
the presence of the gap.
More controversial is the role of timing information,
phase-locking, in other types of auditory coding. An endless argument in psychoacoustics is how pitch is coded. I
will discuss this more fully later but one hypothesis is that
pitch is coded at the level of the auditory nerve in the timing of spike discharges. It has also been suggested that
phase locking to the fmc structure of speech plays a crucial
role in speech perception. The psychophysical evidence
supporting these hypotheses is equivocal. What is certain,
however, is that phase locking does play a fundamental
role in binaural hearing and sound localization. This will
be discussed later in the conference. And, as I have argued,
phase locking is a direct correlate of the fine temporal resolution of monaural hearing.
There is another aspect of temporal resolution that
should be discussed. That is the distinction between crosschannel and within-channel temporal resolution. Cross
channel resolution refers to the ability to detect temporal
differences that occur over wide frequency spacings, typically 500 Hz or more.For example,it hasbeenshownthat
we candetecta differencein the onsetsof a 1kHz toneand
a 4 kHz tone of about 3 ms. Within-channel temporal resolution refers to the ability to detect changes that occur
within a circumscribed frequency band. These changes are
detected as changes in the envelope of the waveform. An
example would be detection of a gap in narrowband noise.
Again, the threshold is about 3 ms although the underlying
processes are almost certainly different. The distinction between these two types of temporal resolution is important
because it may help to clarify situations in which phase
distortion is audible. Specifically, phase changes that occur
over a narrow frequency region may be audible as changes
in the envelope. Phase changes that cause a delay between
widely spaced frequencies may be audible if the delay ex_As 3 ms.
Finally, no discussion of temporal factors would be eomplete without a discussion of temporal integration. Temporal integration refers to the decrease in the detection threshold with increases in the duration of the signal. It is as if
the system sums or integrates power over time. This integration occurs over relatively long durations -- of the order of hundreds of milliseconds. This poses an interesting
problem -- if the auditory system sums integrates over
hundreds of milliseconds, how can we resolve events in the
15
VIEMEISTER
9o
1o
J
m so
m
e_
30
1o
o._,
_ I t t J i II
o.s
z
FREQUENCY
t
2
, , , ,, ,I
s
lO
(KHZ)
Figure2. Basilar membrane tuningcurve cor chinchillaobtainedusingthe Mossbauertechnique.Theordinateisthe

levelof a tone necessaryto producea vibrationvelocityof
0.1 mm/s.DatafromRoblesetal.[J.Acoust.Soc.Am.80,
1364-1374,1986]
millisecond range, two orders of magnitude shorter? This
is a rather basic paradox and is one that has not been adcquately resolved. My own belief is that temporal integralion is not a basic property of hearing, rather it reflects a
cognitive strategy applicable only in certain circumstances,
That is, we listen to the world through a relatively small
temporal window -- about 3 ms long -- and information
from the "looks" is stored in memory. Temporal integradon, according to this account, simply reflects an increased
number of looks as the signal duration is increased,
IlL FREQUENCY AND PITCH PERCEPTION
A. Frequency selectivity
It is well-known that the auditory system behaves roughly
as a Fourier analyzer -- it decomposes a complex spectrum into its constituent frequency components. At the
physiological/biomechanical level this is referred to as
tonotopic organization: different frequencies stimulate different places in the auditory system. This organization begins at the cochlea and is preserved at the highest levels of
the auditory system. The discovery and delineation of
tonotopicity and the underlying mechanisms is, especially
at the cochlear level, one of science's beautiful stories
complete with a Nobel prize, and truly elegant experimental and theoretical work by many scientists.
Figure 2 presents data from an experiment that measured
the vibration of the basilar membrane --the structure within the cochlea upon which the hair cell receptors sit. The
figure shows a mechanical tuning curve obtained by mcasurement of the velocity of a very small portion of the basilar membrane. We are looking at one spot on the membrahe and finding the SPL of a tone that produces a fixed
rms velocity of that spot. (The spot vibrates at the same
frequency as the stimulating tone). The tuning curve shows
a high degree of frequency selectivity -- only frequencies
within a narrow range vibrate that spot on the membrane,
If wc were to look at another spot close by we would find a
similar tuning curve but shifted in frequency m the "best
l6
frequency" changes with changes in location. There is

much more to the cochlear story but I can just mention in
passing that it is an area of research in hearing that has recendy exploded. The explosion was caused primarily by
observations that indicate that we can no longer think of
the cochlea as a passive mechanical system. There are acfive processes that are occurring and these play a fundamental role in cochlear frequency selectivity. An example
of the evidence for active processes are "cochlear emissions" -- sounds that are produced within the cochlea and
that can be measured in the external ear canal.
There are many psychoacoustical manifestations of frequency selectivity. The notion of the "critical band" has
been around since Fletcher's classic work in 1940. The notion rests on the demonstration that when a pure tone is
masked by noise, only frequency components of the noise
that are near the frequency of the tone are effective in
maskingthe tone.A morecurrentconcernin psychoacoustics is the measurementof the filter characteristicthat underlies the critical band and, more generally, auditory frequency selectivity. An example is a "psychophysical
tuning curve" such as shown in Fig. 3. The subject's task is
to detect a signal whose level and frequency are fixed. We
choose a masker frequency (the abscissa) and fred the level
of the masker that just masks the signal. This level is the
ordinate of the figure. We see that for masker frequencies
that are close to the signal frequency, a low intensity
masker will mask the signal, whereas for remote masker
frequencies a much higher masker level is required. The
interpretation is the subject is listening through a Filtercentered at the signal frequency and the amount of masker required is determined by how much the filter attenuates the
masker.Thingsaren't that simple, of course,and there is
considerable debateover whether these psychophysical
tuning curves are a completely valid measure of frequency
selectivity. I use this example because of the remarkable
similarity of the psychophysical tuning curve and the mechanical tuning curve shown in the previous figure. Both
are clear indications of auditory frequency selectivity.
Auditory frequency selectivity, in its general sense, is a
fundamental property of hearing and is the most studied
single aspect of hearing. Indeed, at one time "theories of
hearing" meant theories of cochlear frequency selectivity.
B. Pitch perception
Pitch is a subjective attribute of sound and does not bear
any simple relationship with the physical attributes of
sound. Frequency and pitch are not synonymous terms and
we do not have a physical measuring instrument that mcasures "pitch".
Pitch is surprisingly hard to rigorously define but is usually related, somewhat circularly, to music -- it is, roughly,
that attribute which permits sounds to be ordered on a musical scale. The pitch of a sound is usually measured by
asking the subject to adjust the frequency of a tone (or the
fundamental frequency of a periodic waveform) such that
the pitch of the tone matches the pitch of the test sound.
Assuming that the subject knows what "pitch" means and
is matching the sounds on that basis -- in some cases deAESSthINTERNATIONAL
CONFERENCE
AN OVERVIEW OF PSYCHOACOUSTICSAND AUDITORY PERCEPTION
batable assumptions -- we can say that the pitch of the test
...
s0 - _
.--I
sound is equivalent to that of a tone of frequency X. The

shorthand, somewhat misleading, expression is that the
pitch is X Hz.
mtn70
There is a very large literature on pitch perception and I

can only highlight certain aspects. We are exquisitely sensifive to frequency changes. For example, we can discriminate between a 1000 Hz tone a 1002 Hz tone: we say that
the difference threshold, or DL, is 2 Hz at 1 kHz. The DL
increases as the frequency increases: at 8 kHz, for example,
the DL is about 70 Hz. It is typically assumed and it seems
reasonable that these frequency differences are perceived as
pitch changes, although, of course, we can not bo sure.
How do we account for our incredible sensitivity to froquency and pitch changes? This gets us into an area which
has been hotly debated in psychoacoustics
for several
decades. The debate is whether pitch is coded peripherally
in terms of the place of stimulation or in terms of the timing of neural responses. It is clear from the physiology that
both codes are possible -- a place code based upon tonetopic organization, a timing code based upon phase locking. Furthermore, there appears to be sufficient information
at the level of the auditory nerve to account for our small
difference thresholds using either of these codes. This issue, a rather fundamental one for understanding hearing, is
still not resolved,
Some other important facts and observations regarding
the pitch of pure tones. A tone of the a certain frequency
can elicit different pitches in the two ears of a normal-hearing subjecL This is called diplacusis and is most easily explained using place theory -- one can imagine that slight
differences in the dimensions and structure between the two
ears causes maximal stimulation a slightly different places,
The pitch of a pure tone of a fixed frequency can change
with changes in the level of the tone. There appear to be
large differences between individuals in how it changes,
however. For this reason, and many others, it would be unadvisable to provide "pitch compensation" (analogous to
loudness compensation) in audio equipment..,
Two aspects of pitch should be distinguished:
tone
height and chroma. Tone height is that aspect which increases continuously
as the frequency of a tone is increased. Tone height is probably what is reflected in the
mci scale. Chroma is cyclical -- two tones that are an octave apart have the same chroma. This captures the "sameness" of say the note A regardless of its octave. A related
point is that the subjective octave does not correspond exactly to the physical octave -- the subjective octave is
slightly larger than a 2:1 frequency ratio and depends on
the frequency,
There is a rich literature on the perception of pitch of
complex sounds. The most important phenomenon
is
"residue pitch", sometimes called the pitch of the missing
fundamental, low pitch, or periodicity pitch. This refers to
the fact that the pitch of a harmonic complex is the same as
the pitch of its fundamental frequency even when there is
no energy at the frequency of the fundamental. For exampie, the pitch of a complex of 1800, 2000, and 2200 Hz is
200 Hz, the fundamental frequency. It has been clearly
60
................
'k
x_
_ 5o
_ n0
ce
_ _0
_
N,_
,
_
-.
20
100
1000
Frequency
10000
(Hz)
Figure3. Psychophysicaltuning curves for three signal frequencies. The ordinate is the level of the masker that is necessaryto just mask the signal. Data from Wightman et al. [In
Psychophysics and physiology of hearing. Evans, E. F. and
Wilson, J. P.(ods.). Academic Press, London. 1977].
shown that the 200 Hz pitch does not result from the generation of a 200 Hz distortion product in the peripheral auditory system. Various models have been proposed to account
for the basic phenomenon of residue pitch and of the many
related phenomenon. These models generally propose fairly
extensive central processing and can involve cognitive,
learning-related factors. The models have evolved to the
point that many can accurately predict the pitch (or pitches)
of very complex sounds such as bell-strikes.
IV. INTENSITY PERCEPTION AND LOUDNESS
A. Intensity discrimination.
Under optimal circumstances
a 1 dB change in sound
intensity can be detected. That is, we can just detect a 1 dB
intensity difference between two bursts of sound, we can
detect a 1 dB increment in a continuous sound, and we can
detect a 1 dB "bump" in the spectrum of an spectrally flat
sound. This is tree, approximately, over a very wide range
of sound levels. Thus, for example, we can just detect the
intensity difference between a 20 and a 21 dB sound and
between a 120 and 121 dB sound. There are several aspects
of this that deserve comment. First of aH, the fact that relatively small intensity differences can be detected over such
a wide range -- a range of over 100 dB -- is another manifestation of the remarkable dynamic range of the auditory
system. Secondly, it should be emphasized that the decibel
is a relative measure: a 1 dB change at 120 dB is a much
larger absolute intensity change than a 1 dB change at 20
dB. A 1 dB change corresponds to a change of about 26%.
Constant relative intensity changes are just detectable. This
fact is known as Weber's Law. Specifically, Weber's Law
states that: AI= k I, where AI is the absolute intensity
change that is just-detectable, and I is the absolute intensity
of the reference. Weber's Law is one of the great "laws" of
experimental psychology and dates back to the work of
Weber and Fechner in the early 1800's. Weber's law holds,
at least approximately, for a wide variety of auditory stimuli and also holds for intensity discrimination in most of
the other senses. Weber's Law, or a version of it, also holds
for detecting signals in noise: this version states that a constant signal-to-noise ratio yield constant detectability and is
17
VIEMEISTER
one reason why we often use this specification. The fundamental question is why does Weber's Law hold? Why are
relative intensity changes so important in hearing? What is
it about auditory processing that makes relative changes
important? We have theories, of course, including that the
auditory system employs logarithmic compression, but
none has proven completely satisfactory.
Finally, I would like to put intensity discrimination in the
broader context of how complex sounds are processed and
ultimately perceived. Intensity discrimination
tells us
something about how changes in amplitude or intensity
that occur within a limited spectral region are detected and
processed. More generally, it has given us valuable hints
about how the spectral characteristics of a sound might be
coded, particularly at the level of the auditory nerve. A recent and exciting development in psychoacoustics addresses the closely related problem of how we discriminate and
perceive speclml shape or spectral "profiles". The important difference is that in this research the subjects must
make a comparison across frequency, not just what happens within a single frequency region. It seems clear that
subjects can do this quite well. It is also clear that such capability is crucial for real-world auditory perception,
B. Loudness
Loudness is, of course, one of the fundamental attributes of
auditory perception.
It is the subjective magnitude of
sound. It is, like pitch, not a physical property of sound. At
the risk of belaboring the obvious: it is almost always incorrect to say that: "the loudness of the sound was 90 dB
SPL". The 90 dB SPL is a physical measurement and is
only indirectly related to the loudness of the sound. A 90
dB sound could be, depending on its spectrum, loud or
quite soft.
I will not attempt a thorough review of loudness but will
mention several highlights. As you are well aware, "equal
loudness contours" have been measured for tones and for
narrow bands of noise. These measurements are based
upon loudness matches and from these measurements we
can determine the "loudness level" (in phons) of a sound,
When we say that the loudness level of a sound is 50 phons
we mean that it is judged equal in loudness to a 1kI-Iz tone
presented at 50 dB SPL. The growth of loudness with intensity has been extensively studied, typically using magnitude estimation procedures, and we know that for sounds
above threshold a 10 dB increase in level will produce approximately a doubling of loudness. Finally, there are several fairly successful schemes for calculating the loudness
of complex sounds,
I am minimizing a discussion of loudness because in my
opinion loudness is not particularly important in hearing,
While it is a primary auditory attribute, loudness, in itself,
is not important for auditory communication, speech and
music included. It is important if a sound is too loud or too
soft, but within this vast range we can communicate about
equally effectively regardless of loudness. What is important, crucially important, for auditory communication are
the intensity changes that occur over frequency and over
time. This is where the information is and we must under18
stand how these changes are processed if we are to understand auditory perception. Loudness, has little, if anything,
to do with it. Yes, dynamics are important in music, at least
certain types of music, but far more important are the spectral shapes of the sounds and their temporal characteristics.
V. SUMMARY AND CONCLUSIONS
In psychoacoustics we are concerned with the behavior of a
very complex system and, despite the stories I've told you,
there are many potential pitfalls in trying to measure hearing
and in drawing valid conclusions from our measurements. I
discussed the distinction between objective and subjective
psychophysical measurements. The question of validity is
not as pressing with objective methods, and the data can be
much more directly related to underlying physiological processes. But, there are many aspects of perception, including
those related to the evaluation of audio reproduction devices,
that simply are not amenable to objective psychophysical
measurement. We must use subjective methods in some cases, but considerable caution should be exercised in interpreting the results of such measurements.
The dynamic range of hearing is the intensity range between absolute threshold and a somewhat arbitrary upper
limit, often taken as the "threshold" for pain. Absolute
thresholds (measured in quiet) are determined by internal
noise, by the transfer function of the acoustic system up to
the cochlea, and by many other factors. Hearing loss is defined by an elevation in absolute threshold There generally
are perceptual consequences of hearing loss in addition to a
loss in sensitivity. Thus, simple compensation for the loss
in sensitivity generally does not restore normal hearing.
The dynamic range of hearing is spectacular and it is not
yet clear how the system maintains such a large range. This,
the so-called "dynamic range problem", is fundamental to
an understanding of how we hear. In discussing this probleto, I mentioned that this huge range is available to us almost instantly m our ears do not slowly adjust their gain to
operate over a restricted range. Clearly, audio reproduction
that does not audibly degrade the signal must somehow preserve a large dynamic range. If this is accomplished by using gain-adjustment devices careful attention must be devoted to the temporal characteristics of those devices.
The auditory system seems to have been designed to
process rapidly changing sounds m sounds whose amplitude and/or frequency changes over time. I distinguished
between two types of temporal resolution: within-channel
and cross-channel. Within-channel resolution reflects sensitivity to envelope changes that occur over a relatively
small portion of the spectrum, a bandwidth of the order of
20% of the center frequency. Cross-channel resolution
refers to sensitivity to temporal difference that occur over
widely spaced frequency regions. For both types of resolution, the approximate auditory time constants are about 3
ms. Phase disparities in reproduction equipment may be
audible if they exceed these times.
A fundamental fact about hearing is that the auditory systern is tonotopically organized. At any given level, different
frequencies stimulate difference places. This organization
begins in the cochlea and, at this level, shows a very high
AN OVERVIEW
OF PSYCHOACOUSTICS
ANDAUDITORYPERCEPTION
degree of frequency selectivity. There are direct psychoacoustical manifestations of frequency selectivity -- the notion of critical bands, of psychophysical tuning curves, and
of auditory fiteringcapture these. Although it is clear that
the auditory system performs a type of frequency-to- place
analysis, it is also clear that timing information is also preserved. Timing information, or phase locking, is clearly important in binaural hearing and also underlies the high degree of temporal resolution of monaural hearing. Whether it
plays a basic role in other types of auditory coding, in pitch
perception, for example, is not clear.
Pitch is a very important subjective attribute of sound,
particularlyin musicalperception.A Iong-stundingissueis
how pitch is coded at the periphery m this is the place vs.
time issue I have just mentioned. The more general issue is
how we extract the pitch of complex sounds. It is clear that
the pitch, or pitches, of such sounds is not simply determined by the physical characteristics of the sounds -- extensive "computation", perhaps including stored or learned
strategies, seems to be involved.
AES8Ih INTERNATIONAL
CONFERENCE
I briefly considered loudness and intensity perception.

We are quite sensitive to intensity changes over the entire
dynamic range of hearing. It is relative intensity changes
that are important, a fact indicated by Weber's Law. This,
essentially, is also why signal-to-noise ratio is a perceptually relevant specification. An important question in psychoacoustics is why Weber's Law holds. This is part of the
more general, and fundamentally important, question of
how we hear and extract information from the spectral
shape of sounds.
VI. SUGGESTED INTRODUCTORY TREATMENTS:
Moore,B.CJ.M. (1989)An Introductionto thePsychology
of Hearing,3rd Edition, AcademicPress, London.
Green, D.M. (1976). An Introduction to Hearing, Erlhaum Associates, Hillsdale, New Jersey.
Pickles, J.O. (1988). An Introduction to the Physiology
of Hearing, 2nd Edition. Academic Press, London.
Yost, W.A. and Nielsen, D.W. (1985). Fundamentals of
Hearing,2ndEdition. Holt,Rinehartand Winston,New York.
19

An Overview of Psychoacoustics and Auditory Perception

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

An Overview of Psychoacoustics and Auditory Perception

Cargado por

Copyright:

Formatos disponibles

AN OVERVIEW OF PSYCHOACOUSTICS AND

Departmentof Psychology, Universityof Minnesota,75 East RiverRoad Minneapolis,Minnesota55455

' ' '''"1

' ' '''"1

' ' ''"'1

AN OVERVIEW OF PSYCHOACOUSTICS AND AUDITORY PERCEPTION

nerve must preserve temporal information with a fairly

Figure2. Basilar membrane tuningcurve cor chinchillaobtainedusingthe Mossbauertechnique.Theordinateisthe

frequency" changes with changes in location. There is

AN OVERVIEW OF PSYCHOACOUSTICSAND AUDITORY PERCEPTION

batable assumptions -- we can say that the pitch of the test

sound is equivalent to that of a tone of frequency X. The

There is a very large literature on pitch perception and I

I briefly considered loudness and intensity perception.

También podría gustarte