Está en la página 1de 37

Consonance and Dissonance - Getting Our Terms Right

The term "consonance" comes from the Latin consonare which means simply `sounding together'. In early western music theory the term became synonymous with a harmonic interval. However, later theorists used the term to refer to particularly euphonious or harmonious intervals. From there, the term was generalized to triads, tetrads and then to sonorities generated by any number of tones. In both theoretical and experimental writings, researchers have used a wide variety of terms. These have included terms such as: pleasant, unpleasant, euphonious, beautiful, ugly, rough, smooth, fused, pure, diffuse, tense, and relaxed. Some researchers have treated such terms as synonymous. Others have assumed that the differences are minor -- that is, when asked to judge how ______ a sound is, listeners will respond in roughly the same manner. Other research have assumed that each of these terms generates an entirely different response.

Van de Geer, Levelt & Plomp (1962)


Van de Geer, Levelt & Plomp (1962) carried out an important study where they asked Dutch listeners to judge tone pairs according to ten different scales:
English Dutch high-low (hoog-laag) sharp-round (scherp-rond) beautiful-ugly (mooi-lelijk) active-passive (actief-passief) consonance-dissonant (consonant-dissonant) euphonious-diseuphonious (welluidend-onweeluidend) wide-narrow (wijd-nauw) sounds like one tone-sounds like more tones (klinkt als een toon-klinkt als meer tonen) tense-quiet (gespanen-rustig) rough-smooth (ruw-glad)

Non-musician listeners judged each harmonic interval using a 7-point scale for each semantic term. Using factor analysis, van de Geer, Levelt and Plomp found that the responses grouped into three independent factors. The analysis produced three statistically significant factors. One factor (dubbed pitch) included the scales high, sharp, tense, narrow, and active. A second factor (dubbed pleasantness) included the scales euphonious, consonant, and beautiful. A third factor (dubbedfusion) included the scales rough, more tones and fusion.

The first factor (pitch) was found to correlate directly with the mean frequency of the pitches used in the interval. Van de Geer, Levelt and Plomp (1962) made the following three conclusions in their study: 1. Musical intervals are judged using three basic dimensions: pitch height, pleasantness, and fusion. 2. Musicians and non-musicians use the term "consonant" differently. Musicians typically consider unisons, octaves, fifths and fourths as the most consonant, whereas non-musicians typically experience thirds and sixths as being more consonant. Non-musicians conceive of "consonance" primarily in terms of pleasantness. 3. There is no straightforward relationship between consonance and fusion. The main lesson from Van de Geer, Levelt and Plomp (1962) is that care must be taken when instructing listeners to judge intervals. Some terms are largely synonymous (such as euphonious and pleasant), whereas other terms are not interchangeable (such as pleasant and fused).

Some Notes Regarding Tuning and Temperament


Suppose we wanted to create a scale that permitted only justly tuned intervals. All octaves would have a frequency ratio of 2:1, all fifths would have a frequency ratio of 3:2, all major thirds would have a frequency ratio of 5:4, and so on. We might also require that all major seconds have a frequency ratio of 9:8. It can be shown that mathematically that such an "ideal" tuning system is impossible. Consider, for example, the goal of having both perfect fifths and perfect octaves. Let's begin with the tone A-440 Hz, and tune a series of 12 fifths:
A4-E5 440 X 3/2 = 660 Hz

E5-B5

660 X 3/2 = 990 Hz

Transposed down one octave = 495 Hz

B4-F#5

495 X 3/2 = 742.5 Hz

F#5C#6

742.5 X 3/2 = 1113.75 Hz

Transposed down one octave = 556.875

C#5G#5

556.875 X 3/2 = 835.3125 Hz

Transposed down one octave = 417.66

G#4D#5

417.66 X 3/2 = 626.4844 Hz

D#5A#5

626.4844 X 3/2 = 939.7266 Hz

Transposed down one octave = 469.8633

A#4-F5

469.8633 X 3/2 = 704.7949 Hz

F5-C5

704.7949 X 3/2 = 1057.1824 Hz

Transposed down one octave = 528.5962

C4-G4

528.5962 X 3/2 = 792.8943 Hz

G4-D5

792.8943 X 3/2 = 1189.34123 Hz

Transposed down one octave = 594.6707

D4-A4

594.6707 X 3/2 = 892.0061 Hz

Transposed down one octave =446.0030

Notice that we began with A-440 Hz and ended with A-446 Hz. This difference is known as the Pythagorean comma and has been known since ancient times. Thecomma amounts to 23 hundreds of a semitone (or 23 cents). The Pythagorean comma arises from the fact that the powers of 2 and the powers of 3 never intersect. That is, 2 X 2 X 2 ... can never converge with the series 3 X 3 X 3 ... Of course it is possible to continuing tuning more fifths that just 12. The "circle" of fifths approaches the starting point after 12, 41, and 53 tunings of fifths. As the number of tones increases, the size of the comma gets smaller,

but it never goes away entirely. No number of tunings will return you to the precise starting point. It is therefore impossible to create a scale that provides a just octave interval from every note as well as a just fifth from every note. A similar case can be made for just major thirds (5:4 ratio). No scale can be created where every pitch provides a just octave interval and a just major third. Nor can a scale be constructed that provides a just octave and just major third from every pitch. There are four general ways of dealing with the Pythagorean comma: (1) ignore it, (2) minimize it, (3) adapt to it, or (4) hide it. The first approach is to simply abandon the goal of creating music using just intervals. Indeed, in many musical cultures, such as in the musics of Bali and Java, the common tuning systems bear little similarity to just intervals, so it is possible that there is no underlying preference for just intervals. Expanded Pitch Set There are two ways of minimizing the Pythagorean comma. One is to create a tuning system with so many notes per octave that the mistuning is small. This approach is evident in the music of Harry Partch, who created his own musical instruments capable of playing 43 notes per octave. Reduced Pitch Set A second way of minimizing the Pythagorean comma is to limit the musicmaking to a small number of notes. It is possible to tune a pentatonic scale so that most intervals are fairly close to their just values. Even better, one might limit music-making to a couple of drone tones tuned a fifth apart. Before the classical period, modulation to different keys was uncommon. Consequently, it was possible to use tuning systems optimized for particular keys (like C major). Dynamic Tuning A third approach is to continuously adapt the tuning as the music unfolds. This can be done with a computer, where the tuning of each successive note in a sequence is adjusted so that just intervals are always used. Unfortunately, this adaptive approach causes the "tonic" pitch to vary over the course of a melody: the "doh" you end with will not necessarily match the "doh" you begin with. For many melodies, most listeners find such adaptive tuning to be unpleasant. Another problem with this approach is that it only works for single-note melodies. Once you add harmony, it is impossible to ensure that both the harmonic and melodic intervals are just. A final disadvantage is that

adaptive tuning is really only practical using a computer. It would be a significant challenge for human performers to adopt adaptive tuning. Masking the Comma The history of Western music has relied almost exclusively on the fourth approach: mask or hide the Pythagorean comma in some way. There are several ways to mask the effect of the non-just tunings. A simple approach is to add vibrato so mistunings are difficult to hear. Research has also shown that listeners have greater difficulty hearing mistunings of tones that are short in duration. So composing music using durations that are predominantly less than half a second will mask the effects of mistuning. Normally, people consider hiding the Pythagorean comma by using a "compromise" tuning system. The most popular compromise tunings spread the comma over a number of notes. These compromise systems are referred to as "temperaments". Lots of temperaments have been advocated over the centuries. The following table characterizes just five. The numerical values in the table indicate how much a given interval deviates from the just interval in cents (hundredths of semitones).
Name of Tuning System Fifth Major Third Minor Third Major Second Minor Second

Pythagorean Tuning

0.0

+21.5

-21.5

0.0

-21.5

Equal Temperament

-2.0

+13.7

-15.6

-3.9

-11.7

Silbermann Tuning

-3.9

+5.9

-9.8

-7.8

-2.0

Meantone Tuning

-5.4

0.0

-5.4

-10.8

+5.4

Salinas Tuning

-7.2

-7.2

0.0

-14.3

+14.3

Is there any way to judge whether one temperament is better than another? Can listeners hear the difference? If they can hear the difference, does the difference matter?

There are four main reasons why modern scholars have lost interest in the question of what is the best tuning system. First, in the 1930s, Carl Seashore measured the pitch accuracy of real performers and showed that singers and violinists are remarkably inaccurate. For non-fixed-pitch instruments, the pitch accuracy is on the order of 25 cents. Yet Western listeners (and musicians) are not noticeable disturbed by the pitch intonation of professional performers. Secondly, on average, professional piano tuners fail to tune notes more accurately than about 8 cents. This means that even if performers could perform very accurately, they would find it difficult to find suitable instruments. Thirdly, listeners seemingly adapt to whatever system they have been exposed to. Most Western listeners find just intonation "weird" sounding rather than "better". Moreover, professional musicians appear to prefer equally tempered intervals to their just counterparts. See the results of Vos (1986). Finally, the perception of pitch has been shown to be categorical in nature. In vision, many shades of red will be perceived as "red". Similarly, listeners tend to mentally "re-code" mis-tuned pitches so they are experienced as falling in the correct category. Mis-tuning must be remarkably large (>50 cents) before they draw much attention. This insensitivity is especially marked for short duration sounds -- which tend to dominant music-making.

Consonance and Dissonance - Effect of Culture


What is the role of culture in the phenomena of consonance and dissonance? We might expect that judgements of consonance and dissonance rely to some extent on exposure to a musical culture -- that is, to learning. Although many people have speculated about the effect of culture on judgements of consonance/dissonance, depressingly few pertinent experiments have been carried out. The poverty of experimental work in this area is a sad indictment of those who tend to pre-judge the issues. One the one hand, many psychoacousticians have tended to assume that the auditory periphery plans the preeminent role in consonance/dissonance perception. Since the human hearing organ changes little around the world, psychoacousticians assume there is no need to repeat experiments with people from different cultures. By contrast, many ethnomusicologists have tended to assume that the differences in perception between cultures are patently obvious. Once again, they assume there is no need to carry out experiments to test their intuitions. The existing experimental evidence is mixed. Some experiments imply that judgements of consonance/dissonance are sensitive to cultural background whereas other evidence implies that such judgements are not especially sensitive to cultural background.

Enculturation of Acceptable Tuning


By way of illustration, consider the following data. In the Western equal temperament tuning system, perfect fifths are mistuned by two cents (two onehundreds of a semitone). Specifically, equally-tempered fifths are smaller by two cents. Joos Vos (1987) had 18 Western musicians judge the acceptability of tunings for various tempered perfect fifths. The graph below shows the results for tone durations of 0.25 seconds (fast presentation) and 0.50 (slow presentation).

Notice that the results are skewed to the left of the perfectly just fifth (0 cents deviation). In general, it appears that western musician subjects judged the slightly small interval as more acceptable than the just perfect fifth. This result is consistent with a learned preference arising from cultural exposure. Notice, however, that Vos had his musicians judge the acceptability of the intervals rather than the consonance, pleasantness or some other criterion.

Butler and Daston


In 1968, Janet Butler and Paul Daston carried out a study of consonance and dissonance where they compared American and Japanese listeners. Both American and Japanese listeners were asked to judge the consonance of various equally-tempered intervals. In addition, each listener was given a

discrimination task where they were required to make same/different judgments for six paired dyads. Finally, Butler and Daston asked their Japanese listeners to indicate whether they preferred Western music or traditional Japanese music. The following table shows the Spearman rank-order correlations between the various subject groups. The asterisk (*) indicates listeners who performed perfectly on the interval discrimination task. The cross (+) indicates Japanese listeners who stated that they preferred traditional Japanese music over Western music.
Group American* American 0.85 0.98 0.98 0.89 0.98 0.997 0.96 0.99 0.99 0.96 0.88 American* American Japanese* Japanese Japanese*+ Japanese+ 0.88 0.95 0.92 0.86 0.83 0.92 0.94 0.93 0.99 0.97 0.91 0.90 0.95 0.99 0.99

Japanese* 0.88 Japanese 0.88

Japanese*+ 0.88 Japanese+ 0.82

The results show evidence of both strong similarities between Japanese and American listeners, and also some evidence of culture difference. The lowest correlation (0.82) occurs between those American listeners who scored best on interval discrimination task and the Japanese listeners who preferred Japanese music. In general, the results show a general consistency between Japanese and American listeners. If there are significant differences between the music listening of Japanese and American listeners, that difference is not to be found in judgments of the consonance of isolated dyads. Lundin (1947) carried out an experiment where Japanese and Western listeners were contrasted.

Cazden's Expectation Dissonance


Music theorist Norman Cazden wrote a number of articles on consonance and dissonance spanning the period 1942-1980. Unaware of the work of Plomp & Levelt (1965) and Kameoka & Kuriyagawa (1969), Cazden (1980) argued that the evidence in support of acoustic and physiological accounts of consonance and dissonance is weak. He did not doubt that a low-level sonorous

unpleasantness (what he called euphony exists, but he regarded the musical significance of this phenomenon to be marginal: "... in principle, euphony refers rather to the overall psychoacoustic quality of a sonority isolated from any musical context. "... With euphony thus distinguished, and defined as a composite of all those psychoacoustic criteria capable of affecting a gradation of isolated sonorities, the terms consonance and dissonance proper may be reserved instead for those particular musical distinctions observed in the practice of Western tonal music" (Cazden, 1980; p.155) Cazden's distinction between euphony and consonance/dissonance is echoed in the writings of many scholars. Cazden (1980) provides the following table of distinctions and their scholarly origins:
Sonorous/Static Euphonie Eufonia Consonance physique Konsonanz - Dissonanz Konsonanz - Dissonanz Konsonanzempfindung Functional/Dynamic Dynamie Dinamia Consonance esthtique Harmonie - Disharmonie Konkordanz - Diskordanz Harmoniegefhl Source Choron, de la Fage Basevi Renaud Wundt Stumpf Stumpf

Konsonanz Objectiv, Konsonanzgefhl Harmonie Subjektiv, Harmoniegefhl Hennig Dissonanz akustischer Konsonanz - Dissonanz Dissonanzkonstatieren consonanza acustica Sensorial consonance Einfach - Kompliziert Auflsungbedrfnis Konsonanz - Dissonanz Dissonanzbehandlung consonanza armonica Aesthetic consonance Harmoniefhrung Jonquire Louis, Thuille Kurth Gentili Guernsey Deutsch

akustische Konsonanz

musikalische Konsonanz

Nll

More important than euphony in Cazden's view is what we have called expectation dissonance: "[Dissonance] identifies rather the functional moment of any sonorous event that is expected to resolve, while the moment to which it ultimately resolves is then deemed consonant. Should the framework for the normative expectations of this kind not be present, or should the apparent resolution tendencies and outcomes be thwarted consistently, as may happen in some compositional styles of twentieth century art music, neither consonance nor dissonance can be said to exist." (Cazden, 1980; p.157) Cazden properly noted that "even a single tone may engender that urgent expectation of resolution that is the essence of dissonance." (p.157). Cazden argued that there are three levels of expectation-related dissonance: 1. Dissonating Tone. A nonharmonic or non-chordal tone has a tendency to resolve within the framework of an underlying chord or harmony. 2. Dissonant Chord Moment. A chord may be dissonant to the extent that it arouses the expectation of resolving to another chord within a harmonic progression. 3. Tonal Center Dissonance. A passage may retain a tonic or dominant tonal center, and dissonance arises is resolved when the dominant tonal area ultimately moves to the original tonic area. Note that Cazden refers to his theory as a "systemic approach" to consonance/dissonance. Although he invokes the psychology of expectation to explain the origins of dissonance, Cazden says rather little about this important concept. For example, without explicitly saying so, Cazden implies that expectations are largely learned -- and so reflective of cultural milieu. "The cultural relativity of the systemic approach may ... account negatively for some cross-cultural effects. For example, it may explain why listeners conditioned to Western music sense a somewhat directionless indecision of harmonic moment when they attend to the heterophonic gamelan music of Bali or to the highly mannered Gagaku music of Japan. Conversely, systemic habits undoubtedly engender bewilderment at the seeming irrelevance of Western harmonic relationships to the classical Karnatic music of India or to the drum ensemble music of Ghana." (Cazden, 1980; p.161)

Finally, Cazden argues that "the raw psychoacoustic or sonorous properties of the musical signal can provide at most certain limiting natural conditions for the art of music, just as there are broad natural limits and conditions for language" (p.161)
Other Research

The principal published literature on this question includes the following: Cazden, N. (1945). Musical consonance and dissonance: A cultural criterion. Journal of Aesthetics and Art Criticism, Vol.4, No. 1, pp. 3-11. Cazden, N. (1960). Sensory theories of musical consonance. Journal of Aesthetics and Art Criticism, Vol. 20, No. 3, pp. 301-319. Cazden, N. (1980). The definition of consonance and dissonance. International Review of the Aesthetics and Sociology of Music, Vol. 2, pp. 123-168. Guernsey, M. (1928). The role of consonance and dissonance in music. American Journal of Psychology, Vol. 40, pp. 173-204. Lundin, R.W. (1947). Toward a cultural theory of consonance. Journal of Psychology, Vol. 23, pp. 45-49. Moran, H., & Pratt, C.C. (1926). Variability of judgments of musical intervals. Journal of Experimental Psychology, Vol. 9, pp. 492-500. The work of Moran & Pratt (1926) and Guernsey (1928) document the differences between musicians and non-musicians. Guernsey showed that musicians make a distinction between consonance and pleasantness.
Musicians vs. Non-Musicians

An argument can be made, that the major learned difference between groups of listeners is not to be found in differences of cultural milieu. The most consistent difference between groups of listeners is between musicians and non-musicians.

Consonance and Dissonance - Effect of Personality

In the second instance, some individuals tend to seek out high levels of stimulation or arousal. Heavy Metal may be the musical equivalent of skydiving or eating highly spiced food. Physiologically, the body continues to exhibit symptoms (either slight or marked) of discomfort, disgust, or fear. However, the dangers involved are known by the listener to be less than implied by the alarm-bells ringing in the sensory system. In short, sensory dissonance may be a form of thrill-seeking. It is true that people have different responses to identical stimuli. Some of these differences have been shown to be related to personality. For example, within roughly 4 seconds of hearing an unexpected tone, a listener's heart-rate typically does one of two things: (1) increases, or (2) decreases and then increases. The first response is indicative of a startle response or defense reflex. The second response is characteristic of an orienting response -- a response that typifies interest or openness to the stimulus (Graham, 1979). For two different individuals, the same stimulus might tend to evoke one or another of these responses. Two studies have shown that such responses correlate with personality characteristics -- as measured by standard personality tests. Both Orlebeke and Feij (1979) and Ridgeway and Hare (1981) found that heart-rate deceleration-acceleration responses tend to occur most commonly for those listeners who score high on "sensation-seeking" personality characteristics. That is, those individuals exhibiting "thrillseeking" or "sensation-seeking" personal dispositions are less likely to be startled or irritated by a sound, and more likely to be "open" or "inquisitive". Similar personality-linked differences have been observed in auditory evoked potentials -- see Zuckerman (1994). As a final observation, we may note that thrill-seeking in sports is known to be linked to certain aspects of personality. There is even some evidence supporting a genetic basis for thrill-seeking dispositions. It is possible that responses to sensory dissonance (such as tolerance or seeking-out) may also be linked to personality. The carefree and rough image of the Heavy Metal fan may have concrete links to the nature of the musical sonorities. If personality is important in such musical tastes, we might predict that sky-divers would tend to favour music that has higher levels of sensory dissonance. Another, more technical prediction arises from the above view. Recall that Simpson (1994) found a neural correlate in cochlear models that accounted for 58% of the variance in dissonance perceptions. There is room to account for further variance. We might predict that a better neural correlate would be include a neural path that encodes information related to the magnitude of possible masking.

Why are Some Sounds Ugly?


As we have seen, there are many ways of conceptualizing dissonance: as pleasant, unpleasant, euphonious, beautiful, ugly, rough, smooth, fused, diffuse, tense, relaxed, etc. Dissonance is foremost a negative valence emotional response. We hear some sound and experience a sense of ugliness or repulsiveness. We might well understand why various sounds (like the cries of someone in pain) would evoke certain emotional responses. But how is it that combinations of simple sine tones can evoke experiences of pleasantness or unpleasantness? How is it that some sounds can be experiences as sounding `bad'? Nesse (1991) draws our attention to why pain (both physical and psychological) is important from an evolutionary point of view. From a survival point of view, tissue damage is one of the worst things that can happen to an organism. The physical pain that ensues (such as from stubbing one's toe) provides a strong incentive for an individual to be careful in situations where physical damage is possible. Similarly, Nesse points out that psychological pain (such as feelings of sadness due to loss) are also important evolutionary adaptations. A person who does not feel some anxiety in the presence of their boss is in danger of behaving in inappropriate and maladaptive ways. Fear, sadness, anxiety, anger, and other emotions serve important functions. Research on consonance and dissonance has tended to focus solely on the causes. Accordingly we ask questions such as does dissonance arises from complex frequency ratios? from beating? from critical band interactions? An alternative approach is to focus on the purpose of dissonance: what is dissonance for? why do we experience some sounds as more pleasant than others? Why isn't our experience the reverse? Why don't we experience nominally dissonant sounds as pleasant and nominally consonant sounds as unpleasant?

Gibsonian Interpretations of Dissonance


A Gibsonian approach to dissonance allows us to reinterpret possible sources (causes) of dissonance: 1. Tonotopic Dissonance. The component of dissonance that arises when pure tones are separated by roughly 40% of a critical band. (Greenwood, 1961; Plomp & Levelt, 1965; Kameoka & Kuriyagawa, 1969a, 1969b).

2. Temporal Dissonance. The component of dissonance that arises due to rapid beating or amplitude fluctuations. (Helmholtz, 1877). 3. Virtual Pitch Dissonance. The component of dissonance that arises from competing (unclear) viritual pitches. (Terhard, 1974). 4. Expectation Dissonance. The component of dissonance that arises due to the thwarting or delaying of a (learned) expectation. According to this view, "even a single tone may engender that urgent expectation of resolution that is the essence of dissonance." (Cazden, 1980; p.157) Cazden argued that there are three levels of expectation-related dissonance: Dissonating Tone where a nonharmonic or non-chordal tone has a tendency to resolve within the framework of an underlying chord or harmony, Dissonant Chord Moment where a chord may be dissonant to the extent that it arouses the expectation of resolving to another chord within a harmonic progression, and Tonal Center Dissonance where a passage may retain a tonic or dominant tonal center, and dissonance arises is resolved when the dominant tonal area ultimately moves to the original tonic area. 5. Interval Category Dissonance. The component of dissonance that arises when a two pitches form an interval that is categorically ambiguous for a listener. That is, where the interval lies near a learned categorical boundary. 6. Absolute Pitch Category Dissonance. The component of dissonance that arises when a pitch is categorically ambiguous for a listener posessing absolute pitch. That is, where the pitch lies close to a learned categorical boundary. 7. Stream Incoherence Dissonance. The component of dissonance that arises due to confusion regarding streaming. (Wright & Bregman, 1987). 8. Relative Dissonance. The component of dissonance that arises from the context of dissonant successions. A sonority might sound relatively consonant when it is preceded by by other sonorities that are highly dissonant.

Consonance and Dissonance - Effect on Musical Organization


In outlining the various theories of consonance and dissonance above, we have overlooked musical practice. If consonance and dissonance are important

in music, we ought to be able to use music itself as a test for the various theories. The extant literature shows that music is indeed organized in a manner consistent with the empirical research on consonance and dissonance.
Preference for Clear Pitches. If composers endeavor to avoid virtual pitch dissonance, then this avoidance should be reflected in the tendency to use harmonic complex tones and sonorities approximating the harmonic series. Frequency of Occurrence of Harmonic Intervals. If composers endeavor to avoid sensory dissonance, then this avoidance should be reflected in the frequency of occurrence for various harmonic intervals. We would expect that dissonant intervals (such as minor seconds and major sevenths) would be less common that consonance intervals (such as perfect octaves and major sixths). Huron (1991) showed that there is a strong negative correlation between frequency of occurrence and degree of dissonance for concurrent diads in the music of J.S. Bach. Chordal-tone Spacing. If composers endeavor to avoid tonotopic dissonance, then multitone chords ought to have wider intervals in bass region. More specifically, the spacing between chordal tones ought to typically result in an even spacing of spectral components with respect to critical bands. Huron and Sellmer (1992) compared three scales (frequency, log frequency, and critical bands) and showed that chordal tone spacing correlates best with critical band spacing. Preferred Chord Arrangements. If composers endeavor to avoid sensory dissonance, then this avoidance should be reflected in the preference for certain chord arrangements. Hutchinson and Knopoff calculated the dissonance for various arrangements of four note major chords. Some arrangements are less dissonant than others. Huron (MS) counted the number of various major chord structures in J.S. Bach's chorale harmonizations. A significant negative correlation was found between the most dissonant chord arrangements and the frequency of occurrence for these arrangements. This correlation is independent of pitch height and so largely unaffected by critical bands. Preferred Musical Scales. If composers endeavor to avoid sensory dissonance, then this avoidance should be reflected in the preference for certain types of musical scales. Aggregate dissonance for common scales is low compared with other possible scales. Common scales exhibit optimum sensory consonance (Huron, 1994) Preferred Drone Tones. If composers endeavor to avoid sensory dissonance, then this avoidance should be reflected in the preference for particular drone tones -- given a particular scale. The most consonant "drone" pitches appear to occur more frequently. E.g. Korean music (Huron MS, Nam). Gregorian chant (Huron MS).

Evaluating the Existing Research

Although anecdotal and case-based research can provide an important source of hypotheses, the phenomenon of consonance and dissonance has developed a sufficient complexity that the only way forward is through careful experimentation. Here we will focus on evaluating different experiments. Over the past century or so, a number of experiments have been carried out regarding consonance and dissonance. In making sense of these experiments, we might consider asking the following questions: 1. 2. 3. 4. How many subjects were used? What is the cultural background of the subjects? Were the subjects musicians or non-musicians? Male or female? Did the experiment use simple or complex tones? If simple tones were used, were they likely to have low distortion? If complex tones were used, is the spectral content well described? Did the experiment use complex non-harmonic tones? 5. Did the experiment use only tone diads? Or where triads and tetrads used? 6. Did the stimuli span a wide frequency range? 7. Was the effect of loudness measured? 8. If a wide frequency range was used, was there any attempt to compensation for threshold changes? 9. How did the subjects rate the stimuli? That is, what question were they asked? 10. Were the subjects' responses examined for reliability? Were test/retest correlations calculated? 11. How is the data presented? Do we see means only? Median values? Any indication of the data spread or variance?

Work Remaining
Further work on the psychoacoustics and physiology of dissonance is needed. All of the important psychoacoustic theories (such as Tonotopic Dissonance, Difference Tones, and Temporal Dissonance) need extensive testing on non-Western listeners. All of the important exposure or learning-based theories (such as Expectation Dissonance, Interval Category Dissonance, and Absolute Pitch Category Dissonance) need to demonstrate the effect of different cultures. Historically, there has been a strong tendency for researcher's to accept only a single theory regarding consonance/dissonance. New research programs are needed that begin from the premise that more than one theory is correct. For example, we need to investigate a possible duplex perception of dissonance where both tonotopic and temporal (e.g. beating) factors contribute to dissonance.

Relationship to personality, etc. Contrary to popular opinion, most theories predict several "good" scales as optimum for consonant music. More in-depth studies are required of scales in different cultures so that proper comparisons with theory can be made. Since timbre is known to affect consonance/dissonance, theories should be able to predict the kinds of timbres that are commonly used in music. We should be able to test whether the preferred musical timbres coincide with particular theoretical predictions. Similarly, we should be able to test whether instrument combinations or patterns of orchestration coincide with particular theoretical predictions. Working out the effect of streaming on dissonance (Wright/Bregman hypothesis).

Consonance and Dissonance - Tonotopic Theory


Greenwood (1961) was the first to observe a relationship between critical bandwidth and judgements of consonance/dissonance. Specifically, Greenwood (1961) plotted data from Mayer (1894) against the estimated size of the critical band. Mayer had collected data where listeners were asked to judge the smallest consonant interval between two pure tones. That is, they were asked to judge the minimum frequency difference where no dissonance was perceived. Greenwood pointed out that the dissonance is judged as absent when the distance between pure tones is roughly the size of a critical bandwidth. In Figure 9 of Plomp & Levelt's 1964 article, Mayer's 1894 data is again plotted against the critical bandwidth. In this case, Plomp & Levelt used an estimate of the critical bandwidth given by Zwicker, Flottorp & Stevens (1957). There are some discrepancies evident in both the higher and lower frequency ranges. However, the size of the critical band proposed by Zwicker et al is now considered excessively large -- especially in the bass region. A better estimate of the size of critical bands is given in Moore & Glasberg (1983). However, it is now thought that Greenwood's original 1961 estimate of the critical bandwidth is slightly better than even Moore and Glasberg's ERB estimates. There are a number of questions arising from Mayer's 1894 data. Mayer used tuning forks to generate his stimuli. This raises questions of the purity of the purported pure tones. In addition, Mayer's lowest tuning fork had a frequency of 256 Hz. Mayer asked a friend, R. Knig, to carry out some additional

measures at low frequencies. Mayer's data was collected from 12 subjects, whereas Knig's data was collected from a single subject. In 1968, Plomp and Steeneken reported on a modern replication of Mayer's (1894) measures. They collected data from 20 listeners. Subjects adjusted the frequency of the higher of two sine-wave oscillators. In one condition, subjects were asked to adjust the higher tone as close as possible to the fixed tone so that "the two tones did not interfere and could be heard separately." The graph below plots the median values against Greenwood's equation estimating the width of critical bands.

If the tonotopic theory of sensory dissonance is correct, then one would predict that intervals formed by pitches presented to alternate ears would evoke no dissonance. Sandig (1939) found that intervals formed by playing each tone to separate ears results in a more "neutral" sounding interval.

Hutchinson and Knopoff


William Hutchinson and Leon Knopoff (UCLA Music Dept.) published two papers -- 1978, 1979 on consonance and dissonance. Their goal was to

generate numerical tables estimating the perceived dissonance for typical tone pairs (1978) and triads (1979). The first thing to note about Hutchinson and Knopoff is that they were completely unaware of the work of Kameoka and Kuriyagawa (1969a,b). Hutchinson and Knopoff begin with a fundamental misconception of Plomp and Levelt. They regard their own work as "an extension of the HelmholtzPlomp and Levelt model of beating as the cause of dissonance." (1978, p.1) They misinterpret Plomp and Levelt as follows: "Of the many extensions of Helmholtz's research, perhaps the most recent and comprehensive is that of Plomp and Levelt (1965). These authors reject a variety of other descriptions of consonance and dissonance and reaffirm that the absence of rapid beating is the physical correlate of Western common practice consonance." (1978, p.2) There is nothing in the Plomp and Levelt approach that takes into account beating. Another misunderstanding arises with respect to the origin of the critical band. The critical band was posited by S.S. Stevens, and measured by Zwicker. "By methods of psychological testing, Plomp and Levelt have determined that the critical bandwidth, within which one hears dissonances, is not a constant fraction of the mean frequency of the two tones. Instead, this fraction is smaller for high mean frequencies and larger for lower mean frequencies." (1978, pp.4-5) In calculating the dissonance for diads and triads, Hutchinson and Knopoff miss an opportunity to note that virtually all triads are more dissonant than virtually all diads. They probably didn't discover this because their equations normalize the dissonance values. "As a matter of computational convenience, we have "fudged" the frequencies of the overtones of any fundamental to the well-tempered scale." (1978, p.7) Fortunately, they avoid the Zwicker CBW curve and use the data from Cross and Goodwin, and from Mayer. They fit their own curve to the CBW as follows: CBW = 1.72 (f)^0.65 (compare with Greenwood) They generate dissonance values for various dyads using their equation. They use complex tones consisting of 10 equallytempered harmonics.

In Hutchinson and Knopoff (1979), the authors turn their attention to triads. They show that a first inversion chord is more dissonant than a root position chord (having the same average pitch). The second inversion chord is slightly more consonant than the root position chord. (Terhardt might have an explanation for why the 2nd inversion chord sounds less "good".) "All other things being equal, the acoustic rank ordering from the most consonant to the most dissonant for a major triad is: second inversion, root position and firt inversion; for the minor triad it is: first inversion, root position, and second inversion." (1979, p.9) Notice that these orderings are simply a consequence of larger critical bandwidths for lower frequencies. In general, wider intervals between the lowest notes of a chord will generate less sensory dissonance.

Simpson (1994)
Further evidence in support of the tonotopic theory of sensory dissonance is found in the work of Jasba Simpson (1994). Simpson made use of contemporary computer models of the operation of the cochlea. Into these models, Simpson input the stimuli used in five perceptual experiments where listeners judged the degree of consonance or dissonance. Simpson then explored the outputs of the cochlear models to determine whether there were any neurophysiological responses that correlated with the consonance/dissonance judgments. Simpson found that the squared distance of the maximums and minimums from the mean of the maximum and minimums accounted for 58% of the variance in the stimuli used in the five experiments. In effect, the squared distance measure used by Simpson amounts to a measure of tonotopic spread. Simpson concluded that his findings suggest that dissonance cues are available at the periphery of the auditory system -- in keeping with other extant experimental literature (Sandig, 1939). Simpson's complete thesis is available online. An implementation of the Kameoka and Kuriyagawa method for estimating sensory dissonance for any arbitrary spectrum is available.

Relating Tuning and Timbre

by William A. Sethares This is the full text of the article (more or less) as it first appeared in Experimental Musical Instruments. It was the catalyst for much of the work that resulted in Tuning Timbre Spectrum Scale, and it contains links to computer programs that will make it easy for you to draw dissonance curves yourself. James Forrest has recently created a Java applet for interactive exploration of dissonance curves. "Clearly the timbre of an instrument strongly affects what tuning and scale sound best on that instrument" W. Carlos
Introduction

If you've ever attempted to play music in weird tunings (where "weird" means anything other than 12 tone equal temperament), then you've probably noticed that certain timbres (or tones) sound good in some scales and not in others. 17 and 19 tone equal temperament are easy to play in, for instance, because many of the standard timbres in synthesizers sound fine in these tunings. I remember when I first played in 16 tone. I had to audition hundreds of sounds before I found a few good timbres. When I tried to play in 10 tone, though, none of the timbres in my synthesizers sounded good. This article explains why this happens, and shows how to design timbres and scales that complement each other. This suggests a way to design new musical instruments with unusual timbres that can play consonantly in unusual scales. The principle of local consonance describes a relationship between the timbre of a sound and a tuning (or scale) in which the timbre will sound most consonant. The principle answers two complementary questions. Given a timbre, what scale should it be played in? Given a scale, how can consonant timbres be chosen? The ability to answer such questions will likely impact the way we design new musical instruments. The presentation begins in the next section with an overview of the work of several acousticians, who have shown that people consistently judge the consonance of intervals composed of pure sine waves. These judgements are averaged into a "consonance curve" which is used to calculate the consonance of complex timbres. The results of such calculations agree well with the normal (musical) notion of consonance when applied to harmonic timbres.

Thus unisons, octaves, fifths and fourths are highly consonant while seconds and sevenths are relatively dissonant. Of course, this measure of consonance can also be applied to other (nonharmonic) timbres, and the succeeding sections show how to design timbres and scales. Several concrete examples follow, including finding scales for nonharmonic timbres (the natural resonances of a uniform beam, "stretched" and "compressed" timbres, FM timbres with noninteger carrier to modulation ratios), and finding timbres for equal tempered scales. This article is a less technical presentation of my paper "Local Consonance and the Relationship Between Timbre and Scale," which contains the mathematical details.
What Exactly is Consonance?

The standard musicological definition (see your favorite dictionary) is that a musical interval is consonant if it sounds pleasant or restful; a consonant interval has little or no musical tension or tendency to change. Dissonance, on the other hand, is the degree to which an interval sounds unpleasant or rough; dissonant intervals generally feel tense and unresolved. In On the Sensations of Tones, Helmholtz offers a physiological explanation for consonance that is based on the phenomenon of beats. If two tones are sounded at almost the same frequency, then a distinct beating occurs that is due to interference between the two tones (piano tuners use this effect regularly). The beating becomes slower as the two tones move closer together, and completely disappears when the frequencies are identical. Typically, slow beats are percieved as a pleasant vibrato while fast beats tend to be rough and annoying. Recalling that any timbre can be decomposed into sine wave components, Helmholtz theorized that dissonance between two tones is caused by the rapid beating of various sine wave components. Consonance, according to Helmholtz, is the absence of such dissonant beats. More recently, Plomp and Levelt (full references are at the end) examined consonance experimentally, by generating pairs of sine waves and asking volunteers to rate them in terms of their relative consonance. Despite considerable variability among the responses, there was a simple and clear trend. At unison, the consonance was maximum. As the interval increased, it was judged less and less consonant until at some point a minimum was reached. After this, the consonance increased up towards, but never quite reached the consonance of the unison. Plomp and Levelt called this tonal consonance, to distinguish it from musical consonance and from Helmholtz' beat theory.

The above figure shows an averaged version of the dissonance curve (which is simply the consonance curve flipped upside-down) in which dissonance begins at zero (at an "interval" of a unison) increases rapidly to a maximum, and then falls back towards zero. The most surprising feature of this curve is that the musically consonant intervals are undistinguished - there is no dip in the curve at the fourth, fifth, or even the octave.
Figure 2: The standard harmonic timbre used to generate the dissonance curve offigure 3. Amplitudes fall at a rate of 0.88.The frequency axis is normalized so that the root frequency is unity.

To explain perceptions of musical intervals, Plomp and Levelt note that most traditional musical tones have a spectrum consisting of a root or fundamental frequency, and a series of sine wave partials that occur at integer multiples of the fundamental. Figure 2 depicts one such timbre. If this timbre is sounded at various intervals, the dissonance of the intervals can be calculated by adding up all of the dissonances between all pairs of partials. Carrying out this calculation for a range of intervals leads to the dissonance curve. For example, the dissonance curve formed by the timbre of figure 2 is shown below in figure 3.

Observe that this curve contains major dips at many of the intervals of the 12 tone equal tempered scale. The most consonant interval is the unison, followed closely by the octave. Next is the fifth, followed by the fourth, the major third, the major sixth, and the minor third. These agree with standard musical usage and experience. Looking at the data more closely shows that the minima do not occur at exactly the scale steps of the 12 tone equal tempered scale. Rather, they occur at the "nearby" simple ratios 1:1, 2:1, 3:2, 4:3, 5:4, and 5:3 respectively, which are exactly the locations of notes in the "justly intoned" scales (see Wilkinson). Thus an argument based on tonal consonance is consistent with the use of just intonation (scales based on intervals with simple integer ratios), at least for harmonic timbres. Perhaps the most striking aspect of figure 3 is that most of the scale steps of the major scale are roughly coincident with local minima of the dissonance curve. Thus the ear perceives intervals which occur at points of local minima in the dissonance curve as relatively consonant. This observation forms the basis of the principle of local consonance: A timbre and a scale are said to be related if the timbre generates a dissonance curve whose local minima occur at scale positions.

This notion of relatedness of scales and timbres suggests two interesting avenues of investigation. Given an arbitrary timbre T (perhaps one whose spectrum does not consist of a standard harmonic series), it is straightforward to draw the dissonance curve generated by T. The local minima of this curve occur at values which are good candidates for notes of a scale, since they are local points of minimum dissonance (i.e. maximum consonance). This might be useful to the experimental musician. Imagine being in the process of creating a new instrument with an unusual (i.e., non-harmonic) tonal quality. How should the instrument be tuned? To what scale should the finger holes

(or frets, or whatever) be tuned? The principle of local consonance answers this question in a concrete way. Alternatively, given a desired scale (perhaps one which divides the octave into n equal pieces, or one which is not based on the octave), there are timbres which will generate a dissonance curve with local minima at precisely the scale degrees. This is useful to musicians and composers who wish to play in nonstandard scales such as 10 tone equal temperament. As the opening quote indicates, this is not the first time that the relationship between timbre and scale has been explored. Pierce's brief note reported synthesizing a timbre designed specifically to be played in an 8 tone equal tempered scale. Pierce concludes, "... by providing music with tones having accurately specified but nonharmonic partials, the digital computer can release music from the tyranny of 12 tones without throwing consonance overboard." Slaymaker investigated timbres with stretched (and compressed) partials, and Mathews and Pierce explored their potential musical uses. Recently, Mathews and Pierce examined a scale with steps based on the thirteenth root of three, rather than the standard twelfth root of two, which is designed to be played with timbres containing only odd partials. Carlos investigated scales for nonharmonic timbres by overlaying their spectra and searching for intervals in which partials coincide, thus minimizing the beats (or roughness) of the sound. This is similar to the present approach, but we provide a systematic technique that can be used to find scales for a given timbre, or to find timbres for a given scale. It would be naive to suggest that truly musical properties can be measured as a simple tonal consonance. Even in the realm of harmony (and ignoring musically essential aspects such as melody and rhythmn), consonance is not the whole story. Indeed, a harmonic progression that was uniformly consonant would likely be boring. Harmonic interest arises from a complex interplay of dissonance (restlessness) and consonance (rest). Perhaps the most important use of the principle of local consonance is to provide guidelines for exploring new tonalities and tunings.
How to Calculate Dissonance Curves

If you're thinking that there must be a lot of calculations necessary to draw dissonance curves, you're right. It's an ideal job for a computer. Those familiar with MATLAB, BASIC or related computer languages may wish to look at the program. The program works by encapsulating the PlompLevelt consonance curve into a mathematical function that consists of a sum of exponentials. The i and j loops calculate the dissonance of the timbre at a particular interval alpha, and the alpha loop runs through all the intervals of

interest. The first few lines set up the frequencies and amplitudes of the timbre. The variable n must be equal to the number of frequencies in the timbre. Running the program as is generates the dissonance data of figure 3 for the timbre of figure 2. To change the start and end points of the intervals, use startint and endint. To make the intervals further apart, increase inc. All the dissonance values are stored in the vector diss. Don't change dstar or any of the variables with numbers. Fortunately, there are some general patterns in the ways that dissonance curves can look. Let's examine a simple timbre with just two partials. As shown in figure 4, the dissonance curve can have three different contours: if the partials are very close together then there are no points of local consonance, if the partials are widely separated then there are two local minima, if they are in between then there is just one. Using the program, you can reproduce these curves (or, of course, generate your own). Set n=2 and freq(1)=500, freq(2)=505, amp(1)=10, amp(2)=10. This gives figure 4(a), where the partials are too close to allow a point of local consonance. Setting freq(2)=1.15*500 shows that the point of local consonance occurs at an interval of 1.15, as in 4(b). Finally, setting freq(2)=1.86*500 gives 4(c), with two points of local consonance. The steep minimum occurs at an interval of 1.86. Notice that the second minimum is shallow, and is a result of the large distance between the partials of the timbre.

You can listen to figure 4 with a synthesizer or tone generator. First, find a tone that is as close to a sine wave as possible. (If using a sample based machine without such a humble waveform, try an organ or flute sample). Assign two tones to each keypress, one at frequency f, and one at a major seventh above f. (A major 7th is an interval of 1.86, just as in 4(c)). Listen to the consonance of the various intervals in this timbre. The first few are very rough. The next few are somewhat aharmonic, but not unpleasant. Then the dissonance rises and plummets quickly, at the interval of 1.86. The octave, at an interval of 2, sounds very dissonant and unoctavelike. For this timbre, the major 7th plays the role normally occupied by the octave, at least in terms of consonance. This is something you can hear for yourself.
Properties of Dissonance Curves

Here are some general properties of dissonance curves. Suppose that the timbre F has n partials located at frequencies (f1, f2, ... , fn).
Property 1: The unison is the global minimum (the lowest possible value of the dissonance curve). All other minima are local. Property 2: As the interval grows larger, the dissonance must approach a value that is no more than the intrinsic dissonance of the timbre. Property 3: The dissonance curve generated by F has at most 2n(n-1) local minima which are located symmetrically (on a logarithmic scale) so that half occur for intervals between 0 and 1, and half occur for intervals between 1 and infinity. Property 4: Up to half of the local minima occur at intervals a for which a=fi/fj where fi and fj are arbitrary partials of F. Up to half of the local minima are the shallow type of figure 4(c).

The fourth property is particularly interesting because it says that points of local consonance tend to occur at intervals which are simply defined by the partials of the timbre. In figures 4(b) and 4(c), for instance, local minima are found at a=1.15 and a=1.86 respectively, which is the ratio between the two partials. The musically useful information is usually contained in intervals between 1/m and m for some small m. The shallow minima tend to vanish for timbres with more than a few partials. Figure 3, for instance, consists exclusively of local minima caused by coinciding partials. Thus, dissonance curves usually have fewer than 2n(n-1) local minima. In figure 3, for instance,

there are only 7 local minima within the octave of interest, considerably fewer than the theoretical maximum of 84. It is possible to achieve the bound. For instance, the timbre (f,2f,3f) over the range 0<a<6 exhibits all 12 possible minima.
From Timbre to Scale

This section constructs examples of scales appropriate for a variety of timbres, and explains various consonance related phenomena in terms of the principle of local consonance. Harmonic Timbres The points of local consonance for the harmonic timbre with partials at (f, 2f, ... , 7f ) are located at simple integer ratios. The results of the previous section explain this elegantly. Candidate points of local consonance are at intervals a for which fi = a fj. Since the partials are at integer multiples of f, a=n/m for integers n and m between 1 and 7. The principle of local consonance says that the most appropriate scale tones for harmonic timbres are located at such a, and indeed, all the points of local consonance of figure 3 occur at such values. The following table compares intervals in the 12-tet scale, intervals in the just major scale, and minima of a dissonance curve drawn for a timbre with nine harmonic partials.
Notes of the equal tempered musical scale compared to minima of the dissonance curve for a 9 partial harmonic timbre, and compared to the Just Intonation major scale from Wilkinson Note Name C C# D D# 12-tet Interval 1.0 1.059 1.122 1.189

Just Minima of Dissonance Curve Intervals 1.0 1:1 16:15 1.14 (8:7 = sept. maj. 2) 1.17 (7:6 = sept. min 3) 1.2 (6:5) 6:5
just min. 3 unison just semitone just whole tone

9:8

E F F# G G# A A# B C

1.26 1.335 1.414 1.498 1.587 1.682 1.782 1.888 2.0

1.25 (5:4) 1.33 (4:3) 1.4 (7:5 = sept. tritone) 1.5 (3:2) 1.6 (8:5) 1.67 (5:3) 1.75 (7:4 = sept. min. 7) 1.8 (9:5 = large just maj. 7) 2.0

5:4 4:3 45:32 3:2 8:5 5:3 16:9 15:8 2:1

just maj. 3 just perfect 4 just tritone perfect 5 just min. 6 just maj. 6 just min. 7 just maj. 7 octave

In a sense, this provides a psychoacoustic basis for justly intonated scales. In terms of tonal consonance, the ear is fairly insensitive to small deviations in frequency, and the 12 tone equal tuning can be viewed as an acceptable compromise between the consonance based desire to play in justly intoned scales and the practicalities of instrument standardization. Stretched and Compressed Timbres Slaymaker and Mathews and Pierce have investigated timbres with partials at fj = f Alog( j) where the log is taken base 2. When A=2, this is simply a harmonic timbre, since fj = f 2log(j)= jf. When A<2, the frequencies of the timbre are compressed, while when A>2, the partials are stretched. The most striking aspect of compressed and stretched timbres is the lack of a real octave. This can be seen clearly from the dissonance curves, which are plotted in the four panels of the figure for A=1.87, 2.0, 2.1, and 2.2 respectively. In each case, the frequency ratio A plays the role of the octave, which Mathews and Pierce call the pseudo octave. Real octaves sound dissonant and unresolved when A is different from 2 while the pseudo octaves are highly consonant. More importantly, each curve has a similar contour. Points of local consonance occur at (or near) the twelve equal steps of the pseudo octaves. "Pseudo-fifths," "pseudo-fourths," and "pseudo-thirds" are readily discernable. This suggests that much of music theory and practice can be transferred to to compressed and stretched timbres, when played in compressed and stretched scale

A Xylophone Tuning It is well known that xylophones, and other instruments which consist of beams with free ends, have partials which are not harmonically related. The principle of local consonance suggests that there is a natural scale, defined by the timbre of the xylophone, in which it will sound most consonant. The first seven frequencies of an ideal beam which is free to vibrate at both ends are given by Fletcher and Rossing as f, 2.758f, 5.406f, 8.936f, 13.35f, 18.645f, 24.82f . Two octaves of the dissonance curve for this timbre are shown below. The curve has numerous minima which are spaced unevenly at the frequencies shown.

This suggests that these would be the most natural sounding tuning for a xylophone, at least in terms of consonance. Tuning for FM Timbres One common method of sound synthesis is frequency modulation (FM) (see Chowning). Noninteger ratios of the carrier and modulating frequencies give nonharmonic timbres that are typically relegated to percussive or bell patches because they sound dissonant when played in traditional 12 tone harmonies. The principle of local consonance suggests that such sounds can be played more harmoniously in scales which are determined by the timbres themselves. A Java applet by James Forrest allows immediate and hands-on exploration of FM timbres and their dissonance curves. The program also maps the sounds onto your computer keyboard so that they are easy to play. For example, consider a simple FM tone with carrier to modulator ratio c:m of 1:1.4 and modulating index I=2. The frequencies and amplitudes of the resulting timbre are given in the following figure.

Three octaves of the dissonance curve for this FM timbre are plotted below. The appropriate scale notes for this timbre occur at the minima of the dissonance curve, which can be read directly from the figure.

From Scale to Timbre

The optimal scale for a given timbre is found simply by locating the local minima of the dissonance curve. The complementary problem of finding an optimal timbre for a given scale is not as simple. There is no single "best" timbre for a given scale. But it is often possible to find "locally best" timbres which can be specified as the solution to a certain optimization problem. For certain classes of scales (such as the m-tone equal tempered scales) the properties of the dissonance curve can be exploited to solve the problem efficiently. Timbre Selection as an Optimization Problem Any set of m scale tones specifies a set of m-1 intervals a1, a2, ... , am-1. The naive approach to the problem of timbre selection is to chose a set of n partials (f1, f2, ... , fn) and volumes (or amplitudes) (v1, v2, ... , vn) to minimize the sum of the dissonances over the m-1 intervals. Unfortunately, this can lead to "trivial" timbres in two ways. Zero dissonance can be achieved by setting all the amplitudes to zero, or by allowing the ai to become arbitrarily large (recall property 2). To avoid such trivial solutions, some constraints are necessary: Constraint 1: Don't allow the amplitudes to change. Constraint 2: Force all frequencies to lie in a predetermined region.

The revised (constrained) optimization is then: With the amplitudes fixed, select a set of n frequencies (f1, f2, ... , fn) lying in the range of interest so as to minimize the cost C = w1 ( sum of dissonances ) + w2 ( number of points ) over the m-1 intervals of local minima, where the w1 and w2 are weighting factors. Minimizing this cost C tends to place the scale steps at local minima as well as to minimize the value of the dissonance curve. Experimentally, we have found that weightings of about w1/w2 = 1000/1 seem to give reasonable results. Minimizing the cost C is a n-dimensional optimization problem with a highly complex error surface. Fortunately, such problems can be solved adequitely (though not necessarily optimally) using a variety of "random search" methods such as "simulated annealing," (see Kirkpatrick) or the "genetic algorithm" (see Goldberg). The genetic algorithm (GA) seems to work well. The GA requires that the problem be coded in a finite string called the "gene" and that a "fitness" function be defined. Genes for the timbre selection problem are formed by concatenating binary representations of the fi. The fitness function of the gene (f1, f2, ... , fn) is measured as the value of the cost J above, and timbres are judged "more fit" if the cost C is lower. The GA searches n-dimensional space measuring the fitness of timbres. The most fit are combined (via a "mating" procedure) into "child timbres" for the next generation. As generations pass, the algorithm tends to converge, and the most fit timbre is a good candidate for the minimizer of C. Indeed, the GA tends to return timbres which are well matched to the desired scale in the sense that scale steps tend to occur at points of local consonance and the total dissonance at scale steps is low. For example, when the 12 tone equal tempered scale is specified, the GA converges near harmonic timbres quite often. This is a good indication that the algorithm is functioning and that the free parameters have been chosen sensibly. Timbres for an Arbitrary Scale As an example of the application of the genetic algorithm to the timbre selection problem, a desired scale was chosen with scale steps at 1, 1.1875, 1.3125, 1.5, 1.8125, and 2. A set of amplitudes were chosen as 10, 8.8, 7.7, 6.8, 5.9, 5.2, 4.6, 4.0, and the GA was allowed to search for the most fit timbre. The frequencies were coded as 8 bit binary numbers with 4 bits for the integer part and 4 bits for the fractional part. The best three timbres out of 10 trial runs of the algorithm were

(f, 1.8f, 4.9f, 14f, 9.87f, 14.81f, 6.4f, 12.9f) (f, 1.5f, 3.3f, 10.3f, 7.8f, 7.09f, 3.52f, 3.87f) (f, 2.39f, 9.9275f, 7.56f, 11.4f, 4.99f, 6.37f, 10.6f) The dissonance curve of the best timbre is shown below. Clearly, these timbres are related to the specified scale, since points of local consonance occur precisely at the scale steps.

Timbres for Equal Temperaments For certain scales, such as the m-tone equal tempered scales, properties of the dissonance curve can be exploited to quickly and easily design timbres, thus bypassing the need to run an optimization program. Recall that the ratio between successive scale steps in 12 tone equal temperament is the twelfth root of 2 (about 1.0595). Similarly, m-tone equal temperament has a ratio of b=mth root of 2 between successive scale steps. Consider timbres for which successive partials are ratios of powers of b. Each partial of such a timbre, when transposed into the same octave as the fundamental, lies on a note of the scale. Such a timbre is said to be induced by the m-tone equal tempered scale. For example, harmonic timbres are induced timbres for the justly intoned scale. Induced timbres are good candidate solutions to the optimization problem. Recall from property 4 that points of local consonance tend to be located at intervals a for which fi = a fj where fi and fj are partials of the timbre F. Since the ratio between any pair of partials in an induced timbre is bk for some integer k, the dissonance curve will tend to have points of local consonance at such ratios: these ratios occur precisely at steps of the scale. Such timbres tend to minimize the cost C. This insight can be exploited in two ways. First, it can be used to reduce the search space of the optimization routine. Instead of searching over all

frequencies in a bounded region, the search need only be done over induced timbres. More straightforwardly, the timbre selection problem for equal tempered scales can be solved by careful choice of induced timbres. As an example, consider the problem of designing timbres to be played in 10 tone equal temperament. 10-tone is often considered one of the worst temperaments for harmonic music, since the steps of the ten tone scale are distinct from the (small) integer ratios, implying that harmonic timbres are very dissonant. The principle of local consonance asserts that these intervals will become more consonant if played with correctly designed timbres. Here are three timbres induced by the 10 tone equal tempered scale. Let b = the 10th root of 2. (f, b10 f , b17 f , b20 f , b25 f , b28 f , b30 f ) (f, b7 f , b16 f , b21 f , b24 f , b28 f , b37 f ) (f, b7 f , b13 f , b17 f , b23 f , b28 f , b30 f ) The dissonance curves of these timbres are

They really are consonant when played on a 10 tone equal tempered scale. Not surprisingly, the same tones sound quite dissonant when played in a standard

12 tone scale. Analogous arguments suggest that the consonance of 12- tone equal tempered tuning can be maximized by moving the partials away from the harmonic series to a series based on b = the twelfth root of 2.
New Instruments, Anyone?

Any arbitrary timbre (set of frequencies and amplitudes) can be realized with the aid of a computer. Is it always possible to design acoustic instruments that will have a given timbre? How about brasses? Fletcher and Rossing proclaim that "If the flaring part of the horn extends over a reasonable fraction of the total length, for example around one third, then there is still enough geometrical flexibility to allow the frequencies of all modes to be adjusted to essentially any value desired." With stringed instruments, the trick is to find a variable thickness string that will vibrate with partials at the desired frequencies. The partials of a drumhead can be tuned by weighting or layering sections of the drumhead. The partials of reed instruments can be manipulated by the contour of the bore as well as the shape and size of the tone holes. Bells can be tuned by changing the shape and thickness of the walls. Exactly how to engineer acoustic instruments with specified timbres is an interesting issue.

An easier approach is to synthesize the timbres. In the figure above, a harmonic waveform (which may be a sample of an acoustic instrument) is transformed into its constituent frequencies. The frequencies are changed in a systematic way that maps the partials into the specified timbre, and then transformed back into a useable waveform. The result is a nonharmonic timbre with much of the character of the original instrument. This is the key idea behind spectral mappings. The principle of local consonance shows how to imagine a number of differently tuned orchestras, digital or acoustic, each with instruments designed with a particular timbre and played in the related tuning. How about a band of instruments tuned to stretched or compressed tunings? An orchestra optimized for seven or ten tone equal temperaments? A wind instrument with the timbre of a drum? A trumpet with the harmonic structure of a steel beam?

The consonance curve shows how to properly tune the instrument. Using a computer to generate the timbres gives the ability to audition the design before building, saving time in the design and specification of nontraditional instruments.
Conclusions

The principle of local consonance shows how to relate timbres and tunings. Two complementary computational techniques were proposed: a way to find consonant scales given a specified timbre, and a way to find consonant timbres given a specified scale. One implication is that the musical notion of consonance of intervals such as the octave and fifth can be viewed as a result of the timbre of the instruments we typically use. The justly intoned scales can similarly be viewed as a consequence of the harmonic timbres of musical instruments. The advent of inexpensive musical synthesizers capable of realizing arbitrary sounds allows exploration of nonharmonic acoustic spaces. The principle of local consonance provides guidelines on how to sensibly relate tuning and timbre. More ambitiously, it is easy to imagine new nonharmonic instruments capable of playing consonant music. The computational techniques of this paper allow specification of timbres and tunings for such instruments.

También podría gustarte