Documentos de Académico
Documentos de Profesional
Documentos de Cultura
2.1 Introductory
2. 1. 1 'Sounds' and 'words'
If we were to ask a non-linguist what are the ultimate units of
language, the building-blocks, so to speak, out of which utterances
are constructed, he might well reply that the ultimate units of
language are 'sounds' and 'words'. He might add that words are
made up of sequences of sounds, each sound being represented,
ideally, by a particular letter of the alphabet (in the case of languages
customarily represented by a system of alphabetic writing); and that,
whereas the words of a language have a meaning, the sounds do not
(their sole function being to form words). These several propositions
underlie the traditional view of language reflected in most grammars
and dictionaries: the grammar gives rules for the construction of
sentences out of words, and the dictionary tells us what the individual
words mean. In the following chapters, we shall have occasion to
examine critically the terms 'sound', 'word', 'meaning' and 'sen-
tence' which figure in these general statements about language.
However, for the purpose of the present preliminary discussion of the
structure of language, we may leave these terms undefined. Certain
distinctions will be indicated in the course of this chapter and will
be made explicit later.
A a b c d e
B f g h i j
c p q r s
1 2 3 4 5 6 7 8 9 10
Fig. 1.
terms: red, orange, yellow, green, blue. We will assume that the same
area is covered by the five words a, b, c, d and e in A, by the five words
f, g, h, i and j in B and by the four words p, q, r and s in C (see
Fig. 1 ). From the diagram, it is clear that language A is semantically
isomorphic with English (in this part of its vocabulary): it has the
same number of colour-terms, and the boundaries between the area of
the spectrum covered by each of them coincide with the boundaries
of the English words. But neither B nor C is isomorphic with English.
Although B has the same number of terms as English, the boundaries
come at different places in the spectrum; and C has a different number
of terms (with the boundaries in different places). In order to
appreciate the practical implications of this, let us imagine that we
have ten objects (numbered 1 to 10 in Fig. 1), each of which reflects
light at a different wavelength, and that we wish to group them
according to their colour. In English, object 1 would be described as
'red' and object 2 as 'orange' -they would differ in colour; in
language A they would also differ in colour, being described as a and
b, respectively. But in B and C they would be described by the same
colour-term, f or p. On the other hand, objects 2 and 3 would be
distinguished by B (as/ andg), but brought together by English and
by A and C (as 'orange', b and p ). From the diagram it is clear that
58 2. THE STRUCTURE OF LANGUAGE
Table 3
Expression-elements
The same argument applies with respect to the spoken language (but
with certain limitations which we will introduce presently). Suppose
that the expression-element /1/ were realized in phonic substance as
[p], /2/ as [i], and so on: cf. column (v). Then the word that is now
written bet (and might still be written bet, since there is obviously no
intrinsic connexion between the letters and the sounds) would be
pronounced like the word that is now written dip (although it would
still mean 'bet'); and so on for all other words: cf. column (x).
Once again, the language itself is unchanged by this alteration in its
substantial realization.
••• (i)
Distributional
equivalence
(ii)
Complementary distribution
• (iii)
Distributional inclusion
(iv)
Overlapping distribution
Fig. 2. Distributional relations (x occurs in the set of contexts A,
and Bis the set of contexts in which y occurs).
logic and set-theory. The fact that this is so is highly relevant to the
study of the logical foundations of linguistic theory. What one might
refer to loosely as 'mathematical' linguistics is now a very important
part of the subject. Although we cannot go into the details of the
various branches of 'mathematical linguistics' in the present ele-
mentary treatment of linguistic theory, we shall make reference to
some of the more important points of contact as the occasion arises.)
It should be emphasized that the term 'distribution' is applied to
the range of contexts in which a linguistic unit occurs only in so far as
that range of contexts can be brought within the scope of a systematic
statement of the restrictions determining the occurrence of the unit
in question. What is here meant by 'systematic' may be explained by
2. THE STRUCTURE OF LANGUAGE
way of an example. The elements /I/ and /r/ of English are at least
partially equivalent in distribution (for the convention of oblique
strokes, cf. 2.2.5): both may occur in a number of otherwise phono-
logically-identical words (cf. light: right, lamb: ram, blaze: braise,
climb: crime, etc.). But many of the words in which one of the
elements occurs cannot be matched with otherwise phonologically-
identical words in which the other occurs: there is no word srip to
match slip, no tlip to match trip, no brend to match blend, no blick to
match brick, and so on. However, there is an important difference
between the non-occurrence of words like srip and tlip, on the one
hand, and of words like brend and blick, on the other. The first pair
(and others like them) are excluded by certain general principles which
govern the phonological structure of English words: there are no
words which begin with /tl/ or /sr/ (the statement could be made in
more general terms, but this formulation of the principle will suffice
for our present purpose). By contrast, no such systematic statement
can be made about the distribution of /I/ and /r/ which would account
for the non-occurrence of blick and brend. Both elements are found
elsewhere in the contexts /b-i ..• /and /b-e .•• /: cf. blink: brink, blessed:
breast, etc. From the point of view of their phonological structure
brend and blick (unlike tlip and srip) are acceptable words of English.
It is a matter of' chance', as it were, that they have not been given a
grammatical function and a meaning, and put to use in the language.
The point that has just been illustrated by means of a phonological
example applies also at the grammatical level. Not all combinations of
words are acceptable. Of the unacceptable combinations, some can be
accounted for in terms of a general distributional classification of the
words of the language, whereas others must be explained by reference
to the meaning of the particular words or to some other fact specific
to them as individual words. We shall return to this question in a later
chapter (cf. 4.2.9). For the purpose of the present argument, it is
sufficient to observe that distributional equivalence, total or partial,
does not imply absolute identity in the range of contexts in which the
units in question occur: what it implies is identity in so far as the
contexts are specified by the phonological and grammatical regu-
larities of the language.
anomalous, the one being tautological and the other contradictory. This
notion of 'marking' within paradigmatic oppositions is extremely
important at all levels of language-structure.
contrast between these two elements has a high functional load. Other
contrasts have a lower functional load. For example, there are
relatively few words that are kept apart from one another in their
substantial realization by the occurrence of one rather than the other
of the two consonants that occur in the final position of wreath and
wreathe (the symbols for these two sounds in the International
Phonetic Alphabet are [0] and [8] respectively: cf. 3.2.8); and there are
very few, if any, that are kept apart by the occurrence of the initial
sound of ship rather than the second consonantal sound of measure or
leisure (these two sounds are symbolized as m and [3] respectively in
the International Phonetic Alphabet). The functional load of the
contrasts between 101 and ll'JI and between III and 131 is therefore
much lower than that of the contrast lpl: lbl.
The importance of functional load is obvious. Misunderstanding
will tend to occur if the speakers of a language do not consistently
maintain those contrasts which serve to distinguish utterances which
differ in meaning. Other things being equal (and we will return to this
condition in a moment), the higher the functional load the more
important it is that the speakers should learn the particular contrast
as part of their 'speech-habits' and should subsequently maintain it
in their use of the language. It is to be expected therefore that
children will tend to learn first those contrasts which have the highest
functional load in the language which they hear about them; and, as
a consequence of this fact, that contrasts with a high functional load
will be correspondingly resistant to disappearance in the transmission
of the language from one generation to the next. Investigation of the
ease with which children masterthe contrasts in their native language
and the study of the historical development of particular languages
lend some empirical support to these expectations. In each case,
however, there are additional factors which interact with, and are
difficult to isolate from, the operation of the principle of functional
load. We shall not go into these other factors here.
The precise quantification of functional load is complicated, if not
made absolutely impossible, by considerations excluded above under
the proviso 'other things being equal'. First of all, the functional load
of a particular contrast between expression-elements will vary
according to the structural position they occupy in the word. For
example, two elements might contrast frequently at the beginning of
a word, but only very rarely at the end of a word. Do we simply take
an average over all the positions of contrast? The answer to this
question is not clear.
2+ STATISTICAL STRUCTURE
2+ 7 Diachronic implications
Languages, as they develop through time and 'evolve' to meet the
changing needs of the societies that use them for communication, can
be regarded as homeostatic (or 'self-regulating') systems, the state of a
language at any one time being 'regulated' by two opposing principles.
The first of these (which is sometimes referred to as the principle of
'least effort') is the tendency to maximize the efficiency of the system
(in the particular sense of 'efficiency' explained above): its effect is to
bring the syntagmatic length of words and utterances closer to the
theoretical ideal. The other principle is 'the desire to be understood':
this inhibits the shortening effect of the principle of 'least effort' by
introducing redundancy at various levels. As a consequence, it is to
be expected that the changing conditions of communication will tend
to keep the two tendencies in balance. If the average amount of noise
is constant for different languages and different stages of the same
language, it would follow that the degree of redundancy in language
is constant. Unfortunately, it is impossible (at present at least) to
verify the hypothesis that languages keep these two opposing
principles in 'homeostatic equilibrium'. The reasons for this will
occupy us presently. Nevertheless, the hypothesis is suggestive. Its
general plausibility is supported by 'Zipf's law'; and also by the fact
(pointed out long before the era of information-theory) that words
tend to be replaced by longer (and more 'colourful') synonyms,
especially in colloquial usage, when frequent use has robbed them of
their 'force' (by reducing their information-content). The extreme
rapidity with which' slang' expressions are replaced can be accounted
for in these terms.
So too can the phenomenon of 'homonymic conflict' and its dia-
chronic resolution (which has been abundantly illustrated by Gillieron
and his followers). 'Homonymic conflict' may arise when the opera-
tion of the principle of 'least effort' may concur with other factors
determining sound-change to reduce, or eliminate, the 'safety
margin' between the substantial realizations of two words and so
2.4. STATISTICAL STRUCTURE 91
position after vowels. It will be observed that there are some striking
differences between the frequency with which particular consonants
occur in different positions of the word. For example, of the units
listed [v] is the least frequent in word-initial position, but the third
most frequent in word-final position; on the other hand, [b] is the
third most frequent in word-initial position, but the least frequent in
word-final position (apart from [h], which does not occur at all
finally: n.b. we are talking of sounds, not letters). Others (like [t])
have a high probability, or (like [g] and [p]) a low probability for both
positions. It will also be observed that the range of variation between
the highest and the lowest probability is greater for the end of the word
than it is for the beginning. Facts of this kind find their place in
a description of the statistical structure of phonological words in
English.
It was said earlier (with reference to 'Zipf's law': cf. 2+6.) that
the number of sounds or letters in a word is not a direct measure of its
syntagmatic length in terms of information-theory. The reason is of
course that not all sounds or letters are equiprobable in the same
context. If the probability of a phonological or orthographic word
were directly related to the probabilities of its constituent expression-
elements, it would be possible to derive the probability of the word by
2.4. STATISTICAL STRUCTURE 95
multiplying together the probabilities of the expression-elements for
each of the structural positions in the word. For example, given that
x is twice as probable as y in initial position and a is twice as probable
as b in final position, one would expect xpa to be twice as frequent as
either ypa or xpb and four times as frequent as ypb. That this expecta-
tion is not fulfilled in particular instances is evident from the con-
sideration of a few words in English. The expression-elements
realized by [k] and [f] are more or less equally probable initially, but
the word call is far more frequent than fall (according to various
published frequency-lists for English words); although the clement
realized by [t] has about fifty times the probability of that realized
by [g] in word-final position, the word big occurs with about four
times the frequency of bit; and so on.
The probabilities for initial and final position used in these calcu-
lations (cf. Table 4) are probabilities based on the analysis of con-
tinuous text. This means that the occurrence of a particular consonant
in relatively few high-frequency words may outweigh the occurrence
of another in very many low-frequency words (cf. the remarks made
in 2.4. 1, in connexion with the concept of 'functional load'). The
consonant [a], which occurs initially in such words as the, then, their,
them, etc., in English, illustrates the effect of this weighting. In initial
position, it is the most frequent of all consonants, with a probability
of about 0·10 (compared with 0·072 for [t], 0·046 for [k], etc.). But it
occurs in only a handful of different words (less than thirty in modern
usage). By contrast initial [k] is found in many hundreds of different
words, although its probability in continuous text is less than half
that of [a]. Comparison of all the words in English realized as
consonant+vowel+consonant (which in itself is a very common
structure for English phonological words) shows that in general
there are more words with a high-frequency initial and final con-
sonant than there are words with a low-frequency initial and final
consonant, and also that the former tend to be of more frequent
occurrence. At the same time, it must be stressed that certain words
are far more frequent or far less frequent than one might predict from
the probabilities of their constituent expression-elements.