Documentos de Académico
Documentos de Profesional
Documentos de Cultura
M
National ResourcesCommittee
OST projectsinvolvingthe collectionand analysisof statistical
data have for one of their major aims the isolation of factors
whichaccountforvariationin the variable studied.The statisticaltool
ordinarilyemployedforthis purposeis the analysis of variance. Fre-
quently,however,the data are sufficiently extensiveto indicate that
the assumptionsnecessaryforthe valid application of this technique
are not justified.This is especiallyapt to be the case with social and
economicdata wherethe normaldistributionis likelyto be the excep-
tion ratherthan the rule. This difficultycan be obviated,however,by
arrangingeach set of values of the variate in orderof size, numbering
them 1, 2, and so forth,and using these ranks instead of the original
quantitative values. In this way no assumptionwhatsoeverneed be
made as to the distributionof the originalvariate.
The utilizationof ranked data is thus frequentlya desirabledevice
to avoid normalityassumptions;in addition,however,it may be ines-
capable either because the data available relate solely to order, or
because we are dealing with a qualitative characteristicwhichcan be
ranked but not measured.
The possibilityof using ranked data in problemsinvolvingsimple
correlationand thereby avoiding assumptions of normalityhas re-
centlybeen emphasized in an article by Harold Hotelling and Mar-
garet Richards Pabst.' It is the purpose of the presentarticleto out-
line a procedurewherebythe analysisof rankeddata can be employed
in place of the ordinaryanalysis of variance when there are two (or
more) criteriaof classification.This procedurehas two major advan-
tages. As already indicated,it is applicable to a wider class of cases
than the ordinaryanalysis of variance. In addition,it is less arduous
than the lattertechnique,requiringbut a fractionas much time. The
loss of informationthroughutilizing the procedure outlined below
when the analysis of variance could validly be applied may thus be
morethan compensatedforby its greatereconomy.This consideration
is likelyto be especiallyimportantwiththoselarge scale collectionsof
social and economicdata whichhave become increasinglyfrequentin
recentyears and forwhichthe fundsavailable foranalysisare limited.
InvolvingNo Assumptionof Normality,"Annalaof
1 "Rank Correlationand Tests ofSignificance
MathematicalStatistics,VII (1936) 29-43.
675
676 AMERICAN STATISTICAL ASSOCIATION*
THE PROCEDURE
Annual familyincome
Categoryofexpenditure $750- $1,000- $1,250- $1,500- $1,750- $2,000- $2,250
1,000 1,250 1,500 1,750 2,000 2,250 2,500
Annual familyincome
Categoryofexpenditure $750- $1,000- $1,250- $1,500- $1,750- $2,000- $2,250-
1,000 1,250 1,500 1,750 2,000 2,250 2,500
Housing 5 1 3 2 4 6 7
Household operation 1 3 4 6 2 5 7
Food 1 2 7 3 5 4 6
Clothing 1 3 2 4 5 6 7
Furnishingsand equip-
ment 2 1 6 3 7 5 4
Transportation 1 2 3 6 5 4 7
Recreation 1 2 3 4 7 5 6
Personal care 1 2 3 6 4 7 5
Medical care 1 2 4 5 7 3 6
Education 1 2 4 5 3 6 7
Communitywelfare 1 5 2 3 7 6 4
Vocation 1 5 2 4 3 6 7
Gifts 1 2 3 4 5 6 7
Other 5 4 7 2 6 1 3
a. Total 23 36 53 57 70 70 83
b. Mean rank 1.643 2.571 3.786 4.071 5.000 5.000 5.929
c. Deviation -2.357 -1.429 -.214 .071 1.000 1.000 1.929
pa 1 P(p + 1)j=
samplingdistributionof the ratio of two variancesis not symmetricalunlessboth variancesare based
on the same numberofdegreesoffreedom.The mean value of theratiois approximatelyunity,but the
medianis not equal to one-it is less than one if the numeratoris based on fewerdegreesof freedom
than the denominator,and conversely.In rankingtwostandarddeviations,therefore, the one based
on the smallernumberofcases wouldreceivea rankof 1 morethan halfthetime.Whenmorethan two
standarddeviationsare rankedthis tendencyis somewhatcompensatedforby the greaterprobability
that thosebased on the fewestcases will receiverelativelyhigh ranks,and thus the average rank will
be less affected.This difficulty
does not,however,affectthevalidityoftheillustrativeanalysispresented
here,since thetwohighestincomeclassescontainthesmallestnumbersoffamiliesbut have thehighest
average ranks.
More generally,when the entriesin different columnsof the same row come fromsymmetrical
universeswith the same mean but different variances,the several ranks will have the same expected
value, but the probabilitydistributionforeach cell will not be exactlyrectangular.This conditionof
symmetry is a sufficient
conditionforthe ranksto have the same expectedvalue; it is, however,more
stringentthanis necessary.This difficulty clearlycalls forfurther
analysis.
6 The sum of the numbersfrom1 to p is lp(p+1). The mean is thereforeI(p+1). The sum of the
squares of the numbersfrom1 to p is (2p+1)(p+1)p/6. The variance is, therefore,(2p+1)(p+1) /6
-*(p +1)2 = (p2-1)/12.
That the samplingdistributionof samples drawn froma rectangularuniverseapproachesnor-
6
malityquite rapidlyis, of course,well known. The distributionof means forsamples of two is a tri-
angle; forsamplesof threeit is made up of threeparabolic segments,the firstand thirdconcave up-
wards,and the middleone concave downward.An empiricaldistributionforsamplesof tenis givenby
Hilda Frost Dunlap, "An EmpiricalDeterminationofthe Distributionof Means, StandardDeviations
and CorrelationCoefficients Drawn fromRectangularPopulations," AnnalsofMathematical Statistics,
II (1931), 66-81. The universesampledwas a discontinuousrectangularuniverse,includingtheintegers
from1 to 6. The empiricaldistributionshowsextremelyclose conformity to the normalcurve.
7 This followsfromthe fact that the variance of a mean of n observationsof equal weightis 1/n
timesthe varianceof an individualobservation.
-THE USE OF RANKS 679
So long as the numberof rows and columnsis not too small, Xr2 com-
puted in this way will be distributedaccordingto the usual x2distribu-
tion with p -1 degreesof freedom.8If, now, Xr2 is significantly greater
than mightreasonably have been expectedfromchance, the implica-
tion is that the mean ranksdiffersignificantly, i.e., that the size ofthe
standard deviation depends on the income level.
The computationof Xr2is extremelysimple.The mean of the seven
mean ranksis, ofnecessity,equal to the truemean of4. The difference
between the mean rank for each column and 4 is given on line c of
Table II. The sum of the squares of these differences is 13.3692 and
Xr = 40.1076.
This illustrativecomputationhas been made using a formulathat
makes clear the nature of xr2. In actual practicethe followingalterna-
tive formulawhichinvolves only integersand makes unnecessarythe
computationof the actual mean ranks will be foundmore convenient:
12 p 2
Xr2= E rij 3n(p+ 1),
np(p+l) i-l i=1
whererij is the rank enteredin the i-th row and j-th column.
The numberof degreesof freedomon which this estimate is based
is p -1 = 6. For six degreesof freedomthe value of x2 whichwould be
exceeded by chance once in 20 timesis 12.592,and once in a hundred
times,16.812.9The probabilityof a value greaterthan 40 is .000001.10
There can thus be little question that the observedmean ranks differ
significantly,i.e., that the standard deviationis related to the income
level. From the mean ranksit is seen that withbut one minorexception
the standarddeviationsconsistentlyincreasewithincome.
Since the value of Xr2is invariant under transpositionsof the
columnsofranksundertheircaptionsthisinformation-thatthe ranks
increase with income-has not been utilized. Wheneverthe columns
themselvescan be ranked,the additional informationsupplied by the
relationshipbetweenthe orderof the mean ranks and the orderof the
columns can be used by computinga rank differencecorrelationbe-
tweenthe two corresponding sets ofranks,determining the probability
that the correlationcoefficient obtained would have been equalled or
exceeded by chance, convertingthis probabilityinto the value of x2
8 For a justificationof the formulafor xr2 and of the statementthat Xr2tends to be distributed
like x2,as well as forsome indicationof the numberof columnsand rowsnecessary,see pp. 687-694
and the mathematicalappendix.
9 Fisher,R. A., StatisticalMethodsforResearchWorkers, Table III.
15 Pearson, Karl, Tables for Statisticiansand Biometricians,3rd Edition, London, 1930, Part, I
Table XII.
680 AMERICANSTATISTICALASSOCIATION-
which correspondsto it for two degrees of freedom,and pooling the
resultantvalue of x2 withXr2.10a In the presentillustrativeexamplethe
evidenceis so clear that this additionalinformationwill obviouslynot
affectthe conclusion. It will, however, serve to exemplifythe pro-
cedure.The rankdifference correlationbetweenthe mean rankand the
income level is .991. (In derivingthis coefficient the tied ranks were
treated in the manner suggestedbelow, i.e., they were assigned the
average value of the ranksforwhichthey were tied.) The probability
of securinga value as great as or greaterthan this is between .00277
and .00040. The value of x2corresponding to the largerof these figures
for two degrees of freedomis -2 loge .0277= 11.77. Adding this to
Xr2 gives 51.88 as the value to be enteredin the x2table for8 degreesof
freedom.The probabilityassociated with this value is smaller than
that forXr2 and, indeed,is so small that it cannot be determinedfrom
the publishedtables.
In orderto test whetherthe standard deviations are related to the
type of expenditureit is only necessaryto repeat the above analysis;
this time,however,treatingthe columnsis the way in whichthe rows
were previouslytreated,and viceversa.Thus the standard deviations
would be ranked foreach incomelevel, and the mean ranks obtained
foreach type of expenditure.
It mightappear offhandas if the procedureused to study the rela-
tion between standard deviations and income level does not make
use of all of the informationprovidedby Table II, that it neglectsthe
distributionof the ranks withinthe columns,and that this supplies
additional informationabout the consistencyof the ranking. This,
however,is not the case. Since Table II must containn l's, n 2's, . . ..
n p's, the total sum of the squared deviations fromthe grand mean
is the same no matterwhat the arrangementof the ranks withinthe
table-it is, in fact,equal to np(p2-1)/12. The sum of squares within
columns plus the sum of squares between columns must add up to
this total. Knowledge of one of these sums of squares thus implies
knowledgeof the other. In the above example we have used the sum
ofsquares betweencolumns;no additionalinformation is thussupplied
by the sum of squares withincolumns.
It should be noted that in testingthe significanceof the differences
among the columnsno assumptionwhatsoeverneeds to be made as to
the similarityof the distributionofthe originalvariateforthe different
rows. The test takes the formof comparingthe mean ranks for the
several columns; essentially, however, the null hypothesis tested is
'OaSee Hotellingand Pabst, op. cit., pp. 35 and 40, and Fisher,op. cit. art.,21.1.
* THE USE OF RANKS 681
that the original entries in each row are from the same universe;
whetheror not this universeis the same forthe different rows is en-
tirelyirrelevantto the validityof the test.
The method of ranks does not providefortesting"interaction."It
is of the verynatureofthe methodthat it cannot do so. Withoutexact
quantitative measurement,"interaction,"in the sense used in the
ordinaryanalysis of variance,is meaningless.
It shouldfurtherbe notedthat the methodofranksmay not provide
a test of the influenceof a factorifthereis reason to suspectthat this
influenceis in a different
direction forthe differentrows;if,forexample,
the standard deviation increaseswith income forcertaintypes of ex-
penditureand decreaseswithincomeforothers.For in such a case the
mean ranks of the p columnsmay all have the same expectedvalue,
althoughthe p ranksforeach of the rowsdo not. Thus, if Xr2is signifi-
cant, the conclusionis that the rankingis not random.But Xr2may not
be significant, not because the rankingis random,or because the dif-
ferencesin the mean ranks are too small for the observed sample to
display significance,but because the influenceof the factortested is
different in directionforthe different rows.In thisconnection,however
the generalpoint should be emphasizedthat non-significant resultsdo
not establishthe validity of the null hypothesisin the same way that
significantresultstend to contradictit.
In some cases two (or more) ofthe values of the variate in a rowwill
be identical,i.e., there will be "tied" ranks. Two procedurescan be
followed:first,the ranks tied forcan be assigned to the two (or more)
values at random; or second, each value can be given the average
value of the ranks tied for (e.g., if two values are tied forthe ranks 2
and 3 each can be giventhe rankof2.5). In general,the second ofthese
proceduresseems to be preferable,since it uses slightlymore of the
informationprovided by the data." The substitutionof the average
rankforthe tied values does not affectthe validityofthe Xr2 test.'2
THE EFFICIENCY OF THE METHOD OF RANKS RELATIVE
TO THE ANALYSIS OF VARIANCE
It is evident that the methodof ranks does not utilize all of the in-
formationfurnishedby the data, since it relies solely on order and
11This alternativemethodofhandlingtied ranksand its advantageswerebroughtto myattention
by Mr. W. Allen Wallis, who has developed a simple adjustmentto the usual formulaforthe rank-
difference correlationto allow forthe treatmentoftied ranksin this fashion.
12 Its only effectis to changeveryslightlythe 'true" value of the variance. In the extremecase
whentied ranksare as probableas untiedranks,the varianceof an individualobservationis changed
from(p2-1)/12 to p(p -1)/12, i.e., it is reducedby (p -1)/12 orin theratioof 1 to p +1. The reduction
is thus relativelysmall whenp is moderatelylarge.
682 AMERICAN STATISTICAL ASSOCIATION
Major categoriesofexpenditure3
Food 15.33** 5.75** 27.02** 19.09**
Household operation 9. 95** 1.01 24. 24** 4.94
Housing 9.50** 1.63 21.94** 6.17
Clothing 9.40** 1.38 25.54** 9.46
Recreation 4.25** 1.98 23. 83** 11. 89*
Personal care 4. 10** .80 21.11** 4.14
Transportation 3. 78** 1.97 24. 00** 10. 06*
Gifts 3.36** .96 21.17** 3.74
Communitywelfare 2. 95** .45 17. 04** .49
Education 2.93** 1.79 17.31** 8.11
Medical care 2.51* .80 18.69** 6.51
Vocation .69 1.01 4.71 1.51
Furnishingsand equipment .42 .37 6.96 3.69
Other .25 .30 5.74 5.40
Sub-groupsof items
Food4:
Dairy products 6.71** 9.41** 23. 66** 21.83**
Fruit 4.87** .38 12.69* 3.31
Food away fromhome 3.49** 3.94** 17.34** 10.09*
Meat 2.59* 2.02 9.34 3.77
Miscellaneousfoods 2.01 1.21 15. 00* 5.49
Fish .98 2.43* 4.11 1.91
Vegetables .73 2.11 6.69 8.80
Grain products .71 4.76** 3.26 9.71*
Sweets .20 1.05 3.96 9.94*
Poultry .20 .99 .30 1.89
Personal care:
Personal service 4.31** .70 19.80** 4.71
Personal supplies 3.38** .75 14.34* 1.49
Household operation:
Fuel and light5 7.26** 1.56 23.25** 6.74
* Indicates that observedfigureis "significant,"i.e., greaterthan the value whichwould be ex-
ceeded by chance once in twentytimes.For the ratios of variancesthis value is 2.14 forincomeand
2.42 forfamilytype. For Xr2 it is 12.592 forincomeand 9.488 forfamilytype.The difference between
the values forincomeand familytypeis a resultofa difference in the numberof degreesoffreedomon
whichthe respectiveestimatesare based.
** Indicates that observedfigureis 'highly significant,"i.e., greaterthan the value whichwould
be exceededbut once in a hundredtimesby chance. For the ratios of variancesthis value is 2.89 for
incomeand 3.41 forfamilytype. For Xr2 it is 15.033 forincomeand 13.277 forfamilytype.
1 The figuresin this table are based on schedulescollectedby the Cost of Living Division of the
U. S. Bureau of Labor Statistics.These scheduleswereloaned to the National ResourcesCommittee
forspecial analyses,one ofwhichis presentedhere.
2' 8, 4, 6 See nextpage.
686 AMERICAN STATISTICAL ASSOCIATION*
groups: those which would have been exceeded by chance (a) in more
than fiveper cent of randomsamples, (b) in betweenfiveper cent and
one per cent of random samples, and (c) in less than one per cent of
random samples. An indication of the relative efficiencyof the two
methodsis providedby Table IV, whichgives a comparisonof the two
classifications.
From the entriesin the diagonal of Table IV, it is seen that for45
out of the 56 analyses the two methodslead to similarconclusions.In
no case does one of the methodsindicate a probabilityof less than .01
whilethe otherindicates a probabilitygreaterthan .05.
TABLE IV
COMPARISON OF RESULTS OF ANALYSIS OF VARIANCE AND METHOD OF RANKS
Analysisof variance
Numberof F's withprobability
Method of ranks Total
Probabilityof Xr2 Greater Between Less
than .05 .05 and .01 than .01
Greaterthan .05 28 2 0 30
Total 32 4 20 56
TABLE VI
EXACT DISTRIBUTION OF Xr2FOR TABLES WITH FROM 2 TO 4 SETS OF FOUR RANKS
(p _4; n =2, 3, 4)
P is the probabilityofobtaininga value OfXr2as greatas or greaterthanthecorresponding
value ofXr2
n =2 n=3 n=4
.8 .8
X_ DISTRIBUTION,2 DEGREES OF FREEDOM
X - DISTRIBUTION,3x9 TABLE
.6 ~~~~~~~~~~~~~~
p ~~~~~~~~~~~~
.4 -4
.2 .2
O I a 4 X!5 6 7 a
PANEL B: TAILSOF DISTRIBUTIONS SCALE
OF 'Xt. ANDOF 'X! ON LOGARITHMIC
.10 .10
.05 .05
\- X DISTRIBUTION,a DEGREES OF FREEDOM
- 4 DISTRIBUTION,3x3 TABLE
X.2DISTRIBUTION,3x5 TABLE
-
\v<tK DISTRIBUTION,3x7 TABLE
\. \..X --~. X DISTRIBUTION,3x 9 TABLE
.01 \ . .01
.005 .005
.001 .001
.0005 .0005
.0001 0001
4 5 6 7 8 9 10 "I X12 13 14 15 16 17 18 Is
._
,X DISTRIBUTION,3 DEGREES OF FREEDOM
X DISTRIBUTION,4x4 TABLE
.6
.4 .4
.2
0
0
0 1 2 a 4 .5 6 7 8 9 10
OF Xr ANDOF 'X ON LOGARITHMIC
PANEL B: TAILS OF DISTRIBUTIONS SCALE
.10 -o
.05 .05
4I .~~~~~~~N.01
.005 : \ 8\ ^ .005
P P
.001 .001
.0005 .0005
- X. DISTRIBUTION,3 DEGREES OF FREEDOM
- O0 DISTRIBUTION,4x2 TABLE
-4 DISTRIBUTION 4x3 TABLE
-4 DISTRIBUTION,4x4 TABLE
1.1^ 000
0670gs
'?? st0"
-x
fromusing the x2distributionas an approximationto the Xr2 distribu-
tion are likelyto be in the properdirection-that is, the significanceof
*THE USE, OF RANKS 693
results will be understatedrather than exaggerated. This tendency
toward under-statementis compensated-indeed, in some cases over-
compensated-by the factthat the values of Xr2whichcan be observed
(i.e., the values ofXr2not adjusted fordiscontinuity)are always greater
than the adjusted values. This factoris of minorsignificance, however,
since the numberof possiblevalues of Xr2 increasesveryrapidly-and
hence the intervalbetweenthem decreases very rapidly-as eitherp
or n increases.Even forp and n both as small as 4, the difference be-
tween the adjusted and unadjusted values of Xr2 is, forpractical pur-
poses, negligible.It is .15 in all but fourcases, .3 in threeof these,and
45 in the remainingone.
A comparisonof the x2 and Xr2 distributionsat the critical points
sheds furtherlighton this problem.For p=3 the value of x2 corre-
spondingto P =.05 is 5.991. From Table V, the nearest value which
Xr2 can have for p = 3, n = 9 is 6, and this has a probabilityof .057
associated withit. Thus, by using the x2distributionwe should be led
to overestimateslightlythe significanceof a value of Xr2= 6. The next
highervalue of Xr2is 6.22, and its significance we shouldestimateprop-
erly,since the probabilityassociated with it is .048. The value of x2
correspondingto P =.01 is 9.21. From Table V, the nearest values
which xr2 can have are 8.67, with a probabilityof .0103, and 9.55 with
a probabilityof .0060. In this case, the use of the x2distributionwould
yield the correctresults';8.66 would be attributeda probabilitygreater
than .01 and 9.55 one less than .01.
For p = 4, the value of x2corresponding to P =.05 is 7.815. The near-
est values of Xr2 forp =4 and n =4, as givenin Table VI, are 7.5 with
a probabilityof .052, and 7.8 with a probabilityof .036. The .01 value
Of x2 is 11.341. From the table, 9.3 has a probabilityof .012, 9.6 of
.0069, and 11.1 of .00094. Here, the use of x2would in each case under-
state the significanceof Xr2.
Whileno definitiveconclusionscan be drawnfromthesecomparisons
they suggestthat forp = 3, the use of the x2 distributionis likelyto
give sufficiently accurate resultsforn greaterthan 9; while forp =4,
the use of the x2distributionis likelyto understatethe significanceof
large values of Xr2 unless n is somewhatlargerthan 4. In view of the
apparent rapiditywith which the Xr2 distributionapproaches the x2
distributionwhen p =4, it seems reasonable that for n equal to or
greaterthan 6, the x2distributionwillgive sufficiently accurateresults.
4
For p greaterthan it is moredifficult to make any generalstatement;
but it seemssafe to say that the x2distributionwill give fairlyaccurate
694 AMERICANSTATISTICALAssoCIATION'
resultsforn equal to or greaterthan 6.24A procedurethat seems appli-
cable whenp is quite largeand n less than 6 is discussedbelow.
Hotelling and Pabst have shown that r' tends to become normally
distributedas p increases. It followsthat for n=2, Xr2 tends also to
become normallydistributedas p increases.
When n is large the distributionof Xr2 approachesthat of x2and the
latter approaches normalityas the numberof degreesof freedomin-
creases.
Since, forthe smallestvalue of n as well as forlarge values, the dis-
tributionof Xr2 tends to normalityas p increases,it seems reasonable
to assume that forintermediatevalues of n it behaves similarly.As-
24 It is worthrecallingthat the rapiditywithwhichthe varianceofthe Xr2distribution approaches
the variance of the x2 distributiondepends solely on n and not at all on p. On the otherhand, the
numberofdistinctvalues of xr2dependson bothp and n.
25 This is the usual notationexceptthat the numberofpairs of ranksis ordinarilydesignatedas n.
The presentnotationis used in orderto preserveconsistencywiththeprecedinganalysis.
26 Hotellingand Pabst, op. cit.,p. 36.
*THE USE OF RANKS 695
sumingthis to be the case, then forsmall values of n and large values
of p the significanceof Xr2can be testedby considering
Xr
-
1)
n-1
2 (p-i)
n
as a normallydistributedvariate with zero mean and unit standard
deviation.
Furtherstudyis clearlyneeded on this point,bothin orderto obtain
a rigorousproofthat forsmall values of n the Xr2distributiontends to
normalityas p increases,and also to determinethe rapiditywithwhich
it approachesnormality.
CONCLUSION
(6) = E1
E
(F
+ Z
-1
Oiri'+ 2n
j2 (P
fOiri) 2
+
1
- R'
or
{E 1+ ZOiri'
n j.,
(7)
(2
+ n2
P-1
E 0O2ri'2+
p-2
2Z
p-i
E j 1)r/r'i'+
) + kRI J n
But since ri' takes all of the p values differingby unity from
-(p -1) to 2(p -1) with equal probability
(8) Er' = 0,
1 (p-l)/2
(9) Er -2= - r/2-= (p2-1)/12
P rj'=-(p-1)/2
Further
(10) Er/'r'i, = -(p + 1)/12
since
((p-1)/2 \ 2 (p-1)/2
E ri= rj-2 E
(11) r=rj/-(p-1)/2 rj/'-(p-l)/2
(p-3)/2 (p-1) /2
+ 2 E E r/r'j, =0
rj'=-(p-l)/2 r'jl=rjl+l
and hence
(12) p(p - 1)Eri'r'i,- pEr/'2 - p(p2 - 1)/12
698 AMERICAN STATISTICAL ASSOCIATION*
signswe have
theorderofthesummation
arranging
24 n-1 n P
(19) Xr2 = (p-1) + E Erjij,IE
p(p + 1)n j, jj=i+1 Z=
Takingtheexpectedvalueofbothsidesgives
24 n-1 n P
(20) EXr2 = (p-1) + 2 E E E E(r'jjrtj)
p(p + l)n j., i==i+l j-1
be ob-
The k-thmomentof Xr2 aboutits meanvalue can therefore
tainedby evaluatingtheexpectedvalueofthek-thpowerof the right
handsideof (22).
p
To determinethe varianceof Xr2 firstnote thatL r'ii r'j i is in-
j=1
(23) EE1 iZ
(23) r r
1~~r11 r11- = E Z r r'i
ir '
~~~~~Ii=1
\G i'=i+1
-
il j= /
But
p
X
E
p
r' ri
2r'y2 + 2
p-1 p
r'i r'i,i r'iitr'i'i')
p p-1 p
= Z Er ii2 Er i,2+ 2 E(r'jj r'j )E(r'j s riy).
i=1 j=1 j'==j+l
700 AMERICAN STATISTICAL ASSOCIATION-
EL~ l
i'-i= 1 2 122
n-1
(26) r2= 2 (p-1).
n
To determinethe thirdmomentof Xr2 about its mean note that the
onlytermin the expansionof
n-1 n p 3
(27) 6ZE
i=1 j'=i+1
Ei''=i'+1L ErIiixr:,j
j=i j=
rij rj,,j
j
r'isr'ji,"iJ
P-2 p-1 P
+ 6E E -
j=1 j'=j+l j"=j'+i
E E r1ijr'j S
(29) --+ j
, ,
Zr i r 2j 1
(32) r' = =,rlir
'F" r12
_ r/2l
r'22 p( 1)