Documentos de Académico
Documentos de Profesional
Documentos de Cultura
$XWKRUV-DFRE&RKHQ
5HYLHZHGZRUNV
6RXUFH&XUUHQW'LUHFWLRQVLQ3V\FKRORJLFDO6FLHQFH9RO1R-XQSS
3XEOLVKHGE\Sage Publications, Inc.RQEHKDOIRIAssociation for Psychological Science
6WDEOH85/http://www.jstor.org/stable/20182143 .
$FFHVVHG
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Sage Publications, Inc. and Association for Psychological Science are collaborating with JSTOR to digitize,
preserve and extend access to Current Directions in Psychological Science.
http://www.jstor.org
98 VOLUME 1, NUMBER 3, JUNE 1992
pression, Journal of Clinical Psychiatry, 51, 61-69 nual meeting of the American College of Neuropsy and J. Perlmutter, The application of positron emis
(1990). chopharmacology, San Juan, Puerto Rico sion tomography to the study of panic disorder,
11. L.R. Baxter, Jr., J.M. Schwartz, B.H. Guze, (December 1991). American Journal of Psychiatry, 143, 469-477
J.C. Mazziotta, M.P. Szuba, K. Bergman, A. 12. E.M. Reiman, M.E. Raichle, F.K. Butler, P. (1986); T.E. Nordahl, W.E. Semple, M. Gross, T.A.
Alazraki, C.E. Selin, H.K. Freng, P. Munford, and Herscovitch, and E. Robins, A focal brain abnormal Mellman, M.B. Stein, P. Goyer, A.C. King, T.W.
M.E. Phelps, Obsessive-compulsive disorder vs. ity in panic disorder, a severe form of anxiety, Na Uhde, and R.M. Cohen, Cerebral glucose metabolic
Tourette's disorder: Differential function in subdivi ture, 310, 683-685 (1984); E.M. Reiman, M.E. Ra differences in patients with panic disorder, Neuro
sions of the neostriatum, paper presented at the an ichle, E. Robins, F.K. Butler, P. Herscovitch, P. Fox, psychopharmacology, 3, 261-272 (1990).
= .01 is
power at a2 Note
Statistical Power Analysis only.56.1
also that at any given value of a, a
test ismore stringent than
JacobCohen two-sided
a one-sided test.
Statistical
power analysis exploits
themathematical relationship
The power of a statistical test of a which r indeed does equal zero, re among these four variables in statis
null hypothesis (H0) is the probabil searchers risk mistakenly rejecting tical inference: power, a, N, and ES.
ity that the H0 will be rejected when the H0 when it is true, a Type Ierror, The relationship is such that when
it is false, that is, the probability of whose rate (.05) is controlled by the any three of them are fixed, the
obtaining a statistically significant a criterion. They also riskmistakenly fourth is determined. Two forms of
result. Statistical power depends on accepting the H0 as tenable when it power analysis are most useful: One
the significance criterion (a), the is false, a Type IIerror, whose prob is the determination of the N that is
-
sample size (N), and the population ability is called ?. Power is thus 1 necessary to attain a specified de
effect size (ES). ?, the probability of not accepting gree of power to detect as significant
The importance of power analysis the H0 when it is false, that is, the (at specified a) a hypothesized ES.
arises from the fact that most empir probability of successfully rejecting This form of power analysis is used
ical research in the social and be the H0. in research planning. The second is
havioral sciences proceeds by for The outcome of a statistical test the determination of power to detect
mulating and testing H0s that the depends on the degree to which the a hypothesized ES (for specified N
investigators hope to reject as a H0 is false, that is, on the magnitude and a), the form used in meta
means of establishing facts about the of the population ES, which in this analytic power reviews of research
phenomena under study. case is the absolute size of the pop areas or journals.
A typical H0 is that a population ulation r?the larger the r, the
product-moment correlation, r, is greater the likelihood that the H0
zero, to be tested at the two-sided will be rejected. It is also true that
(ct2 =) .05 level. When this H0 is the outcome depends on N, a larger EFFECTSIZE
tested on a sample of N cases ran sample being more likely to result in
domly drawn from a population in rejection of a false H0 than a smaller
=
one. Thus, at a2 .05, for exam I noted that in testing a sample r,
ple, if the population r is .30, when the ES is simply the population r.
Jacob Cohen, Professor of Psychol N is 40, the power of the standard t More in the Neyman
generally,
ogy at New York University, is the
test of a sample r turns out to equal Pearson system of statistical induc
author of Statistical Power Analysis
Sciences .48, whereas when N is 80, power is tion,2 whence the concept of power
for the Behavioral (2nd
.78. If the population r is .40, when is derived, the ES is the discrepancy
ed., 1988) and co-author with Pa
N is 40, power is .74, but when N is between the null
tricia Cohen of Applied Multiple hypothesis, H0,
80, power is .96. Finally, the test and the alternate hypothesis of inter
Regression/Correlation Analysis for
the Behavioral Sciences (2nd ed., outcome depends also on a, the risk est, Hv For testing a sample r, the
1983), both published by Law of a Type I error. A smaller and H0 is that the population r is zero,
rence Erlbaum Associates. Address therefore more stringent a criterion, and the H^ posits a specific nonzero
to Cohen, =
correspondence Jacob say, a2 .01, for any given popu value, for example, .30. Thus, the
Department of Psychology, New lation r and N, would result in ES in this example is simply the dif
York University, 6 Washington -
smaller power. For example, with ference: .30 .00. Every statistical
Place, 5th Floor, New York, NY r = .30 and N = 80, test has its own ES index, a contin
population
10003. =
while power at a2 .05 is .78, uous value that runs from zero,
research costs are at least approxi the incidence of dyslexia. If in a Abnormal and Social Psychology
mately linear in the number of sub population of dyslexic children half from the perspective of power.51 de
jects, cost-effectiveness demands are boys, there is no sex difference, termined power for each statistical
that this decision be appropriate. so H0 is P = .50. Departure from test in each article using the N em
with a =
When asked in connection .50 would render H0 false. The ES ployed at a2 .01, .05, and . 10 for
-
particular investigation what a and index for this test isg = P .50, the the conventional definitions of
power are desired, a neophyte re departure of the proportion from one small, medium, and large ES. I
searcher might suggest a2 = .01 and half. If the investigator's resources found, for example, that the median
some very large value for power, are such that she could obtain an N power to detect a medium ES at a2
=
say, .99. Power analysis quickly de of 90 to 100, and her expectation is .05 was .46. The many power
termines that these specifications ne a value of g in the range .10 to Ve, surveys done in the biosocial sci
cessitate a sample size that is likely she might compile the sample size ences since that time have had sim
beyond the available resources. For planning table shown in Table 1 by ilar results. For example, a similar
example, for a test of the difference looking up various combinations of review by Sedlmeier and Gigerenzer
between means, if a medium ES (d a2 and g that would result in Ns of the 1984 Journal of Abnormal
= in
.5) exists the population, these within the desired range and noting Psychology6 found the median
specifications require 194 cases in the resulting power. From this table, power under the same conditions to
each of the two samples. Similarly, she could choose a set of specifica be a little worse (.44)?and itwas
they require that if population r = tions. lower still (.37) when an experi
.30, a test of the significance of a mentwise a criterion was employed.
=
sample r have 254 cases. For a2 Even worse was the finding that in
.05 and .99 power, the N require 11% of the studies, the H0 was taken
ments are, respectively, 148 and DETERMININGPOWER as the research hypothesis and non
195. significance taken as confirmation:
To determine the necessary sam The median power of these studies
ple size, one needs to posit the a, There is a useful role for power to detect a medium ES at a2 = .05
ES, and desired power. I have pro analysis in assessing completed re was .25!
posed as a convention that in the ab search, particularly research in
sence of any other basis for setting which nonsignificant results were
the value for desired power, .80 be obtained. Given the N employed
used.1 In scientific research, it is typ and a, one needs only to posit the CONCLUSION
ically more serious to make a false population ES to determine power.
positive claim (Type I error) than a The sample ES found, or one or
false negative one (Type IIerror). Be more ES values posited by the asses There has been no disagreement
cause the implicit convention for sor, may serve
this purpose. It is a among research methodologists
significance is a = .05, the use of common finding that power was about the desirability of power anal
the convention
.80 for desired poor for plausible ESs, usually be ysis in research planning and assess
= cause of small N. in application
power (hence, ? .20) makes the ment, yet progress of
Type IIerror 4 times as likely as the In 1962, I reviewed the articles in this method over the last quarter
Type Ierror, an arbitrary but reason the 1960 volume of the journal of century has been slow. There have,
able reflection of their relative im however, been some rays of hope in
portance.4 the past few years. The popularity of
A useful aid in determining the meta-analysis has served to empha
Table 1. A sample size
necessary sample size is a sample size the size of effects and by thus
planning table
size planning table. To prepare such raising the consciousness of behav
a table, the investigator selects val a2 g Power N ioral scientists has promoted the
ues or ranges of values for a, ES, and cause of power analysis.3 More di
.01 1/6 .75 92
power and then determines the N for .02 .15 .75 98 rectly, both graduate and undergrad
each combination. This table pro .02 1/6 .85 98 uate statistics textbooks have begun
vides the basis for a judicious choice .05 .10 .50 96 to feature chapter-length treatments
or leads to the use .05 .15 .85 97 of power analysis.7
of specifications Finally, in addi
.10 .10 .60 90
ful discovery that the research as tion to the reference works already
.10 .15 .90 91
conceived is not viable.3 .10 1/6 .95 92 noted,1,4 there are available com
Recall the investigator pursuing .20 .15 .95 90 puter programs for power analysis
the question of a sex difference in and sample size determination.8