Documentos de Académico
Documentos de Profesional
Documentos de Cultura
spe ci a l r ep or t
Medical research relies on clinical trials to as- coronary events with the use of pravastatin was
sess therapeutic benefits. Because of the effort examined in a diverse population of persons who
and cost involved in these studies, investigators had survived a myocardial infarction. In sub-
frequently use analyses of subgroups of study group analyses, the investigators further examined
participants to extract as much information as whether the efficacy of pravastatin relative to pla-
possible. Such analyses, which assess the heter- cebo in preventing coronary events varied accord-
ogeneity of treatment effects in subgroups of pa- ing to the patients’ baseline low-density lipopro-
tients, may provide useful information for the care tein (LDL) levels.
of patients and for future research. However, sub- Subgroup analyses are also undertaken to in-
group analyses also introduce analytic challeng- vestigate the consistency of the trial conclusions
es and can lead to overstated and misleading among different subpopulations defined by each
results.1‑7 This report outlines the challenges as- of multiple baseline characteristics of the patients.
sociated with conducting and reporting subgroup For example, Jackson et al.9 reported the outcomes
analyses, and it sets forth guidelines for their use of a study in which 36,282 postmenopausal
in the Journal. Although this report focuses on the women 50 to 79 years of age were randomly as-
reporting of clinical trials, many of the issues dis- signed to receive 1000 mg of elemental calcium
cussed also apply to observational studies. with 400 IU of vitamin D3 daily or placebo. Frac-
tures, the primary outcome, were ascertained over
sub gr oup analyse s an average follow-up period of 7.0 years; bone den-
and rel ated concep t s sity was a secondary outcome. Overall, no treat-
ment effect was found for the primary outcome;
Subgroup Analysis that is, the active treatment was not shown to pre-
By “subgroup analysis,” we mean any evaluation vent fractures. The effect of calcium plus vitamin
of treatment effects for a specific end point in sub- D supplementation relative to placebo on the risk
groups of patients defined by baseline character- of each of four fracture outcomes was further ana-
istics. The end point may be a measure of treat- lyzed for consistency in subgroups defined by 15
ment efficacy or safety. For a given end point, the characteristics of the participants.
treatment effect — a comparison between the
treatment groups — is typically measured by a Heterogeneity and Statistical Interactions
relative risk, odds ratio, or arithmetic difference. The heterogeneity of treatment effects across the
The research question usually posed is this: Do the levels of a baseline variable refers to the circum-
treatment effects vary among the levels of a base- stance in which the treatment effects vary across
line factor? the levels of the baseline characteristic. Heteroge-
A subgroup analysis is sometimes undertaken neity is sometimes further classified as being ei-
to assess treatment effects for a specific patient ther quantitative or qualitative. In the first case,
characteristic; this assessment is often listed as one treatment is always better than the other, but
a primary or secondary study objective. For exam- by various degrees, whereas in the second case,
ple, Sacks et al.8 conducted a placebo-controlled one treatment is better than the other for one sub-
trial in which the reduction in the incidence of group of patients and worse than the other for
another subgroup of patients. Such variation, also tation of such results. There are several methods
called “effect modification,” is typically expressed for addressing multiplicity that are based on the
in a statistical model as an interaction term or use of more stringent criteria for statistical sig-
terms between the treatment group and the base- nificance than the customary P<0.05.7,15 A less
line variable. The presence or absence of interac- formal approach for addressing multiplicity is to
tion is specific to the measure of the treatment note the number of nominally significant inter-
effect. action tests that would be expected to occur by
The appropriate statistical method for assess- chance alone. For example, after noting that 60
ing the heterogeneity of treatment effects among subgroup analyses were planned, Jackson et al.9
the levels of a baseline variable begins with a sta- pointed out that “Up to three statistically signifi-
tistical test for interaction.10-13 For example, Sacks cant interaction tests (P<0.05) would be expected
et al.8 showed the heterogeneity in pravastatin on the basis of chance alone,” and then they in-
efficacy by reporting a statistically significant corporated this consideration in their interpre-
(P = 0.03) result of testing for the interaction be- tation of the results.
tween the treatment and baseline LDL level when
the measure of the treatment effect was the rel- Prespecified Analysis versus Post hoc
ative risk. Many trials lack the power to detect het- Analysis
erogeneity in treatment effect; thus, the inability A prespecified subgroup analysis is one that is
to find significant interactions does not show that planned and documented before any examination
the treatment effect seen overall necessarily ap- of the data, preferably in the study protocol. This
plies to all subjects. A common mistake is to analysis includes specification of the end point,
claim heterogeneity on the basis of separate tests the baseline characteristic, and the statistical
of treatment effects within each of the levels of method used to test for an interaction. For exam-
the baseline variable.6,7,14 For example, testing the ple, the Heart Outcomes Prevention Evaluation 2
hypothesis that there is no treatment effect in investigators16 conducted a study involving 5522
women and then testing it separately in men does patients with vascular disease or diabetes to as-
not address the question of whether treatment dif- sess the effect of homocysteine lowering with fo-
ferences vary according to sex. Another common lic acid and B vitamins on the risk of a major car-
error is to claim heterogeneity on the basis of the diovascular event. The primary outcome was a
observed treatment-effect sizes within each sub- composite of death from cardiovascular causes,
group, ignoring the uncertainty of these esti- myocardial infarction, and stroke. In the Methods
mates. section of their article, the authors noted that “Pre-
specified subgroup analyses involving Cox mod-
Multiplicity els were used to evaluate outcomes in patients
It is common practice to conduct a subgroup analy- from regions with folate fortification of food and
sis for each of several — and often many — base- regions without folate fortification, according to
line characteristics, for each of several end points, the baseline plasma homocysteine level and the
or for both. For example, the analysis by Jackson baseline serum creatinine level.” Post hoc analy-
and colleagues9 of the effect of calcium plus vi- ses refer to those in which the hypotheses being
tamin D supplementation relative to placebo on tested are not specified before any examination
the risk of each of four fracture outcomes for 15 of the data. Such analyses are of particular con-
participant characteristics resulted in a total of cern because it is often unclear how many were
60 subgroup analyses. undertaken and whether some were motivated by
When multiple subgroup analyses are per- inspection of the data. However, both prespeci-
formed, the probability of a false positive finding fied and post hoc subgroup analyses are subject
can be substantial.7 For example, if the null hy- to inflated false positive rates arising from mul-
pothesis is true for each of 10 independent tests tiple testing. Investigators should avoid the ten-
for interaction at the 0.05 significance level, the dency to prespecify many subgroup analyses in the
chance of at least one false positive result exceeds mistaken belief that these analyses are free of
40%. Thus, one must be cautious in the interpre- the multiplicity problem.
Trials (no.)
Trials (no.)
30
15 25
10 20 16
15
5
5 10
5 3
0 0
1–4 5–8 >8 Unclear Never Sometimes Always
Trials (no.)
20 15 13
16
15 10
11 10 8
10
5 3 4
5
0 0
Never Sometimes Always Inconsistent
at ry
P s
an ue
e
m g
P I
CI
C
ic
lu
Su hin
st ma
l
ist
va
va
d
ot
N
Consistent
Trials (no.)
30 30
25 25
20 20
13
15 15
7 9
10 10 6
5 2 5
0 0
Heterogeneity No Yes No Yes No Yes
Not Claimed
Multiplicity Issues Heterogeneity Heterogeneity
Addressed and Not Claimed Claimed
Heterogeneity Claimed
ods used in order to increase the clarity and com- 9. Jackson RD, LaCroix AZ, Gass M, et al. Calcium plus vitamin
D supplementation and the risk of fractures. N Engl J Med 2006;
pleteness of the information reported. As always, 354:669-83. [Erratum, N Engl J Med 2006;354:1102.]
these are guidelines and not rules; additions and 10. Pocock SJ. Clinical trials: a practical approach. Chichester,
exemptions can be made as long as there is a clear England: John Wiley, 1983.
11. Halperin M, Ware JH, Byar DP, et al. Testing for interaction
case for such action. in an I×J×K contingency table. Biometrika 1977;64:271-5.
No potential conflict of interest relevant to this article was re- 12. Simon R. Patient subsets and variation in therapeutic effi-
ported. cacy. Br J Clin Pharmacol 1982;14:473-82.
We thank Doug Altman, John Bailar, Colin Begg, Mohan 13. Gail M, Simon R. Testing for qualitative interactions between
Beltangady, Marc Buyse, David DeMets, Stephen Evans, Thomas treatment effects and patient subsets. Biometrics 1985;41:361-72.
Fleming, David Harrington, Joe Heyse, David Hoaglin, Michael 14. Brookes ST, Whitely E, Egger M, Smith GD, Mulheran PA,
Hughes, John Ioannidis, Curtis Meinert, James Neaton, Robert Peters T. Subgroup analyses in randomized trials: risks of sub-
O’Neill, Ross Prentice, Stuart Pocock, Robert Temple, Janet group-specific analyses; power and sample size for the interac-
Wittes, and Marvin Zelen for their helpful comments. tion test. J Clin Epidemiol 2004;57:229-36.
15. Bailar JC III, Mosteller F, eds. Medical uses of statistics. 2nd
1. Yusuf S, Wittes J, Probstfield J, Tyroler HA. Analysis and in- ed. Waltham, MA: NEJM Books, 1992.
terpretation of treatment effects in subgroups of patients in ran- 16. Lonn E, Yusuf S, Arnold MJ, et al. Homocysteine lowering
domized clinical trials. JAMA 1991;266:93-8. with folic acid and B vitamins in vascular disease. N Engl J Med
2. Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup 2006;354:1567-77. [Erratum, N Engl J Med 2006;355:746.]
analysis and other (mis)uses of baseline data in clinical trials. 17. Lees KR, Zivin JA, Ashwood T, et al. NXY-059 for acute ische-
Lancet 2000;355:1064-9. mic stroke. N Engl J Med 2006;354:588-600.
3. Pocock SJ, Assmann SF, Enos LE, Kasten LE. Subgroup analy- 18. Al-Marzouki S, Roberts I, Marshall T, Evans S. The effect of
sis, covariate adjustment and baseline comparisons in clinical scientific misconduct on the results of clinical trials: a Delphi
trial reporting: current practice and problems. Stat Med 2002; survey. Contemp Clin Trials 2005;26:331-7.
21:2917-30. 19. Moher D, Schulz KF, Altman DG, et al. The CONSORT State-
4. Hernández A, Boersma E, Murray G, Habbema J, Steyerberg ment: revised recommendations for improving the quality of
E. Subgroup analyses in therapeutic cardiovascular clinical tri- reports of parallel-group randomized trials. (Accessed Novem-
als: are most of them misleading? Am Heart J 2006;151:257-64. ber 1, 2007, at http://www.consort-statement.org/.)
5. Parker AB, Naylor CD. Subgroups, treatment effects, and 20. International Conference on Harmonisation (ICH). Guid-
baseline risks: some lessons from major cardiovascular trials. Am ance for industry: E9 statistical principles for clinical trials. Rock-
Heart J 2000;139:952-61. ville, MD: Food and Drug Administration, September 1998. (Ac-
6. Rothwell PM. Subgroup analysis in randomised controlled cessed November 1, 2007, at http://www.fda.gov/cder/guidance/
trials: importance, indications, and interpretation. Lancet 2005; ICH_E9-fnl.PDF.)
365:176-86. 21. Cuzick J. Forest plots and the interpretation of subgroups.
7. Lagakos SW. The challenge of subgroup analyses — report- Lancet 2005;365:1308.
ing without distorting. N Engl J Med 2006;354:1667-9. [Erratum, 22. Wactawski-Wende J, Kotchen JM, Anderson GL, et al. Calci-
N Engl J Med 2006;355:533.] um plus vitamin D supplementation and the risk of colorectal
8. Sacks FM, Pfeffer MA, Moye LA, et al. The effect of prava- cancer. N Engl J Med 2006;354:684-96.
statin on coronary events after myocardial infarction in patients Copyright © 2007 Massachusetts Medical Society.
with average cholesterol levels. N Engl J Med 1996;335:1001-9.