Documentos de Académico
Documentos de Profesional
Documentos de Cultura
9
Copyright 1992 by The Johns Hopkins University School of Hygiene and Puttfc Health Printed in U.S.A.
All rights reserved
The purpose of this series of papers is to In this paper, the first of three, the prin-
present a theoretical framework for control ciples underlying control selection are de-
selection in case-control studies and show veloped. These principles also apply to the
how practical issues can be addressed within design of cohort studies, as would be ex-
this framework. We discuss controversial pected since the case-control design is simply
areas of control selection using the frame- an efficient sampling technique to measure
work and attempt to offer advice when there exposure-disease associations in a cohort or
is relevant empiric information or experi- study base. In theory, every case-control
ence to guide us. For the most part, issues study takes place within a cohort, although
of analysis will not be addressed in the re- in practice it can be difficult to characterize
view. the cohort or study base. The identification
of the appropriate study base from which to
select controls is the primary challenge in
Received for publication May 8, 1991, and in final form the design of case-control studies.
February 11, 1992.
Abbreviations: ADS, acquired immunodeficiency syn- In our second paper (1), we apply the
drome; HIV, human mmunodeficiency virus.
1
principles presented in this paper to the
Btostatistics Branch, National Cancer Institute, Be- selection of control groups used in case-
thesda, MD.
2
Department of Environmental and Occupational control studies, including population con-
Health, School of Public Health, University of Minnesota, trols, hospital controls, medical practice
Minneapolis, MN. controls, friend controls, and relative con-
Reprint requests to Dr. Shokxn Wachokler, Biostabstcs
Branch, National Cancer Institute, 6130 Executive Blvd., trols. We also discuss the use of proxy re-
EPN 403, Rockville, MD 20892. spondents and deceased controls.
The authors thank Dr. Robert Hoover, Dr. Peter Inskip, In the third paper of the series (2), we
Dr. Mitchel Ga8, Dr. William Blot, Dr. Patricia Hartge, Dr.
Jack Siemlatycki, and Dr. OUi Miettinen for their comments focus on issues encountered after a particu-
on earlier versions of the manuscripts in this set. lar control group has been selected. Some of
1019
1020 Wachokter et al.
the areas discussed are matching, ratio of enrolled as a case if diagnosed with disease
controls to cases, number of control groups, at the time. A useful paradigm with an ex-
nested case-control studies, two-stage sam- plicitly defined study base is the "nested
pling designs, and issues relating to infor- case-control study" (2, 5-7) where controls
mation bias such as contemporaneity of are selected randomly from the "risk set,"
cases and controls. the subjects in the cohort who are at risk at
We do not intend the principles described the time of diagnosis of each case.
and illustrated in these papers to be used for Deconfounding principle. Confounding
determining whether a study is up to stan- should not be allowed to distort the estima-
dard. Perfect adherence to a principle can tion of effect. Confounders that are mea-
be as difficult to achieve as perfect experi- sured can be controlled in the analysis. Un-
mental conditions in a laboratory. Some- known or unmeasured confounders should
times, one principle can conflict with an- have as little variability as possible. Since
other. Indeed, tolerating a minor violation this variability is measured conditionally on
of a principle is often the only way to study the levels of other variables being studied,
a particular exposure-disease association. the use of stratification or matching can, in
Such a study can still provide valuable in- effect, reduce or eliminate the variability of
formation, particularly when the impact of the confounder. For example, using siblings
the violation can be evaluated or bounded. as matched controls in a study of environ-
mental risk factors may result in less varia-
bility for genetic risk factors within the
COMPARABILITY PRINCIPLES matched set and, hence, less confounding
Three basic tenets of comparability un- than using controls who are not siblings.
derlie attempts to minimize bias in control The extent of bias from an unmeasured or
selection. These are the principles of study uncontrolled confounder depends on the
base, deconfounding, and comparable accu- strengths of the associations between it and
racy. the study exposure and disease risk.
Study base principle. Cases and controls Comparable accuracy principle. The de-
should be "representative of the same base gree of accuracy in measuring the exposure
experience" (3, p. 545). The base is the set of interest for the cases should be equivalent
of persons or person-time, depending on the to the degree of accuracy for the controls,
context, in which diseased subjects become unless the effect of the inaccuracy can be
cases. The base can also be thought of as the controlled in the analysis.
members of the underlying cohort or source We believe that the results of a case-
population for the cases during the time control study become more credible to the
periods when they are eligible to become extent that these three principles are met.
cases (4). Typically in chronic disease epi- Strict adherence to the principles of com-
demiology, membership in the base is dy- parability outlined here ensures that an ap-
namic in the sense that a subject may be in parent effect is not due to 1) differences in
the base at certain times and out of it at the way cases and controls are selected from
other times. The simplest way to satisfy this the base; 2) distortion of the effect by other,
principle is to choose a random sample of unmeasured, risk factors related to exposure;
individuals from the same source as the or 3) differences in the accuracy of the in-
cases; if comparability of time, e.g., age or formation obtained from cases and controls.
calendar time, is essential, the sampling The aim of the principles is to reduce or
should be from the members of the base at eliminate, respectively, selection bias, con-
risk at the same time as the case's diagnosis. founding bias, and information bias.
Immigration and emigration from the catch- However, there is an additional practical
ment area affect whether someone is in the principle that constrains attempts at com-
study base at a particular time; a subject is parability.
in the base only when he or she would be Efficiency principle. The study should be
Principles of Control Selection 1021
subpopulation may be of interest in itself; in bility of the cases (30) violates the study base
a representative population, an association principle, and the estimate of effect for an
that is limited to one group may be obscured exposure associated with such residential
because the effect is weaker in other groups mobility could be biased (30).
or because of differences in the distribution Nonrandom selection from the study base.
of the exposure. On the other hand, detec- In theory, choosing the controls to be a
tion of variability of the strength of associa- random sample from the base ensures that
tion (effect modification) can be missed if the controls are representative of the base.
the study base is narrowly defined. If there When random selection is not practical, as
is reason to believe that an effect is strongest when identification of the base is difficult, a
in one particular subgroup, exclusion of nonrandom subset can be selected if a rep-
other subgroups might be the best strategy resentativeness assumption regarding the
for demonstrating that effect; thus, a study study exposure is met: that the distributions
of the effect of a possible risk factor for of the exposures of interest are the same in
myocardial infarction might restrict the base the control series as in a random sample of
to subjects who had a previous one. The the (secondary) base (3, 9).
power of a study targeted at a subgroup can For example, hospital controls are a non-
even be greater than the power of a study of random subset of the study base rather than
the entire population, despite the reduced a random sample from the study base; the
number of subjects, when the effect is larger validity of a hospital-based study rests on
in the subgroup (26). Other grounds for ex- the (perhaps tenuous) assumption that the
clusions that may increase statistical or eco- distribution of exposure among the chosen
nomic efficiency include 1) inconvenience hospital controls is the same as in the base
(e.g., subjects likely to be too hard to reach); itself or differs because of measurable factors
2) anticipated low or inaccurate responses (1, 9). This assumption is reasonable when
(e.g., exclusion of subjects who do not speak the following two conditions apply.
the language of the interview); 3) lack of Identical catchment populations. Subjects
variability in the exposure (27, 28) (e.g., a who are admitted to the hospital for the case
study of the effects of oral contraceptives on disease would have been admitted to the
subsequent risk of breast cancer should same hospital for the control disease, and,
probably exclude women who were past re- conversely, subjects who are admitted for
productive age when oral contraceptives the control disease would have been admit-
were introduced into common use); or 4) ted for the case disease. Thus, determinants
subjects at increased risk of disease due to of hospitalization and the choice of hospital
other causes (e.g., subjects at high risk for must be considered carefully in studies with
leukemia as a result of chemotherapy for hospital controls.
Hodgkin's disease), because cases from the Exposure independent of admission. The
treated group are likely to be attributable to exposure is unrelated to the reason for ad-
the treatment and therefore may not con- mission of the control.
tribute much to the understanding of other In the male infertility example considered
risk factors. above, a control series consisting of men
An exclusion rule that applies equally to whose wives have been identified as infertile
cases and controls is valid (29) because it at an infertility clinic (11) would be a non-
simply refines the scope of the study base. random sample of the appropriate secondary
One that applies to one but not the other base that would have the same determinants
violates the study base principle. For exam- of seeking medical attention as the cases.
ple, a study design that excludes potential However, it could introduce selection bias
controls who had changed their residence for male correlates of causes of female infer-
between the time of diagnosis of the matched tility, such as sexually transmitted disease in
case and the time of selection but places no the husbands of women with pelvic inflam-
analogous restriction on the residential mo- matory disease (11).
1024 Wacholder et aJ.
mates of effect without affecting validity (2, With nondifferential errors, the bias is
35, 36). Generally, matching on variables typically (but not always) in a predictable
that are not risk factors is also overmatching, direction (toward lack of association) and,
since the matching may reduce the variabil- unless the measurement is so bad as to be
ity in the exposure of interest without con- negatively correlated with the truth, seldom
trolling for any confounding (2, 36, 37). On reverses the direction of the association (42,
the other hand, reduced precision might be 43). On the other hand, the effect of differ-
inevitable in the presence of confounding, ential measurement error on estimates of
since it can be a consequence of control for association is usually unpredictable.
confounding in the design and analysis. Thus, adherence to the comparable accu-
racy principle does not eliminate its corre-
Comparable accuracy principle sponding biasinformation bias. Only
elimination of errors (or correction for bias
Error in the measurement of variables is in the analysis using additional information
unavoidable in epidemiologic studies, par- or assumptions (39)) can remove bias en-
ticularly when information is obtained ret- tirely. Adherence to this principle may not
rospectively. When the bias due to measure- even reduce bias, as in the hypothetical ex-
ment error can be removed in the analysis, ample presented in table 1. The true odds
as when the relations between the observed ratio is 6. When the exposure of the cases is
and true exposure measurements are known misclassified with specificity and sensitivity
for cases and controls or an appropriate both equal to 80 percent, the observed odds
validation study can be used (38, 39), this ratio from controls with 100 percent speci-
principle need not influence control selec- ficity and sensitivity will be 3.2 (table 2),
tion. For example, measurements made us- which is less biased than the 2.7 that would
ing both "gold standard" and error-prone be observed from controls with 80 percent
methods on some study subjects can allow sensitivity and specificity (table 3). So why
unbiased estimation of the effects of a poorly make this a principle if adhering to it can
measured exposure (38-40). Even when increase bias? The rationale is to ensure that
cases' information was obtained from one a positive finding cannot be induced simply
clinic and that of controls from another, by differences in the accuracy of information
subjects for whom information from both about cases and controls. While recent work
clinics was available can be used as a vali- (42, 44) indicates that equal accuracy does
dation study and can yield unbiased esti- not guarantee bias toward the null, a reversal
mates under the assumption that being in- of the direction of the association seems
terviewed in both clinics is unrelated to the unlikely.
responses given (41). Differential errors can be hard to avoid in
When no correction is possible in the case-control studies in which exposure in-
analysis, the comparable accuracy principle formation is obtained from interviews with
calls for all measurement errors that result the subjects. Even when interviewers can be
in distortion of the estimates of effect to be blinded to the disease status of a subject, the
nondifferential; i.e., the error distributions case generally knows the diagnosis at the
should be the same for cases and controls, time of interview. The disease itself and
as seems reasonable when the mechanisms hospitalization and treatment of the disease
generating the errors for both groups are the may change actual habits as well as percep-
same and are not influenced by disease sta- tion of current and past habits.
tus. In control selection, one needs to con- The comparable accuracy principle
sider the accuracy of information that can should not be taken to mean that creating
be obtained from the controls, e.g., whether strata within which the errors are equal will
recollection of past exposures is better if be helpful. In fact, stratification designed to
hospital controls are used rather than achieve nondifferential error within strata
healthy population controls. can increase bias (45). Thus, creating a stra-
1026 Wacholder et al.