Wacholder AJE 1992 1 Comparibility Principles

American Journal of Epidemiology Vol. 135, No.
9
Copyright 1992 by The Johns Hopkins University School of Hygiene and Puttfc Health Printed in U.S.A.
All rights reserved
Selection of Controls in Case-Control Studies

I. Principles
Shdom WachokJer,1 Joseph K. McLaughlin,1 Debra T. Silverman,1 and Jack S. MandeP
A synthesis of classical and recent thinking on the issues involved in selecting

controls for case-control studies is presented in this and two companion papers (S.
Wachokter et al. Am J Epidemiol 1992;135:1029-50). In this paper, a theoretical
framework for selecting controls in case-control studies is developed. Three principles
of comparability are described: 1) study base, that all comparisons be made within the
study base; 2) deconfounding, that comparisons of the effects of the levels of exposure
on disease risk not be distorted by the effects of other factors; and 3) comparable
accuracy, that any errors in measurement of exposure be nondifferential between cases
and controls. These principles, if adhered to in a study, can reduce selection, confound-
ing, and information bias, respectively. The principles, however, are constrained by an
additional efficiency principle regarding resources and time. Most problems and contro-
versies in control selection reflect trade-offs among these four principles. Am J Epidemiol
1992;135:1019-28.
bias (epidemiology); epidemiotogic methods; prospective studies; retrospective studies
The purpose of this series of papers is to In this paper, the first of three, the prin-
present a theoretical framework for control ciples underlying control selection are de-
selection in case-control studies and show veloped. These principles also apply to the
how practical issues can be addressed within design of cohort studies, as would be ex-
this framework. We discuss controversial pected since the case-control design is simply
areas of control selection using the frame- an efficient sampling technique to measure
work and attempt to offer advice when there exposure-disease associations in a cohort or
is relevant empiric information or experi- study base. In theory, every case-control
ence to guide us. For the most part, issues study takes place within a cohort, although
of analysis will not be addressed in the re- in practice it can be difficult to characterize
view. the cohort or study base. The identification
of the appropriate study base from which to
select controls is the primary challenge in
Received for publication May 8, 1991, and in final form the design of case-control studies.
February 11, 1992.
Abbreviations: ADS, acquired immunodeficiency syn- In our second paper (1), we apply the
drome; HIV, human mmunodeficiency virus.
1
principles presented in this paper to the
Btostatistics Branch, National Cancer Institute, Be- selection of control groups used in case-
thesda, MD.
2
Department of Environmental and Occupational control studies, including population con-
Health, School of Public Health, University of Minnesota, trols, hospital controls, medical practice
Minneapolis, MN. controls, friend controls, and relative con-
Reprint requests to Dr. Shokxn Wachokler, Biostabstcs
Branch, National Cancer Institute, 6130 Executive Blvd., trols. We also discuss the use of proxy re-
EPN 403, Rockville, MD 20892. spondents and deceased controls.
The authors thank Dr. Robert Hoover, Dr. Peter Inskip, In the third paper of the series (2), we
Dr. Mitchel Ga8, Dr. William Blot, Dr. Patricia Hartge, Dr.
Jack Siemlatycki, and Dr. OUi Miettinen for their comments focus on issues encountered after a particu-
on earlier versions of the manuscripts in this set. lar control group has been selected. Some of
1019
1020 Wachokter et al.
the areas discussed are matching, ratio of enrolled as a case if diagnosed with disease
controls to cases, number of control groups, at the time. A useful paradigm with an ex-
nested case-control studies, two-stage sam- plicitly defined study base is the "nested
pling designs, and issues relating to infor- case-control study" (2, 5-7) where controls
mation bias such as contemporaneity of are selected randomly from the "risk set,"
cases and controls. the subjects in the cohort who are at risk at
We do not intend the principles described the time of diagnosis of each case.
and illustrated in these papers to be used for Deconfounding principle. Confounding
determining whether a study is up to stan- should not be allowed to distort the estima-
dard. Perfect adherence to a principle can tion of effect. Confounders that are mea-
be as difficult to achieve as perfect experi- sured can be controlled in the analysis. Un-
mental conditions in a laboratory. Some- known or unmeasured confounders should
times, one principle can conflict with an- have as little variability as possible. Since
other. Indeed, tolerating a minor violation this variability is measured conditionally on
of a principle is often the only way to study the levels of other variables being studied,
a particular exposure-disease association. the use of stratification or matching can, in
Such a study can still provide valuable in- effect, reduce or eliminate the variability of
formation, particularly when the impact of the confounder. For example, using siblings
the violation can be evaluated or bounded. as matched controls in a study of environ-
mental risk factors may result in less varia-
bility for genetic risk factors within the
COMPARABILITY PRINCIPLES matched set and, hence, less confounding
Three basic tenets of comparability un- than using controls who are not siblings.
derlie attempts to minimize bias in control The extent of bias from an unmeasured or
selection. These are the principles of study uncontrolled confounder depends on the
base, deconfounding, and comparable accu- strengths of the associations between it and
racy. the study exposure and disease risk.
Study base principle. Cases and controls Comparable accuracy principle. The de-
should be "representative of the same base gree of accuracy in measuring the exposure
experience" (3, p. 545). The base is the set of interest for the cases should be equivalent
of persons or person-time, depending on the to the degree of accuracy for the controls,
context, in which diseased subjects become unless the effect of the inaccuracy can be
cases. The base can also be thought of as the controlled in the analysis.
members of the underlying cohort or source We believe that the results of a case-
population for the cases during the time control study become more credible to the
periods when they are eligible to become extent that these three principles are met.
cases (4). Typically in chronic disease epi- Strict adherence to the principles of com-
demiology, membership in the base is dy- parability outlined here ensures that an ap-
namic in the sense that a subject may be in parent effect is not due to 1) differences in
the base at certain times and out of it at the way cases and controls are selected from
other times. The simplest way to satisfy this the base; 2) distortion of the effect by other,
principle is to choose a random sample of unmeasured, risk factors related to exposure;
individuals from the same source as the or 3) differences in the accuracy of the in-
cases; if comparability of time, e.g., age or formation obtained from cases and controls.
calendar time, is essential, the sampling The aim of the principles is to reduce or
should be from the members of the base at eliminate, respectively, selection bias, con-
risk at the same time as the case's diagnosis. founding bias, and information bias.
Immigration and emigration from the catch- However, there is an additional practical
ment area affect whether someone is in the principle that constrains attempts at com-
study base at a particular time; a subject is parability.
in the base only when he or she would be Efficiency principle. The study should be
Principles of Control Selection 1021
implemented so as to learn as much as pos- The fundamental trade-off between a pri-

sible about the questions being investigated mary base and a secondary base is that it is
for a fixed expenditure of time and re- easier to sample for controls from a well-
sources. defined primary base than from a secondary
base, where it may not be obvious whether
or not an individual is a member of the base;
Study base principle
on the other hand, case ascertainment is
The importance of defining the study base complete by definition in a secondary base
in epidemiologic investigations has been rec- but can be problematic with a primary base.
ognized for a long time (8). Miettinen (3, 9, Selection factors affecting which cases are
10) distinguishes between a primary base ascertained and included in the study or the
and a secondary base. In a study with a accuracy of identification of the base can
primary base, the base is defined by the cause bias in either a primary or secondary
population experience that the investigator base setting. Identification of a setting where
wishes to target, with the cases being subjects no selection factor operates on the cases or
within the base who develop the disease. A on the sample of the base is often a major
population-based case-control study is an challenge in case-control studies, as in the
example of a study that uses a primary base, three following examples.
where population experience is defined geo- Referral hospital. In a study where the
graphically and temporally. However, par- cases are subjects who were treated at a
ticularly when ascertainment of all cases in referral hospital, the (secondary) base con-
a primary base is difficult or impractical, it sists of those individuals who would have
may be preferable to use a secondary base, been treated at that hospital had they been
where the cases are defined before the base diagnosed with the study disease. The diffi-
is identified. In this approach, the base is culty, of course, is in identifying exactly who
defined as the source of the cases, and con- would have been referred to that hospital
trols are individuals who would have be- had they developed the study disease.
come study cases if they had developed dis- Underascertainment of cases. Incomplete
ease during the time of the investigation (9).
case identification can be substantial for dis-
For example, in a hospital-based study, the
eases with mild symptoms and for those that
cases might be all patients diagnosed with
do not require medical attention; hence,
the study disease at one hospital; the indi-
there could be a spurious association with
viduals contributing to the (secondary) base
would be all subjects who would be diag- variables related to utilization of medical
nosed at that hospital had they developed services in a study using self-identified cases.
the study disease. A primary base would be unworkable in a
study of male infertility, since infertile men
Thus, while the major challenge with a will not become cases unless they are at-
primary base is complete case identification tempting to have children and seek medical
in the base, the major challenge with a sec- help (11). A secondary base approach would
ondary base is definition of the study base. restrict controls to men who, if they were
Sometimes it may not be possible to resolve infertile, would seek help, just as all the cases
definitively whether and when a particular have. Failure to restrict the secondary base
person is in the secondary base. Whether the accordingly, and thereby failure to exclude
base is primary or secondary, the critical controls who would not seek medical advice,
point is that the base and the cases need to could result in a misleading association with
be defined so that the cases consist, exclu- correlates of seeking medical attention.
sively, of all (or a random sample of) subjects Temporal differences. When cases are di-
experiencing the study outcome in the base, agnosed long before controls are selected,
and that the controls are derived from the it can be difficult to reconstruct the base
base and can be used to estimate the expo- that was contemporaneous with disease
sure distribution in that base. incidence.
1022 Wacholder et al.
Problems in identifying the base some- seroconversion is typically unknown. Thus,

times make it very difficult to choose the defining a base of HIV-positive subjects is
study base that would be the most scientifi- difficult (18).
cally informative. This is particularly true Sampling from the study base. In simple
when becoming a case is contingent on a random sampling, controls are selected ran-
previous condition, as in the following ex- domly from the base. Therefore, each eligi-
amples. ble individual has the same probability of
Screening. A simple and powerful ap- selection as a control, and the sampling is
proach to evaluate the efficacy of screening independent; i.e., the presence of a specific
for breast cancer would compare mortality subject in the sample does not make the
in screened and unscreened women in a base presence of any other more or less likely. In
of women who had developed early stage stratified sampling and frequency matching,
breast cancer (12,13). However, it would be the base is subdivided into strata determined
difficult for a case-control study to use this by factors such as age and sex, and the
approach because of the problem in identi- sampling fraction is allowed to vary across
fying members of the base for the denomi- strata. More complex sampling schemes,
nator of the mortality rate, particularly in such as two-stage (19-21) and cluster sam-
unscreened women. Thus, the standard but pling plans (22), can be used as long as the
less efficient approach for case-control stud- joint distribution of the exposures of interest
ies is to choose controls from a broader base in the base can be estimated without bias;
consisting of women at risk for breast cancer generally these require knowledge of the rel-
(14, 15). ative sampling fractions and a nonstandard
Prenatal survival. Exposure to human ter- analysis.
atogens may affect prenatal survival and, Selection bias can be introduced when the
thus, the opportunity to observe a congenital sampling fractions for individuals in the base
malformation. This can lead to misleading depend on an exposure variable in an un-
estimates of effects in case-control studies known way. This dependence is typically
using livebirths as controls (16). indirect and inadvertent, such as when con-
Spontaneous abortion. The ideal base for trol selection by telephone tends to exclude
a case-control study of previous therapeutic poor people without phones. However, an
abortion on risk of ectopic (tubal) pregnancy analysis of the effects of other variables will
would be women who conceive. However, be unbiased when the source of the depen-
identification of women with intrauterine dence can be identified and handled in the
pregnancies who spontaneously abort would analysis as if it were a confounder (23, 24).
be incomplete, since the women themselves Unfortunately, recognizing the presence of
may never become aware of the conception selection bias can be quite difficult, and this
(17). If women who had a previous thera- solution requires identification of the selec-
peutic abortion are at extra risk for unnot- tion factor. As with confounding, there is no
iced spontaneous abortions, the proportion bias when the selection probability depends
of missed intrauterine conceptions will differ on a factor that is unrelated to the exposure.
by exposure, and use of this base could be The study base principle entails the re-
prone to bias. If the base is women who are quirement of representativeness of the base
trying to conceive, it would be difficult to but not necessarily of the general popula-
separate the effects of factors related to con- tion. Representativeness of the general pop-
ception itself, such as contraceptive use, ulation is crucial in estimating the preva-
from those leading to ectopic pregnancy in lence of disease, the attributable risk, or the
women who do conceive. distribution of a variable in a population
Acquired immunodeficiency syndrome based on a sample (25). But representative-
(AIDS) after human immunodeficiency vi- ness, per se, is not needed in analytical stud-
rus (HIV) infection. In studies of progression ies of the relation between an exposure and
to AIDS after HIV infection, the time of disease (9, 25). An association found in any
subpopulation may be of interest in itself; in bility of the cases (30) violates the study base
a representative population, an association principle, and the estimate of effect for an
that is limited to one group may be obscured exposure associated with such residential
because the effect is weaker in other groups mobility could be biased (30).
or because of differences in the distribution Nonrandom selection from the study base.
of the exposure. On the other hand, detec- In theory, choosing the controls to be a
tion of variability of the strength of associa- random sample from the base ensures that
tion (effect modification) can be missed if the controls are representative of the base.
the study base is narrowly defined. If there When random selection is not practical, as
is reason to believe that an effect is strongest when identification of the base is difficult, a
in one particular subgroup, exclusion of nonrandom subset can be selected if a rep-
other subgroups might be the best strategy resentativeness assumption regarding the
for demonstrating that effect; thus, a study study exposure is met: that the distributions
of the effect of a possible risk factor for of the exposures of interest are the same in
myocardial infarction might restrict the base the control series as in a random sample of
to subjects who had a previous one. The the (secondary) base (3, 9).
power of a study targeted at a subgroup can For example, hospital controls are a non-
even be greater than the power of a study of random subset of the study base rather than
the entire population, despite the reduced a random sample from the study base; the
number of subjects, when the effect is larger validity of a hospital-based study rests on
in the subgroup (26). Other grounds for ex- the (perhaps tenuous) assumption that the
clusions that may increase statistical or eco- distribution of exposure among the chosen
nomic efficiency include 1) inconvenience hospital controls is the same as in the base
(e.g., subjects likely to be too hard to reach); itself or differs because of measurable factors
2) anticipated low or inaccurate responses (1, 9). This assumption is reasonable when
(e.g., exclusion of subjects who do not speak the following two conditions apply.
the language of the interview); 3) lack of Identical catchment populations. Subjects
variability in the exposure (27, 28) (e.g., a who are admitted to the hospital for the case
study of the effects of oral contraceptives on disease would have been admitted to the
subsequent risk of breast cancer should same hospital for the control disease, and,
probably exclude women who were past re- conversely, subjects who are admitted for
productive age when oral contraceptives the control disease would have been admit-
were introduced into common use); or 4) ted for the case disease. Thus, determinants
subjects at increased risk of disease due to of hospitalization and the choice of hospital
other causes (e.g., subjects at high risk for must be considered carefully in studies with
leukemia as a result of chemotherapy for hospital controls.
Hodgkin's disease), because cases from the Exposure independent of admission. The
treated group are likely to be attributable to exposure is unrelated to the reason for ad-
the treatment and therefore may not con- mission of the control.
tribute much to the understanding of other In the male infertility example considered
risk factors. above, a control series consisting of men
An exclusion rule that applies equally to whose wives have been identified as infertile
cases and controls is valid (29) because it at an infertility clinic (11) would be a non-
simply refines the scope of the study base. random sample of the appropriate secondary
One that applies to one but not the other base that would have the same determinants
violates the study base principle. For exam- of seeking medical attention as the cases.
ple, a study design that excludes potential However, it could introduce selection bias
controls who had changed their residence for male correlates of causes of female infer-
between the time of diagnosis of the matched tility, such as sexually transmitted disease in
case and the time of selection but places no the husbands of women with pelvic inflam-
analogous restriction on the residential mo- matory disease (11).
1024 Wacholder et aJ.
Use of deterministic (nonrandom) the neighborhood since diagnosis of the case

schemes for control selection, such as choos- reduces the problem but does not solve it,
ing the case's best friend or neighbors, can since people who moved out of the neigh-
avoid the need for a representativeness as- borhood will still be missed.
sumption for exposure if 1) the base is di-
vided into nonoverlapping strata and 2) all
Deconfounding principle
members of the base in the stratum that
includes the case are selected as controls While the study base principle clarifies who
(31). Thus, instead of random selection, a can be entered into the study, the decon-
100 percent sample (31) from a (typically founding principle addresses the problem
very small) stratum of the study base is created when the study exposure is associ-
chosen. (Strictly speaking, this would not be ated with other risk factors. The principle
a case-control study, since no sampling is applies to control selection with respect to
involved; it is a cohort study where all the unmeasured confounders, since measured
strata with no cases can be ignored.) To- confounders can be handled in the analysis.
gether, these two requirements imply reci- Confounding can bias the results of any
procity (31). If A is included as a control for epidemiologic study. Complete assurance of
B, then B would have to have been included control of confounding is achieved (in the-
as a control for A, if A had become the case; ory) by eliminating the variability in the
this is exactly what is done in a cohort study. confounding factor. Thus, if the study base
In practice, selection of a subset of the stra- consists entirely of males, there can be no
tum deterministically would not produce confounding by sex. Some control for con-
bias, unless the selection were related to founding by genotype might be achieved by
exposure (31). But the possibility of bias does the use of relatives of the cases as matched
exist with any scheme that allows control controls. Similarly, controls are sometimes
selection to be determined by the case or the selected to match the neighborhood of the
case's physician. case in order to control for unknown risk
Controls from outside the study base. A factors relating to socioeconomic and ethnic
proxy control series from outside the base variables or, particularly, access to medical
can be used as an "indirect way to probe the care, which is difficult to control for other-
base" (9, p. 82), if the representativeness of wise. However, controlling for the con-
exposure assumption is met. For example, founding effects of a risk or selection factor
in a study where blood group is the exposure by matching on its correlate or proxy does
of interest, use of females as controls when not eliminate confounding bias (33).
the actual base consists only of males would This principle, however, can conflict with
be theoretically acceptable, under the as- the efficiency principle. Selecting controls to
sumption that blood group distribution does have the same values of confounders as cases
not vary by sex (32). (Of course, published results in controls who are likely to be more
rates on the distribution of blood group similar to cases with respect to exposure (34);
might obviate the need for any controls.) In i.e., restricting the variability of the con-
more common situations, it may not be founding variable will also reduce the con-
known whether the representativeness as- ditional variability of the exposure of inter-
sumption actually holds for a given expo- est when the exposure and con founder are
sure. The validity of the assumption for each highly correlated. Studying a population that
exposure studied needs to be assessed indi- is almost uniform with respect to unmea-
vidually. sured confounders but also nearly uniform
Controls currently living in a neighbor- on the exposures of interest is not an effec-
hood who are chosen to match cases diag- tive strategy (35); it is a form of overmatch-
nosed several years earlier should be ex- ing (in the sense that subjects are effectively,
cluded since they are outside the study base. if not deliberately, "matched" on the expo-
Excluding controls who have moved into sure) that can reduce the precision of esti-
mates of effect without affecting validity (2, With nondifferential errors, the bias is
35, 36). Generally, matching on variables typically (but not always) in a predictable
that are not risk factors is also overmatching, direction (toward lack of association) and,
since the matching may reduce the variabil- unless the measurement is so bad as to be
ity in the exposure of interest without con- negatively correlated with the truth, seldom
trolling for any confounding (2, 36, 37). On reverses the direction of the association (42,
the other hand, reduced precision might be 43). On the other hand, the effect of differ-
inevitable in the presence of confounding, ential measurement error on estimates of
since it can be a consequence of control for association is usually unpredictable.
confounding in the design and analysis. Thus, adherence to the comparable accu-
racy principle does not eliminate its corre-
Comparable accuracy principle sponding biasinformation bias. Only
elimination of errors (or correction for bias
Error in the measurement of variables is in the analysis using additional information
unavoidable in epidemiologic studies, par- or assumptions (39)) can remove bias en-
ticularly when information is obtained ret- tirely. Adherence to this principle may not
rospectively. When the bias due to measure- even reduce bias, as in the hypothetical ex-
ment error can be removed in the analysis, ample presented in table 1. The true odds
as when the relations between the observed ratio is 6. When the exposure of the cases is
and true exposure measurements are known misclassified with specificity and sensitivity
for cases and controls or an appropriate both equal to 80 percent, the observed odds
validation study can be used (38, 39), this ratio from controls with 100 percent speci-
principle need not influence control selec- ficity and sensitivity will be 3.2 (table 2),
tion. For example, measurements made us- which is less biased than the 2.7 that would
ing both "gold standard" and error-prone be observed from controls with 80 percent
methods on some study subjects can allow sensitivity and specificity (table 3). So why
unbiased estimation of the effects of a poorly make this a principle if adhering to it can
measured exposure (38-40). Even when increase bias? The rationale is to ensure that
cases' information was obtained from one a positive finding cannot be induced simply
clinic and that of controls from another, by differences in the accuracy of information
subjects for whom information from both about cases and controls. While recent work
clinics was available can be used as a vali- (42, 44) indicates that equal accuracy does
dation study and can yield unbiased esti- not guarantee bias toward the null, a reversal
mates under the assumption that being in- of the direction of the association seems
terviewed in both clinics is unrelated to the unlikely.
responses given (41). Differential errors can be hard to avoid in
When no correction is possible in the case-control studies in which exposure in-
analysis, the comparable accuracy principle formation is obtained from interviews with
calls for all measurement errors that result the subjects. Even when interviewers can be
in distortion of the estimates of effect to be blinded to the disease status of a subject, the
nondifferential; i.e., the error distributions case generally knows the diagnosis at the
should be the same for cases and controls, time of interview. The disease itself and
as seems reasonable when the mechanisms hospitalization and treatment of the disease
generating the errors for both groups are the may change actual habits as well as percep-
same and are not influenced by disease sta- tion of current and past habits.
tus. In control selection, one needs to con- The comparable accuracy principle
sider the accuracy of information that can should not be taken to mean that creating
be obtained from the controls, e.g., whether strata within which the errors are equal will
recollection of past exposures is better if be helpful. In fact, stratification designed to
hospital controls are used rather than achieve nondifferential error within strata
healthy population controls. can increase bias (45). Thus, creating a stra-
1026 Wacholder et al.
TABLE 1. Hypothetical example: exposure can be excluded on efficiency grounds. For

classified correctly example, in a study of oral contraceptive use
Measured No. of No. erf Observed and risk of myocardial infarction, it would
exposure cases controls odds ratio
be foolish to include males since sex is a
Present 800 400 6.00 confounder and since there is no variability
Absent 200 600 1.00 in exposure in the male stratum.
Total 1,000 1,000
Comment
TABLE 2. Hypothetical example: exposure The use of the term "comparability" in

mlsdassltied for cases only* the principles delineated above does not nec-
Measured No. of No. of Observed essarily entail equality. Instead, it means that
exposure cases controls odds ratio the study results should be as valid as those
Present 680 400 3.19 that would be obtained under equality.
Absent 320 600 1.00 Therefore, our framework of comparability
principles, under certain assumptions, al-
Total 1,000 1,000 lows controls to be selected from outside the
Specificity and sensitivity are 80% for cases and 100% for study base (1, 9); allows external informa-
controls.
tion to be used to correct for an unmeasured
confounder (48); and allows for the use of
TABLE 3. Hypothetical example: exposure separate validation studies of the exposure
misclassrfled for cases and controls* for cases and controls to correct for unequal
Measured No of No. of Observed accuracy (49, 50). Thus, violations of
exposure cases controls odds ratio "equality" do not always violate the com-
Present 680 440 2.70 parability principles.
Absent 320 560 1.00
Total 1,000 1,000 EFFICIENCY PRINCIPLE

Specificity and sensitivity are 80% for both cases and
controls. Savings in money and time are two mo-
tivations for choosing a case-control design.
These factors also affect decisions about
turn of direct-interview cases and controls other aspects of design, such as the ratio of
and another for proxy-interview cases and controls to cases, whether and on which
controls does not necessarily reduce bias. variables to match (3), the source of controls
Examining the interaction, however, may be (2), and how they will be recruited. The
helpful since the bias will be greatest in the efficiency principle calls for consideration of
strata with poorest classification (46). costs as well as validity in selection of con-
trols. Statistical efficiency refers to the
amount of information obtained per subject;
Comparable opportunity for exposure? more broadly, efficiency encompasses the
Since the focus of a study should be on time and energy needed to complete the
whether the risk of disease is related to the study. For example, even when matching
level of exposure actually received, cases and can improve statistical efficiency, the payoff
controls do not need to have equal oppor- may not be worth the extra effort needed to
tunity to be exposed (3, 25, 47). Thus, in a recruit subjects (51).
study of cancer treatment on subsequent risk We have already seen how the efficiency
of leukemia, a case who received a treatment principle can conflict with the deconfound-
could be matched to a control whose physi- ing principle. When control of confounding
cian never prescribed that treatment. Of is essential for bias reduction, the efficiency
course, when it is easy to identify subsets of principle must be subordinated. However,
subjects without exposure opportunity, they the principle is important in choosing
among control selection strategies, for ex- 1986;39:567.

ample, whether to match or to control in 5. Breslow NE, Lubin JH, Marek P, et al. Multiplic-
ative models and cohort analysis. J Am Stat Assoc
the analysis for each potential confounder 1983;78:1-12.
(3, 52). Precision of the estimates of effect 6. Mantel N. Synthetic retrospective studies and re-
of a given exposure depends on the variance lated topics. Biometrics 1973;29:479-86.
7. Liddell FDK, McDonald JC, Thomas DC. Meth-
of the exposure, conditional on the matching ods of cohort analysis: appraisal by application to
factors and the other variables that are ad- asbestos mining (with discussion). J R Stat Soc A
justed for in the model, regardless of whether 1977; 140:469-91.
8. Dorn HF. Some problems arising in prospective
or not they are confounders. When a second and retrospective studies of the etiology of disease.
risk factor is strongly related to exposure and NEngl J Med 1959,261:571-9.
there is a need to control for its confounding 9. Miettinen OS. Theoretical epidemiology: principles
of occurrence research in medicine. New York:
effect, any strategy for controlling the effects John Wiley & Sons, Inc, 1985.
of the confounder will reduce the condi- 10. Miettinen OS. The concept of secondary base. J
tional variance of the exposure and can re- Clin Epidemiol 1990,43:1016-17.
11. Savitz DA, Pearce N. Control selection with incom-
duce efficiency substantially. In a matched- plete case ascertainment. Am J Epidemiol 1988;
pairs study, this phenomenon is manifested 127:1109-17.
as a reduction in the number of discordant 12. Tarone RE, Gart JJ. Significance tests for cancer
screening trials. Biometrics 1989;45:883-90.
pairs. 13. Chu KC, Smart CR, Tarone RE. Analysis of breast
cancer mortality and stage distribution by age for
the health insurance plan clinical trial. J Natl Can-
SUMMARY cer Inst 1988;80:1125-32.
14. Morrison AS. Case definition in case-control stud-
In this paper, we have presented and de- ies of the efficacy of screening. Am J Epidemiol
scribed what we believe are the major prin- 1982; 115:6-8.
ciples underlying control selection in case- 15. Weiss NS. Control definition in case-control studies
of the efiicacy of screening and diagnostic testing.
control studies. The principles of study base, Am J Epidemiol 1983;118:457-60.
deconfounding, and comparable accuracy 16. Khoury MJ, Flanders WD, James LM, et al. Hu-
all address the issue of comparability be- man teratogens, prenatal mortality, and selection
bias. Am J Epidemiol 1989; 130:361-70.
tween cases and controls. Perhaps the key 17. Weiss NS, Daling JR, Chow WH. Control defini-
concept is that of the study base. If the study tion in case-control studies of ectopic pregnancy.
base is identified correctly and if controls Am J Public Health 1985;75:67-8.
18. Brookmeyer R, Gail MH. Biases in prevalent co-
are chosen from it properly, the exposure horts. Biometrics 1987;41:739-49.
experience of the controls should be repre- 19. White JE. A two stage design for the study of the
sentative of the individuals who compose relationship between a rare exposure and a rare
disease. Am J Epidemiol 1982;115:119-28.
the base. At times, however, the pragmatic 20. Weinberg CR, Wacholder S. The design and analy-
principle of efficiency limits the investiga- sis of case-control studies with biased sampling.
tor's ability to achieve comparability, re- Biometrics 199O;46:963-75.
21. Breslow NE, Cain KC. Logistic regression for two-
flecting the tension between efficiency and stage case-control data. Biometrika 1988;75:
comparability inherent in epidemiologic 11-20.
research. 22. Graubard B, Fears TR, Gail MH. Effects of cluster
sampling on epidemiologic analysis in population-
based case-control studies. Biometrics 1989;45:
1053-71.
23. Breslow N. Design and analysis of case-control
REFERENCES studies. Annu Rev Public Health 1982;3:29-54.
24. Breslow NE, Day NE, eds. Statistical methods in
I. Wacholder S, Silverman DT, McLaughlin JK, et cancer research. Vol 1. The analysis of case-control
al. Selection of controls in case-control studies. II. studies. Lyon: International Agency for Research
Types of controls. Am J Epidemiol I992;l35: on Cancer, 1980. (IARC scientific publication no.
32).
2. Wacholder S, Silverman DT, McLaughlin JK, et 25. Rothman KJ. Modern epidemiology. Boston: Lit-
al. Selection of controls in case-control studies. III. tle, Brown & Company, 1986.
Design options. Am J Epidemiol 1992,135: 26. Rosenbaum PR. Case definition and power in case-
1042-50. control studies. Stat Med 1984;3:27-34.
3. Miettinen OS. The "case-control" study: valid se- 27. McKeown-Eyssen GE, Thomas DC. Sample size
lection of subjects. J Chronic Dis 1985;38:543-8. determination in case-control studies: the influence
4. Miettinen OS. Response. (Letter). J Clin Epidemiol of the distribution of exposure. J Chronic Dis 1985;
1028 Wachoider et al.
38:559-68. error application to diet and colon cancer. Stat

28. Lubin JH, Samet JM, Weinberg C. Design issues Med 1989;8:1151-65.
in epidemiologic studies of indoor exposure to Rn 41. Elton RA, Duffy SW. Correcting for the effect of
and risk of lung cancer. Health Phys 199O;59: misclassification bias in a case-control study using
807-17. data from two different questionnaires. Biometrics
29. Lubin JH, Hartge P. Excluding controls: misappli- 1983,39:659-65.
cations in case-control studies. Am J Epidemiol 42. Dosemeci M, Wacholder S, Lubin JH. Does non-
1984;120:791-3. differential misclassification of exposure always
30. Savitz DA, Wachtel H, Barnes FA, et al. Case- bias a true effect toward the null value? Am J
control study of childhood cancer and exposure to Epidemiol 1990; 132:746-8.
60-Hz magnetic fields. Am J Epidemiol 1988; 128: 43. Dosemeci M, Wacholder S, Lubin JH. The authors
21-38. clarify and reply. (Letter). Am J Epidemiol 1991;
31. Robins J, Pike M. The validity of case-control 134:441-2.
studies with nonrandom selection of controls. Epi- 44. Wacholder S, Dosemeci M, Lubin JH. Blind as-
demiology 1990; 1:273-84. signment of exposure does not always prevent dif-
32. Miettinen OS, Cook EF. Confounding: essence and ferentia] misclassification. Am J Epidemiol 1991;
detection. Am J Epidemiol 1981;114:593-603. 134:433-7.
33. Greenland S. The effect of misclassification in the 45. Greenland S, Robins JM. Confounding and mis-
presence of covariates. Am J Epidemiol 1980;l 12: classification. Am J Epidemiol 1985; 122:495-506.
564-9. 46. Walker AM, Velema JP, Robins JM. Analysis of
34. Cole P. The evolving case control study. J Chronic case-control data derived in part from proxy re-
Dis 1979;32:15-27. spondents. Am J Epidemiol 1988; 127:905-14.
35. Cole P. Introduction. In: Breslow NE, Day NE, 47. Poole C. Exposure opportunity in case-control
eds. Statistical methods in cancer research. Vol 1. studies. Am J Epidemiol 1986; 123:352-8.
The analysis of case-control studies. Lyon: Inter- 48. Axelson O. Aspects of confounding in occupational
national Agency for Research on Cancer, 1980:14 health epidemiology. Scand J Work Environ
40. (IARC scientific publication no. 32). Health 1978;4:98-102.
36. Miettinen OS. Matching and design efficiency in 49. Greenland S, KJeinbaum DG. Correcting for mis-
retrospective studies. Am J Epidemiol 1970;91: classification in two-way tables and matched pair
111-18. studies. Int J Epidemiol 1983; 12:93-7.
37. Day NE, Byar DP, Green SB. Overadjustment in 50. Espeland MA, Hui SL. A general approach to
case-control studies. Am J Epidemiol 1980; 112: analyzing epidemiologic data that contain misclas-
696-706. sification error. Biometrics 1987;43:1001-12.
38. Greenland S. Variance estimation for epidemio- 51. Thompson WD, Kelsey JL, Walter SD. Cost and
logic effect estimates under misclassification. Stat efficiency in the choice of matched and unmatched
Med 1988;7:745-57. case-control study designs. Am J Epidemiol 1982;
39. Armstrong BG. The effects of measurement errors 116:840-51.
on relative risk regressions. Am J Epidemiol 1990; 52. Thomas DC, Greenland S. The relative efficiencies
132:1176-84. of matched and independent sample designs for
40. Armstrong BG, Whittemore AS, Howe GR. Analy- case-control studies. J Chronic Dis 1983;36:
sis of case-control data with covariate measurement 685-97.

Wacholder AJE 1992 1 Comparibility Principles

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Wacholder AJE 1992 1 Comparibility Principles

Cargado por

Copyright:

Formatos disponibles

American Journal of Epidemiology Vol. 135, No.

Selection of Controls in Case-Control Studies

Shdom WachokJer,1 Joseph K. McLaughlin,1 Debra T. Silverman,1 and Jack S. MandeP

A synthesis of classical and recent thinking on the issues involved in selecting

bias (epidemiology); epidemiotogic methods; prospective studies; retrospective studies

implemented so as to learn as much as pos- The fundamental trade-off between a pri-

Problems in identifying the base some- seroconversion is typically unknown. Thus,

Use of deterministic (nonrandom) the neighborhood since diagnosis of the case

TABLE 1. Hypothetical example: exposure can be excluded on efficiency grounds. For

TABLE 2. Hypothetical example: exposure The use of the term "comparability" in

Total 1,000 1,000 EFFICIENCY PRINCIPLE

among control selection strategies, for ex- 1986;39:567.

38:559-68. error application to diet and colon cancer. Stat

También podría gustarte