Documentos de Académico
Documentos de Profesional
Documentos de Cultura
108 111 1 PB PDF
108 111 1 PB PDF
RESEARCH
Carmen Cadarso-Surez
Biostatistics Unit, Department of Statistics and Operations
Research, University of Santiago de Compostela, Spain
Wenceslao Gonzlez-Manteiga
Department of Statistics and Operations Research,
University of Santiago de Compostela, Spain
RESUMEN: La Bioestadstica es hoy en da una componente cientfica ABSTRACT: The discipline of biostatistics is nowadays a fundamen-
fundamental de la investigacin en Biomedicina, salud pblica y servi- tal scientific component of biomedical, public health and health ser-
cios de salud. Las reas tradicionales y emergentes de aplicacin inclu- vices research. Traditional and emerging areas of application include
yen ensayos clnicos, estudios observacionales, fisologa, imgenes, y clinical trials research, observational studies, physiology, imaging,
genmica. Este artculo repasa la situacin actual de la Bioestadstica, and genomics. The present article reviews the current situation of
considerando los mtodos estadsticos usados tradicionalmente en in- biostatistics, considering the statistical methods traditionally used
vestigacin biomdica, as como los recientes desarrollos de nuevos in biomedical research, as well as the ongoing development of new
mtodos, para dar respuesta a los nuevos problemas que surgen en methods in response to the new problems arising in medicine. Clear-
Medicina. Obviamente, la aplicacin fructfera de la estadstica en ly, the successful application of statistics in biomedical research
investigacin biomdica exige una formacin adecuada de los bioes- requires appropriate training of biostatisticians. This training should
tadsticos, formacin que debera tener en cuenta las reas emergentes aim to give due consideration to emerging new areas of statistics,
en estadstica, cubriendo al mismo tiempo los fundamentos de la teora while at the same time retaining full coverage of the fundamentals
estadstica y su metodologa. Es importante, adems, que los estudian- of statistical theory and methodology. In addition, it is important
tes de bioestadstica reciban formacin en otras disciplinas biomdicas that students of biostatistics receive formal training in relevant bio-
relevantes, como epidemiologa, ensayos clnicos, biologa molecular, medical disciplines, such as epidemiology, clinical trials, molecular
gentica y neurociencia. biology, genetics, and neuroscience.
PALABRAS CLAVE: Estadstica, Epidemiologa, Ensayos clnicos, KEY WORDS: Statistics, Epidemiology, Clinical Trials, Bionformatics,
Bioinformtica, Programas de formacin. training programs.
In fact, biostatistics has become a defined branch of sci- The important role that biostatistics and biostatisticians play
ence that uses an intricate combination of statistics, prob- in the field of medical research has always been widely rec-
ability, mathematics and computing to resolve problems in ognized by the biomedical community, and today statistics
the biomedical sciences. Because research questions in bi- applied to medicine can be considered a successful model
ology and medicine are diverse, biostatistics has expanded for the introduction of statistics into scientific practice. As
its domain to include any quantitative, not just statistical, an indication of its importance, the International Committee
model that may be used to answer these questions. As a of Medical Journal Editors (ICMJE, http://www.icmje.org) is
discipline designed to yield information, biostatistics may advised by biostatisticians of recognized prestige like DG
Altman, and includes in its Uniform Requirements for 2. WHY IS STATISTICS NECESSARY IN MEDICINE?
Manuscripts Submitted to Biomedical Journals a series of
N 725
recommendations for the correct application and explana- As we have noted, the growth of quantitative methods in
tion of statistical methods used. Similarly, the biomedical the biomedical sciences has made biostatistics a key com-
journals of highest prestige such as The Lancet, the British ponent in many research areas. Let us here consider why
STATISTICS IN BIOMEDICAL RESEARCH
Medical Journal, the Journal of the American Medical Asso- two major fields of medical research, namely epidemiology
ciation and the American Journal of Epidemiology include and clinical trials, depend on statistics as a fundamental
biostatisticians (notably specialists in epidemiology and tool in the achievement of their goals.
clinical trials) among their associate editors and reviewers.
In addition, journals of this type typically have a Theory and 2.1. Epidemiology
Methods section admitting articles about methodology, of-
ten statistical methodology. This situation is spreading to Epidemiology is the study of how often diseases occur,
other biomedical journals. and why, in different groups of people. Epidemiological
information is used to plan and evaluate strategies to
Biostatistics has developed enormously in recent years, due prevent illness, and serves as a guide for the management
to continuing advances in diverse biomedical areas and of patients in whom disease has already developed. Like
fields. For example, new problems in biomedical research the clinical findings and pathology, the epidemiology of
have led to the development of new statistical method- a disease is an integral part of its basic description. The
ologies that would not otherwise have arisen, and at the subject has its own special techniques for data collection
same time have favoured ingenious adaptations of clas- and interpretation.
sical statistical techniques to new contexts of application
(DeMets et al., 1994, and Zelen, 2006). The connection between biostatistics and epidemiology
has always been close. The early epidemiologists were
So biostatistics has over the years become a discipline in physicians interested in the way in which diseases oc-
itself, enriching not only medicine but statistics in general. cur in populations, their causes, and their relationships
We only need to look at the current issues of prestigious with different medical and non-medical factors. The
statistical journals, such as the Journal of the American Sta- problems tackled by these pioneers were not confined to
tistical Association, or the Journal of the Royal Statistical the study of epidemics, but extended to the evaluation
Society, to see the influence that biostatistics has on new of therapies. Many were skilled in quantitative reasoning
statistical methodology. In addition, specialized statistical and were knowledgeable about the statistical methods of
journals like Biostatistics, Biometrics, Biometrika, Statistics their day. Then, from the 1930s on, epidemiology began
in Medicine and Statistical Methods in Medical Research to turn its attention to the study of chronic diseases. It
are considered highly prestigious within statistics, and became impracticable to use the same prospective re-
occupy high positions in the Journal Citation Index rank- search strategies that had been so obviously appropriate
ing for the category Statistics and Probability. In recent in the study of infectious diseases. And here it was stat-
years, new journals have appeared in this ranking, such as isticians, primarily Cornfield and Mantel, who provided
the Biometrical Journal, the Journal of Agricultural Biologi- a rationale for approximately valid inference based on
cal and Environmental Statistics, Stochastic Environmental case-control data. Biostatisticians became more involved
Research and Risk Assessment (SERRA), and Environmet- in elaborating on the conditions for valid inference, with
rics, covering applications of statistics in environmental concerns about bias due to possible confounding factors.
sciences and public health. It seems likely that the number Furthermore, they began exploring other issues related
of journals in this ranking will increase in coming years. to epidemiologic research, such as models called dose-
Finally, it is worth noting the publication by Wiley of the response models for evaluating the effects of possible
8-volume Encyclopedia of Biostatistics (2005), whose edi- risk factors for disease. These effects are quantified by
tors are biostatisticians of high repute. This work extends measures of association such as the odds ratio or relative
and broadens the biostatistical content of Wileys highly risk, i.e. probabilistic concepts that need to be estimated
respected Encyclopedia of Statistical Sciences. appropriately, according to the type of study (case-con-
Several emerging and recurrent issues relating to the drug In 1997, Houwelingen likewise suggested that the fu-
development process merit particular mention. Emerging ture would be marked by new biomedical applications
issues include ongoing changes to the FDA (Food and Drug (in epidemiology, historical data on oncological patients
Administration, http://www.fda.gov) regulatory environ- and their families; in ecology, spatial data); by new phi-
ment, increasing internationalization of drug development, losophies (causal models instead of randomized clinical
advances in computer technology and visualization tools, trials; prediction versus prognostic modelling); new mod-
and efforts to incorporate meta-analytical methodology. els (graphical chain models, random effects models); new
Recurrent issues include the continuing development of computational facilities (with an impact on the other as-
statistical methods for handling subgroups in the design pects); new techniques (graphic techniques, exact meth-
Houwelingen are already a reality, and many of the new Since 1959, the Kaplan-Meier curve has been a well-
statistical techniques they argue for have already been known estimator of the survival function, and it is ex-
applied in studies published in prestigious biomedical jour- tensively used in epidemiological and clinical research.
nals. However, while these new methods are already being The classical proportional hazards model of Cox (1972)
used in biomedical research, not all are being widely used. is also widely used whenever the goal is to study how
One possible explanation for this situation may be the lack covariates affect survival. A generalization of the survival
of standard software implementing these new techniques, process arises when survival is the ultimate outcome but
so that most biomedical researchers view their practical intermediate states are identified. In this situation, a se-
application as difficult. It is certainly clear that new tech- quence of events is observed, leading to more than one
niques may take some while to find their way into medical observation per individual. Intermediate states might be
research, although there is some evidence to suggest that based on categorical time-dependent covariates such as
the speed of transfer has picked up in recent decades. transplantation, clinical symptoms (e.g. bleeding episodes),
or a complication in the course of the illness (e.g. me-
tastasis), or alternatively on biological markers (like CD4
T-lymphocyte levels). A classical approach for the analysis
4. SOME RECURRENT AND EMERGING of data of this type is the timedependent Cox regression
ISSUES IN BIOSTATISTICS model (TDCM). Advantages of Coxs regression model in-
clude its easy interpretability and its availability in the
Modern biostatistics presents a number of challenges in majority of statistical packages.
terms of both the continuing development of classical
techniques and the creation of new techniques to re- In the 1990s, so-called multi-state models (MSMs) became
solve new problems. In this section we focus first on the available: these offer a better understanding of the disease
methodology of survival analysis, as an example of the process, leading to a better knowledge of how the time-
advances being made in classical biostatistics. We then dependent covariate affects the evolution of the disease.
turn our attention to various emerging fields that merit These modern models have several advantages over Coxs
further research by biostatisticians: specifically bioinfor- regression model. They offer a better understanding of the
matics, spatial statistics, neural networks, and functional disease process, providing the hazard for movement out of
data analysis. one state into another (i.e. transition intensities), as well
as many other types of information, including the mean
4.1. New issues in survival analysis time of sojourn in each state, and survival rates for each
state. Covariates on transition intensities can also explain
In many biomedical fields, time-to-event data must fre- differences in the course of the illness among subjects in
quently be analysed. In oncology, for example, interest the population. Notably, MSMs can reveal how different
typically centres on the patients time of survival follow- covariates affect different transitions, something that is
ing a surgical intervention. The analysis of this type of not possible with other models like the TDCM. In fact, it is
survival experiment is complicated by issues of censoring very unlikely that the risk of death in patients who have re-
and truncation (Klein and Moeschberger, 2003). Censoring ceived different treatments will be the same. Furthermore,
occurs when we do not fully observe the patients survival, the prognostic factors associated with the risk of death
due to death unrelated to the cancer under study, or dis- may differ depending, e.g., on the treatment received.
appearance from the study for some reason. Truncation
basically occurs when some patients cant be observed A considerable literature is nowadays available on the
for some reasons related to the survival itself. A common analysis of MSMs: see for example the books of An-
(b) Ecological studies are concerned with associations environmental health issues, including the US Centers
between the observed incidence of disease and for Disease Control and Prevention (CDC), Environmental
potential risk factors, as measured on groups rather Protection Agency (EPA), and National Research Centre for
than individuals, the groups typically being defined Statistics and the Environment (NRCSE), as well as the Pan
by geographical area. Such studies are valuable in American Health Organization (PAHO), the World Health
investigating the aetiology of disease, and may help Organization (WHO), and various European Community
to identify future lines of research, and possibly pre- government agencies.
ventative measures.
(c) Disease clustering studies focus on identifying geo- 4.4. Neural networks in medicine
graphical areas with a significantly elevated risk of
disease, or on assessing the evidence of elevated risk Neural networks (NN) approaches in medicine have at-
around putative sources of hazard. Uses include the tracted many researchers, and these approaches have been
targeting of follow-up studies to ascertain reasons implemented in several biomedical applications (see Ohno-
for observed clustering in disease occurrence, or the Machado, 1996), including diagnostic systems, biochemi-
investigation of control measures where the aetiology cal analysis, image analysis, and drug development. Neural
of observed clustering has been established. networks, which simulate the function of human neuron
(d) Environmental assessment and monitoring is con- networks, have potentially useful implementations in
cerned with ascertaining the spatial distribution of many applications domains. Unlike human decision-mak-
environmental factors relevant to health, and expo- ers, NNs are of course unaffected by factors like fatigue,
sure to them, so as to establish necessary controls or working conditions and emotional state.
take preventative action.
NNs have been applied in various areas of medicine: they
Given the breadth and importance of the concerns in are widely used in diagnostic systems, for the detection
geographical epidemiology, it is not surprising that there of cancer and heart problems, and for the analysis of di-
has been considerable interest in this area in recent years. verse types of medical image (including tumour detection
Much of this interest has been in the development of rel- in ultrasonograms, classification of chest x-rays, tissue
evant statistical methods and techniques, and there is no and vessel classification in magnetic resonance images,
doubt that this particular vein of research has been, and estimation of skeletal age from x-ray images, and assess-
continues to be, a fruitful source of interesting statistical ment of brain maturation). NNs are used experimentally to
problems, motivating successful methodological develop- model the human cardiovascular system: diagnosis can be
ments within the discipline. Several issues of major sta- achieved by building a model of the cardiovascular system
tistical journals have been devoted to spatial statistical of an individual and comparing it with the real-time physi-
methods in health applications (e.g. the American Journal ological measurements taken from that patient. NNs are
of Epidemiology, the Journal of the Royal Statistical So- also used as tools in the development of drugs for treating
ciety, and Statistics in Medicine). There has also been a cancer and AIDS.
considerable volume of papers in the field published sepa-
rately, in key journals including the American Journal of Neural networks are increasingly being seen as an exten-
Epidemiology, Biometrics, Biometrika, Environmetrics, the sion to general statistical methodology, to be given full
Journal of the American Statistical Association, and Statis- consideration alongside classical and modern statistical
tics in Medicine. In addition, a number of significant recent methods. Reviews arguing this viewpoint include those of
texts have been devoted to this subject area (see Cressie, Ripley (1993, 1996) and Cheng and Titterington (1994),
1993; Lawson and Cressie, 2000; Lawson 2006). and certainly it is a viewpoint that is being widely ac-
trainees spending time in biomedical laboratories to gain cialist statistical training. This is turn will favour the
first-hand experience and insight into the nature of the more rapid transfer of new statistical methods into the
real problems faced by these researchers. A major goal mainstream of biomedical research.
must be to train students to become independent re-
searchers, advancing the field of statistical research and Also, since the goal is to train students not only to be
its application to biomedical research, both basic and clini- excellent members of research teams, but ultimately to
cal; and at the same time to be team workers intimately be independent biomedical researchers in their own right,
involved in the design and data analysis of collaborative students must gain some fundamental knowledge of a
biomedical projects. biological specialty area. This knowledge can be gained
through formal course work or hands-on laboratory expe-
Since many biomedical problems will likely require the rience, or indeed by a combination of the two. For example,
development of new statistical methods, students should courses for a student focusing on statistical genetics might
be capable of critically reading the theoretical and include statistical genetics, human genetics, and popula-
methodological literature in statistics, and of develop- tion genetics, as well as courses on areas of computer
ing new methods as appropriate for the problems that science like database structures, and bioinformatics.
they encounter. Students without such rigorous training
are unlikely to be able to develop new methods or fully Concern for the current situation of biostatistics in Spain
understand the strengths and limitations of those meth- led in 1999 to the organization of a Round Table on Sta-
ods developed by others. The traditional component of tistics in Medicine, as part of IVth Congress of the Galician
biomedical courses will probably focus on areas of math- Society for the Promotion of Statistics and Operations Re-
ematical statistics including probability theory, inference, search. Under the direction of Professor Guadalupe Gmez,
resampling methods (e.g. bootstrap), linear regression, this round table constituted an interesting reflection on
analysis of variance, generalized linear models, survival various questions of relevance to the profession of bi-
analysis (including multi-state models), nonparametric ostatistics, summarized in a valuable article in the journal
methods, and data analysis. In addition, new methodolo- Questio (Abraira et al., 2001). Specific issues discussed
gies like spatial statistics, neural networks, and smoothing included the need to promote postgraduate courses in bi-
regression methods (such as generalized additive models) ostatistics, aimed at training biostatisticians of the highest
are strongly recommended. rank within Spain. In line with this need, the Department
of Statistics and Operations Research of the University of
As we have noted, in order that biostatisticians can offer Santiago de Compostela launched in 2005 a new Master
new statistical methodologies useful for the biomedi- in Biostatistics (http://eio.usc.es/pub/master_bio), with the
cal research community (in other words, methods that aim of training students in modern statistical methods for
are not already incorporated into standard statistical biomedical research, and at the same time giving them a
software), they need to be competent at software crea- solid grounding in the major areas of biomedical research,
tion. It is thus of interest for training programmes to notably epidemiology, clinical trials, genetics and bioin-
include components dealing with relevant programming formatics.