7071

Validation of predictive regression models
Ewout W. Steyerberg, PhD

Clinical epidemiologist
Frank E. Harrell, PhD

Biostatistician
Personal background
Ewout Steyerberg:
Erasmus MC, Rotterdam, the Netherlands
Frank Harrell: Health Evaluation Sciences,

Univ of Virginia, Charlottesville, VA, USA
Validation of predictions from regression models is of paramount importance
Learning objectives: knowledge of

common types of regression models fundamental assumptions of regression models performance criteria of predictive models principles of different types of validation
Performance objectives
To be able to explain why validation is necessary for predictive models To be able to judge the adequacy of a validation procedure
Predictive models provide quantitative estimates of an outcome, e.g.

Quality of life one year after surgery
Death at 30 days after surgery

Long term survival
Predictive models are often based on regression analysis

y ~ a + sum(bi*xi)
y: outcome variable
a: intercept
bi: regression coefficient i
xi: predictor variable i

i in [1,many], usually 2 to 20
3 examples of regression
Quality of life one year after surgery:
continuous outcome, linear regression

Death at 30 days after surgery:
binary outcome, logistic regression

Long term survival:
time-to-outcome, Cox regression
Predictive models make assumptions

Distribution
Linearity of continuous variables
Additivity of effects
Example: a simple logistic regression model

30day mortality ~ a + b1*sex + b2*age
Assumptions:
Distribution of 30day mortality is binomial

Age has a linear effect
The effects of sex and age can be added
Assessing model assumptions

Examine model residuals
Perform specific tests

add nonlinear terms, e.g. age+age2
add interaction terms, e.g. sex*age
Model assumptions and predictions

Better predictions if assumptions are met
Some violation inherent in empirical data
Evaluate predictions in new data
Evaluation of predictions
Calibration average of predictions correct? low and high predictions correct? Discrimination distinguish low risk from high risk patients?
Example: predicted probabilities

Actual 30-day mortality
Area under ROC: 0.77 Calibration: OK
0.0
0.0 0.1 0.2 0.3 0.4 Predicted probability of 30-day mortality
0.1
0.2
0.3
0.4
3 types of validation
Apparent: performance on sample used to develop model Internal: performance on population underlying the sample External: performance on related but slightly different population
Apparent validity
Easy to calculate
Results in optimistic performance estimates
Apparent estimates optimistic since same data used for:

Definition of model structure: e.g. selection and coding of variables
Estimation of model parameters: e.g. regression coefficients Evaluation of model performance: e.g. calibration and discrimination
Internal validity
More difficult to calculate
Test model in new data, random from underlying population
Why internal validation?

Honest estimate of performance should be obtained, at least for a population similar to the development sample
Internal validated performance sets an upper limit to what may be expected in other settings (external validity)
External validity
Moderately easy to calculate when new data are available Test model in new data, different from development population
Why external validation?

Various factors may differ from development population, including different selection of patients
different definitions of variables

different diagnostic or therapeutic procedures
Internal validation techniques

Split-sample:
development / validation
Cross-validation:
alternating development / validation

extreme: n-1 develop / 1 validate (jack-knife) Bootstrap
Bootstrap is the preferred internal validation technique

bootstrap sample for model development: n patients drawn with replacement
original sample for validation: n patients difference: optimism efficiency: development and validation on n patients
Example: bootstrap results for logistic regression model

30-day mortality ~ a + b1*sex + b2*age
Apparent area under the ROC curve: 0.77
Mean area of 200 bootstrap samples:0.772

Mean area of 200 tests in original: 0.762
Optimism in apparent performance: 0.01

Optimism-corrected area: 0.76
External validation techniques

Temporal validation: same investigators, validate in recent years Spatial validation (other place): same investigators, cross-validate in centers Fully external: other investigators, other centers
Example: external validity of logistic regression model

30-day mortality ~ a + b1*sex + b2*age
Apparent area in 785 patients: 0.77
Tested in 20,318 other patients: 0.74

Tested by other investigators: ?
Example: external validation

Actual 30-day mortality
Area under ROC: 0.74 Calibration: reasonable
0.0
0.0 0.1 0.2 0.3 0.4 Predicted probability of 30-day mortality
0.1
0.2
0.3
0.4
Summary
Apparent validity gives an optimistic estimate of model performance Internal validity may be estimated by bootstrapping External validity should be determined in other populations
Key references
tutorial and book on multivariable models
(Harrell 1996, Stat Med 15:361-87; Harrell: regression modeling strategies, Springer 2001)
empirical evaluations of strategies

(Steyerberg 2000: Stat Med19: 1059-79)
internal validation (Steyerberg 2001:JCE 54: 774-81)
external validation
(Justice 1999: Ann Intern Med 130:515-24; Altman 2000: Stat Med 19: 453-73)
Links
Interactive text book on predictive modeling
http://www.neri.org/symptom/mockup/Chapter_8/
Harrells Regression modeling strategies

http://hesweb1.med.virginia.edu/biostat/rms/

7071

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

7071

Cargado por

Copyright:

Formatos disponibles

Validation of predictive regression models

Ewout W. Steyerberg, PhD

Frank E. Harrell, PhD

Frank Harrell: Health Evaluation Sciences,

Validation of predictions from regression models is of paramount importance

Learning objectives: knowledge of

Predictive models provide quantitative estimates of an outcome, e.g.

Death at 30 days after surgery

Predictive models are often based on regression analysis

xi: predictor variable i

continuous outcome, linear regression

binary outcome, logistic regression

time-to-outcome, Cox regression

Predictive models make assumptions

Example: a simple logistic regression model

Distribution of 30day mortality is binomial

The effects of sex and age can be added

Assessing model assumptions

Perform specific tests

add interaction terms, e.g. sex*age

Model assumptions and predictions

Evaluate predictions in new data

Example: predicted probabilities

Results in optimistic performance estimates

Apparent estimates optimistic since same data used for:

Test model in new data, random from underlying population

Why internal validation?

Why external validation?

different definitions of variables

Internal validation techniques

alternating development / validation

Bootstrap is the preferred internal validation technique

Example: bootstrap results for logistic regression model

Mean area of 200 bootstrap samples:0.772

Optimism in apparent performance: 0.01

External validation techniques

Example: external validity of logistic regression model

Tested in 20,318 other patients: 0.74

Example: external validation

empirical evaluations of strategies

internal validation (Steyerberg 2001:JCE 54: 774-81)

Harrells Regression modeling strategies

También podría gustarte