Está en la página 1de 10

Gynaecological oncology

DOI: 10.1111/j.1471-0528.2012.03297.x
www.bjog.org

Triaging women with ovarian masses for


surgery: observational diagnostic study to
compare RCOG guidelines with an International
Ovarian Tumour Analysis (IOTA) group protocol
B Van Calster,a,b D Timmerman,a,c L Valentin,d A McIndoe,e S Ghaem-Maghami,b
AC Testa,f I Vergote,c,g T Bournec,e
a
Department of Development and Regeneration, KU Leuven University of Leuven, Leuven, Belgium b Department of Cancer and Surgery,
Imperial College, London, UK c Department of Obstetrics and Gynaecology, University Hospitals KU Leuven, Leuven, Belgium d Department
of Obstetrics and Gynaecology, Skane University Hospital Malmo, Lund University, Malmo, Sweden e Queen Charlottes and Chelsea Hospital,
Imperial College NHS Trust, London, UK f Department of Obstetrics and Gynaecology, Universita` Cattolica del Sacro Cuore, Rome, Italy
g
Department of Oncology, KU Leuven University of Leuven, Leuven, Belgium
Correspondence: Dr B Van Calster, Department of Development and Regeneration, KU Leuven University of Leuven, Herestraat 49 box 7003,
B-3000 Leuven, Belgium. Email ben.vancalster@med.kuleuven.be

Accepted 21 January 2012. Published Online 6 March 2012.

Objective To compare guidelines from the Royal College of

Main outcome measure Percentages of benign, borderline and

Obstetricians and Gynaecologists (RCOG) based on the Risk of


Malignancy Index (RMI) with a protocol based on logistic
regression model LR2 developed by the International Ovarian
Tumour Analysis (IOTA) group for triaging women with an
ovarian mass as low, moderate, or high risk of malignancy.

invasive tumours classified as low, moderate or high risk.

Design and setting Observational diagnostic study conducted


between 2005 and 2007 at 21 oncology referral centres, referral
centres for ultrasonography and general hospitals.

Results The IOTA and RCOG protocols classified 71.1% and


62.1% of benign tumours as low risk, respectively (difference 9.0;
95% CI 6.211.9, P < 0.0001). Of invasive tumours, 88.6% and
73.6% were labelled high risk (difference 15.0; 10.619.4,
P < 0.0001), and 3.0% and 5.2% were labelled low risk (difference
)2.2; )4.6 to 0.2, P = 0.07) respectively by each protocol. Similar
results were found after stratification for menopausal status.

Sample In all, 1938 women undergoing surgery for an ovarian

Conclusions The IOTA protocol was more accurate for triage than

mass.

the RCOG protocol. The IOTA protocol would avoid major


surgery for more women with benign tumours while still
appropriately referring more women with an invasive tumour to
a gynaecological oncologist.

Methods RCOG guidelines use the RMI to triage women as low

(RMI < 25), moderate (25250), or high (above >250) risk. The
IOTA protocol uses LR2s estimated probability of malignancy
(<0.05 indicates low risk, 0.05 but <0.25 moderate risk, and
0.25 high risk).

Keywords Diagnostic tests, IOTA, ovarian neoplasms, risk of

Malignancy Index, triage, ultrasonography.

Please cite this paper as: Van Calster B, Timmerman D, Valentin L, McIndoe A, Ghaem-Maghami S, Testa A, Vergote I, Bourne T. Triaging women with
ovarian masses for surgery: observational diagnostic study to compare RCOG guidelines with an International Ovarian Tumour Analysis (IOTA) group
protocol. BJOG 2012;119:662671.

Introduction
Ovarian cancer remains a significant problem with 6500
cases and 4400 deaths in 2008 among UK women.1 In the
whole of Europe the figures are 66 700 and 41 900, respectively.2 A significant contributor to prognosis is the quality
of the primary surgery, as survival from the disease is
related to residual tumour mass after debulking.3 Less

662

radical treatment options require adequate surgical staging


to ensure that women are not under-treated. Consequently,
treatment in specialised centres dealing with large numbers
of women improves survival in women with ovarian cancer
and is more cost-effective.48 On the other hand, benign
ovarian lesions are more common than carcinoma, and in
this group misdiagnosis may lead to unnecessary levels of
intervention. Therefore, to optimise care, referring the right

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

Triaging women with ovarian masses for surgery

women to specialist centres is crucial. This hinges on an


accurate preoperative assessment of the likely pathology.
It has been suggested that decisions on how to manage
women with an adnexal mass are taken on the basis of the
Risk of Malignancy Index (RMI).913 For example, guidelines from the Royal College of Obstetricians and Gynaecologists (RCOG) in the UK suggest using the RMI to
categorise women with an adnexal mass into three groups:
low, moderate and high risk of ovarian cancer.14 For
tumours classified as low risk, the proposed management is
expectant management or laparoscopic surgery by a generalist in a gynaecology unit. If at moderate risk, laparoscopic
surgery in a cancer unit by a surgeon with a special interest
is suggested. If at high risk, referral of the woman to a cancer centre for a full staging procedure by a subspecialist
gynaecological oncologist is advised.
Recently, the International Ovarian Tumour Analysis
(IOTA) group collected a large database of women with an
adnexal mass and developed logistic regression models to
calculate the risk of malignancy in adnexal masses using
clinical information and features derived from ultrasonography.15 Two logistic regression models (LR1 containing
twelve variables and LR2 containing six variables) were
developed and validated internally (IOTA phase 1,
n = 1066). These models then underwent temporal validation (IOTA phase 1b, n = 507) and external validation
(IOTA phase 2, n = 1938) with excellent performance,16,17
also in comparison with RMI and other models from the
literature as evaluated using the area under the receiver
operating characteristics curve.18 On external validation in
12 centres LR2 and RMI had areas under the curve of 0.95
and 0.91, respectively, and the advantage of LR2 over RMI
in terms of area under the curve was larger for premenopausal women than for postmenopausal women.18
The aim of this study is to compare the RCOG protocol
with a protocol based on the IOTA logistic regression
model LR2 with respect to the classification of adnexal
masses as being at low, moderate or high risk of malignancy. The key objective is to investigate whether the IOTA
protocol improves the selection of women referred for
potentially radical surgery by labelling more benign
tumours as low risk, more invasive tumours as high risk,
and classifying fewer tumours as moderate risk. A secondary objective is the comparison of the two protocols for
specific subgroups of women. The most important subgroups are premenopausal and postmenopausal women.

Methods
Design and setting
This is a multicentre observational diagnostic study to evaluate two triaging protocols on women presenting with an
adnexal (ovarian, para-ovarian, or tubal) mass who later

underwent surgery. We used the data from phase 2 of the


IOTA study, which have also been used in previous publications.1720 Patients were recruited between November
2005 and October 2007 in 11 oncology referral centres,
three referral centres for ultrasonography, and five general
hospitals in eight countries (Belgium, Sweden, Italy, UK,
Czech Republic, Poland, China and Canada). The centres
are listed in Table 1. The principal investigators were
gynaecologists or radiologists specialised in gynaecological
ultrasonography and with a special interest in adnexal
masses. The research protocols were ratified by the local
Ethics Committee at each recruitment centre.

Participants
Women presenting to one of the recruitment centres with
at least one persistent adnexal mass that was selected for
surgical intervention by the managing clinicians were eligible for inclusion conditional on oral informed consent
before the ultrasound scan and surgery. In the event of
multiple masses, the mass with the most complex ultrasound morphology was used to collect information on
tumour characteristics for statistical analysis. When masses
with similar morphology were observed we included the
larger of the two masses or the one most easily visible by
ultrasonography. Exclusion criteria were pregnancy, refusal
of transvaginal ultrasonography, and surgical removal of
the mass more than 120 days after the ultrasound examination.

Data collection
A dedicated, secure data collection system was developed
for the study (IOTA 2 study screen; astraia GmbH,
Munich, Germany). An automatically generated unique
identifier was made for each womans record. Clinicians
could only view or update the records from their own centre. Data security was ensured by encrypting all data communication. Client-side checks in the astraia system and
manual checks by one biostatistician and two experienced
ultrasound examiners were used to ensure data integrity
and completeness.
Immediately before the ultrasound examination a standardised history was taken, including the patients age and
menopausal status, information on personal history of
ovarian and breast cancer, number of first-degree relatives
with ovarian or breast cancer, current hormonal therapy,
and previous gynaecological surgery. Women aged 50 years
or more who had undergone hysterectomy were defined as
postmenopausal. A standardised approach was used to
carry out transvaginal ultrasonography in all the women.
The examination technique and the ultrasound terms and
definitions used to describe the ultrasound findings have
been described elsewhere.21 In the event that a large mass
could not be seen in its entirety using a transvaginal probe,

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

663

Van Calster et al.

Table 1. Pathological tumour diagnosis stratified by recruitment centre


Recruitment centre

Benign
n (%)

Oncology referral centres


University Hospitals K.U.Leuven, Belgium
Ospedale San Gerardo, Universita di Milano-Bicocca, Monza, Italy
Medical University Lublin, Poland
Universita Cattolica del Sacro Cuore, Rome, Italy
Istituto Europeo di Oncologia, Milan, Italy
General Faculty Hospital, Charles University, Prague, Czech Rep.
Chinese PLA General Hospital, Beijing, China
Kings College Hospital, London, UK
Skane University Hospital Lund, Lund University, Sweden
Universita degli Studi di Udine, Italy
Istituto Nazionale dei Tumori, Fondazione Pascale, Naples, Italy
Referral centres for ultrasonography
University of Bologna, Italy
DCS Sacco University of Milan, Italy
Universita degli Studi di Napoli, Naples, Italy
General hospitals
Ziekenhuis Oost-Limburg, Genk, Belgium
Ospedale San Giovanni di Dio, Cagliari, Italy
Skane University Hospital Malmo, Lund University, Sweden
Macedonia Melloni Hospital, University of Milan, Italy
St. Josephs Hospital, McMaster University, Hamilton, Canada
Summary
Oncology referral centres
Referral centres for ultrasonography
General hospitals
Total

transabdominal ultrasonography was used. All centres used


high-quality ultrasound equipment with sensitive colour
Doppler functions. Information on more than 40 grey scale
and Doppler ultrasound variables was collected to characterise each adnexal mass.
The participating centres were encouraged to measure
the level of serum CA125 in peripheral blood from all
women in the study, but the availability of CA125 results
was not a requirement for inclusion in the IOTA study.
The decision to measure CA125 or not was a reflection of
routine clinical practice in the different centres: some centres measured serum CA125 levels in every woman whereas
others did not. Second-generation immunoradiometric
assay kits for CA-125 II22 from the following companies
were used: Roche Diagnostics, Basel, Switzerland; Centocor,
Malvern, PA, USA; Cis-Bio, Gif-sur-Yvette, France; Abbott
Laboratories Diagnostic Division, Abbott Park, IL, USA;
Bayer Diagnostics, Tarrytown, NY, USA; bioMerieux,
Marcy lEtoile, France. All kits used the OC 125 antibody.
Serum CA125 levels are expressed in units per millilitre
(U/ml).

664

155
199
101
54
41
39
57
40
31
10
4

Borderline
n (%)

Invasive
n (%)

Total
n

(62)
(79)
(66)
(44)
(44)
(43)
(78)
(62)
(82)
(59)
(44)

24
17
3
11
10
15
1
9
1
0
0

73
35
50
57
43
36
15
16
6
7
5

252
251
154
122
94
90
73
65
38
17
9

124 (92)
45 (90)
51 (80)

3
0
2

8
5
11

135
50
64

173
134
110
17
11

(87)
(87)
(80)
(81)
(92)

5
3
6
1
0

22
17
21
3
1

200
154
137
21
12

731
220
445
1396

(63)
(88)
(85)
(72)

91 (8)
5 (2)
15 (3)
111 (6)

343 (29)
24 (10)
64 (12)
431 (22)

1165
249
524
1938

Reference standard: pathological tumour outcome


In this study we focus on the pathological categorisation of
removed tissues as benign, borderline or invasive. Surgery
was performed by laparoscopy or laparotomy according to
the surgeons judgment, and the subsequent tissue examination was performed at the local centre by a dedicated
gynaecological pathologist. In case of a borderline or invasive tumour, surgical stage was recorded according to the
criteria recommended by the International Federation of
Gynaecology and Obstetrics.23 We evaluate borderline and
invasive tumours separately. In many studies, including
those in which RMI and LR2 were developed,9,15 borderline
and invasive tumours are combined into a single group of
malignant tumours for statistical analysis because borderline tumours require similar preoperative, perioperative
and postoperative precautions as invasive tumours.

The triaging protocols


The RCOG guideline number 3414 suggests management of
adnexal masses using the RMI.9,24 The RMI contains one

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

Triaging women with ovarian masses for surgery

clinical variable, five ultrasound variables and one biochemistry variable. It is obtained as the product of the
menopausal status score M, the ultrasound score U, and
the serum CA125 level in U/ml: M*U*CA-125. M has value
3 if the woman is postmenopausal and 1 if the woman is
premenopausal. U has score 0, 1 or 3 depending on
whether none, one or more than one of the following five
ultrasound characteristics are present: multilocular cyst,
evidence of solid components, presence of ascites, bilateral
masses and presence of metastases. If the RMI is below 25
the guideline recommends considering the woman as at
low risk of malignancy, if the RMI is between 25 and 250
the guideline assumes moderate risk, and if the RMI is
above 250 high risk is assumed. According to Davies
et al.,24 the likelihood of having a malignant tumour in the
low, moderate and high risk groups is 2%, 21%, and 75%,
respectively.
The IOTA protocol is based on the logistic regression
model LR2.15 The data that were used to develop LR2 were
collected between 1999 and 2002 in nine centres (IOTA
phase 1), seven of which also participated in IOTA phase
2. LR2 contains one clinical variable (a), four grey scale
ultrasound variables (b, c, d, e) and one Doppler ultrasound variable (f): (a) patient age (years), (b) maximal
diameter of the solid components (mm), (c) presence of
ascites (yes = 1, no = 0), (d) irregular internal cyst walls
(yes = 1, no = 0), (e) presence of acoustic shadows
(yes = 1, no = 0) and (f) the presence of papillary structures with detectable flow (yes = 1, no = 0). The risk of
malignancy is derived as 1/(1 + e)z), with z = )5.3718 +
0.0354 * (a) + 0.0697 * (b) + 1.6159 * (c) + 0.9586 * (d) )
2.9486 * (e) + 1.1768 * (f). A priori, we determined the risk
thresholds for triaging women as low, moderate or high
risk to be 0.05 and 0.25, i.e. 5% (1 in 20) and 25% (1 in
4) estimated risk of malignancy according to LR2: a risk
below 0.05 corresponds to low risk; a risk of at least 0.05
but less than 0.25 corresponds to moderate risk; a risk of
0.25 or higher corresponds to high risk.

Statistical analysis
All statistical analyses were performed using sas version 9.2
(SAS Institute, Cary, NC, USA).
Patients were cross-tabulated based on the true outcome
(benign, borderline, invasive) and the assigned risk (low,
moderate, high), and the percentages of women classified
in each risk group were calculated for each tumour type to
assess the impact of the two protocols. Another index of a
protocols impact is the percentage of all women classified
as moderate risk and therefore not receiving a clear diagnosis. Differences in paired proportions were analysed using
95% confidence intervals obtained by the standard asymptotic method and the McNemar test. We compared the
protocols by assessing whether the IOTA protocol reclassi-

fied women to a more appropriate risk group than the


RCOG protocol using a measure called the Net Reclassification Improvement.25 For benign and invasive tumours the
measure computes the proportion of tumours that are
reclassified to a more appropriate risk group minus the
proportion of tumours reclassified to less appropriate risk
groups to give the net percentage of tumours with
improved reclassification.
The serum CA125 level was missing in 22% of the
women, and was more often missing in women with a
benign tumour (28%) than in women with a borderline or
invasive tumour (10%). The missing values are highly likely
to have occurred for two reasons. First, different management practices caused some centres to be more committed
than others to measure CA125. Second, investigators sometimes did not measure CA125 given the overall clinical
picture and the ultrasound appearance of the tumour.
Rather than discarding cases with a missing value, which
would probably result in a biased patient sample, we
handled the missing values using multiple imputation.26,27
In this approach, the missing CA125 values are estimated
(i.e. imputed) multiple times to acknowledge the uncertainty in the imputed values. To estimate the missing values, we used predictive mean matching regression28 using
variables that were related to either the level of CA125 or
the unavailability of CA125 (this is a binary indicator with
value 1 if CA125 is missing and 0 otherwise). We generated
100 imputations of the missing values such that 100 completed datasets were obtained. The RCOG protocol, which
relies on the CA125 level, was analysed on each of the 100
datasets generated by imputation, and the results were
averaged. For cross-tabulations of women with respect to
true outcome and risk group assignment, results were
rounded to integer values while respecting the (fixed) number of women in each true outcome category. P-values for
the McNemar test for each completed dataset were combined using the method from Li et al.29
One of the variables of the RMI is the presence of metastases. This was not a mandatory variable in the IOTA
study, and therefore information on metastases was sometimes missing (23%). To address this, we analysed the ability of the RMI to discriminate between benign and
malignant tumours when 1) the variable metastases was
not used, 2) the variable was imputed by 0 in case of missing information, and 3) multiple imputation was used. The
resulting receiver operating characteristics curves were
virtually identical for all three approaches. We decided to
impute the variable by 0 in case of missing information,
because it is likely that in most cases where information on
metastases was missing there were no ultrasound signs of
metastases.
In addition to reporting the results for the whole study
population we report results for specific subgroups. First,

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

665

Van Calster et al.

Eligible patients (n = 1970)


Patients excluded (n = 32)
No surgical removal of the mass within 120 days (n = 15)
Patient pregnant at time of examination (n = 12)
Errors in data entry (n = 4)
Protocol violation (n = 1)

Included patients (n = 1938)


Imputation of missing values
Serum CA-125 level (n = 434)
Presence of metastases (n = 443)

Index tests
RCOG protocol (n = 1938)
IOTA protocol (n = 1938)

Low risk
RCOG protocol (n = 906)
IOTA protocol (n = 1024)

Moderate risk
RCOG protocol (n = 609)
IOTA protocol (n = 345)

High risk
RCOG protocol (n = 423)
IOTA protocol (n = 569)

Reference standard
RCOG protocol (n = 906)
IOTA protocol (n = 1024)

Reference standard
RCOG protocol (n = 609)
IOTA protocol (n = 345)

Reference standard
RCOG protocol (n = 423)
IOTA protocol (n = 569)

Benign
RCOG
(n = 867)
IOTA
(n = 993)

Borderline
RCOG
(n = 17)
IOTA
(n = 18)

Invasive
RCOG
(n = 22)
IOTA
(n = 13)

Benign
RCOG
(n = 454)
IOTA
(n = 280)

Borderline
RCOG
(n = 63)
IOTA
(n = 29)

Invasive
RCOG
(n = 92)
IOTA
(n = 36)

Benign
RCOG
(n = 75)
IOTA
(n = 123)

Borderline
RCOG
(n = 31)
IOTA
(n = 64)

Invasive
RCOG
(n = 317)
IOTA
(n = 382)

Figure 1. Flow diagram of the RCOG and IOTA protocols as triage systems for women with adnexal masses.

the classification performance of the protocols is examined


for invasive tumours of varying histology. Next, results for
all tumours are examined for the following subgroups: premenopausal women, postmenopausal women, women with
available CA125 information, women with information on
both the CA125 level and presence of metastases, women
from oncology referral centres, women from referral centres
for ultrasonography, and women from general hospitals.

Results
There were 1970 women enrolled in the study (see Figure 1
for a flow diagram). Thirty-two women were excluded for
the following reasons: 15 did not undergo surgical removal
of the mass within 120 days after the ultrasound examination, 12 were pregnant at the time of the examination, four
were excluded because of errors in data entry, and one was
excluded because of a protocol violation. The analysis dataset contained the remaining 1938 women, of whom 1396
had benign (72%), 111 borderline (6%), and 431 invasive
(22%) tumours. Tables 13 describe the pathological and
demographic details of the women. Table 1 presents the
distribution of the outcome (benign, borderline, invasive)
for the total sample and for different centres. Oncology
referral centres had more borderline (8%) and invasive
(29%) tumours than general hospitals or referral centres
for ultrasonography (23% borderline tumours and 1012%
invasive tumours). Table 2 summarises the specific histology
of the masses in the dataset. The most common benign
tumours were endometriomas, serous cystadenomas and

666

Table 2. Specific pathological tumour diagnoses


Pathological diagnosis
Benign
Endometrioma
Serous cystadenoma
Teratoma
Mucinous cystadenoma
Simple cyst or parasalpingeal cyst
Fibroma
Functional cyst
Hydrosalpinx or salpingitis
Abscess
Rare benign
Peritoneal pseudocyst
Borderline
Stage I
Stage II
Stage III
Stage IV
Invasive
Primary invase Stage I
Primary invase Stage II
Primary invase Stage III
Primary invase Stage IV
Rare primary invasive
Metastatic invasive

n (%)
1396
400
236
226
138
131
86
77
49
24
18
11
111
99
3
8
1
431
70
30
202
30
41
58

(72.0)
(20.6)
(12.2)
(11.7)
(7.1)
(6.8)
(4.4)
(4.0)
(2.5)
(1.2)
(0.9)
(0.6)
(5.7)
(5.1)
(0.2)
(0.4)
(0.1)
(22.2)
(3.6)
(1.6)
(10.4)
(1.6)
(2.1)
(3.0)

teratomas. Borderline tumours were almost all stage I


tumours, whereas primary invasive tumours were most often
stage III. Table 3 presents the demographic characteristics,

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

Triaging women with ovarian masses for surgery

Table 3. Demographic characteristics, CA125 level, RMI, and the risk of malignancy estimated by logistic regression model LR2 for the women in
the study stratified by tumour type
Variable

Statistics

Age in years
Postmenopausal, %
Nulliparous, %
CA125 in U/mL
Risk of Malignancy Index (RMI)
LR2 estimated risk of malignancy

Median
n (%)
n (%)
Median
Median
Median

Benign
(n = 1396)

(IQR)

41
382
651
18
12
2.3%

(IQR)
(IQR)
(IQR)

(3152)
(27)
(47)
(1144)
(047)
(1.35.8)

Borderline
(n = 111)
48
49
44
33
95
30.7%

(3564)
(44)
(40)
(1891)
(41312)
(8.474.4)

Invasive
(n = 431)
57
311
98
218
1232
69.8%

(5066)
(72)
(23)
(58772)
(2145423)
(48.887.0)

IQR, interquartile range.

CA125 level, RMI, and risk of malignancy estimated by


logistic regression model LR2.
Table 4 presents the cross-tabulation of the outcome
(benign, borderline, invasive) and the risk group assignments for the RCOG and IOTA triage protocols. The IOTA
protocol classified fewer women as moderate risk: 17.8%
versus 31.4% for the RCOG protocol (difference ) 13.6%,
95% CI) 16.1 to ) 11.1; P < 0.0001). The Net Reclassification Improvement measure demonstrated improved reclassification of benign and invasive tumours by the IOTA
protocol relative to the RCOG protocol: the net percentages of tumours with improved reclassification were 5.8%
(95% CI 2.69.0; P = 0.0004) and 15.5% (95% CI 10.6
20.4; P < 0.0001), respectively. Benign tumours were more
often classified as low risk by the IOTA protocol (71.1%
versus 62.1%; difference 9.0%, 95% CI 6.211.9; P <
0.0001). Invasive tumours were more often classified as
high risk by the IOTA protocol (88.6% versus 73.6%; difference 15.0%, 95% CI 10.619.4, P < 0.0001), and less
often as low risk (3.0% versus 5.2%; difference ) 2.2%,
95% CI ) 4.6 to 0.2; P = 0.07). On the other hand, the

Table 4. Triage results for women when using the RCOG protocol
or the IOTA protocol depending on tumour type
Triage result

Type of tumour
All women
n (%)

RCOG protocol
Low risk
906
Moderate risk
609
High risk
423
IOTA protocol
Low risk
1024
Moderate risk
345
High risk
569

Benign
n (%)

Borderline
n (%)

Invasive
n (%)

(46.7)
(31.4)
(21.8)

867 (62.1)
454 (32.6)
75 (5.3)

17 (14.9)
63 (57.0)
31 (28.1)

22 (5.2)
92 (21.2)
317 (73.6)

(52.8)
(17.8)
(29.4)

993 (71.1)
280 (20.1)
123 (8.8)

18 (16.2)
29 (26.1)
64 (57.7)

13 (3.0)
36 (8.4)
382 (88.6)

IOTA protocol classified more benign tumours as high risk


(8.8% versus 5.3%; difference 3.5%, 95% CI 1.75.2;
P = 0.0001). With respect to borderline tumours, the difference between the protocols was that the IOTA protocol
classified more tumours as high risk whereas the RCOG
protocol classified more tumours as moderate risk. Of the
tumours classified as low, moderate and high risk by the
RCOG protocol, 2.5%, 15.0% and 75.0% were invasive
tumours, respectively. The corresponding figures for the
IOTA protocol were 1.3%, 10.4% and 67.1%. We can also
summarise Table 4 in classical terms of sensitivity and
specificity for malignancy. Using an RMI cut-off of 25 and
an LR2 cut-off of 0.05 to predict malignancy (i.e. low risk
versus moderate/high risk), the RCOG protocol achieved
a sensitivity of 92.8% and a specificity of 62.1% whereas
the IOTA protocol achieved 94.3% sensitivity and 71.1%
specificity.
The RCOG and IOTA triage protocols disagreed in risk
assignment for 662 women (34.2% of the total sample). In
this group, the IOTA protocol classified 53% of benign
tumours as low risk and 74% of invasive tumours as high
risk, compared with 27% and 16% for the RCOG protocol.
Table 5 presents the results for women with an invasive
tumour after stratification for tumour histology. For all
histological types of primary invasive ovarian cancer except
epithelial not otherwise specified tumours the IOTA protocol classified more invasive tumours as high risk than the
RCOG protocol. Irrespective of the specific histology, the
IOTA protocol classified between 70% and 100% of the
invasive tumours as high risk, whereas the RCOG protocol
classified around 50% of the mucinous, non-epithelial primary invasive, and metastatic invasive tumours as high
risk. The higher number of invasive tumours classified as
low risk by the RCOG protocol (Table 4) is explained
mainly by the misclassification of non-epithelial primary
invasive and metastatic invasive tumours.
A summary of results for all women and for the
subgroups is presented in Table 6. The difference in

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

667

Van Calster et al.

Table 5. Triage results for women with an invasive tumour when using the RCOG protocol or IOTA protocol depending on tumour histology
Triage result

RCOG protocol
Low risk
Moderate risk
High risk
IOTA protocol
Low risk
Moderate risk
High risk
Total

Primary invasive histologies

Metastatic invasive

Serous
n (%)

Mucinous
n (%)

Endometrioid
n (%)

Clear cell
n (%)

Epithelial
NOS
n (%)

Non-epithelial
n (%)

4 (1.9)
22 (10.3)
190 (87.8)

1 (3.6)
12 (43.4)
15 (53.1)

0 (0)
12 (27.2)
33 (72.8)

0 (0)
7 (27.2)
17 (72.8)

2 (14.3)
2 (14.3)
10 (71.4)

9 (20.0)
17 (36.7)
20 (43.3)

6 (10.5)
20 (33.4)
32 (56.2)

3 (1.4)
11 (5.1)
202 (93.5)
216

1 (3.6)
4 (14.3)
23 (82.1)
28

0 (0)
3 (6.7)
42 (93.3)
45

0 (0)
0 (0)
24 (100)
24

2 (14.3)
2 (14.3)
10 (71.4)
14

4 (8.7)
9 (19.6)
33 (71.7)
46

3 (5.2)
7 (12.1)
48 (82.8)
58

NOS, not otherwise specified.

Table 6. Classification of tumours by the RCOG and IOTA protocols in all women and in various subgroups
Percentage of all
tumours classified
as moderate risk

Percentage
of benign tumours
classified as low risk

Percentage of
benign tumours
classified as high risk

All women (n = 1938)


RCOG
31.4
62.1
IOTA
17.8
77.1
Premenopausal women (n = 1196; 61.7%)
RCOG
27.5
71.6
IOTA
14.1
82.1
Postmenopausal women (n = 742; 38.3%)
RCOG
37.8
36.8
IOTA
23.7
42.2
Available CA125 level (n = 1504, 77.6%)
RCOG
32.4
59.6
IOTA
18.4
68.3
Available information on CA125 and metastasis (n = 1147, 59.2%)
RCOG
33.6
56.2
IOTA
18.6
65.9
Oncology referral centres (n = 1165; 60.1%)
RCOG
31.5
59.3
IOTA
17.8
67.6
Referral centres for ultrasonography (n = 249; 12.8%)
RCOG
35.4
62.1
IOTA
16.5
78.2
General hospitals (n = 524; 27.0%)
RCOG
29.4
66.6
IOTA
18.5
73.5

performance between the two protocols was similar for all


subgroups, including the subgroup of women with available
CA125 level (n = 1504, 77.6% of all tumours) and the subgroup of women with available information on the CA125

668

Percentage of
invasive tumours
classified as low risk

Percentage of
invasive tumours
classified as high risk

5.3
8.8

5.2
3.0

73.6
88.6

3.9
5.1

10.1
6.7

56.7
78.3

9.2
18.6

3.3
1.6

80.1
92.6

6.0
10.4

5.6
3.3

73.3
88.0

6.9
11.7

4.7
3.2

74.0
88.2

6.8
11.5

4.8
2.9

75.5
88.0

3.3
4.1

16.7
0

47.3
100

3.9
6.7

3.1
4.7

73.2
87.5

level and on the presence of metastases (n = 1147, 59.2%


of all tumours). The only exception was that in general
hospitals the number of invasive tumours classified as low
risk was lower for the RCOG protocol (two out of 64) than

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

Triaging women with ovarian masses for surgery

for the IOTA protocol (three out of 64), but this difference
was based on only one woman.

Discussion
This study has shown that the IOTA protocol is more
accurate than the RMI-based protocol recommended by
the RCOG for classifying adnexal masses as being at low,
moderate or high risk of malignancy. The IOTA protocol
resulted in a substantial increase in the number of benign
ovarian masses classified as low risk and in the number of
invasive tumours classified as high risk. It was also associated with a reduction in the number of invasive tumours
classified as low risk. The difference between the protocols
was similar in all subgroups considered. Further, the IOTA
protocol performed fairly consistently in all histological
types of invasive tumours, whereas the RCOG protocol
showed poorer performance in mucinous, non-epithelial
primary invasive, and metastatic invasive tumours. This is
likely to be explained by the lower expression of CA125 in
these tumour types (data not shown).
The major strengths of this study are that the protocols
were applied on a large, multicentre, international database.
This means that the results are likely to be robust. Also,
the performance of the RMI and the LR2 logistic regression
model was evaluated in a way that corresponds to their use
in clinical practice instead ofas in many other evaluation
studiesrelying on the area under the receiver operating
charactieristics curve.
Our study also has limitations, one being the missing
values for CA125. This issue is appropriately tackled with
the technique of multiple imputation,27 and with a comparison of the results based on multiple imputation
(Table 4) with those based only on cases with available
CA125 levels. It is reassuring that the results based on
women with available information were very similar to
those obtained after multiple imputation. Another weakness is that information on the presence of metastases
was sometimes missing. As described in the Methods section, different strategies to manage the missing values
were compared, and this did not affect the conclusion of
this study. Results based on women with available CA125
levels as well as information on the presence of metastases
(Table 6) were similar to the main results presented in
Table 4. A third criticism could be that both the RMI
and the LR2 were used in the hands of experienced ultrasound operators with a specific interest in adnexal pathology, and that therefore, the results of this study might
not be generalisable to less experienced ultrasound examiners. However, we believe that it should be possible for
any qualified ultrasound practitioner to obtain reliable
information on all the ultrasound variables required for
both models.

According to our results, the IOTA protocol would result


in more women appropriately undergoing minimally invasive surgery, whereas most invasive cancers would still be
correctly referred to gynaecological oncologists. As a consequence, the use of the IOTA triage system would have a
beneficial impact on the management of women with
adnexal tumours. Additionally, because the IOTA protocol
labelled fewer women as moderate risk, fewer women
would receive an uncertain diagnosis. This might be a psychological advantage. The IOTA protocol may, however,
lead to an increase in the number of women with a benign
tumour being unnecessarily referred to a cancer centre. We
believe that this disadvantage is of limited importance relative to the benefits of the IOTA protocol.
The relative performance of the two models with respect
to borderline tumours merits a comment. The IOTA
protocol classified more borderline tumours as high risk
than the RCOG protocol. This implies that with the IOTA
protocol more women with borderline tumours would be
managed in gynaecological oncology centres. We argue that
this is an advantage because, even though there may be more
conservative fertility-sparing treatment options in these
women, decisions about such treatments are likely to be
better handled by subspecialist gynaecological oncologists.
Some may argue that the RMI is a simpler method to use
in everyday clinical practice than the LR2, because the RMI
can be calculated without the use of a computer. However,
nowadays computers are routinely used in mostif not
allhospitals and outpatient clinics, and the LR2 risk can
be easily calculated using an excel file. It should also be
easy to incorporate the LR2 risk calculation model in a
database handling patient data.
LR2 was developed on a large dataset including women
from nine clinical centres in five countries. This should
make the model generally applicable. LR2 has been shown
to have only slightly lower performance than LR1 (LR1 is a
model with 12 predictors) both on internal, temporal, and
external validation.1518 Moreover, LR2 and LR1 have been
shown to perform as well as advanced algorithms based on
neural networks and support vector machines.16,30 An
interesting issue is whether adding information on CA125
to the LR2 model would improve patient triage. We therefore developed a model where CA125 was added to the six
predictor variables used in LR2, using the same data as
those on which LR2 was developed.15 The triaging protocol
based on this model (labelled the IOTA + CA125 protocol)
is based on the same risk thresholds as the IOTA protocol.
Triage results of the IOTA + CA125 protocol are shown in
the Table S1. When comparing the IOTA protocol with the
IOTA + CA125 protocol using the Net Reclassification
Improvement measure, the addition of CA125 slightly
improved triage of invasive tumours (P = 0.12) but also
slightly deteriorated triage of benign tumours (P = 0.44).

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

669

Van Calster et al.

Conclusion
In summary, we have shown that a management protocol
based on triaging women using the IOTA logistic regression model LR2 performs significantly better than the
RMI-based protocol that is currently proposed by the
RCOG. LR2 has been developed on a large database of
adnexal masses and undergone external validation in several different centres. We believe that the LR2 model
should be considered as an alternative to the RMI for
inclusion in triaging protocols for adnexal pathology.

Disclosure of interest
The authors declare that they have no competing interests.

Contribution to authorship
BVC, DT, LV, AM and TB designed the study. DT, LV,
ACT and IV were involved in data acquisition. BVC performed the statistical analysis. BVC, DT, LV, AM, SGM
and TB interpreted the results and BVC and TB drafted the
manuscript. BVC, DT, LV, AM, SGM, ACT, IV, TB revised
the manuscript and approved the final version of the manuscript.

Details of ethics approval


The University Hospitals K.U.Leuven is the coordinating
centre. The protocol of IOTA phase 2 was approved by
Ethics Committee of the University Hospitals K.U.Leuven
(Commissie Medische Ethiek) on 23 November 2005 with
reference number ML1248. The research protocol was further ratified by the local Ethics Committee at each recruitment centre.

Supporting Information
Additional Supporting Information may be found in the
online version of this article.
Table S1. Triage results for patients when using the
RCOG protocol, the IOTA protocol, and the IOTA+CA125
protocol.
Please note: Wiley-Blackwell is not responsible for the
content or functionality of any supporting information
supplied by the authors. Any queries (other than missing
material) should be directed to the corresponding
author. j

References

Funding
This work was supported by the Research Foundation
Flanders (FWO) (grants 1251609N, 1251612N, G049312N),
Flanders Agency for Innovation by Science and Technology
(IWT) (grant IWT-TBM070706-IOTA3); Swedish Medical
Research Council: grants numbers K2001-72X 11605-06A,
K2002-72X-11605-07B, K2004-73X-11605-09A and K200673X-11605-11-3; funds administered by Malmo University
Hospital; and two Swedish governmental grants: ALFmedel and Landstingsfinansierad Regional Forskning. Tom
Bourne is supported by Imperial Healthcare NHS Trust
NIHR Biomedical Research Centre.

Acknowledgements
B. Van Calster is a postdoctoral fellow of the Research FoundationFlanders (FWO). We acknowledge the free use of a dedicated
study screen developed by astraia GMBH. The Steering Committee of the International Ovarian Tumor Analysis (IOTA) study
group consists of Dirk Timmerman (University Hospitals
K.U.Leuven, Leuven, Belgium), Lil Valentin (Skane University

670

Hospital Malmo, Lund University, Sweden), Tom Bourne (Imperial College Hammersmith Campus, London, UK), Antonia C
Testa (Universita` Cattolica di Sacre Cuore, Rome, Italy), Sabine
Van Huffel (Katholieke Universiteit Leuven, Leuven, Belgium)
Ignace Vergote (University Hospitals K.U.Leuven, Leuven, Belgium). The IOTA principal investigators are (in alphabetical
order) Jean-Pierre Bernard (Maurepas, France), Artur Czekierdowski (Lublin, Poland), Elisabeth Epstein (Lund and Stockholm,
Sweden), Enrico Ferrazzi (Milano, Italy), Daniela Fischerova
(Prague, Czech Republic), Dorella Franchi (Milano, Italy), Robert
Fruscio (Monza, Italy), Stefano Greggi (Napoli, Italy), Stefano
Guerriero (Cagliari, Italy), Jingzhang (Beijing, China), Davor Jurkovic (London, UK), Fabrice Lecuru (Paris, France), Francesco
PG Leone (Milano, Italy), Andrea A Lissoni (Monza, Italy), Ulrike Metzger (Paris, France), Henry Muggah (Hamilton, ON,
Canada), Dario Paladini (Napoli, Italy), Alberto Rossi (Udine,
Italy), Luca Savelli (Bologna, Italy), Antonia C Testa (Roma,
Italy), Dirk Timmerman (Leuven, Belgium), Diego Trio (Milano,
Italy), Lil Valentin (Malmo, Sweden), Caroline Van Holsbeke
(Genk, Belgium), Gerardo Zanetta [deceased] (Monza, Italy).

1 Cancer Research UK. Ovarian cancer statistics UK. 2011. [Available


at: http://info.cancerresearchuk.org/cancerstats/types/ovary/]. Last accessed 13 February 2012.
2 Ferlay J, Parkin DM, Steliarova-Foucher E. Estimates of cancer incidence and mortality in Europe in 2008. Eur J Cancer 2010;46:765
81.
3 Bristow RE, Tomacruz RS, Armstrong DK, Trimble EL, Montz FJ. Survival effect of maximal cytoreductive surgery for advanced ovarian
carcinoma during the platinum era: a meta-analysis. J Clin Oncol
2002;20:124859.
4 Greving JP, Vernooij F, Heintz APM, van der Graaf Y, Buskens E. Is
centralization of ovarian cancer care warranted? A cost-effectiveness
analysis Gynecol Oncol 2009;113:6874.
5 Vergote I, De Brabanter J, Fyles A, Bertelsen K, Einhorn N, Sevelda
P, et al. Prognostic importance of degree of differentiation and cyst
rupture in stage I invasive epithelial ovarian carcinoma. Lancet
2001;357:17682.
6 Bristow RE, Berek JS. Surgery for ovarian cancer: how to improve
survival. Lancet 2006;367:155860.
7 Earle CC, Schrag D, Neville BA, Yabroff KR, Topor M, Fahey A, et al.
Effect of surgeon specialty on process of care and outcomes for
ovarian cancer patients. J Natl Cancer Inst 2006;98:17280.

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

Triaging women with ovarian masses for surgery

8 Engelen MJA, Kos HE, Willemse PHB, Aalders JG, de Vries EGE,
Schaapveld M, et al. Surgery by consultant gynecologic oncologists
improves survival in patients with ovarian carcinoma. Cancer
2006;106:58998.
9 Jacobs I, Oram D, Fairbanks J, Turner J, Frost C, Grudzinskas JG. A
risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian
cancer. Br J Obstet Gynaecol 1990;97:9229.
10 Geomini P, Kruitwagen R, Bremer GL, Cnossen J, Mol BWJ. The
accuracy of risk scores in predicting ovarian malignancy. A systematic review. Obstet Gynecol 2009;113:38494.
11 Clarke SE, Grimshaw R, Rittenberg P, Kieser K, Bentley J. Risk of
malignancy index in the evaluation of patients with adnexal masses.
J Obstet Gynaecol Can 2009;31:4405.
12 Bailey J, Tailor A, Naik R, Lopes A, Godfrey K, Hatem HM, et al. Risk
of malignancy index for referral of ovarian cancer cases to a tertiary
center: does it identify the correct cases? Int J Gynecol Cancer
2006;16(Suppl. 1):304.
13 National Collaborating Centre for Cancer. Recognition and initial
management of ovarian cancer: NICE clinical guideline 122. London,
UK: National Institute for Health and Clinical Excellence, 2011, Available from: www.nice.org.uk/CG122. Last accessed 13 February
2012.
14 Rufford BD, Jacobs IJ. Green-top Guideline No. 34. Ovarian cysts in
postmenopausal women. London, UK: Royal College of Obstetricians
and Gynaecologists, 2003, Available from: http://www.rcog.org.uk/
files/rcog-corp/GTG3411022011.pdf. Last accessed 13 February
2012.
15 Timmerman D, Testa AC, Bourne T, Ferrazzi E, Ameye L, Konstantinovic ML, et al. A logistic regression model to distinguish between
the benign and malignant adnexal mass before surgery: a multicenter study by the International Ovarian Tumor Analysis (IOTA) group.
J Clin Oncol 2005;23:8794801.
16 Van Holsbeke C, Van Calster B, Testa AC, Domali E, Lu C, Van Huffel S, et al. Predicting malignancy of ovarian tumors: prospective
evaluation of models from the IOTA study. Clin Cancer Res 2009;
15:68491.
17 Timmerman D, Van Calster B, Testa AC, Guerriero S, Fischerova D,
Lissoni AA, et al. Ovarian cancer prediction by logistic regression
models: a prospective evaluation of diagnostic accuracy. Ultrasound
Obstet Gynecol 2010;36:22634.
18 Van Holsbeke C, Van Calster B, Bourne T, Ajossa S, Testa AC,
Guerriero S, et al. External validation of diagnostic models to

19

20

21

22

23

24

25

26
27

28
29

30

estimate the risk of malignancy in adnexal masses. Clin Cancer Res


2012;18:815825.
Van Calster B, Valentin L, Van Holsbeke C, Testa AC, Bourne T, Van
Huffel S, et al. Polytomous diagnosis of ovarian tumors as benign,
borderline, primary invasive or metastatic: development and validation of standard and kernel-based risk prediction models. BMC Med
Res Methodol 2010;10:96.
Timmerman D, Ameye L, Fischerova D, Epstein E, Melis GB, Guerriero S, et al. Simple ultrasound rules to distinguish between benign
and malignant adnexal masses before surgery: prospective validation
by IOTA group. BMJ 2011;341:c6839.
Timmerman D, Valentin L, Bourne TH, Collins WP, Verrelst H, Vergote I. Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus opinion from the
International Ovarian Tumor Analysis (IOTA) group. Ultrasound
Obstet Gynecol 2000;16:5005.
Kenemans P, van Kamp GJ, Oehr P, Verstraeten RA. Heterologous
double-determinant immunoradiometric assay CA 125 II: reliable
second-generation immunoassay for determining CA 125 in serum.
Clin Chem 1993;39:250913.
Heintz APM, Odicino F, Maisonneuve P, Quinn MA, Benedet JL,
Creasman WT, et al. Carcinoma of the ovary. Int J Gynaecol Obstet
2006;95(Suppl. 1):S16192.
Davies AP, Jacobs I, Woolas R, Fish A, Oram D. The adnexal mass:
benign or malignant? Evaluation of a risk of malignancy index Br J
Obstet Gynaecol 1993;100:92731.
Pencina MJ, DAgostino RB Sr, DAgostino RB Jr, Vasan RS. Evaluating
the added predictive ability of a new marker: from area under ROC
curve to reclassification and beyond. Stat Med 2008;27:15772.
Little JR, Rubin D. Statistical analysis with missing data, 2nd edn.
New York, NY: Wiley; 2002.
Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG,
et al. Multiple imputation for missing data in epidemiological and
clinical research: potential and pitfalls. BMJ 2009;338:b2393.
Schenker N, Taylor JMG. Partially parametric techniques for multiple
imputation. Comput Stat Data Anal 1996;22:42546.
Li K-H, Meng X-L, Raghunathan TE, Rubin DB. Significance levels
from repeated p-values with multiply-imputed data. Stat Sin 1991;
1:6592.
Van Calster B, Timmerman D, Lu C, Suykens JAK, Valentin L, Van
Holsbeke C, et al. Preoperative diagnosis of ovarian tumors using
Bayesian kernel-based methods. Ultrasound Obstet Gynecol 2007;
29:496504.

2012 The Authors BJOG An International Journal of Obstetrics and Gynaecology 2012 RCOG

671

También podría gustarte