Documentos de Académico
Documentos de Profesional
Documentos de Cultura
www.elsevier.com/locate/eswa
Abstract
The medical diagnosis by nature is a complex and fuzzy cognitive process, and soft computing methods, such as neural networks, have
shown great potential to be applied in the development of medical decision support systems (MDSS). In this paper, a multiplayer perceptron-
based decision support system is developed to support the diagnosis of heart diseases. The input layer of the system includes 40 input
variables, categorized into four groups and then encoded using the proposed coding schemes. The number of nodes in the hidden layer is
determined through a cascade learning process. Each of the 5 nodes in the output layer corresponds to one heart disease of interest. In the
system, the missing data of a patient are handled using the substituting mean method. Furthermore, an improved back propagation algorithm
is used to train the system. A total of 352 medical records collected from the patients suffering from five heart diseases have been used to train
and test the system. In particular, three assessment methods, cross validation, holdout and bootstrapping, are applied to assess the
generalization of the system. The results show that the proposed MLP-based decision support system can achieve very high diagnosis
accuracy (O90%) and comparably small intervals (!5%), proving its usefulness in support of clinic decision process of heart diseases.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Medical decision support system; Multilayer perceptron; Back propagation algorithm; Performance of classifier; Heart disease
1. Introduction with many human organs other than the heart, and very
often heart diseases may exhibit various syndromes. At the
Statistics have consistently shown that heart disease is same time, different types of heart diseases may have
one of the leading causes of deaths in US and all over the similar symptoms.
world (CDCs report). Significant life saving, however, can To reduce the diagnosis time and improve the diagnosis
be achieved if an accurate diagnosis decision can be accuracy, it has become more of a demanding issue to
promptly made to patients suffering various types of heart develop reliable and powerful medical decision support
diseases, after which an appropriate treatment can immedi- systems (MDSS) to support the yet and still increasingly
ately follow. Unfortunately, accurate diagnosis of heart complicated diagnosis decision process. The medical
diseases has never been an easy task. As a matter of fact, diagnosis by nature is a complex and fuzzy cognitive
many factors can complicate the diagnosis of heart diseases, process, hence soft computing methods, such as neural
often causing the delay of a correct diagnosis decision. For networks, have shown great potential to be applied in the
instance, the clinic symptoms, the functional and the development of MDSS of heart diseases. In (Long, Naimi &
pathologic manifestations of heart diseases are associated Criscitello, 1992), a probability network-based heart failure
program was developed to assist physicians in reasoning
about patients, which produced appropriate diagnoses about
* Corresponding author. Tel.: 702 895 2533; fax: 702 895 4075. 90% of the time on the training set. Azuaje et al. (Azuaje
E-mail addresses: hmyan@uestc.edu.cn (H. Yan), yingtao@egr.unlv. et al., 1999) employed artificial neural networks (ANN) to
edu (Y. Jiang), zheng@cs.qc.edu (J. Zheng). recognize Poincare-plot-encoded heart rate variability
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. patterns related to the risk of the coronary heart disease.
doi:10.1016/j.eswa.2005.07.022 Tkacz et al. (Tkacz & Kostka, 2000) demonstrates how
H. Yan et al. / Expert Systems with Applications 30 (2006) 272281 273
wavelet neural networks (WNN) can be applied for disease 2. The MLP architecture and the BP algorithm
classification useful to diagnose coronary artery disease at
different levels. For the diagnosis of congenital heart In this section, the theoretical background of the neural
diseases, Reategui et al. (Reategui et al., 1997) proposed a network architecture and the learning algorithm pertaining
model by integrating case-based reasoning with neural to our study are reviewed. Improvements to these
networks. In (Tsai & Watanabe, 1998), fuzzy reasoning algorithms are also discussed.
optimized by genetic algorithm was used for the classifi-
cation of myocardial heart disease. 2.1. The multilayer perceptron (MLP)
All the above studies, with different types of soft
computing methods being applied, deal with only one MLP is one of the most frequently used neural network
specific type of heart disease individually. In fact, of all the architectures in MDSS (Bishop, 1995; Hand, 1997; Ripley,
known heart diseases, hypertension, coronary heart disease, 1996), and it belongs to the class of supervised neural
rheumatic valvular heart disease, chronic cor pulmonale, networks. The multilayer perceptron consists of a network
and congenital heart disease have been identified as the five of nodes (processing elements) arranged in layers. A typical
most common ones (http://www.ifcc.org/ejifcc/vol14- MLP network consists of three or more layers of processing
no2/140206200308n.htm; Fuster et al., 2001). No study, nodes: an input layer that receives external inputs, one or
however, has yet attempted to consider a system to more hidden layers, and an output layer which produces the
differentiate the diagnosis of all these five major heart classification results (Fig. 1). Note that unlike other layers,
diseases simultaneously, which is essential to assist no computation is involved in the input layer. The principle
physicians for their diagnosis. of the network is that when data are presented at the input
Multilayer perceptron (MLP) is one of the most layer, the network nodes perform calculations in the
popular neural network models due to its clear successive layers until an output value is obtained at each
architecture and comparably simple algorithm. In this of the output nodes. This output signal should be able to
paper, we demonstrate that such a complex system to indicate the appropriate class for the input data. That is, one
support heart disease diagnosis can be developed can expect to have a high output value on the correct class
following a computational model based on a multilayer node and low output values on all the rest.
perceptron network consisting of three layers: (i) one A node in MLP can be modeled as an artificial neuron
input layer, (ii) one hidden layer, and (iii) one output (Fig. 2), which computes the weighted sum of the inputs at
layer. The input layer of the system takes in 40 variables, the presence of the bias, and passes this sum through the
extracted from an experimental study on a large number activation function. The whole process is defined as follows:
of patient cases. The number of nodes in the hidden
X
p
layer is determined by a cascade learning process. The nj Z wji xi C qj
output layer has 5 nodes corresponding to the heart iZ1
(1)
diseases of interest. The whole system is trained using an yj Z fj nj
improved back-propagation (BP) algorithm. Moreover,
where vj is the linear combination of inputs x1 ; x2 ; .; xp , qj
we have applied cross validation, holdout and boot-
is the bias, wji is the connection weight between the input xi
strapping methods to assess the generalization of the
system. The experimental results indicate that our Output Classes yj Level
proposed system has a strong capability to classify the
five heart diseases with high accuracy (O90%) and
comparable small intervals (!5%), proving its usefulness Output
Layer Connection
... j
in support of clinic diagnosis decision process. As
pointed out in (David & Vivian, 2000), one has to Weights
recognize that the purpose of our system, like many
other MDSS systems currently employed or under
development, is to augment, not to replace the human Hidden
Layer(s)
... k
diagnosticians in the complex process of medical
diagnosis.
In what follows, we first review the architecture of MLP
network and the basic BP algorithm in Section 2. The
improvement of the learning/training strategy is also Input ... i
Layer
discussed in that section. In Section 3, the methods of
building the decision support system based on an MLP,
are presented. The experimental results are reported Input Pattern Feature Value xi
in Section 4. Finally, the conclusions are summarized in
Section 5. Fig. 1. Architecture of a multilayer perceptron network.
274 H. Yan et al. / Expert Systems with Applications 30 (2006) 272281
the QuasiNewton method, and the conjugate gradients Basic Age [0, 1]
information Sex (0, 1)
method. In this study, the conjugate gradients method is
Cough (-1, 0.5, 1)
adopted, as it has a low computation cost and exhibits good Emptysis/Expectoration (-1, 0.5, 1)
results (Polak, 1971). The connection weights thus can be
Dyspnea (-1, 0.5, 1)
expressed by:
Gasping/Panting (-1, 0.5, 1)
Wt C 1 Z Wt C htdt (8) Symptom
Nausea/vomiting (-1, 0.5, 1)
Dizziness (-1, 0.5, 1)
Headache (-1, 0.5, 1)
dt ZKVEWt C btdtK1 (9) Fever (-1, 0.5, 1)
Cyanosis (-1, 0.5, 1)
d0 ZKVEW0 (10) Palpitations (-1, 0.5, 1)
Weakness or fatigue (-1, 0.5, 1)
where PE is the gradient, d(t) is conjugate gradient, h(t) is Heart Disease Discomfort, pressure, (-1, 0.5, 1)
the step wide, b(t) is determined given by PolakRibiere Variables heaviness in the chest
function shown in Eq. (11). Chest pain (precordial
(-1, 0.5, 1)
region, posterior sternal)
T
VEWtKVEWtK1 VEWt Edema of lower limb (-1, 0.5, 1)
bt Z (11) Known factors of
VEWtK1T VEWtK1 inducement
(-1, 0, 1)
In this section, we will present the details regarding each of Sound of breath in pulmonary (-1, 0.5, 1)
these layers. Moist rales (-1, 0.5, 1)
diagnosis. A physician reaches a correct diagnosis or Second heart sound (-1, 0.5, 1)
treatment decision based upon observations, the patients Liver and kidney tenderness (-1, 0.5, 1)
answers to questions, and physical examinations or lab
Hyperresonant note (-1, 0.5, 1)
results. A MDSS system simulates the diagnostic process by Eminence in
(-1, 0.5, 1)
transforming the patient information into mathematical precordial region
Cardiac rhythm in
variables. These variables then can be processed using good order or not
(-1, 0.5, 1)
accuracy rate
if it outbreaks chronically for a long time. 0.8
(iv) The rest variables are encoded using the three-value 0.6
0.4
ordinal scales, with K1 representing the absence of
0.2
the attribute, 1 representing the largest presence, and
0.0
0.5 representing intermediate level of the variables. 0 5 10 15 20 25 30
For instance, headache can be encoded as 1 if a Number of hidden layer nodes
patient feels an acute headache, and K1 if he/she
Fig. 4. Accuracy rate (curve 1) and learning time (curve 2) vs. number of
doesnt feel headache at all. In between, the situation hidden nodes.
is coded as 0.5.
Fig. 4 shows the result of the accuracy rate (curve 1) and
The complete encoding values of all 40 variables are the learning time (curve 2) with respect to the numbers of
listed in Fig. 3. Note that the encoded values of the input hidden nodes. Here one can see that, the network can
variables are mainly for numerically setting the connections achieve a high accuracy rate with a comparable learning
of neurons in the network. time when the number of hidden nodes is between 10 and
15. When the number of hidden nodes reaches 15, the
3.2. Hidden layer network achieves the highest accuracy ratio of 91.5%. As a
result, in our system, the hidden layer is determined to
Kavzoglu (Kavzoglu, 2001) has shown that the number consist of 15 nodes.
of neurons in a hidden layer will significantly influence the
networks ability to generalize from the training data to the 3.3. Output layer
unknown examples. A network with small hidden nodes
cannot fully identify the structures present in the training The output layer comprises of 5 nodes and each node
data (known as underfitting), while a network with large corresponds to one heart disease of interest. The architecture
hidden nodes may determine decision boundaries in vector of the overall decision support system is illustrated in Fig. 5.
space that are unduly influenced by the specific properties of
the training data. The latter phenomenon is known as
3.4. Summery about the system
overfitting. Some experimental formulas have been pre-
sented to decide the number of neurons needed in a hidden
In this section, we have shown that the proposed decision
layer (Kavzoglu, 2001), but they can only provide very
support system is a three-layer MLP with 40 input variables,
coarse estimation values. In fact, accuracy and convergence
speed are the two main parameters of a network, and we thus rheumatic
coronary valvular chronic cor congenital
determine the number of the hidden nodes through a cascade Hypertension
heart disease heart disease pulmonale heart disease
learning process to seek a balance between these two.
In the cascade learning process, a four-fold cross
validation method (more details are provided in Section 4) Output l=5
layer
is adopted. The whole process works as follows:
Table 1 Table 2
The training parameters The distribution of the heart diseases records
complete input data. However, one common problem in where n is the size of the dataset D, xi is the instance of D, yi
medical databases is the lack of information in the form of is the label of xi, and Di is the possible label of xi by
missing data values. For instance, some medical examin- the classifier. Here,
ations may not be appropriate for certain patients or the (
doctor failed to record some symptoms. Different methods 1 if i Z j
have been suggested to handle this missing data problem di; j Z (13)
0 otherwise
(Heckerling et al., 1991; Doyle et al., 1995). The simplest
method is to ignore the cases with missing data. A more In this study, we have also applied a four-fold cross-
complex approach may involve computing a complex validation method for the performance assessment. The data
structure like an ANN (Erkki et al., 1998). In our study, listed in Table 2 are thus equally divided into four subsets
278 H. Yan et al. / Expert Systems with Applications 30 (2006) 272281
Table 3
Number of cases with missing data values and the mean value of each diagnostic variable
Diagnostic variables Number of cases with missing data Mean of each variable
Basic Information Age 0 0.47
Sex 0 0.43
Symptoms Cough 0 0.18
Emptysis/expectoration 0 K0.59
Dyspnea 0 K0.36
Gasping/panting 0 0.18
Nausea/vomiting 0 K0.79
Dizziness 0 K0.19
Headache 0 K0.55
Fever 0 K0.90
Cyanosis 5 K0.79
Palpitations 0 0.17
Weakness or fatigue 2 K0.6
Discomfort, pressure, heaviness in 0 K0.37
the chest
Chest pain (precordial region, pos- 0 K0.50
terior sternal)
Edema of lower limb 12 K0.57
Inducement and history Known factors of inducement 0 K0.05
Upper respiratory infection 146 K0.69
Chronic pulmonary disease history 0 K0.67
Hypertension family history 43 K0.67
Coronary heart disease family his- 56 K0.65
tory
Physical examination Blood pressure 0 0.45
Cervical venous engorgement 0 K0.66
Barrel chest 0 K0.65
Sound of breath in pulmonary 0 K0.43
Moist rales 0 K0.71
Boundary of heart sonant 0 K0.20
Systolic murmur 0 K0.13
Diastolic murmur 0 0.43
Systolic thrill 41 K0.59
Second heart sound 87 K0.57
Liver and kidney tenderness 0 K0.51
Hyperresonant note 0 K0.65
Cardiac rhythm in good order or 0 K0.46
not
Eminence in precordial region 41 K0.66
Neck venous return 0 0.83
Pulmonic P-wave 74 K0.79
Cardiac enlargement 0 K0.07
Electrocardiogram 15 0.17
ST-T alteration 53 K0.55
(F1, F2, F3, F4) and four different sets of experiments have The experimental results of the four test sets have also
been performed: been presented as a confusion matrix (Table 5). Generally, a
confusion matrix contains information about the actual and
(i) Training: F1CF2CF3; Testing: F4 the predicted classes. In the confusion matrix, the columns
(ii) Training: F1CF2CF4; Testing: F3 represent the test data, while the rows represent the labels
(iii) Training: F1CF3CF4; Testing: F2 assigned by the classifier. Several indices of classification
(iv) Training: F2CF3CF4; Testing: F1 accuracy can be derived from the confusion matrix.
The cross-validation classification accuracy thus far can
be determined as
4.2.2. Results
75 C 66 C 77 C 56 C 48 322
The cross-validation estimated accuracy of each test run, Z Z 91:5% (14)
the interval of estimated accuracy and the mean accuracy 352 352
are listed in Table 4. Here the first column CV1 refers to the From the confusion matrix shown in Table 5, we can
first test based on the cross validation method. readily calculate the procedure accuracy and the user
H. Yan et al. / Expert Systems with Applications 30 (2006) 272281 279
Table 4
Accuracy result: cross-validation
CV 1 CV 2 CV 3 CV 4 Interval
Acch (%) 88.6 92.0 92.0 93.2 [88.6, 93.2]
Mean Acch (%) 91.5
Table 5
The confusion matrix of the diagnostic result of the heart diseases: cross-validation
Illness Coronary heart Rheumatic valvu- Hypertension Chronic cor Congenital heart Row sum
disease lar heart disease pulmonale disease
Coronary heart 75 3 3 1 0 82
disease
Rheumatic valvu- 1 66 1 0 3 71
lar heart disease
Hypertension 4 2 77 2 1 86
Chronic cor 2 0 2 56 0 60
pulmonale
Congenital heart 1 2 2 0 48 53
disease
Column sum 83 73 85 59 52 352
280 H. Yan et al. / Expert Systems with Applications 30 (2006) 272281
Table 6 Acknowledgements
Accuracy result: holdout
Holdout Holdout Holdout Holdout Interval This work was supported by NSFC 30400105. The
1 2 3 4 authors would like to thank Southwest Hospital and Dajiang
Acch (%) 93.2 92.0 93.2 89.8 [89.8, 93.2] Hospital, Chongqing, China, for their support in the heart
Mean Acch 92.0 disease database and valubale suggestions.
(%)
Table 7 References
Accuracy result: bootstrapping
CDCs report, http://www.cdc.gov/nccdphp/overview.htm.
Mean Interval Long, W. J., Naimi, S., & Criscitello, M. G. (1992). Development of a
Accs 0.921 [0.902, 0.946] knowledge base for diagnostic reasoning in cardiology. Computers in
xI 0.906 [0.893, 0.927] Biomedical Research, 25, 292311.
Accboot (%) 91.1 Azuaje, F., Dubitzky, W., Lopes, P., Black, N., & Adamsom, K. (1999).
Predicting coronary disease risk based on short-term RR interval
measurements: A neural network approach. Artificial Intelligence in
where b is the number of bootstrap samples, xi is the Medicine, 15, 275297.
accuracy estimate for bootstrap sample i, and Accs is the Tkacz, E. J., & Kostka, P. (2000). An application of wavelet neural
resubstitution accuracy estimate or apparent error rate. network for classification patients with coronary artery disease based
on HRV analysis. Proceedings of the Annual International
In this study, we have generated 5 bootstrapping Conference on IEEE Engineering in Medicine and Biology ,
samples from the full samples summarized in Table 2. 13911393.
The network then has been trained with each boot- Reategui, E. B., Campbell, J. A., & Leao, B. F. (1997). Combining a neural
strapping sample and the accuracy estimate has been network with case-based reasoning in a diagnostic system. Artificial
determined from the corresponding test set. The results Intelligence in Medicine, 9, 527.
Tsai, D. Y., & Watanabe, S. (1998). Method optimization of
are provided in Table 7. fuzzy reasoning by genetic algorithms and its application to
discrimination of myocardial heart disease. Proceedings of IEEE
Nuclear Science Symposium and Medical Imaging Conference ,
4.5. Summary 17561761.
http://www.ifcc.org/ejifcc/vol14no2/140206200308n.htm
Three different evaluation methods have been adopted to Fuster, V., Alexander, R. W., & ORourke, R. A. (2001). Hursts the heart.
assess the performance of the proposed MLP-based decision USA: McGraw Hill Professional.
support system. From the results reported in Tables 47, one David, W., & Vivian, W. (2000). Improving diagnostic accuracy using a
hierarchical neural network to model decision subtasks. International
can see that the system has a strong capability to accurately Journal of Medical Informatics, 57, 4155.
recognize all the five heart diseases (O90%) with Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford:
comparable small intervals (5%). Oxford University Press.
Hand, D. J. (1997). Construction and assessment of classification rules.
New York: Wiley.
Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge:
Cambridge University Press.
5. Conclusion Duda, R. O., Hart, P. E., & Strok, D. G. (2001). Pattern classification. New
York: Wiley.
In this paper, we have presented a medical decision Shigetoshi, S., Toshio, F., & Takanori, S. (1995). A neural network
support system based on the MLP neural network architecture for incremental learning. Neurocomputing, 9,
111130.
architecture for heart disease diagnosis. In particular, Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-fuzzy and soft
we have identified the 40 input variables critical to the computing. USA: Prentice Hall.
diagnosis of the heart diseases of interest and encoded Baba, N. (1989). A new approach for finding the global minimum of error
them accordingly. The system is trained by employing an function for neural networks. Neural Networks, 2, 367373.
improved BP algorithm. A heart diseases database Takeshi, F. (2001). Fusion of fuzzy/neuro/evolutionary computing
for knowledge acquisition. Proceedings of the IEEE, 89,
consisting of 352 cases has been used in this study. 12661274.
Three assessment methods, cross validation, holdout and Gill, P. E., Murray, W., & Wright, M. H. (1981). Practical optimization.
bootstrapping, have been applied to assess the general- London: Academic Press.
ization of the system. The results show that the proposed Polak, E. (1971). Computational methods in optimization. London:
system can achieve very high diagnosis accuracy Academic Press.
Kavzoglu, T. (2001). An investigation of the design and use of feed-
(O90%) and comparably small intervals (!5%), proving forward artificial neural networks in the classification of remotely
its usefulness in support of clinic diagnosis decision of sensed images, PhD Thesis, School of Geography, University of
heart diseases. Nottingham.
H. Yan et al. / Expert Systems with Applications 30 (2006) 272281 281
Heckerling, P., Elstein, A., Terzian, C., & Kushner, M. (1991). The effect of INNSIEEE International Joint Conference on Neural Networks,
incomplete knowledge on diagnoses of a computer consultant system. 2478, 2482.
Medical Informatics, 16, 363370. Kohavi, R., (1995). A study of cross-validation and bootstrap for accuracy
Doyle, H., Parmanto, B., Marino, I., Aldrighetti, L., & Doria, C. (1995). estimation and model selection. Proceedings of International Joint
Building clinical classifiers using incomplete observations-a neural Conference on Artificial Intelligence.
network ensemble for hepatoma detection in patients with cirrhosis. Penrod, C., & Wagner, T. (1979). Risk estimation for nonparametric
Methods of information in medicine, 34, 253258. discrimination and estimation rules: A simulation study. IEEE
Erkki, P., Matti, E., & Matti, J. (1998). Treatment of missing data values in Transactions on Information Theory, 25, 753758.
a neural network based decision support system for acute abdominal Jain, A. K., Dubes, R. C., & Chen, C. (1987). Bootstrap techniques for error
pain. Artificial Intelligence in Medicine, 13, 139164. estimation. IEEE Transactions Pattern Analysis and Machine Intelli-
Embrechts, M. J., Arciniegas, F. A., Breneman, C. M., Bennett, K. P., & gence, 5, 628632.
Lockwood, L. (2001). Bagging neural network sensitivity analysis Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap.
for feature reduction for In-Silico drug design. Proceedings of London: Chapman and Hall.