Está en la página 1de 10

Home

Search

Collections

Journals

About

Contact us

My IOPscience

A comparative study of image features for classification of breast microcalcifications

This article has been downloaded from IOPscience. Please scroll down to see the full text article. 2011 Meas. Sci. Technol. 22 114005 (http://iopscience.iop.org/0957-0233/22/11/114005) View the table of contents for this issue, or go to the journal homepage for more

Download details: IP Address: 128.173.127.127 The article was downloaded on 14/03/2012 at 13:31

Please note that terms and conditions apply.

IOP PUBLISHING Meas. Sci. Technol. 22 (2011) 114005 (9pp)

MEASUREMENT SCIENCE AND TECHNOLOGY

doi:10.1088/0957-0233/22/11/114005

A comparative study of image features for classication of breast microcalcications


I I Andreadis1 , G M Spyrou2 and K S Nikita1
1 Biomedical Simulations and Imaging Laboratory, Department of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece 2 Biomedical Research Foundation, Academy of Athens, Athens, Greece

E-mail: iandr@biosim.ntua.gr, gspyrou@bioacademy.gr and knikita@cc.ece.ntua.gr

Received 15 December 2010, in nal form 3 March 2011 Published 14 October 2011 Online at stacks.iop.org/MST/22/114005 Abstract Computer-aided diagnosis systems for mammography have been developed in order to assist radiologists in the diagnostic process by providing a reliable and objective discrimination of benign and malignant mammographic ndings. The effectiveness of such systems is based on the image features extracted from mammograms, which are mainly related to the morphology, texture and optical density of the suspicious abnormality. There are many methodologies reported in the literature able to provide a mathematical description of a mammographic lesion. In this paper, we apply various feature extraction methodologies on cases containing clusters of microcalcications. Our purpose is to compare their performance in large scale in terms of classication accuracy and to investigate their potentiality in discriminating benign from malignant clusters. Experiments were performed on 1715 cases (882 benign and 833 malignant) extracted from the Digital Database of Screening Mammography, which is the largest publicly available database of mammograms. The results of our study indicated that texture features outperformed the rest of the considered categories, while the combination of the best features optimized the classication results, leading to an area under the receiver operating characteristic curve equal to 0.82.
Keywords: microcalcications, feature extraction, CAD

1. Introduction
According to the American Cancer Society, excluding cancers of the skin, breast cancer is the most common cancer among women, accounting for nearly 1 in 4 cancers diagnosed in US women3 . The causes of the disease remain unknown, but efcient and early diagnosis gives a patient a better chance for recovery. Currently, despite the introduction of new imaging techniques for breast cancer screening such as magnetic resonance imaging (MRI) or ultrasound examinations, mammography is still considered the most effective screening tool [1]. However, inherent limitations of the method led to the development of computer-aided diagnosis (CAD) systems, whose role is to interpret in an objective way mammographic images and assist the radiologists in the diagnostic process, providing them a reliable second opinion.
3

American Cancer Society, Breast Cancer Facts and Figures 20092010.

CAD systems can be categorized into two main types: computer-aided detection (CADe) systems focusing on the detection of suspicious lesions and computer-aided diagnosis (CADx) systems targeting the characterization of the lesion [2]. Extended reviews for the existing systems of each category may be found respectively in [3] and [4]. The most important types of mammographic lesions related to the existence of breast cancer are masses and clustered microcalcications (MCs). In this study, we focus on the latter nding. MCs are tiny deposits of calcium which can be found anywhere in breast tissue [5]. The subtle nature of these ndings makes their automated characterization a very challenging task. CADx systems which focus on classifying benign and malignant clusters should extract suitable and objective features through image analysis techniques, which, after proper selection, would be able to contribute to the correct discrimination of the clusters. Many studies have been focused on the development of algorithms for the classication of clustered MCs and various
1
2011 IOP Publishing Ltd Printed in the UK & the USA

0957-0233/11/114005+09$33.00

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

feature extraction methods are reported [615]. The majority of researchers investigated the usefulness of morphological and shape features of individual MCs or features related to the distribution and the morphology of the cluster [610, 14, 15]. Kallergi [6] developed a classication scheme into which individual features of MCs, features of the MC clusters and the age of the patient were incorporated. Nakayama et al [7] calculated ve shape features and performed experiments on a database of magnication mammograms, while Papadopoulos et al [8] implemented features mainly related to the shape of the whole cluster of MCs. Betal et al [9] developed a set of features related to the cluster representation and its distribution, as well as to the irregularities and number of in-foldings of individual MCs. Veldkamp et al [10] focused on features mainly related to the shape of the cluster and its distribution, but they also incorporated features concerning the location of the cluster in the breast. Several approaches exploit the use of textural features for the discrimination between benign and malignant clusters [1115]. Karahaliou et al [11] studied the texture properties of the tissue surrounding the cluster of MCs using four different categories of textural features: rst-order statistics (FOS), gray-level co-occurrence matrices (GLCMs), gray-level run length matrices and Laws texture energy measures. Pereira et al [12] applied features extracted from the spatial gray-level dependence matrix and the wavelet transform. A few studies have incorporated features extracted from different methodologies in order to get combined information or even perform comparison between different groups of features in terms of classication accuracy. Chan et al [14] implemented both morphological and texture features of MCs, while Soltanian-Zadeh et al [15] compared the performance of four feature sets (co-occurrence matrices based, shape, wavelet and multiwavelet features). Although a broad variety of features for characterizing the MCs exist, there is no clear evidence on which methodology is better related to the task of discriminating benign and malignant clusters. Additionally, while many studies have been focused on the development of algorithms for the classication of clustered MCs, no straightforward comparison can be made, as in most cases experiments are conducted on different datasets. As a result, we believe that there is a need for a comparative study between different image feature types that may indicate the best suited group for the task of classication. For future comparisons, we perform this study on the largest publicly available mammogram database, the DDSM [16]. The rest of the paper is organized as follows: rstly, in section 2, we describe the dataset used in our study, the extracted features, the corresponding categories and the classication methodology that is going to be used for our measurements. Next, in section 3, we describe in detail the structure of our experiments and report the results achieved, and nally, in sections 4 and 5, we discuss the outcome of our study and report some valuable conclusions that may form a proper baseline for our future work.
2

2. Methods and materials


2.1. Data collection As mentioned above, we used cases provided by the DDSM which contains digitized mammograms of approximately 2600 cases. Almost all cases contain images from both mediolateral (MLO) and craniocaudal (CC) views and information on the boundaries of the annotated regions, where the lesion has been detected and biopsy has been done. Each case was also accompanied with the patients demographic data and information on the digitization process. All mammograms of the DDSM have been digitized by one of the following scanners: LUMISYS 200 scanner at 12 bits pixel and 50 m spatial resolution, Howtek MultiRad850 and Howtek 960 at 43.5 m and DBA digitizer (M2100 ImageClear) at 42 m. Some cases contain more than one region of interest (ROI). The outlines of the suspicious region of each ROI are provided by a chain code. We extracted all ROIs from the DDSM where clusters of MCs were detected. We excluded only cases with unproven pathology result and a few cases whose ROI included almost the whole breast area. We ended with a dataset of 1715 ROIs of both views. For each case in the DDSM, the density of the breast is provided. The density of the breast is a very important factor which has to be considered when reading a mammogram. According to the BI-RADS standard system [17], there are four ratings of the breast based on its density: (1) entirely fat, (2) scattered brogranular densities, (3) heterogeneously dense and (4) extremely dense. In real-life procedures, radiologists pay attention on the density of the breast, as they think that it might turn out to be a good predictor of a womans breast-cancer risk. Many studies have shown that most misclassications happen to dense mammograms [18, 19]. Dense breasts have a greater proportion of glandular tissue, which can obscure clusters of MCs, making them difcult to detect. To this end, we made the following assumption. Because our primary goal was not to assess the detection phase of MCs, but to perform a comparison between different feature types, we separated the mammograms into two different categories and studied them individually. These two main categories of tissues are as follows: fatty tissues containing the rst two ratings and dense tissues containing the rest. We assumed that a categorization of mammograms based on their density is required in order to assess the different feature types in a more objective way. As a result, we ended with two individual datasets: the rst included 653 clusters (338 benign and 315 malignant) and the latter included 1062 clusters (544 benign and 518 malignant). 2.2. ROI segmentation Before extracting certain feature types, such as shape features of MCs, the segmentation of MCs from the surrounding breast tissue is prerequisite. We applied a detection algorithm, as presented in a previous work [20]. The output of the current layer is a segmented image, where the image background is omitted and only the MCs remain. The current process is depicted in gure 1.

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

computational complexity. We extracted in that way features such as the eccentricity, spread and F3_F1 metric. At the end, statistics of each feature like the mean value, median, standard deviation and others are obtained for the whole cluster. 2.3.2. Cluster shape features. Of course, radiologists do not analyze exclusively the shape of the particles but take under consideration the morphology of the whole cluster. To extract the features representing the shape of the cluster, we based our measurements on the convex hull of the centroids of all the MCs in the cluster. Features such as the number of MCs in the cluster, the calcication coverage, area, circularity, eccentricity and the perimeter of the cluster have been implemented. 2.3.3. Distribution features. Another group of features that has been considered for this study is distribution modiers that describe the arrangement of MCs within the cluster. According to the BI-RADS standard system, clustered MCs are considered less suspicious of malignancy, while linear, segmental and diffuse MCs are types of distribution of greater concern. Once again, we based our measurements on the convex hull of the cluster and the centroids of each particle, to implement the features describing the spatial distribution of the MCs within a cluster, such as the distances between individual MCs or the distances from the cluster centroid. 2.3.4. Optical density of individual MCs. These features are related to the gray values of each individual particle. For each MC, we get a measure of its brightness and estimate its contrast to the surrounding tissue. Statistics of these features are obtained for the whole cluster. As already mentioned, all the above groups of features are calculated from the images containing only the MCs segmented from the background tissue. As a result, it is obvious that these features are highly dependent on the segmentation phase. Consequently, the importance of certain features may be overestimated or even underestimated depending on the preprocessing and segmentation phase. This is one of the most important drawbacks for the above feature types. To overcome these drawbacks, many studies have incorporated in the feature extraction phase the features that represent the texture of the surrounding tissue. Textural features represent the spatial distribution of gray levels, so their use allows us to investigate whether the presence of MCs alters the breast tissue. Obviously, the most important advantage of this group is its robustness to the segmentation phase, as no segmentation of MC particles is needed. 2.3.5. Textural features. In this study, two categories of textural features were extracted: rst order statistics (FOS) and gray-level co-occurrence matrices (GLCMs) features. FOS. These statistics depend only on single pixel values and represent properties of the intensity histogram of the ROI. A total of 16 features were extracted for this group, containing statistics for the gray values of the ROI, kurtosis, skewness and statistics related to the distribution of the histogram.
3

(a)

(b)

Figure 1. (a) Original ROI (DDSM: volume benign_02, case A_1265, RIGHT_MLO), (b) segmented image with detected MCs.

Afterward, we labeled each individual MC in an 8connectivity way. Using these labels, we handled each MC in the next layer, in order to extract image features for each particle. 2.3. Feature extraction methods Probably the most important step in the whole process of characterizing ROIs including cluster of MCs is the feature extraction layer. In this step, the cluster has to be analyzed in an objective way in order to extract features that may prove to be crucial for the discrimination between malignancy and benignity. Several different feature extraction methodologies have been reported in the literature. In general, the features extracted should be associated with the strategy that each radiologist follows when performing his/her diagnosis. The most important parameters that are related to the malignancy of a cluster have been indicated to be the shape of the whole cluster, the number of MCs detected in the cluster and the shape of MCs in the cluster. We tried to perform an exhaustive collection of image features that have been used in previous works and are considered useful for the analysis of a cluster. Finally, we divided these features into six main categories, which are presented below. 2.3.1. Shape features of individual MCs. According to the BI-RADS system, the morphology of MCs is a substantial factor for their discrimination. In general, regular particles, such as round and oval with lucent centers, are considered benign, while thin, linear and small in size MCs are considered an important factor of malignancy. We extracted for each individual MC features representing its shape such as the area, perimeter, compactness, circularity and elongation. We also computed shape descriptors based on the theory of moments. The theory of moments for the shape analysis of MCs has already been investigated by other studies [6, 15] showing promising results. Due to the fact that our initial goal was not to compare shape extraction methods but different groups of features, we believed that the choice of an already tested method should be preferred. Additionally, the momentbased shape descriptors are concise, robust and with low

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

GLCMs. GLCM features, proposed by Haralick et al [21], are used to characterize texture patterns and describe how often different pixel values occur in an image. These secondorder statistics are derived from the GLCM. An element, i.e. Pd , (i,j ), describes the relative frequency of occurrence of pairwise gray levels (i, j ) separated by a distance d in the direction . In this work, four angles 0 , 45 , 90 and 135 were used to obtain four co-occurrence matrices at distance d = 1. Firstly, 13 textural measurements were derived from each of these matrices and then mean values, standard deviation and range were computed. The last category of features considered in this study is related to BI-RADS descriptors and can be extracted from the annotation les of the mammographic images in DDSM. They have been used by several authors and have contributed to high levels of classication accuracy [22, 23]. However, these features are not related to the image analysis of the ROI and cannot be easily adapted to clinical practice due to the fact that they are the radiologists subjective diagnosis. Despite the fact that these features are beyond the scope of this study, they have been incorporated for reasons of comparison.

Table 1. Evaluation metrics per feature type for fatty tissues. Feature type Shape Cluster Distribution OD GLCM FOS BI-RADS descriptors ACC 0.6677 0.6784 0.6845 0.6279 0.7014 0.6554 0.8943 SN 0.4127 0.5524 0.473 0.4921 0.6381 0.546 0.8667 SP 0.9053 0.7959 0.8817 0.7544 0.7604 0.7574 0.9201 Az 0.722 0.747 0.73 0.671 0.776 0.714 0.891

3. Results
3.1. SVM parameterization To this end, the LIBSVM library [26] was used for our measurements. Firstly, because many kernels may be used for the training of an SVM classier, we used and compared the performance of several kernels to choose which of them is the most suitable for our experiments with the DDSM. We evaluated three well-known kernels, the Gaussian RBF kernel, the linear kernel and the polynominal kernel [25, 26]. We observed that the two latter kernels were slower during training, while achieving lower performance in comparison to the RBF kernel. For this reason, all experiments hereafter consider the RBF kernel. The leave-one-out (LOO) method was applied for performance evaluation. To apply the SVM training algorithm, two parameters have to be adjusted, the regularization parameter C and the parameter g. Parameterization was applied for the following values of parameters: C {23 , 22 , . . . , 215 } and g {212 , 211 , . . . , 24 }, followed by a ner grid search in local neighborhoods. 3.2. Fatty tissues The rst axis of our study was focused on the image analysis of the ROIs which belong to fatty tissues, according to the density rating found in the DDSM annotation les. After having completed the feature extraction process for the 653 clusters of this dataset, we continued with the evaluation results for each one of the different groups of features. For each different group, using the SVM classication scheme as described above, the best feature set was selected with respect to overall classication accuracy by means of an exhaustive search. Table 1 provides the values of accuracy (ACC), sensitivity (SN), specicity (SP) and Az for each feature category. We note that the greatest accuracy was achieved by the BI-RADS descriptors provided by the DDSM. We expected this performance as such features have already been used in previous studies achieving high levels of accuracy. Of course, due to the fact that these features cannot be easily adapted to clinical practice, it is of great importance to compare the rest of the image feature types. We observe that the GLCM textural features slightly outperform the other groups of features in terms of the value of Az achieved. As expected, cluster and distribution features appear to be relatively important, while features related to the OD appear to be more irrelevant to the
4

2.3.6. BI-RADS descriptors. According to the BI-RADS standard system, we extracted features concerning the subtlety of an image, the assessment of the radiologist, the description of the shape and the distribution of the MCs in the cluster and the age of the patient. The corresponding features were encoded into numerical values, following a rank ordering system proposed by Lo et al [24]. The statistics of all the above-described features led to a set of 195 different features.

2.4. Classication scheme Many classiers such as neural networks, linear discriminant analysis and k nearest neighbors have been used in the past for the specic two-class classication problem [6, 8, 1015]. Due to the fact that there is no conclusion about which classier is best suited for the discrimination between benign and malignant clusters and in order to perform direct comparison between different feature types, we chose to apply the support vector machine (SVM) classier as the classication tool in our experiments. SVMs [25] are learning machines based on intuitive geometric principles, aiming at the denition of an optimal hyperplane (maximal margin hyperplane, MMH), which linearly separates the training data so that a minimum expected risk is achieved. Suppose we have a training set S of m training points S = {{xi ,yi }, i = 1, 2, . . . ,m}, where x Rn is the input vector and the scalar y denotes its class label, y {1,1}. Using the SVM approach, we construct a decision function df(s) that can correctly classify an input pattern s. After training, the decision function depends only on a small subset of the training vectors which are called support vectors.

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

Figure 3. Az values by the RFE method for different subsets of top-ranked features in the training set. Figure 2. Comparison of ROC curves between the best four groups of features. Table 2. Evaluation metrics for the RFE method. ACC SN 0.651 0.727 SP 0.814 0.76 Az 0.826 0.79 Number of features for max Az 23 188

classication task. Figure 2 shows the ROC curves for the best feature set between the best four groups. Of course, it is of great importance to investigate whether the combination of features of different groups may be able to provide optimized classication results. In order to investigate whether the combination of different features may enhance the classication results, we worked in two different ways. The rst approach included retraining of the SVM classier by using the whole dataset of features and evaluation of its performance after proper feature selection. The latter approach included the application of a simple majority voting rule, where the class assigned to each sample is selected according to the majority of the outputs by each individual feature category. Working toward the rst approach, we created a dataset matrix 653 188 where we included all image features excluding only the BI-RADS descriptors, which are not extracted through image analysis. Due to the great number of features used in this phase, a feature selection preprocessing is needed in order to nd a satisfactory feature subset, eliminating features that may be irrelevant to the classication task. A previous work [27] has indicated the potentiality of the recursive feature elimination (RFE) selection method proposed by Guyon et al [28] for the selection of an optimal subset. The RFE method generates ranking of features during an iterative, multivariate backward feature-elimination. The ranking is based on the square of the weight that an SVM classier assigns to each feature during the training phase. After normalizing the training set, it undergoes the feature selection stage, where the features are ranked according to the selection method. The top-ranked features are those to be considered by the method as highly signicant for the discrimination between benign and malignant clusters. We keep then the number of features from top to bottom order, creating each time a different training set. Afterward, to evaluate the classication performance, we used the Az metric, which is considered to be one of the most reliable metrics in two-class classication problems. The results for Az achieved by the RFE method are depicted in gure 3.
5

RFE No FS

0.735 0.744

We may note that the method is robust as far as classication is concerned, as the results do not change drastically beyond a specic number of features. The best classication results provided by the selection method are shown in table 2, accompanied by the results with no feature selection. We see that the method is able to achieve the maximum Az value and provide the best classication results for a small subset of 23 features. It is of great importance to perform an investigation of the image features landscape and to notice which features appear to be highly signicant for the discrimination task. In table 3, we present the features considered to be the most signicant by the feature selection method. During the feature extraction process we have considered six basic feature types. It is noticeable that the majority of topranked features by the RFE feature selection method are related to the GLCM, distribution and the morphology and shape of the cluster. The specic fact was expected as these three feature categories have shown the best classication performance. It is evident that many of them play an important role in the discrimination task, and for this reason they are top ranked by the RFE method. It seems that the combination of such features is beneciary for the results, as the Az value is increased from 0.79 to 0.8259. In contrast, FOS and features representing OD appear to be irrelevant to the specic discrimination task. The second approach for combining different feature types included the application of a new classication scheme using a majority voting rule. In order to obtain a ROC curve for this scheme, we dene for each case of the dataset the number of benign and malignant votes, both ranging from 0 to 6. Due to the fact that a malignancy threshold should be dened to generate a ROC curve, we used the number of malignant votes as proposed in [11]. The Az value obtained by the current classication scheme was found equal to 0.757. In gure 4,

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

Table 4. Evaluation metrics per feature type for dense tissues. Feature type Shape Cluster Distribution Optical GLCM FOS ACC 0.5631 0.6083 0.6083 0.5612 0.6083 0.5678 SN 0.537 0.3803 0.334 0.3938 0.612 0.4653 SP 0.588 0.8254 0.8695 0.7206 0.6048 0.6654 Az 0.554 0.631 0.606 0.578 0.636 0.608

Figure 4. ROC curves for the two developed classication schemes. Table 3. Top-ranked features. Rank Feature 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Range of distances between neighboring MCs Clusters perimeter Minimum value of MCs brightness Minor axis length Max. compactness Std of sum variance Difference entropy (90 ) Contrast (135 ) Equivalent diameter of cluster Std of MCs compactness Clusters eccentricity Std of contrast Sum of squares of variance (90 ) Maximal correlation coefcient (135 ) Mean value of MCs size Range of MCs distances from the centroid Contrast (0 ) Minimum distance of neighboring MCs Difference entropy (45 ) Std of difference entropy Range of sum entropy Angular second moment (0 ) Mean value of difference entropy Feature type Distribution Clusters shape OD Clusters shape MCs shape GLCM GLCM GLCM Clusters shape MCs shape Clusters shape GLCM GLCM GLCM MCs shape Distribution GLCM Distribution GLCM GLCM GLCM GLCM GLCM

in dense mammograms is a nontrivial problem of its own. As a result, the image measurements are strongly inuenced by the segmentation algorithm and certain feature types may present underestimated discrimination ability. As the scope of this study was not the investigation of the potentiality of the segmentation algorithm, for reasons of direct comparison we used the same segmentation algorithm and the same groups of features in a new dataset containing only dense tissues. As mentioned in the data collection section, this new set contained a much larger number of cases. Working as in the previous phase, we rstly performed a comparison of the different groups. The corresponding results are presented in table 4. As expected, the results are much worse than in the case of fatty tissues. Observing the Az values achieved by each method, we may extract the same conclusions as in the case of fatty tissues for the greater effectiveness of GLCM features in comparison with the other groups. However, because of the low values achieved, no secure conclusions can be made. The current experiment indicates that the combination of the segmentation algorithm and the features implemented is not adequate for discrimination between benign and malignant clusters of MCs in dense tissues. The reported values of Az are really close to the value of 0.5 which indicates randomness in two-class classication problems. We again evaluated the dataset which included all the implemented features. The Az value was found equal to 0.6455. Again, the overall accuracy was slightly improved, a fact that indicates that each different feature type contains signicant features which, if combined, may optimize the classication performance. The ROC curve presented in gure 5 leads us to the same conclusion that in the case of dense tissues no signicant discrimination results can be produced.

we present the two ROC curves achieved by the SVM classier trained with the optimal reduced subset as well as by the voting scheme-based classier. It is obvious that the SVM classier outperforms the voting scheme. 3.3. Dense tissues In the previous phase we carried out a comparison of the considered feature types and tried to nd a subset of features able to combine the information content of each group and enhance the classication results, using only mammograms with fatty tissue. Our initial assumption was that mammograms of different density should be handled in a different way due to the fact that the detection of MCs
6

4. Discussion
In this paper, we present a large-scale comparative study for breast cancer diagnosis, based on the analysis of clusters of MCs. We worked on two main objectives. The rst objective of our study was to evaluate the performance of as many as possible state-of-the-art feature extraction methodologies in a large number of cases provided by the DDSM. We considered six basic groups of image features including features related to the shape of MCs, morphology of the cluster, optical density and distribution of MCs, rst-order and second-order statistics for the texture of the ROIs tissue. The last two feature types have the advantage that they are robust to the segmentation phase. The results

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

Figure 5. ROC curve combining all features for dense tissues.

from the classication experiments in a dataset of 653 clusters obtained from fatty tissues indicate that the GLCM features slightly outperform the other groups achieving a value of Az equal to 0.77. Both cluster and distribution features provided a sufcient classication performance, with the rest of the groups demonstrating poorer performance. Afterward, we combined the features in a complete dataset to investigate whether the combined information of different groups may lead to improvement of the classication results. We worked in two different ways. Firstly, we applied a feature selection pipeline and then classied again with the SVM scheme, nding that there is a small subset of features able to maximize the classication results. The best Az value reported is 0.826. Later, we applied a majority voting rule which combined the classication outputs by all individual feature categories. The reported Az value achieved by the voting scheme is 0.757. The conclusion that we may extract from these computational experiments is that a combination of different methodologies may perform better than individuals for the classication of clusters. Additionally, the GLCM features appear to be the most powerful group for discriminating benignity and malignancy. However, we should recall that the rst four groups considered appear highly dependent on the preprocessing phase. It is possible that the same features would perform better or worse in images segmented by different segmentation algorithms. In any case, newly developed methodologies and features able to characterize a cluster should be rstly compared to the existing ones, as many factors may inuence their performance. The second objective of our study was to examine if the proposed methodologies are adequate for all types of breast tissues. For this reason, we performed the same measurements in a new set, containing cases that have been rated as density 3 or density 4, according to the BI-RADS standard. We found that, while in the case of fatty tissues
7

we may notice a satisfactory classication performance, the same features in the case of dense tissues fail to discriminate benign and malignant clusters. The same conclusion has also been reported in previous studies [18, 19] and it is probably assigned to the similarity between normal and tumor tissues, making hard the task of detecting the clusters of MCs. As a result, there seems to be the need for a pre-screening of the density in each case and a methodology renement in order to handle cases with dense tissue. Working toward the renement of image enhancement and segmentation of images would improve the classication performance and would allow us to investigate the discriminative power of the developed features under better circumstances. Many studies have been reported previously in the literature, focusing on the task of classifying clusters of MCs. However, despite the vast amount of such studies, no straightforward comparison can be carried out due to the different datasets used in each study. Especially in the case of the DDSM, because of the great number of cases included, many researchers choose to conduct their measurements on a random subset of these cases. To the best of our knowledge, few studies have incorporated all the cases of the DDSM containing clusters of MCs. A typical example is the study of Pereira et al [12] who reported the Az value of 0.607 for the classication between benign and malignant clusters of MCs. Yoon et al [22] used almost all cases from the DDSM reporting a value of Az equal to 0.9. However, as mentioned earlier, they incorporated BI-RADS descriptors, which in the current study have performed similarly (Az = 0.891) and have been excluded from our measurements in order to investigate only image features. The majority of the rest of the studies use other databases, public (MIAS, Nijmegen) or private. Representative studies in terms of classication accuracy and database used are provided hereafter. Kallergi [6] reported a high value of Az , as high as 0.98, using a dataset which contained 100 mammograms selected from the patient les of the H Lee Moftt Cancer Center & Research Institute at the University of South Florida, digitized with a DBA digitizer. Nakayama et al [7] performed experiments on a database of 58 magnication mammograms, achieving an accuracy of 96%. Karahaliou et al [11] report a high value of Az equal to 0.96, using 85 dense mammograms from the DDSM digitized with the LUMISYS scanner. However, they did not focus on the individual MCs, but they studied the texture properties of the tissue surrounding a cluster of MCs. Chan et al [14] achieved an Az of 0.89 on a dataset containing 78 mammograms from the Michigan University. Soltanian-Zadeh et al [15] achieved an Az of 0.89, using the Nijmegen database for their experiments. Betal et al [9] report Az = 0.84, using a dataset of 38 digitized mammograms. Veldkamp et al [10] reported a value of Az equal to 0.83, using the cases from the Nijmegen dataset. Papadopoulos et al [8] report a value of Az equal to 0.79 using the Nijmegen database and Az = 0.81 using the MIAS database. Comparing the method and the results in our study to previous ones, we believe that there are two main points that differentiate our work.

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

Firstly, the majority of the previous studies implement only one feature extraction approach, while in the current study we performed comparison and combination of different groups of features. Only the studies of Chan [14] and Soltanian-Zadeh [15] exploit combined information from different methodologies. While in the former study both shape and texture features are combined, in the latter study the authors performed comparison between four different feature sets (co-occurrence matrices based, shape, wavelet and multiwavelet features), indicating that the multiwavelet features outperformed the other three feature sets. Secondly, it is obvious that the above studies are performed on sets of less than 100 mammograms. We believe that the great number of cases incorporated in our study strengthens the classication procedure and that the results provided may become a proper baseline for future comparisons.

5. Conclusions
The aim of this study was to perform a comparison of different methodologies existing in the literature and examine their potentiality to discriminate benign and malignant clusters. Feature extraction is a pipeline of measurements regarding various characteristics acting as object descriptors. Through our study, we can see that different ways of morphological and textural measurements have a strong inuence on the nal classication performance. It is difcult to compare our method with others reported in the literature as in most cases computational experiments are conducted on different datasets. Many previous studies have reported high values of accuracy in classication performance. However, in most cases small datasets have been used, and as a result it is quite difcult to examine the generalization ability of the methodologies developed. Despite the fact that the classication results of our study are not very high, especially in the case of dense tissues, there are many valuable conclusions concerning the feature extraction process and the different behavior of the classier depending on the density of the breast. Namely, we noticed that there is a specic group of textural features which outperformed the rest. Secondly, we indicated that the combination of all features may improve the classication accuracy and perform better than individual groups. Finally, it seems that the investigated features, while performing well in fatty tissues, fail to perform efciently when applied to cases with dense tissues. These conclusions provide us a proper baseline to work toward the optimization of the classication procedure, focusing on the renement of image preprocessing and on the investigation of new features or feature combinations.

References
[1] Ng H H and Muttarak M 2003 Advances in mammography have improved early detection of breast cancer J. HK Coll. Radiol. 6 (3) 12631 [2] Giger M, Chan H and Boone J 2008 Anniversary paper: history and status of CAD and quantitative image analysis: the role of medical physics and AAPM Med. Phys. 35 5799820 8

[3] Nishikawa R M 2007 Current status and future directions of computer-aided diagnosis in mammography Comput. Med. Imaging Graph. 31 22435 [4] Elter M and Horsh A 2009 A CADx of mammographic masses and clustered microcalcications: a review Med. Phys. 36 205268 [5] Lanyi M 1985 Microcalcications in the breasta blessing or a curse? A critical review Diagn. Imaging Clin. Med. 54 12645 [6] Kallergi M 2004 Computer-aided diagnosis of mammographic microcalcication clusters Med. Phys. 31 31426 [7] Nakayama R, Uchiyama Y, Watanabe R, Katsuragawa S, Namba K and Doi K 2004 Computer-aided diagnosis scheme for histological classication of clustered microcalcications on magnication mammograms Med. Phys. 31 78999 [8] Papadopoulos A, Fotiadis D I and Likas A 2005 Characterization of clustered microcalcications in digitized mammograms using neural networks and support vector machines Artif. Intell. Med. 34 14150 [9] Betal D, Roberts N and Whitehouse G H 1997 Segmentation and numerical analysis of microcalcications on mammograms using mathematical morphology Br. J. Radiol. 70 90317 [10] Veldkamp W J, Karssemeijer N, Otten J D and Hendricks J H 2000 Automated classication of clustered microcalcications into malignant and benign types Med. Phys. 27 26008 [11] Karahaliou A, Skiadopoulos S, Boniatis I, Sakellaropoulos P, Likaki E, Panayiotakis G and Costaridou L 2007 Texture analysis of tissue surrounding microcalcications on mammograms for breast cancer diagnosis Br. J. Radiol. 80 64856 [12] Pereira R R, Azevedo Marques P M, Honda M O, Kinoshita S K, Engelmann R, Muramatsu C and Doi K 2007 Usefulness of texture analysis for computerized classication of breast lesions on mammograms J. Digit. Imaging 20 24855 [13] Fu J, Lee S, Wong S, Yeh J, Wang A and Wu H 2005 Image segmentation feature selection and pattern classication for mammographic microcalcications Comput. Med. Imaging Graph. 29 41929 [14] Chan H, Sahiner B, Lam K, Petrick N, Helvie M, Goodsitt M and Adler D 1998 Computerized analysis of mammographic microcalcications in morphological and texture feature spaces Med. Phys. 25 200719 [15] Soltanian-Zadeh H, Raee-Rad F and Pourabdollah-Nejad S 2004 Comparison of multiwavelet, wavelet, haralick, and shape features for microcalcication classication in mammograms Pattern Recognit. 37 197386 [16] Heath M, Bowyer K, Kopand D, Moore R and Kegelmeyer W 2001 The digital database for screening mammography Proc. 5th IWDM (Yaffe M Medical Physics Publishing) pp 21218 [17] American College of Radiology (ACR) 2003 Illustrated Breast Imaging Reporting and Data System (BI-RADS) 4th edn (Reston, VA: American College of Radiology) [18] Li L, Wu Z, Salem A, Chen Z, Chen L, George F, Kallergi M and Berman C 2006 Computerized analysis of tissue density effect on missed cancer detection in digital mammography Comput. Med. Imaging Graph. 30 29197 [19] Sampat M, Markey M and Bovik A 2005 Computer-aided detection and diagnosis in mammography Handbook of Image and Video Processing ed A C Bovik (New York: Academic) pp 1195217 [20] Spyrou G, Kapsimalakou S, Frigas A, Koufopoulos K, Vassilaros S and Ligomenides P 2006 Hippocrates-mst: a prototype for computer-aided microcalcication analysis and risk assessment for breast cancer Med. Biol. Eng. Comput. 44 100715

Meas. Sci. Technol. 22 (2011) 114005

I I Andreadis et al

[21] Haralick R, Shanmugam K and Dinstein I 1973 Textural features for image classication IEEE Trans. Syst. Man Cybern. 3 61021 [22] Yoon S and Kim S 2009 Mutual information-based SVM-RFE for diagnostic classication of digitized mammograms Pattern Recognit. Lett. 30 148995 [23] Verma B, McLeod P and Klevansky A 2010 Classication of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer Expert Syst. Appl. 37 334451 [24] Lo J, Gavrielides M, Markey M and Jesneck J 2003 Computer-aided classication of breast microcalcication clusters: merging of features from image processing and radiologists Proc. SPIE 5032 88289

[25] Burges C J 1998 A tutorial on support vector machines for pattern recognition Data Min. Knowl. Discovery 2 12167 [26] Chang C C and Lin C J 2001 LIBSVM: a library for support vector machines Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm [27] Andreadis I, Antaraki A, Spyrou G, Ligomenides P and Nikita K 2010 Investigating the image features landscape for the classication of breast microcalcications Proc. Int. Conf. Imaging Systems and Technlques (Thessaloniki) pp 13943 [28] Guyon I, Weston J, Barnhill S and Vapnik V 2002 Gene selection for cancer classication using support vector machines Mach. Learn. 46 389422

También podría gustarte