Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Executive Summary
Chapter one presents the most important and widely used metrics in software engi-
neering for quality evaluation. The area of software engineering metrics is always
under study; researchers continue to validate the metrics. The metrics presented
were selected after studying software engineering literature, yielding only those
metrics that are widely accepted. We must stress that we have not presented any
models for evaluating quality, only metrics that can be used for quality evaluation.
Quality evaluation models will be presented in the appropriate deliverable.
The metrics presented are categorised according to an accepted taxonomy among
researchers into three sections: process metrics, product metrics and resources met-
rics. We have also included a section for metrics specific for Open Source software
development. The presentation of the metrics is brief, allowing for a straightforward
application and tool development. We have included both metrics that are consid-
ered classic (e.g. program length and McCabe’s cyclomatic complexity) and modern
metrics (e.g. the Chidamber and Kemerer metrics suite and object oriented design
heuristics). While we present some metrics for Open Source software development,
this topic will be presented at length elsewhere.
Chapter two presents tools for acquiring metrics presented in chapter one. The
tools presented are both Open Source and proprietary. There are a lot of metrics
tools available and we tried to present a representative sample of them. Specifically
we present those tools that are going to be useful for our own system and there is a
potential to include them in our system (especially the Open Source ones). We tried
to install and test each tool ourselves. For each tool we present its functionality
and include also some screenshots of it. Although we tried to include all possible
tools that might be helpful to our project, future work will accomodate such tools as
become available.
Chapter three introduces empirical Open Source Software studies from many
viewpoints. The first part details the historical perspectives of the evolution of five
popular Open Source Software systems (Linux, Apache, Mozilla, GNOME, and the
FreeBSD). This is followed by horizontal studies in which researchers examining sev-
eral projects collectively. A model for the simulation of the evolution of Open Source
Software projects and results from early studies is also presented. The evolution of
Open Source Software projects is directly linked with the evolution of the code and
communities around the project. Thus, the forth viewpoint in this chapter considers
code quality studies of Open Source Software by applying evolution laws of Open
Source software development to study how code evolves and how this evolution af-
fects the quality of the software. The chapter concludes with community studies
in mailing lists, in which a research methodology for the extraction and analysis of
community activities in mailing lists is proposed.
Chapter four introduces the concept of data mining and its significance in the
Revision: final 1
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Revision: final 2
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Document Information
Deliverable Number: 2
Due Date: 22nd January 2007
Deliverable Date: 22nd January 2007
Approvals
Name Organisation Date
Coordinator Georgios AUEB/SENSE 10/09/2006
Gousios
Technical Coordinator Ioannis Samo- AUTH/PLaSE
ladas
WP leader Ioannis Antoni- AUTH/PLaSE
ades
Quality Reviewer 1
Quality Reviewer 2
Quality Reviewer 3
Revisions
Revision Date Modification Authors
0.1 05/10/2006 Initial version AUTH
Revision: final 3
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Contents
1 Software Metrics and Measurement 7
1.1 Software Metrics Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Process Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Structure Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 Design Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.3 Product Quality Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Productivity Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Open Source Development Metrics . . . . . . . . . . . . . . . . . . . . . . 22
1.5 Software Metrics Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.5.1 Validation of prediction measurement . . . . . . . . . . . . . . . . . 25
1.5.2 Validation of measures . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 Tools 27
2.1 Process Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 CVSAnalY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.1.2 GlueTheos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.3 MailingListStats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Metrics Collection Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.1 ckjm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.2 The Byte Code Metric Library . . . . . . . . . . . . . . . . . . . . . 33
2.2.3 C and C++ Code Counter . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2.4 Software Metrics Plug-In for the Eclipse IDE . . . . . . . . . . . . 34
2.3 Static Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1 FindBugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.2 PMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.3 QJ-Pro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.4 Bugle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4 Hybrid Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4.1 The Empirical Project Monitor . . . . . . . . . . . . . . . . . . . . . 39
2.4.2 HackyStat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4.3 QSOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 Commercial Metrics Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6 Process metrics tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.1 MetriFlame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.2 Estimate Professional . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.3 CostXpert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.6.4 ProjectConsole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6.5 CA-Estimacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.6.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7 Product metrics tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Revision: final 4
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Revision: final 5
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Revision: final 6
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
These two main goals will be delivered through a plug-in based quality assessment
platform. In order to achieve these goals, the project’s consortium has to answer
specific questions derive from those goals. Thus, for the goals presented the follow-
ing have to be answered:
1. How can the quality of Open Source software be evaluated and improved?
These questions can be answered if we examine and measure both the process of
creating Open Source software and the product itself, i.e. the code. Both entities
can be measured with the help of software metrics. This section presents software
metrics and overview of how useful the metrics are for software evaluation.
• Process metrics are metrics that refer to the software development activities
and processes. Measuring defects per testing hour, time, number of people,
etc. falls under this category.
• Product metrics are metrics that refer to the the products of the software de-
velopment process (e.g. code but also documents etc.).
• Resources metrics are metrics that refer to any input to the development pro-
cess (e.g. people and methods).
Revision: final 7
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Each one of these categories contains metrics that are further distinguished as
either internal or external metrics.
Apart from the formal categories presented, we shall also include some metrics de-
rived directly from the Open Source development process.
In the following sections the most important (to our own perspective) metrics
shall be presented in each of the categories above. However, the metrics presented
have been studied and used extensively in traditional closed source software devel-
opment. In the end we present metrics for Open Source software that have appeared
in the recent years, when researchers started studying Open Source software. Al-
though these metrics can be classified according to the above taxonomy, we prefer
to present them separately.
Revision: final 8
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
This is a very useful metric and can be applied at any phase of the software develop-
ment process.
One other metric which can be derived from defect density is system spoilage
[FP97], a metric rather useful for the effectiveness of the development team. This
metric is defined as
Time To Fix Post Release Defects
System Spoilage =
Total System Development Time
As mentioned, this metric reflects the ability of the development team to respond to
defects found.
LOC: Code can be measured in several ways. The first and most common metrics
in the area of software engineering is the number of lines of code (LOC). Although
it may seem easy to measure the lines of the code of a computer program, there is
a controversy about what we mean by LOC. Most researchers refer to LOC as the
Source Lines Of Code (SLOC) which can either be physical SLOCs or Logical SLOCs.
Specific definitions of these two measures vary in the sense that what is actually
measured is not explicit. One needs to consider whether what is measured refers to
any one of the following:
• Blank lines.
• Comment lines.
• Data declarations.
Revision: final 9
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
definition, and logical SLOC can often be significantly different from physical SLOC.
For the purpose of our research, a physical source line of code (SLOC) will be defined
as:
Using the definition above we have to stress that this size metrics does not repre-
sent the actual size of the source code of the program since it excluded the comment
lines. Thus the total length of the program is represented as
The number of commented line is also a useful metric when we refer to other aspects
of software, e.g. documentation.
Halstead Software Science: Apart from counting lines of code there are also
other kind of metrics that try to measure the length of a computer program. One
of the earliest of such metrics was introduced by Halstead [Hal77] in the late ’70s.
Halstead measures are based on four measures that are directly derived from the
source code:
Halstead further introduced some metrics based upon the previous measures. These
metrics are:
Revision: final 10
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
In order for these metrics to be measured, one has to decide how to identify the
operators and operands. Halstead also used his metrics to estimate the length and
the effort for a given program. For more on Halstead estimations see [Hal77].
Halstead Software Science metrics have been criticised a lot during the years and
there are controversial opinions regarding them, especially for the volume, difficulty
and the rest of estimation metrics. These opinions vary from “no corresponding
consensus” [FP97] to “strongest measures of maintainability” [OH94]. However the
value of N as a program length, as well as the volume of a program, as proposed
by Halstead, does not contradict any relations we have between a program and its
length. Thus, we choose to include Halstead metrics in our research [FP97].
Function Points: The previous size measures (1.2-1.2) count physical size: lines,
operators and operands. Many researchers argue that this kind of measurement
could be misleading since it does not capture the notion of functionality, i.e. the
amount of function inside the source code of a given program. Thus, they propose
the use of functionality metrics.
One of the first such metrics was proposed by Albrecht in 1977 and it was called
Function Point Analysis (FPA) [Alb79] as a means of measuring size and productivity
(and later on also complexity). It uses functional, logical entities such as inputs,
outputs, and inquiries that tend to relate more closely to the functions performed by
the software as compared to other measures, such as lines of code. Function point
definition and measurement have evolved substantially; the International Function
Point User Group or IFPUG1 , formed in 1986, actively exchanges information on
function point analysis (FPA).
In order to compute Function Points (FP), one first need to compute Unadjusted
Function Point Count (UFC). To calculate this, first on further needs to calculate the
following:
• External inputs: Every input provided from the user (data and UI interactions)
but not inquiries.
• External outputs: Every output to the user (i.e. reports and messages).
• Internal files: Files that the system uses for its purposes.
• Simple.
1
http://www.ifpug.com/
Revision: final 11
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Average.
• Complex.
Then a weight is assigned to the item according to some tables (e.g. for simple
external input this is 3 and for a complex external inquiry this is 6, the total number
of weights equals to 15). So, the UFC is calculated as
15
X
UFC = {(Number Of Items Of Variety i) ∗ weighti }
i=1
FP = UFC ∗ TCF
There is a very large user community for function points; IFPUG has more than
1200 member companies, and they offer assistance in establishing a FPA program.
The standard practices for counting and using function points can be found in the
IFPUG Counting Practices Manual. Without some standardisation of how the func-
tion points are enumerated and interpreted, consistent results can be difficult to
obtain. Successful application seems to depend on establishing a consistent method
of counting function points and keeping records to establish baseline productivity
figures for specific systems. Function measures tend to be independent of language,
coding style, and software architecture, but environmental factors such as the ratio
of function points to source lines of code will vary, although there have been some
tries to map LOCs to FPs [Jon95]. Some limitations of the function points include
problems about the subjectivity of the TCF and other subjective measures used, the
weights and other. Also, their application is rather time consuming and demands
well trained staff. Taking into account its limitations, the method can be rather use-
ful as an estimator about size and other metrics that take size into account.
Object Oriented Size Metrics: In object oriented development, classes and meth-
ods are the basic constructs. Thus, apart from the metrics presented above, in object
oriented technology we can use the number of the classes and methods as an aspect
of size. These metrics are straightforward:
• Number of classes.
Revision: final 12
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
It is obvious that metrics from other sections also apply to object oriented develop-
ment, but in relation to classes and objects (for example, for the complexity metrics
presented later in this document, we have average complexity per class or method).
Reuse: With the term reuse we mean the amount of code which is reused in the fu-
ture release of the software. Although it may sound simple, reuse cannot be counted
in a straightforward manner, because it is difficult to define what we mean by code
reuse. So there are different notions of reuse that take into account the extent of
reuse [FP97]: we have straight reuse (copy and paste of the code) and modified
reuse (take a module and change the appropriate line in order to implement new
features). In addition, in object oriented programming, reuse extends to the reuse
or inheritance of certain classes.
Reuse also affects size measurement of successive releases, if the present release
of a software contains a large identical amount of code from the previous one, what
is its actual size? For example, IBM uses a metric called shipped source instructions
(SSI) [Kan03] which is expressed as
The final term adjusts for changed code which would otherwise be counted twice.
This metric encapsulates reuse in its definition and it is rather useful.
Apart from size, there are other internal product attributes that are useful to soft-
ware engineering measurement practice. Since the early stages of the science of
software metrics, researchers pointed out a link between the structure of the prod-
uct (i.e. the code) and certain quality aspects. These are called structural metrics
and here we are going to present them. According to our belief these metrics are
going to be useful for our research.
McCabe’s Complexity Metrics: One of the first and widely used complexity met-
rics is McCabe’s Cyclomatic Complexity [McC76]. McCabe proposed that a pro-
gram’s cyclomatic complexity can be measured by applying principles of graph the-
ory. He represented the program structure as a graph G. So for a program with a
Revision: final 13
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
v(G) = e − n + 1
where e is the number of edges of G and n the number of nodes. In addition, McCabe
has given some other definitions such as the cyclomatic number, which is
v(G) = e − n + 2
ev(G) = v(G) − m
where m is the number of sub flowgraphs, else the number of connected components
of the graph. In the literature there is also the definition:
v(G) = e − n + p
(where e is the number of edges, n the number of nodes and p the number of nodes
that are exit points — last instruction, exit, return etc.) So for the graph in Figure
1.2.12 , the cyclomatic complexity V (G) = 3.
Although the cyclomatic complexity metric was developed in the mid ’70s, it has
evolved and been calibrated during the years and it has become a mature, objective
and useful metric to measure a program’s complexity. It is also considered to be a
good maintainability metric.
The above metrics (LOC, McCabe’s Cyclomatic Complexity and Halstead’s Soft-
ware Science) treat each module separately. The metrics below try to take into
account the interaction between the modules and quantify this interaction.
Revision: final 14
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Stamp coupling relation: x and y accept the same record type (i.e. in database
systems) as a parameter, which may cause interdependency between otherwise
not related modules.
• Common coupling relation: x and y refer to the same global data. This type of
coupling is the kind that we don’t want to have.
From the above coupling relations the instance of common coupling has been ex-
plored in the case of the Linux kernel in order to explore its maintainability [YSCO04].
Revision: final 15
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
modules that call module m plus the number of data structures that are retrieved
by m. With fan out we mean the number of modules that are called from m plus the
number of data structures that are updated by m. The definition of the metric for a
module m is:
2
Information Flow Complexity(m) = length(m) · (Fan In(m) · Fan Out(m))
Other researchers have proposed to omit the factor length and, thus simplify the
metric.
Since its introduction, Henry and Kafura’s metric has been validated and con-
nected with maintainability [FP97], [Kan03]. Modules with high information flow
complexity tend to be error prone, while, on the other hand, low values of the metric
correlate with fewer errors.
Object Oriented Complexity Metrics: With the rise of object oriented program-
ming, software metrics researchers tried to figure out how to measure the complex-
ity of such applications. One of the most widely used complexity metrics for object
oriented systems is the Chidamber and Kemerer’s metrics suite [CK76]:
• Metric 1: Weighted Methods per Class (WMC) WMC is the sum of the complexi-
ties of the methods, whereas complexity is measured by cyclomatic complexity:
Pn
WMC = i=1 ci
where n is the number of methods and c i is the complexity of the i -th method.
We have to stress here that measuring the complexity is difficult to implement
because, due to inheritance, not all methods are assessable in the class hier-
archy. Therefore, in empirical studies, WMC is just the number of methods in
a class and the average of WMC is the average number of methods per class
[Kan03].
• Metric 2: Depth of Inheritance Tree (DIT) This metric represents the length of
the maximum path from the node to the root of the inheritance tree.
• Metric 5: Response for Class (RFC) This metric represents the number of the
methods that can be executed in response to a message received by an object
of that class. It equals to all the local methods plus the number of methods
called by local methods.
Revision: final 16
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Several studies show that the CK metrics suite assist in measuring and predicting an
object oriented systems maintainability [FP97], [Kan03]. In particular, studies show
that certain CK metrics are linked to faulty classes and help predict such [Kan03].
Along with object oriented programming the notion of object oriented design was
introduced, too. The programmer has to use some kind of modelling with classes
and objects in order to design his application first. After the design is completed,
the programmer goes on with coding. One of the questions that a programmer asks
himself is whether or not his design is of good quality. An experienced programmer
can answer that question by applying on his/her design a number of rules based
on his/her experience. He looks for bad choices that may have been done or the
violation of some intuitive rules of himself. If the design passes his own checks then
it is of good quality and he continues to code writing. Of course with big applications,
inspection of a design by a person is rather difficult, so a tool is needed.
These intuitive rules are called “design heuristics” checks. They are based on
experience. They are like design patterns, but rather than proposing a certain design
for certain problems, heuristics are rules that help the designer check the validity
of his design. Design heuristics are validations for the object oriented design and
advise the programmer for certain design mistakes. These advises should be taken
into account by the programmer, who has to make some research to correct things.
Of course a heuristics violation does not mean a design mistake all the time, but it
is a point for further investigation by the development team. A well known set of
such object oriented design heuristics was first introduced by Arthur Riel. Riel in
his seminal work [Rie96] defined a set of more than 60 design heuristics, a result
of his experience. His work has helped many people to improve their designs and
the way they program. Before Riel other researchers have addressed similar issues,
including Coad and Yourdon [YC91]. Additionally, there is on going research in the
field of design heuristics. Researchers are investigating the impact of the application
of object oriented design heuristics and the evaluation and the validation of these
heuristics [DSRS03, DSA+ 04]. As an example someone can read the object oriented
design heuristics listed in the list below. These heuristics are taken from Riel [Rie96].
2. Do not use global data. Class variables or methods should be used instead.
Revision: final 17
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
10. The number of public methods of a class should be no more than seven.
11. The number of classes with a class collaborates should not be more than four.
12. Classes with so much information should be avoided. We consider that a class
fits into this description when it associates with more than four classes, has
more than seven methods and more than seven attributes.
13. The fan out of the class should me minimised. The fan out is the product of the
number of methods defined by the class and the number of methods they send.
This number should be no more than nineteen.
17. A class should not have only methods with names set, get print.
18. If a class contains objects from another class, then the containing class should
be sending messages to the contained objects. If this does not happen then we
have a violation of the heuristic.
19. In case that a class contains objects from other classes, these objects should
not be associated with each other.
20. If a class has only one method apart from set, get and print it means then there
is a violation.
21. The number of messages between a class and its collaborator should me mini-
mal. If this number is more than fifteen we have
One should note here that the above heuristics can be validated with the use of a
tool.
Before Riel, Lorenz [LK94] proposed similar rules derived from industrial experi-
ence (include metrics for the development process):
Revision: final 18
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
4. Class hierarchy nesting level (or DIT of CK metrics) should be less than 6.
10. Number of times class is reused (a class should be reused in other projects,
otherwise it might need redesign).
11. Number of classes and methods thrown away (this should occur in a steady
rate).
As mentioned before, all these “rules of thumb” are derived from the experience
gained during multiple development processes and reflect practical knowledge. For
example, a large number of average method size may indicate a poor OO design
and a function oriented coding [Kan03]. A class containing too much responsibility
(many methods) indicates that there should be a separate class for some of methods.
The list of practice goes on, reflecting this practical knowledge mentioned.
The previous sections discussed development and design quality. These are the qual-
ity metrics that can be applied to a software product early in the product lifecycle:
before the product is released these metrics may already be calculated. The follow-
ing metrics are post-release metrics and apply to a finished software product.
Revision: final 19
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Corrective maintenance, which is the main maintenance task and involves cor-
recting defects that are reported by users.
• Adaptive maintenance is the maintenance that has to do with adding new func-
tionality to the system.
Average Code Lines per Module: This is a very simple metric which is the aver-
age number of comment lines in the code of its module of the code (e.g. function, or
class). This metric show how easy the code can be maintained or how easy someone
can understand part of the code and correct it. With this metric there are some con-
siderations regarding the comment lines, considerations that also apply later in the
Maintainability Index metric. For instance, considerations need to be given to how
much of the comment lines reflects the code (are there useless comment lines?), if
the commented lines contain comment with copyright notices and other legal notices
etc.
Revision: final 20
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
is efficient and closes bugs faster than their arrival rate. If it is less than 100% it
means that the development team has efficiency problems and stays behind with the
defect fixing process.
Maintainability Index: Several metrics have been proposed to describe the inter-
nal measurement of maintainability [FP97]. Most of them try to correlate structural
metrics presented before, with maintainability. In certain cases there has been a link
of a certain metric with maintainability. For example, McCabe categorised programs
in the maintenance risk categories, stating that any program with a McCabe metric
larger than 20 has a high risk of causing problems.
One interesting model that derived from regression analysis and is based on met-
rics proposed before is the Maintainability Index (MI) proposed by Welker and Oman
[WO95]. The MI shows strong correlation between Halstead Software Science met-
rics, McCabe’s cyclomatic complexity, LOCs , and the number of comments in the
code. There are two expressions of MI, one using the three of the previous metrics
and another using the four:
Three-Metric MI equation
where aveV is the average Halstead Volume per module, aveV (g) is the average
extended cyclomatic complexity per module, and aveLOC is the average lines
of code per module.
Four-Metric MI equation
q
MI = 171 − 5.2 ln(aveV ) − 0.23aveV (g) − 16.2 ln(aveLOC) + 50 sin 2.4perCM
here aveV (g) and aveLOC are as before and aveE is the average Halstead Effort
per module and and perCM is the average percent of lines of comments per
module.
In their article, Welker and Oman proposed three rules on how to choose which
metric (3 or 4 metric equation) is appropriate for use [WO95]. They proposed three
criteria, if one them is true then it is better to use the 3 metric equation, otherwise
use the 4 metric one. The criteria are:
• The comments do not accurately match the code. Unless considerable attention
is paid to comments, they can become out of synchronisation with the code and
thereby make the code less maintainable. The comments could be so far off as
to be of dubious value.
Revision: final 21
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• There are large sections of code that have been commented out. Code that has
been commented out creates maintenance difficulties.
Calculated MI is simple because there are tools (we examine such tools in Section
2) that measure the metrics it facilitates. As authors suggest MI is useful periodic
assessment of the code in order to test its maintainability.
Revision: final 22
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Volume of mailing lists: This metric is rather useful in order to evaluate the
health of a project and the support it provides [SSA06]. It is a direct measurement
of the number of messages sent in a project’s list in a month period (or another
fixed time period). A healthy project has an active mailing list, while a soon to be
abandoned one has lower activity. The volume of the users’ mailing list is also an
indicator of how well this project is supported and well documented.
Volume of available documentation: Along with the previous metric, this one
is an indicator of the available support. When we refer to the volume of available
documentation, we mean the available documents, like the installation guide or the
administrator’s guide.
Freshmeat User Rating: Freshmeat.net hosting service uses a user rating metric
which works like this, according to its website3 : Every registered user of freshmeat
may rate a project featured on this website. Based on these ratings, they build a top
20 list and users may sort their search results by ratings as well. Please be aware
of the fact that unless a project received 20 ratings or more a project will not be
3
http://freshmeat.net/faq/view/31/
Revision: final 23
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
considered for inclusion in the top 20. The formula gives a true Bayesian estimate,
the weighted rank (WR):
v m
WR = R+ C
v+m v+m
where:
R = average rating for the project
v = number of votes for the project
m = minimum votes required to be listed in the top 20 (currently 20)
C = the mean vote across the whole report
Freshmeat Vitality: The second metric that Freshmeat uses is the project’s vital-
ity. Again, according to Freshmeat 4 , the vitality score for a project is calculated
thus:
popularity = (announcements ∗ age)/(last announcement)
which is the number of announcements multiplied by the number of days an applica-
tion exists divided by the days passed since the last release. This way, applications
with lots of announcements that have been around for a long time and have recently
come out with a new release earn a high vitality score, old applications that have only
been announced once get a low vitality score. The vitality score is available through
the project page and can be used as a sort key for the search results (definable in
the user preferences).
Freshmeat Popularity: From the Freshmeat site 5 : The popularity score super-
seded the old counters for record hits, URL hits and subscriptions. Popularity is
calculated as
q
popularity = (record hits + URL hits) ∗ (subscriptions + 1)
Again we have to stress here that these metrics are used by Freshmeat and of
course they need further investigation and validation.
Revision: final 24
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
software engineering research and it is the reason of why metrics are questioned by
researchers and there is a lot of discussion for them.
According to Fenton [FP97] the way we want to validate metrics depends on
whether we want just to measure an attribute or we want to measure in order to
predict. Prediction of an attribute of a system (e.g. code quality or cost) is a core
issue in software engineering. So, in order to perform metrics validation we must
distinguish between these two types:
From these two approaches, empirical validation is the one which is widely used. It
practically tries to correlates a measure with some external attributes of a software,
for example complexity with defects.
Revision: final 25
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Revision: final 26
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
2 Tools
Many publications mention measurement tool support and automation as important
success factors for software measurement efforts and quality assurance [KSPR01],
providing frameworks and general approaches [KRSZ00], or giving more specific
solution architectures [JL99]. There is a great variety of research tools to support
software metric creation, handling, and analysis; an overview on different types of
software metrics tools is given in [Irb]. Wasserman [A.I89] introduces the concept
of tools with vertical and horizontal architecture, with the former supporting ac-
tivities in a single life cycle phase, such as UML design tools or change request
databases, and the latter supporting activities over several life cycle phases, such as
project management and version control tools. Fuggetta [Fug93], on the other hand,
classifies tools as either single tools, workbenches supporting a few development ac-
tivities, and environments supporting a great part of the development process. The
above ideas for different kind of metrics tools certainly affected the functionality that
commercial tools offer but still the most popular categorisation of metrics tools clas-
sifies metrics tools as either product metrics tools or process metrics tools. Product
metrics tools measure the software product at any stage of its development, from
requirements to installed system. Product metrics tools may measure the complex-
ity of the software design, the size of the final program (either source or object
code), or the number of pages of documentation produced. Process metrics tools,
on the other hand, measure the software development process, such as overall de-
velopment time, type of methodology used, or the average level of experience of the
programming staff. In this chapter we are going to present tools, both Open Source
and commercial that support and automate the measurement process.
• The code itself, along with historical data (changes, additions, etc).
Revision: final 27
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
All these are stored in a repository and someone can find tools, available as Open
Source software, to extract all the useful information from the repository.
2.1.1 CVSAnalY
CVSAnalY8 (CVS Analysis) is one of the first tools that accesses a repository in order
to find information regarding an open source project. It has been developed by the
Libresoft Group at the Universidad Juan Carlos in Spain and has already produced
results, used in research in open source software [RKGB04]. The tool is licenced
under the GNU General Public Licence.
Specifically, CVSAnalY is a tool that extracts statistical information out of CVS
and Subversion repository logs and transforms it in database SQL formats. The
main tool is a command line tool. The presentation of the results is done with a web
interface - called CVSAnalYweb - where the results can be retrieved and analysed
in an easy way (after someone has run the command line main tool, CVSAnalY). The
tools produces various results and statistics regarding the evolution of a project over
time. A general view of the tool is shown in Figure 2. The tool stores historical data
such as:
Revision: final 28
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Commiters.
• Commits.
• Files.
• Aggregated Lines.
• Removed Lines.
• Changed Lines.
• Final Lines.
• File type.
• Modules.
• Commits.
• Files.
• Lines Changed.
• Lines Added.
• Lines Removed.
• Removed files.
• External.
• CVS flag.
• First commit.
• Last commit.
The tool also logs the inactivity rate for modules and commiters, commiters per
module, Herfindahl-Hirschman Index for modules and also as mentioned before it
produces helpful graphs. Example of graphs produced are:
Revision: final 29
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
2.1.2 GlueTheos
GlueTheos [RGBG04] has been developed to coordinate other tools used to analyse
Open Source repositories. The tool is a set of scripts used to download data (source-
code) from Open Source repositories, analyse them with external tools (developed
from third parties) and store the results in a database for further investigation. The
parts, which comprise GlueTheos are:
• The core scripts act as a user interface interacting with the user and handle
details like repository configuration, periods of analysis (the periodic snapshots
from a repository), storage details, third party tools details and parameters.
Revision: final 30
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• The analysis module. Here user describes further details of the external tools
used for source code analysis. These details include instructions on how to
invoke the tool, which are the parameters to the tool and the output details of
the tool. The module is also responsible for running these eternal tools.
• The storage module. This module is responsible for the storage of the results
created by the previous module. It takes the output of an analysis tool and
formats it into an appropriate SQL command, suitable to store the result into a
database.
1. The user chooses which project to analyse (e.g. GNOME) and which periods to
analyse (e.g. every month from December 2003 until September 2005).
2. Then it chooses an analysis tool (e.g. sloccount, which counts physical source
lines of code9 ). The integration of the tool with the main set of scripts include
description on how to call the tool, parameter passing and description of its
output.
9
http://www.dwheeler.com/sloccount/
Revision: final 31
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
3. The program retrieves the code of the project analysed for the configured
dates, then it analyses the code with the external tool and stores the output
in a database.
The table of the database that contains the analysis results has as a column its output
of the external tool. Figure 5 shows a table created by GlueTheos, which contains
the output of sloccount (SLOC -source lines of code- and language type) for the files
of the gnome core project at a specific date. GlueTheos is released under the GNU
General Public Licence.
2.1.3 MailingListStats
MailingListStats10 analyses Mailman (and in future other mailing list manager soft-
ware) archives in order to get statistical data out of them. Statistical data is trans-
formed into XML and SQL to allow further analysis and research. This tool also
includes a web interface.
10
http://libresoft.urjc.es/Tools/MLStats
Revision: final 32
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
The Byte Code Metric Library12 (BCML) is a collection of tools to calculate the met-
rics of the Java byte code classes or JAR files in directories, output the result into
XML files, and report the result with HTML format.
CCCC is a tool which analyses C / C++ files and generates a report on various metrics.
The tool 13 is developed as a MSc thesis by Tim Littlefair and it is copyrighted by him.
The tool is command line and analyses an input of a list of files and generates HTML
and XML reports containing results. The metrics measured are the most common
ones, specifically they are:
• Summary table of high level metrics summed over all files processed in the
current run.
• Structural metrics based on the relationships of each module with others. In-
cludes fan-out (i.e. number of other modules the current module uses), fan-in
(number of other modules which use the current module), and the Information
Flow measure suggested by Henry and Kafura, which combines these to give a
measure of coupling for the module.
• Lexical counts for parts of submitted source files which the analyser was unable
to assign to a module. Each record in this table relates to either a part of the
code which triggered a parse failure, or to the residual lexical counts relating
to parts of a file not associated with a specific module.
11
http://www.spinellis.gr/sw/ckjm
12
http://csdl.ics.hawaii.edu/Tools/BCML
13
http://cccc.sourceforge.net/
Revision: final 33
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 6 shows the report for procedural metrics for an Open Source project, while
Figure 7 shows the report for Object Oriented Metrics of the same project.
The Software Metrics 14 Plug In for Eclipse IDE is a powerful add-on for the popular
Open Source software IDE Eclipse. It is installed, as its name denotes, as a plug in
to Eclipse and it is distributed under the same licence as the Eclipse IDE itself. The
tool measures Java code against a long list of metrics:
• Lines of Code (LOC): Total lines of code in the selected scope. Only counts
non-blank and non-comment lines inside method bodies.
• Number of Static Methods (NSM): Total number of static methods in the se-
lected scope.
Revision: final 34
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Method Lines of Code (MLOC): Total number of lines of code inside method
bodies, excluding blank lines and comments.
• Weighted Methods per Class (WMC): Sum of the McCabe Cyclomatic Complex-
ity for all methods in a class.
Revision: final 35
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
indicates a lack of cohesion and suggests the class might better be split into a
number of (sub)classes.
• Efferent Coupling (CE): The number of classes inside a package that depend
on classes outside the package.
• Depth of Inheritance Tree (DIT): Distance from class Object in inheritance hi-
erarchy.
The user can also set ranges and thresholds for each metric in order to track code
quality. Examples of these ranges can be:
• Nested Block Depth (Method Level): Max 5 - If a block of code has over 5
nested loops, break up the method.
• Lines of Code (Class Level): Max 750 - If a class has over 750 lines of code,
split up the class and delegate it’s responsibilities.
Revision: final 36
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
As someone can see from this list, the tool is rather extensive and the metrics mea-
sured is exhaustive. A view of the plugin displaying the results of a measurement is
shown in Figure 8.
The tool also displays the dependency connections among the various packages
and classes of a project analysed as a connected graph. An example of this graph is
shown in Figure 9.
2.3.1 FindBugs
FindBugs15 looks for bugs in Java programs. It is based on the concept of bug pat-
terns.
15
http://findbugs.sourceforge.net
Revision: final 37
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
2.3.2 PMD
PMD16 scans source code and looks for potential problems possible bugs, unused
and suboptimal code, over-complicated expressions and duplicate code.
2.3.3 QJ-Pro
QJ-Pro17 is a tool-set for static analysis of Java source code: a combination of auto-
matic code review and automatic coding standards enforcement.
2.3.4 Bugle
Revision: final 38
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
The Empirical Project Monitor19 (EPM) provides a tool for automated collection and
analysis of project data. The current version uses CVS, GNATS, and Mailman as data
sources.
2.4.2 HackyStat
2.4.3 QSOS
QSOS21 is a method, designed to qualify, select and compare free and Open Source
software in an objective, traceable and argued way. It publicly available under the
terms of the GNU Free Documentation License.
Revision: final 39
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Platform The first step in utilising any tool is to install it on an operating system.
In the worst case, tool’s platform requirements can not be fulfilled by an exist-
ing environment, which means a new OS would have to be added, i.e., bought,
installed, maintained. Another platform issue is the database support: some
tools are based on a metric repository and have to rely on some sort of rela-
tional database. The range of supported databases affects the tools’ platform
interoperability. As some of the tools have both server and client components
(for data storage and collection/reporting purposes, respectively), one has to
distinguish these components’ platform interoperability separately.
Input/output Software project quality tracking and estimation tools heavily rely on
data from external sources such as UML modelling tools, source code anal-
ysers, work effort or change request databases etc. The ease of connecting
to these applications through interfaces or file input substantially influences
a metric tool’s efficiency and error-proneness. On the other hand, data often
has to be exported for further processing in spread-sheets, project manage-
ment tools or slide presentations. Reports and graphs have to be created and
possibly viewed, posted on the Web, or printed.
2.6.1 MetriFlame
Revision: final 40
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
which are not part of the MetriFlame tool, but separate programs. These programs
convert the data and generate structured text files, which can then be imported into
MetriFlame. New data can also be entered manually. The process of data collection
cannot be automated. Project data can only be saved in a MetriFlame project file;
no other file format is available. Reports (graphs) can be saved as WMF, EMF, BMP,
JPEG or structured text. MetriFlame does not feature an estimation model.
Revision: final 41
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
by choosing subtypes for parts of the project or tuning the estimation by changing
productivity drivers like database size or programmer capability. A screenshot of the
tool is presented in Figure 11. Estimate Professional supports MS Windows 95/98,
NT 4.0 and 2000. For Installation on NT systems, administrator rights are required.
Project data can be imported from a Microsoft Project file or from a CSV file. The
process of data collection cannot be automated. Project data can be exported to a
Microsoft Project file; project metrics can be exported into a CSV file.
2.6.3 CostXpert
The software cost estimation tool CostXpert24 produces estimates of project dura-
tion, costs, staff effort, labour costs etc. using software size, labour costs, risk fac-
tors and other input variables. The tool features mappings of source lines of code
equivalents for more than 600 different programming languages. The main menu
of the tool is presented in Figure 12. Import of project data is limited to manual
entry. Data connectors to tools processing software artifacts do not exist. Data can
be exchanged between different copies of CostXpert via CostXpert project files. The
process of data collection can not be automated. Regarding the estimation process
Cost Xpert integrates multiple software sizing methods, it is compliant with CO-
COMO and over 32 lifecycles and standards. Cost Xpert is designed to aid project
control, facilitate process improvement and earn a greater return on investment
(ROI). Especially for COTS products the tool is able to estimate the portion of the
package that needs no modification but should be configured and parameterised,
what portion of the package needs to be modified and the amount of functionality
24
http://www.costxpert.com/
Revision: final 42
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
that should be added to the system. Project data in a work breakdown structure
can be exported to Microsoft Project or Primavera TeamPlay. The expected labour
distribution can be exported to a CSV file. Customised project types, standards and
lifecycles can be exported to so-called customised data files. Reports can be printed
or exported as PDF, RTF or HTML files. Graphs can be exported as BMP, WMF or
JPEG files. CostXpert integrates more than 40 different estimation models based on
data from over 25.000 software projects. CostXpert supports MS Windows 95 and all
later versions. CostXpert does not feature a project database; project data is stored
in a project file in proprietary format.
2.6.4 ProjectConsole
ProjectConsole 25 is a Web-based tool for project control that offers project reporting
capabilities to software development teams. Project information can be extracted
from Rational tools or other third-party tools, is stored in a database and can be
accesses through a Web site. Rational ProjectConsole makes it easy to monitor the
status of development projects, and utilise objective metrics to improve project pre-
dictability. Rational ProjectConsole greatly simplifies the process of gathering met-
rics and reporting project status by creating a project metrics Web site based on
data collected from the development environment. This Web site, which Rational
ProjectConsole updates on demand or on schedule, gives all team members com-
plete, up-to-date view of your project environment. Rational ProjectConsole collects
metrics from the Rational Suite development platform and from third-party products,
and presents the results graphically in a customisable format to help the assessment
25
http://www-128.ibm.com
Revision: final 43
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
of the progress and quality. Rational ProjectConsole supports MS Windows XP; Win-
dows NT 4.0 Server or Workstation, SP6a or later and Windows 2000 Server or
Professional, SP1 or later. All the data is stored in a database, the so-called metric
data warehouse. Supported databases include SQL Server, Oracle and IBM DB2.
ProjectConsole needs a Web server (IIS or Apache Tomcat) to publish its data over
a network (local network or the Internet). The project Web site can be viewed with
any browser. ProjectConsole can extract metrics directly from Rational Clear-Quest,
Requisite Pro, Rose, and Microsoft Project repositories. In addition, ProjectConsole
provides so-called collection agents that can parse Rational Purify, Quantify, Cover-
age, and ClearCase data files. Automatic collection tasks can be scheduled to run
daily, weekly or monthly at a specified date and time. The data is extracted from the
source programs and stored in the metric data warehouse. The project Web site is
automatically updated. Graphs are in stored in PNG files. Data can be published in
tables and exported into HTML format. MS Excel 2000 or later can be used to im-
port the HTML table format. ProjectConsole does not feature an estimation model.
Figure 13 depicts the multi chart display of Project Console.
2.6.5 CA-Estimacs
Rubin has developed a proprietary software estimating model 26 that utilises gross
business specifications for its calculations. The model provides estimates of total
development effort, staff requirements, cost, risk involved, and portfolio effects.
The ESTIMACS model addresses three important aspects of software management-
estimation, planning, and control. The ESTIMACS system includes five modules.
26
http://www.ca.com/products/estimacs.htm
Revision: final 44
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
The first module is the System development effort estimator. This module requires
responses to 25 questions regarding the system to be developed, development envi-
ronment, etc. It uses a database of previous project data to calculate an estimate of
the development effort. Staffing and cost estimator is another. Inputs required for
this module are: the effort estimation from above, data on employee productivity,
and salary for each skill level. Again, a database of project information is used to
compute the estimate of project duration, cost, and staffing required. Hardware con-
figuration estimator requires as inputs information on the operating environment for
the software product, total expected transaction volume, generic application type,
etc. Output is an estimate of the required hardware configuration. Risk estimator
module calculates risk using answers to some 60 questions on project size, struc-
ture, and technology. Some of the answers are computed automatically from other
information already available. Finally portfolio analyser provides information on the
effect of this project on the total operations of the development organisation. It
provides the user with some understanding of the total resource demands of the
projects.
2.6.6 Discussion
The tools evaluated provide a broad variety of analysis capabilities, and different
degrees of explicit estimation support. However, they all allowed storing and com-
paring project measures in a structured way. Certain conclusions can be drawn on
whether the tools can integrate seamlessly in an existing and heterogeneous soft-
ware development environment. All of the evaluated tools are only available on one
operating system (MS Windows). This is particularly problematic for server compo-
nents, as many times a dedicated server would have to be added to an otherwise
Unix-based server farm. Some tools only work with particular database engines, for
example Project Console. In addition to manual data entry, the tools generally are
restricted to a few input file formats (e.g. Estimate Professional only reads Microsoft
Project and CSV files). While communication with spreadsheet applications is usu-
ally supported, few tools can access development tools like integrated development
environments (IDE) or requirement databases directly. Tools with advanced metric
data collection capability (like MetricCenter) offer only a limited set of connectors to
specific development tools, which have to be purchased separately. Their communi-
cation protocol is disclosed. Automation support is either not available (MetriFlame,
Estimate Professional, CostXpert) or limited to pull operations (MetricCenter). The
degree of flexibility with respect to defining new metrics and changing reports dif-
fers greatly, however all tools provide only basic reporting flexibility. This would not
be a problem itself, if the tools would allow unrestricted data access for online analyt-
ical processing (OLAP) reporting tools, but this is not possible with most of the tools
either. Data output for further processing is sometimes limited to CSV files and a pro-
prietary file format (Metri-Flame). Tools often don’t support common reporting file
Revision: final 45
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
formats like PDF. Output automation is supported by few of the evaluated tools (Met-
ricCenter, Project-Console). Some tools, instead of supporting integration, seem to
duplicate features, which normally are already available in medium and large-scale
IT environments: Some tools introduce a proprietary file format (Metri-Flame), or
are limited to a particular database system instead of accessing the companies re-
liable database infrastructure. Some basic graphical reporting and Web-publishing
features are provided, instead of feeding advanced OLAP reporting tools, whose use
would also automatically eliminate the need of duplicating features for the handling
of user access rights. Finally, the difficulties in getting access to some tools pro-
vide an additional cost barrier in integrating them in existing IT environments and
seem to indicate that at least some of these tools do not provide user interfaces
with a low learning curve. Altogether, process engineers and portfolio managers
operating in highly dynamic environments must still expect substantial costs when
evaluating, integrating, customising, operating and continuously adapting planning
and monitoring tools. Even tools with advanced architectures like MetricCenter of-
fer a limited set of supported development tool, restricted customisation capabilities
due to disclosed data protocols, and platform restrictions. Proprietary approaches to
security and user access concerns are further complicating integration. Much work
needs to be done to lower the technological barrier for collecting software metrics
in a varying and changing environment. Possible approaches to some of the current
problems are likely to embrace the support of modern file formats like XML, and
light-weight data communication by using, for example, the SOAP protocol.
Revision: final 46
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
CT C++ , CMT++ and CTB 27 are all tools developed by the Finish company, Testwell
and available from Verifysoft for Microsoft Windows, Solaris, HP-UX and Linux. They
focus on test coverage (CT C++ ), metric analysis (CMT++) and unit testing (CTB)
for C/ C++ source code. CT C++ is a coverage tool supporting testing and tuning
of programs written in C and C++ . This coverage-analyser supports coverage for
function, decision, statement, condition and multi-condition presenting the result
in a text or HTML report. The analyser is available for coverage measuring in the
host, for operating systems as well as for embedded systems. The tool is integrated
in Microsoft Visual C++ , the Borland compiler and WindRiver Tornado. CMT++ is
a tool for assessing code complexity. Code complexity has effect on how difficult
it is to test and maintain an application. Complex code is likely to contain errors.
Metrics like McCabe Cyclomatic Complexity, Halsteads Software Metrics and Line-
of-Code-Mare are supported by the tool. The tool can be customised by the user
for company coding standards. CMT++ identifies complex and error-prone code.
As there is usually too little time to inspect all the code carefully, it is an important
step to select the most error-prone modules. CMT++ also gives an estimation of
the number of test cases needed to test all paths of a function and gives you an
idea of how many bugs you should find to have a “clean” code. CTB is a module-
testing-tool for C programming language that allows the testing of the code at a
very early development stage having as a result the prevention of bugs. As soon
as the module compiles, the test bed can be generated on it without any additional
programming. The tool supports specification based (black-box) testing approach
from “ad-hoc”-trial to systematic script-based regression-tests. Tests can run in an
interactive mode with a C-like command interface as well as script- or file-based
and made automatic. Test based execution is as if the test driver would read the
test main program and immediately execute it command-by-command showing what
happens. CTB works together with coverage analysis tools, such as CT C++ .
2.7.2 Cantata++
Cantata++ 28 is a commercial tool for unit and integration testing, coverage and
static analysis. It tool is built on Eclipse v3.2 Open Source development platform
including the C Development Tools (CDT). Unit and test integration capabilities of
the environment support automated test script generation by parsing source code to
derive parameter and data information with stubs and wrappers automatically gen-
erated into the test script. Stubs provide programmable dummy versions of external
software while wrappers are used for establishing programmable interceptions to
the real external software. The building and the execution of tests, black and white
27
http://www.verifysoft.com
28
http://www.ipl.com/products/tools/pt400.uk.php
Revision: final 47
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 14: Cantata++ V5 - a fully integrated Test Development and Analysis Envi-
ronment
box, is supported both by the tool and also via the developer’s build system. Verifi-
cation of the code is also supported by providing sequential execution of test cases
based on wrappers and stubs. The test cases defined in verification can be reused
for inherited classes and template instantiation. Figures 14 and 15 present the en-
vironment of the tool.
Coverage analysis provides measurement of how effective testing has been in ex-
ecuting the source code. Configurable coverage requirements are defined in rule
sets that are integrated into dynamic tests resulting in Pass/Fail for coverage re-
quirements. The coverage metrics used by the tool are the following:
• Entry points
• Call Returns
• Statements
• Basic Blocks
• Decisions (branches)
Cantata has certain features that support coverage especially for applications de-
veloped in Java such as reuse of JUnit tests with coverage by test case, and builds
with ANT. Static analysis generates over 300 source code metrics. The results of
these metrics are stored in reports that can used to help enforce code quality stan-
dards. The metrics defined are both procedural and product metrics. Procedural
Revision: final 48
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
metrics involve code lines, comments, functions and counts of code constructs. Prod-
uct metrics calculate Myers, MOOSE, McCabe, MOOD, Halstead, QMOOD, Hansen,
Robert Martin, McCabe, Object Oriented, Bansiya’s Class Entropy metrics. Can-
tata++ can be integrated with many development tools including debuggers, simula-
tors/emulators, UML modelling, Project Management and Code execution profilers.
2.7.3 TAU/Logiscope
Revision: final 49
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
ate the code are ISO 9126 compliant. Templates as mentioned can be customised to
fit project-specific requirements. Logiscope TestChecker measures structural code
coverage and shows uncovered source code paths having as a result the discovery
of bugs hidden in untested source code. TestChecker is based on a source code in-
strumentation technique that is adaptable to test environment constraints. Figure
16 shows the way results are depicted by Logiscope. Both the three functions of the
tool are based on international recognised standards and models such as SEI/CMM,
DO-178B and ISO/IEC 9126 and 9001. Several techniques that methodically track
software quality for organisations at SEI/CMM Level 2 (repeatable) that want to
reach Level 3 (defined) and above are supported. “Reviews and Analysis of the
Source Code” and the “Structural Coverage Analysis” as required by the avionics
standard, DO-178B, for software systems from Levels E to A are partially supported
by Logiscope as well as “Quality Characteristics” as defined by ISO/IEC 9126. The
Logiscope product line is available for both UNIX and Windows.
2.7.4 McCabe IQ
Revision: final 50
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Integration Complexity
• Lines of Code
• Halstead
By using the above metrics complex code is identified. Figure 17 shows an example
of how complex code identification is presented to the user. The Battlemap uses
colour coding to show which sections of code are simple (green), somewhat complex
(yellow), and very complex (red). Figure 18 presents the metric statistics that the
tool calculates. Another function supported is the tracking of redundant code by
using a module comparison tool. This tool allows the selection of predefined search
criteria or the establishment of new criteria for finding similar modules. After the
selection of the search criteria the process is as follows: selection of the modules
you used for matching, specification of programs or repositories that will be used
for searching and finally localisation of the modules that are similar to the ones used
for matching based on the search criteria selected. Then it is determined if there is
any redundant code. If redundant code is found it is evaluated and if needed reengi-
neered. The tool provides a series of data metrics. The parser analyses the data
declarations and parameters in the code. The result of this analysis is the produc-
tion of metrics based on data. There are two kinds of data-related metrics: global
data and specified data. Global data refers to those data variables that are declared
as global in the code. Based on the result of the parser’s data analysis reports are
Revision: final 51
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
produced that show how global data variables are tied to the cyclomatic complexity
of each module in code. As cyclomatic complexity and global data complexity in-
crease, so does the likelihood that the code contains errors. Specified data refers
to the data variables that are specified as what is called a specified data set in the
data dictionary. In general, a data set is specified in the data dictionary one or more
variables have to be located in the code in order to analyse their association with
the complexity of the modules in which they appear. The tool includes a host of
tools and reports for locating, tracking, and testing code containing specified data,
as well as for enforcing naming conventions. The tool is platform independent and
supports Ada, ASM86, C, C++ .NET, C++ , COBOL, FORTRAN, JAVA, JSP, Perl, PL1,
VB, VB.NET
Revision: final 52
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
sion for Siebel Test Automation accelerates the process of system test creation, exe-
cution and analysis to ensure the early capture and repair of application errors. IBM
Rational Manual Tester is a manual test authoring and execution tool for testers and
business analysts. The tool enables test step reuse to reduce the impact of software
change on manual test maintenance activities and supports data entry and verifi-
cation during test execution to reduce human error. IBM Rational TestManager is
a tool for managing all aspects of manual and automated testing from iteration to
iteration. It is the central console for test activity management, execution and re-
porting supporting manual test approaches, various automated paradigms including
unit testing, functional regression testing, and performance testing. Rational Test-
Manager is meant to be accessed by all members of a project team, ensuring the
high visibility of test coverage information, defect trends, and application readiness.
IBM Rational Functional Tester Extension for Terminal-based Applications allows
the testers to apply their expertise to the mainframe environment while continuing
to use the same testing tool used for Java, VS.NET and Web applications.
2.7.6 Safire
The component that is most involved in quality assurance is the Campaigner that
supports automated execution of tests. This component creates, edits, manages and
executes test campaigns allowing the configuration of parameters. Campaigner also
32
http://www.safire-team.com/products/index.htm
Revision: final 53
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
produces test report in the form of quality pass or fail modules. Also the tool allows
automated repentance of certain tests. The quality rules that are used during the
design and the testing of the code are the following:
• System structure
• Naming conventions-existence
• Naming conventions-properties
• SDL simplicity
• Uniqueness
• Modularity
• Proper-functionality
• Comments
• Communication
• Events
• Behaviour
2.7.7 Metrics 4C
Metrics4C33 calculates software metrics for individual modules or for the entire
project. These tools run interactively or in the background on a daily, weekly, or
monthly basis. The software metrics calculated for an individual module include:
• Lines of code
• Cyclomatic complexity
• Fan out
33
http://www.plus-one.com
Revision: final 54
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
The above values are then summed to provide their respective project metrics. In
addition, other project metrics calculated include:
The Integration Test Percentage (ITP) provides a numeric value indicating how much
of the project’s source code has been tested and can be used to better prepare for
Formal Qualification Testing (FQT). Output from Metrics4C can easily be imported
into a spreadsheet program to graphically display the data. Metrics4C can also flag
warnings if the lines of code or the cyclomatic complexity value exceeds a specified
maximum.
Resource Standard Metrics 34 for C/ C++ and Java in any operating system generates
source code metrics. Source code quality metrics and complexity are measured by
this tool from the written source code having as a target to evaluate the projects
performance. Source code metric differentials can be determined between base-
lines using RSM code differential work files. Source code metrics (SLOC, KSLOC,
LLOC) from this tool can provide line of code derived function point metrics. RSM is
compliant with ISO9001, CMMI and AS9100. Typical functionality of RSM enables:
• The determination of source code LOC, SLOC, KSLOC for C, C++ and Java
• Measurement of software metrics for each baseline and determine metrics dif-
ferentials between baselines
34
http://msquaredtechnologies.com/
Revision: final 55
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Performance of source code static analysis, best used for code peer reviews
• Creation of user defined code quality notices with regular expressions or utili-
sation of the 40 predefined code quality notices.
2.7.9 Discussion
Most of the testing and product metrics tools provide the online capability to record
defect information including severity, class, origin, phase of detection, and phase
introduced. Several tools automate the testing procedure by providing estimation
of error prone code and automatically generating results and reports. Metrics tools
provide a variety of metrics reports or transport data into spreadsheets or report
generators. Query and search capabilities are also provided. Users have the ca-
pability to customise tools to meet their organisation’s unique requirements. For
example, users can customise quality rules, workflow, queries, reports, and access
controls. Other common features of the tools studied include:
• Graphical user interface.
• Measurement analysis.
• Data reporting.
Revision: final 56
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Back in 1971, in his book titled “The Psychology of Computer Programming,” Gerald
M. Weinberg was probably the first who analysed the so-called “egoless program-
ming,” meaning non-selfish, altruistic programming. This term was used in order to
describe the function of a software development environment in which volunteers
participate actively by discovering and fixing bugs, contributing new code, express-
ing ideas etc. These activities are without any direct material reward. Weinberg
subsequently observed that when developers are not territorial about their code and
encourage other people to look for bugs and potential improvements, then improve-
ment happens much faster [Wei71].
Several years later, Frederick P. Brooks, in his classic “The Mythical Man-Month:
Essays on Software Engineering,” predicted that OSS developers will play a signif-
icant role in software engineering in the future. In addition, he claimed that main-
taining a widely used program is typically 40% or more of the cost of developing it.
This cost is strongly affected by the number of users or developers of the specific
project. As more people will find more bugs and other flaws, the overall cost of the
software will be reduced. Brooks concluded [Bro75] that, this is why OSS can be
competent and sometimes even better than conventionally-built software.
In his influential article, “The Cathedral and the Bazaar,” Eric Steven Raymond,
gathered and presented the main features of OSS development. Starting with the
analysis of his own OSS project, Fetchmail, he distinguished the classical “Cathedral-
like” way of developing a commercial software from the new, “Bazaar-like” world of
Linux and other FOSS projects. Eventually, he came up with a series of lessons to be
learned, which can very well serve as principles that make a FOSS project successful
[Ray99].
According to OSS History written by Peter H. Salus [Sal], there are indications
that OSS development has its roots in the 1980’s or even earlier. But Raymond’s arti-
cle was actually the first attempt for a systematic approach to OSS and its methods.
His work though has met a lot of opposition, both in the FOSS community [DOS99]
and the academic circles [Beza, Bezb], as being too simplistic and shallow. No mat-
ter how controversial Raymond’s article is, its main contribution is that it raised a
widespread interest in OSS empirical studies. Since the dawn of the new millen-
nium, a satisfactory number of research essays on this subject have been published.
Some findings of these essays are described below, in order to let us gain a deeper
understanding on the evolution of several famous OSS projects.
Revision: final 57
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
3.1.2 Linux
The Linux operating system kernel is the best-known FOSS project worldwide, there-
fore it’s a case worth of a closer study. The Linux project started in 1991 as a private
research project by a 22-years-old Finnish student named Linus Torvalds. Being dis-
satisfied with the existing operating systems, he started programming a kernel him-
self, based on code and ideas from Minix, a tiny Unix-like operating system. Linux’s
first official release version 1.0 occurred in March 1994.
Today, Linux is one of the dominant computer operating systems, enjoying world-
wide acceptance. It is a large system: it contains over four millions lines of code and
it releases new versions very often. It has occupied hundreds of developers, who
have willfully dedicated a lot of their time to fix bugs, develop new code and report
their ideas for its evolution. According to Wikipedia’s relative article, it is estimated
that Linus Torvalds himself has contributed only about 2 per cent of Linux’s code,
but he remains the ultimate authority on what new code is incorporated into the
Linux kernel. Such a case is definitely a fine example of how a FOSS community
can work successfully by gathering the powers of a large, geographically distributed
community of software specialists.
The growth of the Linux Operating System began by following two parallel paths:
the stable and the development releases. The stable release contains features that
have been already tested, showing a proven stability, ease of use and lack of bugs.
The development release contains more features that are still in an experimental
phase, therefore it lacks stability and it contains more bugs. As one would expect,
there are more development releases than stable ones. Also, the features of develop-
ment releases that have been adequately tested are incorporated in the next stable
release. This development concept has played a big part in the project’s success, as
it provides conventional users with a reliable operating system (the stable release)
and at the same time giving software developers the freedom to experiment and try
new features (the development release).
Following Raymond’s analysis on development method of the Linux operating sys-
tem, Godfrey and Tu presented a research of Linux’s evolution over the years from
1994 till 1999 [GT00]. As they say, most might think that as Linux got bigger and
more complex, its growing pace should slow down. This is also what the well-known
Lehman’s laws of software evolution suggest: “as systems grow in size and com-
plexity, it becomes increasingly difficult to insert new code” [LRW+ 97]. In the same
context, Turski analysed several large software systems - that were all created and
maintained by small, predefined teams of developers using traditional management
techniques. From his study, Turski posits that system growth is usually sub-linear.
That is, a software system slows down as the system grows in volume and complex-
ity [Tur96]. Also Parnas referred to this subject, by comparing software aging with
human aging [Par94].
But the findings of Godfrey and Tu after studying the evolution of Linux, indicated
Revision: final 58
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 19: Growth of the compressed tar file for the full Linux kernel source release,
([GT00], p.135).
a different trend. The methodology that they employed was to examine Linux both
at the overall system level and at each one of the major subsystems. In this way,
they were able to study not just the whole system’s evolution of size, but each major
subsystem’s volume as well. This concept can provide us with more information, as it
is not obligatory that each and every subsystem follows the same evolution patterns
with the overall system. A sample of 96 kernel versions was selected, including 34
stable releases and 62 development releases. Two main metrics were used in this
research: the size of tar files and the number of the lines of code (LOC). A tar file
includes all the source artifacts of the kernel, such as documentation, scripts and
other, but no binary files. LOC were counted in two ways: with the Unix command
wc -l (that included blank lines and comments) and with an awk script (that ignored
blank lines and comments).
Regarding the overall system’s growth, the results of this research show that the
development releases grew at a super-linear rate over time, while the stable releases
grew at a much slower rate (Figures 19 and 20). These tendencies are common for
both metrics that were used. It is therefore clear that Linux’s development releases
follow an evolution pattern that differs from the Lehman’s laws of software evolu-
tion. We can support the view that this happens due to the way development releases
are built: they attract capable developers that are willing to contribute to the sys-
tem’s growth. As the project’s popularity rises, more developers are attracted to it
and more code is contributed. The stable releases, that follow a more conservative
development path and don’t accept new contributions too easily, show a slower rate
of size growth.
As for the growth of major subsystems, Godfrey and Tu selected 10 of these
Revision: final 59
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 20: Growth in the number of lines of code measured using two methods: the
Unix command wc -l, and an awk script that removes comments and blank lines,
([GT00], p.135).
subsystems:
• arch: contains the kernel code that is specific to particular hardware architec-
tures/CPU’s
Figure 21 shows the evolution of each one of these subsystems in terms of LOC.
We notice that drivers subsystem is both the biggest subsystem and the one with the
fastest growth. In Figure 22, a comparative analysis of each subsystem’s LOC versus
the overall system’s LOC is presented. We can see that drivers occupy more than 60
Revision: final 60
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
per cent of the total system’s size and this percentage is continuously growing. This
fact can be explained as a result of Linux’s rising awareness: more users wish to run
it with many different types of devices, therefore the respective drivers have to be
included to the system.
A recent observation of Linux’s evolution was published by [Rob05]. He employed
a methodology similar to this of Godfrey and Tu, but examined all the available re-
leases of Linux (both stable and development) till December 2004, instead of picking
a sample. The metric that was used in this research was the SLOCCount tool, which
counts source lines of code written in identified source code files. The kernel had
grew a lot in comparison to the previous survey: the number of SLOC and the size
of tar file were more than double. This trend is visible in Figures 23 and 24: the
super-linearity of Linux’s evolution is even more remarkable during the last years.
Like Godfrey and Tu, Robles also examined the evolution of Linux’s major sub-
systems, as we can see in Figures 25 and 26. The results were similar, as drivers is
still the biggest subsystem, though its share of the total Linux kernel has decreased,
mainly due to the removal of sound subsystem in early 2002.
All in all, we conclude that the OSS communities’ power can push a project to
super-linear growth, in contrast to the typical software evolution rules. Voluntary
participation in a software’s development ensures that the participants are really
interested on it both as developers and as users. In this case, software isn’t treated
merely as a commercial product, but as a means of improving people’s lives. Linux
is a very good example of such a case.
Revision: final 61
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 22: Percentage of SLOC for each major subsystem of Linux (development
releases), ([GT00], p.138).
Figure 23: Growth of SLOC of Linux for all the stable and development releases,
([Rob05], p.89).
Revision: final 62
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 24: Growth of the tar file (right) and the number of files (left) for the full
Linux kernel source release, ([Rob05], p.90).
Figure 25: Growth of SLOC of the major subsystems of Linux (development re-
leases), ([Rob05], p.91).
Revision: final 63
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 26: Percentage of SLOC for each major subsystem of Linux (development
releases), ([Rob05], p.93).
3.1.3 Apache
Another famous OSS project is the Apache web server. It began in early 1995 by
Rob McCool, a software developer and architect who was 22 years old at that time.
Apache was initially an effort to coordinate the improvement of NCSA (National
Center for Supercomputing Applications) HTTPd program, by creating patches and
adding new features. Actually this was the initial explanation of the project’s name:
it was “a patchy” server. Later though, the project’s official website claimed that
Apache name was given as a sign of respect to the native American tribe of Apache.
Apache quickly attracted the attention of an initial core team of developers, who
formed the “Apache Group,” and it was first launched in early 1996, as Apache HTTP
version 1.0. That time, it was actually the only workable Open Source alternative to
the Netscape web server. Since April 1996, it has reportedly been the most popular
HTTP server on the internet, as it hosts over half of all websites globally.
One of the most comprehensive research on Apache server was conducted by Au-
dris Mockus, Roy T. Fielding and James Herbsleb in 2002 [MFH02]. In this research,
they discuss about the way the Apache development occurred and they present some
quantitative results of Apache’s development evolution. The following information is
based on this article.
As we mentioned earlier, the “Apache Group” was formed at the initial stage of
the project and it was charged with the project’s coordination. It was an informal
organisation of people, consisted entirely of volunteers, who all had other full-time
jobs. Therefore they decided to employ a decentralised, scattered development con-
cept, that supported asynchronous communication. This was achieved through the
Revision: final 64
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 27: The cumulative distribution of contributions to the code base, ([MFH02],
p.321)
use of e-mailing lists, newsgroups and the problem reporting system (BUGDB). Ev-
ery developer may take part in the project, submit his contributions and then the
“Apache Group” decides on the inclusion of any code change. Apache core develop-
ers are free to choose the project’s area that most attracts them and leave it when
they are no more interested in it.
Mockus, Fielding and Herbsleb studied several aspects of Apache’s development.
Firstly, they examined the participation of the project’s development community,
which counts almost 400 individuals, in the two main parts of the software’s devel-
opment: code generation and bug fixes. In Figure 27, we can see the cumulative
proportion of code changes (on the vertical axis) versus the top N contributors to
the code base (on the horizontal axis), which are ordered by the number of Modifi-
cation Requests (MRs) from largest to smallest. Code contribution is measured by
4 factors: MRs, Delta, Lines Added and Lines Deleted. The Figure shows that the
top 15 developers contributed more than 83 per cent of MRs and deltas, 88 per cent
of lines added, and 91 per cent of deleted lines. Similarly, Figure 28 shows the cu-
mulative proportion of bug fixes (vertical axis) versus the top N contributors to bug
fixing. This time, the core of 15 developers produced only 66 per cent of the fixes.
These two figures show that the participation of a wide development community
is more important in defect repair than in new code submission. We notice that,
despite the broad overall participation in the project, almost all new functionalities
are created by the core developers. A broad developers’ community though, is es-
sential for bug fixing. Mockus, Fielding and Herbsleb made a comparative analysis
of these findings to several commercial projects’ data. This study’s outcome was
that in commercial projects, core developers’ contribution in the project’s evolution
Revision: final 65
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
3.1.4 Mozilla
Mockus, Fielding and Herbsleb [MFH02] present an analysis of another OSS project,
the Mozilla web browser. Mozilla was initially created as a commercial project by
Netscape Corporation, which (in January 1998) decided to distribute its communi-
cator free of charge, and give free access to the source code as well - therefore
turning it into a OSS project. Netscape was actually so impressed by Linux’s evo-
lution, that they were attracted by the idea of developing an Open Source web
browser. The project’s management was assigned to the “Mozilla Organisation,”
now named “Mozilla Foundation.” Nowadays, the foundation coordinates and main-
tains the Mozilla Firefox browser and the Mozilla Thunderbird e-mail application,
among others.
Mockus, Fielding and Herbsleb investigate the size of Mozilla’s development
community. By examining the project’s repository, they found 486 code contribu-
tors and 412 bug fixes contributors. In Figure 29, we can see the project’s external
participation over time. The vertical axis represents the fraction of external devel-
opers and the horizontal axis represents time. It is clear that participation gradually
increases over time, as a result of widespread interest and improved documentation.
As an example, it is mentioned that 95 per cent of the people who created problem
reports were external, and they committed 53 per cent of the total number of prob-
Revision: final 66
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
lem reports. Figure 30 shows the cumulative distribution of code contribution for
seven Mozilla modules. In this case, the developer contribution does not seem to
vary as much as in Apache project.
Mozilla represents a way in which commercial and Open Source development
approaches could be combined. The interdependence among Mozilla modules is
high and the effort dedicated in code inspections is high. Therefore, Mozilla’s core
teams are bigger than in Apache, employing more formal means of coordinating
the project. But the fact is that, despite its commercial development roots, Mozilla
managed to leverage the OSS community, achieve high participation and result in a
high-quality product.
3.1.5 GNOME
GNOME is also one of the biggest and most famous OSS projects. It is a desktop
environment for Unix systems and its name was formed as an acronym of the words
“GNU Network Object Model Environment.” In 2004, Daniel M. German published a
research of GNOME, in order to examine how global software development can lead
to success [Ger04b]. The discussion below is based on that article.
The GNOME project was started by Miguel de Icaza, a Mexican software pro-
grammer. Its first version was released in 1997 and contained one simple appli-
cation and a set of libraries. Today, GNOME has turned into a large project, with
more than two millions of LOC and hundreds of developers worldwide. In 2000, the
GNOME Foundation (similar to Apache’s Software Foundation) was established. It is
composed of four entities: the Board of Directors, the Advisory Board, the Executive
Director and the members. Many of the participants in the Board of Directors are
Revision: final 67
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 30: The cumulative distribution of contributions to the code base for seven
Mozilla modules, ([MFH02], p.336
3.1.6 FreeBSD
FreeBSD is an open-source operating system that is derived from BSD, the version of
Unix that has been developed by the University of California. The project started in
Revision: final 68
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 31: FreeBSD stable release growth by release number, ([IB06], p.207)
1993 and its current (end of 2006) stable version is 6.1. It is run by the FreeBSD de-
velopers that have commit access to the project’s CVS. As it is considered a success-
ful OSS project, it has attracted scientific interest over its evolutionary process. The
most recent publication on FreeBSD’s evolution has been committed by Clemente
Izurieta and James Bieman [IB06]. Based on an earlier study of Trung Dinh-Trong
and James Bieman [DTB04] that praised the system’s organizational structure, Izuri-
eta and Bieman focused on examining the growth rate of FreeBSD stable releases
since its inception, by employing metrics such as LOCs, number of directories, total
size in Kbytes, average and median LOC for header (dot-h) and source (dot-c) files,
and number of modules for each sub-system and for the system as a whole.
This study indicates that FreeBSD follows a linear (and sometimes sub-linear)
rate of growth, as it is demonstrated in figures 31 to 35. We observe that dot-c and
dot-h files (figure 34) show a very slight growth in size, which is due to the fact
that the system does not evolve in an uncontrolled manner, as Izurieta and Bieman
explain. It also has to be clarified that in figure 35, contrib subsystem contains soft-
ware contributed by users, and sys subsystem is the system’s kernel. As one could
expect, sys is smaller in size and grows in a slower pace than contrib, because its
content goes through a stricter validation process before its inclusion in the system.
During the last years, some horizontal studies of OSS projects have been published,
in which several projects are examined collectively. Such an example is an article
by Andrea Capiluppi, Patricia Lago and Maurizio Morisio [CLM04], in which they
pick up 12 projects from the Freshmeat Open Source portal. These projects were
Revision: final 69
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Revision: final 70
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 34: FreeBSD average and median values of dot-c and dot-h files, ([IB06],
p.209)
Revision: final 71
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
all “alive,” meaning that they had shown significant growth over time and there
were still developers working on them by the date of the research. Actually, the
authors report that during their research on Fresh Meat portal, they discovered
that a significant percentage of the hundreds of accessible OSS projects were not
evolving anymore, having no developers and no growth for a considerable amount
of time. The authors concluded that that mortality of OSS projects is quite high.
After an initial observation of the sample, they clustered the 12 projects into
three categories: large, medium and small projects, as follows: Large projects: Mutt,
ARLA Medium projects: GNUPARTED, Weasel, Disc-cover, XAutolock, Motion, Bub-
blemon Small projects: Dailystrips, Calamaris, Edna, Rblcheck
The authors analysed some basic attributes of these projects, such as size, mod-
ules and number of developers. According to their findings, all projects had grown
at a linear rate over time, both in terms of size and in terms of the number of de-
velopers. Some periodic fluctuations of the code’s size were noticed, mainly caused
by internal redesigns of the software, but the long-term view has been upward in
all cases. In large and medium projects, the core teams had grown as well, but in a
limited way, which suggests that there is always a ceiling in the core project teams’
expansion. The same patterns of linear or sub-linear growth have been discovered
for the number of modules, too. In a later study, Andrea Capiluppi, Maurizio Morisio
and Juan Ramil proceeded to a further examination of the ARLA project, reaching
similar conclusions [CMR04].
Finally, another interesting research has been carried out by James W. Paulson,
Giancarlo Succi and Armin Eberlein [PSE04]. In order to test the effectiveness of
OSS development process, they investigated the evolutionary patterns of three ma-
jor OSS projects (Linux, GCC and Apache) in comparison to three closed-source
software projects, the names of which were kept confidential. According to their
findings, OSS development structure fosters creativity and constructive communica-
tion among the developers more effectively than traditional ways of software devel-
opment, because the new functions and features that were added to OSS projects
were bigger in number and in volume than the ones added to closed-source software
projects. In addition, OSS projects perform faster fixing of bugs and other defects,
because of the greater number of developers and testers that contribute to them.
However, the evidence presented in this research does not support the arguments
that OSS systems are more modular and grow faster than closed-source competitors.
The authors in [ASAB02] and later on in [ASS+ 05] described a general framework
for F/OSS dynamical simulation models and the extra difficulties that have to be
confronted relative to analogous models of the closed-source process. It is actually
Revision: final 72
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
1. Much unlike closed source projects, in F/OSS projects, the number of contrib-
utors greatly varies in time and is based on the interest that the specific F/OSS
project attracts. It cannot be directly controlled and cannot be predetermined
by project coordinators. Therefore, an F/OSS model should a) contain an ex-
plicit mechanism for determining the flow of new contributors as a function of
time and b) relate this mechanism to specific project-dependent factors that
affect the overall “interest” in the project.
2. In any F/OSS project, any particular task at any particular moment in time
can be performed either by a new contributor or an old one. In addition, al-
most all F/OSS projects have a dedicated team of “core” programmers that
perform most of the contributions, while their interest in the project stays ap-
proximately the same. Therefore, the F/OSS simulation model must contain
a mechanism that determines the number of contributions that will be under-
taken per category of contributors (e.g. new, old or core contributors) at each
time interval.
3. In F/OSS projects, there is also no direct central control over the number of
contributions per task type or per project module. Anyone may choose any
task (eg. code writing, defect correction, etc) and any project module to work
on. The allocation of contributions per task type and per project module depend
on the following sets of factors:
(a) Programmer profile (eg. some programmers may prefer code testing to
defect correcting). These factors can be further categorized as follows:
i. constant in time (eg. the preference of a programmer in code-writing)
and
ii. variable with time (eg. the interest of a programmer to contribute
to any task or module may vary based on frequency of past contribu-
tions).
(b) Project-specific factors (eg. a contributor may wish to write code for a
specific module, but there may be nothing interesting left to write for that
module).
Therefore, the F/OSS model should (a) identify and parameterise the depen-
dence of programmer interest to contribute to a specific task/module on (i)
programmer profile, (ii) project evolution and (b) contain a quantitative mech-
anism to allocate contributions per task type and per project module.
Revision: final 73
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
5. In F/OSS projects there is no specific time plan for project deliverables. There-
fore, the number of calendar days for the completion of a task varies greatly.
Also, delivery times should depend on project specific factors such as the amount
of work needed to complete the task. Therefore, task delivery times should be
determined in a stochastic manner on the one hand, while average delivery
times should follow certain deterministic rules.
The authors concluded that the core of any F/OSS simulation model should be
based upon a specific behavioural model that must be properly quantified in order to
model the behaviour of project contributors in deciding a) whether to contribute to
the project or not, b) which task to perform, c) which module to contribute to and d)
how often to contribute. The behavioural model should then define the way that the
above four aspects depend on a) programmer profile and b) project-specific factors.
The formulation of a behavioural model must be based on a set of qualitative
rules. Fortunately, previous case studies have already pinpointed such rules either
by questioning a large sample of F/OSS contributors or by analysing publicly avail-
able data in F/OSS project repositories. As previous case studies identified many
common features across several F/OSS project types, one certainly can devise a be-
havioural model general enough to describe at least a large class of F/OSS projects.
Selecting a suitable equation that describes a specific qualitative rule is largely
an arbitrary task in the beginning, however a particular choice may be subsequently
justified by the model’s demonstrated ability to fit actual results. Once the be-
havioural model equations and intrinsic parameters are validated, then the model
may be applied to other F/OSS projects.
Revision: final 74
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
INPUT (Project-Specific)
BEHAVIOURAL
MODEL Behavioural
model
fixed
PROBABILITY parameters
DISTRIBUTIONS
OUTPUT
TIME EVOLUTION
OF DYNAMIC
VARIABLES
Figure 36: Structure of a generic F/OSS dynamic simulation model. Figure was
reproduced from [ASS+ 05].
may attempt to provide rough estimates for these values based on results of other
(similar) real-world F/OSS projects. However, these values may be readjusted in the
course of the evolution of the simulated project as real data becomes available. If
the simulation does not get more accurate in predicting the future evolution of the
project, by applying this continuous re-adjustment of parameters, it means that a)
either some of the behavioural model qualitative rules are based on wrong assump-
tions for the specific type of project studied or, b) the values of behavioural model
which are project-independent must be re-adjusted.
Revision: final 75
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• F/OSS project risk management. F/OSS projects are risky, in the sense that
many, not easily anticipated, factors may affect negatively their evolution. Sim-
ulation models may help in quantifying the impact of such factors, taking into
account their probability of occurrence and the effect they may have, in case
they occur.
• F/OSS process evaluation. The nature of F/OSS guarantees that in the fu-
ture we will observe new types of project organisation and evolution patterns.
Researchers may be particularly interested in understanding the dynamics of
F/OSS development and simulation models may provide a suitable tool for that
purpose.
In conclusion, the authors in both [ASAB02] and [ASS+ 05] claimed that existing
case studies do not contain the complete set of data necessary for a full-scale cali-
bration and validation of their simulation model. Despite this fact, qualitatively, the
simulation results demonstrated the super-linear project growth at the initial stages,
the saturation of project growth at later stages where a project reached a level of
functional completion (Apache) and the effective defect correction, facts that agree
with known studies.
Revision: final 76
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 37: Simulation results for the Apache project: Cummulative LOC difference
vs. time for the Apache project. The bold line is the average of the 100 runs. The
gray lines are one standard deviation above and below the average. The dashed
vertical line shows the end of the time period for which data was collected in the
Apache case study [MFH02]. Figure was reproduced from [ASS+ 05]
.
Figure 38: LOC evolution in gtk+ module of GNOME project: Cumulative LOC dif-
ference vs. time. The bold line is the expectation (average) value of LOC evolution.
The gray lines are one standard deviation above and below the average. The dashed
vertical line shows approximately the end of the time period for which data was
collected in the GNOME case study. Figure was reproduced from [ASS+ 05].
Revision: final 77
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
One of the most evident intrinsic limitations of the F/OSS simulation models, the
authors claimed, comes from the very large variances of the probability distribu-
tions used. On output, this leads to large variances in the evolution of key project
variables, a fact that naturally limits the predictive power of the model.
Finally, the authors concluded that despite the aforementioned intrinsic and ex-
trinsic limitations, their “first attempt” simulation runs, fairly demonstrated the
model’s ability to capture reported qualitative and quantitative features of F/OSS
evolution.
Revision: final 78
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
of a project). The two size metrics that relate with satisfaction are the “Number
of statements” and “Program Length”. This relation is negative, i.e. the bigger a
component is, the worse the performance of its “external quality”.
The authors at the end suggest that Open Source performs no worse of a standard
implied by an industrial tool and they emphasise the need for more empirical studies
in order to clarify Open Source quality performance. The authors suggest (in 2002)
that in an Open Source project programmers should follow a programming standard
and have a quality assurance plan, leading to a high quality code. This suggestion
has been recently adopted by large Open Source projects like KDE35 .
Another study from the same group that assesses the maintainability of open
source software is that of Samoladas et al [SSAO04]. In this paper, authors studied
the maintainability of five Open Source software projects and one closed source, a
comparison that is not frequent in Open Source literature. The measurement was
conducted in successive versions, allowing the study of the evolution of maintain-
ability and how it behaves over time. The maintainability was measured using the
Maintainability Index described in section 1.3.3 and the measurement was done with
the help of a metrics package found in the Debian r3.0 distribution, which contains
tools from Chris Lott’s page, and a set of Perl scripts to coordinate the whole process.
The projects under study had certain characteristics: Two of them were pure
open source projects (initiated as Open Source and continue to evolve as such), the
other is an academia project that gave birth to an Open Source project, the fourth
is a closed source project that opened its code and continued as open source, the
fifth was an Open Source project that was forked to a commercial one, while itself
continued as Open Source and the last one is the latter closed source, which code
is available with a commercial, non-modifiable licence. The result of the study was
that in all cases the maintainability of all projects deteriorates over time. When they
compared the evolution of the maintainability of the closed source one versus its
35
http://www.englishbreakfastnetwork.org/
Revision: final 79
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 39: Maintainability Index evolution for an Open Source project and its closed
source “fork” (Samoladas et al.).
counterpant, the closed source performs worse than the Open Source project. The
authors conclude that open source code quality, as it is expressed by maintainability,
suffers from the same problems that have been observed in closed source software
studies. They also point the fact that further empirical studies are needed in order
to produce safe results about Open Source code quality.
Another study of maintainability of Open Source projects and particularly the
maintainability of the Linux kernel, was conducted by Yu et al. [YSCO04]. Here,
authors study the number of instances of common coupling between the 26 kernel
modules and all the other non kernel modules. As coupling they mean the degree
of interaction between the modules and thus the dependency between them. In
this document, coupling was also explained in section 1.3.1. Additionally, for kernel
based software, they also consider couplings between the kernel and non kernel
modules. The reason they studied coupling as a measure for maintainability is that,
as authors suggest and explain, common coupling is connected to fault proneness,
thus maintainability.
The specific study is a follow up to previous ones, which the same team has con-
ducted. In these previous studies, they examined 400 successive versions of the
Linux kernel and tried to find relations between the size, as it is expressed by the
lines of code, and the number of instances of coupling. Their findings showed that
the number of lines of code in each kernel module increases linearly with the version
number, but that the number of instances of common coupling between kernel mod-
ules and all others shows an exponential growth. In this new study they perform an
in depth analysis of the notion of coupling in the Linux kernel. In order to perform
Revision: final 80
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 40: Maintainability Index evolution for three Open Source project (Samoladas
et al.).
their new study, authors first refined the definition of coupling and defined differ-
ent expressions of it (e.g global variables inside the Linux kernel, global variables
outside the kernel, etc), by separating coupling into five categories and characterise
them as “safe” and “unsafe”. Then, they constructed an analysis technique and met-
ric for evaluating coupling and applied it to analyse the maintainability of the Linux
kernel.
The application of this classification of coupling in the Linux 2.4.20 kernel, showed
that for a total 99 global variables (common expression of coupling) there are 15.110
instances of them, of which 1.908 are characterised as “unsafe”. Along with the re-
sults from their previous study (the exponential growth of instances) they conclude
that the maintainability of the Linux kernel will face serious problems in the long
term.
A more recent paper from the same group compares the maintainability, as it
expressed by coupling, of the Linux kernel with that of the FreeBSD, OpenBSD and
NetBSD [YSC+ 06]. They applied the similar analysis as in [YSCO04] and compared
the performance of Linux against the BSD family (as statistical formal hypotheses).
Results showed that there Linux contains considerable more instance of common
coupling than the BSD family kernels, making it more difficult to maintain and fault
proness to changes. Authors suggest that the big difference between Linux and the
BSD family kernels indicates that it is possible to design a kernel without having a
lot of global variables and, thus, the Linux kernel development team does not take
into account maintainability so much.
A more recent study is that of Güneş Koru and Jeff Tian [KT05]. Here the two
Revision: final 81
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
authors try to correlate change proness and structural metrics, like size, coupling,
cohesion and inheritance metrics. They suggest, based on previous studies, that
change prone modules are also defect prone and these modules can be spotted by
measuring their structural characteristics. In short, authors measured two -large-
Open Source projects, namely Mozilla and OpenOffice, by using a large set of struc-
tural measures which fit into the categories mentioned before. The measurement
was done with the Columbus36 tool. In addition with the help of a custom made (by
them) Perl scripts, they counted the differences for each application from its imme-
diately preceding revision. As the smallest software unit they considered the class.
This measurement involved 800 KLOC and 51 measures for Mozilla and 2.700 KLOC
and 46 measures for OpenOffice.
With the results obtained they questions whether high change high change mod-
ules were the same as modules with the highest measurement values considering
each metric individually. They also tried to compare the results with an older similar
study of their own, a study that was conducted for six large scale projects in industry
(IBM and Nortel) In order to answer the questions they created appropriate statis-
tical, formal hypothesis and tests. The results showed that there is strong evidence
that modules, which had the most changes, did not have the highest measurement
values, a fact that was true for the previous industrial study. Authors also performed
a similar analysis, but with clustering techniques. The second analysis resulted in
the same statement, but also it pointed out that the high change modules were not
the modules with the highest measurement values but those with fairly high mea-
surement values.
The latter was the main outcome of the paper and, as authors indicate, the same
is true for the six industrial applications. Authors, trying to explain this, suggest
that this fact holds because expert programmers in Open Source take on the difficult
tasks and novice ones the easier ones. This might result in modules with the highest
structural measures, which solve complex tasks, not to be the most problematic
ones. Of course as they suggest this needs further investigation and is a central
issue in their future studies.
A very intersting paper, although not directly an Open Source code quality study,
is that of Gyimóthy, Ferenc and Siket [GFS05]. The study has as its main goal, the
validation of the Object Oriented Metrics Suite of Chidamber and Kemerer ( CK suite
- as described in section 1.3.1) with the help of open source software, not the assess-
ment of the quality of an Open Source software per se. Particularly they validated
the CK metrics suite with the help of a framework-metrics collection tool named
Columbus, which was mentioned previously, on an Open Source project, Mozilla. In
order to perform their analysis, except from using Columbus to extract the metrics,
they also collected information about bugs in Mozilla from the bugzilla database,
the system that Mozilla uses for bug reporting and tracking. The validation of the
36
http://www.frontendart.com
Revision: final 82
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 41: Changes in the mean value of CK metrics for 7 version of Mozilla (Gy-
imóthy et al.)
metrics was done with statistical methods such as logistic and linear regression, but
also with machine learning techniques, like decision trees and neural networks. The
latter techniques were used to predict fault proneness of the code.
The methodology followed can be summarized in:
2. Application of the four techniques (logistic and linear regression, decision tress
and neural networks) to predict the fault proneness of the code
It (the methodology) is well described in the paper. As authors admit, the challenge
of the whole process was to associate the bugs from the Bugzilla database with the
classes found in the source code. This association was complicated and demanded a
lot of iterative work, it is described in the paper.
From the “pure” software engineering part of the study, the validation of the
metrics and the models predictiveness, the most interesting results is that the CBO
metric (Coupling Between Object classes) seems to be the best in predicting the
fault-proneness of classes. Someone is easy to notice that again the notion of cou-
pling is strongly related to bugs and, thus, to maintainability. This fact demands
further investigation and it has to be in our project’s research agenda. Regarding
the “Open Source” part of the study, authors observed a significant growth of the
values of 5 out of 7 CK metrics (the seventh is LCOMN - Lower of Cohesion on Meth-
ods allowing Negative Value, a metric not included in the CK metrics suite). Authors
assume that this happened because of big reorganization of the Mozilla source code
with version 1.2, causing this growth. Of course this justification needed further
investigation. Figure 11 the changes of metrics for the seven versions of the Mozilla
suite. To conclude, we could say that, although this study does not directly assesses
Revision: final 83
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Free and Open Source Software (F/OSS) development not only exemplifies a vi-
able software development approach, but it is also a model for the creation of self-
learning and self-organising communities in which geographically distributed indi-
viduals contribute to build a particular software. The Bazaar model [Ray99], as
opposed to the Cathedral model of developing F/OSS has produced a number of
successful applications (eg. Linux, Apache, Mozilla, MySQL, etc). However, the ini-
tial phase of most F/OSS projects does not operated at the Bazaar level and only
successful projects make the transition from Cathedral to Bazaar style of software
development [Mar04].
Participants who are motivated by a combination of intrinsic and extrinsic mo-
tives congregate in projects to develop software on-line, relying on extensive peer
collaboration. Some project participants often augment their knowledge on coding
techniques by having access to a large code base. In many projects epistemic com-
munities of volunteers provide support services [BR03], act as distributing agents
and help newcomers or users. The F/OSS windfall is such that there is increased
motivation to understand the nature of community participation in F/OSS projects.
Substantial research on Open Source software projects focused on software repos-
itories such as mailing lists to study developer communities with the ultimate aim to
inform our understanding of core software development activities. Mundane project
activities which are not explicit in most developer lists have also received attention
[SSA06], [LK03a]. Many researchers focus on mailing lists in conjunction with other
software repositories [KSL03], [Gho04], [LK03a], [HM05]. These studies provided
great insight into the collaborative software development process that characterises
F/OSS projects. F/OSS community studies in mailing lists are important because on
one hand, one major technical infrastructure F/OSS projects require is mailing lists.
On the other hand, F/OSS projects are symbiotic cognitive systems where ongo-
ing interactions among project participants generate valuable software knowledge
- a collection of shared and publicly reusable knowledge - that is worth archiving
[SSA06]. One form of knowledge repository where archiving of public knowledge
takes place is the project’s mailing list.
Lists are active and complex living repositories of public discussions among F/OSS
participants on issues relating to project development and software use. They con-
Revision: final 84
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
However, expert software developers, project and package maintainers take part
in mundane activities in non-developer mailing lists. They interact with participants
and help answer questions others posted. Sometimes they encounter useful issues
which help them to further plan and improve code or overall software quality and
functionality. In addition, although mundane activities display a low level of innova-
tiveness, they are fundamental for the adoption of F/OSS [BR03].
Revision: final 85
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 42: Methodological Outline to Extract Data from Mailing Lists Archives. Mod-
ified from [SSA06] (p.1027).
Revision: final 86
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
externalising knowledge into the mailing lists. In any project’s mailing list, these
posters could assume the role of knowledge seekers and/or knowledge providers
[SSA06]. The posting and replying activities of the participants are two variables
that can be compared, measured and quantified. The affiliation an individual partic-
ipation has with others as a result of the email messages they exchange within the
same list or across lists in different projects could be mapped and visualised using
Social Network Analyses (SNA). For the construction of such an affiliation network
or ’mailing list network’ see ([SSA06], pp. 130-131).
Revision: final 87
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
The two main goals of data mining are the prediction and the description. The
prediction aims at estimating the future value or predicting the behaviour of some
interesting variables based on some other variables’ behaviour. The description is
concentrated on the discovery of patterns that represents the data of a complicated
database by a comprehensible and exploitable way. A good description could suggest
a good explanation of the data behaviour. The relevant importance of the prediction
and description varies for different data mining applications. However, as regards
the knowledge discovery, the description tends to be more important than the predic-
tion contrary to the pattern recognition and machine learning application for which
the prediction is more important. A number of data mining methods have been pro-
posed to satisfy the requirements of different applications. However, all of them
accomplish a set of data mining tasks to identify and describe interesting patterns
of knowledge extracted from a data set. The main data mining tasks are as follows:
Revision: final 88
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Association rules extraction. Mining association rules is one of the main tasks
in the data mining process. It has attracted considerable interest because the
rules provide a concise way to state potentially useful information that is eas-
ily understood by the end-users. Association rules reveal underlying “correla-
tions” between the attributes in the data set. These correlations are presented
in the following form: A → B , where A, B refer to sets of attributes in underly-
ing data.
Revision: final 89
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Classification
Classification is a function that maps (classifies) a data item into one of the sev-
eral predefined classes. One of the widely used classification techniques is the
decision trees. They can be used to discover classification rules for a chosen
attribute of a dataset by systematically subdividing the information contained
in this data set. Decision trees have been one of the tools that have been cho-
sen for building classification models in the software engineering field. Figure
43 shows a classification tree that has been built to provide a mechanism for
identifying risky software modules based on attributes of the module and its
system. Thus based on the given decision tree we can extract the following
rule that assists with making decision on errors in a module:
IF(# of data bindings > 10) AND (it is part of a non real-time system)
THEN
the module is unlikely to have errors
A number of approaches has been proposed in literature which based on the above
data mining techniques aims to assist with some of the main software engineering
tasks, that is software maintenance and testing. We provide an overview of these
approaches in the following section. Also Table 2 summarises their main features.
Data mining due to its capability to deal with large volumes of data and its efficiency
to identify hidden patterns of knowledge, has been proposed in a number of research
work as mean to support industrial scale software maintenance.
Revision: final 90
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 43: Classification tree for identifying risky software modules [MN99]
• Entity type and granularity they use ( e.g. file, function, statement, etc).
Revision: final 91
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
In the sequel, we introduce the main concepts used in MSR and then we briefly
present some of the most known MSR approaches proposed in literature.
Fundamental Concepts in MSR. The basic concepts with respect to MSR involve
Revision: final 92
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
the level of granularity of what type of software entity is investigated, the changes
and the underlying nature of a change. Then most widely used concepts can be
summarised to the followings:
• The semantics of a change is a high level, yet concise description of the change
in the entity’s semantics or feature space. For instance, a class interface
change, bug fix, a new feature was added to GUI etc.
MSR via CVS annotations. One approach is to utilise CVS annotation information.
Gall et. al. [GHJ98] propose an approach for detecting common semantic (logical
and hidden) dependencies between classes on account of addition or modification
of particular class. This approach is based on the version history of the source
code where a sequence of release numbers for each class in which its changes are
recorded. Classes that have been changed in the same release are compared in
order to identify common change patterns based on author name and time stamp
from the CVS annotations. Classes that are changed with the same time stamp are
inferred to have dependencies.
Specifically, this approach can assist with answering questions such as which
classes change together, how many times was a particular class changed, how many
class changes occurred in a subsystem (files in a particular directory). An approach
that studies the file-level changes in software is presented in [Ger04a]. The CVS
annotations are utilised to group subsequent changes into what termed modification
request (MR). Specifically this approach focus on studying bug-MRs and comment-
MRs to address issues regarding the new functionality that may be added or the
bugs that may be fixed by MRs, the different stages of evolution to which MRs cor-
respond or identify the relation between the developer and the modification of files.
MSR via Data Mining. Data mining provides a variety of techniques with potential
application to MSR. One of these techniques are the association rules. The work
proposed by Zimmerman et al [ZWDZ04] exploit the association rules extraction
technique to identify co-occurring changes in a software system. For instance, we
want to discover relation between the modification of software entities. Then we aim
Revision: final 93
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
MSR via Heuristics. CVS annotation analysis can be extended by applying heuris-
tics that include information from source code or source code models. Hassan et
al [HH04] proposed a variety of heuristics (developer-based, history-based, code-
layout-based (file-based)) which are then used to predict the entities that are candi-
dates for a change on account of a given entity being changed. CVS annotations are
lexically analysed to derive the set of changed entities from the source-code repos-
itories. Also the research in [ZWDZ04] and [HH04] use source-code version history
to identify and predict software changes. The questions that they answered are quite
interesting with respect to testing and impact analysis.
Revision: final 94
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
ified syntactic elements based on the encoded AST. The types and prevalence of
syntactic changes can be easily computed. Specifically, the approach supports the
following questions:
According to the above discussion on MSR we can conclude that the types of ques-
tions that MSR can answer can be classified to two categories:
• Attributes that describe the entities (such class name, superclass, method name
etc).
The above elements specifies the data input model of the framework. Another
part of the framework is an extraction process which aim to extract elements and
metrics from source code. Then the extracted information is stored in a relational
Revision: final 95
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
database so that the data mining techniques can be applied. In the specific approach,
clustering techniques are used to analyse the input data and provide a rough grasp
of the software system to the maintenance engineer. Clustering produces overviews
of systems by creating mutually exclusive groups of classes, member data, methods
based on their similarities. Moreover, it can assist with discovering programming
patterns and outlier cases (unusual cases) which may require attention.
• Generate candidate feature-sets and use each one to create and train a
pattern classifier to distinguish failures from the successful executions.
• Select the features of the classifier that give the best results.
Revision: final 96
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
3. The profiles of reported failures are analysed using cluster analysis, in order to
group together failures whose profiles are similar with respect to the features
selected in phase 2.
Revision: final 97
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 45: Merging two clusters. The new cluster A contains the clusters repre-
sented by the two homogeneous sub-trees A1 and A2
large homogeneous subtrees containing failures with different causes. Such a clus-
ter should be split at the level where its large homogeneous subtrees are connected,
so that these subtrees become siblings as Figure 46 shows. If it is too fine, siblings
may be clusters containing failures with the same causes. Such siblings (clusters)
should be merged at the level of their parent as Figure 45 depicts.
Based on these definitions, the strategy that has been proposed for refining an
initial classification of failures using dendrograms has three phases:
1. Select the number of clusters into which the dendrogram will be divided.
2. Examine the individual clusters for homogeneity by choosing the two execu-
tions in the cluster with maximally dissimilar profiles. If the
selected executions have the same or related causes, it is likely that all of the
other failures in the cluster do as well. If the selected executions do not have
the same or related causes, the cluster is not homogeneous and should be split.
3. If neither the cluster nor its sibling is split by step 2, and the failures were
examined have the same cause then we merge them.
Clusters that have been generated from merging or splitting should be analysed
in the same way, which allow for recursive splitting or merging.
Revision: final 98
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
Figure 46: Splitting a cluster: The two new clusters (subtrees with roots A11 and
A12) correspond to the large homogeneous subtrees in the old cluster.
L = {(x1 , j1 ), . . . , (xN , jN )}
1 X 2
d(t) = ji − j(t))
Nt
• Each node t is split into two children tR and tL . The split is chosen that max-
imises the reduction in deviance. That is, from the set of possible splits S, the
optimal split is found by:
Nt Nt
∗
s = argmins∈S d(t) − L d(tR ) − L d(tL )
Nt Nt
• The predicted value for a leaf is the average value of j among the executions in
that leaf.
Revision: final 99
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
• Warnings are flagged for return values that are completely ignored or if the
return value is stored but never used.
• Warnings are also flagged for return values that are used in a calculation before
being tested in a control flow statement.
Any return value passed as an argument to a function before being tested is flagged,
as well as any pointer return value that is dereferenced without being tested.
However there are types of functions that lead the static analysis procedure to
produce false positive warnings. If there is no previous knowledge, it is difficult to
tell which function does not need their return value checked. Mining techniques for
source code repository can assist with improving static analysis results. Specifically
the data we mine from the source code repository and from the current version of
the software is used to determine the actual usage pattern for each function.
In general terms, it has been observed that the bugs catalogued in bug databases
and those found by inspecting source code change histories differ in type and level
of abstraction. Software repositories record all the bug fixed, from every step in
development process and thus they provide much useful information. Therefore, a
system for bug finding techniques is proved to be more effective when it automati-
cally mines data from source code repositories.
that are involved in a potential bug fix in a CVS commit and those that are not. Next,
within each group, the functions are ranked by how often their return values are
tested before being used in the current version of the software.
The evaluation of software is based on tests that are designed by software testers.
Thus the evaluation of test outputs is associated with a considerable effort by human
testers who often have imperfect knowledge of the requirements specification. This
manual approach of testing software results in heavy losses to the world’s economy.
Thus the interest of researchers has been focused on the development of automated
techniques that induces functional requirements from execution data. Data mining
approaches can be used for extracting useful information from the tested software
which can assist with the software testing. Specifically the induced data mining mod-
els of tested software can be used for recovering missing and incomplete specifica-
tions, designing a set of regression tests and evaluating the correctness of software
outputs when testing new releases of the system.
In developing a large system, the test of the entire application (system testing)
is followed by the stages of unit testing and integration testing. The activities of
system testing includes function testing, performance testing, acceptance testing
and installation testing. The function testing aims to verify that the system performs
its functions as specified in the requirements and there are no undiscovered errors
left. Thus a test set is considered adequate if it causes all incorrect versions of the
program to fail. It is then important that the selection of tests and the evaluation of
their outputs are crucial for improving the quality of the tested software with less
cost. Assuming that requirements can be re-stated as logical relationships between
input and outputs, test cases can be generated automatically by techniques such
as cause effect graphs [Pfl01] and decision tables [LK03b]. A software system in
order to stay useful has to undergo continual changes. Most common maintenance
activities in software life-cycle include bug fixes, minor modifications, improvements
of basic functionality and addition of brand new features.
The purpose of regression testing is to identify new faults that may have been
introduced into the basic features as a result of enhancing software functionality or
correcting existing faults. A regression test library is a set of test cases that run
automatically whenever a new version of software is submitted for testing. Such
a library should include a minimal number of tests that cover all possible aspects
of system functionality. A standard way to design regression test library is to iden-
tify equivalence classes of every input and then use only one value from each edge
(boundary) of every class. One of the main problems is the generation of a mini-
mal test suite which covers as many cases as possible. Ideally such a test suite can
be generated by a complete and up-to-date specification of functional requirements.
However, frequent changes make the original requirements specifications, hardly
relevant to the new versions of software. Then to ensure effective design of new
regression test cases, one has to recover the actual requirements of an existing sys-
tem. Thus, a tester can analyse system specifications, perform structural analysis
of the system’s source code and observe the results of system execution in order to
define input-output relationships in tested software.
An approach that aims to automate the input-output analysis of execution data
based on a data mining methodology is proposed in [LFK05]. This methodology
relies on the info-fuzzy network (IFN) which has an ‘oblivious’ tree-like structure.
The network components include the root node, a changeable number of hidden
layers (one layer for each selected input) and the target (output) layer representing
the possible output values. The same input attribute is used across all nodes of a
given layer (level) while each target node is associated with a value (class) in the
domain of a target attribute. If the IFN model is aimed at predicting the values of
a continuous target attribute, the target nodes represent disjoint intervals in the
attribute range.
A hidden layer l, consists of nodes representing conjunctions of values of the first
l input attributes, which is similar to the definition of an internal node in a standard
decision tree. The final (terminal) nodes of the network represent non-redundant
conjunctions of input values that produce distinct outputs. Considering that the
network is induced from execution data of a software system, each interconnection
between a terminal and target node represents a possible output of a test case.
Figure 47 presents an IFN structure where the internal nodes include the nodes
(1,1), (1,2), 2, (3,1), (3,2) and the connect (1, 1) → 1 implies that the expected output
values for a test case where both input variables are equal to 1, is also 1. The
confectionist nature of IFN resembles the structure of a multi-layer neural network.
Therefore, the IFN model is characterised as a network and not as a tree.
A separate info-fuzzy network is constructed to represent each output variable.
Thus we present below the algorithm for building an info-fuzzy network of a single
output variable.
Network Induction Algorithm. The induction procedure starts with defining the tar-
get layer (one node for each target interval or class) and the “root” node. The root
node represents an empty set of input attributes which are selected incrementally
to maximise a global decrease in the conditional entropy of the target attribute. The
IFN algorithm is based on the pre-pruning approach unlike algorithms of building de-
cision trees such as CART and C4.5. Thus it assumes that when no attribute causes a
statistically significant decrease in the entropy, the network construction is stopped.
The algorithm performs discretisation of continuous input attributes “on-the-fly” by
recursively finding a binary partition of an input attribute that minimises the con-
ditional entropy of the target attribute [FI93]. The search for the best partition of
attribute is dynamic and it is performed each time a candidate input attribute. Each
hidden node in the network is associated with an interval of a discretised input at-
tribute. The estimated conditional mutual information between the partition of the
interval S at the threshold Th and the target attribute T given the node z is defined
as follows:
X X P (Sy ; Ct /S, z)
M I (Th ; T /S, z =) = P (Sy ; Ct ; z) · log
t=0,...,MT −1 y=1,2 P (Sy /S, z) · P (Ct /S, z)
where
• Specification of Application Inputs and Outputs (SAIO). Basic data on each in-
put and output variable in the Legacy System.
• Test bed (TB). This module feeds training cases generated by the RTG module
to the LS.
The IFN algorithm is trained on inputs provided by RTG and outputs obtained
from a legacy system by means of the Test Bed module. A separate IFN module is
built for each output variable. The information derived from each IFN model can be
summarised to the following:
• Logical (if... then ...) rules expressing the relationships between the selected
input attributes and the corresponding output. The set of rules appearing at
each terminal node represents the distribution of output values at that node.
• A set of test cases. The terminal nodes in the network are converted into test
cases, each representing a non-redundant conjunction of input values / equiva-
lence classes and the corresponding distribution of output values.
The IFN algorithm takes as input the training cases that are randomly generated
by the RTG module and the outputs produced by LS for each test case. The IFN
algorithm repeatedly runs to find a subset of input variables relevant to each out-
put and the corresponding set of non-redundant test cases. Actual test cases are
generated from the automatically detected equivalence classes by using an existing
testing policy.
Text mining is the process of extracting knowledge and patterns from unstructured
document text. It is a young interdisciplinary research field under the wider area of
data mining engaged in information retrieval, machine learning and computational
linguistics. The methods deployed in text mining, depending on the application,
usually require the transformation of the texts into an intermediate structured rep-
resentation, which can be for example the storage of the texts into a database man-
agement system, according to a specific schema. In many approaches though, there
is gain into also keeping a semi-structured intermediate form of the texts, as for ex-
ample could be the representation of documents in a graph, where social analysis
and graph techniques can be applied.
Independently from the task objective, text mining requires preprocessing tech-
niques, usually levying qualitative and quantitative analysis of the documents’ fea-
tures. In Figure 48, the diagram depicts the most important phases of the prepro-
cessing analysis, as well as the most important text mining techniques.
Preprocessing assumes a preselected documents representation model, usually
the vector space model though the boolean and the probabilistic are other options.
According to the representation model, documents are parsed, and text terms are
weighted according to weighting schemes like the TF-IDF (Term Frequency - Inverse
Document Frequency), which is based on the frequency of occurrence of terms in
the text. Several other options are described in [Cha02, BYRN99]. Natural lan-
guage processing techniques are also applied, the state of the art of which is well
described in [Mit05, MS99]. Often, stop-words removal and stemming is applied.
In favour of the use of natural language processing techniques in text mining, it
has been shown in the past that the use of semantic linguistic features, mainly de-
rived from a language knowledge base like WordNet word thesaurus [Fel98], can
help text retrieval [Voo93] and text classification [MTV+ 05]. Furthermore, the use
of word sense disambiguation (WSD) techniques [IV98] is important in several natu-
ral language processing techniques and text mining tasks, like machine translation,
speech processing and information retrieval. Lately, state of the art approaches in
unsupervised WSD [TVA07, MTF04], have pointed the way towards the use of se-
mantic networks generated from texts, enhanced with semantic information derived
from word thesauri. These approaches are to be launched in the text retrieval task,
Preprocessing
Storage and Indexing
Feature Extraction
Structured Representation
Term Weighting
Boolean
Dimensionality Reduction
Vector
Text Keyword Characterization
Probabilistic
Natural Language Processing
Non-Overlapping Lists
Part of Speech Tagging
Proximal Nodes
Document Word Sense Disambiguation
Collection Summarization
Semi-Structured Representation
Graph
Phrase Detection
Link Analysis
Entity Recognition
Meta-data
Word Thesauri/Domain Ontologies
Text Mining
Processing
Clustering
Structured and/or
Classification semi-structured data
Retrieval Models/
Patterns/
Social Analysis Answers
and frequently used in many applications. For example, clustering has already been
used in information retrieval and it is already applied in popular web search engines,
like in Vivisimo 37 . Text classification is widely used in spam filtering. Text retrieval
is a core task with unrestricted range of applications varying from search engines
to desktop search. Social analysis can be applied when any type of links between
documents is available, like for example publications and references, or posts in fo-
rums and replies, and is widely used for authority and hubs detection (i.e. finding
the most important people in this graph). Finally domain ontology evolution is a task
where through the use of other text mining techniques, like clustering or classifi-
cation, an ontology describing a specific domain can be evolved and enhanced with
term features of new documents pertaining with the domain. This is really important
in cases where the respective domain evolves fast, prohibiting the manual update of
the ontology with new concepts and instances.
Text Mining
Method Text Input Source Output
Technique
Weighting of OSS
Entity Resolution, participants,
E-mail archives of
[BGD+06] Social Network Relationship of e-mail
OSS software
Analysis activity and commit
activity
Patterns in the
development of large
[VT06] CVS repositories Text Clustering software projects
(history analysis,
major contributions)
Similarity between
CVS commit notes, new bug reports and
[CC05] Text Retrieval
Set of fixed bugs source code files -
Prediction
Text analysis,
CVS repositories, Predictions of source
[WK05] retrieval,
source code bugs
classification
OSSD Web
Text Extraction, Transformation of
Repositories (Web
Entity Resolution, data into process
[JS04] pages, mailing lists,
Social Network events, Ordering of
process entity
Analysis processing events
taxonomy)
Mailing lists, CVS Statistical measures
Text Summarization
[GM03] logs, Change Log for code changes and
and Validation
files developers
source file they used the set of fixed bugs data and the respective CVS commit notes
as descriptors. With the use of a probabilistic text retrieval model they measure the
similarity between the descriptors of each source file and the new bug description.
This way they predict probably future affected parts of code by bug fixing. Still, the
same method could have been viewed from a supervised learning perspective and
classification along with predictive modelling techniques, would have been a good
baseline for their predictions.
Following the same goal, in [WH05] they mined the CVS repositories to obtain
categories of bug fixes. Using a static analysis tool, they inspected every source
code change in the software repository and they predicted whether a potential bug
in the code has been fixed. These predictions are then ranked with the analysis of
the contemporary context information in the source code (i.e. checking the percent-
age of the invocations of a particular function where the return value is tested before
being used). The whole mining procedure is based on text analysis of the CVS com-
mit changes. They have conducted experiments on the Apache Web server source
code and the Wine source code, in which they showed that the mined data from the
softwares’ repositories produced really good precision and certainly better than a
baseline naive technique.
From another perspective, text mining has been used in software engineering to
validate the data from mailing lists, CVS logs, and change log files of Open Source
software. In [GM03a] they created a set of tools, namely SoftChange38 , that imple-
ments data validation from the aforementioned text sources of Open Source soft-
ware. Their tools retrieve, summarise and validate these types of data of Open
Source projects. Part of their analysis can mark out the most active developers of an
Open Source project. The statistics and knowledge gathered by SoftChange analysis
has not been exploited fully though, since further predictive methods can be applied
with regards to fragments of code that may change in the future, or associative anal-
ysis between the changes’ importance and the individuals (i.e. were all the changes
committed by the most active developer as important as the rest, in scale and in
practice?).
Text mining has also been applied in software engineering for discovering devel-
opment processes. Software processes are composed of events such as relations of
agents, tools, resources, and activities organised by control flow structures dictat-
ing that sets of events execute in serial, parallel, iteratively, or that one of the set
is selectively performed. Software process discovery takes as input artifacts of de-
velopment (e.g. source code, communication transcripts, etc) and aims to elicit the
sequence of events characterising the tasks that led to their development. In [JS04]
an innovative method of discovering software processes from open source software
Web repositories is presented. Their method contains text extraction techniques,
entity resolution and social network analysis, and it is based in process entity tax-
onomies, for entity resolution. Automatic means of evolving the taxonomy using text
mining tasks could have been levied, so as for the method to lack strict dependency
from the taxonomy’s actions, tools, resources and agents. An example could be text
clustering on the open software text resources and extraction of new candidate items
for the taxonomy arising from the clusters’ labels.
Text clustering has also been used in software engineering, in order to discover
patterns in the history and the development process of large software projects. In
[VT06] they have used CVSgrab to analyse the ArgoUML and PostgreSQL reposito-
ries. By clustering the related resources, they generated the evolution of the projects
based on the clustered file types. Useful conclusions can be drawn by careful man-
ual analysis of the generated visualised project development histories. For example,
they discovered that in both projects there was only one author for each major ini-
tial contribution. Furthermore, they came to the conclusion that PostgreSQL did not
start from scratch, but was built atop of some previous project. An interesting evo-
lution of this work could be a more automated way of drawing conclusions from the
development history, like for example extracting clusters labels, map them to tax-
onomy of development processes and automatically extract the development phases
with comments emerging from taxonomy concepts.
38
Publicly available at http://sourcechange.sourceforge.net/
• Social network analysis, for the purposes of discovering the important cluster
of individuals in a software project, through using more sophisticated graph
processing techniques, like PageRank or Spreading Activation.
Actually social net analysis is a set of algorithms that exist long ago having
been applied in other context. The ‘future direction’ is to extend and apply
them in the context of SQO-OSS - aiming at ranking relevant entities appearing
in software development.
• Text clustering of the bug reports, and cluster’s labelling can be used to auto-
matically create a taxonomy of bugs in the software. Metrics in that taxonomy
can be defined to show the influence of generated bugs belonging in a cate-
gory of bugs, to other categories. This can also be translated as a metric of bug
influence across the software project.
Also we can assume graphs created from the existing OSS software and the
communication data. This implies a graph G(V, E), where V = node/ node
represents user, E = edge/ edge: e.g. email exchange.
Applying mining techniques we can extract useful information from the graph
and predict individual actions (i.e. what/when will be the next action of q user)
and calculate aggregate measures regarding the software quality.
5.1 CALIBRE
CALIBRE was an EU FP6 Co-ordination Action project that involved the leading au-
thorities on libre/Open Source software. CALIBRE brought together an interdis-
ciplinary consortium of 12 academic and industrial research teams from France,
Ireland, Italy, the Netherlands, Poland, Spain, Sweden, the UK and China.
The two-year project managed to:
• Foster the effective transfer Open Source best practice to European industry
• Integrate and coordinate European Open Source software research and prac-
tice
CALIBRE aimed to coordinate the study of the characteristics of open source soft-
ware projects, products and processes; distributed development; and agile meth-
ods. This project integrated and coordinated these research activities to address
key objectives for open platforms, such as transferring lessons derived from open
source software development to conventional development and agile methods, and
vice versa.
CALIBRE also examined hybrid models and best practices to enable innovative
reorganisation of both SMEs and large institutions, and aimed to construct a com-
prehensive research road-map to guide future Open Source software research. To
secure long-term impact, an important goal of CALIBRE was to establish a Euro-
pean Open Source Industry Forum, CALIBRATION, to coordinate policy making into
the future. The CALIBRATION Forum and the results of the CALIBRE project were
disseminated through a series of workshops and international conferences in the
various partner countries.
The first public deliverable of CALIBRE was to present an initial gap-analysis of
the academic body of knowledge of Libre Software, as represented by 155 peer-
reviewed research artifacts. The purpose of this work was to support the wider
CALIBRE project goal of articulating a road map for Libre Software research and
exploitation in a European context. For the gap-analysis, a representative collection
of 155 peer-reviewed Libre Software research artifacts was examined and it was
attempted to answer three broad questions about each:
• Libre software projects can be categorised first between small (I-Mode) and
large (C-Mode) projects in the context of an entrepreneurial analysis of Li-
bre software, and second thanks to a dynamic and open meta-maintenance fo-
rum which would provide a standard quality assessment model to all software-
enabled industries, and specially to the secondary software sector
AMs and Libre Software push for a less formal and hierarchical, and more human-
centric development, with a major emphasis on focusing on the ultimate goal of de-
velopment -producing the running system with the correct amount of functionalities.
This deliverable presented an attempt to deepen the understanding of the analogies
between the two methods and to identify how such analogies may help in getting
a deeper understanding of both. The relationships were analysed theoretically and
experimentally, with a final, concrete case study of a company adopting both the XP
development process and Libre Software tools.
Other deliverables of CALIBRE reported on the groundwork for future research
within the CALIBRE project, leading towards the overall project goal of articulating
a road-map for Libre Software in the European context. The research was shaped by
the concerns expressed by the CALIBRE industry partners in the various CALIBRE
events to-date. Specifically, industry partners, notably Paul Everett of the Zope Eu-
rope Association (ZEA) have identified that the primary challenge for Libre software
businesses was effectively delivering the whole product in a manner that takes ac-
count of, and in fact leverages, the unique business model dynamics associated with
Libre software licensing and processes. The document described a framework for
analysing Libre software business models, an initial taxonomy of model categories,
and a discussion of organisational and network agility based on ongoing research
within the ZEA membership.
Another deliverable of the CALIBRE project presented a selection of product and
process metrics defined in various suites, frameworks and categorisations to time.
Each metric was analysed for citations and applications to both agile and Libre devel-
opment approaches. Opportunities for migration and knowledge transfer between
these areas were stressed and outlined. The document also summarised product
maturity models available for Open Source software and emphasises the need for
alternative approaches to shaping Open Source Process maturity models.
CALIBRE project has produced the CALIBRE Working Environment (CWE). As
a result, a deliverable described the first version of the CALIBRE Working Envi-
ronment (CWE). The requirements for the system were described, and the way in
which the CWE addresses these requirements was identified. The CWE require-
ments were identified collaboratively, in consultation with its users, and the system
as it stands largely meets the needs of the users. The software and hardware used to
implement the CWE was described, and areas for further work were identified. The
current CWE is located at http://hemswell.lincoln.ac.uk/calibre/ and allows
registered members to prepare content, with varying levels of dissemination (pub-
lic, restricted to registered members and private), upload documents and files, add
events to a shared calendar and archive mailing list information.
The last publicly available deliverable of CALIBRE focused on Education and
training on Libre (Free, Open Source) software. In this report, a scenario which
could be considered as the second generation in Libre software training was pre-
sented: the compendium of knowledge and experiences needed to deal with the
many facets of the Libre software phenomenon. For this goal, higher education was
considered as the best possible framework. The main guidelines of such a program
on Libre software were proposed. In summary, the studies designed in this report
were aimed at providing students with the knowledge and expertise that would make
them expert in Libre software. The programme provided capabilities and enhances
skills to the point that students can deal with problems ranging from the legal or
economic areas to the more technically oriented ones. It did not (intentionally) focus
on a set of technologies, but approached the Libre software phenomenon from an
holistic point of view. However, it was also designed to provide practical and real
world knowledge. It could be offered jointly by several universities across Europe,
within the framework of the ESHE, or adapted to the specific needs of a single one.
In addition, it could also be adapted for non-formal training.
5.2 EDOS
EDOS stands for Environment for the development and Distribution of Open Source
software. This is a research project funded by the European Commission as a STREP
project under the IST activities of the 6th Framework Programme. The project in-
volves universities - Paris 7, Tel Aviv, Zurich and Geneva Universities -, research
institutes - INRIA - and private companies - Caixa Magica, Nexedi, Nuxeo, Edge-IT
and CSP Torino.
The project aims to study and solve problems associated with the production,
management and distribution of Open Source software packages. Software pack-
ages are files in the RPM or Debian packaging format that contain executable pro-
grams or libraries, their files, along with metadata describing what’s in the package
and what conditions are needed to use it.
There are several problems associated with software packages.
The focus was mainly on the issues related to dependency management for
large sets of software packages, with a particular attention to what must be
done to maintain consistency of a software distribution on the repository side,
as opposed to maintaining a set of packages on a client machine. This choice
is justified by the fact that maintaining the consistency of a distribution of soft-
ware packages is essential to make sure the current distributions will scale
up, yet it is also an invisible task, as the smooth working it will ensure on the
end user side will tend to be considered as normal and obvious as the smooth
working of routing on the Internet. In other words, the project was tackling
an essential infrastructure problem, which was perfectly suited for an Euro-
pean Community funded action. Over the first year and a half of its existence,
Work Package 2 team of the EDOS project has done an extensive analysis of the
whole set of problems that are in its focus, ranging from upstream tracking, to
thinning, rebuilding, and dependency managements for F/OSS distributions.
• Metrics: Following the “release early, release often” philosophy, Free and Open
Source software is always in constant development and any serious project has
many versions floating around : older but stable versions, and newer versions
with new features but with more bugs. Free software can be of wildly varying
quality. Quality metrics are defined, their relevance is assessed and they are
implemented. Work package 5 handles these issues.
The goal of work package 5 is to develop technology and products that will
improve the efficiency of two key processes and one system. The two processes
are the generation of a new version of a distribution from the previous version
and the production of a customised distribution from an existing one. The
system is the current inefficient mechanism of mirroring the Cooker data that
needs to be replaced by a more efficient system. In the end, a demonstration
that the processes have indeed been improved and the system will take place.
Thus, the goal is to define a set of metrics to measure the efficiency of the
processes in question. These metrics will include man power as measured in
man months and elapsed time.
The EDOS project attempts to solve those problems by using formal methods
coming from the academic research groups in the project, to address in a novel way
three outstanding problems:
• The efficient distribution of large software systems, using peer-to-peer and dis-
tributed data-base technology.
These problems were studied and various technical reports were produced ex-
plaining their importance and giving ways of mathematically expressing them, algo-
rithms for solving associated problems and real-world statistics. A certain amount
of software was also produced which is, of course, Free and Open Source :
• The day-to-day evolution of the Debian packages, that is, its detailed history,
can be browsed using anla. This also gives, for every day, reports on instal-
lable software packages and a global installability index for every day (Debian
weather).
• Ara is a search engine for Debian packages that allows arbitrary boolean com-
binations of field-limited regular-expressions, and that ranks results by popu-
larity (again in Ocaml)
5.3 FLOSSMETRICS
FLOSSMetrics stands for Free/Libre Open Source Software Metrics.
Industry, SMEs, public administrations and individuals are increasingly relying
on Libre (Free, Open Source) software as a competitive advantage in the globalis-
ing, service-oriented software economy. But they need detailed, reliable and com-
plete information about Libre software, specifically about its development process,
its productivity and the quality of its results. They need to know how to benchmark
individual projects against the general level. And they need to know how to learn
from, and adapt, the methods of collaborative, distributed, agile development found
in Libre software to their own development processes, especially within industry.
FLOSSMETRICS addresses those needs by analysing a large quantity (thousands)
of Libre software projects, using already proven techniques and tools. This analy-
sis will provide detailed quantitative data about the development process, develop-
ment actors, and developed artifacts of those projects, their evolution over time, and
benchmarking parameters to compare projects. Several aspects of Libre software
development (software evolution, human resources coordination, effort estimation,
productivity, quality, etc.) will be studied in detail. The main objective of FLOSS-
METRICS is to construct, publish and analyse a large scale database with informa-
tion and metrics about Libre software development coming from several thousands
of software projects, using existing methodologies, and tools already developed. The
project will also provide a public platform for validation and industrial exploitation
of results.
The FLOSSMetrics targets are to:
• Integrate already available tools to extract and process such data into a com-
plete platform (WP2).
The main results of FLOSSMETRICS will be: a huge database with factual details
about all the studied projects; some higher level analysis and studies which will help
to understand how Libre software is actually developed; and a sustainable platform
for continued, publicly available benchmarking and analysis beyond the lifetime of
this project. With these results, European industry, SMEs, as well as public adminis-
trations and individuals will be able to take informed decisions about how to benefit
from the competitive advantage of Libre software, either as a development process
or in the evaluation and choosing of individual software applications. The project
methodologies and findings go well beyond Libre software with implications for evo-
lution, productivity and development processes in software and services in general.
FLOSSMETRICS is scheduled in three main phases (running partially in parallel).
The first one will set up the infrastructure for the project, and the first version of the
database with factual data. During the second phase most of the studies and analysis
will be performed, and the contents of the database will be enlarged and improved.
During the third phase the results of the project will be validated and adapted to the
needs of the target communities.
The usability of the results of the project (datasets and studies) will be targeted
to several different users: SMEs developing or using Libre software (or even in-
terested in it), industrial players developing Libre software, and the Libre software
community at large. Based on the feedback obtained in these contexts, a complete
exploitation strategy will also be designed.
Dissemination to these communities will be performed using the project website,
specific presentations at conferences, and by organising a series of workshops. Wide
impact of the results will be supported by using open access licenses for all output
documents.
The data is also expected to be useful for the scientific community, which could
use it for their research lines, thus helping to improve the general understanding of
Libre software development.
The impact of the project is expected to be large in the Libre software develop-
ment realm (and in the whole software development landscape). FLOSSMETRICS
will produce the most complete and detailed view of the current landscape of Libre
software, providing not only a static snapshot of how projects are performing now,
but also historical information about the last ten years of Libre software develop-
ment.
5.4 FLOSSWORLD
Free Libre and Open Source Software - Worldwide Impact Study The FLOSS-
World project aims to strengthen Europe’s leadership in research into FLOSS and
open standards, building a global constituency with partners from Argentina, Brazil,
Bulgaria, China, Croatia, India, Malaysia and South Africa. So far, FLOSSWORLD is
a European Union funded project involving 17 institutions from 12 countries span-
ning Europe, Africa, Latin America and Asia, to undertake a worldwide study on the
impact of select issues in the context of Free/Libre Open Source Software (FLOSS).
The problem Empirical data on the impact of FLOSS, its use and development
is still quite limited. The FP5 FLOSS project and FP6 FLOSSPOLS project have
helped fill in the gaps in knowledge about why and how FLOSS is developed and
used, but have necessarily been focused on Europe. FLOSS is a global phenomenon,
particularly relevant in developing countries, and thus more knowledge on FLOSS
outside Europe is needed.
to sustain a focus. FLOSSWorld will perform three global empirical studies of proven
relevance to Europe and third countries, which will provide a foundation for FLOSS-
World’s regional and international workshops. The studies will cover topics such
as impact of being in a FLOSS community on career growth and prospects, motiva-
tional factors in choice of FLOSS, perspectives from user community towards FLOSS,
inter-regional differences in FLOSS development methodology, etc.
Goals of Workshops During workshops all consortium partners (17 in all) are
brought together with additional participants from their countries, and observers
from the organisations listed as having provided letters of support to the FLOSS-
World project. Workshop participants are experts representing the interests of the
Open Source community, government, businesses, researchers and higher education
institutes, as appropriate for the workshop questions. Some participants will take a
more active role as specific questions are addressed, but in principle all the three
research tracks will be treated in each single workshop.
1. Private sector
2. Government sector
5.5 PYPY
The PyPy project has been an ongoing Open Source Python language implementation
since 2003. In December 2004 PyPy received EU-funding within the Framework
Programme 6, second call for proposals ("Open development platforms and services"
IST).
PyPy is an implementation of the Python programming language written in Python
itself, flexible and easy to experiment with. The long-term goals of this project are
to target a large variety of platforms, small and large, by providing a compiler tool
suite that can produce custom Python versions. Platform, memory and threading
models are to become aspects of the translation process - as opposed to encoding
low level details into the language implementation itself. Eventually, dynamic opti-
misation techniques - implemented as another translation aspect - should become
robust against language changes.
A consortium of 8 (12) partners in Germany, France and Sweden are working to
achieve the goal of an open run-time environment for the Open Source Program-
ming Language Python. The scientific aspects of the project is to investigate novel
techniques (based on aspect-oriented programming code generation and abstract
interpretation) for the implementation of practical dynamic languages.
A methodological goal of the project is also to show case a novel software engi-
neering process, Sprint Driven Development. This is an Agile methodology, provid-
ing a dynamic and adaptive environment, suitable for co-operative and distributed
development.
The project is divided into three major phases, phase 1 has the focus of develop-
ing the actual research tool - the self contained compiler, phase 2 has the focus of
optimisations (core, translation and dynamic) and in phase 3 the actual integration
of efforts and dissemination of the results. The project has an expected deadline in
November 2006.
PyPy is still, though EU-funded, heavily integrated in the Open Source community
of Python. The methodology of choice is the key strategy to make sure that the com-
munity of skilled and enthusiastic developers can contribute in ways that wouldn’t
have been possible without EU-funding.
5.6 QUALIPSO
Goals The Integrated Project (QualiPSo) aims at making a major contribution to
the state of the art and practice of Open Source Software. The goal of the QualiPSo
integrated project is:
The need to sustain and advance the QualiPSo solutions in the future requires an
open sustainability approach. QualiPSo is open in the following ways:
• its use of open standards and the Open Source software development approach
• Legal Issues: This activity addresses the need for a clear legal context in which
OSS will be able to evolve within the European Union.
• Business Models: This activity addresses the need to incorporate new software
development models that can cope with the OSS peculiarities.
• Interoperability: This activity addresses the needs of the software industry for
standards based interoperable software.
• Trustworthy Results: This activity addresses the need for the definition of
clearly identified and tested quality factors in OSS products.
• Trustworthy Processes: This activity addresses the need for the definition of an
OSS-aware standard software development methodology.
• Project activities. The project activities are cross-cutting activities that take
the results generated by the problem activities, integrate them in a coherent
framework and assess and improve their applicability using the selected ap-
plication scenarios. Project activities also include all issues related to indus-
trialisation, dissemination, standardisation, and exploitation of the resulting
framework. These activities are the following:
• QualiPSo Factory: This activity integrates the results achieved in the prototyp-
ing phase of the problem activities to create the QualiPSo environment.
• QualiPSo Competence Centre: this activity aims to develop the means for con-
tinuous and sustainable (beyond the scope of the project) centralisation of ref-
erence information concerning quality OSS development.
• Promotion and support: this activity aims develop awareness for the QualiPSo
results within the global OSS community.
• Demonstration
• Training: This activity will focus on providing training services both in class-
room and through the internet in order to evangelise the results of QualiPSo.
Coordination To achieve its ambitious goal QualiPSo will pursue the following ob-
jectives:
• Define methods, development processes, and business models for the imple-
mentation and deployment of Open Source Software systems to insure inten-
sive software consumers that Open Source projects conform to the standards
required to provide industry level software.
• Design and implement a specific environment where different tools are inte-
grated to facilitate and support the development of viable industrial OSS sys-
tems. This environment will include a secure collaborative platform able to
• Understand the legal conditions by which OSS products are protected and
recognised, without violating the OSS spirit.
5.7 QUALOSS
The strategic objective of this project is to enhance the competitive position of the
European software industry by providing methodologies and tools for improving
their productivity and the quality of their software products.
To achieve this goal, this proposal aims to build a high level methodology to
benchmark the quality of Open Source software in order to ease the strategic deci-
sion of integrating adequate F/OSS1 components into software systems. The results
of the QUALOSS project directly address the strategic objective 2.5.5 of providing
methodologies to use Open Source software into industrial development, to enable
its benchmarking, and to support its development and evolution.
Two main outcomes of the QUALOSS project achieve the strategic objectives by
delivering an assessment methodology for gauging the evolvability and robustness
of Open Source software and a tool that mostly automate the application of the
methodology. Unlike current assessment techniques, ours combines data from soft-
ware products (its source code, documentation, etc) with data about the developer
5.8 SELF
SELF will be a web-based, multi-language, free content knowledge base written
collaboratively by experts and interested users. The SELF Platform aims to be the
central platform with high quality educational and training materials about Free
Software and Open Standards. It is based on world-class Free Software technologies
that permit both reading and publishing free materials, and is driven by a worldwide
community.
The SELF Platform is a repository with free educational and training materials
on Free Software and Open Standards and an environment for the collaborative
creation of new materials. Inspired by Wikipedia, the SELF Platform provides the
materials in different languages and forms. The SELF Platform is also an instrument
for evaluation, adaptation, creation and translation of these materials. Most impor-
tantly, the SELF Platform is a tool to unite community and professional efforts for
public benefit.
The general strategic objectives of the SELF project are:
• Centralise, transmit and enlarge the available knowledge on Free Software and
Open Standards by creating a platform for the development, distribution and
use of information, educational and training programmes about Free Software
and its main applications.
• Raise awareness and contribute to the building of critical mass for the use of
Free Software and Open Standards.
• Research the state of the art of currently available Free Software educational
and training programmes and detect the potential gaps.
• Create an open platform for the development, distribution and use of informa-
tion, educational and training programmes on Free Software and Open Stan-
dards.
• Develop educational and training materials concerning Free Software and Open
Standards. The project aims for including information on at least 50 software
applications in the initial period.
While the SELF platform will be started by the members of the consortium, its fi-
nal goal is to become a community of different interested parties (from governments
and educational institutes to companies) that can not only exploit the SELF materials
but also participate in its production. The commercial and educational interests of
exploiting the SELF materials will assure the self-sustainable character of the SELF
Platform beyond the EC funding period. The SELF Project aims for involving at least
150 members in the SELF community by the end of the project.
This project starts from three main assumptions:
1. Free Software and Open Standards are crucial to support the competitive po-
sition of the European software industry.
2. The real and long term technological change from private to Free Software can
only come by investing in education and training.
That is why the SELF platform will have two main functions. It will be simulta-
neously a knowledge base and a collaborative production facility. On the one hand,
it will provide information, educational and training materials that can be presented
in different languages and forms: from course texts, presentations, e-learning pro-
grammes and platforms to tutor software, e-books, instructional and educational
videos and manuals. On the other hand, it will offer a platform for the evaluation,
adaptation, creation and translation of these materials. The production process of
such materials will be based on the organisational model of Wikipedia. In short,
SELF will be a web-based, multi-language, free content knowledge base written col-
laboratively by experts and interested users.
5.9 TOSSAD
Europe, as a whole, has a stake in improving the usage of F/OSS in all branches of
IT and public life, in general. F/OSS communities throughout Europe can achieve
better results through co-ordination of their research activities/programmes that
reflect the current state-of-the-art.
The main objective of the tOSSad project is to start integrating and exploiting
already formed methodologies, strategies, skills and technologies in F/OSS domain
in order to help governmental bodies, educational institutions and SMEs to share
research results, establish synergies, build partnerships and innovate in an enlarged
Europe.
More precisely, the tOSSad project aims at improving the outcomes of the F/OSS
communities throughout Europe through supporting the coordination and network-
ing of these communities by means of state-of-the-art studies, national program initi-
ations, usability cases, curriculum development and implementation of collaborative
information portal and web based groupware.
Main tOSSad coordination activities are:
Work package 1 has the intention of producing a report detailing both the current
status of F/OSS adoption in European countries, and the barriers that such future
adoption might face. It has the intention of producing a report detailing both the
current status of F/OSS adoption in European countries, and the barriers that such
future adoption might face. The main goal is to give a clear picture of the current
status (usage, implementation, adoption, penetration, government policies, etc.) of
F/OSS related to following topics:
• Educational weakness
• Cultural readiness
• F/OSS training and certification solutions for IT people, developers and users
making use of existing or new training institutions
• Building partnerships within the public and private sectors and civil society, as
well as regionally within Europe.
• Preparing not only high-line case histories, but also all the details needed to
copy and implement F/OSS solutions locally.
The major objectives of Work package 3 are to tackle the obstacles and leading
to a breakthrough of usability in F/OSS, by assuring that usability will be paid more
attention in F/OSS in the future.
Within the usability work package of tOSSad the major objectives are to tackle
obstacles and leading to a breakthrough of usability in F/OSS, by assuring that us-
ability will be paid more attention in F/OSS in the future. To reach these objectives,
besides the intensive spreading of awareness, the following three major areas will
be addressed within Work package 3:
• State of the art usability based on both in depth desk research and an empirical
survey in F/OSS. If appropriate, the survey will be integrated into the empirical
investigations conducted in Work package 1
• Usability test of selected F/OSS components with a specific focus towards desk-
top applications, personal information management (PIM) and office applica-
tions
• Based on the test results and research in the area of tomorrow’s usability re-
quirements (thinking of mobile end devices, voice interaction, wearable) F/OSS
gaps will be detected. Derived from these recommendations for future research
directions will generated
• Guideline taking both the attention of usability aspects during F/OSS develop-
ment and the conduction of usability testing into account Thereby a recurrent
user involvement for usability assurance during shared developments via mock-
ups for inclusion in F/OSS development environment will be focused.
• Courses and curricula about using the most popular F/OSS desktop applica-
tions - F/OSS office automation software, mail applications, Web browsers,
Wiki’s, etc. - even on proprietary operating systems.
• Courses and curricula about F/OSS server application & management - Linux
operating system Application Server (Tomcat), Web Server (Apache), databases,
middleware, and related system applications.
• Courses and curricula about F/OSS software development tools IDE (Eclipse),
Versioning System and related tools.
• Courses and curricula about how to develop and take advantage of F/OSS soft-
ware and software engineering of F/OSS. They are related to ongoing research
on methodologies and tools for F/OSS development, and aim to train software
developers able to build, customise and consult on F/OSS applications, being
active members of the F/OSS development community.
[ASS+ 05] Ioannis P. Antoniades, Ioannis Samoladas, Ioannis Stamelos, Lefteris An-
gelis, and George Bleris. Dynamical Simulation Models of the Open
Source Development Process, chapter 8, pages 174–202. Idea Group
Inc., 2005.
[BBM96] Victor Basili, Lionel Briand, and Walcelio Melo. A validation of object-
oriented design metrics as quality indicators. IEEE Transactions on Soft-
ware Engineering, Vol. 22, No. 10, pp 751-761, 1996.
[Bezb] Nikolai Bezroukov. A second look at the cathedral and the bazaar.
[BL96] M. Berry and G. Linoff. Data Mining Techniques For marketing, Sales
and Customer Support. John Willey and Sons Inc., 1996.
D2 / IST-2005-33331 SQO-OSS 22nd January 2007
[BR03] A. Bonaccorsi and C. Rossi. Why open source can succeed. http://
opensource.mit.edu/papers/rp-bonaccorsirossi.pdf, 2003.
[CC05] G. Canfora and L. Cerulo. Impact analysis by mining software and change
request repositories. In Proceedings of the 11th IEEE International Soft-
ware Metrics Symposium (METRICS-05)., 2005.
[Cha02] S. Chakrabarti. Mining the Web: Analysis of Hypertext and Semi Struc-
tured Data. Morgan Kaufmann., 2002.
[CLM04] Andrea Capiluppi, Patricia Lago, and Maurizio Morisio. Software engi-
neering metrics: What do they measure and how do we know? 10th
International Software Metrics Symposium, METRICS 2004, 2004.
[CMR04] Andrea Capiluppi, Maurizio Morisio, and Juan F. Ramil. Structural evo-
lution of an open source system: a case study. In Proceedings of the 12th
IEEE International Workshop on Program Comprehension (IWPC), Bari,
Italy, June 24-26, 2004, 2004.
[Con06] S.M. Conlin. Beyond low-hanging fruit: Seeking the next generation in
floss data mining. In IFIP International Federation for Information Pro-
cessing (IFIP), Vol. 203, Open Source Systems, pp. 261-266,, 2006.
[DH73] R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. John
Wiley and Sons, 1973.
[DOS99] Chris DiBona, Sam Ockman, and Mark Stone. Open Sources: Voices from
the Open Source Revolution. OReilly and Associates, 1999.
[DTB04] Trung Dinh-Trong and James Bieman. Open source software develop-
ment: A case study of freebsd. In Proceedings of the 10th IEEE Interna-
tional Symposium on Software Metrics, 2004.
[DZ07] C. Ding and H. Zha. Spectral clustering, ordering and ranking statisti-
cal learning. Springer Verlag, Computational Science and Engineering.,
2007.
[FP97] Norman Fenton and Shari Lawrence Pfleeger. Software Metrics - A Rig-
orous Approach. International Thomson Publishing, London, 1997.
[GT00] Michael W. Godfrey and Qiang Tu. Evolution in open source software:
A case study. 16th IEEE International Conference on Software Mainte-
nance (ICSM’00), 2000.
[IB06] Clemente Izurieta and James Bieman. The evolution of freebsd and linux.
In ACM/IEEE International Symposium on Empirical Software Engineer-
ing, Rio de Janeiro, Brazil, 21-22 September, 2006, 2006.
[Irb]
[IV98] N.M. Ide and J. Veronis. Word sense disambiguation: The state of the art.
Computational Linguistics, 24:1–40., 1998.
[JD88] A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Prentice-Hall,
1988.
[JS04] C. Jensen and W. Scacchi. Data mining for software process discovery in
open source software development communities. In Proceedings of In-
ternational Workshop on Mining Software Repositories (MSR-04)., 2004.
[KB04b] Cem Kaner and Walter Bond. Software engineering metrics: What do
they measure and how do we know? 10th International Software Metrics
Symposium, METRICS 2004, 2004.
[KPP+ 02] Barbara Kitchenham, Shari Lawrence Pfleeger, Lesley M. Pickard, Pe-
ter W. Jones, David C. Hoaglin, Khaled El Emam, and Jarrett Rosenberg.
Premilinary guidelines for empirical research in software engineering.
IEEE Transactions on Software Engineering, Vol. 28, No. 8, pp 721-733,
2002.
[KT05] G Koru and J. Tian. Comparing high-change modules and modules with
the highest measurement values in two large-scale open-source prod-
ucts. IEEE Transactions on Software Engineering, Vol. 31, No. 6, pp
625-642, 2005.
[lD05] On line Document. Business readiness rating for open source. BRR 2005
- RFC 1, http://www.openbrr.org, 2005.
[LK03a] Hippel von E. Lakhani K. How open source software works: "free" user-
to-user assistance. Research Policy, 32:923–943., 2003.
[LK03b] M. Last and A. Kandel. Automated test reduction using an info-fuzzy net-
work. Annals of Software Engineering, Special Volume on Computational
Intelligenece in Software Enginnering, 2003.
[MFH02] Audris Mockus, Roy T. Fielding, and James Herbsleb. Two case studies
of open source software development: Apache and mozilla. ACM Trans-
actions on Software Engineering and Methodology, vol.11, no.3, 2002.
[Par94] David Lorge Parnas. Software aging. Proceedings of the 16th Interna-
tional Conference on Software Engineering, 1994.
[Ray99] Eric Steven Raymond. The Cathedral and the Bazaar: Musings on Linux
and Open Source by an Accidental Revolutionary. O’Reilly and Asso-
ciates, 1999.
[Rie96] Arthur J. Riel. Object Oriented Design Heuristics. Addison Wesley Pro-
fessional, 1996.
[Sal] Peter H. Salus. The daemon, the gnu and the penguin.
[Voo93] E.M. Voorhees. Using WordNet to disambiguate word senses for text re-
trieval. In Proceedings of the 16th International Conference on Research
and Development in Information Retrieval (SIGIR-93)., 1993.
[WH05] C.C. Williams and J.K. Hollingsworth. Automating mining of source code
repositories to improve bug finding techniques. IEEE Transactions on
Software Engineering 31(6):466–480., 2005.
[WK91] S.M. Weiss and C. Kulikowski. Computer Systems that Learn: Classi-
fication and Prediction Methods from Statistics, Neural Nets, Machine
Learning and Expert Systems. Morgan Kauffman, 1991.
[YSC+ 06] L. Yu, S. Schach, K. Chen, G. Heller, and J. Offnutt. Maintainability of the
kernels of open source operating systems: A comparison of linux with
freebsd, netbsd and openbsd. The Journal of Systems and Software, 79,
807-815, 2006.
[YSCO04] Liguo Y., S.R. Schach, K. Chen, and J. Offnutt. Categorization of the
common coupling and its application to the maintainability of the linux
kernel. IEEE Transaction on Software Engineering, Vol. 30, No. 10, pp
694-706, 2004.