(Myers79) (Hetzel88)

Introduction
[Software Testing is the process of executing a program or system with the intent of finding
errors. [Myers79] Or, it involves any activity aimed at evaluating an attribute or capability of a
program or system and determining that it meets its required results.] [Hetzel88] Software is not
unlike other physical processes where inputs are received and outputs are produced. Where
software differs is in the manner in which it fails. Most physical systems fail in a fixed (and
reasonably small) set of ways. By contrast, software can fail in many bizarre ways. Detecting all
of the different failure modes for software is generally infeasible. [Rstcorp]
Unlike most physical systems, most of the defects in software are design errors, not
manufacturing defects. Software does not suffer from corrosion, wear-and-tear -- generally it will
not change until upgrades, or until obsolescence. So once the software is shipped, the design
defects -- or bugs -- will be buried in and remain latent until activation.
Software bugs will almost always exist in any software module with moderate size: not because
programmers are careless or irresponsible, but because the complexity of software is generally
intractable -- and humans have only limited ability to manage complexity. It is also true that for
any complex systems, design defects can never be completely ruled out.
Discovering the design defects in software, is equally difficult, for the same reason of
complexity. Because software and any digital systems are not continuous, testing boundary
values are not sufficient to guarantee correctness. All the possible values need to be tested and
verified, but complete testing is infeasible. Exhaustively testing a simple program to add only
two integer inputs of 32-bits (yielding 2^64 distinct test cases) would take hundreds of years,
even if tests were performed at a rate of thousands per second. Obviously, for a realistic software
module, the complexity can be far beyond the example mentioned here. If inputs from the real
world are involved, the problem will get worse, because timing and unpredictable environmental
effects and human interactions are all possible input parameters under consideration.
A further complication has to do with the dynamic nature of programs. If a failure occurs during
preliminary testing and the code is changed, the software may now work for a test case that it
didn't work for previously. But its behavior on pre-error test cases that it passed before can no
longer be guaranteed. To account for this possibility, testing should be restarted. The expense of
doing this is often prohibitive. [Rstcorp]
An interesting analogy parallels the difficulty in software testing with the pesticide, known as the
Pesticide Paradox [Beizer90]: Every method you use to prevent or find bugs leaves a residue of
subtler bugs against which those methods are ineffectual. But this alone will not guarantee to
make the software better, because the Complexity Barrier [Beizer90] principle states: Software
complexity(and therefore that of bugs) grows to the limits of our ability to manage that
complexity. By eliminating the (previous) easy bugs you allowed another escalation of features
and complexity, but his time you have subtler bugs to face, just to retain the reliability you had
before. Society seems to be unwilling to limit complexity because we all want that extra bell,
whistle, and feature interaction. Thus, our users always push us to the complexity barrier and
how close we can approach that barrier is largely determined by the strength of the techniques
we can wield against ever more complex and subtle bugs. [Beizer90]
Regardless of the limitations, testing is an integral part in software development. It is broadly

deployed in every phase in the software development cycle. Typically, more than 50% percent of
the development time is spent in testing. Testing is usually performed for the following purposes:
 To improve quality.
As computers and software are used in critical applications, the outcome of a bug can be severe.
Bugs can cause huge losses. Bugs in critical systems have caused airplane crashes, allowed space
shuttle missions to go awry, halted trading on the stock market, and worse. Bugs can kill. Bugs
can cause disasters. The so-called year 2000 (Y2K) bug has given birth to a cottage industry of
consultants and programming tools dedicated to making sure the modern world doesn't come to a
screeching halt on the first day of the next century. [Bugs] In a computerized embedded world,
the quality and reliability of software is a matter of life and death.
Quality means the conformance to the specified design requirement. Being correct, the minimum
requirement of quality, means performing as required under specified circumstances. Debugging,
a narrow view of software testing, is performed heavily to find out design defects by the
programmer. The imperfection of human nature makes it almost impossible to make a
moderately complex program correct the first time. Finding the problems and get them fixed
[Kaner93], is the purpose of debugging in programming phase.
 For Verification & Validation (V&V)
Just as topic Verification and Validation indicated, another important purpose of testing is
verification and validation (V&V). Testing can serve as metrics. It is heavily used as a tool in the
V&V process. Testers can make claims based on interpretations of the testing results, which
either the product works under certain situations, or it does not work. We can also compare the
quality among different products under the same specification, based on results from the same
test.
We can not test quality directly, but we can test related factors to make quality visible. Quality
has three sets of factors -- functionality, engineering, and adaptability. These three sets of factors
can be thought of as dimensions in the software quality space. Each dimension may be broken
down into its component factors and considerations at successively lower levels of detail. Table
1 illustrates some of the most frequently cited quality considerations.

Functionality (exterior Engineering (interior Adaptability (future

quality) quality) quality)
Correctness Efficiency Flexibility
Reliability Testability Reusability
Usability Documentation Maintainability
Integrity Structure
Table 1. Typical Software Quality Factors [Hetzel88]

[Good testing provides measures for all relevant factors. The importance of any particular factor
varies from application to application. Any system where human lives are at stake must place
extreme emphasis on reliability and integrity. In the typical business system usability and
maintainability are the key factors, while for a one-time scientific program neither may be
significant. Our testing, to be fully effective, must be geared to measuring each relevant factor
and thus forcing quality to become tangible and visible. [Hetzel88]
Tests with the purpose of validating the product works are named clean tests, or positive tests.
The drawbacks are that it can only validate that the software works for the specified test cases. A
finite number of tests can not validate that the software works for all situations. On the contrary,
only one failed test is sufficient enough to show that the software does not work. Dirty tests, or
negative tests, refers to the tests aiming at breaking the software, or showing that it does not
work. A piece of software must have sufficient exception handling capabilities to survive a
significant level of dirty tests.
A testable design is a design that can be easily validated, falsified and maintained. Because
testing is a rigorous effort and requires significant time and cost, design for testability is also an
important design rule for software development.]
 For reliability estimation [Kaner93] [Lyu95]
Software reliability has important relations with many aspects of software, including the
structure, and the amount of testing it has been subjected to. Based on an operational profile (an
estimate of the relative frequency of use of various inputs to the program [Lyu95]), testing can
serve as a statistical sampling method to gain failure data for reliability estimation.
Software testing is not mature. It still remains an art, because we still cannot make it a science.
We are still using the same testing techniques invented 20-30 years ago, some of which are
crafted methods or heuristics rather than good engineering methods. Software testing can be
costly, but not testing software is even more expensive, especially in places that human lives are
at stake. Solving the software-testing problem is no easier than solving the Turing halting
problem. We can never be sure that a piece of software is correct. We can never be sure that the
specifications are correct. No verification system can verify every correct program. We can
never be certain that a verification system is correct either.
Key Concepts
Taxonomy
[There is a plethora of testing methods and testing techniques, serving multiple purposes in
different life cycle phases. Classified by purpose, software testing can be divided into:
correctness testing, performance testing, reliability testing and security testing. Classified by life-
cycle phase, software testing can be classified into the following categories: requirements phase
testing, design phase testing, program phase testing, evaluating test results, installation phase
testing, acceptance testing and maintenance testing. By scope, software testing can be
categorized as follows: unit testing, component testing, integration testing, and system testing].
Correctness testing
Correctness is the minimum requirement of software, the essential purpose of testing. Correctness
testing will need some type of oracle, to tell the right behavior from the wrong one. The tester may or
may not know the inside details of the software module under test, e.g. control flow, data flow, etc.
Therefore, either a white-box point of view or black-box point of view can be taken in testing software.
We must note that the black-box and white-box ideas are not limited in correctness testing only.
 Black-box testing
The black-box approach is a testing method in which test data are derived from the specified
functional requirements without regard to the final program structure. [Perry90] It is also termed
data-driven, input/output driven [Myers79], or requirements-based [Hetzel88] testing. Because
only the functionality of the software module is of concern, black-box testing also mainly refers
to functional testing -- a testing method emphasized on executing the functions and examination
of their input and output data. [Howden87] The tester treats the software under test as a black
box -- only the inputs, outputs and specification are visible, and the functionality is determined
by observing the outputs to corresponding inputs. In testing, various inputs are exercised and the
outputs are compared against specification to validate the correctness. All test cases are derived
from the specification. No implementation details of the code are considered.
It is obvious that the more we have covered in the input space, the more problems we will find
and therefore we will be more confident about the quality of the software. Ideally we would be
tempted to exhaustively test the input space. But as stated above, exhaustively testing the
combinations of valid inputs will be impossible for most of the programs, let alone considering
invalid inputs, timing, sequence, and resource variables. Combinatorial explosion is the major
roadblock in functional testing. To make things worse, we can never be sure whether the
specification is either correct or complete. Due to limitations of the language used in the
specifications (usually natural language), ambiguity is often inevitable. Even if we use some type
of formal or restricted language, we may still fail to write down all the possible cases in the
specification. Sometimes, the specification itself becomes an intractable problem: it is not
possible to specify precisely every situation that can be encountered using limited words. And
people can seldom specify clearly what they want -- they usually can tell whether a prototype is,
or is not, what they want after they have been finished. Specification problems contributes
approximately 30 percent of all bugs in software. [Beizer95]
The research in black-box testing mainly focuses on how to maximize the effectiveness of testing
with minimum cost, usually the number of test cases. It is not possible to exhaust the input space,
but it is possible to exhaustively test a subset of the input space. Partitioning is one of the
common techniques. If we have partitioned the input space and assume all the input values in a
partition is equivalent, then we only need to test one representative value in each partition to
sufficiently cover the whole input space. Domain testing [Beizer95] partitions the input domain
into regions, and consider the input values in each domain an equivalent class. Domains can be
exhaustively tested and covered by selecting a representative value(s) in each domain. Boundary
values are of special interest. Experience shows that test cases that explore boundary conditions
have a higher payoff than test cases that do not. Boundary value analysis [Myers79] requires one
or more boundary values selected as representative test cases. The difficulties with domain
testing are that incorrect domain definitions in the specification can not be efficiently discovered.
Good partitioning requires knowledge of the software structure. A good testing plan will not only
contain black-box testing, but also white-box approaches, and combinations of the two.
 White-box testing
Contrary to black-box testing, software is viewed as a white-box, or glass-box in white-box

testing, as the structure and flow of the software under test are visible to the tester. Testing plans
are made according to the details of the software implementation, such as programming
language, logic, and styles. Test cases are derived from the program structure. White-box testing
is also called glass-box testing, logic-driven testing [Myers79] or design-based testing
[Hetzel88].
There are many techniques available in white-box testing, because the problem of intractability is
eased by specific knowledge and attention on the structure of the software under test. The
intention of exhausting some aspect of the software is still strong in white-box testing, and some
degree of exhaustion can be achieved, such as executing each line of code at least once
(statement coverage), traverse every branch statements (branch coverage), or cover all the
possible combinations of true and false condition predicates (Multiple condition coverage).
[Parrington89]
Control-flow testing, loop testing, and data-flow testing, all maps the corresponding flow
structure of the software into a directed graph. Test cases are carefully selected based on the
criterion that all the nodes or paths are covered or traversed at least once. By doing so we may
discover unnecessary "dead" code -- code that is of no use, or never get executed at all, which
can not be discovered by functional testing.
In mutation testing, the original program code is perturbed and many mutated programs are
created, each contains one fault. Each faulty version of the program is called a mutant. Test data
are selected based on the effectiveness of failing the mutants. The more mutants a test case can
kill, the better the test case is considered. The problem with mutation testing is that it is too
computationally expensive to use. The boundary between black-box approach and white-box
approach is not clear-cut. Many testing strategies mentioned above, may not be safely classified
into black-box testing or white-box testing. It is also true for transaction-flow testing, syntax
testing, finite-state testing, and many other testing strategies not discussed in this text. One
reason is that all the above techniques will need some knowledge of the specification of the
software under test. Another reason is that the idea of specification itself is broad -- it may
contain any requirement including the structure, programming language, and programming style
as part of the specification content.
We may be reluctant to consider random testing as a testing technique. The test case selection is
simple and straightforward: they are randomly chosen. Study in [Duran84] indicates that random
testing is more cost effective for many programs. Some very subtle errors can be discovered with
low cost. And it is also not inferior in coverage than other carefully designed testing techniques.
One can also obtain reliability estimate using random testing results based on operational
profiles. Effectively combining random testing with other testing techniques may yield more
powerful and cost-effective testing strategies.
Performance testing
Not all software systems have specifications on performance explicitly. But every system will
have implicit performance requirements. The software should not take infinite time or infinite
resource to execute. "Performance bugs" sometimes are used to refer to those design problems in
software that cause the system performance to degrade.
Performance has always been a great concern and a driving force of computer evolution.
Performance evaluation of a software system usually includes: resource usage, throughput,
stimulus-response time and queue lengths detailing the average or maximum number of tasks
waiting to be serviced by selected resources. Typical resources that need to be considered
include network bandwidth requirements, CPU cycles, disk space, disk access operations, and
memory usage [Smith90]. The goal of performance testing can be performance bottleneck
identification, performance comparison and evaluation, etc. The typical method of doing
performance testing is using a benchmark -- a program, workload or trace designed to be
representative of the typical system usage. [Vokolos98]
Reliability testing
Software reliability refers to the probability of failure-free operation of a system. It is related to

many aspects of software, including the testing process. Directly estimating software reliability
by quantifying its related factors can be difficult. Testing is an effective sampling method to
measure software reliability. Guided by the operational profile, software testing (usually black-
box testing) can be used to obtain failure data, and an estimation model can be further used to
analyze the data to estimate the present reliability and predict future reliability. Therefore, based
on the estimation, the developers can decide whether to release the software, and the users can
decide whether to adopt and use the software. Risk of using software can also be assessed based
on reliability information. [Hamlet94] advocates that the primary goal of testing should be to
measure the dependability of tested software.
There is agreement on the intuitive meaning of dependable software: it does not fail in
unexpected or catastrophic ways. [Hamlet94] Robustness testing and stress testing are variances
of reliability testing based on this simple criterion.
The robustness of a software component is the degree to which it can function correctly in the
presence of exceptional inputs or stressful environmental conditions. [IEEE90] Robustness
testing differs with correctness testing in the sense that the functional correctness of the software
is not of concern. It only watches for robustness problems such as machine crashes, process
hangs or abnormal termination. The oracle is relatively simple, therefore robustness testing can
be made more portable and scalable than correctness testing. This research has drawn more and
more interests recently, most of which uses commercial operating systems as their target, such as
the work in [Koopman97] [Kropp98] [Ghosh98] [Devale99] [Koopman99].
Stress testing, or load testing, is often used to test the whole system rather than the software
alone. In such tests the software or system are exercised with or beyond the specified limits.
Typical stress includes resource exhaustion, bursts of activities, and sustained high loads.
Security testing
Software quality, reliability and security are tightly coupled. Flaws in software can be exploited
by intruders to open security holes. With the development of the Internet, software security
problems are becoming even more severe.
Many critical software applications and services have integrated security measures against
malicious attacks. The purpose of security testing of these systems include identifying and
removing software flaws that may potentially lead to security violations, and validating the
effectiveness of security measures. Simulated security attacks can be performed to find
vulnerabilities.
Testing automation
Software testing can be very costly. Automation is a good way to cut down time and cost.
Software testing tools and techniques usually suffer from a lack of generic applicability and
scalability. The reason is straight-forward. In order to automate the process, we have to have
some ways to generate oracles from the specification, and generate test cases to test the target
software against the oracles to decide their correctness. Today we still don't have a full-scale
system that has achieved this goal. In general, significant amount of human intervention is still
needed in testing. The degree of automation remains at the automated test script level.
The problem is lessened in reliability testing and performance testing. In robustness testing, the
simple specification and oracle: doesn't crash, doesn't hang suffices. Similar simple metrics can
also be used in stress testing.
When to stop testing?
Testing is potentially endless. We can not test till all the defects are unearthed and removed -- it
is simply impossible. At some point, we have to stop testing and ship the software. The question
is when.
Realistically, testing is a trade-off between budget, time and quality. It is driven by profit
models. The pessimistic, and unfortunately most often used approach is to stop testing whenever
some, or any of the allocated resources -- time, budget, or test cases -- are exhausted. The
optimistic stopping rule is to stop testing when either reliability meets the requirement, or the
benefit from continuing testing cannot justify the testing cost. [Yang95] This will usually require
the use of reliability models to evaluate and predict reliability of the software under test. Each
evaluation requires repeated running of the following cycle: failure data gathering -- modeling --
prediction. This method does not fit well for ultra-dependable systems, however, because the real
field failure data will take too long to accumulate.
Alternatives to testing
Software testing is more and more considered a problematic method toward better quality. Using
testing to locate and correct software defects can be an endless process. Bugs cannot be
completely ruled out. Just as the complexity barrier indicates: chances are testing and fixing
problems may not necessarily improve the quality and reliability of the software. Sometimes
fixing a problem may introduce much more severe problems into the system, happened after bug
fixes, such as the telephone outage in California and eastern seaboard in 1991. The disaster
happened after changing 3 lines of code in the signaling system.
In a narrower view, many testing techniques may have flaws. Coverage testing, for example. Is
code coverage, branch coverage in testing really related to software quality? There is no definite
proof. As early as in [Myers79], the so-called "human testing" -- including inspections,
walkthroughs, reviews -- are suggested as possible alternatives to traditional testing methods.
[Hamlet94] advocates inspection as a cost-effect alternative to unit testing. The experimental
results in [Basili85] suggests that code reading by stepwise abstraction is at least as effective as
on-line functional and structural testing in terms of number and cost of faults observed.
Using formal methods to "prove" the correctness of software is also an attracting research
direction. But this method can not surmount the complexity barrier either. For relatively simple
software, this method works well. It does not scale well to those complex, full-fledged large
software systems, which are more error-prone.
In a broader view, we may start to question the utmost purpose of testing. Why do we need more
effective testing methods anyway, since finding defects and removing them does not necessarily
lead to better quality. An analogy of the problem is like the car manufacturing process. In the
craftsmanship epoch, we make cars and hack away the problems and defects. But such methods
were washed away by the tide of pipelined manufacturing and good quality engineering process,
which makes the car defect-free in the manufacturing phase. This indicates that engineering the
design process (such as clean-room software engineering) to make the product have less defects
may be more effective than engineering the testing process. Testing is used solely for quality
monitoring and management, or, "design for testability". This is the leap for software from
craftsmanship to engineering.
Available tools, techniques, and metrics
There are an abundance of software testing tools exist. The correctness testing tools are often
specialized to certain systems and have limited ability and generality. Robustness and stress
testing tools are more likely to be made generic.
Mothora [DeMillo91] is an automated mutation testing tool-set developed at Purdue University.

Using Mothora, the tester can create and execute test cases, measure test case adequacy,
determine input-output correctness, locate and remove faults or bugs, and control and document
the test.
NuMega's Boundschecker [NuMega99] Rational's Purify [Rational99]. They are run-time

checking and debugging aids. They can both check and protect against memory leaks and pointer
problems.
Ballista COTS Software Robustness Testing Harness [Ballista99]. The Ballista testing harness is
an full-scale automated robustness testing tool. The first version supports testing up to 233
POSIX function calls in UNIX operating systems. The second version also supports testing of
user functions provided that the data types are recognized by the testing server. The Ballista
testing harness gives quantitative measures of robustness comparisons across operating systems.
The goal is to automatically test and harden Commercial Off-The-Shelf (COTS) software against
robustness failures.
Manual testing is the process of manually testing software for defects. It requires a tester to play the
role of an end user, and use most of all features of the application to ensure correct behavior. To ensure
completeness of testing, the tester often follows a written test plan that leads them through a set of
important test cases.
A key step in the process of software engineering is testing the software for correct behavior
prior to release to end users.
For small scale engineering efforts (including prototypes), exploratory testing may be sufficient.
With this informal approach, the tester does not follow any rigorous testing procedure, but rather
explores the user interface of the application using as many of its features as possible, using
information gained in prior tests to intuitively derive additional tests. The success of exploratory
manual testing relies heavily on the domain expertise of the tester, because a lack of knowledge
will lead to incompleteness in testing. One of the key advantages of an informal approach is to
gain an intuitive insight to how it feels to use the application.
Large scale engineering projects that rely on manual software testing follow a more rigorous
methodology in order to maximize the number of defects that can be found. A systematic
approach focuses on predetermined test cases and generally involves the following steps.[1]
1. Choose a high level test plan where a general methodology is chosen, and resources such
as people, computers, and software licenses are identified and acquired.
2. Write detailed test cases, identifying clear and concise steps to be taken by the tester,
with expected outcomes.
3. Assign the test cases to testers, who manually follow the steps and record the results.
4. Author a test report, detailing the findings of the testers. The report is used by managers
to determine whether the software can be released, and if not, it is used by engineers to
identify and correct the problems.
A rigorous test case based approach is often traditional for large software engineering projects
that follow a Waterfall model.[2] However, at least one recent study did not show a dramatic
difference in defect detection efficiency between exploratory testing and test case based testing.[3]
Test automation is the technique of testing software using software rather than people. A test
program is written that exercises the software and identifies its defects. These test programs may
be written from scratch, or they may be written utilizing a generic Test automation framework
that can be purchased from a third party vendor. Test automation can be used to automate the
sometimes menial and time consuming task of following the steps of a use case and reporting the
results.
Test automation may be able to reduce or eliminate the cost of actual testing. A computer can
follow a root sequence of steps more quickly than a person, and it can run the tests overnight to
present the results in the morning. However, the labor that is saved in actual testing must be
spent instead authoring the test program. Depending on the type of application to be tested, and
the automation tools that are chosen, this may require more labor than a manual approach. In
addition, some testing tools present a very large amount of data, potentially creating a time
consuming task of interpreting the results. From a cost-benefit perspective, test automation
becomes more cost effective when the same tests can be reused many times over, such as for
regression testing and test-driven development, and when the results can be interpreted quickly.
If future reuse of the test software is unlikely, then a manual approach is preferred.[4]
Original software that does not have a graphical user interface tends to be tested by automatic
methods, such as those by Original Software. Things such as device drivers and software
libraries must be tested using test programs. In addition, testing of large numbers of users
(performance testing and load testing) is typically simulated in software rather than performed in
practice.
Conversely, graphical user interfaces whose layout changes frequently are very difficult to test
automatically. There are test frameworks that can be used for regression testing of user
interfaces. They rely on recording of sequences of keystrokes and mouse gestures, then playing
them back and observing that the user interface responds in the same way every time.
Unfortunately, these recordings may not work properly when a button is moved or relabeled in a
subsequent release. An automatic regression test may also be fooled if the program output varies
significantly (e.g. the display includes the current system time). In cases such as these, manual
testing may be more effective.[5]
Software Testing
It is the process used to help identify the correctness, completeness, security, and quality of
developed computer software. Testing is a process of technical investigation, performed on
behalf of stakeholders, that is intended to reveal quality-related information about the product
with respect to the context in which it is intended to operate. This includes, but is not limited to,
the process of executing a program or application with the intent of finding errors. Quality is not
an absolute; it is value to some person. With that in mind, testing can never completely establish
the correctness of arbitrary computer software; testing furnishes a criticism or comparison that
compares the state and behaviour of the product against a specification. An important point is
that software testing should be distinguished from the separate discipline of Software Quality
Assurance (SQA), which encompasses all business process areas, not just testing.
White box and black box testing are terms used to describe the point of view a test engineer
takes when designing test cases. Black box being an external view of the test object and white
box being an internal view. Software testing is partly intuitive, but largely systematic. Good
testing involves much more than just running the program a few times to see whether it works.
Thorough analysis of the program under test, backed by a broad knowledge of testing techniques
and tools are prerequisites to systematic testing. Software Testing is the process of executing
software in a controlled manner; in order to answer the question “Does this software behave as
specified?” Software testing is used in association with Verification and Validation. Verification
is the checking of or testing of items, including software, for conformance and consistency with
an associated specification. Software testing is just one kind of verification, which also uses
techniques as reviews, inspections, walk-through. Validation is the process of checking what has
been specified is what the user actually wanted.
WHITE BOX TESTING
UNIT TESTING
The developer carries out unit testing in order to check if the particular module or unit of code is
working fine. The Unit Testing comes at the very basic level as it is carried out as and when the
unit of the code is developed or a particular functionality is built. Unit testing deals with testing a
unit as a whole. This would test the interaction of many functions but confine the test within one
unit. The exact scope of a unit is left to interpretation. Supporting test code, sometimes called
scaffolding, may be necessary to support an individual test. This type of testing is driven by the
architecture and implementation teams. This focus is also called black-box testing because only
the details of the interface are visible to the test. Limits that are global to a unit are tested here. In
the construction industry, scaffolding is a temporary, easy to assemble and disassemble, frame
placed around a building to facilitate the construction of the building. The construction workers
first build the scaffolding and then the building. Later the scaffolding is removed, exposing the
completed building. Similarly, in software testing, one particular test may need some supporting
software. This software establishes an environment around the test. Only when this environment
is established can a correct evaluation of the test take place. The scaffolding software may
establish state and values for data structures as well as providing dummy external functions for
the test. Different scaffolding software may be needed from one test to another test. Scaffolding
software rarely is considered part of the system. Sometimes the scaffolding software becomes
larger than the system software being tested. Usually the scaffolding software is not of the same
quality as the system software and frequently is quite fragile. A small change in the test may lead
to much larger changes in the scaffolding. Internal and unit testing can be automated with the
help of coverage tools. A coverage tool analyzes the source code and generates a test that will
execute every alternative thread of execution. It is still up to the programmer to combine this test
into meaningful cases to validate the result of each thread of execution. Typically, the coverage
tool is used in a slightly different way. First the coverage tool is used to augment the source by
placing informational prints after each line of code. Then the testing suite is executed generating
an audit trail. This audit trail is analyzed and reports the percent of the total system code
executed during the test suite. If the coverage is high and the untested source lines are of low
impact to the system's overall quality, then no more additional tests are required.
STATIC & DYNAMIC ANALYSIS

Static analysis involves going through the code in order to find out any possible defect in the code.
Dynamic analysis involves executing the code and analyzing the output.
STATEMENT COVERAGE
In this type of testing the code is executed in such a manner that every statement of the
application is executed at least once. It helps in assuring that all the statements execute without
any side effect.
BRANCH COVERAGE
No software application can be written in a continuous mode of coding, at some point we need to branch
out the code in order to perform a particular functionality. Branch coverage testing helps in validating of all
the branches in the code and making sure that no branching leads to abnormal behavior of the
application.
SECURITY TESTING
Security Testing is carried out in order to find out how well the system can protect itself from
unauthorized access, hacking – cracking, any code damage etc. which deals with the code of
application. This type of testing needs sophisticated testing techniques.
MUTATION TESTING
A kind of testing in which, the application is tested for the code that was modified after fixing a particular
bug/defect. It also helps in finding out which code and which strategy of coding can help in developing the
functionality effectively.
BLACK BOX TESTING
FUNCTIONAL TESTING
In this type of testing, the software is tested for the functional requirements. The tests are written
in order to check if the application behaves as expected. Although functional testing is often
done toward the end of the development cycle, it can—and should, —be started much earlier.
Individual components and processes can be tested early on, even before it's possible to do
functional testing on the entire system. Functional testing covers how well the system executes
the functions it is supposed to execute—including user commands, data manipulation, searches
and business processes, user screens, and integrations. Functional testing covers the obvious
surface type of functions, as well as the back-end operations (such as security and how upgrades
affect the system).
Software testing is an investigation conducted to provide stakeholders with information about

the quality of the product or service under test.[1] Software testing also provides an objective,
independent view of the software to allow the business to appreciate and understand the risks at
implementation of the software. Test techniques include, but are not limited to, the process of
executing a program or application with the intent of finding software bugs.
Software testing can also be stated as the process of validating and verifying that a software
program/application/product:
1. meets the business and technical requirements that guided its design and development;
2. works as expected; and
3. can be implemented with the same characteristics.
Software testing, depending on the testing method employed, can be implemented at any time in
the development process. However, most of the test effort occurs after the requirements have
been defined and the coding process has been completed. As such, the methodology of the test is
governed by the software development methodology adopted.
Different software development models will focus the test effort at different points in the
development process. Newer development models, such as Agile, often employ test driven
development and place an increased portion of the testing in the hands of the developer, before it
reaches a formal team of testers. In a more traditional model, most of the test execution occurs
after the requirements have been defined and the coding process has been completed.
Contents
[hide]
 1 Overview
 2 History
 3 Software testing topics
o 3.1 Scope
o 3.2 Functional vs non-functional testing
o 3.3 Defects and failures
o 3.4 Finding faults early
o 3.5 Compatibility
o 3.6 Input combinations and preconditions
o 3.7 Static vs. dynamic testing
o 3.8 Software verification and validation
o 3.9 The software testing team
o 3.10 Software quality assurance (SQA)
 4 Testing methods
o 4.1 The box approach
 4.1.1 White box testing
 4.1.2 Black box testing
 4.1.3 Grey box testing
 5 Testing levels
o 5.1 Unit testing
o 5.2 Integration testing
o 5.3 System testing
o 5.4 System integration testing
o 5.5 Regression testing
o 5.6 Acceptance testing
o 5.7 Alpha testing
o 5.8 Beta testing
 6 Non-functional testing
o 6.1 Software performance testing and load testing
o 6.2 Stability testing
o 6.3 Usability testing
o 6.4 Security testing
o 6.5 Internationalization and localization
o 6.6 Destructive testing
 7 The testing process
o 7.1 Traditional CMMI or waterfall development model
o 7.2 Agile or Extreme development model
o 7.3 A sample testing cycle
 8 Automated testing
o 8.1 Testing tools
o 8.2 Measurement in software testing
 9 Testing artifacts
 10 Certifications
 11 Controversy
 12 See also
 13 References
 14 External links
[edit] Overview
Testing can never completely identify all the defects within software. Instead, it furnishes a
criticism or comparison that compares the state and behavior of the product against oracles—
principles or mechanisms by which someone might recognize a problem. These oracles may
include (but are not limited to) specifications, contracts,[2] comparable products, past versions of
the same product, inferences about intended or expected purpose, user or customer expectations,
relevant standards, applicable laws, or other criteria.
Every software product has a target audience. For example, the audience for video game
software is completely different from banking software. Therefore, when an organization
develops or otherwise invests in a software product, it can assess whether the software product
will be acceptable to its end users, its target audience, its purchasers, and other stakeholders.
Software testing is the process of attempting to make this assessment.
A study conducted by NIST in 2002 reports that software bugs cost the U.S. economy $59.5
billion annually. More than a third of this cost could be avoided if better software testing was
performed.[3]
[edit] History
The separation of debugging from testing was initially introduced by Glenford J. Myers in 1979.
[4]
Although his attention was on breakage testing ("a successful test is one that finds a bug"[4][5])
it illustrated the desire of the software engineering community to separate fundamental
development activities, such as debugging, from that of verification. Dave Gelperin and William
C. Hetzel classified in 1988 the phases and goals in software testing in the following stages:[6]
 Until 1956 - Debugging oriented[7]

 1957–1978 - Demonstration oriented[8]
 1979–1982 - Destruction oriented[9]
 1983–1987 - Evaluation oriented[10]
 1988–2000 - Prevention oriented[11]
[edit] Software testing topics

[edit] Scope
A primary purpose for testing is to detect software failures so that defects may be discovered and
corrected. This is a non-trivial pursuit. Testing cannot establish that a product functions properly
under all conditions but can only establish that it does not function properly under specific
conditions.[12] The scope of software testing often includes examination of code as well as
execution of that code in various environments and conditions as well as examining the aspects
of code: does it do what it is supposed to do and do what it needs to do. In the current culture of
software development, a testing organization may be separate from the development team. There
are various roles for testing team members. Information derived from software testing may be
used to correct the process by which software is developed.[13]
[edit] Functional vs non-functional testing
Functional testing refers to tests that verify a specific action or function of the code. These are
usually found in the code requirements documentation, although some development
methodologies work from use cases or user stories. Functional tests tend to answer the question
of "can the user do this" or "does this particular feature work".
Non-functional testing refers to aspects of the software that may not be related to a specific
function or user action, such as scalability or security. Non-functional testing tends to answer
such questions as "how many people can log in at once".
[edit] Defects and failures
Not all software defects are caused by coding errors. One common source of expensive defects is
caused by requirement gaps, e.g., unrecognized requirements, that result in errors of omission by
the program designer.[14] A common source of requirements gaps is non-functional requirements
such as testability, scalability, maintainability, usability, performance, and security.
Software faults occur through the following processes. A programmer makes an error (mistake),
which results in a defect (fault, bug) in the software source code. If this defect is executed, in
certain situations the system will produce wrong results, causing a failure.[15] Not all defects will
necessarily result in failures. For example, defects in dead code will never result in failures. A
defect can turn into a failure when the environment is changed. Examples of these changes in
environment include the software being run on a new hardware platform, alterations in source
data or interacting with different software.[15] A single defect may result in a wide range of failure
symptoms.
[edit] Finding faults early

It is commonly believed that the earlier a defect is found the cheaper it is to fix it.[16] The
following table shows the cost of fixing the defect depending on the stage it was found.[17] For
example, if a problem in the requirements is found only post-release, then it would cost 10–100
times more to fix than if it had already been found by the requirements review.
Time detected
System Post-
Requirements Architecture Construction
test release
Requirements 1× 3× 5–10× 10× 10–100×

Time
Architecture - 1× 10× 15× 25–100×
introduced
Construction - - 1× 10× 10–25×
[edit] Compatibility
A common cause of software failure (real or perceived) is a lack of compatibility with other
application software, operating systems (or operating system versions, old or new), or target
environments that differ greatly from the original (such as a terminal or GUI application intended
to be run on the desktop now being required to become a web application, which must render in
a web browser). For example, in the case of a lack of backward compatibility, this can occur
because the programmers develop and test software only on the latest version of the target
environment, which not all users may be running. This results in the unintended consequence
that the latest work may not function on earlier versions of the target environment, or on older
hardware that earlier versions of the target environment was capable of using. Sometimes such
issues can be fixed by proactively abstracting operating system functionality into a separate
program module or library.
[edit] Input combinations and preconditions
A very fundamental problem with software testing is that testing under all combinations of
inputs and preconditions (initial state) is not feasible, even with a simple product.[12][18] This
means that the number of defects in a software product can be very large and defects that occur
infrequently are difficult to find in testing. More significantly, non-functional dimensions of
quality (how it is supposed to be versus what it is supposed to do)—usability, scalability,
performance, compatibility, reliability—can be highly subjective; something that constitutes
sufficient value to one person may be intolerable to another.
[edit] Static vs. dynamic testing
There are many approaches to software testing. Reviews, walkthroughs, or inspections are
considered as static testing, whereas actually executing programmed code with a given set of test
cases is referred to as dynamic testing. Static testing can be (and unfortunately in practice often
is) omitted. Dynamic testing takes place when the program itself is used for the first time (which
is generally considered the beginning of the testing stage). Dynamic testing may begin before the
program is 100% complete in order to test particular sections of code (modules or discrete
functions). Typical techniques for this are either using stubs/drivers or execution from a
debugger environment. For example, spreadsheet programs are, by their very nature, tested to a
large extent interactively ("on the fly"), with results displayed immediately after each calculation
or text manipulation.
[edit] Software verification and validation
Software testing is used in association with verification and validation:[19]
 Verification: Have we built the software right? (i.e., does it match the specification).
 Validation: Have we built the right software? (i.e., is this what the customer wants).
The terms verification and validation are commonly used interchangeably in the industry; it is
also common to see these two terms incorrectly defined. According to the IEEE Standard
Glossary of Software Engineering Terminology:
Verification is the process of evaluating a system or component to determine whether the

products of a given development phase satisfy the conditions imposed at the start of that phase.
Validation is the process of evaluating a system or component during or at the end of the
development process to determine whether it satisfies specified requirements.
[edit] The software testing team
Software testing can be done by software testers. Until the 1980s the term "software tester" was
used generally, but later it was also seen as a separate profession. Regarding the periods and the
different goals in software testing,[20] different roles have been established: manager, test lead,
test designer, tester, automation developer, and test administrator.
[edit] Software quality assurance (SQA)
Though controversial, software testing may be viewed as an important part of the software
quality assurance (SQA) process.[12] In SQA, software process specialists and auditors take a
broader view on software and its development. They examine and change the software
engineering process itself to reduce the amount of faults that end up in the delivered software:
the so-called defect rate.
What constitutes an "acceptable defect rate" depends on the nature of the software; A flight
simulator video game would have much higher defect tolerance than software for an actual
airplane.
Although there are close links with SQA, testing departments often exist independently, and
there may be no SQA function in some companies.
Software testing is a task intended to detect defects in software by contrasting a computer
program's expected results with its actual results for a given set of inputs. By contrast, QA
(quality assurance) is the implementation of policies and procedures intended to prevent defects
from occurring in the first place.
[edit] Testing methods

[edit] The box approach
Software testing methods are traditionally divided into white- and black-box testing. These two
approaches are used to describe the point of view that a test engineer takes when designing test
cases.
[edit] White box testing
Main article: White box testing
White box testing is when the tester has access to the internal data structures and algorithms
including the code that implement these.
Types of white box testing
The following types of white box testing exist:
 API testing (application programming interface) - testing of the application using public
and private APIs
 Code coverage - creating tests to satisfy some criteria of code coverage (e.g., the test
designer can create tests to cause all statements in the program to be executed at least
once)
 Fault injection methods - improving the coverage of a test by introducing faults to test
code paths
 Mutation testing methods
 Static testing - White box testing includes all static testing
Test coverage
White box testing methods can also be used to evaluate the completeness of a test suite that
was created with black box testing methods. This allows the software team to examine parts of
a system that are rarely tested and ensures that the most important function points have been
tested.[21]
Two common forms of code coverage are:
 Function coverage, which reports on functions executed

 Statement coverage, which reports on the number of lines executed to complete the
test
They both return a code coverage metric, measured as a percentage.
[edit] Black box testing
Main article: Black box testing
Black box testing treats the software as a "black box"—without any knowledge of internal
implementation. Black box testing methods include: equivalence partitioning, boundary value
analysis, all-pairs testing, fuzz testing, model-based testing, traceability matrix, exploratory
testing and specification-based testing.
Specification-based testing: Specification-based testing aims to test the functionality of

software according to the applicable requirements. [22] Thus, the tester inputs data into, and only
sees the output from, the test object. This level of testing usually requires thorough test cases to
be provided to the tester, who then can simply verify that for a given input, the output value (or
behavior), either "is" or "is not" the same as the expected value specified in the test case.
Specification-based testing is necessary, but it is insufficient to guard against certain risks. [23]
Advantages and disadvantages: The black box tester has no "bonds" with the code, and a
tester's perception is very simple: a code must have bugs. Using the principle, "Ask and you shall
receive," black box testers find bugs where programmers do not. But, on the other hand, black
box testing has been said to be "like a walk in a dark labyrinth without a flashlight," because the
tester doesn't know how the software being tested was actually constructed. As a result, there
are situations when (1) a tester writes many test cases to check something that could have been
tested by only one test case, and/or (2) some parts of the back-end are not tested at all.
Therefore, black box testing has the advantage of "an unaffiliated opinion," on the one hand, and
the disadvantage of "blind exploring," on the other. [24]
[edit] Grey box testing
Grey box testing (American spelling: gray box testing) involves having knowledge of internal
data structures and algorithms for purposes of designing the test cases, but testing at the user, or
black-box level. Manipulating input data and formatting output do not qualify as grey box,
because the input and output are clearly outside of the "black-box" that we are calling the system
under test. This distinction is particularly important when conducting integration testing between
two modules of code written by two different developers, where only the interfaces are exposed
for test. However, modifying a data repository does qualify as grey box, as the user would not
normally be able to change the data outside of the system under test. Grey box testing may also
include reverse engineering to determine, for instance, boundary values or error messages.
[edit] Testing levels

Tests are frequently grouped by where they are added in the software development process, or by
the level of specificity of the test.
[edit] Unit testing
Main article: Unit testing
Unit testing refers to tests that verify the functionality of a specific section of code, usually at
the function level. In an object-oriented environment, this is usually at the class level, and the
minimal unit tests include the constructors and destructors.[25]
These type of tests are usually written by developers as they work on code (white-box style), to
ensure that the specific function is working as expected. One function might have multiple tests,
to catch corner cases or other branches in the code. Unit testing alone cannot verify the
functionality of a piece of software, but rather is used to assure that the building blocks the
software uses work independently of each other.
Unit testing is also called component testing.
[edit] Integration testing
Main article: Integration testing
Integration testing is any type of software testing that seeks to verify the interfaces between
components against a software design. Software components may be integrated in an iterative
way or all together ("big bang"). Normally the former is considered a better practice since it
allows interface issues to be localised more quickly and fixed.
Integration testing works to expose defects in the interfaces and interaction between integrated
components (modules). Progressively larger groups of tested software components
corresponding to elements of the architectural design are integrated and tested until the software
works as a system.[26]
[edit] System testing
Main article: System testing
System testing tests a completely integrated system to verify that it meets its requirements.[27]
[edit] System integration testing
Main article: System integration testing
System integration testing verifies that a system is integrated to any external or third party
systems defined in the system requirements.[citation needed]
[edit] Regression testing
Main article: Regression testing
Regression testing focuses on finding defects after a major code change has occurred.
Specifically, it seeks to uncover software regressions, or old bugs that have come back. Such
regressions occur whenever software functionality that was previously working correctly stops
working as intended. Typically, regressions occur as an unintended consequence of program
changes, when the newly developed part of the software collides with the previously existing
code. Common methods of regression testing include re-running previously run tests and
checking whether previously fixed faults have re-emerged. The depth of testing depends on the
phase in the release process and the risk of the added features. They can either be complete, for
changes added late in the release or deemed to be risky, to very shallow, consisting of positive
tests on each feature, if the changes are early in the release or deemed to be of low risk.
[edit] Acceptance testing
Main article: Acceptance testing
Acceptance testing can mean one of two things:
1. A smoke test is used as an acceptance test prior to introducing a new build to the main testing
process, i.e. before integration or regression.
2. Acceptance testing performed by the customer, often in their lab environment on their own
hardware, is known as user acceptance testing (UAT). Acceptance testing may be performed as
part of the hand-off process between any two phases of development. [citation needed]
[edit] Alpha testing
Alpha testing is simulated or actual operational testing by potential users/customers or an

independent test team at the developers' site. Alpha testing is often employed for off-the-shelf
software as a form of internal acceptance testing, before the software goes to beta testing. van
Veenendaal, Erik. "Standard glossary of terms used in Software Testing".
http://www.astqb.org/educational-resources/glossary.php#A. Retrieved 17 June 2010.
[edit] Beta testing
Beta testing comes after alpha testing. Versions of the software, known as beta versions, are
released to a limited audience outside of the programming team. The software is released to
groups of people so that further testing can ensure the product has few faults or bugs. Sometimes,
beta versions are made available to the open public to increase the feedback field to a maximal
number of future users.[citation needed]
[edit] Non-functional testing

Special methods exist to test non-functional aspects of software. In contrast to functional testing,
which establishes the correct operation of the software (correct in that it matches the expected
behavior defined in the design requirements), non-functional testing verifies that the software
functions properly even when it receives invalid or unexpected inputs. Software fault injection,
in the form of fuzzing, is an example of non-functional testing. Non-functional testing, especially
for software, is designed to establish whether the device under test can tolerate invalid or
unexpected inputs, thereby establishing the robustness of input validation routines as well as
error-handling routines. Various commercial non-functional testing tools are linked from the
software fault injection page; there are also numerous open-source and free software tools
available that perform non-functional testing.
[edit] Software performance testing and load testing
Performance testing is executed to determine how fast a system or sub-system performs under a
particular workload. It can also serve to validate and verify other quality attributes of the system,
such as scalability, reliability and resource usage. Load testing is primarily concerned with
testing that can continue to operate under a specific load, whether that be large quantities of data
or a large number of users. This is generally referred to as software scalability. The related load
testing activity of when performed as a non-functional activity is often referred to as endurance
testing.
Volume testing is a way to test functionality. Stress testing is a way to test reliability. Load
testing is a way to test performance. There is little agreement on what the specific goals of load
testing are. The terms load testing, performance testing, reliability testing, and volume testing,
are often used interchangeably.
[edit] Stability testing
Stability testing checks to see if the software can continuously function well in or above an
acceptable period. This activity of non-functional software testing is often referred to as load (or
endurance) testing.
[edit] Usability testing
Usability testing is needed to check if the user interface is easy to use and understand.
[edit] Security testing
Security testing is essential for software that processes confidential data to prevent system
intrusion by hackers.
[edit] Internationalization and localization
Internationalization and localization is needed to test these aspects of software, for which a
pseudolocalization method can be used. It will verify that the application still works, even after it
has been translated into a new language or adapted for a new culture (such as different currencies
or time zones).
[edit] Destructive testing
Main article: Destructive testing
Destructive testing attempts to cause the software or a sub-system to fail, in order to test its
robustness.
[edit] The testing process

[edit] Traditional CMMI or waterfall development model
A common practice of software testing is that testing is performed by an independent group of

testers after the functionality is developed, before it is shipped to the customer.[28] This practice
often results in the testing phase being used as a project buffer to compensate for project delays,
thereby compromising the time devoted to testing.[29]
Another practice is to start software testing at the same moment the project starts and it is a
continuous process until the project finishes.[30]
[edit] Agile or Extreme development model
In counterpoint, some emerging software disciplines such as extreme programming and the agile
software development movement, adhere to a "test-driven software development" model. In this
process, unit tests are written first, by the software engineers (often with pair programming in the
extreme programming methodology). Of course these tests fail initially; as they are expected to.
Then as code is written it passes incrementally larger portions of the test suites. The test suites
are continuously updated as new failure conditions and corner cases are discovered, and they are
integrated with any regression tests that are developed. Unit tests are maintained along with the
rest of the software source code and generally integrated into the build process (with inherently
interactive tests being relegated to a partially manual build acceptance process). The ultimate
goal of this test process is to achieve continuous deployment where software updates can be
published to the public frequently. [31] [32]
[edit] A sample testing cycle
Although variations exist between organizations, there is a typical cycle for testing[33]. The
sample below is common among organizations employing the Waterfall development model.
 Requirements analysis: Testing should begin in the requirements phase of the software
development life cycle. During the design phase, testers work with developers in determining
what aspects of a design are testable and with what parameters those tests work.
 Test planning: Test strategy, test plan, testbed creation. Since many activities will be carried out
during testing, a plan is needed.
 Test development: Test procedures, test scenarios, test cases, test datasets, test scripts to use
in testing software.
 Test execution: Testers execute the software based on the plans and test documents then
report any errors found to the development team.
 Test reporting: Once testing is completed, testers generate metrics and make final reports on
their test effort and whether or not the software tested is ready for release.
 Test result analysis: Or Defect Analysis, is done by the development team usually along with the
client, in order to decide what defects should be treated, fixed, rejected (i.e. found software
working properly) or deferred to be dealt with later.
 Defect Retesting: Once a defect has been dealt with by the development team, it is retested by
the testing team. AKA Resolution testing.
 Regression testing: It is common to have a small test program built of a subset of tests, for each
integration of new, modified, or fixed software, in order to ensure that the latest delivery has
not ruined anything, and that the software product as a whole is still working correctly.
 Test Closure: Once the test meets the exit criteria, the activities such as capturing the key
outputs, lessons learned, results, logs, documents related to the project are archived and used
as a reference for future projects.
[edit] Automated testing

Main article: Test automation
Many programming groups are relying more and more on automated testing, especially groups
that use test-driven development. There are many frameworks to write tests in, and continuous
integration software will run tests automatically every time code is checked into a version control
system.
While automation cannot reproduce everything that a human can do (and all the strange ways
they think of doing it), it can be very useful for regression testing. However, it does require a
well-developed test suite of testing scripts in order to be truly useful.
[edit] Testing tools
Program testing and fault detection can be aided significantly by testing tools and debuggers.
Testing/debug tools include features such as:
 Program monitors, permitting full or partial monitoring of program code including:

o Instruction set simulator, permitting complete instruction level monitoring and trace
facilities
o Program animation, permitting step-by-step execution and conditional breakpoint at
source level or in machine code
o Code coverage reports
 Formatted dump or symbolic debugging, tools allowing inspection of program variables on error
or at chosen points
 Automated functional GUI testing tools are used to repeat system-level tests through the GUI
 Benchmarks, allowing run-time performance comparisons to be made
 Performance analysis (or profiling tools) that can help to highlight hot spots and resource usage
Some of these features may be incorporated into an Integrated Development Environment (IDE).
[edit] Measurement in software testing
Usually, quality is constrained to such topics as correctness, completeness, security,[citation needed]

but can also include more technical requirements as described under the ISO standard ISO/IEC
9126, such as capability, reliability, efficiency, portability, maintainability, compatibility, and
usability.
There are a number of frequently-used software measures, often called metrics, which are used to
assist in determining the state of the software or the adequacy of the testing.
[edit] Testing artifacts

Software testing process can produce several artifacts.
Test plan
A test specification is called a test plan. The developers are well aware what test plans will be
executed and this information is made available to management and the developers. The idea is
to make them more cautious when developing their code or making additional changes. Some
companies have a higher-level document called a test strategy.
Traceability matrix
A traceability matrix is a table that correlates requirements or design documents to test

documents. It is used to change tests when the source documents are changed, or to verify that
the test results are correct.
Test case
A test case normally consists of a unique identifier, requirement references from a design
specification, preconditions, events, a series of steps (also known as actions) to follow, input,
output, expected result, and actual result. Clinically defined a test case is an input and an
expected result.[34] This can be as pragmatic as 'for condition x your derived result is y', whereas
other test cases described in more detail the input scenario and what results might be expected.
It can occasionally be a series of steps (but often steps are contained in a separate test
procedure that can be exercised against multiple test cases, as a matter of economy) but with
one expected result or expected outcome. The optional fields are a test case ID, test step, or
order of execution number, related requirement(s), depth, test category, author, and check
boxes for whether the test is automatable and has been automated. Larger test cases may also
contain prerequisite states or steps, and descriptions. A test case should also contain a place for
the actual result. These steps can be stored in a word processor document, spreadsheet,
database, or other common repository. In a database system, you may also be able to see past
test results, who generated the results, and what system configuration was used to generate
those results. These past results would usually be stored in a separate table.
Test script
The test script is the combination of a test case, test procedure, and test data. Initially the term
was derived from the product of work created by automated regression test tools. Today, test
scripts can be manual, automated, or a combination of both.
Test suite
The most common term for a collection of test cases is a test suite. The test suite often also
contains more detailed instructions or goals for each collection of test cases. It definitely
contains a section where the tester identifies the system configuration used during testing. A
group of test cases may also contain prerequisite states or steps, and descriptions of the
following tests.
Test data
In most cases, multiple sets of values or data are used to test the same functionality of a
particular feature. All the test values and changeable environmental components are collected
in separate files and stored as test data. It is also useful to provide this data to the client and
with the product or a project.
Test harness
The software, tools, samples of data input and output, and configurations are all referred to
collectively as a test harness.
[edit] Certifications
Several certification programs exist to support the professional aspirations of software testers and
quality assurance specialists. No certification currently offered actually requires the applicant to
demonstrate the ability to test software. No certification is based on a widely accepted body of
knowledge. This has led some to declare that the testing field is not ready for certification.[35]
Certification itself cannot measure an individual's productivity, their skill, or practical
knowledge, and cannot guarantee their competence, or professionalism as a tester.[36]
Software testing certification types
 Exam-based: Formalized exams, which need to be passed; can also be learned by self-
study [e.g., for ISTQB or QAI][37]
 Education-based: Instructor-led sessions, where each course has to be passed [e.g.,
International Institute for Software Testing (IIST)].
Testing certifications
 Certified Associate in Software Testing (CAST) offered by the Quality Assurance Institute
(QAI)[38]
 CATe offered by the International Institute for Software Testing [39]
 Certified Manager in Software Testing (CMST) offered by the Quality Assurance Institute
(QAI)[38]
 Certified Software Tester (CSTE) offered by the Quality Assurance Institute (QAI)[38]
 Certified Software Test Professional (CSTP) offered by the International Institute for
Software Testing[39]
 CSTP (TM) (Australian Version) offered by K. J. Ross & Associates[40]
 ISEB offered by the Information Systems Examinations Board
 ISTQB Certified Tester, Foundation Level (CTFL) offered by the International Software
Testing Qualification Board [41][42]
 ISTQB Certified Tester, Advanced Level (CTAL) offered by the International Software
Testing Qualification Board [41][42]
 TMPF TMap[dubious – discuss] Next Foundation offered by the Examination Institute for
Information Science[43]
Quality assurance certifications
 CMSQ offered by the Quality Assurance Institute (QAI)[38].

 CSQA offered by the Quality Assurance Institute (QAI)[38]
 CSQE offered by the American Society for Quality (ASQ)[44]
 CQIA offered by the American Society for Quality (ASQ)[44]
[edit] Controversy
Some of the major software testing controversies include:
What constitutes responsible software testing?
Members of the "context-driven" school of testing [45] believe that there are no "best practices"
of testing, but rather that testing is a set of skills that allow the tester to select or invent testing
practices to suit each unique situation. [46]
Agile vs. traditional
Should testers learn to work under conditions of uncertainty and constant change or should
they aim at process "maturity"? The agile testing movement has received growing popularity
since 2006 mainly in commercial circles [47][48], whereas government and military [49] software
providers are slow to embrace this methodology [neutrality is disputed] in favour of traditional test-last
models (e.g. in the Waterfall model).
Exploratory test vs. scripted[50]
Should tests be designed at the same time as they are executed or should they be designed
beforehand?
Manual testing vs. automated
Some writers believe that test automation is so expensive relative to its value that it should be
used sparingly.[51] More in particular, test-driven development states that developers should
write unit-tests of the XUnit type before coding the functionality. The tests then can be
considered as a way to capture and implement the requirements.
Software design vs. software implementation[52]
Should testing be carried out only at the end or throughout the whole process?
Who watches the watchmen?
The idea is that any form of observation is also an interaction—the act of testing can also affect
that which is being tested[53].
SDLC models
Introduction
There are various software development approaches defined and designed which are
used/employed during development process of software, these approaches are also referred as
"Software Development Process Models".
Each process model follows a particular life cycle in order to ensure success in process of
software development.
Waterfall Model
Waterfall approach was first Process Model to be introduced and followed widely in Software
Engineering to ensure success of the project. In "The Waterfall" approach, the whole process of
software development is divided into separate process phases.
The phases in Waterfall model are: Requirement Specifications phase, Software Design,
Implementation and Testing & Maintenance. All these phases are cascaded to each other so that
second phase is started as and when defined set of goals are achieved for first phase and it is
signed off, so the name "Waterfall Model". All the methods and processes undertaken in
Waterfall Model are more visible.
Waterfall Model
The stages of "The Waterfall Model" are:
Requirement Analysis & Definition: All possible requirements of the system

to be developed are captured in this phase. Requirements are set of
functionalities and constraints that the end-user (who will be using the system)
expects from the system. The requirements are gathered from the end-user by
consultation, these requirements are analyzed for their validity and the
possibility of incorporating the requirements in the system to be development
is also studied. Finally, a Requirement Specification document is created
which serves the purpose of guideline for the next phase of the model.
System & Software Design: Before a starting for actual coding, it is highly
important to understand what we are going to create and what it should look
like? The requirement specifications from first phase are studied in this phase
and system design is prepared. System Design helps in specifying hardware
and system requirements and also helps in defining overall system
architecture. The system design specifications serve as input for the next
phase of the model.
Implementation & Unit Testing: On receiving system design documents, the

work is divided in modules/units and actual coding is started. The system is
first developed in small programs called units, which are integrated in the next
phase. Each unit is developed and tested for its functionality; this is referred to
as Unit Testing. Unit testing mainly verifies if the modules/units meet their
specifications.
Integration & System Testing: As specified above, the system is first

divided in units which are developed and tested for their functionalities. These
units are integrated into a complete system during Integration phase and tested
to check if all modules/units coordinate between each other and the system as
a whole behaves as per the specifications. After successfully testing the
software, it is delivered to the customer.
Operations & Maintenance: This phase of "The Waterfall Model" is

virtually never ending phase (Very long). Generally, problems with the system
developed (which are not found during the development life cycle) come up
after its practical use starts, so the issues related to the system are solved after
deployment of the system. Not all the problems come in picture directly but
they arise time to time and needs to be solved; hence this process is referred as
Maintenance.
Advantages and Disadvantages
Advantages
The advantage of waterfall development is that it allows for

departmentalization and managerial control. A schedule can
be set with deadlines for each stage of development and a
product can proceed through the development process like a
car in a carwash, and theoretically, be delivered on time.
Development moves from concept, through design,
implementation, testing, installation, troubleshooting, and
ends up at operation and maintenance. Each phase of
development proceeds in strict order, without any
overlapping or iterative steps.
Disadvantages
The disadvantage of waterfall development is that it does

not allow for much reflection or revision. Once an
application is in the testing stage, it is very difficult to go
back and change something that was not well-thought out
in the concept stage. Alternatives to the waterfall model
include joint application development (JAD), rapid
application development (RAD), synch and stabilize,
buildAdvantages and Disadvantages
Advantages
The advantage of waterfall development is that it allows for

departmentalization and managerial control. A schedule can
be set with deadlines for each stage of development and a
product can proceed through the development process like a
car in a carwash, and theoretically, be delivered on time.
Development moves from concept, through design,
implementation, testing, installation, troubleshooting, and
ends up at operation and maintenance. Each phase of
development proceeds in strict order, without any
overlapping or iterative steps.
Disadvantages
The disadvantage of waterfall development is that it does not

allow for much reflection or revision. Once an application is
in the testing stage, it is very difficult to go back and change
something that was not well-thought out in the concept stage.
Alternatives to the waterfall model include joint application
development (JAD), rapid application development (RAD),
synch and stabilize, build and fix, and the spiral model.
Version Second
Advantages of Prototyping
There are many advantages to using prototyping in software development, some tangible some
abstract.
Reduced time and costs: Prototyping can improve the quality of requirements and specifications
provided to developers. Because changes cost exponentially more to implement as they are
detected later in development, the early determination of what the user really wants can result in
faster and less expensive software.
Improved and increased user involvement: Prototyping requires user involvement and allows
them to see and interact with a prototype allowing them to provide better and more complete
feedback and specifications. The presence of the prototype being examined by the user prevents
many misunderstandings and miscommunications that occur when each side believe the other
understands what they said. Since users know the problem domain better than anyone on the
development team does, increased interaction can result in final product that has greater tangible
and intangible quality. The final product is more likely to satisfy the users desire for look, feel
and performance.
Volume Testing belongs to the group of non-functional tests, which are often
misunderstood and/or used interchangeably. Volume testing refers to testing a
software application with a certain amount of data. This amount can, in generic
terms, be the database size or it could also be the size of an interface file that is the
subject of volume testing. For example, if you want to volume test your application
with a specific database size, you will expand your database to that size and then test
the application's performance on it. Another example could be when there is a
requirement for your application to interact with an interface file (could be any file
such as .dat, .xml); this interaction could be reading and/or writing on to/from the
file. You will create a sample file of the size you want and then test the application's
functionality with that file in order to test the performanc Stress testing
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Stress testing is a form of testing that is used to determine the stability of a given system or
entity. It involves testing beyond normal operational capacity, often to a breaking point, in order
to observe the results. Stress testing may have a more specific meaning in certain industries, such
as fatigue testing for materials.
Contents
[hide]
 1 Computer software
 2 Hardware
o 2.1 Computer processors
 3 Financial sector
 4 See also
 5 References
[edit] Computer software

Main article: stress testing (software)
In software testing, a system stress test refers to tests that put a greater emphasis on robustness,
availability, and error handling under a heavy load, rather than on what would be considered
correct behavior under normal circumstances. In particular, the goals of such tests may be to
ensure the software does not crash in conditions of insufficient computational resources (such as
memory or disk space), unusually high concurrency, or denial of service attacks.
Examples:
 A web server may be stress tested using scripts, bots, and various denial of service tools to
observe the performance of a web site during peak loads.
Stress testing may be contrasted with load testing:
 Load testing examines the entire environment and database, while measuring the response
time, whereas stress testing focuses on identified transactions, pushing to a level so as to break
transactions or systems.
 During stress testing, if transactions are selectively stressed, the database may not experience
much load, but the transactions are heavily stressed. On the other hand, during load testing the
database experiences a heavy load, while some transactions may not be stressed.
 System stress testing, also known as stress testing, is loading the concurrent users over and
beyond the level that the system can handle, so it breaks at the weakest link within the entire
system.
[edit] Hardware
Reliability engineers often test items under expected stress or even under accelerated stress. The
goal is to determine the operating life of the item or to determine modes of failure.[1]
Stress testing, in general, should put the hardware under exaggerated levels of stress in order to
ensure stability when used in a normal environment.
[edit] Computer processors
When modifying the operating parameters of a CPU, such as in overclocking, underclocking,

overvolting, and undervolting, it may be necessary to verify if the new parameters (usually CPU
core voltage and frequency) are suitable for heavy CPU loads. This is done by running a CPU-
intensive program (usually Prime95) for extended periods of time, to test whether the computer
hangs or crashes. CPU stress testing is also referred to as torture testing. Software that is
suitable for torture testing should typically run instructions that utilise the entire chip rather than
only a few of its units.
Stress testing a CPU over the course of 24 hours at 100% load is, in most cases, sufficient
enough to determine that the CPU will function correctly in normal usage scenarios, where CPU
usage fluctuates at low levels (50% and under), such as on a desktop computer.
Security testing
Security testing is a process to determine that an information system protects data and maintains
functionality as intended.
The six basic security concepts that need to be covered by security testing are: confidentiality,
integrity, authentication, authorization, availability and non-repudiation.
Contents
[hide]
 1 Confidentiality
 2 Integrity
 3 Authentication
 4 Authorization
 5 Availability
 6 Non-repudiation
 7 See also
[edit] Confidentiality
 A security measure which protects against the disclosure of information to parties other than
the intended recipient that is by no means the only way of ensuring the security.
[edit] Integrity
 A measure intended to allow the receiver to determine that the information which it is providing
is correct.
 Integrity schemes often use some of the same underlying technologies as confidentiality
schemes, but they usually involve adding additional information to a communication to form the
basis of an algorithmic check rather than the encoding all of the communication.
[edit] Authentication
 The process of establishing the identity of the user.
 Authentication can take many forms including but not limited to: passwords, biometrics, radio
frequency identification, etc
[edit] Authorization
 The process of determining that a requester is allowed to receive a service or perform an
operation.
 Access control is an example of authorization.
[edit] Availability
 Assuring information and communications services will be ready for use when expected.
 Information must be kept available to authorized persons when they need it.
[edit] Non-repudiation
 A measure intended to prevent the later denial that an action happened, or a communication
that took place etc.
 In communication terms this often involves the interchange of authentication information
combined with some form of provable time stamp.
Sanity testing
(Redirected from Sanity test)
A sanity test or sanity check is a basic test to quickly evaluate the validity of a claim or
calculation. In arithmetic, for example, when multiplying by 9, using the divisibility rule for 9 to
verify that the sum of digits of the result is divisible by 9 is a sanity test.
In computer science, a sanity test is a very brief run-through of the functionality of a

computer program, system, calculation, or other analysis, to assure that the system or
methodology works as expected, often prior to a more exhaustive round of testing.
Software development
In software development, the sanity test (a form of software testing which offers "quick, broad,
and shallow testing"[1]) determines whether it is reasonable to proceed with further testing.
Software sanity tests are commonly conflated with smoke tests. [2] A smoke test determines
whether it is possible to continue testing, as opposed to whether it is reasonable[citation needed]. A
software smoke test determines whether the program launches and whether its interfaces are
accessible and responsive (for example, the responsiveness of a web page or an input button). If
the smoke test fails, it is impossible to conduct a sanity test. In contrast, the ideal sanity test
exercises the smallest subset of application functions needed to determine whether the
application logic is generally functional and correct (for example, an interest rate calculation for
a financial application). If the sanity test fails, it is not reasonable to attempt more rigorous
testing. Both sanity tests and smoke tests are ways to avoid wasting time and effort by quickly
determining whether an application is too flawed to merit any rigorous testing. Many companies
run sanity tests and unit tests on an automated build as part of their development process.[3]
The Hello world program is often used as a sanity test for a development environment. If Hello
World fails to compile or execute, the supporting environment likely has a configuration
problem. If it works, the problem being diagnosed likely lies in the real application being
diagnosed.
Another, possibly more common usage of 'sanity test' is to denote checks which are performed
within program code, usually on arguments to functions or returns therefrom, to see if the
answers can be assumed to be correct. The more complicated the routine, the more important that
its response be checked. The trivial case is checking to see that a file opened, written to, or
closed, did not fail on these activities – which is a sanity check often ignored by programmers.
But more complex items can also be sanity-checked for various reasons.
Examples of this include bank account management systems which check that withdrawals are
sane in not requesting more than the account contains, and that deposits or purchases are sane in
fitting in with patterns established by historical data – large deposits may be more closely
scrutinized for accuracy, and large purchase transactions may be double-checked with a card
holder for validity against fraud; these are "runtime" sanity checks, as opposed to the
"development" sanity checks mentioned above.
Smoke testing
Smoke testing is a term used in plumbing, woodwind repair, electronics, computer software
development, infectious disease control, and the entertainment industry. It refers to the first test
made after repairs or first assembly to provide some assurance that the system under test will not
catastrophically fail. After a smoke test proves that "the pipes will not leak, the keys seal
properly, the circuit will not burn, or the software will not crash outright," the assembly is ready
for more stressful testing.
In computer programming and software testing, smoke testing is a preliminary to further

testing, which should reveal simple failures severe enough to reject a prospective software
release. In this case, the smoke is metaphorical. Smoke testing in software development
A subset of all defined/planned test cases that cover the main functionality of a component or
system, for ascertaining that the most crucial functions of a program work, but not bothering
with finer details. A daily build and smoke test is among industry best practices. Smoke testing is
done by testers before accepting a build for further testing. Microsoft claims[1] that after code
reviews, smoke testing is the most cost effective method for identifying and fixing defects in
software.
In software engineering, a smoke test generally consists of a collection of tests that can be
applied to a newly created or repaired computer program. Sometimes the tests are performed by
the automated system that builds the final software. In this sense a smoke test is the process of
validating code changes before the changes are checked into the larger product’s official source
code collection or the main branch of source code.
In software testing, a smoke test is a collection of written tests that are performed on a system
prior to being accepted for further testing. This is also known as a build verification test. This is a
"shallow and wide" approach to the application. The tester "touches" all areas of the application
without getting too deep, looking for answers to basic questions like, "Can I launch the test item
at all?", "Does it open to a window?", "Do the buttons on the window do things?"
The purpose is to determine whether or not the application is so badly broken that testing
functionality in a more detailed way is unnecessary. These written tests can either be performed
manually or using an automated tool. When automated tools are used, the tests are often initiated
by the same process that generates the build itself. This is sometimes referred to as "rattle"
testing - as in "if I shake it does it rattle?".
Smoke tests can be broadly categorized as functional tests or unit tests. Functional tests exercise
the complete program with various inputs. Unit tests exercise individual functions, subroutines,
or object methods. Both functional testing tools and unit testing tools tend to be third party
products that are not part of the compiler suite. Functional tests may be a scripted series of
program inputs, possibly even an automated mechanism for controlling mouse movements. Unit
tests may be separate functions within the code itself, or driver layer that links to the code
without altering the code being tested.
Ad hoc testing
Ad hoc testing is a commonly used term for software testing performed without planning and
documentation (but can be applied to early scientific experimental studies).
Software Testing portal
The tests are intended to be run only once, unless a defect is discovered. Ad hoc testing is a part
of exploratory testing, being the least formal of test methods. In this view, ad hoc testing has
been criticized because it isn't structured, but this can also be a strength: important things can be
found quickly. It is performed with improvisation, the tester seeks to find bugs with any means
that seem appropriate. It contrasts to regression testing that looks for a specific issue with
detailed reproduction steps, and a clear expected result. Ad hoc testing is most often used as a
complement to other types of testing.

(Myers79) (Hetzel88)

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

(Myers79) (Hetzel88)

Cargado por

Copyright:

Formatos disponibles

Introduction

Regardless of the limitations, testing is an integral part in software development. It is broadly

 For Verification & Validation (V&V)

Functionality (exterior Engineering (interior Adaptability (future

 For reliability estimation [Kaner93] [Lyu95]

Contrary to black-box testing, software is viewed as a white-box, or glass-box in white-box

Software reliability refers to the probability of failure-free operation of a system. It is related to

When to stop testing?

Mothora [DeMillo91] is an automated mutation testing tool-set developed at Purdue University.

NuMega's Boundschecker [NuMega99] Rational's Purify [Rational99]. They are run-time

WHITE BOX TESTING

STATIC & DYNAMIC ANALYSIS

BLACK BOX TESTING

Software testing is an investigation conducted to provide stakeholders with information about

 Until 1956 - Debugging oriented[7]

[edit] Software testing topics

[edit] Functional vs non-functional testing

[edit] Defects and failures

[edit] Finding faults early

Requirements 1× 3× 5–10× 10× 10–100×

[edit] Input combinations and preconditions

[edit] Static vs. dynamic testing

[edit] Software verification and validation

Software testing is used in association with verification and validation:[19]

Verification is the process of evaluating a system or component to determine whether the

[edit] The software testing team

[edit] Software quality assurance (SQA)

[edit] Testing methods

[edit] White box testing

Main article: White box testing

Types of white box testing

The following types of white box testing exist:

Two common forms of code coverage are:

 Function coverage, which reports on functions executed

[edit] Black box testing

Main article: Black box testing

Specification-based testing: Specification-based testing aims to test the functionality of

[edit] Grey box testing

[edit] Testing levels

[edit] Unit testing

Main article: Unit testing

Unit testing is also called component testing.

[edit] Integration testing

Main article: Integration testing

[edit] System testing

Main article: System testing

[edit] System integration testing

Main article: System integration testing

Main article: Regression testing

[edit] Acceptance testing

Main article: Acceptance testing

Acceptance testing can mean one of two things:

[edit] Alpha testing

Alpha testing is simulated or actual operational testing by potential users/customers or an

[edit] Beta testing

[edit] Non-functional testing

[edit] Software performance testing and load testing

[edit] Stability testing

[edit] Usability testing

[edit] Security testing

[edit] Internationalization and localization

[edit] Destructive testing