Está en la página 1de 145

Ethiopian Civil Service University

Center for Public Policy Studies


Course: Advanced Research Methods

Identifier: M-oo1
Title: Handout (Reader)

Prepared by:
Atakilt Hagos

March, 2014
Addis Ababa

Didactic Design: <Title of Module>

Summary of the Didactic Design for the Module


General Data
Module
Number
Module Title
Module
Description

PPS- 562
Advanced Research Methods
The module, Advanced Research Methods is a module that is

provided to masters students of public policy studies. The


aim of the module is to introduce students with the basic knowledge and
skill to do research in relation to public policy issues, processes and
outcomes/impacts. Therefore, this module provides students
with some conceptual and theoretical backgrounds of
Advanced Research Methods. It will enable participants to identify and
apply appropriate research methodologies in order to plan, conduct and evaluate
basic research in organizations. In addition, the module will also enhance the
research ability of students to understand public policy problems from the
perspectives of various fields; identify appropriate research designs and draw the
attention of students to understand techniques of writing a basic research report. The
module will, furthermore, enable participants to understand the ethical issues in
scientific research while laying the foundation for research skills at higher levels. As
part of the assessment, participants will individually select a topic and prepare a
research proposal for which they will get a feedback so that they prepare themselves
to the task they face in Thesis I and II. Statistical package (e.g., SPSS) will be used
whenever applicable.

The module introduces participants with the meaning of


scientific research and the research process. It also sheds
some light on types of research design as well as sampling
design. The module also drills down into the data types/sources
and methods of data collection. The participants will also be
introduced to the basics of proposal writing, including
referencing, and qualitative and quantitative research strategies and
methods of Analyzing the Data. They will also acquire skills on the use of
SPSS in analyzing quantitative data. Specifically, they will apply SPSS in
generating descriptive statistics from a data set as well as in conducting
correlation analysis, hypothesis testing and regression analysis. Application.
The module uses Lecture, Individual work, collaborative work,

Module
Level
Abbreviation
Subtitle
Duration in
Semesters
Frequency
Language
Mode of
Delivery
ECTS

tutorial, and presentation as teaching and learning methods. The


assessment methods for this module are short test/quiz, individual
assignment, group assignment and final exam.
Masters
EPP
Semesters two
Once in a year
English
Face-To-Face
7
2

Didactic Design: <Title of Module>

Workload
Contact Hours
Non-Contact
Hours
Total Hours

70 Hrs
140 Hrs
210 Hrs

Assessment
Description

Examination
Types
Examination
Duration
Assignments
Repetition

The assessment method will follow Continuous assessment


where students will be given individual assignments, group
assignments, short quiz and final exam.
Individual assignments (20%)
Group assignment
(20%)
Short quizzes
(20%)
Final examination
(40%)
Total 100%
Written
180 minutes
Both individual and group assignments
Once only in case of certified health, maternity leave, and for
students who scored an F grade in the module.

Description
Learning Outcomes

At the end of the module, students are able to:


1. Applying the concepts, approaches and methods of scientific research
in develop a research proposal and in designing data collection
instruments
2. Review the literature, research proposals and research reports prepared
by other researchers
3. Analyze qualitative/quantitative data using statistical packages
4. Conduct scientific research individually or as a member of a research
group
5. Understand the meaning, characteristics, and steps of scientific
research and the various types of research design and sampling design

Prerequisites

Basic mathematics & statistics

Content

PART ONE: Research Meaning, Process, and Proposal


Writing
1. Scientific Research and the Research Process: Meaning and
objective of Scientific research; Characteristics of Scientific Research;
Criteria of Good Research; The Research Process: An Overview

2. Research Approaches and Sampling Design: Types of


research and Research Approaches/designs; Determining Sample
Design (probability and non-probability sampling);
3

Didactic Design: <Title of Module>

3. Data Sources and Methods of Data Collection: Data Types


and Sources/Methods of collecting Primary Data; Sources of
Secondary Data; Guideline for Designing a Questionnaire and
other Instruments
4. Research Proposal, Referencing, Reporting Results and
Ethical Considerations: The Research Proposal; Referencing
Styles; 4.3 Report Writing: Communicating the Results;
Preparing and Delivering a Presentation; Ethical Issues and
Precautions
5. The survey method and case studies
PART TWO: PRESENTING AND ANALYZING QUANTITATIVE
DATA AND HYPOTHESIS TESTING (With SPSS Application)
6. Presentation and analysis of quantitative data - Descriptive
Statistics: Basic concepts in statistics; Classification and
Presentation of Statistical Data (bar chart, pie chart, histogram);
Measures of central tendency and dispersion (mean, median,
mode, mean deviation, variance, standard deviation, covariance,
Z-score); Exercise with SPSS Application
7. Tests of hypothesis concerning means and proportions: Tests
of hypotheses concerning means; Tests concerning the difference
between two means (independent samples); Tests of mean
difference between several populations (independent samples);
Paired-samples t-test (Differences between dependent groups);
Tests of association (the Pearson coefficient of correlation and
test of its significance, The Spearman rank correlation
coefficient and test of its significance); Nonparametric
Correlations (The Chi-square test); Hypothesis test for the
difference between two proportions; Exercise with SPSS
Application
8. The simple linear regression model and Statistical Inference;
the simple linear regression model, estimation of regression
coefficients and interpreting results; hypothesis testing; Exercise
with SPSS application
9. The multiple linear regression model and Statistical
Inference; the multiple linear regression model, estimation of
regression coefficients and interpreting results; hypothesis
testing; Exercise with SPSS application
Learning &
Teaching Methods

Lecture, tutorial, Individual work, collaborative work, presentation

Media

Face-to-face lecture, notes, power points, internets, books, journal articles

Literature

Electronic Books:
Khotari , C.R (2004) Research Methodology: Methods and
Techniques, New Age International (P) Limited Publishers, New
Delhi.
Dowson, Catherine (2002) Practical Research Methods. How to
Books, Oxford, UK.
4

Didactic Design: <Title of Module>

Punch, Keith F. (2006) Developing Effective Research Proposal.


Marczyk, Dematteo and Festinger (2005). Essentials of Research
Design and Methodology.
Gray, David E. (2004). Doing Research in the Real World.
Greener, Sue (2008). Business Research Methods.
Flick, Kardorff and Steinke (2004). A Companion to Qualitative
Research.
Singh, Yogesh Kumar (2006). Fundamentals of Research
Methodology and Statistics.
Dowdy, Wearden and Chilko (2004). Statistics for Research, 3rd
ed.
Yin (1994) Case StudyResearch

Books from the library:

Hill, R. Carter (1997), Undergraduate econometrics, Wiley, New York


Gujarati, Damodar N (1994) Basic econometrics.

Didactic Design: <Title of Module>

The Reader
Part I: Research Meaning and Process
Unit One:
Scientific Research and the Research Process
As students of the masters program and as professionals after graduation, you will be engaged in
scientific research. As decision makers, you may be provided with information on the progress
and findings of a research project sponsored by your organization or another agency. One way or
another, you are likely to be involved in research. So it is very essential for you to know what
research is and how it is carried out. Research requires passion, knowledge and skills. So what is
research? Why do we conduct research? What are the building blocks of scientific research?
What process should you follow in conducting scientific research? We will address these and
other questions in this chapter.
Learning Objectives: After reading this chapter, you should be able to:
Explain the meaning and objectives of research
Discuss characteristics of scientific research
Describe the criteria of Good Research
Distinguish between inductive and deductive research types
Discuss the nine steps of the Research Process

1.1Meaning and objective of scientific research


As per the Merriam-Webster Online Dictionary, the word research is derived from the French
"recherche", which means "to go about seeking". Research has been defined in a number
of different ways. The Merriam-Webster Online Dictionary defines research in more
detail as "a studious inquiry or examination; especially: investigation or
experimentation aimed at the discovery and interpretation of facts, revision of
accepted theories or laws in the light of new facts, or practical application of such
new or revised theories or laws.
The Market Research Society (in UK) defines (social science) research as The
application of scientific research methods to obtain objective information on
peoples attitude and behavior based usually on representative sample of the
relevant populations (Yuvon McGivern (2003).

Objectives of Research:
To gain familiarity with a phenomenon or achieve new insights into it (via:
exploratory or formative research studies)
To portray/describe the characteristics of a particular individual, situation or group
(Via: descriptive research studies)
6

Didactic Design: <Title of Module>

To determine the frequency with which something occurs or with which it is


associated with something else (Via: diagnostic research studies)
To test a hypothesis of a causal relationship between variables (Via: hypothesistesting research studies)

1.2Characteristics of Scientific Research


Generally scientific research has the following characteristics:

It is empirical: Science is based purely around observation and measurement, and the vast

majority of research involves some type of practical experimentation.


It relies upon data: quantitative and qualitative

It is Intellectual and Visionary: Science requires vision, and the ability to observe the
implications of results. the visionary part of science lies in relating the findings back into the real
world. The process of relating findings to the real world is known as induction, or inductive
reasoning, and is a way of relating the findings to the universe around us.

It uses experiments to test predictions: This process of induction and generalization allows
scientists to make predictions about how they think that something should behave, and design an
experiment in a laboratory or by just observing the natural world.

It is systematic and methodical: Follows certain steps that are repeatable.

1.3 Criteria of Good Research


Whatever may be the types of research works and studies, one thing that is important is that they
all meet on the common ground of scientific method employed by them. One expects scientific
research to satisfy the following criteria:
i) Good research is systematic: It means that research is structured with specified
steps to be taken in a specified sequence in accordance with the well defined set
of rules. Systematic characteristic of the research does not rule out creative
thinking but it certainly does reject the use of guessing and intuition in arriving at
conclusions.
ii) Good research is logical: This implies that research is guided by the rules of
logical reasoning and the logical process of induction and deduction are of great
value in carrying out research. Induction is the process of reasoning from a part to
the whole whereas deduction is the process of reasoning from some premise to a
conclusion which follows from that very premise. In fact, logical reasoning makes
research more meaningful in the context of decision making.

Didactic Design: <Title of Module>

iii) Good research is empirical: It implies that research is related basically to one or
more aspects of a real situation and deals with concrete data that provides a basis
for external validity to research results.
iv) Good research is replicable: This characteristic allows research results to be
verified by replicating the study and thereby building a sound basis for decisions.

1.4 Deductive vs Inductive Research


Scientific research follows logical reasoning, which could be deductive or inductive.

Induction: moving from specific to


general. Based on facts, researchers
develop principles or theories, which
could later be used as basis for
deductive research.
Deduction: moving from general to
specific. Arguments are based on
laws, rules and accepted principles.

In deduction, conclusions are based on premises.

Didactic Design: <Title of Module>

Didactic Design: <Title of Module>

10

In induction, conclusion is based on reasons with proof and evidence for a fact.

1.5 The Research Process: An Overview


The research process depends on the type of the research logic (deductive or inductive).

1.5.1 The Deductive Research Process


The research process involves several steps. The number of the steps may differ from author to
author. However, most authors divide the entire research process into NINE steps. In this course,
we will discuss these nine steps. However, please bear in mind that all these steps are not equally
applicable to all types of research.

Step One: Find a Research Topic and State Your Problem


This is the first step in the process of research. From my experience as an academic staff; from
my exposure to different research methodology trainings; and, from my own little research
experience, I have seen two approaches or practices regarding the selection of a research topic.
10

Didactic Design: <Title of Module>

11

The first practice is that masters and even PhD students are required to make their topic
as broad as possible and incorporate as many aspects of the problem as possible. The
advantage of this approach is that the student will have some wider knowledge on the
problem; what it lacks is depth or it takes more time to make the research both wider and
deeper at the same time. In the absence of sufficient time and resources,
o The literature survey becomes too broad.
o Many of the variables in this type of research are not well identified or else they
are not well-defined relevant indicators are not sufficiently included.
o As a result, the questions in a questionnaire or interview are too general and
shallow. The research methods adopted tend in most cases to be descriptive.

The second practice, a growingly dominant paradigm in western countries, is that


students are obliged to narrow down their research topic, make it specific, and dig deep
down into the issues. In this case, the researcher can do a better job given the limited time
and resources he/she has. Specifically:
o The literature survey becomes targeted.
o The variables or assessment issues are well identified
o The specific indicators for each variable are also identified
o The data collection instruments will be sharper and to the point.
o Students will have the opportunity to go beyond simple description of
phenomenon and attempt an analysis of causes and effects, hypothesis testing, etc.

Having quality of research in mind, the second approach is more appealing. With this in mind,
you can follow the following guideline while identifying your research problem/topic.

Decide the general area of interest or aspect of a subject matter that you would like to
enquire into and consider the feasibility of a particular solution. Pick a smaller part of a
bigger problem; do not try to address a big problem in one research.

Understand the problem by discussing with friends/colleagues or with those that have
some expertise in the matter or with those agencies working in relation to the issue.

Narrow the problem down based on the general discussion and phrase the problem in
operational terms. This process of narrowing down the problem is iterative.

Examine all available literature to acquaint oneself with the selected problem. There are
two types of literature: the conceptual (concepts and theories) and the empirical. This can
also help the researcher know what data are available.

Requires careful verification of the validity and objectivity of the background facts
concerning the problem.

Define pertinent terms


o What are key variables in your study?
11

Didactic Design: <Title of Module>

12

o What relationship do you investigate?


Checklist for a good research topic
Is the topic something in which you are really interested?
Does the topic have a clear link to theory?
Do you have, or are you able to develop, the necessary research skills to
undertake the topic?
Is your topic societal relevant?
Subject which is overdone must not be selected.
Controversial subject should not become the choice of an average researcher.
Avoid too narrow or too vague problems.
The subject should be familiar and feasible
Consider importance of the areas/subject, capacity of the researcher, cost and time
requirements, accessibility of necessary cooperation, etc.
Note:
i)

Your problem statement must be specific to the issue at hand and often ends up with
research questions. Within a given research topic, it is possible that different researchers
could formulate different research questions. Therefore, it is very important to write
down your research questions at the end of the problem statement. The research
questions specifically indicate what your study is about.

ii) While you state the problem, you may need to provide some data or information to
express the magnitude of the problem. This may require you to do a preliminary data
gathering.

Step Two: Literature Survey: Theoretical and Conceptual Framework


Once you have chosen your research topic and narrowed it down, you have to carry out an
extensive literature survey of the academic literature (journal articles, conference proceedings,
books, unpublished materials, etc) to know more about the theories and debates going on
regarding the topic and understand the specific nature of the problem in the academic literature.
You need also to gather some preliminary information regarding the topic and the study area
from government and non-government documents, reports, etc.
Expected Outputs from the Literature Survey
The theoretical and conceptual framework is what is ultimately expected as a product of your literature
survey.
i)

The Theoretical Framework

The theoretical framework refers to a summary of the theories that you will refer to in your study. You
will refer to the theories during the development of the hypotheses and the conceptual framework; when
12

Didactic Design: <Title of Module>

13

you prepare the research design; do the data analysis and generalization. Your conceptual framework will
indicate the important issues to be assessed or the variables to be measured; their possible indicators; the
type and direction of relationship that exists among the variables, and so on.
What you summarize as part of the theoretical framework has to be very relevant to the topic and
particularly the research problem and the research questions. While conducting the literature survey,
students often throw in whatever literature that is in one way or another related with the research area but
not necessarily with the research problem. Failing to prepare the theoretical framework properly has at
least the following disadvantages:
a) You will not have the basis to define relevant concepts, identify the assessment issue or the
variables and define them.
b) You dont know what relationship to expect.
c) It will not be easy for you to choose the appropriate research design.
d) Your data collection instruments will be ill designed.
e) During analysis, you will not have any theory to compare your results with.
f) The contribution of your research to the existing theory will be blurred.
ii)

The Conceptual Framework

Based on the theoretical framework, you are expected to develop your conceptual framework. In
this part, you will define the concepts you will use in your research. In the literature, concepts
may defined in different ways and you will have to make a choice here. In your study, how the
concepts defined operationally? What are the variables and indicators that you will use in your
study to measure the concepts? How do the different concepts relate among each other? These
and other questions should be answered via your conceptual framework.
Conceptual frameworks are best done graphically rather than in text. The diagram depicts the
concepts/issues and how they relate among each other. You may create your illustrative diagram
or adopt or adopt it from the literature. In the later case, you have to clearly cite the sources for
your diagram.

Step Three: Development of Working Hypothesis:


If you are conducting a deductive type of research, you will be required to develop hypotheses
based on the theories you have encountered while doing your literature survey. The question is,
what does the theory(ies) say about the phenomenon you are investigating and the relationship
among the various variables involved?
A hypothesis:

Is a tentative assumption made in order to draw out and set its logical or empirical
consequences; It should be specific and pertinent to the piece of research in hand;

Provides the focal point for the research; to delimit the area, sharpen thinking and keep
the researcher on the right track.

Determines data type required; the data collection and sampling methods to be used; the
tests that must be conducted during data analysis.

13

Didactic Design: <Title of Module>

14

Results from a-priori thinking about the subject, examination of available data and
material.

Step Four: Preparing the Research Design:


Once you have developed your hypothesis, the next step is o craft your research design. A
research design is like the blue print for house construction. If you start building a house with out
first having the design (consisting of the architectural, electrical, sanitary, etc designs), the result
is you dont know what type of house you will end up with; it will be costly and time taking;
often involving construction and demolition of what has been constructed. Most importantly, the
house lacks quality and may be prone to risks. Likewise, a research conducted without a research
design at hand is aimless, ambiguous, time taking, costly and may be totally irrelevant and
unacceptable in light of the requirements for a scientific investigation.
A research design refers to the crafting of the conceptual structure within which research will
be conducted in a way that is as efficient as possible, the collection of relevant evidence with
minimal expenditure of effort, time and money. More explicitly, the design decisions happen to
be in respect of:
i.

What is the study about?

ii.

Why is the study being made?

iii.

Where will the study be carried out?

iv.

What type of data is required?

v.

What periods of time will the study include?

vi.

What will be the sample design?

vii.

What techniques of data collection will be used?

viii.

How will the data be analysed?

ix.

In what style will the report be presented?

Step Five: Collecting the Data and Administering data Collection:


Once you have completed the research design and have it endorsed by the concerned parties, you
can proceed to data collection using the data collection instruments you have developed as part
of your research design. We have seen that scientific research is empirical. Therefore, you need
to gather empirical data to answer you research questions and meet your research objectives.
This involves collecting the data through observation, personal interview, telephone interview,
mailing of questionnaire, schedule etc.
Administering Data Collection (Managing the project)
During the data collection process, the researcher must address the possible problem of bias in
information collection. Possible sources of bias during data collection:
Defective instruments, such as questionnaires, weighing scales or other measuring
equipment, etc
Observer bias
14

Didactic Design: <Title of Module>

15

Effect of the interview on the informant


Information bias
o These sources of bias can be prevented by carefully planning the data collection
process and by pre-testing the data collection tools.
o All these potential biases will threaten the validity and reliability of your study.
o By being aware of them it is possible, to certain extent, to prevent them.
Managing the project (data collection process) involves the following:
o

Organizing fieldwork

Briefing interviewers (enumerators) and coordinators

Developing an analysis plan (e,g, Coding)

Organizing data processing (e.g. entering the coding of the questionnaire items into EXCEL or
SPSS spread sheet; entering data before the data collection has been completed)

Starting the analysis

Checking and reporting progress of data collection

Step Six: Analysis of Data


A. Data Preparation
Once all the data that is required to answer the research questions and meet the research
objectives is collected, you can now start analyzing the data. First you have start with data
preparation. Data preparation involves establishment of categories, the application of these
categories to raw data through coding, tabulation, and then drawing statistical inferences.
Coding, data editing, tabulation, computation of different statistic, etc follow
Editing:- Editing is a process of examine the collected raw data ( especially in survey) to detect
error and omission and to correct these when possible,
Coding: is the process of assigning numerals or other symbols to answer so that responses can
be put into a limited number of categories or classes and such classes should be appropriate to
the research problem.
Classification: The raw data must be grouped or classes on the basis of common
characteristic, data having a common characteristic are placed in one class and in this way the
entire data get divided into a number of groups or classes.
Tabulation /Compilation:- It is the process of summarizing and displaying raw data in compact form
( statistical tables) for further analysis. In a broader sense, tabulation is an orderly arrangement of data in
column and row.
B. Data Analysis:

15

Didactic Design: <Title of Module>

16

This is a very important part of your study. Depending on the methods you have selected during your
research design, which in turn depends on the type of the research (exploratory, descriptive, etc) and the
research approach (qualitative vs, quantitative, or both), you need to analyze the data using those
methods. What is expected as a result of your analysis is the findings pertinent to the research questions
and objectives.

Step Seven: Hypothesis Testing:


For those types of research for which hypotheses must be developed in advance based on theory,
this is the time to test the hypothesis based on the findings of your analysis. There are different
types of statistical tests of hypothesis: Chi-square test, t-test, F-test, ANOVA, etc,

Step Eight: Generalization and Interpretation:


Your analysis and hypothesis testing is useless unless it leads you to certain generalizations. After the
hypothesis has been tested several times, you may arrive at generalizations. Again, depending on the type
of the research, your generalization could be in different forms. For example:

In exploratory research, it could be in the form of proposing a hypothesis that shall be tested by
several other studies in the future.

In explanatory studies, your generalization could be in the form of statements regarding what
factors explain the dependent variable and whether this is in accordance with what is stated in
theory.

If you had no hypothesis at the beginning, explain the findings on the basis of some theory known
as interpretation. The process of interpretation may trigger new questions which will serve as a
basis for further researches.

Step Nine: Preparation of the Research Report


Your research effort and output is summarized and presented in the research report of thesis. If
your language and writing style is not up to the expected standard, the reader may find it
annoying and less interesting to read your work no matter how good your topic and research
questions may be. If it is your masters thesis and the report is not well written, you are likely to
face a problem when your advisor and examiners read your paper. If it is for publication in a
journal, the editor or the blind reviewers will discard the paper unless it is well-written in light of
some standard or expected style.

1.5.2 The Inductive Research Process


Researc
h topic

Analyz
e data

Observation,
artifacts

Tentativ
e
working
hypothe

Tentativ
e
Researc
h

Field
work
(data
collectio

Refining
hypothes
is

Develop
theory

Write
the
Report

If necessary,
collect additional
data

Interviews, hanging
out
Focus groups
Field notes

16

Didactic Design: <Title of Module>

17

Unit Two
Types of Research Design and the Sampling Design
2.1 Research Approaches
2.1.1 Types of Research based on the nature of enquiry
Based on the nature of the research enquiry, research approaches are classified as exploratory,
descriptive, explanatory (causal) or predictive research
A.

Exploratory research:

Research undertaken to explore an issue or a topic to identify a problem, clarify the nature of the problem
or define the issue involved. It can be used to develop propositions (hypothesis) for further research, to
gain new insights and a greater understanding of the issue, especially when little is known about the issue.
Characterstics of exploratory research:

It is more of qualitative rather than being quantitative

It is carried out in small-scale rather than in large-scale.

Provides answers to questions: what, how and why

Concerned with hypothesis development rather than hypothesis testing.

It could be conducted as a pre-study: Literature, Focus group, Experience survey, Brainstorm

It could also be conducted as main study: Observation, Case-study/ies/.

Exploratory studies could be carried out based on literature search (review), experience survey, and
analysis of selected cases. Observation, focus groups, and interviews are useful methods of data collection
for exploratory researches.
B.

Descriptive size exploration (ex-post facto research):

Fact-finding enquiries, describing the state of affairs as it exists; researcher has no control over variables,
can only report what happened or what is happening by using Survey or Correlational Methods. It aims at
answering the questions: Who? What? Where? When? How? And How Many? This type of research is
carried out to answer more clearly defined research questions

Classification of descriptive studies: Descriptive studies could be Longitudinal or Cross sectional.


i)

Longitudinal: Studying different units (e.g. households) over time (e.g. over several years).
This study could be based on either a true panel or an omnibus panel.

True panel: In this case, the units of analysis included in the sample (e.g. households) are
consistently studied over time. Eaxmple, if the study is about consumption pattern of
households and Ato Ayeyels household is part of the study, Ato Ayeles household will
be studies throught the time period.

Omnibus panel: In this case, the members of the sample may change. In the previous
example, Ato Ayeles household could be part of the study in year I while it may not be
part of the study in year II.
17

Didactic Design: <Title of Module>


ii)

18

Cross sectional: Studying different units (e.g. households, sub-cities, regions, etc) at a given
point in time.
Causal or explanatory (Hypothesis-Testing/Experimental): -

C.

Helps to develop causal explanations about variables/factors by addressing the why questions: Why
do people chose brand A and not brand B? Why are some customers and not others satisfied with a
product of a firm? Why do some celebrities and not others use drags? Explanatory research may
involve experiment (laboratory or field experiment).
D.

Predictive: to predict the likely future effects of current actions using the if...then proposition.

2.1.2

Types of research based on the mode of data collection


a) Continuous (longitudinal) and takes data for many years where the subjects may be
over time.
b) Ad hoc (one time) research

2.1.3

Types of research based on the type of data


a. Quantitative: applied to phenomenon that can be expressed quantitatively.
Involves the generation of data in quantitative form, which can be subjected to
rigorous quantitative analysis in a formal or rigid fashion.
Can be further reclassified into inferential, experimental and simulation
approaches to research.
o

Inferential Approach: Purpose is to form a database from which to infer


characteristics or relationships or population survey research.

Experiential Approach: Characterized by much greater control over the


research environment and manipulation of some variables to observe
their effect on other variables.

Simulation Approach: Involves the construction of artificial


environment to permit an observation of the dynamic behaviour of a
system (or its sub-systems) under controlled conditions.

b. Qualitative: applied to quality or kind to describe the underlying motives of human


behavior. E.g., motivation research deals with why people think or do certain things using
depth interviews. Other techniques include: Word Association Test, Sentence completion
test; Story completion test, etc.
o Concerned with subjective assessment of attitudes, opinions and behaviour.
o Research is a function of the researchers insights and impressions (judgements)
o Result is either in non-quantitative form or in forms that are not subjected to
rigoros quantitative analysis.
o Utilizes techniques such as Focus Group Interviews, Projective Techniques and
Depth Interviews.

18

Didactic Design: <Title of Module>

2.1.4

2.1.5

2.1.6

2.1.7

19

Types of research based on the use of the research output:


a)

Applied: to solve immediate problems of society

b)

Fundamental/Basic/Pure: to develop theories- for knowledges sake.

Types of research based on the degree of theorization: Conceptual Vs Empirical:

Conceptual: Related to some abstract ideas or theory, generally used by philosophers


and thinkers to develop new concepts or reinterpret existing ones.

Empirical: Relies on experience or observation alone; data-based research that starts


with a working hypothesis or guess, then collection of data, then proving or disproving
hypothesis; characterized by controlling and manipulation of variables.

Types of research based on the environment in which it is to be carried out


a)

Field-Setting Research

b)

Laboratory/Simulation Research

Based on cause of the research:


c. Conclusion-Oriented (the researcher is free to pick up a problem according to his/her
wishes);
d. Decision-Oriented (Research problem emanates from needs of the decision maker- e.g.,
Operations Research)

2.2 Determining Sample Design


2.2.1 Key elements of the sampling design
While explaining the research design as the fourth step in your research process, we have seen
that the sampling design is part of the research design or the research methodology. When you
deal with the sample design, you are deciding the way of selecting a sample before data
collection actually takes place. While developing a sampling design, you must pay attention to
the following points:
(ii)

Type of universe:- finite or infinite universe

(iii) Sampling unit:- Sampling unit may be a geographical one such as state, district,
village, etc., or a construction unit such as house, flat, etc., or it may be a social
unit such as family, club, school, etc., or it may be an individual. The researcher
will have to decide one or more of such units that he has to select for his study.
(iv) Source list (sampling frame): is the frame from which sample is to be drawn. It
contains the names of all items of a universe (in case of finite universe only). If
source list is not available, researcher has to prepare it.
(v) Size of sample: An optimum sample is one which fulfills the requirements of
efficiency, representativeness, reliability and flexibility. Furthermore, the desired
precision as also an acceptable confidence level for the estimate, the parameters
19

Didactic Design: <Title of Module>

20

of interest and the budgetary constraint must invariably be taken into


consideration when we decide the sample size.
(vi) Parameters of interest: In determining the sample design, one must consider the
question of the specific population parameters which are of interest (e.g., mean,
median, mode) .
(vii) Budgetary constraint: Cost considerations. This fact can even lead to the use of a
non-probability sample.
(viii) Sampling procedure: There are several sample designs out of which you must
choose one for your study. Obviously, you must select that design which, for a
given sample size and for a given cost, has a smaller sampling error.

2.2.2 Criteria of selecting a sampling procedure (mainly for deductive


research)
While preparing your sampling design, you must remember that two costs are involved:
i)

The cost of collecting the data and

ii) The cost of an incorrect inference resulting from the data


There are two causes of incorrect inferences: systematic bias and sampling error.
Systematic bias
i)

A systematic bias results from errors in the sampling procedures, and it cannot be
reduced or eliminated by increasing the sample size.

ii) At best the causes responsible for these errors can be detected and corrected.
Usually a systematic bias is the result of one or more of the following factors:
1. Inappropriate sampling frame:
2. Defective measuring device:
3. Non-respondents:
4. Indeterminancy principle: Sometimes we find that individuals act differently when kept
under observation than what they do when kept in non-observed situations.
5. Natural bias in the reporting of data: People in general understate their incomes if asked
about it for tax purposes, but they overstate the same if asked for social status or their
affluence. Generally in psychological surveys, people tend to give what they think is the
correct answer rather than revealing their true feelings.
Sampling errors
Sampling errors:

Are the random variations in the sample estimates around the true population
parameters (e.g., population mean).
20

Didactic Design: <Title of Module>

21

Since they occur randomly and are equally likely to be in either direction, their
nature happens to be of compensatory type and the expected value of such errors
happens to be equal to zero.

Sampling error decreases with the increase in the size of the sample, and it happens
to be of a smaller magnitude in case of homogeneous population.

If we increase the sample size, the precision can be improved. But increasing the size
of the sample has its own limitations viz., a large sized sample increases the cost of
collecting data and also enhances the systematic bias.

Thus the effective way to increase precision is usually to select a better sampling
design which has a smaller sampling error for a given sample size at a given cost. In
practice, however, people prefer a less precise design because it is easier to adopt the
same and also because of the fact that systematic bias can be controlled in a better
way in such a design.

In brief, while selecting a sampling procedure, researcher must ensure that the
procedure causes a relatively small sampling error and helps to control the systematic
bias in a better way.

Characteristics of a good sampling design


From what has been stated above, we can list down the characteristics of a good sample design as under:
(a) Sample design must result in a truly representative sample.
(b) Sample design must be such which results in a small sampling error.
(c) Sample design must be viable in the context of funds available for the research study and in the
context of practicality to pick the selected elements of the sample.
(d) Sample design must be such so that systematic bias can be controlled in a better way.
(e) Sample should be such that the results of the sample study can be applied, in general, for the universe
with a reasonable level of confidence.

2.2.3 Different Types of Sample Designs: Probability and non-probability


Sample design involves probability or non-probability sampling. This is necessary is we are not going for
a census. A census is a complete enumeration of the entire population. The population and housing census
which was conducted in Ethiopia in 1986 E.C. can be taken as an example of a census.

There are several reasons for taking a sample (and hence a sampling design) instead of a complete
enumeration of the whole population or census. These include:
21

Didactic Design: <Title of Module>

22

a)
b)
c)

A census may be very expensive.


A census may require too much time.
A carefully obtained sample may be more accurate than a census. For example, in a large inventory
census or in a complete audit, errors due to fatigue or carelessness on the part of the census taker
may introduce a serious bias in the results.
Broadly speaking, there are two types of sampling techniques: random sampling and non-random
sampling. In random sampling, the elements to be included in the sample entirely depend on chance.
Random sampling techniques often yield samples that are representative of the population from which
they are drawn. In non-random sampling, the units in the sample are chosen by the investigator based on
his/her personal convenience and beliefs.

Probability Sampling: includes sampling techniques such as Simple Random Sampling;


Systematic Random Sampling; Stratified Sampling; Cluster/Area Sampling.

Non-Probability (Purposive) Sampling includes sampling techniques such as Convenience;


Judgemental; Quota Sampling.

A.

Random or Probability Sampling Techniques

Simple Random Sampling: This is a method of sampling in which every member of the population has
the same chance of being included in the sample.
Systematic Random Sampling: In some instances, the most practical way of sampling is to select, say,
every 20th name on a list, every 12 th house on one side of a street, every 50 th piece of item coming off a
production line, and so on. This is called systematic sampling, and an element of randomness can be
introduced into this kind of sampling by using random numbers to pick the unit with which to start.
Stratified Random Sampling: The methods of stratified sampling tend to be economically desirable if
the population to be sampled can be divided into relatively homogeneous subdivisions or strata. Stratified
random sampling is the procedure of dividing the population into relatively homogeneous groups, called
22

Didactic Design: <Title of Module>

23

strata, and then taking a simple random sample from each stratum. If the population elements are
homogeneous, then there is no need to apply this technique.
Example: If our interest is the income of households in a city, then our strata may be:
low income households
middle income households
high income households
To obtain a sample from each stratum, we may follow two ways:
i.

Taking a sample of size proportional to the sub-population (stratum) size, i.e., draw a
large sample from a large stratum and a small sample from a small sub-population. This
is known as proportional allocation.
ii.
Selecting a sample from each stratum so that the variation due to sampling is minimized.
This is known as optimum allocation.
iii.
Selecting equal units from each stratum. This is known as equal allocation.
Cluster Sampling: This is a method of sampling in which the total population is divided into relatively
small subdivisions, called clusters, and then some of these clusters are randomly selected using simple
random sampling. Once the clusters are selected, one possibility is to use all the elements in the selected
clusters. However, if elements within selected clusters give similar results, it seems uneconomical to
measure them all. In such cases, we take a random sample of elements from each of the selected clusters
(called two-stage sampling).
Example: Suppose we want to make a survey on the attitude and awareness of households about solid
waste management (SWM) in Addis Ababa. Collecting information on each and every household is
impractical from the point of view of cost and time. What we do is divide the city into a number of
relatively small subdivisions, say, Kebeles. So the Kebeles are our clusters. Then we randomly select, say,
20 Kebeles using simple random sampling. To collect information about individual households, we have
two options:
1)
2)

B.

We visit all households in these 20 Kebeles, or,


We randomly select households from each of these 20 selected Kebeles using simple random
sampling. This method is called two-stage sampling since simple random sampling is applied twice
(first, to select a sample of Kebeles and second, to select a sample of households from the selected
Kebeles)

Non-random Sampling Techniques

Convenience, Haphazard or Accidental sampling (members of the population are chosen based on
their relative ease of access)
Judgmental sampling or Purposive sampling (The researcher chooses the sample based on who
he/she thinks would be appropriate for the study)

Purposive sampling starts with a purpose in mind and the sample is thus selected to include
people or objects of interest and exclude those who do not suit the purpose. Purposive
sampling can be subject to bias and error.
23

Didactic Design: <Title of Module>

24

Case study (The research is limited to one group, often with a similar characteristic or of small size.)
Ad hoc quotas (A quota is established and researchers are free to choose any respondent they wish
as long as the quota is met.)
Snowball sampling (The first respondent refers a friend. The friend also refers a friend, etc.)
Comparison: Probability and non-probability sampling

Probability sampling (or random sampling) is a sampling technique in which the probability of
getting any particular sample may be calculated. Non-probability sampling does not meet this
criterion and should be used with caution. Non-probability sampling techniques cannot be used
to infer from the sample to the general population. Performing non-probability sampling is
considerably less expensive than doing probability sampling, but the results are of limited value.
The difference between non-probability (accidental or purposive) and probability sampling is
that non-probability sampling does not involve random selection and probability sampling does.
Does that mean non-probability samples aren't representative of the population? Not necessarily.
But it does mean that non-probability samples cannot depend upon the rationale of probability
theory. At least with a probabilistic sample, we know the odds or probability that we have
represented the population well. We are able to estimate confidence intervals for the statistic.
With non-probability samples, we may or may not represent the population well, and it will often
be hard for us to know how well we've done so. In general, researchers prefer probabilistic or
random sampling methods over non-probabilistic ones, and consider them to be more accurate
and rigorous.

24

Didactic Design: <Title of Module>

25

Unit Three

Data Types/Sources and Methods of Data Collection


3.1 Data sources/types: primary and secondary
Generally, the data you need for your research are classified into two categories:
primary data and secondary data.
Primary Data: Data that you collect for the first time by yourselves for your
own purpose. For example, you may measure the heights of students in a
class using a meter.
Secondary data: Data that have been collected by others for their own
purpose or for a general purpose. In this case, you had no control over the
design and data collection Government data (economics and demographics),
Media reports (TV, newspapers, Internet), etc
As a general rule, primary data sources are preferred to secondary sources since the primary source
contains much pertinent information about collection methods and limitations associated with the data. If
the information is derived from a secondary source, for instance, it is possible that the data might have
been altered for some reason. However, it is also common that a particular research could employ both
primary and secondary data.

3.2 Methods of Collecting Primary Data


3.2.1 Observation method:Description of the Method: In observation method researchers involves in recording the behavioural
patterns of people objects and events in a systematic manner. Observational methods may be: Participant
or Non participant; Structured and unstructured; Disguised and undisguised; and, Personal or
Mechanical.
Advantages of Observation Method:
It helps in overcoming issues of validity,
bias etc.
It is useful when the subject can not
provide information.
It is also useful when the subject is feared
to provide inaccurate information.
The researcher can have first hand
experiences of their study.
The ability to record and report all findings
that are true to the topic at hand. It depends
on what you are studying or researching as
to the variables of the project at hand of
course.

Disadvantage of Observation Method


Past events being studied
Frequently measuring attitudes or opinions
Selecting sample is tricky
Time and costs are high can be automated
Ethical issues
There may be too few trials/studies/ or
objects observed to make an end
conclusion to the study.

25

Didactic Design: <Title of Module>

26

3.2.2 Interview Method


Description of the Method: Interviewing is a technique that is primarily used to gain an understanding
of the underlying reasons and motivations for peoples attitudes, preferences or behaviour. Interview can
be undertaken on a personal interview one-to one basis and, if possible, through telephone. They can be
conducted at work, at home, in the street or in a shopping center, or some other agreed location.
There are three types of interview:
i.

Structured interview: A structured interview means that the questions are developed a head of
time with some opportunity to ask pre-planned, open-ended, probing questions.

ii.

Semi- Structured Interview: In a semi-structured interview, the interviewer will have some set
questions but interviewer can ask some spontaneous question too.

iii.

Unstructured Interview: This interview is also called an in-depth interview. The interviewer
begins by asking a general question. The interviewer then encourages the respondent to talk
freely.

A. Personal interview:
Personal interview is the process taking place between interviewer (person asking question) and
interviewee (person answering to the question).
Advantages of Personal Interview:

Disadvantages of personal interview:

Good response rate

Need to set up interviews

Possible in-depth questions

Time consuming

Can investigate motives and feelings

Geographic limitation

Can use recording equipment

Can be expensive

Can complete in set time & immediate

Normally need a set of questions

Interviewer in control and can give help if


there is a problem

Embarrassment possible if personal


questions

Serious approach by respondent resulting


in accurate information

If many interviewees, interviewers training


required,

Characteristics of respondent can be


assessed tone of voice, facial expression,
hesitation, etc.

Respondent bias tendency to please or


impress, create false personal image, or end
interview quickly.

Steps in conducting personal interview


List the areas in which you require information
Decide on type of interview structured, semi-structured, unstructured
Transform areas into actual question
26

Didactic Design: <Title of Module>

27

Make an appointment with respondent(s) discussing details of why and how long.
Try and fix a venue and time when you will not be disturbed.

B. Telephonic interview:
This is an alternative form of interview to the personal, face-to-face interview.
Telephonic interviews are less time consuming and less expensive and the researcher has ready
access to anyone on the planet that has telephone.
Advantages of Telephone Interview:

Disadvantages of Telephone Interview:

Relatively cheap and quick

Questionnaire required

Can cover reasonably large numbers of


people or organizations.

Not everyone has a telephone.

Wide geographic coverage


No waiting and spontaneous response
Help can be given to the respondent

Repeat calls are inevitable


Straightforward questions are required to ask
Respondent has little time to think on an issue
Good telephone manner is required

Can tape answers


3.2.3

Questionnaire Method:

Description: A questionnaire is a series of written questions an a topic about which the subjects opinions
are sought. In this method of data collection questionnaire will be sent to respondent through either post
or e-mail and asking the respondent to fill up the questionnaire and send it back to the researcher. A
questionnaire consists of a number of questions well formulated, printed or typed in a definite order to
probe and obtain responses from respondents. Therefore there is a variation in the form and content of
questionnaire from situation to situation.
Advantages of Questionnaire Method

Disadvantage of Questionnaire Method

Can be used as a method in its own right or as


a basis for interviewing or a telephone survey

Design problems can hamper the


research

Can be posted, e-mailed or faxed

Questions have to be relatively simple.

Can cover a large number of people or


organizations

Historically low response rate (although


inducements may help).

Wide geographic coverage.

Time delay whilst waiting for responses


to be returned.

Relatively cheap and no prior arrangements


are needed.
Avoids embarrassment on the part of the
respondent.
Respondent can consider responses.

Require a return deadline.


Several reminders may be required
Assumes no literacy problems.
No control over who completes it

Possible anonymity of respondent.


27

Didactic Design: <Title of Module>


No interviewer bias.

28

Problems with incomplete


questionnaires.

3.2.4 Focus Group Discussion method


A focus group could be defined as a group of interacting individuals (8-12 people in one group) having
some common interest or characteristics, brought together by a researcher, who uses the group and its
interaction as a way to gain information about a specific or focused issue. FGDs are facilitated by a
moderator.
Advantages of Focus Groups:
Takes advantage of the fact that people naturally interact and are influenced by others ( high face
validity).
Provide data more quickly and at lower cost than if individuals interviewed.
Generally requires less preparation and is comparatively easy to conduct.
Researcher can interact directly with respondents ( allows clarification, follow-up questions,
probing)
Data uses respondents own words: can obtain deeper levels of meaning, make important
connections, identify subtle nuances.
Very flexible; can be used with wide range or topics, individuals, and settings
Disadvantages of Focus Groups:
Have less control over group; less able to control what information will be produced.
Produces relatively chaotic data making data analysis more difficult.
Small numbers and convenience sampling severely limit ability to generalize to larger
populations.
Requires carefully trained interviewer who is knowledgeable about group dynamics.
Researcher may knowingly or unknowingly bias results by providing cues about what types of
responses are desirable.
Uncertainty about accuracy of what participants say. Results may be biased by presence of a very
dominant or opinionated member; more reserved members may be hesitant to talk.

3.3 Secondary data


In our modern world there is an unbelievable mass of data that is routinely collected by governments,
businesses, colleges, and other national and international organizations. Much of this information is stored
in electronic databases that can be accessed and analyzed. Secondary data can also be obtained from
published researches, government and non-government policy documents and reports, internal records,
and so on. Secondary data is taken by the researcher from secondary sources, internal or external.
The researcher must thoroughly search secondary data sources before commissioning any efforts
for collecting primary data, there are many advantages in searching for and analyzing data before
attempting the collection of primary data.
28

Didactic Design: <Title of Module>

29

Usually the cost of gathering secondary data is much lower than the cost of organizing primary
data. Moreover, secondary data has several supplementary uses.
It also helps to plan the collection of primary data, in case, it becomes necessary.
Advantages of Secondary data analysis:
Secondary data analysis has several advantages:
It makes use of data that were already collected by someone else.
It often allows researcher to extend the scope of your study considerably.
It saves time that would otherwise be spent collecting data.
It provides a larger database (usually) than what would be possible to collect on ones own.
In many small research projects it is impossible to consider taking a national sample because of
the costs involved.
Many archived databases are already national in scope and, by using them; researcher can
leverage a relatively small budget into a much broader study than if you collected the data
yourself.
Disadvantage of secondary data:
You may have less control over how the data was collected.
There may be biases in the data that you dont know about.
Its answers may not exactly fit your research questions.
It may be obsolete data.
Old secondary data collections can distort the results of the research.
Secondary data con also raise issues of authenticity and copyright.

3.4 Guideline for Designing a Questionnaire and other Instruments


3.4.1 Guideline for Choice/Design of Data Collection Instruments in general
In the design of data collection instruments, the decision about question content, wording and
order are the result of a process that considers the following:
i)

What is the research problem?: The problem definition and objectives of the research.

ii) What type(s) of evidence is needed to address it?: Exploratory, descriptive, causal or
explanatory
iii) What ideas, concepts, variables are we measuring? Content, definition and indicators
iv)

What type(s) of data is(are) appropriate? Qualitative, quantitative, both.

v)

From whom should we collect the data? Nature of the target population or sample (e.g.,
their education level, cultural background, etc)
29

Didactic Design: <Title of Module>

vi)

30

What method of data collection is most suitable? Observation, interviews, questionnaire


or schedule, face-to-face or telephone, e-mail, web or postal.

vii) Where will the data be collected? In the street/shopping centre. At respondents office or
home.
viii) How will responses be captured? Pen and paper, computer, Audi and/or video recording,
photograph.
ix)

What are the constraints? Time and/or budget.

x)

How will the response be analyzed? Computer or/and manually.

3.4.2 Designing a Questionnaire


Masters students often conduct surveys and use questionnaire to collect data. However, they face
difficulties in designing the questionnaire. The way you design the questionnaire or schedule has
a big role to play in helping your or the enumerator gather the data accurately and effectively;
and in helping the respondents provide accurate, complete and reliable data.
Why worry about the quality of the questionnaire/schedule?
i) A purely designed questionnaire can result in an unpleasant experience for the
respondents and adversely affect their perception about research, reducing their
willingness to cooperate in other future researches.
ii) A poor introduction and description of the research (e.g., purpose) can lead to high level
of non-response, adversely affecting the representativeness of the sample.
iii) Poorly conceived questions not measuring what they claim to measure mean the data
collected are not valid.
iv) Questions that are beyond the knowledge of the respondent ot that require heavy
memory about past events result in inaccurate and unreliable data.
v) Poorly worded questions (using ambiguous, vague, difficult, unusual or technical jargons)
can be misunderstood or misinterpreted or interpreted differently by different people,
resulting in unreliable and invalid data.
vi) A badly structured questionnaire (that begins with difficult, sensitive or personal
questions) can result in refusal to answer or complete the questionnaire.
vii) Poor question order can result in order bias or contamination of later responses by earlier
questions.
viii)
Long, boring or repetitive questions may result in a loss of interest or produce
inaccurate responses.
ix) Too long questionnaire results in respondents fatigue, loss of interest
x) Poor layout can lead to errors in recording, coding and data processing.
30

Didactic Design: <Title of Module>

31

Guideline to the Questionnaire Design Process:


i)

Decide on the question content: This is done by clarifying the research objectives (the
information requirements) and what exactly it is that the question needs to measure.
o

ii)

Some questions require standard answer options. For example: Marital status has standard
answer options (single or never married, married, living as married, separated, divorced,
widowed). While developing the content of a questionnaire, clarify the meaning (concepts,
definitions and indicators). If you are not clear with the concepts and their indicators, it is
difficult to craft the questions with the right wording of the statements.
Ensure Proper Wording of the Questions:

Each question should be worded so that the following hold:

It measures what it claims to measure

It is relevant, meaningful and acceptable to the respondent

It is understandable to the enumerator and well as the respondent

It is interpreted in the way in which you intend by all respondents.

It elicits an clear, unambiguous, accurate and meaningful response

Examples of vaguely worded questions:

How much money do you earn? - What type of earning is this question referring to (from
work? Investment? Remittances? Social benefits?); what time period is it referring to? (dayly?
Weekly? Annually?)

Do you have a personal computer? is this questions referring to ownership of the computer or
type of the computer? What does it mean by you? (Myself? My household?) What does it mean
by have? (own? or have access to?)

Other pitfalls in wording of questions:

Using of technical Jargons and abbreviations: Example refurbishments, fiscal policy, monetary
policy, UNHCR, UNESCO, etc

Using words that are difficult to read out or pronounce: E.g., ,In an anonymous form.

Use of double-barrelled questions: Example- Do you like using e-mail and the web; Would you
like to be rich and famous?

Use of negatively phrased questions: Do you agree that it is NOT the job of the government to
take decisions about the following?

Use of very long questions

Use of questions that challenge the respondents memory: Example How many hours of
television did you watch last month?; List the books you have read in the last year
31

Didactic Design: <Title of Module>

32

Including leading questions: Example - Public speeches against racism should not be allowed.
Do you agree or disagree?

Wording questions using sensitive or loaded non-neutral questions: Example What do you
think of welfare for the poor?

Questions that make assumptions: Example How often do you travel to rural areas? (this
wording assumes that the respondent travels to rural areas); When did you stop beating your
wife?

Questions with overlapping response categories: Example- Howmany hours did you spend in the
library yesterday? (response categories: 0-1 hours, 1-2 hours, 2-3 hours, etc)

Questions with insufficient response categories: Example How do you travel to work each
day? (response categories: by my own car, on foot, by public bus, on a bicycle). In this case other
modes are missing (e.g., by service bus, by a friend/colleagues car, by motor bicycle, by Bajaj,
by cart, etc) and the possibility to chose more than one mode is not provided.

Questions on sensitive topics Example: which political party did you elect during the May 7
election?

iii)

Follow the right question order

Put your questions into effective and logical order. Dont ask sensitive and difficult questions too
early. Also it is preferable to ask personal questions (e.g. classification questions such as those on
age, income, etc) at the end.

Classify your questions into groups and provide a brief introduction to each group. Within each
group (module), begin with general questions and then move on to specific questions.

In case a particular question is not relevant to the respondent, indicate that they can jump the
question to indicate which question they should jump to.

iv)

v)

Make good layout and appearance:


The layout should enable the respondent fill the questionnaire easily. Instruction to the
interviewer should be in CAPITALIZED and BOLD format while the question text and answer
categories are in lower cases, not bold.
Optimize questionnaire Length:

The questionnaire must be long enough to cover the research objectives but not too long
considering the research cost and time to the respondent.

Recommended maximum length:

vi)

For an in-home face-to-face questionnaire (schedule) is 45 60 minutes.

For Telephone interview: about 20 minutes

For Street interview: 5 to 10 minutes.

Conduct Pilot Study and make necessary changes:


32

Didactic Design: <Title of Module>

33

Test the questionnaire out to identify its pitfalls and correct them before you go for the full-scale
survey.

33

Didactic Design: <Title of Module>

34

Unit Four
Research Proposal, Referencing, Reporting Results and Ethical
Considerations
4.1 The Research Proposal
4.1.1 The Need for Research Proposal
Before you embark up on your research for the masters thesis, you will be required to submit a
research proposal. Most journals and calls for conference papers also require a submission of a
research proposal or abstract or synopsis.

4.1.2 Structure of Research Proposal


A research proposal should include the following sections.
1) Covering page:
3) Introduction
3) Statement of the research problem.
4) Research objectives/hypothesis/ key concepts
6) Research methods/preliminary survey of literature;
7) References
8) Time table/ time schedule or research plan; and
9) Budget.

4.1.3 Content of research proposal: a brief Guide


This brief guideline is prepared to help you in the process of preparing your research proposal for
the research area and research topics you have chosen (as reflected in the research title you have
submitted.
Cover page:

The title: the title should be as short as possible but should adequately represent the topic
and the research problem/objective
1. The introduction
1.1 General Background

Your research proposal should have an introduction. The introduction should given a general background
in relation to the research area/topic and enhance interest of readers. Also indicate the debate/controversy
in the literature over the topic or issue you intend to deal with in your research. You should demonstrate
the relevance of your research to the theory (to the debate) and/or for practice (policy).
34

Didactic Design: <Title of Module>

35

1.2 Background to the study area


Provide a brief background to the geographic area in which the study will be conducted. Make your
description relevant to the research topic. Avoid unnecessary descriptions.
1.3 Statement of the problem
In this part you are expected to state the problem that your research aims at addressing. The problem can
be stated from theoretical point of view and/or practical point of view. As part of the problem statement,
you should provide a justification as to why this research has to be done. If similar research has been done
by others, you have to indicate their gaps. Your problem statement should end by specifying the general
and specific research questions that the research aims at answering.
1.4 Scope of the study:
Indicate or delimit the thematic and geographic scope of the study.
1.5 Significance of the study
Indicate the theoretical and/or practical significance of your research.
1.6 Limitations of the study
Though a complete enumeration of the limitations will be done after you have gathered the data and did
the analysis, you should anticipate the possible limitations or weaknesses of your study. Shortage of time
and money are not limitations. The limitations could, for instance, be related to sampling procedures and
sample size, quality of the data, data analysis techniques used, etc.

2. Theoretical/conceptual framework
This is the part where you will provide a summary of the literature review you have conducted. Your
review should lead to a clear definition of the theoretical framework and the conceptual framework.
2.1 Theoretical framework
The theoretical framework refers to a summary of the theories that you will refer to in your study. Review
the relevant literature and identify what the theory says about the issue/topic you are addressing.
Elaborate the different perspectives. Also provide alternative definitions (if any) of the important concepts
and variables that you will use in your research. The literature review will help you identify the relevant
variables that could potentially be used as indicators/measures for your concepts. If you are going to
investigate the relationship among variables, show what the theories state about the relationships (i.e.
about the direction of relationships and significance).
What you summarize as part of the theoretical framework has to be very relevant to the topic and
particularly the research problem and the research questions. While conducting the literature survey,
students often throw in whatever literature that is in one way or another related with the research area but
not necessarily with the research problem. Failing to prepare the theoretical framework properly has at
least the following disadvantages:
35

Didactic Design: <Title of Module>

36

You will not have the basis to define relevant concepts, identify the assessment issue or the
variables and define them.

You dont know what relationship to expect.

It will not be easy for you to choose the appropriate research design.

Your data collection instruments will be ill designed.

During analysis, you will not have any theory to compare your results with.

The contribution of your research to the existing theory will be blurred.

2.2 Conceptual Framework

Based on the theoretical framework, you are expected to develop your conceptual framework.
This is the part in which you are going to define the concepts operationally (in a way you like your
readers to understand the concepts) and where you will select your factors/variables. In the literature,

concepts may defined in different ways and you will have to make a choice here. In your study,
how the concepts defined operationally? What are the variables and indicators that you will use
in your study to measure the concepts? How do the different concepts relate among each other?
These and other questions should be answered via your conceptual framework. For instance,
education quality can be measured using a number of indicators. Among these indicators (which you must
have identified and defined in the conceptual framework), clearly show which ones you are going to use
in your research. Consider the usefulness/appropriateness of the indicator as well as data availability and
feasibility in your choice of the variables.
If you are doing a qualitative research that involves, for instance, assessments, you should make clear the
assessment themes/issues and the indicators. If you could add a diagrammatical representation of the
conceptual framework, it will add a visual effect to your operationalization. The diagram depicts the

concepts/issues and how they relate among each other. You may create your illustrative diagram
or adopt or adopt it from the literature. In the later case, you have to clearly cite the sources for
your diagram.
2.3 Hypothesis
In deductive research designs, it is necessary that you formulate some hypothesis. Once you are clear with
the theory and your conceptual framework, you can state some hypothesis. A hypothesis is a tentative

assumption made in order to draw out and set its logical or empirical consequences. It should be
specific and pertinent to the piece of research in hand. The hypothesis will provide the focal
point for your research; to delimit the area, sharpen thinking and keep the researcher on the right
track. Also remember that your hypothesis determines data type required; the data collection and
sampling methods to be used; the tests that must be conducted during data analysis.
3. Methodology (Research design):

Once you have developed your hypothesis, the next step is to craft your research design. A
research design is like the blue print for house construction. If you start building a house with out
first having the design (consisting of the architectural, electrical, sanitary, etc designs), the result
is you dont know what type of house you will end up with; it will be costly and time taking;
36

Didactic Design: <Title of Module>

37

often involving construction and demolition of what has been constructed. Most importantly, the
house lacks quality and may be prone to risks. Likewise, a research conducted without a research
design at hand is aimless, ambiguous, time taking, costly and may be totally irrelevant and
unacceptable in light of the requirements for a scientific investigation. Research design refers to
the crafting of the conceptual structure within which research will be conducted in a way that is
as efficient as possible, the collection of relevant evidence with minimal expenditure of effort,
time and money.
3.1 Research type:
Describe the type of your research based on some commonly known criteria. More specifically, describe
the type of your research based on the nature of the research enquiry (e.g., exploratory, descriptive, etc);
the mode of data collection; the type of the data; and so on.

3.2 Data type, source and data collection techniques


In this part, describe:

The data type (primary and secondary)

The sources of data (primary and secondary sources)

The methods of data collection (which method to which data type interview,
questionnaire, FGD, Observation, etc)

The data collection procedure you are going to follow

3.3 Sampling Design


In this part, indicate your sample population, sampling frame, sampling unit; determine the sample size
and, show the sampling techniques and procedures.

3.4 Method of data analysis


Indicate how you are going to analyze the data you will gather. Specify the techniques and
statistical packages you will use (if applicable).
4. Timeline and budget
Identify the major activities in your research and put a time line (a start and finish period). The
Gant Chart is a useful tool in this respect. Then, attach a budget to the activities.
5. List of References
Include a list of references. References from journal articles and books are more reliable. Also
include and clearly indicate other resources such as government policy documents, reports,
magazine articles, conference papers and proceedings, legal rule/regulation, thesis, online
database, other unpublished works.
Annexes:
Additional information that is not directly part of the proposal, but which is considered to be
relevant for the understanding of the project, should be attached to the research proposal as an
annex.
37

Didactic Design: <Title of Module>

38

4.2 Referencing Styles: The APA Referencing Style

The American Psychological Association reference style uses the Author-Date format.

Refer to the Publication Manual of the American Psychological Association (6th ed.) for
more information. Check the Library Catalogue for call number and location(s).

When quoting directly or indirectly from a source, the source must be acknowledged in
the text by author name and year of publication. If quoting directly, a location reference
such as page number(s) or paragraph number is also required.

IN-TEXT CITATION
Direct quotation use quotation marks around the quote and include page numbers
Samovar and Porter (1997) point out that "language involves attaching meaning to symbols"
(p.188).
Alternatively, Language involves attaching meaning to symbols" (Samovar & Porter, 1997,
p.188).
Indirect quotation/paraphrasing no quotation marks
Attaching meaning to symbols is considered to be the origin of written language (Samovar &
Porter, 1997).
N.B. Page numbers are optional when paraphrasing, although it is useful to include them
(Publication Manual, p. 171).

Citations from a secondary source


As Hall (1977) asserts, culture also defines boundaries of different groups (as cited in Samovar
and Porter, 1997, p. 14).
At the end of your assignment, you are required to provide the full bibliographic information
for each source.
References must be listed in alphabetical order by author.

EXAMPLES OF REFERENCES BY TYPE


(Retrieved on Jan. 14, 2013, from
http://www.libraries.psu.edu/psul/lls/students/apa_citation.html)
Books
Important Elements:
38

Didactic Design: <Title of Module>

39

Author (last name, initials only for first & middle names)
Publication date

Title (in italics; capitalize only the first word of title and subtitle, and proper nouns)

Place of publication

Publisher
Citing Books
Source

Example Citation

Book by a single author

Rollin, B. E. (2006). Science and ethics. New York, NY:


Cambridge University Press.

Book by two authors

Sherman, C., & Price, G. (2001). The invisible web:


Uncovering information sources search engines cant see.
Medford, NJ: CyberAge Books.

Book by three or more


authors

Goodpaster, K. E., Nash, L. L., & de Bettignies, H. (2006).


Business ethics: Policies and persons (3rd ed.). Boston,
MA: McGraw-Hill/Irwin.

Book by a corporate
author

American Medical Association. (2004). American Medical


Association family medical guide (4th ed.). Hoboken, NJ:
Wiley.

Article or chapter within Winne, P. H. (2001). Self-regulated learning viewed from


an edited book
models of information processing. In B.J. Zimmerman &
D.H. Schunk (Eds.), Self-regulated learning and academic
achievement (2nd ed., pp. 160-192). Mahwah, NJ: Lawrence
Erlbaum Associates.
Translation

Tolstoy, L. (2006). War and peace. (A. Briggs, Trans.). New


York, NY: Viking. (Original work published 1865).

Articles from Print Periodicals (magazines, journals, and newspapers)


Important Elements:

Author (last name, initials only for first & middle names)
Date of publication of article (year and month for monthly publications; year, month and
day for daily or weekly publications)
39

Didactic Design: <Title of Module>

40

Title of article (capitalize only the first word of title and subtitle, and proper nouns)

Title of publication in italics (i.e., Journal of Abnormal Psychology, Newsweek, New York
Times)

Volume and issue number

Page numbers of article


Citing Articles from Print Periodicals
Source

Example Citation

Article in a monthly
magazine (include
volume # if given)

Swedin, E. G. (2006, May/June). Designing babies: A


eugenics race with China? The Futurist, 40, 18-21.

Article in a weekly
magazine (include
volume # if given)

Will, G. F. (2004, July 5). Waging war on Wal-Mart.


Newsweek, 144, 64.

Article in a daily
newspaper

Dougherty, R. (2006, January 11). Jury convicts man in


drunk driving death. Centre Daily Times, p. 1A.
Rimer, S. (2003, September 3). A campus fad thats being
copied: Internet plagiarism seems on the rise. New York
Times, p. B7.

Article in a scholarly
journal

Stock, C. D., & Fisher, P. A. (2006). Language delays


among foster children: Implications for policy and practice.
Child Welfare, 85(3), 445-462.

Book review

Rifkind, D. (2005, April 10). Breaking their vows. [Review


of the book The mermaid chair, by S.M. Kidd]. Washington
Post, p. T6.

Electronic Resources - including online articles, websites, and blogs


The following guidelines for electronic sources follow the recommendations in the sixth edition
(2009) of the Publication Manual of the American Psychological Association.

40

Didactic Design: <Title of Module>

41

Articles from the Librarys Online Subscription Databases


Important Elements:

Publication information (see Print Periodicals, above)


DOI number (if available). More information about DOI numbers is available on the
American Psychological Association's APA Style page.

If the DOI number is not available, APA recommends giving the URL of the publication.
If the URL is not known, include the database name and accession number, if known:
Retrieved from ERIC database (ED496394).
Citing Articles from the Librarys Online Subscription Databases
Source

Example Citation

Magazine article
with URL

Poe, M. (2006, September). The hive. Atlantic Monthly, 298, 86-95.


Retrieved from http://www.theatlantic.com

Journal article
with DOI

Blattner, J., & Bacigalupo, A. (2007). Using emotional intelligence to develop


executive leadership and team and organizational development. Consulting
Psychology Journal: Practice and Research, 59(3), 209-219.
doi:10.1037/1065-9293.59.3.209

Articles in Online Journals, Magazines and Newspapers


Important Elements

Author (last name, initials only for first & middle names)
Date of publication of article

Title of article

Title of publication (in italics)

Volume and issue number (for scholarly journals, if given)

Page numbers, if given

DOI number, if given. More information about DOI numbers is available on the
American Psychological Association's APA Style page.

If the DOI is not available, give the URL (Web address) of the article.
41

Didactic Design: <Title of Module>

42

Citing Articles in Online Journals, Magazines and Newspapers


Source
Article in an online
scholarly journal

Example Citation

Overbay, A., Patterson, A. S., & Grable, L. (2009). On the outs: Learning
styles, resistance to change, and teacher retention. Contemporary Issues
inTechnology and Teacher Education, 9(3). Retrieved from
http://www.citejournal.org/vol9/iss3/currentpractice/article1.cfm

Article in an online
magazine

Romm, J. (2008, February 27). The cold truth about climate change.
Salon.com. Retrieved from http://www.salon.com

Article in an online
newspaper

McCarthy, M. (2004, May 24). Only nuclear power can now halt global
warming. Earthtimes. Retrieved from http://www.earthtimes.org

Web Sites: Important Elements


Author (if known)
Date of publication, copyright date, or date of last update

Title of Web sitei

Date you accessed the information (APA recommends including this if the information is
likely to change)

URL (Web address) of the site


Citing Web Sites
Source

Example Citation

Web site with author

Kraizer, S. (2005). Safe child. Retrieved February 29, 2008, from


http://www.safechild.org/

Web site with


corporate author

Substance Abuse and Mental Health Services Administration


(SAMHSA). (2008, February 15). Stop underage drinking. Retrieved
February 29, 2008, from http://www.stopalcoholabuse.gov

Web site with

Penn State myths. (2006). Retrieved December 6, 2011, from


42

Didactic Design: <Title of Module>

unknown author

43

http://www.psu.edu/ur/about/myths.html

Page within a Web site Global warming 101. (2012). In Union of Concerned Scientists.
(unknown author)
Retrieved December 14, 2012, from
http://www.ucsusa.org/global_warming/global_warming_101/

Electronic Books
Important Elements:

Author (last name, initials only for first & middle names)
Publication date

Title (in italics; capitalize only the first word of title and subtitle, and proper nouns)

Place of publication

Publisher

URL (Web address) of the site from which you accessed the book
Citing Electronic Books
Source

Electronic Book

Example Citation
McKernan, B. (2005). Digital cinema: The revolution in cinematography,
postproduction, and distribution. New York, NY: Mc-Graw Hill. Retrieved
from www.netlibrary.com.
Post, E. (1923). Etiquette in society, in business, in politics, and at home.
New York, NY: Funk & Wagnalls. Retrieved from
http://books.google.com/books.

Multimedia Resources - including motion pictures and television


Motion Picture (film, video, DVD): Important Elements
Director
Date of release

Title (in italics)


43

Didactic Design: <Title of Module>

Country where motion picture was made

Studio

44

Citing Films, Videos, DVDs


Source
Motion Picture

Example Citation
Johnston, J. (Director). (2004). Hidalgo. [Motion Picture]. United States,
Touchstone/Disney.

Television Program: Important Elements


Producer
Date of broadcast

Title of television episode

Title of series (in italics)

Location of network and network name


Citing Television Programs
Source

Television program
in series

Example Citation
Buckner, N. & Whittlesey, R. (Writers, Producers & Directors). (2006).
Dogs and more dogs. [Television series episode]. In P. Apsell (Senior
Executive Producer), NOVA. Boston: WGBH.

Government Publications: Important Elements


Government Agency
Date of publication

Title of document (in italics)

Place of publication

Publisher
Citing Government Publications
44

Didactic Design: <Title of Module>

Source
Government
document

45

Example Citation
U.S. Dept. of Housing and Urban Development. (2000). Breaking the cycle of
domestic violence: Know the facts. Washington, DC: U.S. Government
Printing Office.

Citing Indirect Sources


If you refer to a source that is cited in another source, list only the source you consulted directly
(the secondary source) in your reference list. Name the original source in the text of your paper,
and cite the secondary source in parentheses: Wallace argues that. (as cited in Smith, 2009).
In this example, only the Smith source would be included in the reference list.
Whenever possible, try to find and consult the original source. If the Penn State University
Libraries does not have the original source, we can try to get it for you through interlibrary loan.

4.3 Report Writing: Communicating the Results


4.3.1 Structure of your research report (Thesis):
It depends on the type of study. Generally, it includes:
1. Title page, Table of contents, acronyms, abstract, etc
2. Main body:
Chapter 1: Introduction
Chapter 2: Theoretical and Conceptual Framework
Chapter 3: Methodology
Chapter 4: Results and Discussion
Chapter 5: Conclusion and Recommendations
3. Annexes

4.3.2 Your sentences and paragraphs


Pay attention to your sentences and paragraphs:

Sentences: Grammar, spelling, mechanics, sentence construction(verb-subject agreement,


flow), etc.
45

Didactic Design: <Title of Module>

46

Paragraphs:
not too long, not too short;
Convey one idea in one paragraph,
Usually a paragraph has introduction and concluding sentences
Check for flow, coherence, economy, etc
Use connecting words/phrases; avoid repetition of words/phrases

4.3.3

While presenting the results (findings):


Do not report one result in different formats (e.g. table, text, graph) use one!
Do not repeat the results that are presented in a table or graph in your paragraphs;
write what you observe (trends, patterns, averages, etc)

4.3.4

When discussing of results/findings

Do not report results again!; Focus on the why part (reasons)/explanations for the
findings from your data or from theory, or give your own interpretation
Link with theory/other studies, your hypothesis
Use the specific objectives of your study as a guide
4.3.5 When writing the conclusion and recommendations:
The conclusions should be based on your findings
While concluding, answer your research questions/address your specific objectives
Recommendations should be based on your conclusions
While recommending solutions, indicate how your recommendations could be put into
practice (the how part)

4.4 Preparing and Delivering a Presentation


What do the advisor and the examiner(s) expect from you? Content-wise, Be focused! Dont
present everything! You could focus on:
The problem, research questions
Your conceptual model & variables
Methodology (in short)
Summary of Key findings, results of hypothesis testing, and comparison with theory (or
interpretation) and other studies; Main Conclusion
46

Didactic Design: <Title of Module>

47

What else do they need from you?


Dressing elegant. Dress for the audience!
Language fluency Correct grammar, pronunciations
Eye contact dont look at the roof or the floor or only one person!
Voice (loud enough, not so noisy, attractive sound),
Speed medium,
Self-confidence (show that it is your work),
Openness (transparency), receptive of comments
Honesty Say I dont know this if you really dont know something Dont pretend or
be too defensive

4.5 Ethical Issues/Considerations


4.5.1 Ethical Issues related to Purpose of Research:
However lofty the stated purposes of research, the product of research in the public sector may
be to provide tools for manipulation and control for some segments of society at the expense of
others.

For example, the tendency to describe some populations as deviant leads away from
focusing on larger problems of the distribution of political, economic, and social power.

Social scientist need to be aware of the possible uses to which their research may be put.

Research should not only enhance the researcher's career, but also benefit the group,
organization, or population studied.
Those who fund and conduct research also reap its benefits.

4.4.2 Ethical Issues related to the Subject Matter


What populations can be studied with little risk or harm?
Are there some populations which are routinely subjects of research, while others are
ignored?
Populations with little social or political power ( e.g. children, the elderly, the poor,
mentally disabled, students, parents, criminals, delinquents, addicts, the military, etc) are
often targets of research, while those with substantial power are not.
Those who "own" or "run" organizations are usually in charge of the research that goes
on in them.
47

Didactic Design: <Title of Module>

48

The people who are the "subjects" of the research may have neither the power to shape
the research nor the ability to refuse to participate.
Is participation Voluntary?
1. Participants must be voluntary and not coerced.
2. Participants cannot be threatened with a loss of other, unrelated benefits( e.g., food stamps,
bilingual education)
3. Participants cannot be offered unreasonably large inducements to participate ( e.g., prisoners)
4. Information must be provided about all risks or potential risks of participation including
physical harm, pain, discomfort, embarrassment, loss of privacy; exposure to illness, etc.
5. Information must be provided about all benefits or potential benefits of participation, for
example, free health care, monetary incentives, the value of the research to science, etc.
The ratio of risks to benefits should be stated.
7. Are the benefits sufficient to allow participants to put themselves at risk? Should the study be
done at all?
8. Are the participants' rights and well being sufficiently protected?
9. Are the means of obtaining informed consent adequate and appropriate?
10. Participants can withdraw from the study at any time, refuse to comply with any part of the
study and refuse to answer any questions.
III. Ethical issues related to the methods
Most ethical violations correspond to illegitimate use of the investigator's power.
Researchers need to be trained to be concerned, as social scientists with people as well as with
research design, methodology, etc.
Ethical concerns include:
1. Involvement without consent:
through participants observation or covert observations;
through unknown intervention in ongoing programs or
operations;
through field experiments.
Disguising the true nature or purpose of the research:
- the way it will be used is not revealed;
- information is withheld that would affect informed
48

Didactic Design: <Title of Module>

49

consent;
3. Deceiving the research participant:
- to conceal the purpose of the research;
- to conceal the true function of the participant's actions;
- to conceal the experiences the participants will have to
undergo;
Leading participants to commit acts that lessen their self-esteem:
- Cheating, lying stealing; harming others;
- yielding to social pressure contrary to one's ideas;
- prohibiting the rendering of aid when needed;
- behavioral control or character change;
- denial of the right to self-determination;
Coercion that abridges freedom of choice
- research is linked to participation in organizational or institutional programs;
- requests for participation" are worded in such a way that it is difficult to say no;
- Participation is made a requirement of a college course;
6. Physical or mental stress:
horror, threat to identity, failure, fear, emotional shock.
7. Invasion of privacy:
- covert observation;
- unnecessary questions of personal nature on interviews or questionnaires;
- disguised, indirect, or projective tests;
- using third-party information without consent;
- it is also the ethical responsibility of the researcher to ensure that the data are accurately
collected, coded, entered, analyzed, and interpreted, so as not to perform a disservice to the
subject population.
After the project is over, the researcher should:
- remove any harmful after- effects from the participants;
- maintain anonymity and/or confidentiality;
- publish the findings in reports and articles;
49

Didactic Design: <Title of Module>

50

- store the data for use by other researchers in the future;


- inform participants of the results if they so choose;
- inform colleagues and professional associates of the research

50

Didactic Design: <Title of Module>

51

UNIT FIVE:
The Survey Method and Case Studies
5.1 The Survey Method
5.1.1 What is a survey?
A survey is a detailed and quantified description of a population. They attempt to identify
something about a population, that is, a set of objects about which we wish to make
generalizations. A population is frequently a set of people, but organizations, institutions or even
countries can comprise the unit of analysis.
Surveys involve the systematic collecting of data, whether this be by interview, questionnaire or
observation methods, so at the very heart of surveys lies the importance of standardization.
Precise samples are selected for surveying, and attempts are made to standardize and eliminate
errors from survey data gathering tools.
A particular form of survey, a census, is a study of every member of a given population. For
example, The Central Statistics Agency of Ethiopia conducts the population and housing census
every ten years. A census provides essential data for government policy makers and planners, but
is also useful, for example, to businesses that want to know about trends in consumer behavior
such as ownership of durable goods, and demand for services.

5.1.2 Characteristics of Survey Methods


The survey method involves a team of enumerators going into urban/rural areas and eliciting data
via answers to questions on a structured form.

Large number of observation can be collected within a certain period of time from a relatively
large number respondents.
The sample can be spread over a wide area, which has statistical value, and there by making the
study some what more generalizable.

At the same time, the principal researcher need not spend too much time in the field.
Surveys require respondents to answer questions about their opinions, attitudes, or preferences
and about socio demographic characteristics of respondents.

It is essentially cross-sectional (during a particular time).

It is not concerned with the characteristics of individuals.

It involves clearly defined problem.

It requires experts imaginative planning.

It involves definite objectives.


51

Didactic Design: <Title of Module>

It requires careful analysis and interpretation of the data gathered.

It requires logical and skilful reporting of the findings.

52

5.1.4 Types of Survey Studies


There are three criteria for classifying the survey research:
(a) Nature of variables: i) Status survey, or ii) Survey research
(b) Group Measured: i) Sample or ii) Population
(c) Sources of data collection:
i. Questionnaire
ii. Interview
i. Controlled observations survey.

5.1.5 Stages of the survey method

52

Didactic Design: <Title of Module>

53

Source: Gray (2004)


53

Didactic Design: <Title of Module>

54

5.2 Case Studies


Surveys are used where large amounts of data have to be collected, often from a large, diverse
and widely distributed population. In contrast, case studies tend to be much more specific in
focus. While surveys tend to collect data on a limited range of topics but from many people, case
studies can explore many themes and subjects, but from a much more focused range of people,
organizations or contexts. The case study method can be used for a wide variety of issues,
including the evaluation of training programmes (a common subject), organizational
performance, project design and implementation, policy analysis and relationships between
different sectors of an organization or between organizations. Case studies, then, explore subjects
and issues where relationships may be ambiguous or uncertain. But, in contrast to methods such
as descriptive surveys, case studies are also trying to attribute causal relationships and are not
just describing a situation. The approach is particularly useful when the researcher is trying to
uncover a relationship between a phenomenon and the context in which it is occurring. For
example, a business might want to evaluate the factors that have made a recent merger a success
(to prepare the ground for future mergers). The problem here, as with all case studies, is that the
contextual variables (timing, global economic circumstances, cultures of the merging
organizations, etc.) are so numerous that a purely experimental approach revealing causal
associations would simply be unfeasible.
The case study approach requires the collection of multiple sources of data but, if the researcher
is not to be overwhelmed, these need to become focused in some way. Therefore case studies
benefit from the prior development of a theoretical position to help direct the data collection and
analysis process. Note that the case study method often (but not always) tends to be deductive
rather than inductive in character.
Case study is both method and tool for research. Case study leads to very novel idea and no
longer limited to the particular individual. In case study investigator tries to collect the bits in
support of proposition. Case study methodological is not longitudinal study but it depends on the
methods of information about the individual as far as possible.

Therefore, case study is conducted only for specific case. Actually case study means a study in
depth. Here depth means to explore all peculiarities of case. It gives a detailed knowledge about
the phenomena and not able to generalize beyond the knowledge. In physical science every unit
is the true representative of the population, but in social sciences, the units may not be true
representative of the population. This is because there are individual differences as well as intraindividual differences. Therefore, prediction cannot be made on the basis of knowledge obtained
from case study. No statistical inferences can be drawn from the exploration of a phenomenon.
54

Didactic Design: <Title of Module>

55

Here case does not necessarily mean an individual. Case means an unit, it may be an institution
or a nation, or religion or may be an individual or a concept

WHEN SHOULD WE USE CASE STUDIES?


The case study method is ideal when a how or why question is being asked about a
contemporary set of events over which the researcher has no control.

Source: Yin (1994)

55

Didactic Design: <Title of Module>

56

SOURCES OF DATA IN CASE STUDY

Source: Adopted from Yin (1994) by Gray (2004).

56

Didactic Design: <Title of Module>

57

PART TWO
Presenting and Analyzing Quantitative Data
(With SPSS Application)
Contents:
Unit 6: Analyzing Quantitative Data - Descriptive Statistics:
Basic concepts in statistics;
Classification and Presentation of Statistical Data (bar chart, pie chart, histogram);
Measures of central tendency and dispersion (mean, median, mode, mean deviation,
variance, standard deviation, covariance, Z-score);
Exercise with SPSS Application
Unit 7: Analyzing Quantitative Data- Tests of hypothesis concerning means and
proportions:
Tests of hypotheses concerning means; Tests concerning the difference between two
means (independent samples);
Tests of mean difference between several populations (independent samples); Pairedsamples t-test (Differences between dependent groups);
Tests of association (the Pearson coefficient of correlation and test of its significance, The
Spearman rank correlation coefficient and test of its significance); Nonparametric
Correlations (The Chi-square test);
Hypothesis test for the difference between two proportions; Exercise with SPSS
Application
Unit 8: Analyzing Quantitative Data - The simple linear regression model and Statistical
Inference;
The simple linear regression model, estimation of regression coefficients and interpreting
results;
Hypothesis testing;
Exercise with SPSS application
Unit 9: Analyzing Quantitative Data - The multiple linear regression model and Statistical
Inference;
The multiple linear regression model, estimation of regression coefficients and
interpreting results;
Hypothesis testing;
Exercise with SPSS application

57

Didactic Design: <Title of Module>

58

Unit Six
Analyzing Quantitative Data: Descriptive Statistics
6.1 Basic concepts in statistics
6.1.1 What is Statistics?
Statistics is a science pertaining to the collection, presentation, analysis and interpretation or explanation
of data. Data can then be subjected to statistical analysis, serving two related purposes: description and
inference.

Descriptive statistics summarize the population data by describing what was observed in the
sample numerically or graphically.
Inferential statistics uses patterns revealed through analysis of sample data to draw inferences
about the population represented.
For a sample to be used as a guide to an entire population, it is important that it is truly a representative of
that overall population. Appropriate and scientific sampling procedures assure that the inferences and
conclusions can be safely extended from the sample to the population as a whole.
The raw materials for any statistical analysis are the data. Once data are collected, we have to organize
and describe these data in a concise, meaningful manner so that they become meaningful. In order to
determine their significance, we must display the data in the form of tables, graphs and charts (so that we
can have a good overall picture of the data). Then, we have to analyze the data, i.e., we calculate
summary measures such as the mean and standard deviation; assess the extent of relationship (correlation)
between two (or more) variables; and the like. Finally, based on the analysis, we have to make
generalizations and arrive at reasonable decisions.

6.1.2 Limitations of statistics


Although statistics is widely applied and has shown its merit in planning, policy making, marketing
decisions, quality control, medical studies, etc., it has some limitations:
(a) Statistical laws are not exact. They are probabilistic in nature, and inferences based on them are only
approximate.
(b) Statistics is liable to be misused. It deals with figures which are innocent by themselves, but which
can be easily distorted and manipulated.
Example: Information released from the Presidents Office of a certain university concerning minority
students states that their number has increased from 10 to 20 in this academic year. The release also stated
that the student population of the university has also increased from 1000 to 2500. Based on this
information, a newspaper headline reads:
Number of minority students doubled
The newspaper headline above strayed from the content of the main feature. Focusing on one aspect of
the data the number of minority students has increased from 10 to 20 the newspapers ignored the other
58

Didactic Design: <Title of Module>

59

fact, that is, the student population of the university has increased from 1000 to 2500 this year. The fact is
that the percentage of minority students has decreased: from 1% last year to 0.8% this year.

6.1.3 Some Basic Terms in Statistics


In collecting data concerning the characteristics of a group of individuals or objects, it is often impossible
or impractical (from the point of view of time and cost) to observe the entire group. In such cases, instead
of examining the entire group, called population, we examine only a small part of the population, called
sample.
Definition: A population is the set of all elements that belong to a certain defined group. A sample is a
part (or a subset) of the population.
Definition: Numerical characteristic of a population is called a parameter. Numerical characteristic of a
sample is called a statistic.
6.2 Classification and Presentation of Statistical Data
6.2.1 Introduction
We can apply various sampling techniques and methods of data collection to obtain the data of interest. In
its original form, such data set is a mere aggregate of numbers and hence, is not very helpful in extracting
information. So, we need to summarize and display the above information in a readily digestible form.
This may take various forms such as ordering the data according to their magnitude; compiling them into
tables; or graphing them to form a visual image. By doing so, a good overall picture and sufficient
information can often be attained.

6.2.2 Scales of measurement and types of classification of data


A. Scales of measurement of data
i. Nominal Scale: The nominal scale assigns numbers as a way to label or identify characteristics. The
numbers assigned have no quantitative meaning beyond indicating the presence or absence of the
characteristic under investigation. In other words, the numbers are not obtained as a result of a
counting or measurement process.

For example, we can record the gender of respondents as 0 and 1, where 0 stands for
male and 1 stands for female. The numbers we assign for the various categories are
purely arbitrary, and any arithmetic operation applied to these numbers is meaningless.
ii.

Ordinal Scale: The ordinal scale is the next higher level of measurement precision. It ensures
that the possible categories can be placed in a specific order (rank) or in some natural way.
Again here the numbers are not obtained as a result of a counting or measurement process, and
consequently, arithmetic operations are not allowed.

For example, responses for health service provision can be coded as 1, 2, 3 and 4: 1 for
poor 2 for moderate 3 for good 4 for excellent. It is quite obvious that there is some
natural ordering: the category 'excellent' (which is coded as 4) indicates a better health
service provision than the category 'moderate' (which is coded as 2) and, thus, order
relations are meaningful.
59

Didactic Design: <Title of Module>

60

iii.
Interval Scale: The interval scale is the second highest level of measurement precision. Unlike the
nominal and ordinal scales of measurement, the numbers in an interval scale are obtained as a
result of a measurement process and have some units of measurement. Also the differences
between any two adjacent points on any part of the scale are meaningful. However, a point can
not be considered to be a multiple of another, that is, ratios have no meaningful interpretation.

For example, Celsius temperature scale that subdivides the distance between the freezing
and boiling point into 100 equally spaced parts is an interval scale. There is a meaningful
difference between 30 degree Celsius and 12 degree Celsius. However, a temperature of
20 degree Celsius can not be interpreted as twice as hot as a temperature of 10 degree
Celsius.
iv.

Ratio Scale: The ratio scale represents the highest form of measurement precision. In addition to
the properties of all lower scales of measurement, it possesses the additional feature that ratios
have meaningful interpretation. Furthermore, there is no restriction on the kind of statistics that
can be computed for ratio scaled data.

For example, the height of individuals (in centimeters), the annual profit of firms (in Birr)
and plot elevation (in meters) represent ratio scales. The statement the annual profit of
Firm X is twice as large as that of Firm Y has a meaningful interpretation.

Why is level of measurement important?


a)
b)

First, knowing the level of measurement helps you decide on how to interpret the data. For example,
if you know that a measure is nominal, then you know that the numerical values are just short codes
for the longer names.
Second, knowing the level of measurement helps you decide what statistical analysis is appropriate
on the values that were assigned. If a measure is nominal, for instance, then you know that you
would never average the data values.

6.2.3 Types of classification of data


The word variable is often used in the study of statistics, so it is important to understand its meaning. A
variable is a characteristic that may assume more than one set of values to which a numerical measure can
be assigned. Sex, age, amount of income, Region or country of birth, grades obtained at school and mode
of transportation to work are all examples of variables.
There are broadly three types of data that can be employed in quantitative analysis: time series data,
cross-sectional data, and panel data.
i) Time series data: Time series data, as the name suggests, are data that have been collected over a
period of time on one or more variables. Time series data have associated with them a particular
frequency of observation or collection of data points. The frequency is simply a measure of the interval
over, or the regularity with which, the data are collected or recorded.

Examples: Daily Dow-Jones stock market average close for the past 90 days, a firms
quarterly sales over the past 5 years. etc

60

Didactic Design: <Title of Module>

61

The data may be quantitative (e.g. exchange rates, prices, number of shares outstanding), or qualitative
(e.g. the day of the week, the number of the financial products purchased by private individuals over a
period of time, etc.).
ii) Cross-sectional data: Cross-sectional data are data on one or more variables collected at a single point
in time. Such data do not have a meaningful sequence. For example, the data might be on:
Sales of 30 companies
Productivity of each sales division
A cross-section of stock returns on the New York Stock Exchange (NYSE)
iii) Panel data: Panel data have the dimensions of both time series and cross-sections, e.g. the daily
prices of a number of blue chip stocks over two years.
Note:
i) For time series data, it is usual to denote the individual observation numbers using the index t, and the
total number of observations available for analysis by T. For cross-sectional data, the individual
observation numbers are indicated using the index i , and the total number of observations available
for analysis by N.
ii) In contrast to the time series case, no natural ordering of the observations in a cross-sectional sample.
For example, the observations i might be on the price of bonds of different firms at a particular point
in time, ordered alphabetically by company name. On the other hand, in a time series context, the
ordering of the data is relevant since the data are usually ordered chronologically.
6.2.4 Continuous and discrete variables
As well as classifying data as being of the time series or cross-sectional type, we could also distinguish it
as being either continuous or discrete.
i) A quantitative variable that has a connected string of possible values at all points along the number
line, with no gaps between them, is called a continuous variable. In other words, a variable is said to
be continuous if it can assume an infinite number of real values within a certain range. It can take on
any value and is not confined to take specific numbers. The values of such variables are often obtained
by measuring. Examples of a continuous variable are distance, age and daily revenue.

The measurement of a continuous variable is restricted by the methods used, or by the


accuracy of the measuring instruments. For example, the height of a student is a
continuous variable because a student may be 1.6321748755... meters tall. However,
when the height of a person is measured, it is usually measured to the nearest centimeter.
Thus, this student's height would be recorded as 1.63 m.
ii) A quantitative variable that has separate values at specific points along the number line, with gaps
between them, is called a discrete variable. Such variables can only take on certain values, which are
usually integers (whole numbers), and are often defined to be count numbers (i.e., obtained by
counting).

The number of people in a particular shopping mole per hour or the number of shares
traded during a day are examples of discrete variables. These can take on values such as
0, 1, 2, 3 ... In these cases, having 86.3 people in the mole or 585.7 shares traded would
not make sense.
1.2.5

Methods of Summarizing and Presenting Quantitative Data


61

Didactic Design: <Title of Module>

62

A. Grouped Frequency Distribution


Raw data in its original form is a mere aggregate of numbers and hence, is not very helpful in extracting
information. So, we need to summarize and display the information contained in a readily digestible form.
One such method is the grouped frequency distribution.

Definition: A grouped frequency distribution is a table in which the observed values of


a variable are grouped into classes together with the number of observed values falling
into each class. The number of observed values that belong to a particular predefined
interval (or class) is called its frequency.

Example: The following data represents the average monthly number of road fatalities (human injury due
to traffic accidents) for a total of 50 major roads in a city:
46.6
55.2
55.7
48.8
48.8

61.5
56.0
59.3
60.5
58.0
43.2

54.5
54.6
56.8

50.3
51.2
50.6
55.4
50.9
52.6

47.1
57.6
57.0

57.8
53.9
58.8
53.6
63.8
49.6

53.3
57.9
53.9

52.7
52.4
47.4
53.0
55.2
58.3

59.1
56.8

56.0
59.2
57.1
53.3
52.4
47.8

45.8
57.3

56.1
51.8

These data may be summarized into a grouped frequency distribution as:


Table: Frequency distribution of the average monthly number of road fatalities
Average monthly number of road
fatalities (class limits)

Frequency (number of
major roads)

43.2 46.6

46.7 50.1

50.2 53.6

13

53.7 57.1

15

57.2 60.6

11

60.7 64.1

2
50

B. Graphical presentation of data


Graphs are effective visual tools because they present information quickly and easily. It is not surprising
then, that graphs are commonly used by print and electronic media. Often data are better understood when
presented by a graph than by a table because the graph can easily reveal a trend ( rise or decline of a
62

Didactic Design: <Title of Module>

63

variable over time) and is a simpler visual aid for comparison purposes. Some of the reasons why we use
graphs when presenting data include: they are quick and direct; they facilitate understanding of the data;
they can convince readers; and they can be easily remembered.
If you have decided that using a graph is the best method to relay your message, then some of the
guidelines to follow are:
1. Define your target audience.
Ask yourself the following questions to help you understand more about your audience and what their
needs are: Who is your target audience? What do they know about the issue? What do they expect to see?
What do they want to know? What will they do with the information?
2. Determine the message(s) to be transmitted.
Ask yourself the following questions to figure out what your message is and why it is important: What do
the data show? Is there more than one main message? What aspect of the message(s) should be
highlighted? Can all of the message(s) be displayed on the same graphic?
Knowing what type of graph to use with what type of information is crucial. Depending on the nature of
the data some graphs might be more appropriate than others. There are many different types of graphs that
can be used to convey information. These include vertical line graphs, bar graphs (charts), pie charts and
histograms, among others.
The presentation of data in the form of tables, graphs and charts is an important part of the
process of data analysis and report writing.
The results can be expressed within the text of a report, data are usually more digestible if they
are presented in the form of a table or graphical display.
Graphs and charts can quickly convey to the reader the essential points or trends in the data.

Some general recommendations to follow when presenting data:

The presentation should be as simple as possible, avoid the trap of adding too much information. A
good rule of thumb is to only present one idea or to have only one purpose for each graph or chart
you create.

The presentation should be self-explanatory.

The title should be clear, and concise indicating what?, when?, and where? the data were obtained.

Codes, legends and labels should be clear and concise, following standard formats if possible,

The use of footnotes is advised to explain essential features of the data that are critical for the correct
interpretation of the graph or chart.

Data Presentation Tools


Several types of statistical/data presentation tools exist, including:
1. Charts displaying frequencies (bar, pie, and pareto charts),
2. Charts displaying trends ( run and control charts),
63

Didactic Design: <Title of Module>

64

3. Charts displaying distributions ( histograms), and


4. Charts displaying associations ( scatter diagrams).
Different types of data require different kinds of statistical tools. There are two types of data
Attribute data are countable data or data that can be put into categories: e.g., the number of people
willing to pay, the number of complaints, percentage who want blue/ percentage who want
red/percentage who want yellow.
Variable data are measurement data, based on some continuous scale: e.g., length, time and cost.
To

Show

Use

Data Needed

Frequency of Occurrence:

Bar chart

Simple percentages or
comparisons of magnitude

Pie chart

Tallies by category ( data can be attribute data or


variable data divided into categories)

Trends over time

Line graph

Pareto chart
Measurements taken in chronological order (attribute or
variable data can be used)

Run chart
Control chart
Distribution: Variation not
related to time
(distributions)

Histograms

Forty or more measurements ( not necessarily in


chronological order, variable data )

Association: Looking for a


correlation between two
things

Scatter diagram

Forty or more paired measurements ( measures of both


things of interest, variable data )

i)

The bar graph (chart)

Bar graphs are one of the many techniques used to present data in a visual form so that the reader may
readily recognize patterns or trends. Bar graphs usually present categorical (qualitative) variables or
numeric (discrete) variables grouped in class intervals.
Example: The following data is on the bed sizes (that is, total number of beds available for patient use) in
three hospitals for the years 2003-2005.

Table: The bed sizes of three hospitals from 2003 to 2005


Hospital

2003

2004

2005
64

Didactic Design: <Title of Module>

65

40

45

45

25

60

60

35

45

75

Total

100

150

180

The simple bar chart does not consider the contribution of each hospital to the total bed size. It simply
provides information on the aggregate bed sizes in the three hospitals for the years 2003-2005.

Figure 1: A simple bar chart for the total bed size in three hospitals

A multiple bar chart (double (or group) bar graph) is another effective means of comparing sets of data.
This type of vertical bar graph gives two or more pieces of information for each item on the x-axis instead
of just one as in Figure 1. In this particular example it allows you to make direct comparisons of values
across categories on the same graph, where the values are bed sizes and the categories are the years 2003
2005. The graph is shown in Figure 2 below.

Figure 2: A multiple bar chart for the bed sizes in three hospitals

From Figure 2, for example, we can see that the total number of beds available for patient use in Hospital
C has consistently increased from 2003 to 2005. On the other hand, the bed size of Hospital B has
65

Didactic Design: <Title of Module>

66

increased from 2003 to 2004, but remained the same in 2005. When comparison is made between
hospitals, we can see that Hospital A had the highest bed size in 2003, but gave way to Hospital B in 2004
and to Hospital C in 2005.
ii)
The Pie chart
A pie chart is a chart that is used to summarize a set of categorical data or to display the different values
of a given variable by means of percentage distribution. This type of chart is a circle divided into a series
of segments (or sectors) each representing a particular category. The area of each segment is the same
proportion of a circle as the category is of the total data set. The use of the pie chart is quite popular since
the circle provides a visual concept of the whole (100%).
Example: A sample of 100 adults was asked what they feel is the most important issue facing today's
youth among: unemployment, youth violence, rising school fees, drugs in schools, and career
counselling. The results are shown in below:
Table: Adults opinion on the most important issue facing the youth
Issue

Number of adults

Unemployment

38

Youth violence

Rising school fees

12

Drugs in schools

22

Career counselling

20
100

First we have to find the percentage contribution of each category (issue), and then the angle measures of
the sectors representing each category of responses have to be calculated.

Issue

Percentage share

Angle measure of sector

Unemployment

38/100 = 38%

38% x 3600 = 136.80

Youth violence

8/100 = 8%

8% x 3600 = 28.80

Rising school fees

12/100 = 12%

12% x 3600 = 43.20

Drugs in schools

22/100 = 22%

22% x 3600 = 79.20

Career counselling

20/100 = 20%

20% x 3600 = 720

100

3600

At last we partition the circle into sectors based on the above angle measures.
66

Didactic Design: <Title of Module>

67

Figure 3: A pie chart of the opinion of adults as to the most important issue facing today's youth

Iii) The histogram


The most common form of graphical presentation of a grouped frequency distribution is the histogram. It
is used to summarize variables whose values are numerical and measured on an interval scale. It divides
up the range of possible values in a data set into classes or groups. For each group, a rectangle is
constructed with a base length equal to the range of values in that specific group (or class width), and an
area proportional to the number of observations falling into that group.
Example: Considering the grouped frequency distribution of the average monthly number of road
fatalities for a total of 50 months in a city, construct a histogram.

Table: Frequency distribution of the average monthly number of road fatalities


Average monthly number of road
fatalities (class limits)

class boundaries

Frequency (number of
major roads)

43.2 46.6

43.15 46.65

46.7 50.1

46.65 50.15

50.2 53.6

50.15 53.65

13

53.7 57.1

53.65 57.15

15

57.2 60.6

57.15 60.65

11

60.7 64.1

60.65 64.15

2
50

67

Didactic Design: <Title of Module>

68

Figure 4: Histogram of the mean monthly number of road fatalities in a city

Summary
After being collected and processed, data need to be organized to produce useful information or output.
Output is usually governed by the need to communicate specific information to a specific audience. The
only limit to the different forms of output you can produce is the different types of output devices
currently available. To help determine the best output type for the information you have produced, you
need to ask yourself these questions: For whom is the output being produced? How will the audience best
understand it?
Generally we have two types of output devices: tables and graphs. Grouping variables and presenting
them as a grouped frequency distribution is part of the process of organizing data so that they become
useful information. If a variable is continuous or takes a large number of values, then it is easier to present
and handle the data by grouping the values into class intervals. On the other hand, discrete variables can
be grouped into class intervals or not.
The other type of output devices are graphs. Graphs are effective visual tools because they present
information quickly and easily. If you have decided that using a graph is the best method to relay your
message, then the guidelines to remember are: define your target audience (understand more about your
audience and what their needs are); determine the message to be transmitted (figure out what your
message is and why it is important); and experiment with different types of graphs and select the most
appropriate. Note that it not appropriate to use a graph when there are too few data (one, two or three data
points) or the data show little or no variations.

1.3 Measures of central tendency and dispersion


6.3.1 Measures of central tendency
One of the most important objectives of statistical analysis is to determine various numerical measures
which describe the inherent characteristics of the data. In other words, it is often necessary to represent a
data set by means of a single number which is descriptive of the entire set. Such values usually tend to lie
68

Didactic Design: <Title of Module>

69

centrally within a set of data arranged in increasing or decreasing order of magnitude. Thus, we refer to
these as measures of central tendency. These include the mean, median and mode.

A. The Mean
i) The arithmetic mean
The arithmetic mean, or simply the mean, of a set of n observations X1 , X 2 , , X n , denoted by X , is
defined as :

n X
X1 X 2 . . . X n
j
n
j1 n

.......................................................... (1)

ii. The weighted mean


Sometimes some of the observations in a data set may have greater weight or importance as compared to
the others. For instance, the final examination in a course is often given more weight as compared to the
mid-term examination or tests in determining the overall score. In such cases, we associate weights w1 ,

w 2 , , w k with the observations X1 , X 2 , , X k , respectively, depending on their importance. The


mean obtained in this way is called the weighted mean, and is defined as:

Xw

w1X1 w 2 X 2 . . . w k X k
w1 w 2 . . . w k

w X
w
j

................................. (3)

Example: Portfolio Rate of Return

Portfolio expected return (an interest rate, indicating performance) is the weighted average of the
expected rates of return of assets in the portfolio, weighted by dollars invested
Suppose portfolio contains three stocks. One ($1,000 invested) is expected to return 20%. Another
($1,800 invested) expects 15%. Third is $2,200 and 30%.
Total invested is 1,000+1,800+2,200 = $5,000
Weights are:
w1 = $1,000/$5,000 = 0.20
w2 = $1,800/$5,000 = 0.36
w3 = $2,200/$5,000 = 0.44

Weighted average is
0.20 (20%) + 0.36 (15%) + 0.44 (30%) = 22.6%

This is the expected (mean) return for the portfolio. Note that each stock is represented in proportion to $
invested.

69

Didactic Design: <Title of Module>

70

iii. The grand mean (mean of means)


If the mean of a set of n1 numbers is X1 , of n 2 numbers is X 2 , . . . , of n k numbers is X k , then we
define the grand mean (or mean of means) as :

n1X1 n 2 X 2 . . . n k X k
n1 n 2 . . . n k

XG

n X
n
j

................................... (4)

Example: In a company, the average salary of a group of 50 male employees is 700 Birr and that of a
group of 30 female employees is 500 Birr. Find the average salary of male and female employees
combined.
Solution:
Let n M , n F denote the number of male and female employees, respectively, and let X M , X F denote the
average salary of male and female employees, respectively. Then the average salary of male and female
employees combined is:

XG

n M X M n F X F 50(700) 30(500)
=
= 625 birr.
nM nF
50 30

B. The median
%, is a single value that divides a set of data into two equal parts. It is the
The median, denoted by X
middle most or most central item in the data set.
Note: Data values which are by far smaller or larger as compared to the bulk of data are called extreme
values or outliers. Whenever such extreme values exist, the mean may give a distorted picture of the
data. On the other hand, the median of such data gives a good overall picture of the data.
Example: income

Average (mean) income for a country equally divides the total, which may include some
very high incomes
Median income chooses the middle person (half earn less, half earn more), giving less
influence to high incomes (if any)

C. Quartiles
The median divides a set of data into two equal parts. The values that divide a set of data into four equal
parts are called quartiles, and are denoted by Q1 , Q 2 and Q3 . If data are arranged in increasing order,
the positions of the quartiles are:

min

Q1

Q2

Q3

max
70

Didactic Design: <Title of Module>

71

where min is the minimum observation, and max is the maximum observation. Note that the second
quartile ( Q 2 ) is the median.

D. The mode

, of a set of numbers is that value which occurs with the greatest frequency. A
The mode, denoted by X
data set is called uni-modal, bi-modal or multi-modal depending on whether it has one mode, two modes
or more than two modes, respectively.
6.3.2

Measures of dispersion

The measures of central tendency help us in describing a set of data by a single number or by a typical
value. However, they do not provide us any information about the extent to which the values differ from
one another or from an average value. This bit of information is very essential as illustrated in the
following example.
Example: Suppose we are told that the mean of two numbers is 1000. Then, these two numbers may be 4
and 1996, or 990 and 1010. Definitely 1000 is not a good descriptive measure for the numbers 4 and
1996, while it is a good representative figure for 990 and 1010. The reason behind this is that there is a
considerable difference between the numbers 4 and 1996, while the difference between 990 and 1010 is
relatively small.
Thus, the dispersion (spread or variability) of a data set gives us additional information that enable us to
judge the reliability of our measure of central tendency: if data are widely dispersed, then the mean
(median or mode) is less representative of the data as a whole than it would be for data with small
dispersion. The measures of dispersion also enable us to compare several samples with similar averages.
Example
1.
2.

Financial analysts are concerned about the dispersion of a firms earnings. Widely dispersed
earnings, those varying from extremely high to low or even negative levels, indicate a higher risk to
stock holders and creditors than do earnings remaining relatively stable.
Quality control experts analyze the dispersion of a products quality levels. A drug that is average in
purity but ranges from very pure to highly impure may endanger lives.

A. The range
A quick and easy indication of dispersion is the range. The range of a set of data is the difference
between the largest and smallest observed values, i.e;
Range = max - min
where max = largest observation and min = smallest observation.
B. The interquartile range
In case an extreme value(s) exists, we use another measure of dispersion, called the interquartile range
(Q), which is defined as:
71

Didactic Design: <Title of Module>

72

Q = Q3 - Q1
where Q3 and Q1 are the third and first quartiles, respectively.
Identifying Outliers
Outliers are observations that are far from the center of the distribution. These are defined as observations
which are either:

Greater than Q3 1.5Q or

Less than Q1 1.5Q


These are shown in the figure below. This figure is known as the box-plot.

c. Mean (average) deviation


If X1 , X 2 , . . . , X n are n observations of a variable X, then the mean deviation (MD) about the mean
is defined as:

1 n
MD (about mean) | X j X | ,
n j1
where X

is the sample mean.

D. Variance and standard deviation


Suppose a population consists of the values X1 , X 2 , . . . , X N . We define the population variance,
denoted by 2 , as:

72

Didactic Design: <Title of Module>

(X
j1

73

) 2

where is the population mean defined by :


N

X
j1

The positive square root the population variance is called the population standard deviation, i.e.;
N

(X
j1

) 2

Usually, information about the entire population is not available. The reason for this is that collecting data
about the entire population is time consuming, expensive, and sometimes impossible. Thus, we often take
a sample and infer something about the population based on sample statistics. If we have a sample of size
n comprising of the values X1 , X 2 , . . . , X n , we calculate the sample variance as:
n

S2

(X
j1

X) 2

n 1

where X is the sample mean defined by :


n

X
j1

The sample standard deviation is defined as the positive square root of the sample variance, i.e.;
n

(X
j1

X) 2

n 1

Properties of the standard deviation


a) If we add (subtract) a constant to (from) each value of a data set, then the standard deviation remains
unchanged.
73

Didactic Design: <Title of Module>

74

b) If we multiply each value of a data set by a constant, then the new standard deviation will be the
original standard deviation multiplied by that constant.

Symbolically, if the standard deviation of the observations X1 , X 2 , , X n is S, then the standard


deviation of:
a) X1 k, X 2 k, . . . , X n k will be S
b) k X1 , k X 2 , . . . , k X n will be kS.
where k is a constant.
E. The coefficient of variation
The standard deviation is an absolute measure of dispersion that expresses the variation in the same units
as the original data. But it cannot be the sole basis for comparing two distributions. If we have a standard
deviation of 10 and a mean of 5, the values vary by an amount twice as large as the mean itself. If, on the
other hand, we have a standard deviation of 10 and a mean of 5,000, the variation relative to the mean is
insignificant. Therefore, we can not fully determine the dispersion of a set of data unless we know the
standard deviation, the mean, and how the standard deviation compares with the mean.
Example: If the weights of certain objects or individuals have a standard deviation of 1 Kg, then this
figure alone does not tell us whether there is a great deal of variation or not: if the data are weights of new
born babies, S = 1 Kg shows a great deal of variability; whereas if the data are the weights of adult
elephants, S = 1 Kg shows a small variability.
In such cases, what we need is a relative measure that will give us a feel for the magnitude of the
deviation relative to the magnitude of the mean. One such relative measure of dispersion is the coefficient
of variation, V, defined by:

S
x 100%
X

where S = standard deviation and X = mean.

F. The standard score


Suppose a student scored 65% in a statistics test and 70% in a mathematics test. In which subject is his
performance better? In order to answer such questions, we need to compare the score of the student with
the average score of all students who sat for these exams in both subjects (simply comparing 65 and 70
may lead us to a wrong conclusion). For such purposes, we define the standard score (or Z-score) which
is given by:

74

Didactic Design: <Title of Module>

75

XX
S

The standard score measures the deviation of each value of a data set from the mean in units of standard
deviation. It is used to compare the relative standing of values.

6.3.3 General shape of distributions


One of the main features of a distribution is the extent to which it is symmetric.
1) A perfectly symmetric curve is one in which both sides of the distribution would exactly match the
other if the figure were folded over its central point. An example is shown below:

A symmetric, uni-modal, bell-shaped distribution is called a normal distribution. In such distributions,


central values do have the highest frequency. As we move to both tails, the frequency keeps on
decreasing. Many phenomena, such as weight, height, intelligence quotient (IQ), etc. of individuals; daily
output of a production line, and the like, can be approximately described by a normal distribution.

2) If a distribution is lop-sided, that is, if most of the data are concentrated either on the left or the right
end, it is said to be skewed.
a) A distribution is said to be skewed to the right, or positively skewed, when most of the data are
concentrated on the left of the distribution.
Income provides one example of a positively skewed distribution. Most people have small income, but
some make quite a bit more, with a smaller number making many millions of dollars a year. Therefore,
the positive (right) tail on the line graph for income extends out quite a long way, whereas the negative
(left) skew tail stops at zero. The right tail clearly extends farther from the distribution's centre than the
left tail as shown below:

b) A distribution is said to be skewed to the left, or negatively skewed, if most of the data are
concentrated on the right of the distribution. The left tail clearly extends farther from the distribution's
centre than the right tail as shown below:
75

Didactic Design: <Title of Module>

76

Example: The following data is the score of 41 students on math test (rounded to the nearest integer):
31, 49, 19, 62, 50, 24, 45, 23, 51, 32, 48, 55, 60, 40, 35, 54, 26, 57, 37, 43, 65, 50, 55, 18, 53, 41, 50, 34,
67, 56, 44, 4, 54, 57, 39, 52, 45, 35, 51, 63, 42
The histogram and the frequency polygon superimposed on it are shown in the Figure below.

Figure: Histogram of the score of students on math test


The amount of distribution spread can be spotted on the graph. For example, the figure reveals that most
students scored in the interval between 50 and 59, while only a few students scored less than 20. The
distribution has a single peak within the 5059 interval. The distribution also shows that most data are
clustered at the right. The left tail extends farther from the data centre (median = 48) than the right tail.
Therefore, the distribution is skewed to the left or negatively skewed.
Note: For normal distributions, the mean and standard deviation are the best measures of central tendency
and dispersion (spread), respectively. However, the standard deviation is not a good measure of spread in
highly skewed distributions. It is also highly influenced by outliers (extreme values). A single outlier can
raise the standard deviation and in turn, distort the picture of spread.

6.4 Exercise with SPSS Application (Computer Lab)

76

Didactic Design: <Title of Module>

77

Unit Seven
Tests of hypothesis concerning means and proportions
7.1 Tests of hypotheses concerning means
1. Parametric and non-parametric statistics (tests)
Parametric tests are statistical tests which make certain assumptions about the parameters of the full
population from which the sample is taken (e.g., a normal distribution). If those assumptions are correct,
parametric methods can produce more accurate and precise estimates (they are said to have more
statistical power). However, if those assumptions are incorrect, parametric methods can be very
misleading. These tests normally involve data expressed in absolute numbers (interval or ratio) rather than
ranks and categories (nominal or ordinal). Such tests include analysis of variance (ANOVA), t-tests, etc.
Consider the frequency distribution shown below. It can easily be observed that the distribution deviates
substantially from the normal distribution (the bell-shaped distribution).

This may also be that case for many variables of interest. For example, is income distributed normally in
the population? -- probably not. The incidence rates of rare diseases are not normally distributed in the
population, and neither are very many other variables in which a researcher might be interested. With a
sample of small size at hand, analyzing such variables using parametric tests might be misleading!
Note: We can apply parametric tests even if we are not sure that the distribution of the variable under
investigation in the population is normal as long as our sample is large enough. If our sample is very
small, however, then those tests can be used only if we are sure that the variable is normally distributed.
Applications of tests that are based on the normality assumptions are further limited by a lack of precise
measurement. For example, course grade (A, B, C, D, F) is a crude measure of scholastic
accomplishments that only allows us to establish a rank ordering of students from "good" students to
"poor" students. Most common statistical techniques assume that the underlying measurements are at least
77

Didactic Design: <Title of Module>

78

of interval. However, as in our example, this assumption is very often not tenable, and the data rather
represent a rank ordering of observations (ordinal) rather than precise measurements.
Thus, the need is evident for statistical procedures that allow us to process data of low quality (nominal
or ordinal), from small samples on variables about which nothing is known (concerning their
distribution). Nonparametric methods have been developed to be used in cases when the researcher
knows nothing about the parameters of the variable of interest in the population (hence the name
nonparametric).
In more technical terms, nonparametric methods do not rely on the estimation of parameters (such as the
mean or the standard deviation) describing the distribution of the variable of interest in the population.
Therefore, these methods are also sometimes (and more appropriately) called parameter-free methods or
distribution-free methods.
Non-parametric methods are widely used for studying populations that take on a ranked order. The use of
non-parametric methods may be necessary when data have a ranking but no clear numerical
interpretation, such as when assessing preferences; in terms of levels of measurement, for data on an
ordinal scale.
When to use which method
Basically, there is at least one nonparametric equivalent for each parametric general type of test. In
general, these tests fall into the following categories:

Tests of differences between groups (independent samples)


Tests of differences between variables (dependent samples)
Tests of relationships between variables.

Differences between independent groups: Usually, when we have two samples that we want to compare
concerning their mean value for some variable of interest, we would use the t-test for independent
samples. A nonparametric alternative for this test is the Mann-Whitney U test. If we have multiple groups,
we would use (the parametric) analysis of variance; a nonparametric equivalent to this method is the
Kruskal-Wallis analysis of ranks test.
Differences between dependent groups: If we want to compare two variables measured in the same
sample we would customarily use the t-test for dependent samples (if we want to compare students' math
skills at the beginning of the semester with their skills at the end of the semester). Nonparametric
alternatives to this test are the Sign test and Wilcoxon's matched pairs test. If the variables of interest are
dichotomous in nature (i.e., "pass" vs. "no pass") then McNemar's Chi-square test is appropriate.
Relationships between variables: To express a relationship between two variables one usually computes
the correlation coefficient. A nonparametric equivalent to the standard correlation coefficient is the
Spearman rank correlation coefficient. If the two variables of interest are categorical in nature (e.g.,
"passed" vs. "failed" by "male" vs. "female") an appropriate nonparametric test of the relationship
between the two variables is the Chi-square test.
78

Didactic Design: <Title of Module>

79

2. Tests concerning the difference between two means (independent samples)


The null and alternative hypotheses are:

H 0 : 1 2
H A : 1 2
where 1 is the mean of population 1 and 2 is the mean of population 2. The null hypothesis ( H 0 )
simply states that the two populations under consideration have equal means. Note than our conclusion is
about the means of the two populations (i.e., true means); it is not about the samples (or sample means)!

a) The t- test:
Assumptions:
1) The samples come from two normally distributed populations.
2
2
2) 1 and 2 are assumed equal but not known.
Under these assumptions, we can use the t-distribution with ( n1 n 2 2 ) degrees of freedom to find the
critical values ( t / 2 (n1 n 2 2) ) for a given level of significance (). The test statistic is given by:

t cal

X1 X 2
1
1
Sp

n1
n2

where Sp is the pooled standard deviation defined as:

Sp

(n1 1)S12 (n 2 1)S22


n1 n 2 2

2
2
Here S1 and S2 are the variances computed from the samples.

Decision rule: Reject H0 if

| t cal | t / 2 (n1 n 2 2)

Example 1: The following summary statistics are on the annual household income (in thousands
of dollars) of individuals who previously defaulted (group 1) and not defaulted (group 2) on their
bank loans (data obtained from SPSS package: bankloan.sav).
Defaulted (group 1)

Not defaulted (group 2)


79

Didactic Design: <Title of Module>

80

Mean

X1 = 41.2131

X 2 = 47.1547

Variance

S12 = 1858.949

S22 = 1171.019

Sample size

n1 = 183

n 2 = 517

Test if there is a significant difference in the mean income of defaulters and non-defaulters at the
5% level of significance.
Solution
The null and alternative hypotheses are:
H 0 : 1 2
H A : 1 2
The level of significance is = 0.05.
The pooled standard deviation is computed as:
Sp

(n1 1)S12 (n 2 1)S22


(183 1)(1858.949) (517 1)(1171.019)

= 36.7477
n1 n 2 2
183 517 2

The test statistic is thus:


t cal

X1 X 2
1
1
Sp

n1
n2

41.2131 47.1547
1
1 = - 1.686
36.7477

183 517

= 0.05 t / 2 (n1 n 2 2) t 0.025 (183 517 2) t 0.025 (698) 1.960


Decision: Since the absolute value of the test statistic | t cal | = 1.686 does not exceed the critical
value, we do not reject H 0 .
Conclusion: There is no significant difference in the mean income of defaulters and nondefaulters at the 5% level of significance.
80

Didactic Design: <Title of Module>

81

Remark: The nonparametric equivalent - the MannWhitney U test


The nonparametric equivalent of the t-test is the MannWhitney U test. This nonparametric test
is virtually identical to performing an ordinary parametric two-sample t-test on the data after
ranking over the combined samples. It requires the two samples to be independent, and the
observations to be ordinal or continuous measurements. Unlike the parametric t-test, this nonparametric makes no assumptions about the distribution of the data (e.g., normality).
Example 2: Consider the data on the annual household income (in thousands of dollars) of
individuals who previously defaulted (group 1) and not defaulted (group 2) on their bank loans
above. To apply the Mann-Whitney U test for the difference in the mean income of defaulters
and non-defaulters, the procedure in SPSS is as follows:
Analyze

Nonparametric Tests

2 Independent Samples

Test Variable List

Household income [Income]

Grouping Variable

default (? ?)

Define Groups

Group 1: 1
Group 2: 0

OK
The output is as shown below:
Mann-Whitney Test
Ranks
Previously defaulted

Mean Rank

Sum of Ranks

Household income in thousands No

517

368.83

190685.50

Yes

183

298.71

54664.50

Total

700

81

Didactic Design: <Title of Module>

82

Test Statisticsa
Household income in
thousands
Mann-Whitney U

37828.500

Wilcoxon W

54664.500

-4.032

Asymp. Sig. (2-tailed)

.000

a. Grouping Variable: Previously defaulted

Decision: Since the p-value is less than 1%, we reject H 0 . Thus, we conclude that there is a
significant difference in the mean income of defaulters and non-defaulters at the 1% level of
significance.
Question: Why are the two tests led to a different conclusion?
This is unexpected, and we have to look for the source of this fallacy. The box plot of the data is
shown below:

Figure: A box plot of the mean income of defaulters and non-defaulters

82

Didactic Design: <Title of Module>

83

We see from the box plot that the item on the 445 th row is an outlier (extreme value). Probably
the fallacy is due to this value. After removing this row, the in independent sample t-test yields
the following result:
Independent Samples Test
Levene's
Test for
Equality of
Variances t-test for Equality of Means
95% Confidence
Interval of the
Difference

F
Househol
d income
in
thousand
s

Equal
variances
assumed
Equal
variances
not
assumed

Sig. t

df

3.517 .061 -2.836 697

Sig.
(2Mean
tailed Differen
)
ce

Std.
Error
Differen
ce
Lower

Upper

.005

2.87926 8.16573
13.8187 2.51266
9

-2.975 347.531 .003

2.74484 8.16573
13.5643 2.76713
2

Remark
2
2
The independent samples t-test is of two types: equal variances assumed ( H 0 : 1 1 ) and equal
2
2
variances not assumed ( H1 : 1 1 ). In order to identify which of the two tests is appropriate,

we use the Levene's test for equality of variances. If the p-value for this test is less than 5%, then
we reject H 0 and consider the result under Equal variances not assumed row; other wise, we
use the result under Equal variances assumed row.
In our case, the Levene's test for equality of variances has a p-value of 0.061 which is greater
2
2
than 5%. Thus, we do not reject H 0 : 1 1 , and consequently, we refer to the result in the

Equal variances assumed row. The p-value is 0.005 which is less than 1%. Thus, we reject the
hypothesis of equality of means of the two groups.
3. Tests of mean difference between several populations (independent samples)
We have seen in section 3 above how to apply hypothesis testing procedures to test the null
hypothesis of no difference between two population means. It is not unusual for the investigator
to be interested in testing the null hypothesis of no difference among several population means.

83

Didactic Design: <Title of Module>

84

Assumptions: Random samples of size n are taken from each of the k populations which are
independently and normally distributed with means 1 , 2 , . . ., k and common variance 2
(i.e the variability in each group is the same). Also all observations are continuous.
Under this general principle we want to test:
H 0 : 1 2 . . . k
against the alternative:
H1 : At least two of them are different.
ANOVA
For such test of hypothesis we use a method called analysis of variance. Analysis of variance is
a method of splitting the total variation into meaningful components that measure different
sources of variation.
In other words, we split the total sum of squares ( SS total ) into between groups (sample) sum
of squares ( SS between ) and within group (sample) sum of squares ( SS within ). And the test
statistic for testing H 0 versus H1 is given by the variance ratio:

SS

cal

B
2

SS

between

within

(k 1)

k(n 1)

where k is the number of groups, and n is the sample size from each group. In order to decide
whether the null hypothesis has to be rejected or not, we compare the above test statistic with
F (k 1, k(n 1)) , the value from the F-distribution with (k 1) and k(n 1) degrees of freedom
for a given level of significance .
Decision rule:
If the calculated value (test statistic) exceeds F (k 1, k(n 1)) , we reject H0 and conclude that
the group means are not all equal.

84

Didactic Design: <Title of Module>

85

The analysis of variance (ANOVA) table for testing such hypotheses is as shown below.
ANOVA table for a one way classification
Source of
variation

Sum of squares

Between
groups

SS

Within group

SS

between

d.f.
k1

Mean square
SS

between

k 1
within

k(n 1)

SS

F-ratio
SS
SS

between

within

within

(k 1)

k(n 1)

k(n 1)
Total

SS

total

kn 1

Example 3: This example uses data from SPSS:


Files\SPSSInc\Statistics17\Samples\English\car_insurance_claims.sav
The data is on: insurance policy holders age, vehicle group, vehicle age, average cost of claims
and number of claims.
Our aim is to see if there is a significant difference in the average cost of claims for vehicles of
age: 0 3, 4 7, 8 9 and 10+.
Before we test if the average cost of claims for the three groups of vehicles (based on age) is the
same, we have to test the existence of common variance 2 (i.e the variability in each group is
the same).
The SPSS output for One-Way ANOVA is shown below:

85

Didactic Design: <Title of Module>

86

Test of Homogeneity of Variances


Average cost of claims
Levene Statistic

df1

1.353

df2
3

Sig.
119

.261

The Levenes test of:


H 0 : 12 12 32 42 versus
H1 : at least two of the variances are different
has a p-value of 0.261. Since this figure is greater than 5%, we do not reject the null hypothesis.
Thus, we can say that the variability in the cost of claims is the same regardless of the age of a
vehicle. However, such parametric tests are highly influenced by the existence of outliers (if
any). The figure below shows the box plot of the average cost of claims.

Figure: a box plot of the average cost of claims


It can be seen that the 13th, 14th and 48th items are outliers. After removing these items, the
Levenes test of equality of variances yields the following result.

86

Didactic Design: <Title of Module>

87

Test of Homogeneity of Variances


Average cost of claims
Levene Statistic

df1

df2

4.696

Sig.
116

.004

Here the p-value is less than 1%. Thus, we reject the null hypothesis and conclude that the
variability in the cost of claims is different depending on the age of a vehicle. Now we can test if
the average cost of claims for the three groups of vehicles (based on age) is the same.
The ANOVA table for testing:
H 0 : 1 2 3 4 versus
H1 : at least two of the means are different is shown below:
ANOVA
Average cost of claims
Sum of
Squares

df

Mean Square

Between Groups

462109.712

154036.571

Within Groups

359371.613

116

3098.031

Total

821481.325

119

F
49.721

Sig.
.000

Since the p-value is less than 1%, we conclude that there is a significant difference in the mean
cost of claims for vehicles of different ages at the one percent level of significance.
Question: which groups of means are different?
To answer this question, we apply pair-wise comparison of means. Since the equality of
variances assumption is rejected, the appropriate tests are those listed under Equal Variances
Not Assumed in SPSS. The output of Post Hoc Tests (Multiple comparisons) is shown below:
87

Didactic Design: <Title of Module>

(I)
(J)
Vehicle Vehicle
age
age
0-3

4-7

8-9

10+

88

95% Confidence Interval


Mean Difference
(I-J)

Std. Error

Sig.

Lower Bound

Upper Bound

4-7

34.774

15.546

.164

-7.68

77.23

8-9

98.677*

16.000

.000

55.04

142.31

10+

165.570*

15.113

.000

124.17

206.97

0-3

-34.774

15.546

.164

-77.23

7.68

8-9

63.903*

13.206

.000

27.97

99.84

10+

130.796*

12.115

.000

97.75

163.84

0-3

-98.677*

16.000

.000

-142.31

-55.04

4-7

-63.903*

13.206

.000

-99.84

-27.97

10+

66.892*

12.693

.000

32.26

101.52

0-3

-165.570*

15.113

.000

-206.97

-124.17

4-7

-130.796*

12.115

.000

-163.84

-97.75

8-9

-66.892*

12.693

.000

-101.52

-32.26

*. The mean difference is significant at the 0.05 level.


Thus, there is a significant difference in the mean cost of claims between vehicles of age:
a) 0 3 versus 8 9 and 10+
b) 4 7 versus 8 9 and 10+
c) 8 9 versus 10+
We can see from the results that vehicles of age 10+ do have the lowest mean cost of claims,
followed by those with ages 8 9 years.
That is, mean cost of claims is statistically significantly greater in those vehicle whose age is
lower (0-3 and 4-7) as compared to 8 9 and 10+; The lower the vehicle age, the higher the cost
of claim.
Remark:
88

Didactic Design: <Title of Module>

89

The nonparametric equivalent of the one-way ANOVA is the KruskalWallis test. The Kruskal
Wallis one-way analysis of variance by ranks is a nonparametric method for testing equality of
population medians among groups. It is identical to a one-way analysis of variance with the data
replaced by their ranks. It is an extension of the MannWhitney U test to three or more groups.
Since it is a non-parametric method, the KruskalWallis test does not assume a normal
population, unlike the analogous one-way analysis of variance.
Example 4: Consider the data is on the average cost of vehicle insurance claims and vehicle age.
To apply the KruskalWallis one-way analysis of variance for the difference in the mean cost of
claims for vehicles of different ages, the procedure in SPSS is as follows:
Analyze

Nonparametric Tests

k Independent Samples

Test Variable List

Average cost of claims

Grouping Variable

vehicleage (? ?)

Define Range

Minimum

1; Maximum 4

OK
The output is as shown below:
Kruskal-Wallis Test

89

Didactic Design: <Title of Module>

90

Ranks
Vehicle age
Average cost of claims

Mean Rank

0-3

31

89.69

4-7

31

79.68

8-9

31

48.58

10+

27

18.65

Total

120

Test Statisticsa,b
Average cost of claims
Chi-Square
df
Asymp. Sig.

73.989
3
.000

a. Kruskal Wallis Test


b. Grouping Variable: Vehicle age
Since the p-value is less than 1%, we again conclude that there is a significant difference in the
mean cost of claims for vehicles of different ages at the one percent level of significance.
4. Paired-samples t-test (Differences between dependent groups)
If we want to compare two variables measured in the same sample we would customarily use the
t-test for dependent samples. For example, we might be interested to see if there is a significance
difference in the mean output per individual worker before and after an intensive training. In this
case, we take a random sample of workers and record the amount of output each produced before
the training and again after training. We then compute the difference between the two variables
(output before training and output after training) for each case, and test to see if the average
difference is significantly different from zero.
Let X j and Yj be the measurements before treatment and after treatment for the jth individual,
respectively. We compute the differences as:
90

Didactic Design: <Title of Module>

D j X j Yj ,

91

j 1, 2, . . ., n

We then calculate the mean and sample variance of the differences D j s as:

2
D

1 n
Dj
n j 1

1 n

(D j D) 2

n 1 j 1

Assumption:
2
The differences ( D j s) are normally distributed with mean and variance d .

The hypotheses to be tested are:


H 0 : 0 (no significant difference in the before after mean)
H1 : 0
The test statistic for this test is given by:

t cal

D 0
SD n

Decision rule: Reject H0 if

| t cal | t / 2 (n 1)

Example 5: This example uses data from SPSS:


Files\SPSSInc\Statistics17\Samples\English\property_assess.sav
The data is on current sale value of n = 1000 houses ( X j ) and the value of the same at last
appraisal ( Yj ). Our aim is to see if there is a significant difference in the mean sale value.
The SPSS output for paired-sample t-test is shown below:

91

Didactic Design: <Title of Module>

92

Paired Samples Statistics


Mean
Pair 1

Std. Deviation

Std. Error Mean

Sale value of house

161.4920

1000

55.44955

1.75347

Value at last appraisal

134.9620

1000

44.79421

1.41652

Sig. (2tailed)

Paired Samples Test


Paired Differences

Mean
(D)
Pair 1

26.53000

95% Confidence Interval


of the Difference
Std.
Std. Error
Deviation
Mean
31.74022

1.00371

Lower
24.56037

Upper
28.49963

26.432

df
999

.000

Since the p-value is less than 0.01, we reject the null hypothesis and conclude that there is a
significant difference between the current mean sale value of houses and the mean sale value in
the last appraisal at the 1% level of significance (the sale value has appreciated, on average).
Remark: The nonparametric equivalent to the paired-samples t-test is the Wilcoxon Signed
Ranks Test. This test does not assume a normal population.
Example 6: For the data in example 5, the Wilcoxon Signed Ranks Test results are shown below:
Wilcoxon Signed Ranks Test

92

Didactic Design: <Title of Module>

93

Ranks

N
Value at last appraisal Sale value of house

Mean Rank

Sum of Ranks

548.34

460056.50

Positive Ranks

160b

246.52

39443.50

Ties

1c

Total

1000

Negative Ranks 839

a. Value at last appraisal < Sale value of house


b. Value at last appraisal > Sale value of house
c. Value at last appraisal = Sale value of house

Test Statisticsb
Value at last appraisal - Sale
value of house
Z

-23.055a

Asymp. Sig. (2tailed)

.000

a. Based on positive ranks.


b. Wilcoxon Signed Ranks Test

Since the p-value is less than 0.01, we again reject the null hypothesis and conclude that there is
a significant difference between the current mean sale value of houses and the mean sale value in
the last appraisal at the 1% level of significance.
5. Tests of association
a) The Pearson coefficient of correlation and test of its significance
For two continuous variables X and Y, a measure of the strength of linear relationship is
provided by the Pearson coefficient of correlation which is defined as:

n xy ( x)( y)
(n x 2 ( x) 2)(n y 2 ( y) 2)

The coefficient of correlation r ranges from -1 to 1, inclusive; i.e., -1 r 1.

93

Didactic Design: <Title of Module>

94

The sign of r indicates the direction of the relationship between the two variables X and Y. If
an inverse relationship exists, then r will fall between 0 and -1. Likewise, if there is a direct
relationship, then r will be a value within the range 0 to 1.
To see if this value of r is of sufficient magnitude to indicate that the two variables of interest are
correlated, we test the hypothesis:
H0: = 0
HA: 0
where is the true (population) coefficient of correlation. The test statistic is:

t cal r

n2
1 r2

Decision: Reject the null hypothesis if: | t cal | t / 2 (n 2)


Example 7: The following are on the advertising spending and sales of a company recorded over
a period of n = 24 months (obtained from SPSS package: C:\Program
Files\SPSSInc\Statistics17\Samples\English\advert.sav). Is there a significant correlation
between advertising spending and sales?
Month

Advertising
spending (X)

Detrended sales
(Y)

Month

Advertising
spending (X)

Detrended sales
(Y)

4.69

12.23

13

5.15

12.27

6.41

11.84

14

5.25

12.57

5.47

12.25

15

1.72

8.87

3.43

11.1

16

3.04

11.15

4.39

10.97

17

4.92

11.86

2.15

8.75

18

4.85

11.07

1.54

7.75

19

3.13

10.38

2.67

10.5

20

2.29

8.71

1.24

6.71

21

4.9

12.07

10

1.77

7.6

22

5.75

12.74

11

4.46

12.46

23

3.61

9.82
94

Didactic Design: <Title of Module>

12

1.83

8.47

95

24

4.62

11.51

Solution
Summary statistics:
n = 24,

= 89.28,

= 253.65,

xy = 1001.954, x

= 386.58,

2755.299
Plugging in these values, the sample coefficient of correlation is:

n xy ( x)( y)
(n x 2 ( x) 2)(n y 2 ( y) 2)

= 0.91627

The hypothesis of interest is:


H0: = 0
HA: 0
The test statistic is: t r

24 2
n2
= (0.91627)
= 10.72913
2
1 r
1 (0.91927) 2

For = 0.01, t / 2(n 2) t 0.005(22) 2.819 . Since the calculated value t = 10.72913 is
greater than 2.819, we reject H0 and conclude that there is a significant positive (or direct)
correlation between advertising spending and sales at the one percent level of significance. The
SPSS output is shown below:
Correlations

95

Didactic Design: <Title of Module>

96

Correlations
Advertising
spending
Advertising
spending

Pearson Correlation 1

.916**

Sig. (2-tailed)

.000

N
Detrended sales

Detrended sales

24

Pearson Correlation .916


Sig. (2-tailed)

.000

24

24
**

1
24

**. Correlation is significant at the 0.01 level (2-tailed).

b) The Spearman rank correlation coefficient and test of its significance


The Pearson coefficient of correlation requires precise numerical values (i.e., continuous data)
for the variables. However, in many instances such numerical measurements may not be possible
(for instance, job performance, taste, intelligence, etc.). In such cases, we can compute a
nonparametric measure of association that is based on ranks. This measure is known as the
Spearman rank correlation coefficient ( rs ), and is given by:

rs 1

6 d 2

n(n 2 1)

where n = number of paired observations


d = difference between the ranks for each pair of observations
The steps involved in computing rs are as follows:

Step1: Rank the xs among themselves giving rank 1 to the largest (or smallest) observation,
rank 2 to the second largest (or second smallest) observation, and
so on.
Step 2: Rank the ys similarly.
Step 3: Find d = rank of x - rank of y for each pair of observations.
2
Step 4: Find d (the sum of squares of the differences between each pair of ranks)
Step 5: Compute the rank correlation coefficient using the above formula.

Example 8: For the data on the advertising spending and sales of a company recorded over a
period of n = 24 months, the SPSS output of nonparametric correlations is shown below:

96

Didactic Design: <Title of Module>

97

Nonparametric Correlations
Correlations

Advertising
spending

Detrended sales

Advertising
spending

Detrended
sales

Correlation
Coefficient

1.000

.889**

Sig. (2-tailed)

.000

24

24
**

Correlation
Coefficient

.889

1.000

Sig. (2-tailed)

.000

24

24

**. Correlation is significant at the 0.01 level (2-tailed).

Since the p-value is less than 0.01, we reject H 0 and conclude that there is a significant
correlation between advertising spending and sales at the one percent level of significance.
c) The Chi-square test
If the two variables whose degree of association we want to test are categorical in nature (for
example, job satisfaction versus income), the appropriate nonparametric statistic for testing such
relationship is the Chi-square test.
Example 9: Here we use the data in SPSS package:
Files\SPSSInc\Statistics17\Samples\English\customer_dbase.sav
Suppose we want to check if there is a relationship between the level of income of employees
(categorized into five) and job satisfaction (categorized from highly dissatisfied to highly
satisfied). Here job satisfaction is not continuous, and hence we can not apply the Pearson
coefficient of correlation.
Before going to the test it is a good idea to see what the data look like using graphs. The multiple
bar chart for the said variables is shown below:

97

Didactic Design: <Title of Module>

98

The chart gives us some idea about the relationship between the two variables. For example, the
frequency of highly dissatisfied employees keeps on decreasing as income increases. However,
we do not come to the final judgement before we apply objective statistical tests. In tests of
independence, the null and alternative hypotheses are of the form:
H0: The two classifications are independent.
HA: The two classifications are dependent.
The null hypothesis can also be written as there is no association between the two
classifications.' The SPSS procedure for testing such hypotheses is:
Analyze

Nonparametric Tests

Test Variable List

income category

Job satisfaction

Chi-Square

OK
The SPSS outpour is as shown below:

98

Didactic Design: <Title of Module>

99

Chi-Square Test
Test Statistics
Income category in thousands
a

Job satisfaction
24.426a

Chi-Square

1252.834

df

Asymp. Sig.

.000

.000

a. 0 cells (.0%) have expected frequencies less than 5. The minimum


expected cell frequency is 1000.0.

Since the p-value is less than 0.01, we reject H0 and conclude that income and job satisfaction are
dependent or associated. A cross-tabulation of income category versus job satisfaction is shown
below.
Job satisfaction
Income
category
(in
thousand
s)
Under
$25

Somewh
Highly
at
Somewh
dissatisfi dissatisfi
at
Highly
ed
ed
Neutral satisfied satisfied Total
Count

413

153

1330

18.2% 16.5%

11.5%

100.0%

413

440

227

1793

23.0%

24.5% 19.4%

12.7%

100.0%

173

182

177

819

percent 14.5%
age

21.1%

22.2% 20.5%

21.6%

100.0%

$75 $124

Count

100

143

183

668

percent 7.9%
age

15.0%

21.4% 28.3%

27.4%

100.0%

$125+

Count

52

85

146

390

13.3%

21.8% 23.1%

37.4%

100.0%

percent 31.1%
age

$25 - $49 Count

365

percent 20.4%
age
$50 - $74 Count

119

53

17

percent 4.4%
age

303

242

22.8%

219

348

168

189

90

99

Didactic Design: <Title of Module>

100

It can be seen that for income category Under $25, more than 50% of employees are highly
dissatisfied or somewhat dissatisfied. On the other hand, for the income category $125+, about
60% of employees are either somewhat satisfied or highly satisfied. In general, the higher the
income level, the more likely are the employees to be satisfied with their job.

6. Hypothesis test for the difference between two proportions


Here our aim is to conduct a hypothesis test to determine whether the difference between two
population proportions is significant or not. The test procedure, called the two-proportion z-test,
is appropriate when the two samples are independent.
When the null hypothesis states that there is no difference between the two population
proportions 1 and 2 , the null and alternative hypotheses for a two-tailed test are often stated in
the following form:
H 0 : 1 2
H1 : 1 2
Denoting the two populations by the subscripts 1 and 2, we take random sample of size n1
from population 1 and compute the sample proportion P1 of individuals that possess a specific
characteristic. Similarly, we take a random sample of size n 2 from population 2 and compute the
sample proportion P2 . Using sample data, we compute the following:

Pooled sample proportion:

n1P1 n 2 P2
n1 n 2

Standard error (SE) of the sampling distribution of the difference between two proportions:

SE(P1 P2 )

n1

P(1 P)

n 2

where P is the pooled sample proportion.


The test statistic is given by:

100

Didactic Design: <Title of Module>

Zcal

P1 P2

SE(P1 P2 )

101

P1 P2
1

n1

P(1 P)

1
n 2

We then compare this statistic with the critical value from the standard normal distribution for a
given level of significance .
Decision: Reject the null hypothesis if: | Zcal | Z / 2
Example 10: Consider the data in example 9 (the level of income of employees and job
satisfaction).
Now let us compare employees who earn under $25 (thousand per year) and those who earn $25
$49.
Job satisfaction
Income
category
(in
thousand
s)
Under
$25

Somew
Somew
Highly
hat
hat
Highly
dissatisfi dissati Neutr satisfie satisfie
ed
sfied
al
d
d
Total
Count

413

Percent 31.1%
age

$25 - $49 Count

365

Percent 20.4%
age

303

242

219

153

1330

22.8% 18.2
%

16.5% 11.5% 100.0%

413

348

440

23.0% 24.5
%

227

1793

19.4% 12.7% 100.0%

Is there a significant difference between the proportion of those who are highly dissatisfied in the
two income groups?
Solution
Here n1 = 413, P1 = 0.311, n 2 = 365, P2 = 0.204. The pooled sample proportion is:
P

n1P1 n 2 P2
413(0.311) 365(0.204)

= 0.260801
n1 n 2
413 365

The standard error SE(P1 P2 ) is calculated as:


101

Didactic Design: <Title of Module>

102

1 1
1
1
0.260801(1 0.260801)
= 0.031543
413 365
n1 n 2

SE(P1 P2 ) P(1 P)

The test statistic is thus:

Zcal

P1 P2
0.311 0.204

= 3.392191
SE(P1 P2 )
0.031543

The critical value from the standard normal distribution for = 0.01 is Z / 2 2.57 .
Decision: Since | Zcal | Z / 2 , we reject the null hypothesis. Thus, there a significant difference
between the proportion of those who are highly dissatisfied in the two income groups at the one
percent level of significance.
Note that P1 P2 > 0 . This indicates that a significantly higher proportion of employees who
earn under $25 are highly dissatisfied with their job as compared to those who earn $25 $49.

Exercise with SPSS Application

102

Didactic Design: <Title of Module>

103

Unit Eight:
The simple linear regression model and Statistical Inference
8.1 What is a regression model?
What is regression analysis? In very general terms, regression is concerned with describing and
evaluating the relationship between a given variable and one or more other variables on which
the given variable depends. More specifically, regression is an attempt to explain movements in a
variable by reference to movements in one or more other variables.
The given variable is referred to as the dependent (or response) variable (denoted by Y), while
the variables which are thought to affect it are referred to as independent (explanatory or
regressor) variables (denoted by X1 , X 2 , X 3 , . . ., X k ). The case where we have just one
explanatory variable is called simple linear regression. If we have two or more explanatory
variable, then we have the multiple linear regression model.
8.2 Regression versus correlation
The correlation between two variables measures the degree of linear association between them. If
X and Y are correlated, then there is an evidence of a linear relationship between the two
variables. However, it is not implied that changes in X cause changes in Y, or that changes in Y
cause changes in X. The degree of linear relationship between these two variables is measured by
the coefficient of correlation.
In regression, the dependent variable and the independent variable are treated very differently.

The dependent variable is assumed to be random or stochastic in some way, i.e. to have
a certain probability distribution.

The independent variables are, however, assumed to have fixed (non-stochastic) values
in repeated samples.

8.3 Simple linear regression


For simplicity, suppose for now that it is believed that Y depends on only one X variable.
Examples of the kind of relationship that may be of interest include:

How asset returns vary with their level of market risk


103

Didactic Design: <Title of Module>

104

Measuring the long-term relationship between stock prices and dividends

Suppose that a researcher has some idea that there should be a relationship between two
variables Y and X, and that economic, financial, etc. theory suggests that an increase in X will
lead to an increase in Y. A sensible first stage to testing whether there is indeed an association
between the variables would be to form a scatter plot of them. Suppose that the outcome of this
plot is as shown in figure 1.

Figure 1: Scatter plot of two variables Y and X


In this case, it appears that there is an approximate positive linear relationship between X and Y,
which means that increases in X are usually accompanied by increases in Y, and that the
relationship between them can be described approximately by a straight line.
It would therefore be of interest to determine to what extent this relationship can be described by
an equation that can be estimated using a defined procedure. It is possible to use the general
equation for a straight line:

Y X .. (1)
to get the line that best fits the data. The researcher would then be seeking to find the values of
the parameters or coefficients, and , which would place the line as close as possible to all of
the data points taken together.
However, this equation is an exact one. Assuming that this equation is appropriate, if the values
of and had been calculated, then given a value of X, it would be possible to determine with
certainty what the value of Y would be. Imagine -- a model which says with complete certainty
what the value of one variable will be given any value of the other!
Clearly this model is not realistic. Statistically, it would correspond to the case where the model
fitted the data perfectly -- that is, all of the data points lay exactly on a straight line. To make the
model more realistic, a random disturbance or error term, denoted by t , is added to the
equation. Thus, we have:
104

Didactic Design: <Title of Module>

105

Yt X t t . (2)
where the subscript t (= 1, 2, 3, . . .) denotes the observation number.

Reasons for the inclusion of the error term

Even in the general case where there is more than one explanatory variable, some
determinants of Yt will always in practice be omitted from the model. This might, for
example, arise because the number of influences on Y is too large to place in a single
model, or because some determinants of Y may be unobservable or not measurable.
There may be errors in the way that Y is measured which cannot be modelled.
There are bound to be random outside influences on Y that again cannot be modelled. For
example, a terrorist attack, a hurricane or a computer failure could all affect financial
asset returns in a way that cannot be captured in a model and can not be forecast reliably.
Similarly, many researchers would argue that human behaviour has an inherent
randomness and unpredictability!

So how are the appropriate values of and determined?


and are chosen so that the (vertical) distances from the data points to the fitted lines are
minimised (so that the line fits the data as closely as possible). The parameters are thus chosen to
minimise collectively the (vertical) distances from the data points to the fitted line.
The most common method used to fit a line to the data is known as ordinary least squares (OLS).
This approach forms the workhorse of econometric model estimation.
Suppose now, for ease of exposition, that the sample of data contains only five observations. The
method of OLS entails taking each vertical distance from the point to the line, squaring it and
then minimising the total sum of squares of the errors (hence least squares) (see figure 2).

105

Didactic Design: <Title of Module>

106

Figure 2: The estimating line together with the associated errors


denote the fitted value from the
Let Yt denote the actual data point for observation t and Y
t
is the
regression line in other words, for the given value of X of this observation at time t, Y
t
value for Y which the model would have predicted. Note that a hat () over a variable or
parameter is used to denote a value estimated by a model. Finally, let t denote the residual,
which is the difference between the actual (observed) value of Y and the value fitted by the
. What is done is to minimise the sum of the 2t .
model for this data point; i.e. Y Y
t

Note: The reason that the sum of the squared distances is minimised rather than, for example,
finding the sum of t that is as close to zero as possible, is that in the latter case some points will
lie above the line while others lie below it. Then, when the sum to be made as close to zero as
possible is formed, the points above the line would count as positive values, while those below
would count as negative values. So these distances will cancel each other out and the sum would
be zero. However, taking the squared distances ensures that all deviations that enter the
calculation are positive and therefore do not cancel out.
So the sum of squared distances to be minimized is given by:
12 22 32 . . . T2


t 1

2
t

. (3)

106

Didactic Design: <Title of Module>

107

This sum is known as the residual sum of squares (RSS) or the sum of squared residuals. But
what is t ? Again, it is the difference between the actual point and the estimating line, that is,
. So minimising
t Yt Y
t


t 1

2
t

is equivalent to minimising

(Y

t 1

)2
Y
t

Letting and denote the values of and selected by minimising the RSS, respectively, the
equation for the fitted line is given by

X . Now let L denote the RSS,


Y
t
t

which is also known as a loss function. Take the summation over all of the observations, i.e. from
t = 1 to T , where T is the number of observations:
L

(Y
t 1

)2
Y
t

(Y
t 1

X t ) 2 (4)

To find the values of and which minimise the residual sum of squares (equivalently, to find
the equation of the line that is closest to the data), L is minimised with respect to and . This
is achieved through differentiating L with respect to and , and setting the first derivatives to
zero. The resulting coefficient estimators for the slope and the intercept are given by:
T X t Yt ( X t )( Yt )

=
T X 2t ( X t ) 2

(X X)(Y Y)
(X X)
t

X Y
X
t

2
t

TXY
TX 2

. (5)

Y X (6)
Thus, given only the sets of observations Yt and X t , it is always possible to calculate the
X is the best fit to
estimated values of the two parameters and so that the line: Y
t
t
the set of data. This method of finding the optimum is known as OLS.
Note (estimator and estimate)
Estimators are the formulae used to calculate the coefficients (or parameters in general). For
example, the expressions given above for and are estimators. Estimates, on the other hand,
are the actual numerical values for the coefficients that are obtained from the sample.
Example1: The following data is on the excess returns of a given asset (Y) together with the
excess returns on a market index (market portfolio) (X) from January 2009 to December 2010
recorded on a monthly basis:
107

Didactic Design: <Title of Module>

108

year/month

2009/01

-7.93

-17.75

2009/02

-9.93

-14.39

2009/03

8.83

12.9

2009/04

10.17

16.11

2009/05

5.33

7.21

2009/06

0.44

-1.96

2009/07

7.76

9.08

2009/08

3.23

7.7

2009/09

4.15

3.17

2009/10

-2.49

-5.38

2009/11

5.64

5.65

2009/12

2.8

0.76

2010/01

-3.51

-0.61

2010/02

3.39

3.63

2010/03

6.3

8.09

2010/04

2.13

1.85

2010/05

-7.86

-8.73

2010/06

-5.67

-6.5

2010/07

7.27

6.81

2010/08

-4.81

-7.06

2010/09

9.56

8.48

2010/10

4.02

2.05

2010/11

0.63

0.58

2010/12

6.77

9.15
108

Didactic Design: <Title of Module>

109

The idea here is to check if there is a linear relationship between X and Y. The first stage could
be to form a scatter plot of the two variables. This is shown in Figure 3 below. Clearly, there
appears to be a positive, approximately linear, relationship between X and Y.

Figure 3: Scatter plot of X and Y


The next step is to estimate the parameters and using the above formula. The necessary
calculations are shown below:
X

XY

X2

-7.93

-17.75

140.7575

62.8849

-9.93

-14.39

142.8927

98.6049

8.83

12.90

113.9070

77.9689

10.17

16.11

163.8387

103.4289

5.33

7.21

38.4293

28.4089

0.44

-1.96

-0.8624

0.1936

7.76

9.08

70.4608

60.2176

3.23

7.70

24.8710

10.4329

4.15

3.17

13.1555

17.2225

-2.49

-5.38

13.3962

6.2001

5.64

5.65

31.8660

31.8096

2.80

0.76

2.1280

7.8400

109

Didactic Design: <Title of Module>

TOTAL

110

-3.51

-0.61

2.1411

12.3201

3.39

3.63

12.3057

11.4921

6.30

8.09

50.9670

39.6900

2.13

1.85

3.9405

4.5369

-7.86

-8.73

68.6178

61.7796

-5.67

-6.50

36.8550

32.1489

7.27

6.81

49.5087

52.8529

-4.81

-7.06

33.9586

23.1361

9.56

8.48

81.0688

91.3936

4.02

2.05

8.2410

16.1604

0.63

0.58

0.3654

0.3969

6.77

9.15

61.9455

45.8329

46.22

40.84

1164.7554

896.9532

Plugging in these values in the above formulae we get the estimates:


T X t Yt ( X t )( Yt )
24(1164.7554) (46.22)(40.84)

=
= 1.344286
2
2
24(896.9532) (46.22) 2
T X t ( X t )
40.84
46.22
) (1.344)(
) = 0.8872
Y X = (
24
24
The fitted line would thus be:

0.887 1.344X
Y
t
t
where X t is the excess return of the market portfolio over the risk free rate (i.e. R m R f ), also
known as the market risk premium.
Interpretation of and
The coefficient estimate of is interpreted as, if X increases by 1 unit, Y will be expected to
increase by the amount units, everything else being equal. If is negative, a rise in X would
110

Didactic Design: <Title of Module>

111

on average cause a fall in Y. The intercept coefficient estimate ( ) is interpreted as the value that
would be taken by the dependent variable Y if the independent variable X took a value of zero.
In Example 1, the coefficient estimate of 1.344 is interpreted as: if the excess return of the
market portfolio over the risk free rate increases by 1%, then the excess returns of this particular
asset will be expected to increase by 1.344%, everything else being equal.
If an analyst tells you that he expects the market to yield a return 10% higher than the risk-free
rate next month, what would you expect the excess return on this asset to be? To answer this,
plug in X = 10 in the estimated equation. This yields:

0.887 1.344(10) 12.553


Y
t
Thus, for a given expected market risk premium of 10%, this fund would be expected to earn an
excess over the risk-free rate of 12.553%.
Note: Caution should be exercised when producing predictions for Y using values of X that are a
long way outside the range of values in the sample. In Example 1, X takes values between
9.93% and 10.17% in the available data. So, it would not be advisable to use this model to
determine the expected excess return on the fund if the expected excess return on the market
were, say 20% or 15% (i.e. the market was expected to fall).

Precision and standard errors


Any set of regression coefficient estimates and are specific to the sample used in their
estimation. In other words, if a different sample of data was selected from within the population,
the data points (the X t and Yt ) will be different, leading to different values of the OLS
estimates.
Recall that the OLS estimators ( and ) are given by equations (5) and (6). It would be
desirable to have an idea of how good these estimates of and are in the sense of having
some measure of the reliability or precision of the estimators and . It is thus useful to know
whether one can have confidence in the estimates, and whether they are likely to vary much from
one sample to another sample within the given population. An idea of the sampling variability,
and hence of the precision of the estimates, can be calculated using only the sample of data
available. This estimate is given by its standard error. Valid estimators of the standard errors of
and are given by:

2
t

T[ X TX 2 ]
2
t

. (10)
111

Didactic Design: <Title of Module>

1
2
X
t TX 2

112

. (11)

where is the estimated standard deviation of the residuals. It is worth noting that the standard
errors give only a general indication of the likely accuracy of the regression parameters. They do
not show how accurate a particular set of coefficient estimates is. If the standard errors are small,
it shows that the coefficients are likely to be precise on average, not how precise they are for this
particular sample. Thus, standard errors give a measure of the degree of uncertainty in the
estimated values for the coefficients. It can be seen that they are a function of the actual
observations on the explanatory variable, X, the sample size, T, and another term, .

is an estimate of the variance of the disturbance term. The actual variance of the disturbance
term t is denoted by 2 . An estimator of 2 is given by:
2

T
1
2t . (12)
(T 2) t 1

Y X . The square root of this


where the t are the OLS residuals, i.e., t Yt Y
t
t
t
estimator, namely , is known as the residual standard deviation. It is sometimes used as a broad
measure of the fit of the regression equation. Everything else being equal, the smaller this
quantity is, the closer is the fit of the line to the actual data.
Example 2: Consider the data in example 1. The size of the sample was T = 24, and the OLS
estimators were calculated as: = 1.344 and = 0.887. Thus, the equation of the fitted line
is:
0.887 1.344X
Y
t
t
The OLS residuals are obtained as:
t Yt X t Yt 0.887 1.344X t

The OLS residuals corresponding to each time point, their square and the total sum of squares of
the residuals are shown in the following table.
year/month

2t

2009/01

-7.93

-17.75

-6.2026

38.4723

112

Didactic Design: <Title of Module>

113

2009/02

-9.93

-14.39

-0.1540

0.0237

2009/03

8.83

12.9

1.9172

3.6755

2009/04

10.17

16.11

3.3258

11.0610

2009/05

5.33

7.21

0.9322

0.8689

2009/06

0.44

-1.96

-1.6643

2.7698

2009/07

7.76

9.08

-0.4645

0.2157

2009/08

3.23

7.7

4.2452

18.0214

2009/09

4.15

3.17

-1.5216

2.3152

2009/10

-2.49

-5.38

-1.1455

1.3122

2009/11

5.64

5.65

-1.0446

1.0911

2009/12

2.8

0.76

-2.1168

4.4808

2010/01

-3.51

-0.61

4.9957

24.9565

2010/02

3.39

3.63

-0.0399

0.0016

2010/03

6.3

8.09

0.5082

0.2583

2010/04

2.13

1.85

-0.1261

0.0159

2010/05

-7.86

-8.73

2.7233

7.4163

2010/06

-5.67

-6.5

2.0093

4.0373

2010/07

7.27

6.81

-2.0758

4.3088

2010/08

-4.81

-7.06

0.2932

0.0860

2010/09

9.56

8.48

-3.4842

12.1395

2010/10

4.02

2.05

-2.4668

6.0852

2010/11

0.63

0.58

0.6203

0.3848

2010/12

6.77

9.15

0.9364

0.8768

0.0000

144.8748

TOTAL

Note that the sum of the residuals ( t ) is zero. The residual variance is computed as:
2

T
1
1
2t
(144.8748) = 6.585217

(T 2) t 1
(24 2)

The residual standard deviation is the square root of the variance, i.e.,
113

Didactic Design: <Title of Module>

114

6.585217 = 2.566168
Estimated standard errors of and are given by:

2
t

T[ X TX ]
2
t

(2.566168)

896.9532
= 0.551918
24[896.9532 24(1.925833) 2 ]

1
1
(2.566168)
= 0.090281
2
896.9532 24(1.925833) 2
X TX
2
t

With the standard errors calculated, the results are written as:
0.887
Y
t
(0.551918)

1.344 X t

(0.090281)

The standard error estimates are usually placed in parentheses under the relevant coefficient
estimates.

8.4 An introduction to statistical inference


Often, (economic, financial, etc.) theory will suggest that certain coefficients should take on
particular values, or values within a given range. It is thus of interest to determine whether the
relationships expected from theory are upheld by the data to hand or not. Estimates of and
have been obtained from the sample, but these values are not of any particular interest. Instead,
the population values that describe the true relationship between the variables would be of more
interest, but are never available. In practice, inferences are made concerning the likely population
values from the regression parameters that have been estimated from the sample of data to hand.
In doing this, the aim is to determine whether or not the differences between the coefficient
estimates that are actually obtained and expectations arising from theory are a long way from one
another in a statistical sense.

Example 3: Suppose the following regression results have been obtained:


10.5 0.6091X
Y
t
t
(8.5519)

(0.2109)

114

Didactic Design: <Title of Module>

115

= 0.6091 is a single (point) estimate of the unknown population parameter, . As stated above,
the reliability of the point estimate is measured by the coefficients standard error. The
information from the sample coefficients ( and ) and their standard errors ( se( ) and se( ) )
can be used to make inferences about the population parameters ( and ). So the estimate of the
slope coefficient is = 0.6091, but it is obvious that this number is likely to vary to some degree
from one sample to the next. Thus, it might be of interest to answer the question, Is it plausible,
given this estimate, that the true population parameter, , could be 0.5? Is it plausible that could
be 1? etc. Answers to these questions can be obtained through hypothesis testing.

8.5 Hypothesis testing: some concepts


In the hypothesis testing framework, there are always two hypotheses that go together, known as
the null hypothesis (denoted by H 0 ) and the alternative hypothesis (denoted H1 or H A ). The null
hypothesis is the statement or the statistical hypothesis that is actually being tested, whereas the
alternative hypothesis represents the remaining outcomes of interest. For example, suppose that
given the regression results above, it is of interest to test the hypothesis that the true value of is
in fact 0.5. The following notation will be used:
H 0 : 0.5
H1 : 0.5
This states that the hypothesis that the true but unknown value of could be 0.5 is being tested
against an alternative hypothesis that is significantly different from 0.5. This is known as a
two-sided test, since the outcomes of both < 0.5 and > 0.5 are subsumed under the alternative
hypothesis.
Sometimes, some prior information may be available suggesting, for example, that > 0.5 would
be expected rather than < 0.5. In this case, < 0.5 is no longer of interest to us, and hence a
one-sided test would be conducted:
H 0 : 0.5
H1 : 0.5
Here the null hypothesis that the true value of is 0.5 is being tested against a one-sided
alternative that is more than 0.5. On the other hand, one could envisage a situation where there
is prior information that < 0.5 is expected. For example, suppose that an investment bank
bought a piece of new risk management software that is intended to better track the riskiness
inherent in its traders books and that is some measure of the risk that previously took the value
0.5. Clearly, it would not make sense to expect the risk to have risen, and so the hypothesis >
115

Didactic Design: <Title of Module>

116

0.5, corresponding to an increase in risk, is not of interest. In this case, the null and alternative
hypotheses would be specified as:
H 0 : 0.5
H1 : 0.5
This prior information should come from (financial) theory of the problem under consideration,
and not from an examination of the estimated value of the coefficient.

Note that there is always an equality sign under the null hypothesis. So, for example, < 0.5
would not be specified under the null hypothesis.

There are two ways to conduct a hypothesis test: via the test of significance approach or via the
confidence interval approach. Both methods centre on a statistical comparison of the estimated
value of the coefficient, and its value under the null hypothesis. In very general terms, if the
estimated value is a long way away from the hypothesised value, the null hypothesis is likely to
be rejected; if the value under the null hypothesis and the estimated value are close to one
another, the null hypothesis is less likely to be rejected. For example, consider = 0.6091 as
above. A null hypothesis that the true value of is 5 ( H 0 : 5 ) is more likely to be rejected
than a null hypothesis that the true value of is 0.5 ( H 0 : 0.5 ). What is required now is a
statistical decision rule that will permit the formal testing of such hypotheses.

The probability distribution of the least squares estimators


In order to test hypotheses, we assume that the error terms ( t ) are normally distributed with
2
mean zero and variance 2 . This is written as: t : N(0 , ) . The normal distribution is a
convenient one to use since the algebra involved in statistical inference considerably simpler.

a) Since Yt is a linear combination of t , it can be stated that if t is normally distributed then


Yt will also be normally distributed.
b) The least squares estimators are linear combinations of the random variables Yt . For
instance:

116

Didactic Design: <Title of Module>

117

(X X)(Y Y)
(X X)
(X X)Y (X X)Y
(X X)
(X X)Y (sin ce (X
(X X)
WY
t

X) 0)

where Wt

(X t X)
are the weights. Since the weighted sum of normal random variables
(X t X)2

is also normally distributed, it can be said that the coefficient estimates will also be normally
distributed. Thus:
: N( , se( ))
: N( , se( ))

where:
se( )

se( )

2
t

T[ X 2t TX 2 ]

. (13)

1
. (14)
X TX 2
2
t

Thus, inferences about the true regression coefficients and can be made based on the normal
distribution (or similar distributions). Note that relations (13) and (14) involve the unknown
residual standard deviation .

117

Didactic Design: <Title of Module>

118

Will the coefficient estimates still follow a normal distribution if the errors do not follow a
normal distribution? The answer is yes provided that the sample size is sufficiently large. This
is due to the central limit theorem (CLT). The normal distribution is plotted below.

Figure 4: The normal distribution


For inferential purposes, we often deal with standard normal random variables whose mean is
zero and whose variance is 1 (denoted by N(0,1)). Standard normal variables can be constructed
from and by subtracting their mean and dividing by the square root of their variance, i.e.,

:
se( )
Var( )

N(0,1) and

: N(0,1)
se( )
Var( )

The square roots of the coefficient variances are the standard errors that are given by relations
(13) and (14) above. Unfortunately, the true standard errors of the regression coefficients are
never known (since the true residual standard deviation is unknown) -- all that is available are
their sample counterparts, the calculated standard errors of the coefficient estimates

and

defined by the relations (10) and (11), respectively. Replacing the true values of the standard
errors with the sample estimated versions induces another source of uncertainty, and also means
that the standardised statistics are not normally distributed! They rather follow another
distribution, namely, the Students t-distribution with (T 2) degrees of freedom. That is:

t (T 2)

and

t (T 2)

A note on the t and the normal distributions


The normal distribution is bell shaped and is symmetric about the mean (or about zero for a
standard normal distribution). A normal variate can be scaled to have zero mean and unit
variance (or standardized) by subtracting its mean and dividing by its standard deviation. What
118

Didactic Design: <Title of Module>

119

does the t-distribution look like? It looks similar to a normal distribution, but with fatter tails and
a smaller peak at the mean (shown in Figure 5 below). In addition to the two parameters (mean
and variance), the t-distribution has another parameter, its degrees of freedom.

Figure 5: The t-distribution versus the normal


Some examples of the percentiles from the normal and t-distributions taken from the statistical
tables are given in the table below. When used in the context of a hypothesis test, these
percentiles become critical values. The values presented in the table would be those critical
values appropriate for a one-sided test of the given significance level. It can be seen that as the
number of degrees of freedom for the t-distribution increases from 5 to 40, the critical values fall
substantially. It can also be seen that the critical values for the t-distribution are larger than those
from the standard normal. This arises from the increased uncertainty associated with the situation
where the error variance must be estimated. So now the t-distribution is used, and for a given
statistic to constitute the same amount of reliable evidence against the null hypothesis, it has to
be bigger in absolute value than in circumstances where the normal is applicable.
Significance level (%)

N(0,1)

t(5)

t(40)

5%

1.645

2.015

1.684

2.5%

1.96

2.571

2.021

1%

2.33

3.365

2.423

Table: Critical values from the standard normal versus t-distribution

There are broadly two approaches to testing hypotheses under regression analysis: the test of
significance approach and the confidence interval approach. Each of these will now be
considered in turn.
a) The test of significance approach
Assume that the regression equation is given by:
119

Didactic Design: <Title of Module>

Yt X t t

120

t = 1, 2, . . . , T .

The steps involved in doing a test of significance are shown below:

1. Estimate and ,

and

using the relations that are discussed earlier.

*
2. Calculate the test statistic. If the null hypothesis is H 0 : and the alternative hypothesis
*
is H1 : (for a two-sided test), the test statistic is given by:
*
t
(15)

3. A tabulated distribution with which to compare the estimated test statistics is required. Test
statistics derived in this way can be shown to follow the t-distribution with (T 2) degrees of
freedom.
4. Choose a significance level, often denoted by (not the same as the regression intercept
coefficient). It is conventional to use a significance level of 5% or 1%.
5. Given a significance level, a rejection region and non-rejection region can be determined. If a
5% significance level is employed, this means that 5% of the total distribution (5% of the
area under the curve) will be in the rejection region. That rejection region can either be split
in half (for a two-sided test) or it can all fall on one side of the y-axis, as is the case for a onesided test. For a two-sided test, the 5% rejection region is split equally between the two tails,
as shown in figure 6(a). For a one-sided test, the 5% rejection region is located solely in one
tail of the distribution, as shown in figures 6(b) and 6(c), for a test where the alternative is of
the less than form, and where the alternative is of the greater than form, respectively.

H 0 : *
H1 : *
120

Didactic Design: <Title of Module>

121

Figure 6(a): Rejection regions for a two-sided 5% test of hypothesis

H 0 : *
H1 : *
Figure 6(b): Rejection region for a one (left) -sided 5% test of hypothesis

H 0 : *
H1 : *
Figure 6(c): Rejection regions for a one (right) -sided 5% test of hypothesis
6. Use the t-tables to obtain a critical value or values with which to compare the test statistic.
The critical value will be that value of x that puts 5% into the rejection region. In figures 6(a)
6(c), c1, c2, c3 and c4 denote such critical values.
7. Finally perform the test. If the test statistic lies in the rejection region, then reject the null
hypothesis ( H 0 ), else do not reject H 0 .
Steps 2 7 require further comment. In step 2, the estimated value of is compared with the
value that is subject to test under the null hypothesis, but this difference is normalised or scaled
by the standard error of the coefficient estimate. The standard error is a measure of how
confident one is in the coefficient estimate obtained in the first stage. If a standard error is small,
the value of the test statistic will be large relative to the case where the standard error is large.
121

Didactic Design: <Title of Module>

122

For a small standard error, it would not require the estimated and hypothesised values to be far
away from one another for the null hypothesis to be rejected.
In this context, the number of degrees of freedom can be interpreted as the number of pieces of
additional information beyond the minimum requirement. If two parameters are estimated ( and
-- the intercept and the slope of the line, respectively), a minimum of two observations is
required to fit this line to the data. As the number of degrees of freedom increases, the critical
values in the tables decrease in absolute terms, since less caution is required and one can be more
confident that the results are appropriate.
The significance level is also sometimes called the size of the test (note that this is completely
different from the size of the sample) and it determines the region where the null hypothesis
under test will be rejected or not rejected. Remember that the distributions in figures 6(a) 6(c)
are for a random variable. Purely by chance, a random variable will take on extreme values
(either large and positive values or large and negative values) occasionally. More specifically, a
significance level of 5% means that a result as extreme as this or more extreme would be
expected only 5% of the time as a consequence of chance alone. To give one illustration, if the
5% critical value for a one-sided test is 1.68, this implies that the test statistic would be expected
to be greater than this only 5% of the time by chance alone. There is nothing magical about the
test -- all that is done is to specify an arbitrary cut-off value for the test statistic that determines
whether the null hypothesis would be rejected or not. It is conventional to use a 5% size of test,
but 10% and 1% are also commonly used.
However, one potential problem with the use of a fixed (e.g. 5%) size of test is that if the sample
size is sufficiently large, any null hypothesis can be rejected. This is particularly worrisome in
finance, where tens of thousands of observations or more are often available. What happens is
that the standard errors reduce as the sample size increases, thus leading to an increase in the
value of all t-test statistics. This problem is frequently overlooked in empirical work, but some
econometricians have suggested that a lower size of test (e.g. 1%) should be used for large
samples
Note also the use of terminology in connection with hypothesis tests: it is said that the null
hypothesis is either rejected or not rejected. It is incorrect to state that if the null hypothesis is not
rejected, it is accepted (although this error is frequently made in practice), and it is never said
that the alternative hypothesis is accepted or rejected. One reason why it is not sensible to say
that the null hypothesis is accepted is that it is impossible to know whether the null is actually
true or not! In any given situation, many null hypotheses will not be rejected. For example,
suppose that H 0 : = 0.5 and H 0 : = 1 are separately tested against the relevant two-sided
alternatives and neither null is rejected. Clearly then it would not make sense to say that H 0 :
= 0.5 is accepted and H 0 : = 1 is accepted, since the true (but unknown) value of cannot be
122

Didactic Design: <Title of Module>

123

both 0.5 and 1. So, to summarise, the null hypothesis is either rejected or not rejected on the
basis of the available evidence.
Example 4: Consider the data in example 1 (on the excess returns of a given asset (Y) together
with the excess returns on a market portfolio (X) from January 2009 to December 2010 recorded
on a monthly basis. The regression result was:
0.887
Y
t
(0.551918)

1.344 X t

(0.090281)

where the figures in parentheses are the standard error estimates. Using both the test of
significance and confidence interval approaches, test the hypothesis that 0 against a twosided alternative at the 5% level of significance.
Solution
The test of significance approach
The null and alternative hypotheses are:
H0 : 0
H1 : 0
The test statistic is calculated as:
t

1.344 0
0.090281 = 14.887

Now we need the critical value from the t-distribution with (T-2) = (24-2) = 22 degrees of
freedom and at the 5% level. This means that 5% of the total distribution will be in the rejection
region, and since this is a two-sided test, 2.5% of the distribution is required to be contained in
each tail. From the t-distribution table we have:
t / 2 (T 2) t 0.025 (22) 2.074
From the symmetry of the t-distribution around zero, the critical values in the upper and lower
tail will be equal in magnitude, but opposite in sign. Thus, the rejection (critical) regions are as
shown below:

123

Didactic Design: <Title of Module>

124

Figure 7: Rejection regions for a two-sided 5% test of hypothesis

Decision: Reject H 0 : 0 since the test statistic lies within the rejection region. Thus, CAPM
beta is significantly different from zero. The implication is that movements in the given asset
(X) are significantly related with movements in the market (Y).
Some more terminology
If the null hypothesis is rejected at the 5% level, it would be said that the result of the test is
statistically significant. If the null hypothesis is not rejected, it would be said that the result of
the test is not significant, or that it is insignificant. Finally, if the null hypothesis is rejected at
the 1% level, the result is termed highly statistically significant.

Classifying the errors that can be made in tests of hypotheses


H 0 is usually rejected if the test statistic is statistically significant at a chosen significance level.
There are two possible errors that could be made:
1. Rejecting H 0 when it is really true; this is called a type I error.
2. Not rejecting H 0 when it is in fact false; this is called a type II error.
The possible scenarios can be summarised in the following table.

True condition (reality)


Result of test

H 0 is true

H 0 is false

Significant

Type I error

correct decision
124

Didactic Design: <Title of Module>

125

(Reject H 0 )
Not significant

correct decision

Type II error

(Do not reject H 0 )

The probability of a type I error is just , the significance level or size of the test chosen. To see
this, recall what is meant by significance at the 5% level: it is only 5% likely that a result as or
more extreme as this could have occurred purely by chance. Or, to put this in another way, it is
only 5% likely that this null would be rejected when it was in fact true.

Note that there is no chance for a free lunch (i.e. a cost-less gain) here! What happens if the size
of the test is reduced (e.g. from a 5% test to a 1% test)? The chances of making a type I error
would be reduced but so would the probability that the null hypothesis would be rejected at all,
so increasing the probability of a type II error. So there always exists a direct trade-off between
type I and type II errors when choosing a significance level. The only way to reduce the chances
of both is to increase the sample size or to select a sample with more variation, thus increasing
the amount of information upon which the results of the hypothesis test are based. In practice, up
to a certain level, type I errors are usually considered more serious and hence a small size of test
is usually chosen (5% or 1% are the most common).

Steps involved in formulating a model


Although there are of course many different ways to go about the process of model building, a
logical and valid approach would be to follow the steps described below.

1. general statement of the problem : This will usually involve the formulation of a theoretical
model based on a certain theory that two or more variables should be related to one another
in a certain way. The model is unlikely to be able to completely capture every relevant realworld phenomenon, but it should present a sufficiently good approximation that it is useful
for the purpose at hand.
2. collection of data relevant to the model: The data required may be available electronically
from data provider or from published government figures. Alternatively, the required data
may be available only via a survey after distributing a set of questionnaires i.e. primary data.
3. choice of estimation method relevant to the model proposed: For example, is a single
equation or multiple equation technique to be used?
125

Didactic Design: <Title of Module>

126

4. statistical evaluation of the model: What assumptions were required to estimate the
parameters of the model optimally? Were these assumptions satisfied by the data or the
model? Also, does the model adequately describe the data? If the answer is yes, proceed to
step 5; if not, go back to steps 1--3 and either reformulate the model, collect more data, or
select a different estimation technique that has less stringent requirements.
5. evaluation of the model from a theoretical perspective: Are the parameter estimates of the
sizes and signs that the theory or intuition from step 1 suggested? If the answer is yes,
proceed to step 6; if not, again return to stages 1--3.
6. use of model: When a researcher is finally satisfied with the model, it can then be used for
testing the theory specified in step 1, or for formulating forecasts or suggested courses of
action. This suggested course of action might be for an individual, or as an input to
government policy.
It is important to note that the process of building a robust empirical model is an iterative one,
and it is certainly not an exact science. Often, the final preferred model could be very different
from the one originally proposed, and need not be unique in the sense that another researcher
with the same data and the same initial theory could arrive at a different final specification.
SPSS Application

126

Didactic Design: <Title of Module>

127

Unit Nine:
The Multiple linear regression model and Statistical Inference
9.1 Introduction

So far we have seen the basic statistical tools and procedures for analyzing relationships between
two variables. But in practice, economic models generally contain one dependent variable and
two or more independent variables. Such models are called multiple regression models.
Example 1:
a) In demand studies we study the relationship between the demand for a good (Y) and price of
the good ( X1 ), prices of substitute goods ( X 2 ) and the consumers income ( X 3 ). Here, Y is
the dependent variable and X1 , X 2 and X 3 are the explanatory (independent) variables. The
relationship is estimated by a multiple linear regression equation (model) of the form:
X X X
Y
0
1 1
2 2
3 3
where 0 , 1 , 2 and 3 are estimated regression coefficients.
b) In a study of the amount of output (product), we are interested to establish a relationship
between output (Q) and labour input (L) & capital input (K). The equations are often
estimated in log-linear form as:
log(L) log(K)
log(Q)
0
1
2
c) In a study of the determinants of the number of children born per woman (Y), the possible
explanatory variables include years of schooling of the woman ( X1 ), womans (or husbands)
earning at marriage ( X 2 ), age of woman at marriage ( X 3 ) and survival probability of
children at age five ( X 4 ). The relationship can thus be expressed as:
X X X X
Y
0

9.2 Estimation of regression coefficients


Example: Consider the following model with two independent variables X1 and X 2 :
Yi 0 1X1i 2 X 2i i ,i 1, 2, . . ., n
Expressing all variables in deviations form, that is, yi Yi Y , x1i X1i X1 and
x 2i X 2i X 2 , the OLS estimators of the parameters 0 , 1 and 2 are given by:
127

Didactic Design: <Title of Module>

128

y i x 22i x 2i yi x 1i x 2i
2
x1i2 x 22i x1i x 2i

1i

y i x1i2 x1i yi x 1i x 2i
2
x1i2 x 2i2 x1i x 2i
2i

0 Y 1X1 2 X 2
where Y , X1 and X 2 are the mean values of the variables Y, X1 and X 2 , respectively.
An estimator of the variance of the errors 2 is given by:
n

i2
i 1

n 3

(Y

i 1

)2
Y
i

n 3

X X
where Y
i
0
1 1i
2 2i
The standard errors of the estimated regression coefficients 1 and 2 are estimated as:

2
and
(1 r122 ) x1i2

2
(1 r122 ) x 2i2

where r12 is the coefficient of correlation between X1 and X 2 , that is:


r12

x x
x x
1i

2
1i

2i

2
2i

Example 2: Consider the following data on per capita food consumption (Y), price of food ( X1 )
and per capita income ( X 2 ) for the years 1927-1941 in the United States. Retail price of food
and per capita disposable income are deflated by the Consumer Price Index.

Year

X1

X2

Year

X1

X2

1927

88.9

91.7

57.7

1935

85.4

88.1

52.1

1928

88.9

92.0

59.3

1936

88.5

88.0

58.0
128

Didactic Design: <Title of Module>

129

1929

89.1

93.1

62.0

1937

88.4

88.4

59.8

1930

88.7

90.9

56.3

1938

88.6

83.5

55.9

1931

88.0

82.3

52.7

1939

91.7

82.4

60.3

1932

85.9

76.3

44.4

1940

93.3

83.0

64.1

1933

86.0

78.3

43.8

1941

95.1

86.2

73.7

1934

87.1

84.3

47.8

We want to fit a multiple linear regression model:


Yi 0 1X1i 2 X 2i i ,i 1, 2, . . ., 15
To simplify the calculations, it is better to work with deviations: yi Yi Y , x1i X1i X1
and x 2i X 2i X 2 . The transformed values are shown in the following table.

129

Didactic Design: <Title of Module>

130

Year

X1

X2

x1

x2

1927

88.9

91.7

57.7

-0.007

5.800

1.173

1928

88.9

92.0

59.3

-0.007

6.100

2.773

1929

89.1

93.1

62.0

0.193

7.200

5.473

1930

88.7

90.9

56.3

-0.207

5.000

-0.227

1931

88.0

82.3

52.7

-0.907

-3.600

-3.827

1932

85.9

76.3

44.4

-3.007

-9.600

-12.127

1933

86.0

78.3

43.8

-2.907

-7.600

-12.727

1934

87.1

84.3

47.8

-1.807

-1.600

-8.727

1935

85.4

88.1

52.1

-3.507

2.200

-4.427

1936

88.5

88.0

58.0

-0.407

2.100

1.473

1937

88.4

88.4

59.8

-0.507

2.500

3.273

1938

88.6

83.5

55.9

-0.307

-2.400

-0.627

1939

91.7

82.4

60.3

2.793

-3.500

3.773

1940

93.3

83.0

64.1

4.393

-2.900

7.573

1941

95.1

86.2

73.7

6.193

0.300

17.173

Total

1333.6

1288.5

847.9

Mean

88.90667

85.90

56.52667

130

Didactic Design: <Title of Module>

131

The necessary calculations using the transformed variables are shown below:
x2y

x1 x 2

x 22

x1

x2

-0.007

5.800

1.173

-0.039

-0.008

6.805

33.640

1.377

4.45E-05

-0.007

6.100

2.773

-0.041

-0.018

16.917

37.210

7.691

4.45E-05

0.193

7.200

5.473

1.392

1.058

39.408

51.840

29.957

0.037

-0.207

5.000

-0.227

-1.033

0.047

-1.133

25.000

0.051

0.043

-0.907

-3.600

-3.827

3.264

3.470

13.776

12.960

14.643

0.822

-3.007

-9.600

-12.127

28.864

36.461

116.416

92.160

147.056

9.040

-2.907

-7.600

-12.727

22.091

36.992

96.723

57.760

161.968

8.449

-1.807

-1.600

-8.727

2.891

15.766

13.963

2.560

76.155

3.264

-3.507

2.200

-4.427

-7.715

15.523

-9.739

4.840

19.595

12.297

-0.407

2.100

1.473

-0.854

-0.599

3.094

4.410

2.171

0.165

-0.507

2.500

3.273

-1.267

-1.658

8.183

6.250

10.715

0.257

-0.307

-2.400

-0.627

0.736

0.192

1.504

5.760

0.393

0.094

2.793

-3.500

3.773

-9.777

10.540

-13.207

12.250

14.238

7.803

4.393

-2.900

7.573

-12.741

33.272

-21.963

8.410

57.355

19.301

6.193

0.300

17.173

1.858

106.36

5.152

0.090

294.923

38.357

27.630

257.397

275.900

355.14

838.289

99.929

TOTAL

x1 y

x12

y2

Summary statistics:

x y = 27.63, x y = 257.397, x x = 275.9, x


838.289, y = 99.929, Y = 88.90667, X = 85.9, X = 56.52667
n = 15,

2
1

= 355.14,

2
2

OLS estimates of the regression coefficients are:

y i x 22i x 2i yi x 1i x 2i
2
x1i2 x 2i2 x1i x 2i

1i

27.63 838.289 257.397 275.9


2
355.14 838.289 275.9

257.397 355.14 27.63 275.9


2
355.14 838.289 275.9

= -0.21596

y i x1i2 x1i yi x 1i x 2i
2
x1i2 x 22i x1i x 2i
2i

131

Didactic Design: <Title of Module>

132

= 0.378127
0 Y 1X1 2 X 2 = 88.90667 (-0.21596)(85.9) (0.378127)(56.52667)

= 86.08318

Hence, the estimated model is:


Y

86.08318 0.21596 X1 0.378127 X 2

Estimation of standard errors of estimated coefficients


The estimated errors (residuals) are:
Y 86.08318 0.21596X 0.378127X
Yi Y

i
2i
3i
The error sum of squares (ESS) =

2
i

= 8.567271. Thus, an estimator of the error variance 2

is:

2
i

8.567271
= 0.713939225
15 3

n 3

The coefficient of correlation between X1 and X 2 is computed as:


r12

x x
x x
1i

2i

2
1i

2
2i

275.9

355.14 838.289

= 0.505655733

The standard errors of estimated regression coefficients 1 and 2 are estimated as:

2
(1 r122 ) x1i2

2
(1 r122 ) x 2i2

0.713939225
= 0.05197
[1 (0.505655733) 2 ][355.14]

0.713939225
= 0.033826
[1 (0.505655733) 2 ][838.289]

132

Didactic Design: <Title of Module>

133

10. Evaluating the regression equation


Is the estimated equation a useful one? To answer this, an objective measure of some sort is
desirable. Such an objective measure, called the coefficient of determination, is available. First
let us define some measures of dispersion or variability.

The total sum of squares (TSS) is a measure of dispersion of the observed values of Y about
their mean. This is computed as:
TSS

(Yi Y)2
i 1

Yi2 nY 2
i 1

y
i 1

2
i

The regression (explained) sum of squares (RSS) measures the amount of the total
variability in the observed values of Y that is accounted for by the linear relationship between
the observed values of X and Y. This is computed as:

RSS

(Y
i 1

Y) 2

The error (residual or unexplained) sum of squares (ESS) is a measure of the dispersion of
the observed values of Y about the regression line. This is computed as:
)2
ESS (Yi Y
i

Note: It can be shown that the total sum of squares is the sum of the regression sum of squares
and the error sum of squares; i.e., TSS = RSS + ESS.

If a regression equation does a good job of describing the relationship between the dependent
variable and the independent variables, the regression (explained) sum of squares (RSS) should
constitute a large proportion of the total sum of squares (TSS). Thus, it would be of interest to
determine the magnitude of this proportion by computing the ratio of the explained sum of
squares to the total sum of squares. This proportion is called the sample coefficient of
determination, R 2 . That is:
2
Coefficient of determination = R

RSS
ESS
1
TSS
TSS

R 2 measures the proportion of variation in the dependent variable that is explained by the
independent variables (or by the linear regression model). It is a goodness-of-fit statistic. The
proportion of total variation in the dependent variable that is accounted for by factors other than
X (for example, due to excluded variables, chance, etc) is equal to ( 1 R 2 ) x 100%.

133

Didactic Design: <Title of Module>

134

Example 3: Consider the data on per capita food consumption (Y), price of food ( X1 ) and per
capita income ( X 2 ). Calculate the coefficient of determination and interpret.

Solution
The variation in the dependent variable Y (food consumption) can be decomposed into:
Total sum of squares: TSS

(Y
i 1

Error sum of squares: ESS

Y) 2

(Yi Y i )2
i 1

y
i 1
n

2
i


i 1

2
i

= 99.929

= 8.567271

Regression sum of squares = RSS = TSS ESS = 91.362


The coefficient of determination is thus:

R2

RSS
= 0.914
TSS

R 2 = 0.914 indicates that 91.4% of the variation (change) in food consumption is attributed
to the effect of food price and/or consumer income.
1 R 2 = 0.086. This indicates that 8.6% of the variation in food consumption is due to factors
(variables) not included in our specification (such as habit persistence, geographical and time
variation, etc.)

Analysis of variance (ANOVA)


R 2 measures the proportion of variation in the dependent variable Y that is explained by the
explanatory variables (or by the multiple linear regression model).The largest value that R 2 can
assume is 1 (in which case all observations fall on the regression line), and the smallest it can
assume is zero. A small value of R 2 casts doubt about the usefulness of the regression equation.
We do not, however, pass final judgment on the equation until it has been subjected to an
objective statistical test.
A test for the significance of R 2 ((i.e., the adequacy of the multiple linear regression model) is
equivalent to testing the hypotheses:

134

Didactic Design: <Title of Module>

135

H 0 : 1 2 . . . k
H A : H 0 is not true
The null hypothesis ( H 0 ) states that all regression coefficients are insignificant (none of them
explains the dependent variable). Not rejecting H 0 means that such a model is inadequate, and is
useless for prediction or inferential purposes.
The above test is accomplished by means of analysis of variance (ANOVA) which enables us to
test the significance of R 2 . The ANOVA table for multiple linear regression model is given
below:
ANOVA table for multiple linear regression
Source of
variation

Sum of
squares

Degrees of
freedom

Mean
square

Regression

RSS

k1

RSS/(k-1)

Residual

ESS

nk

ESS/(n-k)

Total

TSS

n1

Variance ratio

Fcal

RSS /(k 1)
ESS /(n k)

Here k is the number of parameters estimated from the sample data and n is the sample size. To
test for the significance of R 2 , we compare the variance ratio with F (k 1, n k) , the critical
value from the F distribution with (k 1) and (n k) degrees of freedom in the numerator and
denominator, respectively, for a given significance level .
Decision rule: Reject H 0 if:
Fcal

RSS /(k 1)
F (k 1, n k)
ESS /(n k)

If H 0 is rejected, we then conclude that R 2 is significant (or that the fitted model is adequate
and is useful for prediction purposes).
Note:
As the number of explanatory (independent) variables increases, R 2 always increases. This
implies that the goodness-of-fit of an estimated model depends on the number of independent
(explanatory) variables regardless of whether they are important or not. To eliminate this
dependency, we calculate the adjusted R 2 (denoted by R 2 ) as:
135

Didactic Design: <Title of Module>

136

n 1
nK
2
2
Unlike R , R may increase or decrease when new variables are added into the model.
R 2 1 (1 R 2 )

Example 4: Consider the multiple regression model of per capita food consumption (Y) on price
of food ( X1 ) and per capita income ( X 2 ) given by:
Yi 0 1X1i 2 X 2i i
The fitted multiple regression model from the sample data was:

86.08318 0.21596 X1 0.378127 X 2

Is the model adequate?


Solution
Here, k = 3 (since we have estimated three regression coefficients 0 , 1 , 2 ), n = 5,
TSS 99.929 , ESS 8.567271 and RSS 91.362 .

A test of model adequacy is accomplished by testing the hypothesis:


H 0 : 1 2 0
H A : H 0 is not true
The ANOVA table is:
ANOVA table for multiple linear regression
Source of
variation

Sum of
squares

Degrees of
freedom

Mean
square

Variance ratio

Regression

91.362

31=2

45.681

Fcal 63.98448

Residual

8.567

15 3 = 12

0.714

Total

99.929

15 1 = 14

We then compare this F-ratio with F (k 1, n k) F (2,12) :

For = 0.01, F (k 1, n k) F0.01 (2,12) 6.93


For = 0.05, F (k 1, n k) F0.05 (2,12) 3.89
136

Didactic Design: <Title of Module>

137

Since the test statistic is greater than both tabulated values, the above ratio is significant at the
conventional levels of significance (1% and 5%). Thus, we reject the null hypothesis and
conclude that the model is adequate, that is, variation (change) in per capita food consumption is
significantly attributed to the effect of food price and/or per capita disposable income.
11. Tests on the regression coefficients
Once we come up with the conclusion that the model is adequate, the next step would be to test
for the significance of each of the coefficients in the model. To test whether each of the
coefficients is significant or not, the null and alternative hypotheses are given by:
H0 : j 0
HA : j 0
for j = 1, 2.

The test statistic is the ratio of the estimated regression coefficients to their estimated standard
errors, that is,
tj

, j 1, 2, . . ., k
j

Decision rule:
If

| t j | t / 2 (n k) , we reject H 0 and conclude that j is significant, that is, the regressor

variable X j , j 1, 2, . . ., k significantly affects the dependent variable Y.


Example 5: Consider our fitted multiple regression model of per capita food consumption (Y) on
price of food ( X1 ) and per capita income ( X 2 ):

86.08318 0.21596 X1 0.378127 X 2

We have already calculated the standard errors of estimated regression coefficients 1 and 2 as:

= 0.05197, = 0.033826.
1
2
a) Does food price significantly affect per capita food consumption?
The hypothesis to be tested is:
137

Didactic Design: <Title of Module>

138

H 0 : 1 0
H A : 1 0
The test statistic is calculated as:

0.21596
t1 1
4.155
0.05197

For significance level = 0.01 and degrees of freedom (n-3) = (15-3) = 12, the value from the
students t-distribution is:
t / 2 (n 3) t 0.005 (12) 3.055
Decision: Since | t1 | 4.155 3.055 , we reject the null hypothesis and conclude that food
price significantly affects per capita food consumption at the 1% level of significance.
b) Does disposable income significantly affect per capita food consumption?
The hypothesis to be tested is:
H 0 : 2 0
H A : 2 0
The test statistic is calculated as:

0.378127
t2 2
11.179
0.033826

The 1% critical value from the students t-distribution is again 3.055.


Decision: Since | t 2 | 11.179 3.055 , we reject the null hypothesis and conclude that
disposable income significantly affects per capita food consumption at the 1% level of
significance.
Generally we have the following:

Food price significantly and negatively affects per capita food consumption, while disposable
income significantly and positively affects per capita food consumption.
The estimated coefficient of food price is -0.21596. Holding disposable income constant, a
one dollar increase in food price results in a 0.216 dollar decrease in per capita food
consumption.
The estimated coefficient of food price is 0.378127. Holding food price constant, a one dollar
increase in disposable income results in a 0.378 dollar increase in per capita food
consumption.

12. Fitting a multiple linear regression model using computer software


In a multiple linear regression analysis involving a large number of explanatory variables, the
computations are complicated and tedious. Fortunately, there are a number of computer packages
138

Didactic Design: <Title of Module>

139

readily available for such analysis, and thus, one does not need to go through the details of the
calculations involved. The SPSS output for the above data is shown below.
Model Summary
Model
1

R
.956a

R Square
.914

Adjusted
R Square
.900

Std. Error of
the Estimate
.84495

a. Predictors: (Constant), income, price

R 2 = 0.914 indicates that 91.4% of the variation (change) in food consumption is attributed to
the effect of food price and/or consumer income.

ANOVAb
Model
1

Regression
Residual
Total

Sum of
Squares
91.362
8.567
99.929

df
2
12
14

Mean Square
45.681
.714

F
63.984

Sig.
.000a

a. Predictors: (Constant), income, price


b. Dependent Variable: consumption

A test of model adequacy is accomplished by means of analysis of variance (ANOVA) which


enables us to test the null hypothesis of no linear relationship between the dependent variable
and the set of explanatory variables. In this particular example, the ANOVA table is used to test
the hypothesis:
H 0 : 1 2 0
H A : H 0 is not true
Definition: (p-value)
A p-value is the smallest level of significance or the smallest value of for which the null
hypothesis is to be rejected.

Example:

If p-value = 0.002, then we can reject the null hypothesis for all values of greater than
0.002 (such as = 0.01, = 0.05).
139

Didactic Design: <Title of Module>

140

If p-value = 0.03, then we can reject the null hypothesis for all values of greater than 0.03
(such as = 0.05). However, we can not reject the null hypothesis at
= 0.01.
If p-value = 0.07, then we can not reject the null hypothesis at both 5% and 1% levels of
significance.

Note: In SPSS output, the p-value corresponds are displayed in the column under Sig.
In this particular example, Sig. = 0.000 which is less than 0.01 or 0.05. Thus, we can conclude
that the model is adequate at the 1% level of significance. That is, there is a significant linear
relationship between food consumption and food price and/or consumer income. This means
based on food price and consumer income, we can make valid inferences about per capita food
consumption at the 99% level of confidence.
Coefficientsa

Model
1

(Constant)
price
income

Unstandardized
Coefficients
B
Std. Error
86.083
3.873
-.216
.052
.378
.034

Standardized
Coefficients
Beta
-.407
1.095

t
22.226
-4.155
11.178

Sig.
.000
.001
.000

a. Dependent Variable: consumption

As can be seen from the table above, the p-values for price and income are both less than 0.01.
Thus, we can conclude that both variables significantly affect consumption at the 1% level of
significance. From the signs of the estimated regression coefficients we can see that the direction
of influence is opposite: price affects consumption negatively while income affects consumption
positively. The constant term (intercept) is also significant.

Note: In general, if Sig. > 0.05, we doubt the importance of the variable!
Multicollinearity
Introduction
In the construction of an econometric model, it may happen that two or more variables giving
rise to the same piece of information are included, that is, we may have redundant information or
unnecessarily included related variables. This is what we call a multicollinearity (MC) problem.

Such kind of MC is so common in macroeconomic time series data (such as GNP, money supply,
income, etc) since economic variables tend to move together over time.
140

Didactic Design: <Title of Module>

141

Consequences of a high degree of MC (moderate to strong MC)


Consider the case when there is a high degree (moderate to strong) MC but not perfect MC.
What happens to the parameter estimates?
Again consider the model in deviations form (K = 3):
yi 2 x 2i 3 x 3i i
There is a high degree of MC means that r23 , the correlation coefficient between X 2 and X 3 ,
tends to 1 or 1.

We have seen earlier that the variances of 2 and 3 are estimated by:
)
V(
2

2
2

and V(3 )
(1 r232 ) x 2i2
(1 r232 ) x 3i2

Now, r23 tends towards 1 :


2
r23 approaches to one
2
1 r23 approaches to zero
2
2
2
2
both (1 r23 ) x 2i and (1 r23 ) x 3i approach to zero

) and V(
) become very large (or will be inflated)
both V(
2
3

Particularly, if r23 1 , then the variances become infinite.


Recall that to test whether each of the coefficients is significant or not, that is, to test H 0 : j 0
versus H A : j 0 , the test statistic is:
tj

j
s.e.( j )

, j 2, 3 where s.e.( j )

)
V(
j

Thus, under a high degree of MC, the standard errors will be inflated and the test statistic will be
a very small number. This often leads to incorrectly accepting (not rejecting) the null hypothesis
when in fact the parameter is significantly different from zero!
Major implications of a high degree of MC
141

Didactic Design: <Title of Module>

142

1. OLS coefficient estimates are still unbiased.


2. OLS coefficient estimates will have large variances (or the variances will be inflated).
3. There is a high probability of accepting the null hypothesis of zero coefficient (using the ttest) when in fact the coefficient is significantly different from zero.
4. The regression model may do well, that is, R 2 may be quite high.
5. The OLS estimates and their standard errors may be quite sensitive to small changes in the
data.
Example: Consider the following data on imports (Y), GDP ( X 2 ), stock formation ( X 3 ) and
consumption ( X 4 ) for the years 1949 1967.
Year

X2

X3

X4

Year

X2

X3

X4

1949

15.9

149.3

4.2

108.1

1959

26.3

239.0

0.7

167.6

1950

16.4

161.2

4.1

114.8

1960

31.1

258.0

5.6

176.8

1951

19.0

171.5

3.1

123.2

1961

33.3

269.8

3.9

186.6

1952

19.1

175.5

3.1

126.9

1962

37.0

288.4

3.1

199.7

1953

18.8

180.8

1.1

132.1

1963

43.3

304.5

4.6

213.9

1954

20.4

190.7

2.2

137.7

1964

49.0

323.4

7.0

223.8

1955

22.7

202.1

2.1

146.0

1965

50.3

336.8

1.2

232.0

1956

26.5

212.4

5.6

154.1

1966

56.6

353.9

4.5

242.9

1957

28.1

226.1

5.0

162.3

1967

59.9

369.7

5.0

252.0

1958

27.6

231.9

5.1

164.3

Applying OLS, we obtain the following results (using SPSS):


Coefficient

Standard error

t-ratio

-19.982

4.372

-4.570

GDP

0.100

0.194

0.515

Stock formation

0.447

0.341

1.309

Consumption

0.149

0.297

0.501

Constant

R 2 = 0.975, F = 197.873 (p-value < 0.001)


The value of R 2 is close to 1, meaning GDP, stock formation and consumption together explain
97.5% of the variation in imports. Also the F-statistic is significant at the 1% level of
142

Didactic Design: <Title of Module>

143

significance. Thus, the linear regression model is adequate. However, all of the estimated
regression coefficients (save the constant term) are insignificant at the conventional levels of
significance. This is an indication that the standard errors are inflated due to MC. Since an
increase in GDP is often associated with an increase in consumption, they have a tendency to
grow up together over time leading to MC. The coefficient of correlation between GDP and
consumption is 0.999. Thus, it seems that the problem of MC is due to the joint appearance of
these two variables.
Methods of detection of MC
Multicollinearity almost always exists in most applications. So the question is not whether it is
present or not; it is a question of degree! Also MC is not a statistical problem; it is a data
(sample) problem. Therefore, we do not test for MC; but measure its degree in any particular
sample (using some rules of thumb).

Some of the methods of detecting MC are:


1. High R 2 but few (or no) significant t-ratios.
2. High pair-wise correlations among regressor. Note that this is a sufficient but not a necessary
condition; that is, small pair-wise correlation for all pairs of regressors does not guarantee the
absence of MC.
3. Variance inflation factor(VIF)
Consider the regression model:
Yi 1 2 X 2i 3 X 3i . . . K X Ki i (*)
The VIF of j is defined as:
VIF( j )

1
1 R 2j

, j 2, 3, . . ., K

2
where R j is the coefficient of determination obtained when the X j variable is regressed on

the remaining explanatory variables (called auxiliary regression). For example, the VIF of
is defined as:
2

VIF( 2 )

1
1 R 22

2
where R 2 is the coefficient of determination of the auxiliary regression:

143

Didactic Design: <Title of Module>

144

X 2i 1 3 X 3i 4 X 4i . . . K X Ki u i

Rule of thumb:
a) If VIF( j ) exceeds 10, then j is poorly estimated because of MC (or the j th regressor
variable ( X j ) is responsible for MC).
2
b) (Kliens rule) MC is troublesome if any of the R j exceeds the overall R 2 (the
coefficient of determination of the regression equation (*)).

Example: Consider the data on imports (Y), GDP ( X 2 ), stock formation ( X 3 ) and
consumption ( X 4 ) for the years 1949 1967. The coefficient of determination of the
auxiliary regression of GDP ( X 2 ) on stock formation ( X 3 ) and consumption ( X 4 ):
X 2i 1 3 X 3i 4 X 4i u i
2
is (using SPSS) R 2 = 0.998203. The VIF of 2 is thus:

VIF( 2 )

1
1

556.5799
2
1 R2
1 0.998203

Since this figure is by far exceeds 10, we can conclude that the coefficient of GDP is poorly
estimated because of MC (or that GDP is responsible for MC). It can also be shown that
VIF( 4 ) 555.898 indicating that consumption is also responsible for MC.

Remedial measures
To circumvent the problem of MC, some of the possibilities are:
1. Include additional observations maintaining the original model so that a reduction in the
correlation among variables is attained.
2. Dropping a variable.
This may result in an incorrect specification of the model (called specification bias). If we
consider our example, we expect both GDP and Consumption to have an impact on Imports.
By dropping one or the other, we have introduced specification bias.
Exercise with SPSS Application
144

Didactic Design: <Title of Module>

145

145

También podría gustarte