Research Process Outline

1. Define Research.
Research is a careful and detailed study into a specific problem, concern, or issue using the
scientific method. It's the adult form of the science fair projects back in elementary school,
where you try and learn something by performing an experiment. This is best accomplished by
turning the issue into a question, with the intent of the research to answer the question.
Research can be about anything, and we hear about all different types of research in the news.
Cancer research has 'Breakthrough Cancer-Killing Treatment Has No Side Effects in Mice,' and
'Baby Born with HIV Cured.' Each of these began with an issue or a problem (such as cancer or
HIV), and they had a question, like 'Does medication X reduce cancerous tissue or HIV
infections?'
To begin researching something, you have to have a problem, concern, or issue that has turned
into a question. These can come from observing the world, prior research, professional
literature, or from peers. Research really begins with the right question, because your question
must be answerable. Questions like 'How can I cure cancer?' aren't really answerable with a
study. It's too vague and not testable.
2. What are the objectives of research?
The objective of any research is to find answers to the questions by applying scientific
procedures. The broad aim of all researches is to confirm the reliability of existing knowledge
and to contribute new knowledge in the existing knowledge. The objectives of research may be
categorized into four:
a) Formulative research studies : To gain familiarity with new insights into a phenomenon.
b) Descriptive research studies : To accurately portray the characteristics of a particular
individual, group or a situation. E.g. To describe the habitat of the giant panda in China
c) Diagnostic research studies : To analyze the frequency with which something occurs.
d) Hypothesis-testing research studies : To examine the hypothesis of a causal relationship
between two variables.
3. State the significance of research.
Research plays a very significant role in making progress possible in all facets related to mankind.
Research encourages scientific and inductive thinking, besides promoting the development of
logical habits of thinking and organization. Some of the areas where research plays a significant
role are summarized below:
Gather Necessary Information: Research provides you with all necessary information in field of
your work, study or operation before you begin working on it. For example, most companies do
research before beginning a project in order to get a basic idea about the things they will need
to do for the project. Research also helps them get acquainted with the processes and resources
involved and reception from the market. This information helps in the successful outcome of the
project.
To Make Changes
Sometimes, there are in-built problems in a process or a project that is hard to discover.
Research helps us find the root cause and associated elements of a process. The end result of
such a research invokes a demand for change and sometimes is successful in producing changes
as well. For example, many U.N researches have paved way for changes in environmental
policies.
Improving Standard Of Living

Only through research can new inventions and discoveries come into life. It was C.V Ramans
research that prompted invention of radio communication. Imagine how you would have
communicated had Graham Bell not come out with the first ever practical telephone! Forget
telephones, what would have happened if Martin Cooper did not present the world the concept
of mobile phones! Addicted as we are to mobile phones, we need to understand that all the
luxuries and the amenities that are now available to us are the result of research done by
someone. And with the world facing more and crisis each day, we need researchers to find new
solutions to tackle them.
For A Safer Life

Research has made ground breaking discoveries and development in the field of health,
nutrition, food technology and medicine. These things have improved the life expectancy and
health conditions of human race in all parts of the world and helped eradicate diseases like
polio, smallpox completely. Diseases that were untreatable are now history, as new and new
inventions and research in the field of medicine have led to the advent of drugs that not only
treat the once-incurable diseases, but also prevent them from recurring.
To Know The Truth

It has been proved time and again that many of established facts and known truths are just cover
ups or blatant lies or rumors. Research is needed to investigate and expose these and bring out
the truth.
Explore Our History

Research about our planets history and human history has enabled us to learn and understand
more about our forefathers and helped us learn from their mistakes and absorb good things
from their life. Research about the planets history and existence has told us a lot about how
things will shape up in years to come and how we need to respect our planet and work closely
together to stop global warming and other scenarios of destruction.
4. What is the importance of knowing how to do research?

Importance of knowing the methodology of research or how research is done stems from the
following considerations:
(i) The knowledge of research methodology provides good training to the new researchers and
enables them to do better research. It helps them to develop disciplined thinking or a bent of
mind to observe the field objectively.
(ii) Knowledge of how to do research will inculcate the ability to evaluate and use research
results with reasonable confidence.
(iii) When one knows how research is done, it enables him to make intelligent decisions
concerning problems faced in practical life at different points of time. Thus, the knowledge of
research methodology provides tools to took at things in life objectively.
(iv) The knowledge of methodology helps the consumer of research results to evaluate them and
enables him to take rational decisions.
5. Briefly outline research process.
The major steps in research are as follows:
Formulation of the Research Problem: This is the first stage of the research process. In this
stage the researcher single out or identified the problem he wants to study. It means that he
must decide the general areas of interest or aspects of a subject-matter that he would like to
inquire. Essentially two steps are involved in formulating the research problem:
Understanding the problem theoretically

Rephrasing the same into analytical terms from an analytical point of view.
Understanding the problem is to discuss it with ones own colleagues or those having expertise
in the matter. In an academic institutions the researcher can seek the help from a guide who is
usually an experienced person and has several research problems in mind. The guide puts forth
the problem in general terms and it is up to the researcher to narrow it down and phrase the
problem in operational terms.
Extensive Literature Survey(Review): After selection of the research problem, the second step is
to review the literature connected with the problem. The availability of literature may help in the
research process. For this purpose academic journals, Government reports etc. must be studied.
Development of Working Hypothesis: In this stage researcher state in clear terms the working
hypothesis or hypothesis. Working hypothesis is tentative assumptions made in order to draw
out and test its logical or empirical consequences. Hypothesis narrows down the area of a
research and keep a researcher on the right path.
Preparing the Research Design: In this stage the researcher prepare a research design i.e.
conceptual structure within which research would be conducted. It contains:
Methodology of the research work

Sampling plan
Tools of gathering data
Geographical area to covered
Scope of the study
Operational definition of the study
Conceptual model of study
Reference period
Budget
Determination of Sampling Design : In this stage researcher decide the way of selecting a
sample which is popularly known as sample design. It is a defined plan determined before any
data are actually collected for obtaining a sample from a given population. Samples can be
either probability samples or non-probability samples. With probability samples each element
has known probability of being included in the sample but the non-probability samples do not
allow the researcher to determine the probability.
Collection of Data : In this stage the researcher collect data. Data can be collected from several
ways i.e. survey , observation, interview and experiment etc. the researcher should select
one method of data collection taking into consideration the nature of the investigation,
objectives and scope of the inquiry, financial resources, available time and desired degree of
accuracy.
Analysis of Data: The analysis of data requires a number of closely related operations such as
Establishment of categories, the application to these categories to raw data through coding,
tabulation and then drawing statistical inferences. The researcher classify the raw data into some
purposeful and usable categories. Coding operation is usually done at this stage through
which the categories of data are transformed into symbols that may be tabulated and counted.
Editing is the procedure that improves the quality of the data for coding. Tabulation is a part of
the technical procedure wherein classified data are put in the form of tables. Analysis work after
tabulation is generally based on computation of various coefficients, measures used to obtain
results.
Hypothesis Testing: The hypothesis may be tested through the use of one or more tests such as
chi-square test, t-test, F-test depending upon the nature and objectives of the research inquiry.
Hypothesis testing will result in either accepting the hypothesis or in rejecting it.
Generalization and Interpretation: In this stage the researcher arrive at generalization i.e. to
build a theory. As the matter of fact the real value of research lies in its ability to arrive at certain
generalizations.
Preparation of the Research Report: Finally, the researcher has to prepare the report of what
has been done by the him. Writing of report must be done with great care keeping in view the
following:
Preliminary Body
The Main Text
The End Matter
Preliminary Body: It contains:
Title page
Researchers declaration
The certificate of the research supervisor
Acknowledgement
Table of contents List of tables
List of graphs and charts
Main Text:
(i)Theoretical background of the topic
(ii) Statement of the problem
(iii) Review of literature
(iv)The scope of the study
(v)The objectives of the study
(vi)Hypothesis to be tested
(vii)Definition of the concepts
(viii)Model if any
The design of the study:
(i)Methodology
(ii)Sources of data
(iii)Sampling plan
(iv)Data collected instrument
(v)Field work
(vi)Data processing and analysis
End Matter (Closing the Report) : This will contain bibliography, references, appendices, index
and maps or charts for illustrations.
6. Highlight the different research approaches.
There are two main approaches to research, namely, quantitative approach and qualitative
approach. The quantitative approach involves collection of quantitative data, which are put to
rigorous quantitative analysis in a formal & rigid manner. This approach further includes
experimental, inferential & simulation approaches to research. Some examples where
quantitative approach is used are a research to predict the product demand with high precision,
a research on layout design aimed to minimize the material handling cost & increase equipment
utilization.
The qualitative approach uses the method of subjective assessment of opinions, behaviors &
attitudes. Research in such a situation is a function of the researchers impressions and insights.
Usually this approach uses techniques like in-depth interviews, focus group interviews &
projective techniques. An example where qualitative approach may be used is, a research on the
impulse buying behavior of consumers at grocery stores.
7. Discuss the qualities of a researcher.
To be a good researcher first requires the intention to be involved in research and immediately
thereafter to show a dedicated interest to do the best research possible. Some of the qualities
can be :
1. A good researcher manifests thirst for new information.
A good researcher shows an open mind about things. He does not just take things by themselves
but explores new grounds. He adopts the philosophy of thinking beyond the box, leaving out
the conventional for something innovative. A good researcher treads the unknown frontier.
Pieces of evidence of this thirst for new information manifest in people who do not stop
learning. Those persons who maintain an open mind for new possibilities to happen, even when
everything appears to have been discovered or studied, or options exhausted.
Two hundred years ago, has anyone ever thought that man could go to the moon, or explore the
depths of the sea? Or tap on the keys of the cell phone to communicate with another person so
far away?
2. A good researcher has a keen sense of things around him.
Keenness is a quality developed through an observant attitude. A good researcher sees

something more out of a common occurrence around him. And he sees this quickly.
He can see a wiggling worm inside a flower, or the beautiful color combinations of a wild plant,
or simply, notices the small fly in the burger.
3. A good researcher likes to reflect or think about the things he encounters.

Researchers who pause and reflect on the knowledge that they gained, either formally in school
or through their experience, gain insights. Insights are creative thoughts that make one nod his
head and say, Aha, this is something I have been looking for! An original idea was born.
4. A good researcher must be intelligent enough to express his ideas.
How can you express your thoughts if you cannot write? The point here is that a good researcher
must be adept in the written language.
How can people understand your point when you are the only one who can understand what
you have written?
Intelligence to express ideas is a quality that appears to reside in gifted individuals. But if you
recognize your weakness in this realm, why not seek someone who can? After all, ideas are more
important; but of course, better if you present them in such in a way that others understand well
what you want to say.
5. A good researcher applies a systematic approach in assessing situations.
Research requires systematic and objective thinking to arrive at something. Logical reasoning,
therefore, is applied by a good researcher.
He can analyze things, meaning, he can break down a complex situation into manageable bits
that he can focus his attention into.
8. Explain the different types of research.
Types of research can be classified in many different ways. some major ways of classifying
research include the following.
Descriptive versus Analytical Research

Applied versus Fundamental Research
Qualitative versus Quantitative Research
Conceptual versus Empirical Research
Descriptive research concentrates on finding facts to ascertain the nature of something as it

exists. The term ex post facto research is quite often used for descriptive research in social
sciences & business research. In descriptive research the researcher has no control over the
variables. S/he can only observe & report what is happening or what has happened. Examples of
such research include examining phenomenon such as consumer preferences, frequency of
purchases etc. In contrast analytical research is concerned with determining validity of
hypothesis based on analysis of facts collected.
Applied research is carried out to find answers to practical problems to be solved and as an aid in
decision making in different areas including product design, process design and policy making.
Fundamental research is carried out to generalize & formulate a theory, than with the intention
of using the research findings for any immediate practical application. Thus, while the principal
objective of applied research is to find a solution to some pressing practical problem, the
objective of basic research is to find information which can add to the already existing body of
scientific knowledge.
Quantitative research studies such aspects of the research subject which are not quantifiable,
and hence not subject to measurement and quantitative analysis. In contrast quantitative
research make substantial use of measurements and quantitative analysis techniques.
Conceptual research is involves investigation of thoughts and ideas and developing new ideas or
interpreting the old ones based on logical reasoning. In contrast empirical research is based on
firm verifiable data collected by either observation of facts under natural condition or obtained
through experimentation. In this type of research the researcher first formulates a working
hypothesis and then gathers sufficient facts to prove or disprove the stated hypothesis.
9. What is a research problem?
A research problem is the topic we would like to address, investigate, or study, whether
descriptively or experimentally. It is the focus or reason for engaging in our research. By defining
a research problem, a researcher wishes to express the problem as well as the limits within
which it is to be studied.
A research problem can belong to one of the following two categories it can belong to the
category in which there can be relationships between various variables or it may belong to the
other category, which is based on nature.
Research problems are questions that indicate gaps in the scope or the certainty of our
knowledge. They point either to problematic phenomena, observed events that are puzzling in
terms of our currently accepted ideas, or to problematic theories, current ideas that are
challenged by new hypotheses.
In general, a research problem refers to an unanswered question that a researcher might

encounter in the context of either a theoretical or practical situation, which s/he would like to
answer or find a solution to.
The selection of one appropriate researchable problem out of the identified problems requires
evaluation of those alternatives against certain criteria, which may be grouped into:
Internal Criteria
Internal Criteria consists of:
1. Researchers interest:
The problem should interest the researcher and be a challenge to him. Without interest and
curiosity, he may not develop sustained perseverance Interest in a problem depends upon the
researchers educational background, experience, outlook and sensitivity.
2. Researchers own resource:

In the case of a research to be done by a researcher on his own, consideration of his own
financial resource is pertinent. If it is beyond his means, he will not be able to complete the
work, unless he gets some external financial support. Time resource is more important than
finance. Research is a time-consuming process; hence it should be properly utilized.
3. Researchers competence:
A mere interest in a problem will not do. The researcher must be competent to plan and carry
out a study of the problem. He must possess adequate knowledge of the subject-matter,
relevant methodology and statistical procedures.
External Criteria
1. Research-ability of the problem:

The problem should be researchable, i.e., amendable for finding answers to the questions
involved in it through the scientific method.
2. Novelty of the problem:

The problem must have novelty. There is no use of wasting ones time and energy on a problem
already studied thoroughly by others.
3. Importance and urgency:

Problems requiring investigation are unlimited, but available research efforts are very much
limited.
4. Facilities: Research requires certain facilities such, as well-equipped library facility, suitable
and competent guidance, data analysis facility, etc. Hence the availability of the facilities
relevant to the problem must be considered. Problems for research, their relative
importance and significance should be considered.
5. Feasibility: A problem may be a new one and also important, but if research on it is not
feasible, it cannot be selected.
6. Usefulness and social relevance:

Above all, the study of the problem should make a significant contribution to the concerned
body of knowledge or to the solution of some significant practical problem. It should be socially
relevant.
Process in defining a research problem:
Following is the procedure or the main steps in defining the research problem :
Define the Problem in a General Way: To begin with, the problem must be expressed in a broad
general way, keeping in view either some practical problem or some scientific or intellectual
interest. The investigator can himself express the problem or he can seek the advice of the guide
or the subject specialist in achieving this activity. Often, the guide puts forth the problem
generally, and it is then up to the researcher to narrow it down and phrase the issue in
operational terms.
Understanding the Nature of the Problem: The simplest way to understand the problem is to talk
about it with people who first raised it to find out how the problem initially came into being and
with what objectives in view. If the researcher has stated the problem himself, he should think
about all those factors that prompted him to make a general statement in regards to the
problem.
Survey the available literature: You will need to review and serve all the possible literature which
can be found on the research area prior to defining the problem. It will help you to examine
newer dimensions in that particular area and will lead to development of knowledge. It means
that you should be well-conversant with relevant theories in the field, reports and records as
also all the other related literature. Studies on linked problems are helpful for suggesting the
kind of challenges that may be experienced in our study as also the possible analytical weak
points.
Discuss to Get Ideas: You can get various new ideas through discussions. Hence, you must discuss
the problem with your fellow workers and others who have ample experience in the same area
or in working on comparable problems. This is usually referred to as an experience survey.
People with abundant experience are in a position to enlighten you on various aspects of the
proposed study and their advice and comments are in most cases invaluable.
Redefining the Research Problem: Last but not least, you must sit down to rephrase the research
problem into a working proposition. Through rephrasing, you will put the research problem in as
specific terms as you can so that it may become operationally viable and may help in the
creation of working hypotheses.
10. Outline the features of research design.
Once the research problem has been formulated, the next step is to prepare the research design
of the project. The research design refers to the overall strategy that a researcher chooses to
integrate the different components of the study in a coherent and logical way. A research design
constitutes the blueprint for the collection, measurement, and analysis of data.
The function of a research design is to ensure that the evidence obtained enables us to answer
the initial question as unambiguously as possible. Obtaining relevant evidence entails specifying
the type of evidence needed to answer the research question, to test a theory, to evaluate a
programme or to accurately describe some phenomenon. In other words, when designing
research we need to ask: given this research question (or theory), what type of evidence is
needed to answer the question (or test the theory) in a convincing way?
The research design highlights decisions which include:

a. Nature of the study
b. Purpose of the study
c. Location where the study would be conducted
d. Nature of data required
e. From where the required data can be collected
f. What time period thw study would cover
g. The type of sample design that will be used
h. The techniques of data collection that would be used
i. The methods of data analysis that would be adopted
j. The manner in which the report would be prepared
The research design may be divided into the following:
I. The Sampling design that deals with the method of selecting items to be observed for
the selected study.
II. The observational design that relates to the conditions under which the observations are
to be made
III. The statistical design that concerns with the question of how many items are to be
observed and how the information & data gathered are to be analyzed
IV. The operational design that deals with the techniques by which the procedures specified
in the sampling, statistical and observational designs can be carried out.
Features of a Good Research design
Generally a good research design minimizes bias and maximizes the reliability of the data
collected and analyzed. The design which gives the smallest experimental error is reported to be
the best design in scientific investigation. Similarly, a design which yields maximum information
and provides an opportunity for considering different aspects of a problem is considered to be
the most appropriate design. A good research design should satisfy the following four conditions
namely objectivity, reliability, validity and generalization of the findings.
1. Objectivity: It refers to the findings related to the method of data collection and scoring of the
responses. The research design should permit the measuring instrument which are fairly
objective in which every observer or judge scoring the performance must precisely give the
same report. In other words, the objectivity of the procedure may be judged by the degree of
agreement between the final scores assigned to different individuals by more than one
independent observer. This ensures the objectivity of the collected data which shall be capable
of analysis and drawing generalizations.
2. Reliability: Reliability refers to consistency throughout a series of measurements. For eg: if a
respondent gives out a response to a particular item, he is expected to give the same response
to that item even if he is asked repeatedly. If he is changing his response to the same item, the
consistency will be lost. So the researcher should frame the items in a questionnaire in such a
way that it provides consistency or reliability.
3. Validity: Any measuring device or instrument is said to be valid when it measures what it is
expected to measure. For eg: an intelligence test conducted for measuring the I.Q should
measure only the intelligence and nothing else, and the questionnaire shall be framed
accordingly.
4. Generalizability: It means how best the data collected from the samples can be utilized for
drawing certain generalizations applicable to a large group from which sample is drawn. Thus a
research design helps an investigator to generalize his findings provided he has taken due care in
defining the population, selecting the sample, deriving appropriate statistical analysis etc. while
preparing the research design.
Thus a good research design is one which is methodologically prepared and should ensure that:
a) The measuring instrument can yield objective, reliable and valid data.
b) The population is clearly defined.
c) Most appropriate techniques of sample selection is used to form an appropriate sample.
d) Appropriate statistical analysis has been carried out, and
e) The findings of the study is capable of generalizations.
11. Explain the significance of research design.
Research design is significant simply because it allows for the smooth sailing of the various
research operations, thus making research as efficient as possible producing maximum
information with nominal expenses of effort, time and money. Just as for better, economical and
attractive construction of a home, we require a blueprint (or what is typically known as the map
of the home) well planned and prepared by an expert architect, in the same way we require a
design or a plan in advance of data collection and analysis for our research study. It means
advance planning of the techniques to be implemented for accumulating the appropriate data
and the strategies to be employed in their analysis, keeping in view the purpose of the research
and the availability of staff, time and money. Preparation of the design must be carried out
meticulously as any error in it may upset the complete project. Research design, actually, has a
great significance and impact on the reliability of the results achieved and as such constitutes
the firm base of the entire edifice of the research work.
12. What is a case study?

A case study is an in depth study of a particular situation rather than a sweeping statistical
survey. It is a method used to narrow down a very broad field of research into one easily
researchable topic.
The case study has been especially used in social science, psychology, anthropology and ecology.
This method of study is especially useful for trying to test theoretical models by using them in
real world situations. For example, if an anthropologist were to live amongst a remote tribe,
whilst their observations might produce no quantitative data, they are still useful to science.
Some argue that because a case study is such a narrow field that its results cannot be
extrapolated to fit an entire question and that they show only one narrow example. On the other
hand, it is argued that a case study provides more realistic responses than a purely statistical
survey.
The truth probably lies between the two and it is probably best to try and synergize the two
approaches. It is valid to conduct case studies but they should be tied in with more general
statistical processes.
For example, a statistical survey might show how much time people spend talking on mobile
phones, but it is case studies of a narrow group that will determine why this is so.
13. Discuss the criteria for evaluating case study.
John Dollard (Dollard 1935) specified seven criteria for evaluating the adequacy of a case or life
history in the context of social research. They are as follows: -
(i) The subject being studied must be viewed as a specimen in a cultural set up. That is, the case
selected from its total context for the purpose of study should be considered a member of the
particular cultural group or community. The scrutiny of the life history of the individual must be
carried out with a view to identify the community values, standards and shared ways of life.
(ii) The organic motors of action should be socially relevant. This is to say that the action of the
individual cases should be viewed as a series of reactions to social stimuli or situations. Putting in
simple words, the social meaning of behaviour should be taken into consideration.
(iii) The crucial role of the family-group in transmitting the culture should be recognized. This
means that as the individual is a member of a family, the role of the family in shaping his/her
behaviour should never be ignored.
(iv) The specific method of conversion of organic material into social behaviour should be clearly
demonstrated. For instance, case-histories that discuss in detail how basically a biological
organism, that is man, gradually transform into a social person are particularly important.
(v) The constant transformation of character of experience from childhood to adulthood should
be emphasised. That is, the life-history should portray the inter-relationship between the
individual's various experiences during his/her life span. Such a study provides a comprehensive
understanding of an individual's life as a continuum.
(vi) The 'social situation' that contributed to the individual's gradual transformation should
carefully and continuously specified as a factor. One of crucial the criteria for life-history is that
an individual's life should be depicted as evolving itself in the context of a specific social
situations and partially caused by it.
(vii) The life-history details themselves should be organized according to some conceptual
framework, which in turn would facilitate their generalizations at higher levels.
These criteria discussed by Dollard emphasise the specific link of co-ordinated, related,
continuous and configured experience in a cultural pattern that motivated the social and
personal behaviour.
14. Define hypothesis.
A research hypothesis is the statement created by researchers when they speculate upon the
outcome of a research or experiment. It is a proposed answer to a question or problem that can
be verified or rejected through testing. A hypothesis statement is typically an educated guess as
to the relationship between variables, and serves as the basis for an experiment to test whether
the relationship holds true.
A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way
to structure your hypothesis is to describe what will happen to the dependent variable if you
make changes to the independent variable. For e.g a hypothesis statement may be "Students
who eat breakfast will perform better on a math exam than students who do not eat breakfast."
15. What are the characteristic features of a hypothesis?
A hypothesis must possess the following characteristics:

(i) Hypothesis should be clear and precise. If the hypothesis is not clear and precise, the
inferences drawn on its basis cannot be taken as reliable.
(ii) Hypothesis should be capable of being tested. Researcher may do some prior study in
order to make hypothesis a testable one. A hypothesis is testable if other deductions can be
made from it which, in turn, can be confirmed or disproved by observation.
(iii) Hypothesis should state relationship between variables, if it happens to be a relational

hypothesis
(iv) Hypothesis should be limited in scope and must be specific. A researcher must remember
that narrower hypotheses are generally more testable and he should develop such hypotheses
(v) Researchers should state hypothesis as far as possible in most simple terms so that the
same is easily understandable by all concerned. But one must remember that simplicity of
hypothesis has nothing to do with its significance
(vi) Hypothesis should be consistent with most known facts i e., it must be consistent with a
substantial body of established facts. In other words, it should be one which judges accept as
being the most likely
(vii) Hypothesis should be amenable to testing within a reasonable time. One should not use
even an excellent hypothesis, if the same cannot be tested in reasonable time for one cannot
spend a life-time collecting data to test it
(viii) Hypothesis must explain the facts that gave rise to the need for explanation. This means
that by using the hypothesis plus other known and accepted generalizations, one should be able
to deduce the original problem condition. Thus hypothesis must actually explain what it claims
to explain; it should have empirical reference.
16. Distinguish between null & alternative hypothesis.
A null hypothesis is a statistical hypothesis and is the default or original hypothesis while an
alternative hypothesis is any hypothesis other than the null. If the null hypothesis is not
accepted or rejected, then the alternative hypothesis is used. H 0 is a null hypothesis while H1 is
an alternative hypothesis.
Research studies and testing usually formulate two hypotheses. One will describe the prediction
while the other will describe all other possible outcomes. For example, you predict that A is
related to B (null hypothesis). The only other possible outcome is that they are not related
(alternative hypothesis).
A null hypothesis is a statistical hypothesis which is the original or default hypothesis while any
other hypothesis other than the null is called an alternative hypothesis. An alternative
hypothesis is used if the null hypothesis is not accepted or rejected. A null hypothesis is the
prediction while an alternative hypothesis is all other outcomes aside from the null. Both the
null and alternative hypotheses are necessary in statistical hypothesis testing in scientific,
medical, and other research studies.
An example of the difference between a null hypothesis and an alternative hypothesis is in the
legal system. The original hypothesis is that the defendant is innocent until he is proven guilty.
His innocence is the null hypothesis while his guilt is the alternative hypothesis.
Another example of hypothesis statement when you predict that children who eat oily fish for a
period time have a higher IQ. Your alternative hypothesis, H1 would be
Children who eat oily fish for six months will show a higher IQ increase than children who have
not.
Therefore, your null hypothesis, H0 would be
Children who eat oily fish for six months do not show a higher IQ increase than children who do
not.
Alternative hypothesis is usually the one which a researcher wishes to prove, whereas the null
hypothesis is the one which he/she wishes to disprove. Thus, a null hypothesis is usually the one
which a researcher tries to reject, while an alternative hypothesis is the one that represents all
other possibilities.
17. Differentiate Type I & Type II error.
When a researcher does a hypothesis test, two types of errors are possible: type I and type II.
He/she may reject Ho when it is true, or accept Ho when it is not true. The former is called as
Type I error and the latter is known as Type II error. In other words, Type I error implies the
rejection of a hypothesis when it must have been accepted, while Type II error implies the
acceptance of a hypothesis which must have been rejected. Type I error is denoted by (alpha)
and is known as error, while Type II error is usually denoted by (beta) and is known as error.
To understand the interrelationship between type I and type II error, and to determine which
error has more severe consequences for your situation, consider the following example.
A medical researcher wants to compare the effectiveness of two medications. The null and
alternative hypotheses are:
Null hypothesis (H0): 1= 2
The two medications are equally effective.
Alternative hypothesis (H1): 1 2
The two medications are not equally effective.
A type I error occurs if the researcher rejects the null hypothesis and concludes that the two
medications are different when, in fact, they are not. If the medications have the same
effectiveness, the researcher may not consider this error too severe because the patients still
benefit from the same level of effectiveness regardless of which medicine they take. However, if
a type II error occurs, the researcher fails to reject the null hypothesis when it should be
rejected. That is, the researcher concludes that the medications are the same when, in fact, they
are different. This error is potentially life-threatening if the less-effective medication is sold to
the public instead of the more effective one.
As you conduct your hypothesis tests, consider the risks of making type I and type II errors. If the
consequences of making one type of error are more severe or costly than making the other type
of error, then choose a level of significance and a power for the test that will reflect the relative
severity of those consequences.
18. How is hypothesis tested?
A hypothesis test is a statistical test that is used to determine whether there is enough evidence
in a sample of data to infer that a certain condition is true for the entire population.
A hypothesis test examines two opposing hypotheses about a population: the null hypothesis
and the alternative hypothesis.
The various steps involved in hypothesis testing are stated below:
(i) Making a formal statement: The step consists in making a formal statement of the null
hypothesis (H0) and also of the alternative hypothesis (H a) This means that hypotheses should be
clearly stated, considering the nature of the research problem For instance, Mr. Mohan of the
Civil Engineering Department wants to test the load bearing capacity of an old bridge which
must be more than 10 tons. In that case he can state his hypotheses as under:
Null Hypothesis H0: m = 10 tons
Alternative Hypothesis Ha: m > 10 tons

The formulation of hypotheses is an important step, which must be accomplished with due care
in accordance with the object and nature of the problem under consideration It also indicates
whether we should use a one-tailed test or a two-tailed test. If H a is of the type greater than (or
of the type lesser than), we use a one-tailed test, but when Ha is of the type whether greater or
smaller, then we use a two-tailed test.
(ii) Selecting a significance level: The hypotheses are tested on a pre-determined level of
significance and as such the same should be specified. Generally, in practice, either 5% level or
1% level is adopted for the purpose. The factors that affect the level of significance are
(a) the magnitude of the difference between sample means
(b) the size of the samples
(c) the variability of measurements within samples
(d) whether the hypothesis is directional or non-directional (A directional hypothesis is one

which predicts the direction of the difference between, say, means). In brief, the level of
significance must be adequate in the context of the purpose and nature of enquiry.
(iii) Deciding the distribution to use: After deciding the level of significance, the next step in
hypothesis testing is to determine the appropriate sampling distribution. The choice generally
remains between normal distribution and the t-distribution. The rules for selecting the correct
distribution are similar to those that we have stated earlier in the context of estimation.
(iv) Selecting a random sample and computing an appropriate value: Another step is to select a
random sample(s) and compute an appropriate value from the sample data concerning the test
statistic utilizing the relevant distribution. In other words, draw a sample to furnish empirical
data.
(v) Calculation of the probability: One has then to calculate the probability that the sample result
would diverge as widely as it has from expectations, if the null hypothesis were in fact true
(vi) Comparing the Probability: Yet another step consists in comparing the probability thus
calculated with the specified value for , the significance level If the calculated probability is
equal to or smaller than the a value in case of one-tailed test (and /2 in case of two-tailed test),
then reject the null hypothesis (i e, accept the alternative hypothesis), but if the calculated
probability is greater, then accept the null hypothesis. In case we reject H 0, we run a risk of (at
most the level of significance) committing an error of Type I, but if we accept H 0, then we run
some risk (the size of which cannot be specified as long as the H 0 happens to be vague rather
than specific) of committing an error of Type II.
19. Define the concept of sampling design.
Ideally, research would collect information from every single member of the population that you
are studying. However, most of the time that would take too long and so, the researcher has to
select a suitable sample: a subset of the population.
Sampling design reefers to the technique or the procedure the researcher adopts for selecting
items for the sample from the population or universe.
There are a variety of ways to select your sample, and to make sure that it gives you results that
will be reliable and credible. The idea behind selecting a sample is to be able to generalize the
research findings to the whole population, which means that the sample must be:
Representative of the population. In other words, it should contain similar proportions of

subgroups as the whole population, and not exclude any particular groups, either by method of
sampling or by design, or by who chooses to respond.
Large enough to give a researcher enough information to avoid errors. It does not need to be a
specific proportion of your population, but it does need to be at least a certain size so that the
researcher feels confident that the findings are likely to be broadly correct.
If the sample is not representative, it may introduce bias into the study. If it is not large enough,
the study will be imprecise.
However, if the relationship between sample and population is right, then strong conclusions can
be drawn about the nature of the population.
20. Describe the steps involved in sampling design.
The following are some of the important steps that one needs to follow when developing a
sample design:-
1. Define the universe or population of interest. The accuracy of the results in any study
depends on how clearly the universe or population of interest is defined. The universe can
be finite or infinite, depending on the number of items it contains.
2. Define the sampling unit within the population of. The sampling unit can be anything that
exists within the population of interest. For example, sampling unit may be a geographical
unit, or a construction unit or it may be an individual unit.
3. Prepare the list of all the items within the population of interest. It is from this list, which is
also called as source list or sampling frame, that we draw our sample. It is important to note
that our sampling frame should be highly representative of the population of interest.
4. Determine the sample size. This is the most critical stage of the sample design process
because the sample size should not be excessively large nor it should be too small. It is
desired that the sample size should be optimum and it should be representative of the
population and should give reliable results. Population variance, population size, parameters
of interest, and budgetary constraints are some of the factors that impact the sample size.
5. Decide about the technique of sampling to be used. There are many sampling techniques
out of which the researchers has to choose the one which gives lowest sampling error, given
the sample size and budgetary constraints.
21. Discuss the criteria for selecting a sampling procedure.
There are two major costs are involved in a sampling analysis viz., the cost of collecting the data
and the cost of an incorrect inference resulting from the data. There are two causes of incorrect
inferences viz., systematic bias and sampling error. A systematic bias results from errors in the
sampling procedures, and it cannot be reduced or eliminated by increasing the sample size. At
best the causes responsible for these errors can be detected and corrected. Usually a systematic
bias is the result of one or more of the following factors:
a. Inappropriate sampling frame: If the sampling frame is inappropriate i.e., a biased

representation of the universe, it will result in a systematic bias.
b. Defective measuring device: If the measuring device is constantly in error, it will result in
systematic bias. In survey work, systematic bias can result if the questionnaire or the
interviewer is biased. Similarly, if the physical measuring device is defective there will be
systematic bias in the data collected through such a measuring device.
c. Non-respondents: If we are unable to sample all the individuals initially included in the
sample, there may arise a systematic bias. The reason is that in such a situation the
likelihood of establishing contact or receiving a response from an individual is often
correlated with the measure of what is to be estimated.
d. Indeterminancy principle: Sometimes we find that individuals act differently when kept
under observation than what they do when kept in non-observed situations. For instance, if
workers are aware that somebody is observing them in course of a work study on the basis
of which the average length of time to complete a task will be determined and accordingly
the quota will be set for piece work, they generally tend to work slowly in comparison to the
speed with which they work if kept unobserved. Thus, the indeterminancy principle may also
be a cause of a systematic bias.
e. Natural bias in the reporting of data: Natural bias of respondents in the reporting of data is
often the cause of a systematic bias in many inquiries. There is usually a downward bias in
the income data collected by government taxation department, whereas we find an upward
bias in the income data collected by some social organisation. People in general understate
their incomes if asked about it for tax purposes, but they overstate the same if asked for
social status or their affluence. Generally in psychological surveys, people tend to give what
they think is the correct answer rather than revealing their true feelings.
Sampling errors are the random variations in the sample estimates around the true population
parameters. Since they occur randomly and are equally likely to be in either direction, their
nature happens to be of compensatory type and the expected value of such errors happens to
be equal to zero. Sampling error decreases with the increase in the size of the sample, and it
happens to be of a smaller magnitude in case of homogeneous population.
Sampling error can be measured for a given sample design and size. The measurement of
sampling error is usually called the precision of the sampling plan. If we increase the sample
size, the precision can be improved. But increasing the size of the sample has its own limitations
viz., a large sized sample increases the cost of collecting data and also enhances the systematic
bias. Thus the effective way to increase precision is usually to select a better sampling design
which has a smaller sampling error for a given sample size at a given cost. In practice, however,
people prefer a less precise design because it is easier to adopt the same and also because of the
fact that systematic bias can be controlled in a better way in such a design.
22. Distinguish between probability & non-probability sampling.

The two main methods used in survey research are probability sampling and nonprobability
sampling. The big difference is that in probability sampling all persons have a chance of being
selected, and results are more likely to accurately reflect the entire population. While it would
always be nice to have a probability-based sample, other factors need to be considered
(availability, cost, time, what you want to say about results). Some additional characteristics of
the two methods are listed below.
Probability Sampling
You have a complete sampling frame. You have contact information for the entire population.
You can select a random sample from your population. Since all persons (or units) have an
equal chance of being selected for your survey, you can randomly select participants without
missing entire portions of your audience.
You can generalize your results from a random sample. With this data collection method and a
decent response rate, you can extrapolate your results to the entire population.
Can be more expensive and time-consuming than convenience or purposive sampling.
Nonprobability Sampling
Used when there isnt an exhaustive population list available. Some units are unable to be
selected, therefore you have no way of knowing the size and effect of sampling error (missed
persons, unequal representation, etc.).
Not random.
Can be effective when trying to generate ideas and getting feedback, but you cannot generalize
your results to an entire population with a high level of confidence. Quota samples (males and
females, etc.) are an example.
More convenient and less costly, but doesnt hold up to expectations of probability theory.
23. How is a random sample selected?
The process of selecting a random sample involves writing the name of each element of a finite
population on a slip of paper and putting them into a box or a bag. Then they have to be
thoroughly mixed and then the required number of slips for the sample should be picked one
after the other without replacement.
While doing this, it has to be ensured that in successive drawings each of the remaining
elements of the population has an equal chance of being chosen. This method would result in
the same probability for each possible sample.
24. Explain complex random sampling designs.
Complex random sampling designs are probability sampling done with restricted sampling
techniques. They are also called mixed sampling designs as they tend to combine probability and
non-probability sampling procedures during sample selection.
Some of the popular complex random sampling designs are as follows:

(i) Systematic sampling: The researchers sometimes select every ith item from a list, this is
known as systematic sampling. The first unit is a random number and the next unit onwards they
are selected at the same fixed intervals.
(ii) Stratified sampling: In a very diverse universe stratified sampling is used were the population
is divided into several groups that are more similar and then items are selected from each strata
as a sample. The strata are formed based on members' shared attributes or characteristics. A
random sample from each stratum is taken in a number proportional to the stratum's size when
compared to the population. These subsets of the strata are then pooled to form a random
sample. (e.g Sexes, races, ages etc.)
(iii) Cluster sampling: In cluster sampling within the population there might be similar groups
these are divided into a number of small homogeneous subdivisions then some of these clusters
are randomly selected as sample. Cluster sampling is highly economic. The difference between
stratified sampling and cluster sampling is that in stratified sampling a random sample is drawn
from each of the strata, whereas in cluster sampling only the selected clusters are studied. (e.g
Cities, Schools etc.)
Cluster sampling is considered less precise than other methods of sampling. However, it may
save costs on obtaining a sample. Cluster sampling is a two-step sampling procedure. It may be
used when completing a list of the entire population is difficult. For example, it could be difficult
to construct the entire population of the customers of a grocery store to interview. However, a
person could create a random subset of stores, which is the first step in the process. The second
step is to interview a random sample of the customers of those stores. This is a simple manual
process that can save time and money.
(iv) Area sampling: In area sampling a large area is divided into smaller parts and then samples
are selected randomly. This is a type of cluster sampling were the cluster of units is based on
geographic area.
(v) Multi-stage sampling: Multi-stage sampling is a complex type of cluster sampling. Multi-stage
sampling is used in researches where the entire universe is very large, for example the entire
country; the researcher selects samples in various levels. The researcher after selecting clusters
from all universe than randomly selects elements from each cluster. This type of sampling is cost
effective and easy to administer.
(vi) Probability proportional to size (PPS) sampling: Probability proportional to size (PPS)
sampling: Sometimes cluster sampling units lack equal number of elements; in such cases the
researcher uses a random selection process where the probability of selection of each sub group
is proportional to the size of the cluster. The actual numbers selected are indicative of the
clusters chosen and selected. PPS avoids under representation of any one group.
(vii) Sequential sampling: This is a complex sampling design was the size of the sample is not
fixed earlier but is determined according the need of the researcher. In this type of sampling
method, the researcher does his research on a particular sample if not satisfied takes another
sample unit and so on. The researchers keeps fine tuning the experiment and decides only after
doing the experiment whether more samples are needed or not.
UNIT 2 : Data Collection and Sources of Data
1. Explain primary & secondary data and distinguish between them.
Primary data are fresh (new) information collected for the first time by a researcher himself
for a particular purpose. Whereas, Secondary data, on the other hand, are information
already collected by others or somebody else and later used by a researcher (or investigator)
to answer their questions in hand. Hence, it is also called second-hand data.
PRIMARY DATA SECONDARY DATA

It is collected by the researcher or his Data collected by persons other than the
agents researcher of his agents
Primary data is original & unique May not be original or unique information
information to the project
Collected by surveys, observations or Old archives, journals, Govt published data
experiments etc.
More time is required to collect primary Less time consuming
data
Cost involved is high Economical
Most suitable to achieve the research May or may not be suitable
objective
The primary data collection is done to Secondary data collected are truly the work
accomplish some fixed objective, and of someone else done for some other
obtained with some focus in mind. purposes. It is not focused to meet the
objective of the researcher. As a result, it
needs to be properly adjusted and
arranged before making its actual use.
2. Explain the different methods of collecting primary data, their merits & demerits.
There are different methods of collecting primary data. Each method has its relative merits
and demerits. The investigator has to choose a particular method to collect the information.
The choice to a large extent depends on the preliminaries to data collection some of the
commonly used methods are discussed below.
1. Direct Personal observation:
This is a very general method of collecting primary data. Here the investigator directly
contacts the informants, solicits their cooperation and enumerates the data. The information
are collected by direct personal interviews.
The novelty of this method is its simplicity. It is neither difficult for the enumerator nor the
informants. Because both are present at the spot of data collection. This method provides
most accurate information as the investigator collects them personally. But as the
investigator alone is involved in the process, his personal bias may influence the accuracy of
the data. So it is necessary that the investigator should be honest, unbiased and
experienced. In such cases the data collected may be fairly accurate. However, the method is
quite costly and time-consuming. So the method should be used when the scope of enquiry
is small.
Merits . The advantages of personal interview are:
1. Response is more encouraging as most people are willing to supply information when
approached personally.
2. The information obtained by this method is likely to be more accurate because the
interviewer can clarify the doubts of the informants about certain questions and thus obtain
correct information. In case the interviewer apprehends that the informant is not giving
accurate information, he may cross-examine him and thereby try to obtain the information.
3. It is also possible through personal interview to collect supplementary information about

the informants personal characteristics and environment and such information often proves
very useful while interpreting results.
4. Some questions about which the informant may likely to be sensitive which can be
carefully combined with other questions by the interviewer. He can twist the questions
keeping in mind the informants reactions. He can change the subject, if necessary, or
explain the survey problem further if it appears that the informant is not inclined to supply
any information. In other words, a delicate situation can usually be handled more effectively
by a personal interview than by other survey techniques.
5. The language of communication can be adjusted to the status and educational level of the
person interviewed, thus inconvenience and misinterpretation on the part of the informant
can be avoided.
Limitations. Important limitations of the personal interview method are:
1. It may be very costly where the number of persons to be interviewed is large and they are
spread over a wide area.
2. The chances of personal prejudice and bias are greater under this method as compared to
other methods.
3. The interviewers have to be thoroughly trained and supervised, otherwise they may not
be able to obtain the desired information. Untrained or poorly trained people may spoil the
entire work.
4. More time is required for collecting information by this method as compared to others.
This is because interviews can be held only at the convenience of the informants. Thus, if
information is required to be obtained from the working members of households, interviews
will have to be held in the evening or on weekend. Since only an hour or two can be used for
interviews in the evening, the work may have to be continued for a long time, or a large staff
may have to be employed involving huge expenditure.
2. Indirect Oral Interviews :
This is an indirect method of collecting primary data. Here information are not collected
directly from the source but by interviewing persons closely related with the problem. This
method is applied to apprehend culprits in case of theft, murder etc. The informations
relating to one's personal life or which the informant hesitates to reveal are better collected
by this method. Here the investigator prepares 'a small list of questions relating to the
enquiry. The answers (information) are collected by interviewing persons well connected
with the incident. The investigator should cross-examine the informants to get correct
information.
This method is time saving and involves relatively less cost. The accuracy of the information
largely depends upon the integrity of the investigator. It is desirable that the investigator
should be experienced and capable enough to inspire and create confidence in the
informant to collect accurate data.
3. Mailed Questionnaire method:
This is a very commonly used method of collecting primary data. Here information are
collected through a set of questionnaire. A questionnaire is a document prepared by the
investigator containing a set of questions. These questions relate to the problem of enquiry
directly or indirectly. Here first the questionnaires are mailed to the informants with a formal
request to answer the question and send them back. For better response the investigator
should bear the postal charges. The questionnaire should carry a polite note explaining the
aims and objective of the enquiry, definition of various terms and concepts used there.
Besides this the investigator should ensure the secrecy of the information as well as the
name of the informants, if required.
Success of this method greatly depends upon the way in which the questionnaire is drafted.
So the investigator must be very careful while framing the questions. The questions should
be
(i) Short and clear
(ii) Few in number

(iii) Simple and intelligible
(iv) Corroboratory in nature or there should be provision for cross check
(v) Impersonal, non-aggressive type
(vi) Simple alternative, multiple-choice or open-end type
(a) In the simple alternative question type, the respondent has to choose between
alternatives such as Yes or No, right or wrong etc.
(b) In the multiple choice type, the respondent has to answer from any of the given
alternatives.
(c) In the Open-end or free answer questions the respondents are given complete freedom
in answering the questions. The questions are like -
The questionnaire method is very economical in terms of time, energy and money. The
method is widely used when the scope of enquiry is large. Data collected by this method are
not affected by the personal bias of the investigator. However the accuracy of the
information depends on the cooperation and honesty of the informants. This method can be
used only if the informants are cooperative, conscious and educated. This limits the scope of
the method.
Merits.
1. This method of collecting data can be easily adopted where the field of investigation is
very vast and the informants are spread over a wide geographical area.
2. It is also relatively cheap and expeditious provided the informants respond in time.
3. On questions of personal nature or questions requiring reaction by the family, this method
is generally superior to either personal interviews or telephone method.
Limitations.
1. This method can be adopted only where the informants are literate people so that they
can understand written questions and send the answers in writing.
2. It involves some uncertainty about the response. Co-operation on the part of informants
may be difficult to presume.
3. The information supplied by the informants may not be correct and it may be difficult to
verify the accuracy.
4. Schedule Method:
In case the informants are largely uneducated and non-responsive data cannot be collected
by the mailed questionnaire method. In such cases, schedule method is used to collect data.
Here the questionnaires are sent through the enumerators to collect informations.
Enumerators are persons appointed by the investigator for the purpose. They directly meet
the informants with the questionnaire. They explain the scope and objective of the enquiry
to the informants and solicit their cooperation. The enumerators ask the questions to the
informants and record their answers in the questionnaire and compile them. The success of
this method depends on the sincerity and efficiency of the enumerators. So the enumerator
should be sweet-tempered, good-natured, trained and well-behaved.
Schedule method is widely used in extensive studies. It gives fairly correct result as the
enumerators directly collect the information. The accuracy of the information depends upon
the honesty of the enumerators. They should be unbiased. This method is relatively more
costly and time-consuming than the mailed questionnaire method.
Merits. The main advantages of the method are:
1. It can be adopted in those cases where informants are illiterate.
2. There can be very little chance for non-response as the enumerators go personally to
obtain the information.
3. The information received is more reliable as the accuracy of statements can be checked by
supplementary questions wherever necessary.
Limitations. 1. Among various methods of collecting primary data, this method is quite
costly as enumerators are generally paid persons.
2. The success of the method depends largely upon the training imparted to the
enumerators.
3. Skilled interviewing requires experience and training, but there is a tendency for
statisticians to neglect this extremely important part of the data collecting process. Without
good interviewing most of the information collected is of doubtful value.
4. The way in which the enumerators conduct the interview would affect the data collected.
When questions are asked by a number of different interviewers, it is possible that variations
in the personalities of the interviewers will cause variation in the answers obtained. This
variation will not be obvious. Hence every effort must be made to remove as much of
variation as possible due to different interviewers.
5. From Local Agents:
Sometimes primary data are collected from local agents or correspondents. These agents are
appointed by the sponsoring authorities. They are well conversant with the local conditions
like language, communication, food habits, traditions etc. Being on the spot and well
acquainted with the nature of the enquiry they are capable of furnishing reliable
information.
The accuracy of the data collected by this method depends on the honesty and sincerity of
the agents. Because they actually collect the information from the spot. Information from a
wide area at less cost and time can be collected by this method. The method is generally
used by government agencies, newspapers, periodicals etc. to collect data.
Information are like raw materials or inputs in an enquiry. The result of the enquiry basically
depends on the type of information used. Primary data can be collected by employing any of
the above methods. The investigator should make a rational choice of the methods to be
used for collecting data. Because collection of data forms the beginning of the statistical
enquiry.
3. Explain the merits & demerits of different methods of collecting primary data.
_ As above -
4. Explain different sources of secondary data and the precautions in using secondary data.
Secondary data can be collected from a number of sources which can broadly be classified
into two categories.
i) Published sources
ii) Unpublished sources
Published Sources:
Mostly secondary data are collected from published sources. Some important sources of
published data are the following.
1. Published reports of Central and State Governments and local bodies.
2. Statistical abstracts, census reports and other reports published by different ministries of
the Government.
3. Official publications of the foreign Governments.
4. Reports and Publications of trade associations, chambers of commerce, financial

institutions etc.
5. Journals, Magazines and periodicals.

6. Periodic Publications of Government organizations like Central Statistical Organization (C.
S. O.), National Sample Survey Organization (NSSO).
7. Reports submitted by Economists, Research Scholars, Bureaus etc.
8. Published works of research institutions and Universities etc.
Unpublished Sources:
Statistical data can also be collected from various unpublished sources. Some of the
important unpublished sources from which secondary data can be collected are:
1. The research works carried out by scholars, teachers and professionals.
2. The records maintained by private firms and business enterprises. They may not like to
publish the information considering them as business secret.
3. Records and statistics maintained by various departments and offices of the Central and
State Governments, Corporations, Undertakings etc.
Secondary data are already collected informations. They might have been collected for some
specific purposes. So they must be used with caution. It is generally very difficult to verify
such information to find out inconsistencies, errors, omissions etc. Therefore scrutiny of
secondary data is essential. Because the data might be inaccurate, unsuitable or inadequate.
Thus it is very risky to use statistics collected by other people unless they have been
thoroughly edited and found reliable, adequate and suitable for the purpose.
5. What is editing of secondary data? Why is it required?
Secondary data whether published, or otherwise, should be used with much caution for that
they were collected by some others originally for their purpose at different times and under
different situation which may not suit the present investigation in all respects. There might
be many errors of omission, commission, compensation, and duplication with those data.
Therefore, before using such data they must be very carefully edited, or scrutinized to
ensure that they are free from inaccuracy, inconsistency, inadequacy and unsuitability.
Therefore, before making use of any secondary data they should be strictly edited in the
light of the following tests:
1.Test of Reliability.
While editing the secondary data, the editor must see that the data obtained are accurate
and reliable.
For testing the reliability of the data, the editor should make the following queries:
(i) Who collected the data ?
(ii) Where from the data were collected ?
(iii) Is the compiler dependable in regard to honesty, integrity, experience and training ?
(iv) Is the source of the data dependable in regard to accuracy, adequacy and consistency ?
(v) What methods were employed in the primary collection of the data ?
(vi) Are the methods of collection proper and dependable ?
(vii) At what time, the data were collected and was it a normal time ?
(viii) Was there any possibility of bias, and prejudices creeping into the minds of the
compilers ?
(ix) What degree of accuracy was fixed by the investigator and was it achieved ?
(x) Was the size of the sample adequate ?
(xi) Was the sample at random, or adequate ?
(xii) With what purpose the data were collected ?
(xiii) What period is covered by the data and how far it is relevant for the present study ?
(xiv) What units of collection and measurement were employed? Were they clearly defined?
Are they suitable for the present purpose ?
(xv) Were the editing, tabulation, and analysis of the data carefully and consciously done ?
2.Test of Adequacy
While editing the secondary data it must be seen that they are adequate, or sufficient for
the purpose of the enquiry. As pointed out earlier, too much of data may prove to be
confusing and irrelevant. Similarly, too less of data, also, will not serve the purpose and give
the true picture of the problem under study. Therefore, the data must be adequate for the
purpose. Whether the data collected from the secondary sources are adequate, or not can
be tested in the light of the following queries:
(i) What was the geographical area from which the data were collected ?
(ii) Is the area of collection wider or narrower than the area covered under the present study
?
For example, if the object is to measure the change in the general price level of India
through the construction of a whole-sale price index number but the data collected relate
only to the cost of living of the people in a particular locality, it would not serve the purpose
on the ground of inadequacy.
(iii) What is the period covered by the data ?
(iv) Is the period covered by the data commensurate with the period of the problem under
study ?
(v) What was the degree of accuracy achieved with the collected data ?
(vi) Is the degree of accuracy achieved with the data commensurate with the degree of
accuracy desire in the present enquiry ?
For example, if in the collected data, 95% degree of accuracy was achieved, and in the
present study, 99% degree of accuracy is desired, the data thus collected will not suit the
purpose of the present enquiry.
3. Test of Suitability
While editing the secondary data, it must be seen that the data collected are suitable for the
present study. If the data collected are not suitable, it will vitiate the whole purpose of the
enquiry and lead to erroneous conclusions. The suitability of the data can be tested in the
light of the following queries:
(i) What was the nature of the problem for which the data were collected? If the nature of
the problem under study does not resemble with that of the problem for which the data
were originally collected, the same data will not be suitable for the investigation.
(ii) What was the object of the enquiry ?
If the object of the past enquiry is completely different from that of the present enquiry, the
collected data will not suit the present purpose. Thus, if the object of the present
investigation is to study the trend in the wholesale price, but the data collected were for
studying retail prices, such data would be unsuitable for the purpose.
(iii) What was the scope of the enquiry?
If the geographical area covered by the data collected was more wider, If the geographical
area covered by the data were collected to study the functioning of the non-banking
financial institutes in New York, the said data will not be suitable for studying the same
problem with reference to the state of Washington.
(iv) Do the definitions given to the various terms and units used in the earlier investigation
remain the same as those under the present enquiry ? If the definition of the various terms
and the units used in the two enquiries are completely different, the data collected originally
will not suit the present enquiry. For instance, if the term wage under the present enquiry
relates to the unskilled labour, the said collected data will not suit the present enquiry.
(v) What was the time covered by the data in the earlier enquiry ?
If the time covered by the data was radically different from the time required to be covered
by the data under the present enquiry, the data concerned will not be suitable for the
problem under study.
6. What are the different types of editing of primary data?
While editing primary data the following considerations need attention:
1. The data should be complete.
2. The data should be consistent,
3. The data should be accurate, and
4. The data should be homogeneous.
1. Editing for completeness. The editor should see that each schedule and questionnaire is
complete in all respects, i.e., answer to each and every question has been furnished. If some
questions have not been answered and those questions are of vital importance the
informants should be contacted again either personally or through correspondence. It may
happen that in spite of best efforts a few questions remain unanswered. In such questions,
the editor should mark No answer in the space provided for answers and if the questions
are of vital importance then the schedule or questionnaire should be dropped.
2. Editing for consistency. While editing the data for consistency, the editor should see that
the answers to questions are not contradictory in nature. If there are mutually contradictory
answers, he should try to obtain the correct answers either by referring back the
questionnaire or by contacting, wherever possible, the informant in person. For example, if
amongst others, reply to the questions: (a) Are you married? (b) Mention the number of
children you have, and the are respectively no and to three, then there is a contradiction
and it should be clarified.
3. Editing for accuracy. The reliability of conclusions depends basically on the correctness of
information. If the information supplied is wrong, conclusions can never be valid. It is,
therefore, necessary for the investigators to see that the information is accurate in all
respects. However, this is one of the most difficult tasks of the investigators. If the inaccuracy
is due to arithmetic errors, it can be easily detected and corrected. But if the cause of
inaccuracy is faulty information supplied, it may be difficult to verify it, e.g., information
relating to income, age, etc.
4. Editing for uniformity. By homogeneity we mean the condition in which all the questions
have been understood in the same sense. The investigators must check all the questions for
uniform interpretation. For example, as to the question of income, if some informants have
given monthly income, others annual income and still others weekly income or even daily
income, no comparison can be made. Similarly, if some persons have given the basic income
whereas others the total income, no comparison is possible. The investigators should check
up that the information supplied by the various people is homogeneous and uniform.
UNIT 2 : Questionnaire & Sampling
1. Explain questionnaire & examine its main characteristics.
Questionnaire is a List of a research or survey questions asked to respondents, and designed to

extract specific information. It serves four basic purposes: to (1) collect the appropriate data, (2)
make data comparable and amenable to analysis, (3) minimize bias in formulating and asking
question, and (4) to make questions engaging and varied.
The following are some principles to be followed for making the exercise successful (its main
characteristics):
1. Covering letter. The person conducting the survey must introduce himself and state the
objective of the survey. It is desirable that
(i) A short letter is enclosed. The letter should state in as few a words as possible the purpose of
the survey and how the informant would tend to benefit from it.
(ii) Enclose a self-addressed stamped envelope for the respondents convenience in returning the
questionnaire.
(iii) Assure the respondent that his answers will be kept in strictest confidence.
(iv) Promise the respondent that he will not be solicited after he fills up the questionnaire.
(v) If possible, offer special inducements (free gifts, concession coupons, etc.) to return the
questionnaire.
(vi) If the respondent is interested, promise a copy of the results of the survey to him.
2. Number of questions should be small. The number of questions should be kept to the
minimum. The precise number of questions to be included would naturally depend on the object
and scope of the investigation. Fifteen to twenty-five may be regarded as a fair number. If a
lengthy questionnaire is unavoidable, it should preferably be divided into two or more parts.
It should be noted that there is an inverse relationship between the length of a questionnaire
and the rate of response to the survey. That is, the longer the questionnaire, the lower will be
the rate of response, the shorter the questionnaire, the higher will be the rate of response.
Therefore, each question must be clearly presented in as few a words as possible and each
question should be deemed essential to the survey. In addition questions must be free from
ambiguities.
3. Questions should be arranged logically. The questions must be arranged in a logical order so
that a natural and spontaneous reply to each is induced. They should not skip back and forth
from one topic to another. Thus, it is undesirable to ask a man how many children he has before
asking whether he is married or not. Similarly, it would be illogical to ask a man his income
before asking him whether he is employed or not. Thus the sequence of the questions should be
considered carefully in terms of the purpose of the study and the persons who will supply the
information. Questions applying identification and description of the respondent should come
first followed by major information questions. If opinions are requested, such questions should
usually be placed at the end of the list. Two different questions worded differently be included
on the same subject to provide cross-check on important points.
4. Questions should be short and simple to understand. Unless the person being interviewed is
technically trained, technical terms should be avoided. Words such as capital or income that
have different meanings for different persons should not be used unless a clarification is
included in the question.
5. Ambiguous questions ought to be avoided. Ambiguous questions means different things to

different people. It will not be possible to obtain comparable replies from correspondents who
take a question to mean different things. For example,
Consider the following question:
Do you smoke? Yes / No
There are several ambiguities in this question. It is not clear whether the desired response
pertains to cigars, cigarattes, pipes or combinations thereof. Also it is not clear whether
occasional smoking or habitual smoking was the primary concern of the question. If we are
interested only in current cigarette consumption, it would be better to ask:
How many cigarettes do you currently smoke each day?
Less than 5
5 to 9
10 to 14
15 to 19
20 and above
6. Personal questions should be avoided. As far as possible, questions of a personal and peculiar
nature should not be asked. For example, questions about income, Sales-tax paid, etc., may not
be willingly answered in writing. Where such information is essential, it should be obtained by
personal interviews. Even then, such questions should be asked only at the end of the interview,
when the informants feel more at ease with the interviewer.
7. Instructions to the informants. The questionnaire should provide necessary instructions to the
informants. For example, the questionnaire should specify the time within which it should be
sent back and the address at which it should be sent. Instructions about units of measurements,
etc., should also be given. For instance, if there is a question on weight, it should be specified as
to whether weight is to be expressed in pounds or kilograms or in some other units.
8. Objective type Questions. Avoid questions of opinion and keep to questions of fact. In factual
studies, it is highly desirable that questions are so designed that objective answer may be
forthcoming. For example, instead of asking the condition of a building, allow the informant or
enumerator to state the condition in his own words. It is desirable to ask if a structure was in
good condition, needed minor repairs, needed structural repairs or was unfit for use. No doubt,
answer to such questions may not be completely objective but they can be readily tabulated.
Similarly, while asking students how do they normaly travel to college, frame a question of the
type:
How do you normally travel to college?
i) By bus (ii) By your own car (iii) By your own scooter
(iv) By taxi (v) On foot (vi) Any other
The respondent will tick mark the particular alternative applicable to him.
This type of question is known as multiple-choice question. It suggests several answer among
which the respondent may choose. If a multiple choice question is used, all alternatives should
be stated and a dont knowcategory be left in the questionnaire. Such questions not only
facilitate tabulation but will take very little time of the respondent to fill the questionnaire.
However, this type of question is excellent if most of the possible answers are both known and
few in number. When the possible answers are numerous, a limited list even if accompanied by
"any other category may elicit response different from that which otherwise would be
forthcoming. Multiple choice questions tend to bias result by the order in which alternative
answers are given. When ideas are involved, the first item in the list of alternative has a
favourable bias. The use of multiple- choice question is indicated only when the investigator is
confident of the existence of a limited group of important alternatives and it should be avoided
when the there are many possible responses of relatively equal significance.
9. Yes or No question. As far as possible the questions should be of such a nature that they
can be answered easily in Yes or No. Such questions pose a simple alternative to the
respondent. This is an excellent technique if applied to situations where a clear-cut alternative
exists. The questions Do you own a car?, Are you married?, Did you vote in last election?
can easily be answered with a yes or no. However, when the alternatives is not clear-cut, the
yes or no type question should be avoided. A question such as Do you favour the
Government policies? usually cannot be answered with a simple reply. The Government has so
many policies and only the most radical or partisam would favour or oppose them all. A typical
citizen may endorse many, have no opinion on some and reject others. The Yes or no
question in this case compels him to compress a variety of opinions into a simple alternative
which may, in reality, not exist.
Sometimes a respondent cannot give a simple yes or no answer either because he has not
yet made up his mind or because the lacks information on the topic. For example, the answer to
the question Are you in favour of public schools? may not always be in yes or no because
the respondent has not thought over it. In such cases additional alternative such as do not
know, undecided; no opinion should be included.
10. Specific information questions and open-end questions. Specific information questions call
for a specific item of information. For example, What is your age?, How many children do you
have ?, etc. These questions are simple and direct and are well adapted to securing information
of this type. Care should be taken to use this type of question only where the respondent can
answer correctly. The open question does not pose alternatives or request specific information.
It leaves the respondent free to make whatever reply he chooses. For example, the question,
what should be done to enhance the practical utility of B.F.Sc course? Why do you use Colgate
toothpaste or Lux soap are open-end questions. In many ways open question is superior to
other typesthere is no danger of being unduly restrictive suggesting answers, posing false
alternatives, and introducing some bias. It also may serve to interest the respondent in the
interview itself, especially if he is asked his opinion at the outset. However, open questions are
difficult to tabulate. Since no restriction is placed upon the variety of answers, many will often be
forthcoming. This not only increased the labour involved but frequently leads to improper
tabulation. Hence every effort must be made to minimize open questions in the questionnaire.
11. Questionnaire should look attractive. A questionnaire should be made to look as attractrive
as possible. The printing and the paper used, etc., should be good and plenty of space should be
left for answers depending upon the type of questions.
12. Questions requiring calculations should be avoided. Questions should not require
calculcations to be made. For example, informants should not be asked yearly income, for in
most cases they are paid monthly. Similarly, questions necessitating calculation of ratios and
percentages, etc., should not be asked as it may take much time and the informant may not send
back the questionnaire.
2. Explain main requirements of a good questionnaire.
As Above
UNIT 2 : Experiments
1. Explain the meaning & characteristics/principles of experimental designs:
An experiment is a controlled study in which the researcher attempts to understand cause-and-

effect relationships. It is a Research method for testing different assumptions (hypotheses) by
trial and error under conditions constructed and controlled by the researcher. During the
experiment, one or more conditions (called independent variables) are allowed to change in an
organized manner and the effects of these changes on associated conditions (called dependent
variables) is measured, recorded, validated, and analyzed for arriving at a conclusion.
Characteristics of a Well-Designed Experiment

A well-designed experiment includes design features that allow researchers to eliminate
extraneous variables as an explanation for the observed relationship between the independent
variable(s) and the dependent variable. Some of these features are listed below.
Control. (Principle of Control) Control refers to steps taken to reduce the effects of extraneous
variables (i.e., variables other than the independent variable and the dependent variable). These
extraneous variables are called lurking variables.
Control involves making the experiment as similar as possible for experimental units in each
treatment condition. Three control strategies are control groups, placebos, and blinding.
Control group. A control group is a baseline group that receives no treatment or a neutral
treatment. To assess treatment effects, the experimenter compares results in the treatment
group to results in the control group.
Placebo. Often, participants in an experiment respond differently after they receive a treatment,
even if the treatment is neutral. A neutral treatment that has no "real" effect on the dependent
variable is called a placebo, and a participant's positive response to a placebo is called the
placebo effect.
To control for the placebo effect, researchers often administer a neutral treatment (i.e., a
placebo) to the control group. The classic example is using a sugar pill in drug research. The drug
is considered effective only if participants who receive the drug have better outcomes than
participants who receive the sugar pill.
Blinding. Of course, if participants in the control group know that they are receiving a placebo,
the placebo effect will be reduced or eliminated; and the placebo will not serve its intended
control purpose.
Blinding is the practice of not telling participants whether they are receiving a placebo. In this
way, participants in the control and treatment groups experience the placebo effect equally.
Often, knowledge of which groups receive placebos is also kept from people who administer or
evaluate the experiment. This practice is called double blinding. It prevents the experimenter
from "spilling the beans" to participants through subtle cues; and it assures that the analyst's
evaluation is not tainted by awareness of actual treatment conditions.
Randomization. (Principle of Randomization) Randomization refers to the practice of using

chance methods (random number tables, flipping a coin, etc.) to assign experimental units to
treatments. In this way, the potential effects of lurking variables are distributed at chance levels
(hopefully roughly evenly) across treatment conditions.
For e.g when a researcher grows one variety of wheat, say, in the first half of the parts of the
field and the other variety in the other half then it is just possible that the soil fertility may be
different in the first half in comparison to the other half. Randomization in such a case would
mean that the researcher assign the variety of wheat to be grown in different parts of the field
chosen randomly.
Replication. (Principle of replication) Replication refers to the practice of assigning each

treatment to many experimental units. In general, the more experimental units in each
treatment condition, the lower the variability of the dependent measures. This way the
statistical accuracy of the experiment is increased.
Confounding
Confounding occurs when the experimental controls do not allow the experimenter to
reasonably eliminate plausible alternative explanations for an observed relationship between
independent and dependent variables.
Consider this example. A drug manufacturer tests a new cold medicine with 200 participants -
100 men and 100 women. The men receive the drug, and the women do not. At the end of the
test period, the men report fewer colds.
This experiment implements no controls! As a result, many variables are confounded, and it is
impossible to say whether the drug was effective. For example, gender is confounded with drug
use. Perhaps, men are less vulnerable to the particular cold virus circulating during the
experiment, and the new medicine had no effect at all. Or perhaps the men experienced a
placebo effect.
This experiment could be strengthened with a few controls. Women and men could be randomly
assigned to treatments. One treatment group could receive a placebo, with blinding. Then, if the
treatment group (i.e., the group getting the medicine) had sufficiently fewer colds than the
control group, it would be reasonable to conclude that the medicine was effective in preventing
colds.
2. Explain informal designs.
Informal experimental designs are those designs that normally use a less sophisticated form of
analysis based on differences in magnitudes. The following are three kinds of informal
experimental design:
Before-and-after without control design.

After-only with control design.
Before-and-after with control design.
1. Before-and-after without control design: In such a design a single test group or area is
selected and the dependent variable is measured before the introduction of the treatment.
The treatment is then introduced and the dependent variable is measured again after the
treatment has been introduced. The effect of the treatment would be equal to the level of
the phenomenon after the treatment minus the level of the phenomenon before the
treatment. The design can be represented thus:
The main difficulty of such a design is that with the passage of time considerable extraneous
variations may be there in its treatment effect.
2. After-only with control design: In this design two groups or areas (test area and control
area) are selected and the treatment is introduced into the test area only. The dependent
variable is then measured in both the areas at the same time. Treatment impact is assessed
by subtracting the value of the dependent variable in the control area from its value in the
test area. This can be exhibited in the following form:
The basic assumption in such a design is that the two areas are identical with respect to their
behaviour towards the phenomenon considered. If this assumption is not true, there is the
possibility of extraneous variation entering into the treatment effect. However, data can be
collected in such a design without the introduction of problems with the passage of time. In
this respect the design is superior to before-and-after without control design.
3. Before-and-after with control design: In this design two areas are selected and the
dependent variable is measured in both the areas for an identical time-period before the
treatment. The treatment is then introduced into the test area only, and the dependent
variable is measured in both for an identical time-period after the introduction of the
treatment. The treatment effect is determined by subtracting the change in the dependent
variable in the control area from the change in the dependent variable in test area. This
design can be shown in this way:
This design is superior to the above two designs for the simple reason that it avoids
extraneous variation resulting both from the passage of time and from non-comparability of
the test and control areas. But at times, due to lack of historical data, time or a comparable
control area, we should prefer to select one of the first two informal designs stated above.
3. Explain formal experimental design & control
1. Completely randomized design (C.R. design): Involves only two principles viz., the principle
of replication and the principle of randomization of experimental designs. It is the simplest
possible design and its procedure of analysis is also easier. The essential characteristic of the
design is that subjects are randomly assigned to experimental treatments (or vice-versa). For
instance, if we have 10 subjects and if we wish to test 5 under treatment A and 5 under
treatment B, the randomization process gives every possible group of 5 subjects selected
from a set of 10 an equal opportunity of being assigned to treatment A and treatment B.
One-way analysis of variance (or one-way ANOVA) is used to analyse such a design. Even
unequal replications can also work in this design. It provides maximum number of degrees of
freedom to the error. Such a design is generally used when experimental areas happen to be
homogeneous. Technically, when all the variations due to uncontrolled extraneous factors
are included under the heading of chance variation, we refer to the design of experiment as
C.R. design.
We can present a brief description of the two forms of such a design as given below:
a. Two-group simple randomized design: In a two-group simple randomized design, first of all
the population is defined and then from the population a sample is selected randomly.
Further, requirement of this design is that items, after being selected randomly from the
population, be randomly assigned to the experimental and control groups (Such random
assignment of items to two groups is technically described as principle of randomization).
Thus, this design yields two groups as representatives of the population. In a diagram form this
design can be shown in this way:
Two-group simple randomized experimental design (in diagram form)
Since in the sample randomized design the elements constituting the sample are randomly
drawn from the same population and randomly assigned to the experimental and control
groups, it becomes possible to draw conclusions on the basis of samples applicable for the
population. The two groups (experimental and control groups) of such a design are given
different treatments of the independent variable. This design of experiment is quite common in
research studies concerning behavioural sciences. The merit of such a design is that it is simple
and randomizes the differences among the sample items. But the limitation of it is that the
individual differences among those conducting the treatments are not eliminated, i.e., it does
not control the extraneous variable and as such the result of the experiment may not depict a
correct picture. This can be illustrated by taking an example. Suppose the researcher wants to
compare two groups of students who have been randomly selected and randomly assigned. Two
different treatments viz., the usual training and the specialized training are being given to the
two groups. The researcher hypothesises greater gains for the group receiving specialised
training. To determine this, he tests each group before and after the training, and then compares
the amount of gain for the two groups to accept or reject his hypothesis. This is an illustration of
the two-groups randomized design, wherein individual differences among students are being
randomized. But this does not control the differential effects of the extraneous independent
variables (in this case, the individual differences among those conducting the training
programme).
Random replication design (in diagram form)
b. Random replications design: The limitation of the two-group randomized design is usually
eliminated within the random replications design. In the illustration just cited above, the
teacher differences on the dependent variable were ignored, i.e., the extraneous variable
was not controlled. But in a random replications design, the effect of such differences are
minimised (or reduced) by providing a number of repetitions for each treatment. Each
repetition is technically called a replication. Random replication design serves two purposes
viz., it provides controls for the differential effects of the extraneous independent variables
and secondly, it randomizes any individual differences among those conducting the
treatments.
From the diagram it is clear that there are two populations in the replication design. The
sample is taken randomly from the population available for study and is randomly assigned
to, say, four experimental and four control groups. Similarly, sample is taken randomly from
the population available to conduct experiments (because of the eight groups eight such
individuals be selected) and the eight individuals so selected should be randomly assigned to
the eight groups. Generally, equal number of items are put in each group so that the size of
the group is not likely to affect the result of the study. Variables relating to both population
characteristics are assumed to be randomly distributed among the two groups. Thus, this
random replication design is, in fact, an extension of the two-group simple randomized
design.
2. Randomized block design (R.B. design) is an improvement over the C.R. design. In the R.B.
design the principle of local control can be applied along with the other two principles of
experimental designs. In the R.B. design, subjects are first divided into groups, known as
blocks, such that within each group the subjects are relatively homogeneous in respect to
some selected variable. The variable selected for grouping the subjects is one that is
believed to be related to the measures to be obtained in respect of the dependent variable.
The number of subjects in a given block would be equal to the number of treatments and
one subject in each block would be randomly assigned to each treatment. In general, blocks
are the levels at which we hold the extraneous factor fixed, so that its contribution to the
total variability of data can be measured. The main feature of the R.B. design is that in this
each treatment appears the same number of times in each block. The R.B. design is analysed
by the two-way analysis of variance (two-way ANOVA)* technique.
Let us illustrate the R.B. design with the help of an example. Suppose four different forms of
a standardised test in statistics were given to each of five students (selected one from each
of the five I.Q. blocks) and following are the scores which they obtained.
If each student separately randomized the order in which he or she took the four tests (by
using random numbers or some similar device), we refer to the design of this experiment as
a R.B. design. The purpose of this randomization is to take care of such possible extraneous
factors (say as fatigue) or perhaps the experience gained from repeatedly taking the test.
3. Latin square design (L.S. design) is an experimental design very frequently used in agricultural
research. The conditions under which agricultural investigations are carried out are different
from those in other studies for nature plays an important role in agriculture. For instance, an
experiment has to be made through which the effects of five different varieties of fertilizers on
the yield of a certain crop, say wheat, it to be judged. In such a case the varying fertility of the
soil in different blocks in which the experiment has to be performed must be taken into
consideration; otherwise the results obtained may not be very dependable because the output
happens to be the effect not only of fertilizers, but it may also be the effect of fertility of soil.
Similarly, there may be impact of varying seeds on the yield. To overcome such difficulties, the
L.S. design is used when there are two major extraneous factors such as the varying soil fertility
and varying seeds.
The Latin-square design is one wherein each fertilizer, in our example, appears five times but is
used only once in each row and in each column of the design. In other words, the treatments in
a L.S. design are so allocated among the plots that no treatment occurs more than once in any
one row or any one column. The two blocking factors may be represented through rows and
columns (one through rows and the other through columns). The following is a diagrammatic
form of such a design in respect of, say, five types of fertilizers, viz., A, B, C, D and E and the two
blocking factor viz., the varying soil fertility and the varying seeds:
The above diagram clearly shows that in a L.S. design the field is divided into as many blocks as
there are varieties of fertilizers and then each block is again divided into as many parts as there
are varieties of fertilizers in such a way that each of the fertilizer variety is used in each of the
block (whether column-wise or row-wise) only once. The analysis of the L.S. design is very similar
to the two-way ANOVA technique.
The merit of this experimental design is that it enables differences in fertility gradients in the
field to be eliminated in comparison to the effects of different varieties of fertilizers on the yield
of the crop. But this design suffers from one limitation, and it is that although each row and each
column represents equally all fertilizer varieties, there may be considerable difference in the row
and column means both up and across the field. This, in other words, means that in L.S. design
we must assume that there is no interaction between treatments and blocking factors. This
defect can, however, be removed by taking the means of rows and columns equal to the field
mean by adjusting the results. Another limitation of this design is that it requires number of
rows, columns and treatments to be equal. This reduces the utility of this design. In case of (2
2) L.S. design, there are no degrees of freedom available for the mean square error and hence
the design cannot be used. If treatments are 10 or more, than each row and each column will be
larger in size so that rows and columns may not be homogeneous. This may make the application
of the principle of local control ineffective. Therefore, L.S. design of orders (5 5) to (9 9) are
generally used.
4. Factorial designs: Factorial designs are used in experiments where the effects of varying more
than one factor are to be determined. They are specially important in several economic and
social phenomena where usually a large number of factors affect a particular problem. Factorial
designs can be of two types: simple factorial designs and complex factorial designs. We take
them separately
Simple factorial designs: In case of simple factorial designs, we consider the effects of varying
two factors on the dependent variable, but when an experiment is done with more than two
factors, we use complex factorial designs. Simple factorial design is also termed as a two-factor-
factorial design, whereas complex factorial design is known as multifactor- factorial design.
Simple factorial design may either be a 2 2 simple factorial design, or it may be, say, 3 4 or 5
3 or the like type of simple factorial design. We illustrate some simple factorial designs as under:
Illustration : (2 2 simple factorial design).
A 2 2 simple factorial design can graphically be depicted as follows:
In this design the extraneous variable to be controlled by homogeneity is called the control
variable and the independent variable, which is manipulated, is called the experimental variable.
Then there are two treatments of the experimental variable and two levels of the control
variable. As such there are four cells into which the sample is divided. Each of the four
combinations would provide one treatment or experimental condition. Subjects are assigned at
random to each treatment in the same manner as in a randomized group design. The means for
different cells may be obtained along with the means for different rows and columns. Means of
different cells represent the mean scores for the dependent variable and the column means in
the given design are termed the main effect for treatments without taking into account any
differential effect that is due to the level of the control variable. Similarly, the row means in the
said design are termed the main effects for levels without regard to treatment. Thus, through
this design we can study the main effects of treatments as well as the main effects of levels. An
additional merit of this design is that one can examine the interaction between treatments and
levels, through which one may say whether the treatment and levels are independent of each
other or they are not so. The following examples make clear the interaction effect between
treatments and levels. The data obtained in case of two (2 2) simple factorial studies may be as
given in below.
All the above figures (the study I data and the study II data) represent the respective means.
Graphically, these can be represented as shown in below.
The graph relating to Study I indicates that there is an interaction between the treatment and
the level which, in other words, means that the treatment and the level are not independent of
each other. The graph relating to Study II shows that there is no interaction effect which means
that treatment and level in this study are relatively independent of each other.
The 2 2 design need not be restricted in the manner as explained above i.e., having one
experimental variable and one control variable, but it may also be of the type having two
experimental variables or two control variables. For example, a college teacher compared the
effect of the classsize as well as the introduction of the new instruction technique on the
learning of research methodology. For this purpose he conducted a study using a 2 2 simple
factorial design. His design in the graphic form would be as follows:
But if the teacher uses a design for comparing males and females and the senior and junior
students in the college as they relate to the knowledge of research methodology, in that case we
will have a 2 2 simple factorial design wherein both the variables are control variables as no
manipulation is involved in respect of both the variables.
Illustration : (4 3 simple factorial design).
The 4 3 simple factorial design will usually include four treatments of the experimental variable
and three levels of the control variable. Graphically it may take the following form:
This model of a simple factorial design includes four treatments viz., A, B, C, and D of the
experimental variable and three levels viz., I, II, and III of the control variable and has 12
different cells as shown above. This shows that a 2 2 simple factorial design can be generalised
to any number of treatments and levels. Accordingly we can name it as such and such ()
design. In such a design the means for the columns provide the researcher with an estimate of
the main effects for treatments and the means for rows provide an estimate of the main effects
for the levels. Such a design also enables the researcher to determine the interaction between
treatments and levels.
Complex factorial designs: Experiments with more than two factors at a time involve the use of
complex factorial designs. A design which considers three or more independent variables
simultaneously is called a complex factorial design. In case of three factors with one
experimental variable having two treatments and two control variables, each one of which
having two levels, the design used will be termed 2 2 2 complex factorial design which will
contain a total of eight cells as shown below in below.
The dotted line cell in the diagram corresponds to Cell 1 of the above stated 2 2 2 design and
is for Treatment A, level I of the control variable 1, and level I of the control variable 2. From this
design it is possible to determine the main effects for three variables i.e., one experimental and
two control variables. The researcher can also determine the interactions between each possible
pair of variables (such interactions are called First Order interactions) and interaction between
variable taken in triplets (such interactions are called Second Order interactions). In case of a 2
2 2 design, the further given first order interactions are possible:
Experimental variable with control variable 1 (or EV CV 1);
Experimental variable with control variable 2 (or EV CV 2);
Control variable 1 with control variable 2 (or CV1 CV2);
Three will be one second order interaction as well in the given design (it is between all the three
variables i.e., EV CV1 CV2).
To determine the main effects for the experimental variable, the researcher must necessarily
compare the combined mean of data in cells 1, 2, 3 and 4 for Treatment A with the combined
mean of data in cells 5, 6, 7 and 8 for Treatment B. In this way the main effect for experimental
variable, independent of control variable 1 and variable 2, is obtained. Similarly, the main effect
for control variable 1, independent of experimental variable and control variable 2, is obtained if
we compare the combined mean of data in cells 1, 3, 5 and 7 with the combined mean of data in
cells 2, 4, 6 and 8 of our 2 2 2 factorial design. On similar lines, one can determine the main
effect for the control variable 2 independent of experimental variable and control variable 1, if
the combined mean of data in cells 1, 2, 5 and 6 are compared with the combined mean of data
in cells 3, 4, 7 and 8.
To obtain the first order interaction, say, for EV CV1 in the above stated design, the researcher
must necessarily ignore control variable 2 for which purpose he may develop 2 2 design from
the 2 2 2 design by combining the data of the relevant cells of the latter design as shown in
Fig. below
Similarly, the researcher can determine other first order interactions. The analysis of the first
order interaction, in the manner described above, is essentially a sample factorial analysis as
only two variables are considered at a time and the remaining one is ignored. But the analysis of
the second order interaction would not ignore one of the three independent variables in case of
a 2 2 2 design. The analysis would be termed as a complex factorial analysis.
It may, however, be remembered that the complex factorial design need not necessarily be of 2
2 2 type design, but can be generalised to any number and combination of experimental and
control independent variables. Of course, the greater the number of independent variables
included in a complex factorial design, the higher the order of the interaction analysis possible.
But the overall task goes on becoming more and more complicated with the inclusion of more
and more independent variables in our design.
Factorial designs are used mainly because of the two advantages.
1. They provide equivalent accuracy (as happens in the case of experiments with only one factor)
with less labour and as such are a source of economy. Using factorial designs, we can determine
the main effects of two (in simple factorial design) or more (in case of complex factorial design)
factors (or variables) in one single experiment.
2. They permit various other comparisons of interest. For example, they give information about
such effects which cannot be obtained by treating one single factor at a time. The determination
of interaction effects is possible in case of factorial designs.
4. Explain complex factorial design

Unit 2 : Observation
1. What are the characteristics of observation?
Some of the characteristics of observation method of data collection are as follows:
1. Observation is a Systematic Method:
Observation is not haphazard or unplanned. The length of the observation periods, the interval
between them, the number of observations, the area or situation of observation and various
techniques used for observation are carefully planned. Often there are systematic managements
for controlling the situation if special factors are to be studied, for example study of honest
behaviour, sportsman spirit, leadership qualities etc.
2. Observation is Specific:
It is not just looking around for general aspects of human behaviour. Rather it is directed at
those specific aspects of total situation which are assumed to be significant from the stand point
of the purpose of the study. The layman may frequently overlook what is crucial while observing
an event or phenomenon, but the scientific observer should look for some definite things which
suit his purpose of study so as to economies his time, money and effort for observation.
3. Observation is Objective:
Observation should be objective and free from bias as far as possible. It should generally be
guided by a hypothesis. The observer must maintain ethical neutrality. He must consider
hypothesis as something to be tested. But at the same time he must maintain a flexible attitude,
so that he can deviate from his original plan when such deviation appears inevitable.
4. Observation is Quantitative:
Although many important phenomena cannot be quantified, it becomes almost an imperative to

use some means for quantifying observations in order to increase their precision and to facilitate
their analysis. Even the quality should be converted into quantity, because qualitative data is
subjective and quantitative one is objective and can further be interpreted in objective manner.
5. Observation is an Affair of Eyes:
P.V. Young remarks that observation is a systematic and deliberate study through eye. An
observer gathers the data which he has seen in his own eyes. Collecting information through
eyes is probably the most trustworthy technique of data collection in social research.
6. Definite Aim:
Observation must have some definite aims and objectives. It should be clearly defined before
the beginning of the actual observation process. Without the proper aims and objectives
observation will be unsystematic and expensive.
7. The Record of Observation is Made Immediately:
During the observation period it is very difficult on the part of the observer to remember each
and every element of observation. He may forget much important information. If we rely on
memory the factor of forgetting will enter and affect the data of observation. Therefore the
observer should record all important informations as soon as the observation is completed.
8. Observation is Verifiable:
Observation result can be checked and verified. Observation must be verified with usual criteria
of reliability, validity and usability. It may be possible to check the findings of the observation by
comparing the results of different observers by repeating the study.
2. How do you differentiate observation from experiment?
Observational study and experiments are the two major types of study involved in research. The
main difference between these two types of study is in the way the observation is done.
Here are examples for observational study and experiments that could clearly define the
differences between the two.
Hawthorne studies are a good example for experiments. The studies were conducted at the
Hawthorne plant of the Western Electric Company. The study was to see the impact of
illumination and productivity. First, the productivity was measured, and then the illumination
was modified. After this the productivity was again measured which helped the researchers to
arrive at a conclusion.
The study to determine the relation between smoking and lung cancer is a typical example for
observational study. For this the researchers collected data of both smokers and non-smokers.
After this, the researchers would make observations with the help of the data and the statistics
collected from each group.
1.The main difference between observational study and experiments is in the way the
observation is done.
2.In an experiment, the researcher will undertake some experiment and not just make
observations. In observational study, the researcher simply makes an observation and arrives at
a conclusion.
3.In observational study, no experiment is conducted. In this type of study the researcher relies
more on data collected.
4.In an experiment, the researcher observes things through various studies.
5.There is human intervention in experiments whereas there is no human intervention in
observational study.
Unit 3 : Statistical Analysis
Unit 4 : Statistical Applications
Unit 4 : Co-relation & regression analysis
1. Explain the aim of correlation analysis.
Correlation is a statistical measure that indicates the extent to which two or more variables are
related to each other. A positive correlation indicates the extent to which those variables
increase or decrease in parallel; a negative correlation indicates the extent to which one variable
increases as the other decreases.
Once correlation is known it can be used to make predictions. When we know a score on one
measure we can make a more accurate prediction of another measure that is highly related to it.
The stronger the relationship between/among variables the more accurate the prediction.
2. Distinguish between positive & negative correlation.
In a positive correlation, as one variable increases, so does the other variable, and as the first
decreases, so does the second. A negative correlation is the opposite. As one variable increases,
the other variable decreases, and as the first decreases, the second increases.
The length of an iron bar increasing as the temperature increases is an example of a positive
correlation. An example of a negative correlation is that the volume of gas decreases as the
pressure increases.
3. State formula for simple correlation coefficient.
The Pearson correlation coefficient is a very helpful statistical formula that measures the
strength between variables and relationships. In the field of statistics, this formula is often
referred to as the Pearson R test. When conducting a statistical test between two variables, it is
a good idea to conduct a Pearson correlation coefficient value to determine just how strong that
relationship is between those two variables.
Correlation Coefficient returns a value of between -1 and +1. A -1 means there is a strong
negative correlation and +1 means that there is a strong positive correlation.
Correlation coefficient r is calculated using the formula below:
4. State the properties of the correlation coefficient.
1) Correlation coefficient remains in the same measurement as in which the two variables are.
By this we mean that if the two variables measures in feet or in any other measure then
coefficient is also the same.
2) The sign which correlations of coefficient have will always be same as the variance. We cannot
alter it because it is meant to be like that.
3) The numerical value of correlation of coefficient will be in between of 1 to + 1. It is known as

real number value.
4) The negative value of coefficient suggests that the correlation is strong and negative. And if r
goes on approaching towards 1 then it means that relationship is going towards negative side.
When r approaches to the side of + 1 then it means the relationship is strong and positive. By
this we can say that if + 1 is the result of the correlation then the relationship is in positive state.
5) The weak correlation is signaled when the coefficient of correlation approaches to zero. When
r is near about zero then we can deduce that relationship is weak.
6) Correlation coefficient can be very dicey because we cannot say that the participants are
truthful or not.
By this we mean that after sometime might be it happen that participant say other things about
what they had said earlier. We can see that coefficient of correlation is not affected when we
interchange the two variables.
7) Coefficient of correlation is a pure number without effect of any units on it. It is also not get
affected when we add same number to all the values of one variable. We can multiply all the
variables by the same positive number. It is not affect the correlation coefficient. As we
discussed that r is not affected by any unit because r is scale invariant.
8) We use correlation for measuring the association but that does not mean we are talking about
causation. By this we simply mean that when we are correlating the two variables then it might
be the possibility that third variable may influencing them.
5. What is rank correlation? Explain.
If ranks can be assigned to pairs of observations for two variables X and Y, then the correlation
between the ranks is called the rank correlation.
6. State the formula for rank correlation coefficient.
Where:
(rho)= Spearman rank correlation
di= the difference between the ranks of corresponding values Xi and Yi
n= number of value in each data set
7. Explain how to resolve ties while calculating ranks.
Equal values are common when rank methods are applied to rounded data or data consisting
solely of small integers. A popular technique for resolving ties in rank correlation is the mid-rank
method: the mean of the rankings remains unaltered, but the variance is reduced and modified
according to the number and location of ties. Although other methods for breaking ties were
proposed in the literature as early as 1939, no such procedure has gained such wide acceptance
as mid-ranks. If there are two items with equal values, assign the average of the two ranks to
both the items.
8. Explain the concept of regression.
In a cause and effect relationship, the independent variable is the cause, and the dependent
variable is the effect. Regression is a method for predicting the value of a dependent variable Y,
based on the value of an independent variable X
Simple Linear Regression is the method for finding the "line of best fit" between the
dependent variable, y, and the independent variable, x.
Simple: only one independent variable
Linear in the Independent Variable: the independent variable only appears to the
first power.
Linear: also means linear in the parameters, since no parameter appears to the first
power.
The Least Squares Regression Line is the line which minimizes the sum of the
square or the error of the data points. It is an averaging line of the data. (See the
graph below.)
Notice that the graph to the right shows several features:
The actual data points (x,y) are the blue dots.
The Least Squares Regression Line of the dependent (y) variable based on the
independent (x) variable is shown in black.
The errors (residuals) are the vertical distances between the observed values
of y and the predictions of the "line of best fit," which are shown in red.
The goal, in general, is to minimize the errors from the actual data to the regression
line. The least squares line minimizes the sum of the square of the errors.
9. What is the principle of least squares. Explain.
Sum of squares is a statistical technique used in regression analysis. Regression analysis is a tool
used to determine how well a function fits a set of data. The sum of squares technique helps
determine what function provides the best fit.
Goal of linear regression procedure is to fit a line through the points. Specifically , it will compute
a line so that the squared deviations of the observed points from that line are minimized. This
procedure is called least square.
For each point in the dataset:

y - (a + bx) measures the vertical deviation (vertical distance) from the point to the line.
Some points are above the line and y - (a + bx) will be positive for these points. For points below
the line y - (a + bx) will be negative, so we square these deviations to make them all positive.
Now if we calculate [ y - (a + bx)]2 for each point (x,y) and add them all up we get the sum of the
squared distances of all the points from the line.
The line which minimises this sum of squared distances is the line which fits the data best and
we call it the Least Squares Line.
10. Explain normal equations in the context of regression analysis.
11. State the formulae for the constant term and coefficient in the regression equation.
12. State the relationship between the regression coefficient and correlation coefficient.
13. Explain the managerial uses of correlation analysis and regression analysis.
Unit 4 : Analysis of variance
1. Define analysis of variance
2. State the assumptions in analysis of variance
3. Explain the classification of linear models for the sample data.
4. Explain ANOVA table.
5. Explain how inference is drawn from ANOVA table.
6. Explain the managerial applications of analysis of variance.

Unit 4 : partial & multiple correlation
1. Explain partial correlation.
2. Explain multiple correlation
3. State the properties of the coefficient of multiple linear corelations.

Unit 4 : Factor Analysis & Conjoint Analysis
1. Explain the purpose of factor analyses.
2. What is the objective on conjoint analysis
3. State the steps in the development of conjoint analysis.
4. State the applications of conjoint analysis.
5. Enumerate the advantages & disadvantages of conjoint analysis
6. What is a product profile? Explain.
7. What are the steps in multi-factor evaluation approach in conjoint analysis?
8. What is a two-factor table? Explain.
9. Explain two-factor evaluation approach in conjoint analysis.

Unit 5 : Structure & components of research reports

Research Process Outline

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Research Process Outline

Cargado por

Copyright:

Formatos disponibles

1. Define Research.

2. What are the objectives of research?

3. State the significance of research.

Improving Standard Of Living

For A Safer Life

To Know The Truth

Explore Our History

4. What is the importance of knowing how to do research?

5. Briefly outline research process.

The major steps in research are as follows:

Understanding the problem theoretically

Methodology of the research work

6. Highlight the different research approaches.

7. Discuss the qualities of a researcher.

1. A good researcher manifests thirst for new information.

2. A good researcher has a keen sense of things around him.

Keenness is a quality developed through an observant attitude. A good researcher sees

3. A good researcher likes to reflect or think about the things he encounters.

4. A good researcher must be intelligent enough to express his ideas.

5. A good researcher applies a systematic approach in assessing situations.

8. Explain the different types of research.

Descriptive versus Analytical Research

Descriptive research concentrates on finding facts to ascertain the nature of something as it

9. What is a research problem?

In general, a research problem refers to an unanswered question that a researcher might

2. Researchers own resource:

1. Research-ability of the problem:

2. Novelty of the problem:

3. Importance and urgency:

6. Usefulness and social relevance:

10. Outline the features of research design.

The research design highlights decisions which include:

The research design may be divided into the following:

Features of a Good Research design

11. Explain the significance of research design.

12. What is a case study?

13. Discuss the criteria for evaluating case study.

14. Define hypothesis.

15. What are the characteristic features of a hypothesis?

A hypothesis must possess the following characteristics:

(iii) Hypothesis should state relationship between variables, if it happens to be a relational

16. Distinguish between null & alternative hypothesis.

Therefore, your null hypothesis, H0 would be

17. Differentiate Type I & Type II error.

Alternative hypothesis (H1): 1 2

The two medications are not equally effective.

18. How is hypothesis tested?

The various steps involved in hypothesis testing are stated below:

Null Hypothesis H0: m = 10 tons

Alternative Hypothesis Ha: m > 10 tons

(a) the magnitude of the difference between sample means

(b) the size of the samples

(c) the variability of measurements within samples

(d) whether the hypothesis is directional or non-directional (A directional hypothesis is one

19. Define the concept of sampling design.

Representative of the population. In other words, it should contain similar proportions of

20. Describe the steps involved in sampling design.

21. Discuss the criteria for selecting a sampling procedure.

a. Inappropriate sampling frame: If the sampling frame is inappropriate i.e., a biased

22. Distinguish between probability & non-probability sampling.

Can be more expensive and time-consuming than convenience or purposive sampling.

23. How is a random sample selected?

24. Explain complex random sampling designs.