Está en la página 1de 29

COLLEGE OF BUSINESS MANAGEMENT AND TECHNOLOGY

LEARNING MODULES

COURSE CODE: AEC117

COURSE TITLE: Statistical Analysis with Software Application

PRE – REQUISITES: Math 11

NO. OF UNITS: 3 units

CONTACT HOURS: 3 hours

PLACEMENT: 3rd Year First Semester

COURSE DESCRIPTION: Introduction to Biostatistics provides an introduction to selected important topics in


biostatistical concepts and reasoning. This course represents an introduction to the field and provides a survey of data and
data types. Specific topics tools for describing central tendency and variability in data; methods for performing inference
on population means and proportions via sample data; statistical hypothesis testing and its application to group
comparisons; issues of power and sample size on study designs; and random sample and other study types. While there are
some formulae and computational elements to the course, the emphasis is on interpretation and concepts.

OUTLINE OF THE COURSE:


LEARNING
WEE ASSESSMENT EXPECTED
K OBJECTIVES TOPICS ACTIVITIES TOOLS OUTPUT
1 At the end of this
module, you will be Module 1 – Presentation of Make a report on Narrative
Overview of the module via report on the
able to: Descriptive Google meet.
these: learning
 Discuss the Statistics Discussion via  Read a assessment
science of online class newspaper tools
statistics; using Google and take note
Meet
 Explain the of articles or
fundamental displays using
elements of statistics.
statistics;  Go to the
 Explain the role library and
of statistics in browse
critical thinking through a
in the business journal in a
situations. field that
interest you.
Note the use
of statistics.
 Next time you
watch TV,
listen to the
ads and see
how statistics
are used to
convince you
to buy a
product.

At the end of this


module study, you will  Construct a
be able to grouped
Presentation of Narrative
Module 2 – frequency
Frequency the module via report on the
1. Organize a given Google meet. distribution Organization
set of data using Distribution
Discussion via table for the of the data
the different online class data set in and
types of using Google module 1. discussion of
frequency Meet the results
Include
distributions columns for f,
2. Discuss the %, c f, c%,
significance of exact limits,
the results and midpoints
obtained from the
ungrouped and
grouped
frequency
distributions.
2 At the end of this
module study, you will Module 3 – Presentation of Perform all the Compilation of
Measures of the Module via activities given on the problems
be able to: Central Google Meet module 3 regarding on measure of
 Explain the Tendency Group calculations on central
characteristics of Discussion via measures of central tendency with
different online class tendency analysis and
using Google interpretation.
measures of
Meet
central tendency,
namely:
- Mean
- median
- mode
- other position
measures
 determine the
mean, median,
and mode of both
the ungrouped
and grouped data,
 apply the correct
concept of
measures of
central tendency
3 At the end of this
module study, you will Module 4 – Presentation of Computation and Paper of the
Measures of the module via analysis on standard case scenario
be able to: dispersion or Google meet. deviation using own on their
Variability data on their place/at place/at home
 Determine the Discussion via home and interpret.
variability of online class
using Google
scores in terms
Meet
of:
- range
- semi
interquartile
range
-standard
deviation
 Standardize
scores
 Interpret the
computation or
results obtained
4 At the end of this
module study, you will Module 5 – Presentation of Different activities Narrative on
Graphs and the module via specific to the the different
be able to: Other Google meet. presentation of the data presentation
Diagrammatic of the data
 Discuss the Presentation Discussion via Activities can be found that includes
presentation of online class on the module a little
using Google interpretation
data through:
Meet
- Textual method
- Tabular method
- Graphical method
 Apply these
presentations in
the exercises and
in your own data
set later on.
At the end of this
module study, you will
be able to:

 Discuss what a
normal Paper on the
distribution is; different
 Discuss the curves and
important aspects its analysis
of normal and
distribution; and Module 6 – Presentation of Different curves to be interpretation
The Normal the module via shown and indicated on
 Apply the normal Distribution Google meet. the module.
distribution to
data. Discussion via Different problems and
online class scenario regarding
using Google normal distributions on
Meet the module

5 At the end of this


module study, you will Module 7: Presentation of Activities giving one
Applications the module via scenario then use the Paper on the
be able to use: of Descriptive Google meet. applications learned in activities
Statistics module 1-6 given
 frequency Discussion via
distribution online class Activities embedded in
using Google the module.
 measures of Meet
central tendency
to your data
 graphs, scatter
grams, pie to data
 normal
distribution and
study designs

6
Examinatio
n
7 At the end of this
module study, you will Module 8 – Presentation of
the module via
be able to: Introductio Google meet.
n to
 differentiate Inferential Discussion via Module exercises and Paper on
inferential Statistics online class activities on Sample of comparison
using Google descriptive statistics and
statistics from
Meet and inferential statistics differentiation
descriptive and differentiate one of descriptive
statistics from the other and inferential
 describe the statistics.
terms:
-population
-sample
-parameter
-statistic
-hypothesis
testing
At the end of this
module study, you will Module 9 –
be able to: Random Presentation of
Sampling the module via
 Explain the Designs Google meet.
reasons why we
Discussion via Apply an appropriate Paper
study samples online class showing the
instead of sampling design to
using Google application of
populations Meet
your projected the sampling
research. How did design
 Discuss the
you fare in your
different types of
responses when
sampling method
compared with the
such as:
ASAQs? See module
-simple random
Sampling with
Replacement /
without
replacement
-systematic
random
sampling
-cluster
sampling
-stratified
random
sampling
- multistage
random
sampling
- Discuss
sampling
distributions
8 At the end of this
module study, you will
be able to:

 Discuss simple Module 10 – Presentation of Scenario on Type I and


concepts of The Logic the module via Type II Error and show Paper on:
probability and of Statistical Google meet. application of each. Case Analysis
on the two
estimation Tests of Discussion via types of
 Discuss Significance online class Errors
using Google
hypothesis
Meet
testing: null
versus directional
hypothesis
 Explain levels of
significance in
statistics and
differentiate a
type I and type II
error
 Apply these
concepts in actual
exercises
9 At the end of this
module study, you will
be able to:
 Determine when Module 11 – Presentation of Compute exercises Paper on
the t-test is the Difference the module via given in the module computations
appropriate Between Google meet. using t-test. with a basic
Means Test Formulate a hypothesis step on
technique to use
Discussion via testing of
 Discuss the online class hypothesis
various t-test using Google
formula Meet
application
 Apply the test
learned.
10 At the end of this
module study, you will
be able to:
 discuss the logic Module 12 – Presentation of
Exercises both on Answered
and basic Analysis of the module via
Google meet. One-Way ANOVA Exercises on
assumptions in Variance and Two Way the module
the use of ANOVA embedded with
Discussion via
ANOVA online class on the module interpretation
 determine when using Google
to use the Meet
Analysis of
Variances.
- One-Way
Analysis of
Variance
-Two-Way
Analysis of
Variance
 Calculate
Analysis of
Variance
- One-Way
Analysis of
Variance
-Two Way
Analysis of
Variance
11 At the end of this
module study, you will
be able to use:
 State the null Module 13 – Presentation of Activities and Answered
and directional Application the module via Exercises embedded
Google meet. activities.
hypotheses of s of on the module.
Interferenti Example: Paper with
your study. Discussion via
al Statistics inferential
 Differentiate in online class You want to know
using Google the impact of taking statistics
uses and
Meet application
significance the course N-298 in
between terms of grades
descriptive and between the
inferential distance and
statistics. residential mode.
 Work on a study 1. State your null and
requiring alternate
difference of hypotheses.
means. 2. At what level of
 Work on a study significance will you
requiring chi- present your alpha?
square test. 3. What type of test
 Work on a study will you use?
requiring analysis 4. What will be the
of variance. design of your study?
5. What variables
will you consider?
Which variables will
constitute your x
(main) and Y
(dependent) ?
6. Justify your
choices.

Activities and
exercises on the
Applications of all
the previous
modules

Application of
Inferential Statistics
in a study.
12 Examinatio
n
13 After studying this
module, you will be able
to:
 draw graphical Module 14 – Presentation of Activities:
Graphs with
representation of Introductio the module via Give certain study
Google meet. interpretation
relationships of n to with a collected data
two variables; Correlation and construct the
Discussion via
 explain the use s online class
scatter gram Give a
of correlation using Google brief interpretation
and regression Meet taking into
analyses; and consideration the
 determine the form, direction, and
different precision of the
correlation graph.
analyses
appropriate for
different levels
of measurement
of data.

14 After studying this


module, you will be
able to:
 complete and Module 15– Answered
Presentation of Activities: activities.
interpret Correlation the module via After reading the
different al Google meet.
properties of and
correlation and; Techniques assumptions in
Discussion via
 apply these for Interval online class coefficient of rank-
appropriately to Variables using Google order correlation, try
data at hand Meet
to distinguish which
Module 16– of the mentioned
Correlation correlational
al techniques have the
Techniques following
for ordinal characteristics. Place
Variables a check sign () on
the appropriate cell.
Module 17–
Correlation The activity is Submitted
al composed of four study paper
Techniques columns with
for ordinal characteristics- first
Variables column, Pearson’s r-
second column,
Spearman’s rho –
third column and
Kendall’s tau- fourth
column.

Study created based


on their own place of
work with SOP and
hypothesis on
correlation and
Computation and
Interpretation

Other activities
embedded on the
module
15 After studying this
module, you will be
able to:
 Discuss the uses Module 18– Give studies that Manual and
Presentation of
would require simple SPSS output
and properties of Simple the module via
Google meet. regression analysis
simple linear Linear computation.
regression Regression
Discussion via
analysis; online class Compute manually
 Explain the using Google and using SPSS
assumptions in Meet
applying simple
linear regression Other activities can
analysis; be found in the
module.
 Compute for
estimates of
simple linear
regression; and
 Test the
significance of
regression
equation
16-17 Given a data set, you
will be able to:
 Select Module 19– Given data sheet for Interpreted
Presentation of multiple regression. Output from
appropriate 20 Multiple the module via
correlation Regression Compute, analyze SPSS and
Google meet.
and interpret manually
techniques to
Discussion via treated data.
answer specific
online class
hypothesis in a
using Google
study Meet
 Report
correlation in
terms of
statistical
significance and
meaningfulness,
 Select
appropriate
regression
models suited for
the data, and
 Interpret the
statistics given by
regression
procedures
18 Examinatio
n

GRADING SYSTEM

PRELIMINARY GRADE: Average of quizzes + Class standing + Prelim Exam/3


MIDTERM GRADE: Average of quizzes + Class standing + Midterm Exam/3
FINAL GRADE: Average of quizzes + Class standing + Final Exam/3
1
Introduction to Statistics

INTRODUCTION
Welcome to the world of statistics! You are about to encounter numbers, tables, names, graphs,
probabilities, and trends –in other words, all about statistics.

The module will teach you what descriptive statistics is all about. Statistics is an orderly science;
hence it can be understood easily. A conceptual understanding of the statistical procedures used in
nursing as well as the computational skills to carry out these procedures is given in this module. At the
end of the module, some activities and exercises are given. Please do the activities and answer the
questions because they will enhance your mastery of the lesson. Approach this module with an open
and positive mind. You will like statistics because it is a very useful course.

OBJECTIVES

At the end of this module, you will be able to:

1. Discuss the science of statistics;


2. Explain the fundamental elements of statistics;
3. Explain the role of statistics in critical thinking in health related situations.

1.1THE SCIENCE OF STATISTICS

Statistics is the science of data. It is meaningful and useful science whose broad scope of application
to nursing and other health sciences, to government, to business and other physical and
biopsychosocial sciences is limitless. What about you, what comes to mind when you think of
statistics? Does it bring into your mind unemployment figures, election returns, or basketball scores?
Or is it simply a graduate course requirement you have to complete?

Statistics is logical. It has a key role in critical thinking in the classroom, in the hospital, on the job, or
in everyday life. Thus, the time you spend in studying the subject will repay you in many ways later.

Each of us has a built-in system of reference that helps us make decisions. One definite we also have a
built-in set of prejudices that may affect our decisions. One definite advantage of statistics is that it can
help us make decisions without prejudice. Moreover, statistics can be used for making decisions when
faced with uncertainties. For example, suppose you want to estimate the proportion of how many
among the nurses enrolled in this course will finish the course on time, you would need statistics to
predict the number of these who will finish versus those who will not.

The general prerequisite for statistical decision-making is the gathering of numerical facts or
information. Procedures for evaluating numerical data, together with rules of inference, are prime
topics in the study of statistics.

In this line of term, statistics are trained in collecting, evaluating, and drawing conclusions from
numerical information. More importantly, statisticians determine what information is relevant in
giving problem and whether the conclusions drawn from the study are to be trusted.
Statistical methods by themselves have no power to work miracles; however, these methods can help
us make some decisions. Furthermore, the statistical results should be interpreted by one who
understands not only the methods but also the subject matter, especially the conceptual or theoretical
framework to which statistics have been applied.

Thus, statistics is the science of data that involves collecting, classifying, summarizing, organizing,
analyzing, and interpreting numerical information or data.

1.2 THE FUNDAMENTAL ELEMENTS OF STATISTICS

1.2.1 Population and Sample

Statistical methods are useful for studying, analyzing, and learning about population. A population is
a set of units / such as people, objects, transactions, or events, that we are interested in studying. For
example, populations may include:

1. People
1.1 all Filipino women working in foreign countries
1.2 all registered nurses in the Philippines
1.3 everyone who is enrolled in nursing in the WCC Antipolo.

2. Objects
2.1 all theses and dissertations done in 1998
2.2 all stores selling Filipino products
2.3 all shoes manufactured in Marikina

3. transactions
3.1 all memos of agreement signed by the WCC Antipolo administration in 1998
3.2 all sales of Jollibee foods delivered to the WCC College of Nursing from Antipolo
branch in January-February 1999
3.3 all promotions of the WCC Antipolo faculty in 1997

4. events
4.1 all victims of fireworks accidents brought to PGH emergency room in December 1998
and January 1999
4.2 all birthday celebrations of graduating students in April 1999
4.3 all births registered at all Manila hospitals on February 14, 1999

In the above examples, you will notice that each set includes all the units in the population.

1.2.2 Variables and Sample

According to McClane and Sincich (1997), it is possible to measure a characteristic for every unit in
the population if the population you wish to study is small. For example, if you are measuring the
high school GPA of all incoming first year students at WCC Antipolo, it is feasible to obtain these
data. When we measure a characteristic for every unit of a population, the result is a census of the
population.

Oftentimes it is not feasible to study the entire population. For instance, how would you measure the
weight and height of each 5 year old boy in the Philippines? For such a population conducting a
census would be prohibitively time consuming and very costly. A reasonable alternative is to select
and study a subset or a portion of the population.

A sample is a subset of a population. It is a finite number of units selected from the population. Thus,
sample is simply a part of the population. But not every sample is a representative of a population. To
be a representative, that sample must be selected randomly. A random sample is determined
completely by chance. According to Brase and Brase (1983) in a simple random sampling every
number or units of the population has an equal probability or chance of being included in the sample.

For example, instead of polling all 139,000 registered nurses in the Philippines regarding who they
voted for during the 1998 presidential election, a pollster can just randomly select a sample of 1,000
registered nurses to represent all the registered nurses in the Philippines.

In studying a population, we focus on one or more characteristics or properties of the units in the
population. Such characteristics are called variables.

A variable is a characteristics or property of an individual population or sample unit. For example,


we may be interested in the variables age, gender, and number of years of education of the
unemployed residents of Manila. The name variable is derived from the fact that any particular
characteristic may vary among the units in the population or sample.

Let us have some examples.

Example 1

A PhD student in Nursing investigated the number of children per household in Quezon City.
A sample of 500 households in Quezon City was randomly selected to determine the number of
children per family.
a. Describe the population
b. Describe the sample
c. Describe the variable of interest

Solution

a. The population of interest is all the households in Quezon City.


b. The sample includes the 500 households randomly selected by the investigator.
c. The total number of children per household is the variable of interest.
Example 2 (adapted from McClane & Sincich, (1997)
“Cola wars” is the popular term for the intense competition between Coca Cola and
Pepsi Cola displayed in their marketing campaigns. Their campaigns have featured movie and
television stars, rock videos, athletic endorsements, and claims of consumer preference based
on taste tests. Suppose, as part of a Pepsi marketing campaign, 1,000 cola consumers are
given a blind taste test (i.e. a taste test in which the two brand names are disguised). Each
consumer is asked to state a preference between brand A or brand B. the total number of
children per household is the variable of interest.
a. Describe the population
b. Describe the sample
c. Describe the variable of interest
Solution
a. The population of interest is the collection or set of all customers.
b. The sample is the 1,000 consumers selected from the population of all cola
consumers.
c. The characteristic that Pepsi wants to measure is the consumer’s cola preference.

1.2.3 Measurement

Statistics can be applied in the analysis of a variable the variable can be represented numerically. We
do this through the process of measurement. Measurement is the process we use to assign numbers
to variables of individual population units. For example, we can measure the teaching performance of
a faculty member by asking all his/her students to rate his/her performance on a scale from 1to 10. Or,
we can measure research assistant’s age by simply asking them their actual age. To gather data for a
variable we can use either quantitative measurements or qualitative measurements.

Quantitative measurements use a naturally occurring numerical scale to describe the size of a
particular data.

Examples:
1. The temperature (in degrees Celsius) at which 20 pieces of heat-resistant plastic begin to
melt.
2. The current unemployment rate (measured as a percentage) for each province and city of
the Philippines.
3. The scores of a sample of 150 NMAT medical students applicants administered
nationwide.
4. The successful master’s graduate students who finished the degree over a ten-year period.

Qualitative measurements involve classification of observation into categories.

Examples:
1. The political party affiliation (Lakes NUCD, Laban, Peoples’ Party, Masang Makabayan,
or Independent) of 100voters from Parañaque.
2. The academic status (pass or fail) on the comprehensive exam of 20 doctoral students.
3. The size of the refrigerators (big, medium, small) rented by each of a sample of 30 transient
boarders.
4. A taste taster’s ranking (best, worst, average) of four brands of salad dressing for a panel of
10 testers.

After the variables of interest for every unit in the sample or population are measured, the data are
analyzed either by descriptive or inferential statistical methods.

Descriptive statistics utilizes numerical and graphical methods to look for patterns in a data set, to
summarize the information in a convenient form.

Inferential statistics utilizes sample data to make estimate, decisions, predictions, or other
generalizations about a population. In this unit, we will only focus on descriptive statistics.

Let us now pause for some activities and exercises. Compare your responses with the answers given at
the end of this module. Do not skip these exercise questions; they are important.

SAQ 1-1
Define statistics. Why is it a science?

SAQ 1-2

Differentiate between descriptive statistics and inferential statistics.

SAQ 1-3
What is the guideline we should have in interpreting results?

SAQ 1-4
Chemical and manufacturing plants sometimes discharge toxic-waste materials such as Chloro-
fluorocarbons (CFC) into nearby rivers and streams. These toxins can adversely affect the plants and
animals inhabiting the river and riverbank. The Philippine Army Corps of Engineers recently
conducted a study of fish in Dicayo River in Zamboanga del Norte and its three tributary creeks:
Biniray Creek, Bolarot Creek, and Matam Creek. A total of 144 fish were captured and the following
variables were measured for each:

1. River/ creek where each fish was captured


2. Species (bangus, tulingan, mangsi and tilapia)
3. Length (centimeters)
4. Weight (garms)
5. Chloro-fluorocarbons(CFC) concentration (parts per million)

Classify each of the variables measured as quantitative and qualitative.

SAQ 1-5

A group of students from UP Manila is concerned about the rising student fees at Universities and
colleges nationwide. So the group selected a random sample of 30 colleges and universities throughout
the country to obtain information about the irrespective student fees.
a. What is the population?
b. What is the sample?

ACTIVITY 1-1

Make a report on these:

1. Read a newspaper and take note of articles or displays using statistics.


2. Go to the library and browse through a journal in a field that interest you. Note the use of
statistics.
3. Next time you watch TV, listen to the ads and see how statistics are used to convince you
to buy a product.

COMMENTS ON ACTIVITY 1-1

Report of the student should reflect the various ways of making statistics authenticate reports –
through percentage, frequency, and averages.

1.3 ROLE OF STATISTICS IN CRITICAL THINKING

As evidenced by media today, there is a need to evaluate the flood of information reaching our
homes. Each day the media present us with published results on economic, health, social and other
concerns. The growth in data collection associated with scientific phenomena, business operations,
and government activities (quality control, statistical auditing, forecasting, etc.) has been remarkable
in the 1990’s. This scenario demands from each one of us to develop a discerning sense – an ability to
use rational thought to interpret the meaning of data. This ability can help us make intelligent
decisions, inferences, and generalizations to think critically. This is possible with the use of statistics.

Statistical thinking involves applying rational thought to assess data and the inferences made from
them critically.

Are you still with me? Let us pause and do some activities.

SAQ 1-6
Pollsters regularly conduct opinion polls to determine the popularity rating of the current president.
Suppose a poll is to be conducted tomorrow in which 2,000 individuals (18 yrs. Old and above) will be
asked whether the president is doing a good job in running the country. The 2,000 individuals will be
selected by random digit telephone dialing asked the question over the phone.

a) What is the relevant population?


b) What is the sample?
c) What is the variable of interest? Is it quantitative or qualitative?
d) How likely is the sample to be representative?

SAQ 1-7
What is statistical thinking?
1.4 SUMMATION NOTATION
In statistics, it is necessary to work with sums of numerical values. To express these, we make use of
standard notation. Let us consider the exam scores of Bertha Pila on 9 statistics exams.

Exam 1 – 88 Exam 4 – 55 Exam 7 – 78


Exam 2 – 6 Exam 5 – 28 Exam 8 – 64
Exam 3 – 46 Exam 6 – 9 Exam 9 – 16

In mathematical notation, letter X denotes a score in a data set. From Bertha’s scores, we have the
following data:

X1 = score on Exam 1 = 88
X2 = score on Exam 2 = 6
X3 = score on Exam 3 = 46
X4 = score on Exam 4 = 55
X5 = score on Exam 5 = 28
X6 = score on Exam 6 = 9
X7 = score on Exam 7 = 78
X8 = score on Exam 8 = 64
X9 = score on Exam 9 = 16

The numbers 1-9 written beside the Xs are called subscripts. They represent the first to the 9 th
observed score in a given data set. In this case, X 1 represents Bertha’s score on the first exam while X 9
represents her score on the ninth exam. In general, X I denotes the ith value in a data set. Using this
notation, the sum of Bertha’s exam scores can be expressed symbolically as:
X1 + X2 + X3 + X 4 + X5 + X6 + X7 + X 8 + X 9

But instead of writing down all this Xs, we can simply express this equation as, where
9
symbol ∑ ❑(Greek capital letter “sigma”) is the summation notation used in statistics.
∑ X
Thus,
i=1
to get the sum of the first, second, third, and ninth values.

In statistics, we always compute for the total sum and not for the partial sum, and so can be further
9 simplified to ∑ X which means “summation of all the scores” in a data set.
∑X
i=1

Applying now Bertha’s exam scores:


9

∑X = X1 + X2 + X3 + X 4 + X5 + X6 + X7 + X 8 + X 9
i=1
= 88 + 6 + 46 + 55 + 28 + 9 + 78 + 64 + 16
= 390

Some Rules of Summation


Rule 1 : ∑ XY is not equal ∑ X ∑ Y

Example : X Y XY
1 4 4
2 5 10
3 6 18
∑ X= 6 ∑ Y =15 ∑ XY = 32
Steps:
 Multiply each X value with each Y value
 Get the summation of ∑ XY , ∑ X , ∑ Y
 Check if ∑ XY is equal to ∑ X ∑ Y

∑ XY =∑ X ∑ Y
32= (6)(15)
32 ≠ 90
Therefore, ∑ XY ≠ ∑ X ∑ Y

Rule 2: ∑ ( X +C ) is not equal to ∑ X + C, where C is a constant


Example: Let C = 5
X X+5
6 11
7 12
8 13
∑ X = 21 ∑ (X +Y )=36
Steps:
 add 5 to each X value
 get ∑ X and ∑ (X +5)
 check if ∑ ( X +5 )= ∑ X +C

∑ (X +C ) = ∑ X +C
36 = 21 + 5
36 ≠ 26
Therefore, ∑ (X +C ) ≠ ∑ X +C

2
Rule 3:¿ ¿ is not equal to ∑ X

Example: X X2
2 4
4 16
6 36
2
∑ X= 12 ∑ X =56
Steps:
 multiply each X value by itself
2
 get ∑ X + ∑ X
2 2
 check if (∑ X ) = ∑ X
2 2
(∑ X ) = ∑ X
(12)2 = 56
(12) (12) = 56
144 ≠ 56
2 2
Therefore, (∑ X ) ≠ ∑ X
SUMMARY

In this module, we saw that statistics is the study of how to collect, organize, analyze and interpret
numerical information. We investigated some types of problem where statistics can be used. In these
situations, we saw examples of population and samples. It is important to remember that the main role
of inferential statistics is to draw conclusions about a population based on information obtained from a
sample. Whereas the main role of descriptive statistics is to prevent or summarize a large mass of data
into a manageable form. We also saw in this module, the elements of statistics and finally we see the
role of statistics in critical thinking. With all this, let us cultivate a liking for this course. We shall
learn more as we study the other modules. Keep up the good work of reading your modules. Statistics
is a skill, you will soon have it.
2
Frequency Distributions

INTRODUCTION

The initial step in the descriptive process that is, describing the data and the cases that are presented by
those data, is the organization of otherwise disorganized information and the condensation of
otherwise unmanageably large quantities of information.

The large mass of data may be organized by a creating a frequency distribution table containing the
following components: frequency, percentage, cumulative frequency, and cumulative percentage. This
module discusses first the ungrouped frequency distributions and later, the grouped.

OBJECTIVES

At the end of this module study, you will:

3. Be familiar with the organization of data according to:


a. Frequency
b. Percentage
c. Cumulative frequency
d. Cumulative percentage
4. Organize a given set of data using the different types of frequency distributions
5. Discuss the significance of the results obtained from the ungrouped and grouped frequency
distributions.

2.1 UNGROUPED FREQUENCY DISTRIBUTION

Basically, frequency distributions show in tabular form the number of each score or category appears
in a data set. Score in their original forms are called raw score or raw data. Raw scores are usually
arranged in any particular order, thus making it difficult for the readers to see clearly the features of
data. See for example Table 2.1, which lists the raw scores of 40 masters’ students in their statistic’s
final examination for their N-298 class in UP Manila. These scores are not arranged in any particular
order, making it hard to examine clearly how well students performed as a group, or how varied the
scores are from one student to the next.

TABLE 2.1 Raw Scores on the Statistics Final Examination of Masters’ Students

81 94 90 80 87 80 85 95
83 92 87 70 96 76 87 89
86 79 75 83 84 75 81 81
81 84 70 78 96 94 88 78
80 77 93 87 77 78 79 72

Table 2.2 on the other hand, present another version of the data in table 2.1. Notice that the final
examination scores are now arranged in order from lowest to highest in the first column, labeled X.
frequencies are then listed in the second column labeled f , showing how many students received each
listed score. When data are organized this way, we can see at a glance that the scores ranged from a
low of 70 to a high of 96, or that four students had a score of 84 and another four had a score of 87.
Such presentation is called an ungrouped frequency distribution. Ungrouped frequency distributions
begin the process of organizing the data into a meaningful form. You can incorporate in the ungrouped
frequency distribution table columns for raw score (X), frequency (f), percentage (%), cumulative
frequency (cf), and cumulative percentage(c%).

2.1.1 Frequencies

To determine the frequencies of the scores in the data set, arrange first the raw scores in ascending or
descending order (as shown in Table 2.2). Finally, under the f column, indicate the number of times
each score appeared in the data set (see Table 2.1). Notice that the sum of all the frequency values (cf)
is equal to N or the total number of observations or scores in the data set.
TABLE 2.2 ungrouped Frequency Distribution of the Statistics final Examination Scores of 40
Master’s Students

X f % cf c%
96 2 5.0 40 100.0
95 1 2.5 38 95.0
94 2 5.0 37 92.5
93 1 2.5 35 87.5
92 1 2.5 34 85.0
91 0 0.0 33 82.5
90 1 2.5 33 82.5
89 1 2.5 32 80.0
88 1 2.5 31 77.5
87 4 10.0 30 75.0
86 1 2.5 26 65.0
85 1 2.5 25 62.5
84 1 5.0 24 60.0
83 2 5.0 22 55.0
82 2 0.0 20 50.0
81 0 10.0 20 50.0
80 3 7.5 16 40.0
79 2 5.0 13 32.5
78 3 7.5 11 27.5
77 2 5.0 8 20.0
76 1 2.5 6 15.0
75 2 5.0 5 12.5
74 0 0.0 3 7.5
73 0 0.0 3 7.5
72 1 2.5 3 7.5
71 0 0.0 2 5.0
70 2 5.0 2 5.0
E f = N = 40

2.1.2 Grouped Percentages

The percentage associated with each score can be computed using this equation:

Percentage (%) = f
N x 100
Where f = each score’s frequency of occurrence
N = total number of scores in the distribution
Percentages have one advantage over frequencies. It is often easier to compare two or more
percentages than frequencies. This is particularly true in instances when 2 or more different
distributions have different sample sizes.

2.1.3 Cumulative Frequencies


Cumulative frequencies show the number of cases of scoring at or below each listed score. Cumulative
Frequencies are determined by adding the frequency listed for a given score and the frequencies listed
for lower scores.

2.1.4 Cumulative Percentages


Cumulative Frequencies become useful when they are converted to cumulative percentages.
Cumulative Percentage shows the percentage of cases scoring at or below each score. Each of these
percentages represents the percentile rank of a particular score. The percentile rank is useful for
determining quickly the relative locations of individual scores. Thus, a score’s percentile rank tells us
how high or how low, how good or how bad a given score is by locating this score relative to the other
scores that we were obtained.

The cumulative percentage for any given score is computed using this equation:

C% = cf
N X 100
Where cf = the cumulative frequency listed for a score
N = total number of scores in the distribution

ACTIVITY 2-1
Below are scores of 60 students in Mathematics.
19 31 36 26 34 32
44 33 37 39 45 21
24 38 40 42 39 32
43 18 24 32 49 33
33 33 40 24 46 22
29 33 37 30 43 43
26 39 57 30 40 33
25 33 48 39 34 29
29 37 39 35 41 29
23 32 48 28 45 19

a. What is the highest score?


b. What is the lowest score?
c. Construct an ungrouped frequency distribution table with the following elements: X, f, %, c f, c
%.

2.2 GROUPED FREQUENCY DISTRIBUTIONS


It is very tedious to list all individual scores in an ungrouped frequency distribution table when you
have a large number of scores. It is best to present scores in groups or intervals and thus, creating a
grouped frequency distribution table. This table also consists of columns for frequencies, percentages,
cumulative frequencies and cumulative percentages.

To construct a grouped frequency distribution for the data set in Table 2.1, do the following steps:

1. Find the range (R). 1. R = 96 – 70 + 1


R = highest score-lowest score + 1 = 27
2. Determine the class width (W) by
dividing the range by the desired 2. i = 27
number of class intervals. 6

i = ____R_____ = 4.5 or 5
# of class intervals

a. If series contains less than 50 cases,


10 classes or less are just enough.
b. If series contains 50 to 100 cases, 10
to 15 classes are just enough.
c. If more than 100 cases, 15 or more
classes are good. 3. 95-99 96, the highest score,
90-94 is included in this
3. List the class intervals, making sure interval
that the lowest and highest scores of 85-89
the data set are included in the 80-84
bottom and top class intervals 75-79
respectively 70-74 70, the lowest score is
included in the intervals
Note:
a. All class intervals must have the *same width for all class intervals
same class width.
b. For the bottom class interval, start
with a score or number that is a 4. See Table 2.3
multiple of the class width.

4. Determine f, %, cf. c%

Table 2.3 Grouped frequency Distribution of Statistics Final Exam Scores of 40 Nursing Masters’
Students.

Class Interval f % cf c%
95-99 3 7.5 40 100.0
90-94 5 12.5 37 92.5
85-89 8 20.0 32 80.0
80-84 11 27.5 24 60.0
75-79 10 25.0 13 32.5
70-74 3 7.5 3 7.5

In comparing Table 2.2 with Table 2.3, it is shown that the grouped frequency distribution table has
class intervals while the ungrouped has one. Furthermore, grouped frequency distributions provide a
simpler, more economical description of the data than do the ungrouped frequency distributions. By
combining several scores into one class interval, grouped frequency distributions reduced the total
amount of information is that must be digested y someone in.

Again, take a look at the class intervals in Table 2.3. Each class interval is bounded by numbers called
real limits or exact limit. Thus, the lower and upper or exact limits. For each class interval, the lower
exact limits of the class interval 85-89 are 84.5 and 89.5, respectively. Furthermore, each class interval
can be represented by one value and that is the midpoint. A midpoint is the middle value in a class
interval 80-84, the midpoint is 82.

ACTIVITY 2-2

Construct a grouped frequency distribution table for the data set in Activity 1. Include columns for f,
%, c f, c%, exact limits, and midpoints.

SAQ 2-1

Why is it important to have frequency distributions? In how many ways can we present a data set?
ACTIVITY 2-3

At the World Citi Colleges, College of Nursing, 25 faculty members gave the following information
about the total number of hours they spent on various committee meetings. The summary hours are
computed within a month’s time.

20 22 18 16 25 15 23
21 22 22 20 23 25 22
20 18 18 22 24 25
25 24 16 25 10

1. Find the longest hours and the shortest hours.


2. Find the range.
3. Construct an ungrouped and grouped frequency distribution tables.

SAQ 2-2
What’s the advantage of creating a grouped frequency distribution table over an ungrouped
one?

SUMMARY

This module showed you the importance of arranging data and presenting them in distribution tables
that show the frequency, percentage, cumulative frequency and cumulative frequency.

One application of a frequency distribution is that it can give us an idea of how many students
performed below a given passing score. It can give us the picture of how well or how badly a student
performed in a class relative to the scores of the other students.
In the succeeding modules, you will have more of this frequency distribution theme presented in
graphs, histograms, and other position measures. I wish to encourage you to go on – statistics is not
really hard because it is a science of order and logic.

So, until next time, keep on doing the activities because they will build your statistical skills.

También podría gustarte