Está en la página 1de 83

Statistics Handouts

Page 1 of 83


MANUAL
IN







STATISTICS
statistics made simple
10
th
edition




Ms. Yumi Vivien C. Valenzuela, MSME
Subject Teacher
Statistics Handouts
Page 2 of 83


TABLE OF CONTENTS

Exercise No. Title Page

1

2

3

4

5

6

7

8

9


Variables and the Summation Notation

Frequency Distribution Table

Numerical Descriptive Measures

Weighted Means

FPC, Combination and Permutation

Probability

Normal Distribution

Test of Hypothesis I

Test of Hypothesis II

6

15

27

34

49

55

62

68

75

Lesson No. Title Page

1

2

3

4

5

6

7

8

9

10

11

12

Methods of Data Collection and Presentation

Frequency Distribution Table

Numerical Descriptive Measures

Weighted Means

Sampling

FPC, Combinations and Permutations

Probability

Estimation

Normal Distribution

Test of Hypothesis

Two-way ANOVA

Pearson Moment Correlation

7

12

17

28

36

46

50

57

60

63

79

83

Statistics Handouts
Page 3 of 83



Sources/ References:

Concepts, sample problems and information given by this manual were taken from the following :

1. Fundamental Statistics for College Students by Pagoso, et al.
2. Graduate Research Manual Guide to thesis and Dissertations (Aquinas Graduate School)
3. How to Design and Evaluate Research Education by Fraenkel and Wallen
4. Introduction to Statistics by Walpole
5. Introduction to Statistical Methods by Parel, Alonzo, et al.
6. Laboratory Manual in Statistics I, UPLB
7. Manual on Training on Microcomputer-Based for the Social Sciences (Richie Fernando Hall AdeNU,
2005)
8. Statistics for the Health Sciences by Kuzma
9. Applied Basic Statistics by Flordeliza Reyes
10. Fundamental Concepts and Methods in Statistics by George Garcia
11. Simplified Statistics for Beginners by Dr. Cesar Bermundo
Statistics Handouts
Page 4 of 83


I. Statistics and its Scope

STATISTICS encompasses all the methods and procedures used in the
collection, presentation, analysis and interpretation of data.

DESCRIPTIVE STATISTICS comprise those methods concerned with
collecting and describing a set of data so as to yield meaningful information.

STATISTICAL INFERENCE comprises those methods concerned with the
analysis of a subset of data leading to predictions or inferences about the
entire set of data

- Population vs Sample
Population is the set of all entities and elements under study. Sample is the
subset of population.

- Parameters vs Statistics
Parameters refer to all descriptive measures or characteristics of population
while statistics refer to sample characteristics.

- Census vs Survey
Census is the process of gathering information from every element of the
population while survey is the process of gathering information from every
element of the sample.


II. Variables and its Level of Measurement
Variable is an observable characteristics of a person or object which is capable
of taking several values or of being expressed in several different categories. It
can be either quantitative (discrete or continuous) or qualitative data.


MEASUREMENT SCALES
a. Nominal are simply labels, names or categories. Number assignment is
used for identification purposes, no meaning can be attached to the
magnitude or size of such numbers. Examples are gender, civil status,
telephone numbers, etc..
b. Ordinal - whereas nominal scales only classify, ordinal scales do not only
classify but also order the classes. Examples are job position, military
ranks, etc..
c. Interval quantitative but has no true zero point. Examples are IQ, room
temperature, etc..
d. Ratio quantitative and has true zero point. Examples are number of
children, physics test scores, etc


Statistics Handouts
Page 5 of 83


SUMMATION NOTATION

For a given universe, suppose we observe a variable, say X. We may denote the
first value as X1, the second as X2 and so on. In general, Xi is the observation on
variable X made on the ith individual.

Given a set of N observations or data values represented by X1, X2, , XN, we express
their sum as



where is the summation symbol;
i is the index of the summation; and
Xi is the summand.
1 is the lower limit
N is the upper limit

Theorem 1. If c is a constant, then


Theorem 2. If c is constant, then




Theorem 3. If a and b are constants, then

(




Statistics Handouts
Page 6 of 83


Exercise # 1 Variables and the Summation Notation

At the end of this exercise, the student must be able to:
1. identify different types of variables
2. classify data according to level of measurement
3. employ summation notation

I. Identify the level of measurement.

A. From all patients admitted in a hospital, the following information are collected:
1. name of patient
2. age
3. sex
4. body temperature
5. blood pressure
6. amt. of deposit
7. first time to see a doctor regarding ailment? (yes/no)
8. heartbeat per minute
9. weight
10. height
11. no. of glasses of fluid intake per day
12. no. of meals taken in a day

B. The following information are of interest for selected students of AdeNU who are cigarette
smokers.
1. age when first smoked
2. average no. of sticks consumed per day
3. main source of allowance
4. amt. of weekly allowance
5. Is your father a smoker? (yes/no)
6. occupation of father
7. brand of cigarette
8. position in the family


II. Instruction will be given by your teacher.


Date Set 1. Data on head circumference (in cm) and foot length (cm) of 8 new born
babies.
Baby no. 1 2 3 4 5 6 7 8
Head
circumference (x)
31.5 33 37.5 38.5 35 32 38 34
Foot length (y) 5.6 6.2 6.8 6.6 6.4 5.4 6.0 6.1


Data Set 2. Data on height (cm) and weight (lbs) of 8 stat students.
Student no. 1 2 3 4 5 6 7 8
Height(x) 168 141 165 180 165 156 150 147
Weight (y) 110 90 120 125 142 97 105 110

Statistics Handouts
Page 7 of 83


Lesson #1 Methods of Data Collection and Presentation

METHODS OF DATA COLLECTION
Various methods for data gathering are available. A researcher should be able to
use the most appropriate.

1. Survey Method questions are asked to obtain information, either through self
administered questionnaire or interview (personal, telephone or internet)

Ways Advantages Disadvantages

Personal
Interview

- Flexibility in obtaining
answers
- More in-depth answers
- Can observe the
respondents behavior


- expensive
- field interviews are
hard to control
- errors in interviewing
- time consuming

Mailed
Questionnaires

- wider geographic
distribution of
respondents possible
- respondents can answer
at their convenience
- no personal interviewers
bias
- centralized control o
people doing the survey
- relatively inexpensive
- respondent may be more
candid if he/she can
answer anonymously


- responses rate may be
low
- hard to obtain in-
depth information
- usable mailing list
may be unavailable
- respondent not the
addressee
- cannot observe
respondents behavior

Phone
Interview

- relatively inexpensive
- fast
- centralized control of
people doing survey
- respondents maybe more
candid


- unlisted telephone
number
- outdated telephone
directory
- interview time needs
to be relatively short
- selected sample may
not have telephones





Statistics Handouts
Page 8 of 83


2. Observation Method makes possible the recording of behavior but only at a
time of occurrence (e.g., observing reactions to a particular stimulus, traffic
count).

Advantages over Survey Method:
- does not rely on the respondents willingness to provide information
- certain types of data can be collected only by observation (e.g., behavior
patterns of which the subject is not aware of or ashamed to admit)
- the potential bias caused by the interviewing process is reduced or eliminated

Disadvantages over Survey Method:
- things such as awareness, beliefs, feelings and preferences cannot be observed
- the observed behavior patterns can be rare or too unpredictable thus
increasing the data collection costs and time requirements


3. Experimental Method a method designed for collecting data under controlled
conditions. An experiment is an operation where there is actual human
interference with the conditions that can affect the variable under study. This is
an excellent method of collecting data for causation studies. If properly designed
and executed, experiments will reveal with a good deal of accuracy, the effect of a
change in one variable on another variable.

4. Use of Existing Studies e.g., census, health statistics, and weather bureau
reports

Two types:
- documentary sources published or written reports, periodicals,
unpublished documents, etc.

- field sources researchers who have done studies on the area of interest
are asked personally or directly for information needed


5. Registration method e.g., car registration, student registration, and hospital
admission
Statistics Handouts
Page 9 of 83


METHODS OF DATA PRESENTATION

1. Textual form data are incorporated to a paragraph.

Advantages:
- This method is appropriate only if there are few numbers to be presented.
- Gives emphasis to significant figures and comparisons

Disadvantages:
- It is not desirable to include a big mass of quantitative data in a text or
paragraph, as the presentation becomes incomprehensible.
- Paragraphs can be tiresome to read especially if the same words are repeated
so many times


2. Tabular Presentation systematic organization of data in rows and columns

Advantages:
- More concise than textual presentation
- Easier to understand
- Facilitates comparisons and analysis of relationship among different categories
- Presents data in greater detail than a graph


PARTS OF A STATISTICAL TABLE:

a. Heading consists of a table number, title and head note. The title explains
what are presented, where the data refers and when the data apply.

b. Box Head contains the column heads which describes the data in each
column, together with the needed classifying and qualifying spanner heads.


c. Stub these are classification or categories found at the left. It describes the
data found in the rows of the table.

d. Field main part of the table


e. Source Note an exact citation of the source of data presented in the table
(should always be placed when figures are not original)



Statistics Handouts
Page 10 of 83



Illustration:


Table 4.4
Philippines Crime Volume and Rate by Type in 1991


1991
Type Volume Crime
Rate
Total

Index Crimes
Murder
Homicide
Physical Injury
Robbery
Theft
Rape

Non Index Crimes

11,326

77,261
8,707
8,068
21,862
13,817
22,780
2,026

44,065
195

124
8,707
8,069
21,862
13,817
88,780
2,026

71


Source: Philippines National Police






Guidelines:
- Title should be concise, written in telegraphic style, not in complete sentence
- Column labels should be precise.
- Categories should not overlap.
- Unit of measure must be clearly stated
- Show any relevant total, subtotals, percentages, etc..
- Indicate if the data were taken from another publication by including a source
note
- Tables should be self-explanatory, although they may be accompanied by a
paragraph that will provide an interpretation or direct attention to important
figures

BOXHEAD
d

STUB
FIELD
HEADING
SOURCE NOTE
Statistics Handouts
Page 11 of 83


3. Graphical Presentation- a graph or chart device for showing numerical values or
relationship in pictorial form


Advantages:
- main feature and implication of a body of data can be grasped at a glance
- can attract attention and hold the readers interest
- simplifies concepts that would otherwise have been expressed in so many words
- can readily clarify data, frequently bring out hidden facts and relationship



Common Types of Graph

a. Line Chart graphical presentation of data especially useful for showing trends over
a period of time.

b. Pie Chart a circular graph that is useful in showing how a total quantity is
distributed among a group of categories.


c. Bar Chart consists of a series of rectangular bars where the length of the bar
represents the quantity or frequency for each category. The height of the bar
represents the quantity

d. Pictorial Unit chart a pictorial chart in which each symbol represents a definite
and uniform value









Statistics Handouts
Page 12 of 83


Lesson #2 Frequency Distribution Table

Date Set. Given below is the distribution of statistics test scores of 50 students (Perfect score is 70 and
passing score is 60% of it )

5
8
10
18
19

20
20
20
20
21
21
21
23
23
23

24
25
25
25
26
27
28
29
29
30


30
30
32
35
35

35
35
36
36
37

38
39
40
40
40
45
47
48
49
50
55
58
59
60
70
Steps in the construction of frequency distribution:
1. Determine the range R of the distribution.

R = highest observed value lowest observed value
= 70 - 5
= 65
2. Determine the number of classes, k, desired. By the square root rule.

K = N , where N = total number of observations
=
K 7
- the number of classes is to be rounded off to the nearest WHOLE NUMBER.

3. Calculate the class size, c.

First find: c = R/K =




The class size is to have the same precision as the raw data and should take the
value nearest to c. Hence, c = 9

4. Enumerate the classes or categories based on the quantities calculated in steps 1-3
bearing in mind that:

a) the lowest class must include the lowest observed value and the highest class,
the highest observed value. (The lowest value of the data is the lower class limit of
the first class).
b) That each observation will go into one and only class (that none of the values can
fall into possible gaps between successive classes and that the classes do not
overlap).

- Successive lower class limits may be obtained by adding c to the preceding
lower class limit. And so with the upper limits.

Statistics Handouts
Page 13 of 83



I. Tally the observations to determine the class frequency or the number of
observations falling into each class.

Classes Frequency
5 - 13 3
14 - 22 9
23 - 31 15
32 - 40 13
41 - 49 4
50 - 58 3
59 - 67 2
68 - 76 1

II. Add other informative columns.

1. True Class Boundaries (TCB) remove discontinuity between classes and
consider the true range of values.

(Lower TCB) LTCB = LL 0.5 (unit)
(Upper TCB) UTCB = UL + 0.5(unit)


- a unit depends on the precision of data

example. 1
st
class: LTCB = 5 - 0.5(1) = 4.5
UTCB = 13 + 0.5(1) = 13.5

Note:
If data Unit of precision
is a whole number
has 1 decimal place
has 2 decimal places
1
0.1
0.01


2. Class Mark (CM) = the center of a class. It is the midpoint of the class interval
where observations in a class tend to cluster about.


CM =
( )




3. Relative Frequency (RF) proportion of observations falling in one class (in %)

RF =




Statistics Handouts
Page 14 of 83



FREQUENCY DISTRIBUTION TABLE

Classes

LL UL
True Class
Boundaries (TCB)

LTCB UTCB

CM

Freq

RF (%)

CF

< >

RCF

< >
5 - 13
14 - 22
23 - 31
32 - 40
41 - 49
50 - 58
59 - 67
68 - 76
4.5 - 13.5
13.5 - 22.5
22.5 - 31.5
31.5 - 40.5
40.5 - 49.5
49.5 - 58.5
58.5 - 67.5
67.5 - 76.5
9
18
27
36
45
64
73
82
3
9
15
13
4
3
2
1
7.5
22.5
37.5
32.5
10
7.5
5
2.5

50 100
Statistics Handouts
Page 15 of 83



Exercise # 2 Frequency Distribution Table

Objectives:
At the end of the exercise, the student is expected to:
1. describe the different methods of data presentation;
2. organize data by constructing a frequency distribution table

A. On organizing data: Construct an FDT for the given data. Show computations for R, K and c.

Table 1.
Blood Glucose of 20 individuals of the Honolulu Heart Center, 1969


ID no.

Blood Glucose
(in mg)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20


107
145
237
91
185
106
177
120
116
105
109
186
257
218
164
158
117
130
132
138



Statistics Handouts
Page 16 of 83


Table 2.
Socio-Economic Characteristics of 30 Countries as of January 1997

Obsn.
No.

Country

Life
Expectancy


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30


Japan
Australia
Canada
Hongkong
Italy
Switzerland
France
US
Britain
Germany
New Zealand
Singapore
Brunei
Taiwan
Macau
Fiji
Malaysia
South Korea
Sri Lanka
China
Mexico
Saudi Arabia
Russia
Thailand
Iran
Brazil
Philippines
Turkey
Vietnam
Egypt

80
78
78
78
78
78
77
77
76
76
76
76
75
75
73
72
72
72
72
71
71
70
69
69
68
67
67
67
67
64


Statistics Handouts
Page 17 of 83


Lessons # 3 Numerical Descriptive Measures

NUMERICAL DESCRIPTIVE MEASURES



I. Measure of Location value within the range of the data which describes its
location or position relative to the entire set of data. The more common measures
are measures of central tendency, percentile, decile and quartile.

A. Measure of Central Tendency describes the center of the data. It is a single
value about which the observations tend to cluster. The common measures
are mean, median and mode.


Characteristics

When to Use

1. Mean sum of the
observations divided
by the number of
observations totaled


- interval statistic
- calculated average
- value is determined
by every case in the
distribution
- affected by extreme
values


- variables are in at least
interval scale
- value of each score is
desired
- values are considerably
concentrated or closed
to each other


2. Median middle
value of an array


- ordinal statistic
- rank or position
average
- not affected by
extreme values


- ordinal interpretation
is needed
- middle score is desired
- we want to avoid
influence of extreme
values


3. Mode observations
which occurs most
frequently in the data
set



- nominal statistics
- inspection average
- not unique; have
more than one mode
- most popular score
- unaffected by
extreme values
- represents the
majority


- nominal interpretation
is needed
- quick approximation of
central tendency
desired




Statistics Handouts
Page 18 of 83


B. Percentile divides the data set into 100 equal parts, each part having one
percent of all the data values. For example, if patrick received a rating of 90
th

percentile in the National Secondary Achievement Test, this means that 90%
of the students who took the test had scores lower than Patricks.

C. Decile divides a data set into ten equal parts, each part having ten percent of
all data values. The first decile is the 10 th percentile, the second decile is the
20
th
pe4rcentile, and so on, up to the tenth decile which is the 100
th

percentile.

D. Quartile divides a data set into four equal parts, each part having twenty-five
percent of all data values. The first quartile is the 25
th
percentile, the second is
the 50
th
percentile, the third is the 75
th
percentile, and the fourth quartile is
the 100
th
percentile.


II. Measure of Dispersion describes the extent to which the data are dispersed.
The more commonly used measures are:

A. Range
- not a stable measure of variation because it can fluctuate greatly
with a change in just a single score, either the highest or the
lowest
- easiest to compute but the LEAST SATISFACTORY because its
value is dependent only upon the two extremes

B. Variance
- considers the position of each observation relative to the mean of
the set; denoted by o
2


C. Standard Deviation (o)
- best measure of variation
- important as a measure of heterogeneity or unevenness within a
set of observations
- used when comparing two or more sets of data having the same
units of measurement

D. Coefficient of Variation ( CV )
- used to compare the variability of 2 or more sets of data even
when the observations are expressed in different units of
measurement.
Statistics Handouts
Page 19 of 83


III. Measure of Skewness (SK) describes the extent of departure of the distribution
of the data from symmetry.


Figure 1. Symmetric Distribution

- the median is the score pt. which bisects
the total area. Half of the area would fall
to the left and half to the right
- mode is the score pt. with the highest
frequency, the pt. on the x-axis
corresponds to the tallest pt. of the curve
- mean is the score pt on the x-axis that
corresponds to the pt. of balance


Figure 2. Positively Skewed

- bump on the left indicates that the mode
corresponds to a low value
- tail extending to the right means that the
mean, which is sensitive to each score
value, will be pulled in the direction of
the extreme scores and will have a high
value
- median which is unaffected by extreme
values will have a value between the
mode and the mean



Figure 3. Negatively Skewed











- mean will have a lower numerical value
than the median because the extremely
low scores will pull the mean to the left
- bump usually occurs at the right
indicating that the mode has a high
numerical value
- median will still be in the middle



IV. Measure of Kurtosis measures the degree of peakedness of a data of
distribution, denoted by k. If the distribution of the data is bell-shaped, k=3. If
the shape of the distribution is relatively peaked, k>3. If the shape is relatively
flat, k<3.




Statistics Handouts
Page 20 of 83



FORMULAS FOR UNGROUPED DATA

Data Set 1: 115 115 120 120 120 125 125 130 300
Data Set 2: 115 115 120 120 120 125 125 125 130 130


Numerical Measures


Computation
Data 1 Data 2

1. Mean = = E Xi/N




2. Median




3 .Mode is determined by mere
inspection.



4. Variance
o
2
= E Xi
2
-
2
N
Where is the mean of the ungrouped data


5. Standard Deviation = positive
square root of variance



6. Coefficient of Variation

CV = [ o/ ] x 100%



7. Measure of Skewness
SK =
o
) ( 3 Median Mean





Statistics Handouts
Page 21 of 83




Numerical Measures


Computation
Data 1 Data 2


8. Pi =







9. Di =







10. Qi =







Statistics Handouts
Page 22 of 83



FORMULAS FOR GROUPED DATA

Data Set


TCB

LTCB UTCB

CM

(Xi)

Freq

(Fi)

CM x Freq

fi Xi
2


CF

<

2.65 3.75
3.75 4.85
4.85 5.95
5.95 7.05
7.05 8.15
8.15 9.25

3.2
4.3
5.4
6.5
7.6
8.7

5
4
8
3
12
8


16
17.2
43.2
19.5
91.2
69.6

5
9
17
20
32
40
40 256.7 1783.83



Numerical Measures


Computation

1. mean () = E fiXi, where
N

fi = frequency of the ith class
Xi= classmark of the ith class
N = total no. of observation
K = number of classes



2. median (Md)

= LTCB
Md
+ c
(
(
(
(

<
Md
b
F
CF
N
2


where LTCBMd = LTCB of the median class
C = class size
<CFb = <CF of the class preceding
median class
FMd = frequency of the median class
N = total number of observations



NOTE: the middle class is the class which
contains the (n/2)th value of the array

Statistics Handouts
Page 23 of 83





3. mode (Mo)

= LTCB
Mo
+ c
(



a b Mo
b Mo
F F F
F F
2

where
LTCBMo = LTCB of the modal class
C = class size
FMo = frequency of the modal class
Fb = frequency of the class preceding the
modal class
Fa = frequency of the class following the
modal class


NOTE: the modal class is the class with
the highest frequency

4. Variance ( o
2
)

=


N
fiXi
2

2
where

fi = freq. Of the ith class
Xi= classmark of the ith class
N = total number of observations

2
G = mean of the grouped data



5. Standard deviation (o) = positive
square root of the variance




6. Coefficient of Variation

CV = [ o/ ] x 100%





7. Measure of Skewness

SK =
o
) ( 3 median mean






Statistics Handouts
Page 24 of 83



7. Percentiles

Pi = LTCB
Pi
+ c
(
(
(
(

<
Pi
b
F
CF N
i
)
100
(


where LTCBPi = LTCB of the PI class
C = class size
<CFb = <CF of the class preceding Pi
class
FMd = frequency of the PI class
N = total number of observations



8. Deciles

Di = LTCB
Di
+ c
(
(
(
(

<
Di
b
F
CF N
i
)
10
(

where LTCBDi = LTCB of the Di class
C = class size
<CFb = <CF of the class preceding Di
class
FMd = frequency of the Di class
N = total number of observations




9. Quartiles

Qi = LTCB
Qi
+ c
(
(
(
(

<
Qi
b
F
CF N
i
)
4
(


where LTCBQi = LTCB of the Qi class
C = class size
<CFb = <CF of the class preceding Qi
class
FMd = frequency of the Qi class
N = total number of observations




Statistics Handouts
Page 25 of 83


FORMULAS:


1. Mean () =
N
fixi








2. Median (Md) = LTCB
Md
+ c
(
(
(
(

<
Md
b
F
CF
N
2





7. Variance (o
2
) =


N
fiXi
2

2



3. Mode (Mo) = LTCB
Mo
+ c
(



a b Mo
b Mo
F F F
F F
2






8. Standard Deviation (o) = iance var


4. Pi = LTCB
Pi
+ c
(
(
(
(

<
Pi
b
F
CF N
i
)
100
(





9. CV = % 100 x

o



5. Di = LTCB
Di
+ c
(
(
(
(

<
Di
b
F
CF N
i
)
10
(




10. SK =
o
) ( 3 median mean




6. Qi = LTCB
Qi
+ c
(
(
(
(

<
Qi
b
F
CF N
i
)
4
(







Statistics Handouts
Page 26 of 83


More Problems:

1. Suppose a teacher assigns the following weights to the various course requirements:

Assignment 15%
Project 25%
Midterms 20%
Finals 40%

The maximum score a student may obtain for each component is 100. Sheila obtains
marks of 83 for assignment, 72 for project, 41 for midterms and 49 for the finals. Find her
mean mark for the score.








2. The blood glucose reading of 100 patients admitted at Honolulu Heart Center has a mean of
152.14 and a standard deviation of 54.72. If their serum cholesterol level has a mean of
216.96 and a standard deviation of 38.82, can we conclude that blood glucose among these
patients are more stable than their serum cholesterol levels?









3. The following are weight losses (in pounds) of 25 individuals who enrolled in a five-week
weight-control program:


2 3 3 4 4 4 5 5 6 7 7 8 8
8 9 9 9 9 10 10 10 11 11 11 12


Compute for the 3
rd
quartile, 7th decile, and 89
th
percentile.



Statistics Handouts
Page 27 of 83


Exercise #3 Numerical Descriptive Measures

Objectives:
A1t the end of the exercise, the student is expected to identify and compute appropriate numerical
descriptive measures for ungrouped and grouped data, specifically,
- measure of central tendency
- measure of dispersion; and
- measure of skewness


A. Using your raw data set and the FDT you constructed in exercise # 2, compute for
the appropriate descriptive measures (ungrouped and grouped). Show solution for
grouped data only.

B. Construct these tables in your workbooks and summarize the values obtained.

I. Measure of Central Tendency

Mean Median Mode
ungrouped grouped ungrouped grouped ungrouped grouped


II. Measure of Dispersion

Range Variance Standard Deviation Coeff. Of Variation
ungrouped grouped Ungrouped grouped ungrouped Grouped ungrouped grouped



III. Measure of Skewness

ungrouped Grouped



C. Interpret the obtained values for your mean, median and mode (ungrouped data
only).



Statistics Handouts
Page 28 of 83



Lesson # 4 Weighted Means

Weighted Means
- Weighted Mean is a statistical measure obtained when data is gathered from a survey questionnaire
using the Likert Scale

- A Likert scale is a psychometric scale commonly used in questionnaires and is the most widely
used scale in survey research. When responding to a Likert questionnaire item, respondents specify
their level of agreement to a statement.
1
A Likert item is simply a statement the respondent is asked
to evaluate according to any kind of subjective or objective criteria.


- Generally, the level of agreement or disagreement is measured. Often five ordered response levels
are used, although many psychometricians advocate using seven or nine levels. A recent empirical
study
2
found that a 5- or 7- point scale may produce slightly higher mean scores relative to the
highest possible attainable score, compared to those produced from a 10-point scale, and this
difference was statistically significant.

- Strategies: 5- Very Effective, 4- Effective,3-Moderately effective/Undecided,
- Practices: 5- Highly Observed/Always/Fully Aware, 4- Observed/Sometimes/Aware,
- Traits/Attitudes: 5-Very Evident, 4-Somewhat Evident, 3-Undecided, 2-Somewhat inevident, 1-Not
evident





1
http://en.wikipedia.org/wiki/Likert_scale
2
Dawes, John (2008). "Do Data Characteristics Change According to the number of scale points used? An experiment using 5-
point, 7-point and 10-point scales". International Journal of Market Research 50 (1): 6177.
Statistics Handouts
Page 29 of 83


Table 1. Illustration of a Likert Scale Questionnaire
Research Title: Solid Waste Management of Ateneo de Naga University
Below is a list of Solid Waste Management practices. Please check the boxes with the appropriate
number corresponding to your chosen answer as to how these are practices are observed.
Scale: 5 - Very High
4 - High
3 - Moderate
2 - Low
1 - Very Low

5 4 3 2 1
A. GENERATION OF WASTE

Ateneo de Naga University

1.Provides information through campaigns or seminars
about solid waste generation

2. Introduces strategies on how to apply the 4R's
( Reuse, Recycle, Reduce and Respond ) of Solid Waste
Management

3. Provides campaign to patronize the use of reusable
and recycled materials

4. Rejects products which are harmful to the
environment such as foam, styrofoam, CFC aerosols,
oil-based paints, pesticides, insecticides, plastics, wood
preservatives, glues and adhesives

5. Encourages the use of unused side of old papers or
recycles its own paper ( as shown by the exam papers
used, handouts, memo, letters, etc)

6. Encourages or requires the use of refillable inks for
pens, ballpens, printers, etc..

7. Allows the use of old notebooks from previous years
instead of requiring new ones

8. Encourages to reuse envelopes, boxes, packaging
materials and folders

9. Repairs or disposes defective computers in
laboratories or offices



Statistics Handouts
Page 30 of 83


Table 2. Tallied Data

5 4 3 2 1
Weighted
Means
A. GENERATION OF WASTE

Ateneo de Naga University

1.Provides information through campaigns or seminars about solid
waste generation

2. Introduces strategies on how to apply the 4R's ( Reuse,
Recycle, Reduce and Respond ) of Solid Waste Management

3. Provides campaign to patronize the use of reusable and recycled
materials

4. Rejects products which are harmful to the environment such as
foam, styrofoam, CFC aerosols, oil-based paints, pesticides,
insecticides, plastics, wood preservatives, glues and adhesives

5. Encourages the use of unused side of old papers or recycles its
own paper ( as shown by the exam papers used, handouts, memo,
letters, etc)

6. Encourages or requires the use of refillable inks for pens,
ballpens, printers, etc..

7. Allows the use of old notebooks from previous years instead of
requiring new ones

8. Encourages to reuse envelopes, boxes, packaging materials and
folders

9. Repairs or disposes defective computers in laboratories or
offices




0


2


6



0



7


1


2


6


0





6


8


8



5



6


1


3


11


2





12


10


22



7



12


4


4


18


3





38


29


38



34



33


41


42


27


43





64


71


46



74



62


73


69


53


72



Cumulative Weighted Mean

Source: Valenzuela 2007, p.66

Statistics Handouts
Page 31 of 83




Table 3
Adjectival Interpretation of the Likert Scale (cumulative mean)


Rating Scale


Range

Interpretation

5


4

3


2

1

4.20 5.00


3.40 4.19

2.60 3.39


1.80 2.59

1.00 1.79


Very High Almost all indicators are
practiced

High 75% of the indicators were practiced

Moderate 50% of the indicators were
practiced

Low 25% of the indicators were practiced

Very Low almost none of the indicators
were practiced



Table 4
Adjectival Interpretation of the Likert Scale (per item)


Rating Scale


Range

Interpretation

5


4

3


2

1

4.20 5.00


3.40 4.19

2.60 3.39


1.80 2.59

1.00 1.79


Very High Almost all respondents practice
the said indicator

High 75% of the respondents

Moderate 50% of the respondents


Low 25% of the respondents

Very Low almost none of the
respondents



Statistics Handouts
Page 32 of 83


Table 5 .
Extent of Solid Waste Management in AdeNU ( faculty and students) , 2007

Weighted
Mean
Interpretation
A. GENERATION OF WASTE

Ateneo de Naga University

1.Provides information through campaigns or seminars about
solid waste generation

2. Introduces strategies on how to apply the 4R's ( Reuse,
Recycle, Reduce and Respond ) of Solid Waste Management

3. Provides campaign to patronize the use of reusable and recycled
materials

4. Rejects products which are harmful to the environment such as
foam, styrofoam, CFC aerosols, oil-based paints, pesticides,
insecticides, plastics, wood preservatives, glues and adhesives

5. Encourages the use of unused side of old papers or recycles its
own paper ( as shown by the exam papers used, handouts, memo,
letters, etc)

6. Encourages or requires the use of refillable inks for pens,
ballpens, printers, etc..

7. Allows the use of old notebooks from previous years instead of
requiring new ones

8. Encourages to reuse envelopes, boxes, packaging materials and
folders

9. Repairs or disposes defective computers in laboratories or
offices




1.67


1.68

2.08



1.52



1.86


1.47


1.56


2.04


1.46




Very Low


Very Low

Low



Very Low



Low


Very Low


Very Low


Low


Very Low


Cumulative Weighted Mean

1.7

Very Low


Statistics Handouts
Page 33 of 83


Generation of Waste
The extent of performance of SWM practices of students and faculty on the area of generation of
wastes is given in Table 5. The results show the respondents mean, based on the nine (9) indicators
used, ranged from 1.4 to 2.08 or from very low to low ratings. The respondents gave an overall mean
that resulted to very low to the following indicators: provides information through campaigns or
seminars about SWM (1.67), introduces strategies on how to apply the 4R's of Solid Waste Management
(1.68),, rejects products which are harmful to the environment such as foam, Styrofoam, CFC aerosols,
oil-based paints, pesticides, insecticides, plastics, wood preservatives, glues and adhesives (1.52) ,
encourages the use of refillable ink (1.47), allows the use of old notebooks (1.56) and repairs or
disposes defective computers (1.46). The very low also implied that almost none of the respondents
observe the mentioned practices.
On the indicators stating that provides campaign to patronize the use of reusable and recyclable
materials (2.08), encourages the use of unused side of old papers or recycles its own paper (1.86),
encourages or requires the use of refillable materials (3.2),and encourages to reuse envelopes, boxes,
packaging materials and folders (2.04) had an overall mean of low. Only 25% of the respondents
observe the mentioned indicators.
The students and faculty gave an overall weighted mean that resulted to very low. In totality,
the cumulative mean score resulted to 1.7. The result implied that almost none of the indicators were
being observed under the generation component of SWM.
Survey results reveal that there was a need for intensive information campaign about SWM and
that the University had yet to implement strategies on how to apply the 4Rs. Such an outcome presents
an opportunity to promote waste-saving measures among the student and teaching population in the
AdeNU in line with the future promotion of the 4Rs.










Statistics Handouts
Page 34 of 83



Exercise # 4 _ Weighted Means

A. For the raw data given, obtain the weighted mean for each item and the
cumulative/total weighted mean.

B. Interpret the cumulative/total weighted mean.

C. What is the highest and lowest obtained weighted means. Interpret the values.

D. Conclusion. Make a discussion on the result of the test base on the objective of the
study.






Rating Scale

Range of The
Likerts Scale

Interpretation

5


4


3


2


1

4.20 5.00


3.40 4.19


2.60 3.39


1.80 2.59


1.00 1.79


Extremely Characteristic of Me Almost all
indicators are evident.

Somewhat Characteristic of Me 75% of the
indicators are evident.

Neither Un/Characteristic of Me 50% of the
indicators are evident.

Somewhat Uncharacteristic of Me 25% of the
indicators are evident.

Extremely Uncharacteristic of Me almost
none of the indicators are evident.

Statistics Handouts
Page 35 of 83


Problem Set
Thesis title: Portable Games and Devices towards Aggressive Behavior of the First Year BS Digital
Animation Students of Ateneo de Naga University
Objective: To determine the level of influence of playing Portable Games and Devices on the behavior
specifically aggressiveness of the respondents
Table 1
Results from the Standard Questionnaire by Buss and Perry.

Indicators

5

4

3

2

1

Weighted
Means
1. Some of my friends think I am a
hothead.
18 12 15 12 13
2. If I have to resort to violence to protect
my rights, I will.
17 21 10

15 7
3. When people are especially nice to me, I
wonder what they want.
14 17 15 17 7
4. I tell my friends openly when I disagree
with them.
17 28 10 10 5
5. I have become so mad that I have broken
things.
10 17 14 15 14
6. I cant help getting into arguments when
people disagree with me.
16 18 14 13 9
7. I wonder why sometimes I feel so bitter
about things.
9 23 15 17 6
8. Once in a while, I cant control the urge
to strike another person.
12 16 10 16 16
9. I am an even/tempered person. 18 21 15 13 3
10. I am suspicious of overly friendly
strangers.
11 19 17 13 10


Cumulative Weighted Mean



Statistics Handouts
Page 36 of 83



Lesson # 5 Sampling

SAMPLE SIZE DETERMINATION
Slovins Formula:
2
1 Ne
N
n
+
=

Where n = sample size
N = population size
e = margin of error (usually at 5%)

A researcher would want to make a socio-economic survey of a school with a
population of 5000 students. If he allows a margin of error of 5%, how many students
must he take into sample?

n =
2
) 05 . 0 ( 5000 1
5000
+


=
) 0025 (. 5000 1
5000
+


=
5 . 12 1
5000
+


=
5 . 13
5000


= 37 . 370 ~ 370

Important: Samples should be as large as a researcher can obtain with a reasonable
expenditure of time and energy. A recommended minimum number of subjects is 100
for a descriptive study, 50 for a correlational, and 30 in each group for experimental
and causal- comparative study.

Statistics Handouts
Page 37 of 83


SAMPLING METHODS


Random Sampling Methods


Nonrandom Sampling Methods

- every element in the population
has an equal chance of being
chosen
- example: The dean of a school
of education in a large
midwestern university wishes
to find out how her faculty feel
about the sabbatical leave
requirements at the university.
She places all 150 names of the
faculty in a hat, mixes them
thoroughly , and then draws
out the names of 25 individuals
to interview.



- not all elements are given a equal
chance of being included in the
sample
- some elements may be deliberately
ignored (that is, giving them no
chance at all) in the choice of
elements for the sample
- example: The manager of the
campus bookstore at a local
university wants to find out how
students feel about the services of
the bookstore provides. Every day for
two weeks during her lunch hour,
she asks every person who enters
the bookstore to fill out a short
questionnaire she has prepared and
drop it in a box near the entrance
before leaving. At the end of the two-
week period, she has a total of 235
completed questionnaires.

Statistics Handouts
Page 38 of 83



I. RANDOM SAMPLING METHODS



A. Simple Random Sampling

Required : complete list of the elements of the population

Features : each and every number of the population has an equal
chance and independent chance of being chosen

When to use : population size is not very large
population is homogeneous

Procedures : i. Lottery method/Chip-in-the-box/Fish-in-the-Bowl
ii. Table of Random Numbers
iii.Calculator/computer generated random numbers




Illustration: Table of Random Numbers

011723 223456 222167
912334 379156 233989
086401 016265 411148
059397 022334 080675
666278 106590 879809
051965 004571 036900
063045 786326 098000
560132 345678 356789
727009 344870 889567
000037 121191 258700
667899 234345 076567

Statistics Handouts
Page 39 of 83


B. Stratified Sampling

Required : complete list of the elements of the population
Features : representative for each strata or subgroups of the population
are randomly chosen as elements of the sample
When to use : Population size is large; Population is heterogeneous but
elements can be grouped into homogeneous strata ; When we want
representative for each strata or subgroups

Procedure: Given a population N = 365, the researcher grouped the
respondents according to gender where there are 219 females and 146 males.
Using stratified sampling, how many respondents will be obtained from each
strata?

N = 365 , use Slovins formula to get the sample size n

n =
2
) 05 . 0 ( 365 1
365
+


=
) 0025 (. 365 1
365
+


=
9125 . 0 1
365
+



=
9125 . 1
365


= 190.849 ~ 191



Statistics Handouts
Page 40 of 83














Researcher identifies
2 subgroups or strata







219 females (60% =
365
219
) 146 males (40% =
365
146
)


using Slovins we compute
the required sample size n,
then we multiply it by the percentage



191 x 0.60 191 x 0.40









Population of 365

115 females 76 males
Statistics Handouts
Page 41 of 83




C. Cluster Sampling

Features : population is grouped into clusters or small units
composed of population elements; each cluster contains
as varied a mixture as possible and at the same time one
cluster is nearly as alike as the other
: Sometimes referred to as an area sample because it is
frequently applied on a geographical basis, blocks in a
community or city are occupied by heterogeneous groups
When to use : large population
: list of all members of the population is not available
Procedure : 50 barangays in Naga City
Randomly choose 3 barangays


C. Multi-stage Sampling

Features :this technique uses several stages or phases in getting
sample from the general population

When to use : conducting nationwide surveys or any survey involving a
large universe


Statistics Handouts
Page 42 of 83


Illustration of Multistage Sampling:






Philippines (17 regions)





Choose randomly 5 regions



R1 R2 R3 R4 R5

Choose randomly 2 provinces for each region





P1 P2 P3 P4 P5 P6 P7 P8 P9 P10
Choose randomly 1 city for each province





C1 C2 C3 C4 C5 C6 C7 C8 C9 C10



Choose randomly 2 barangays for each city



Then choose randomly 5 households for each barangay



Statistics Handouts
Page 43 of 83


Populations



CDE
MNO
MNO
F G
A H G K L I D W E R

T Y U O P S F G H J

Z X C V B N M

A B C D E
25%
F G H I J
K L M N O
50%

P Q R S T
25%
CDE
MNO
M N


O C M
B C 25%
F H M O 50%
Q S 25%
MNO
F G
SIMPLE RANDOM STRATIFIED
SAMPLING
Statistics Handouts
Page 44 of 83




Populations


C D
I J K
C D
AB
CDE
FG
HKL
MNO
AB
G H
E F

D A

AB
FG
HKL
A B
CLUSTER SAMPLING
TWO-STAGE SAMPLING
Statistics Handouts
Page 45 of 83


II. Non-random sampling

A. Convenience - chooses sample at the researchers convenience
example. To find out how students feel about food service
in the student union at an East Coast university, the
manager stands outside the main door of the cafeteria
one Monday morning and interviews the first 50 students
who walk out of the cafeteria
B. Purposive - use their judgement to select a
sample that they believe will provide the data they need
-example. A graduate student wants to know how retired
people aged 65 and over feel about their golden years. He
has ben told by one of his professors, an expert on aging and
the aged population, that the local Association of Retired
Workers is a representative cross section of retired people
age 65 and over. He decides to interview a sample of 50
people who are members of the association to get their views.
C. Quota - sets a sample size then chooses the
respondents without setting criteria. The researcher proceeds
to fill the prescribed quota. The researcher is left to his own
convenience or preference.

D. Snowball



REASONS FOR USING NON-RANDOM SAMPLING


a. Some might use this technique because they just want to get a feel of the
market before launching or producing a certain product.

b. Lack of logistics or inadequate knowledge in the use of random methods

c. The validity of the sample is based on the soundness of the judgement of
whoever make the choice.
Example. One would naturally use judgement instead of randomness
in the choice of people who will work for a company.



Statistics Handouts
Page 46 of 83


Lesson # 6 FPC, Permutations and Combinations

Definition.. FUNDAMENTAL PRINCIPLE OF COUNTING. If one event can occur in m
different ways, and if, after it has happened in one of these ways, a second event can
occur in n different ways, then both events can occur, in the order stated, in m x n
different ways.

Examples.

1. If there are eight doors providing access to a building, in how many ways can a
person enter the building by one door and leave by a different door?




2. How many positive integers of three different digits can be formed from the integers
1,2 3, 4 and 5.






3. How many different arrangements, each consisting of five different letters, can be
formed from the letters of the word PERSONAL if each arrangement is to begin and
end with a vowel?






4. How many different arrangements of five distinct books each can be made on a shelf
with space for five books?





5. Suppose that there are 3 math books and 3 physics books, how many different
arrangement of the six books can be made on a shelf if books on the same subject
are to be kept together?

Statistics Handouts
Page 47 of 83


Definition. PERMUTATION (nPr). Let S be a set containing n elements and suppose r is
a positive integer such that r < n. Then a permutation of r elements of s is an
arrangement in a definite order, without repetitions of r elements of s.

Theorem 1. The number of permutations of n elements taken r at a tiem is given by
either of the following formulas:
a. nPr = n(n-1)(n-2) (n-r+1)
b. nPr = n! / (n-r)!

Special case: nPn = n!

Examples:
1. A bus has six vacant seats. If three additional passengers enter the bus, in how
many different ways can they be seated?




2. In how many ways can 3 boys and 3 girls be seated in a row containing eight seats if
a. a person may sit in any seat
b. boys and girls must sit in alternate seats?





Theorem 2. If we are given n elements, of which exactly m1 are of one kind, exactly m2
are alike of a second kind, , and exactly mk are alike of a kth kind, and if n=m1 +
m2 + .. + mk, then the number of distinguishable permutations that can be made of the
n elements taking them all at one time is

. n! .
m1! m2! mk!

Examples:
1. Determine the number of different nine-digit numerals that can be formed from the
digits 6,6,6,5,5,5,4,4 and 3.





2. How many permutations can be formed from the word TENNESSEE?
Statistics Handouts
Page 48 of 83


Definition. COMBINATION (nCr). Let s be a set containing n elements, and suppose r is
a positive integer such that r< n. then a combination of r elements of s is containing r
distinct elements.

Theorem 3. The number of combinations of n elements taken r at a time is given by
nCr = nPr / r!
= n! / (n-r)!r!

Theorem 4. NCr = nCn-r


Examples:

1. A football conference consists of 10 teams. If each team plays every other team, how
many conference games are played?







2. A student has twelve posters to pin up on the walls of her room, but there is space
for only 7. In how many ways can she choose the posters to be pinned up?






3. How many committees of five can be formed from 7 sophomores and 5 freshmen if
each committee is to consist of 3 sophomores and 2 freshmen?







4. From 6 history books and 8 economics books, in how many ways can a person select
3 history books and 5 economics books and arrange them on a shelf?


Statistics Handouts
Page 49 of 83


Exercise #5 FPC, Combinations and Permutations

Objectives:
At the end of the exercise, the student is expected to be able to:

1. Count the number of ways an event may possibly occur by:
a. listing all possible outcomes in the sample space corresponding to the event; and
b. using the method of counting.

2. Solve problems requiring the applications of the concept of permutation and combination.

I. Show complete solution for each.

1. How many different outcomes are possible in a roll of 4 dice? In tossing 3 coins? In
rolling 2 dice and tossing 3 coins simultaneously?
2. How many distinct permutations can be made from the word COOL? List them
down.
3. Package of 12 game boy sets contains 4 defective sets. If 5 sets are to be picked out
randomly and sent to a customer for an inspection, in how many ways can the
customer find at least three defective set?
4. How many different telephone numbers can be formed from a seven-digit number if
the first digit cannot be zero?
5. A shelf contains 3 books in red binding, 4 books in blue and 2 in green. In how
many different orders can they be arranged if all the books of the same color must
be kept together?
6. How many different numbers less than 200 can be formed from the digits 1,2,3,4
and 5 (a)if repetitions are not allowed? (b) repetitions are allowed?
7. How many numbers greater than 300 can be formed with the digits 1,2,3,4,5 if
repetitions are not allowed?
8. How many committees of 5 can be selected from 12 republicans and 8 democrats (a)
if it must contains 2 republicans and 3 democrats? (b)if it must contains at least 3
republicans?
9. There are 8 baseball teams in a league. How many games will be played if each team
play each of the other teams 40 times?
10. In how many ways can one make a selection of 5 black balls, 3 red balls, and 2
white balls from a box containing 8 black balls, 7 red balls and 5 white balls?
11. The tennis squad of one college consists of 8 players that if another consist of 10
players. In how many ways can a doubles match between the 2 institutions be
arranged?
12. In how many ways can one make selection 4 novels, 3 biographies and 6 detective
stories from a shelf containing 10 novels, 8 biographies and 10 detective stories.
Statistics Handouts
Page 50 of 83


Lesson #7 Probability

PROBABILITY

- SAMPLE SPACE is the set of all possible outcome of a given experiment.
- A subset of the sample space of an experiment is called an EVENT associated with
the experiment

Definition. PROBABILITY OF AN EVENT. If S is the sample space of an experiment and
E is an event associated with the experiment, the probability of E, denoted by P(E), is
defined by

P(E) = . n(E) . where n(E) are the numbers of elements in E and S respectively.
n(S)

Furthermore, if P(E)= 0 then the event will never happen or it is an impossible event.
If P(E) = 1, the event is certain to happen or it is a sure event.

Examples:
1. Determine the probability of each of the following events:
a. Obtaining a 4 on a throw of a single die
b. Obtaining a head on a toss of a coin

2.

a. a. If 2 dice are thrown, what
is the probability of obtaining
a sum of 8? a sum of 3?




1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

3. Determine the probability of each of the following events
a. Drawing a heart from a deck of 52 playing cards
b. Drawing 4 spades in succession from a deck of 52 playing cards if after each
card is drawn it is not replaced in a deck




4. If a French, Spanish, Russian and English books are placed at random on a shelf
with a space for 4 books, what is the probability that the Russian and English books
will be next to each other?


Statistics Handouts
Page 51 of 83



CONJUNCTION AND DISJUNCTION PROBABILITIES

Definition. CONJUNCTION PROBABILITY. This type of probability is associated with
events happening together, one event and another event occurring at the same time.
Events, however, may be independent or dependent

Case 1. P(A and B) = P(A) x P(B)
When the occurrence of one event does not influence the probability of the
occurrence of the other event, these events are said to be independent.

Example. At birth the probability that US female will survive to age 65 is
approximately 7/10, that is P(F65) = 7/10. The probability that a male will
survive to age 65 is approximately 3/5, i.e., P(M65) = 3/5. What is the
probability that both male and female will be alive at age 65?





What is the probability that only the male will be alive at age 65?





What is the probability that at least one of the two will be alive at age 65?





Case 2. P(A and B)= P(A) x P(B/A)
When the occurrence of one event is conditioned by the other event, these
events are said to be conditional.

Example. Suppose a box contains 30 fuses 5 of which are defective. What is
the probability of drawing at random two defective fuses in succession if
the first fuse that has been drawn is not returned before making the
second draw?

Statistics Handouts
Page 52 of 83




Definition. DISJUNCTION PROBABILITY. This type of probability is associated with
several events that happen either separately or simultaneously. Disjunction probability
is concerned with either or relationship.

Case 3. P(A or B) = P(A) + P(B
When the events do not have common sample points, they are said to be
mutually exclusive.

Example. What is the probability that in a single toss of a two dice, the sum
will be 4 or 7?







Case 4. P(A or B) = P(A) + P(B) P(AB)
There are also cases of joint events which are not mutually exclusive
because there are some elements common to both events.

Example. What is the probability of getting a sum of 8 or a sum greater
than 7 in a throw of two dice?







Example. Take a math class with 52 students, 27 of whom are males and
the rest are females. A total of 21 of the males and 15 of the females got a
grade above 90. What is the probability that if a student is chosen at
random, this student has either grade of above 90 or is a male?




Statistics Handouts
Page 53 of 83


PROBABILITIES INVOLVING QUALITATIVE DATA IN CONTINGENCY TABLE

When the data are presented in the form of frequencies and are classified
according to qualitative rather than quantitative categories, they are called
qualitative data in contingency tables.

Illustration:

Vegetarian Status

Gender

Vegetarian

Non
Vegetarian

Total


Male
20 23 43

Female
22 25 47

Total
42 48
90


1. To find the probability of a single event from qualitative data, simply
divide the subtotal of the desired event by the grand total.

P(A) = subtotal/ grand total

Example. The probability that a person is vegetarian







2. To find the conjunction probabilities of two independent events from
qualitative data, divide the observed frequency wheer the two events
intersect by the grand total.

P(A and B) = observed freq. of the two events intersection .
Grand total
Example. The probability that a person is female and a vegetarian

Statistics Handouts
Page 54 of 83


3. To find the probabilities of two dependent events from qualitative data,
divide the observed frequency where the two events intersect by the
subtotal of the event which is used as a condition

P(A and B) = observed freq. of the two events intersection .
Subtotal of the conditional events
Example. The probability of getting a male at random provided that he is a
non- vegetarian













4. To find the disjunction probabilities of the two events

P(A or B) = Subtotal of 1
st
event . + . subtotal of 2
nd
event .
grand total grand total

Observd Freq. Of Intersectx
grand total

Example. The probability of getting a female or a person who is a non
vegetarian

Statistics Handouts
Page 55 of 83


Exercise # 6 - Probability

Objectives:
At the end of the exercise, the student is expected to be able to apply the different operations on probability

II. Show complete solution for each.

1. On a throw of two dice, what is the probability of obtaining a sum that at most 5?



2. If a single card is drawn from deck of 52 playing cards, what is the probability of
each of the following events: (a) obtaining a red card; (b) obtaining a heart; and (c)
obtaining an ace or spade?



3. A committee of 5 is to be selected from 10 seniors and 15 juniors. What is the
probability that the committee is to consist of at most 3 juniors?



4. A number of twp different digits is to be formed from the digits 1,2,3,4 and 5.
Determine the probability of each of the following events:
a. the no. is odd
b. no. is greater than 25



5. A couple is planning to have three children. Find the probabilities that the couple
will have
a. two girls and one boy
b. at least two boys
c. no boys
d. at most two girls
e. two boys followed by a girl




Statistics Handouts
Page 56 of 83








6. Classification of Patients in a Hospital

Pregnant Elderly Children
Male 0 27 35 62
Female 28 49 11 88
28 76 46
150

What is the probability that a patient chosen at random from among the 150 will be:

a. pregnant
b. female or elderly
c. female and elderly
d. male or a child
e. male provided that he is elderly
f. child given male

Statistics Handouts
Page 57 of 83



Lesson # 8 Estimation

ESTIMATION

- refers to any process by which sample information is used to predict or estimate the
numerical value of some population measure.

- The formula, function or procedure used in estimating a population parameter is
called an estimator. The value obtained with the use of the estimator is the
estimate.

- Two types of estimators: point estimator and interval estimator. A point estimator
yields a numerical value of the estimate. An interval estimate gives a range or band
of values within which the value of the parameter is estimated to lie.

- INTERVAL ESTIMATION OF THE POPULATION MEAN
An interval estimate of ( or any parameter) incorporates a measure of the
confidence in the reliability of the range or interval of values within which the
parameter is estimated to lie. Thus, an interval estimate is also called a confidence
estimate, and its limits, confidence limits.




Where


o = level of significance

1- o = level of confidence


( ) 1 P X k X k o s s + =
int X po estimateofthemean
. . s e
n
o | |
=
|
\ .
2
( . .) k Z s e
o
=
Statistics Handouts
Page 58 of 83


Example.

1.The mean IQ of a random sample of 400 high school students is 110. The standard
deviation of the population of IQ scores is 16. If the population is normally distributed,
find:
a. a .95 confidence interval estimate of










b. a .90 confidence interval estimate of







2. Find the .90 confidence interval estimate of the mean weight of all the pupils in a
certain school if a random sample of 25 pupils has a mean weight of 70lbs with a
standard deviation of 15lbs. Assume the population weights to be normally distributed.





2
1.96 Z
o
=
2
1.64 Z
o
=
2
1.711 t
o
=
Statistics Handouts
Page 59 of 83


3. The contents of 7 similar containers of sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0,
10.2 and 9.6 liters. Find a 95% confidence interval for the mean content of all such
containers, assuming an approximate normal distribution for containers contents. (

)






4. The mean and standard deviation for the quality grade-point averages of a
random sample of 36 college seniors are calculated to be 2.6 and 0.3, respectively. Find
the 99% confidence interval for the mean of the entire senior class. Interpret the
obtained confidence interval. (

)








5. The manager of a home delivery service for pizza pies wants an estimate of the
average time it takes to deliver an order within the town proper of the City of Naga. A
sample of 25 deliveries had a Mean time of 15 minutes and a standard deviation of 4
minutes. Construct a 95% confidence interval for the average time for all deliveries.
Interpret the interval obtained. ( Zo = 1.96 )







6. A random sample of 12 students in a certain dormitory showed an average weekly
expenditure of P400 for snack foods, with a standard deviation of P50.25. Construct a
90% confidence interval for the average amount spent each week on snack foods by
female students living in this dormitory, assuming the expenditure to be approximately
normally distributed. Interpret your confidence interval.
( t = 1.796)
Statistics Handouts
Page 60 of 83


Lesson # 9 Normal Distribution


PROPERTIES OF A NORMAL CURVE

The normal distribution is represented by a normal curve. A normal curve is bell-
shaped figure, has the following six properties:
1. It is symmetrical about X .
2. The mean is equal to the median, which is also equal to mode.
3. The tail or ends are asymptotic relative to the horizontal line
4. The total area under the normal curve is equal to 1 or 100%
5. The normal curve area may be subdivided into at least three standard scores
each to the left and to the right of the vertical axis.
6. Along the horizontal line, the distance from one integral standard score to the
next integral standard score is measured by the standard deviation.


AREA UNDER THE NORMAL CURVE

In making use of the properties of the normal curve to solve certain types of
statistical problems, one must first learn how to find areas under the normal curve.

The first step in finding areas under the normal curve is to convert the normal
curve of any given variable into a standardized normal curve by using the formula:

X X
Z
S

=

where Z = standard score
X = mean
S = Standard deviation
X = given value of a particular variable

WORDED PROBLEMS:
1. Given a normal distribution with mean 300 and standard deviation s=50, find the
probability that x assumes a value greater than 362.




Statistics Handouts
Page 61 of 83


2. Given a normal distribution with mean = 50 and s = 10, find the probability that x
assumes a value between 45 and 62.







3. Given a normal distribution with mean = 40 and s = 6, find the value of X that has
a. 38% of the area below it
b. 5% of the area above it










4. An electrical firm manufactures light bulbs that have a length of life that is normally
distributed with mean equal to 800 hours and a standard deviation of 40 hours.
Find the probability that a bulb burns between 778 and 834 hours.








5. The average weekly income of 2,000 construction workers is P1,500 with a standard
deviation of P200. Assuming that the weekly incomes are normally distributed, find
the number of workers who earn:
a. from P1,300 to P1,600 per week
b. less than P1,250 per week?
c. Greater than P1,800 per week?



Statistics Handouts
Page 62 of 83



Exercise # 7 Normal Distribution



Objectives: At the end of the exercise the student should be able to:
1.Find probabilities using the standard normal probability curve;
2. Apply the concepts of finding areas under the normal probability curve in solving
problems


I. Find the probability.
a. P( z < -1.25) f. P( z > 0.75) k. P(1.34 < z < 1.58)
b P( z < 1.62) g. P( z > 0.59) l. P(-1.46 < z < 2.01)
c. P( z < 0.95) h. P( z > 3.05) m. P(-0.56 < z < 1.01)
d. P( z < -2.03) i. P( z > 2.77) n. P(-0.95 < z < 0.05)
e. P( z < -1.23) j. P( z > 0.51) o. P(-1.43 < z < 1.85)


II. Find the unknown constant a given the area under the normal curve.
a. P(z < a) = 0.25
b. P(z > a) = 0.99


III. Solve the following problems.

a. Given a normal distribution with = 40 and o = 6, find
i. the area below 32
ii. the area above 27
iii. the area between 42 and 51
iv. the x value that has 45% of the area below it
v. the x value that has 13% of the area above it


b. A soft drink machine is regulated so that it discharges an average of 200
million milliliters per cup (=200). If the amount of drink is normally
distributed with a o = 15 milliliters,
i. What fraction of the cups will contain more than 224 milliliters?
ii. What is the probability that a cup contains between 191 and 209
milliliters?
iii. How many cups will likely to overflow if 230 milliliter cups are used to
the next 1000 drinks?
iv. Below what value do we get the smallest 25% of the drinks?



Statistics Handouts
Page 63 of 83


Lesson # 10- Test of Hypothesis
COMMON TERMS IN INFERENTIAL STATISTICS

- A HYPOTHESIS is a statement, which aims to explain facts about the real
world. A test of hypothesis is a two-way decision problem. It is a procedure to
substantiate or invalidate a claim which is stated as null hypothesis

Definition. A NULL HYPOTHESIS (Ho) is the hypothesis that we hope to accept
or reject; must always express the idea of nonsignificance of difference
An ALTERNATIVE HYPOTHESIS (Ha). The rejection of Ho is the
acceptance of this hypothesis.


- TYPE I and TYPE II ERROR

Decision Ho is TRUE Ha is TRUE

Reject Ho Type I error Correct decision
Accept Ho Correct decision Type II error

Type I error (o error) when we reject the null hypothesis when in fact the
null hypothesis is true.

Type II error ( | error) when we accept the null hypthesis when in fact the
null hypothesis is false.


- ONE-TAILED AND TWO-TAILED TEST

Definition. When the rejection region located at only one extreme of the range
of values for the test statistics, the test is ONE-TAILED. If Ha is a statement of
non-equality represented by the sign = , then the hypothesis is non-
directional, thus we have a two-tailed test.



Statistics Handouts
Page 64 of 83


Steps in Test of Hypothesis:

i. State the hypotheses, Ho and Ha.
ii. Determine the appropriate test statistic to use
iii. Choose the level of significance and formulate the decision rule
iv. Compute the value of statistic from the sample data
v. Make a decision (reject or accept) in accordance with the decision rule formulated
vi. Draw a conclusion in relation to the objective of the original problem



I. Mean of a Single Population

Case 1. Z Test

a. Hypotheses: Ho:
0
= against

A. Ha:
0
= or
B. Ha:
0
<
C. Ha:
0
>

c. Test Statistic : Z Test
d. Computation:

0
X
Zc
n

=
e. Decision Rule: At a level of significance o,


A. For Ha:
0
= reject Ho if /Zc/ >
2
Z
o
, otherwise accept Ho.
B. For Ha:
0
< reject Ho if Zc < -Z
o,
otherwise accept Ho.

C. For Ha:
0
> reject Ho if Zc > Z
o,
otherwise accept Ho.


Statistics Handouts
Page 65 of 83


Example 1. The weight of crabs is normally distributed with mean 28.5
ounces and standard deviation of 3 ounces. A new breeder claims that he
can breed crabs yielding a mean weight of more than 28 ounces. A random
sample of 16 crabs from the new breeder had a mean weight of 29.2
ounces. At o = 5%, do the data support the breeders claim?

i. Ho : = 28.5
Ha: > 28.5

ii. Test Statistic: Z Test

iii. Decision Rule : Reject Ho if Zc > Z
o,
otherwise accept Ho.

iv. Computation:



Zo = 1.645


v.Decision: Since Zc < Zo (0.933 = 1.645), accept Ho.

vi. Conclusion: At 5 % level of significance, there is no enough evidence to
support the new breeders claim OR the mean weight of the samples is not
significantly different from the mean of 28.5.



Example 2. For the past five years, the mean height of AdeNU students is 60
inches. A simple random sample of 100 is taken from the present students. It
was found that the mean height is 65 inches with a standard deviation of 4
inches. Is there reason to believe that the mean height of present AdeNU
students different from the past five years at 5% level of significance?


Statistics Handouts
Page 66 of 83


Case 2. T Test

a. Hypotheses: Ho:
0
= against

D. Ha:
0
= or
E. Ha:
0
<
F. Ha:
0
>

f. Test Statistic : T Test
g. Computation:

X
Tc
s
n

=

h. Decision Rule: At a level of significance o,


D. For Ha:
0
= reject Ho if /Tc/ >
[ , 1]
2
n
T
o

, otherwise accept Ho.
E. For Ha:
0
< reject Ho if Tc < -T
|o, n1|,
otherwise accept Ho.

F. For Ha:
0
> reject Ho if Tc > T
|o, n1|,
otherwise accept Ho.


Example 3. A softdrink vending machine is set to dispense 6 ounces per
cup. If the machine is tested eight times, yielding a mean cup fill of 5.8
ounces with a standard deviation of 0.16 oz. Is there evidence at 5% level of
significance that the machine is underfilling cups. Assume normality.

i. Ho : = 6
Ha: < 6

ii. Test Statistic: T Test

iii. Decision Rule : reject Ho if Tc < -T
|o, v1|,
otherwise accept Ho.
iv. Computation:

5.8 6
.3536
0.16
8
X
Tc
s
n

= = =

-T
|o, n1|,
= -T[0.05,7] = -1.895

Statistics Handouts
Page 67 of 83


v.Decision: Since -3.536 < -1.895, reject Ho.

vi. Conclusion: At 5 % level of significance, there is evidence to say that the
machine is under filling the cups.


Example 4. The monthly output of a plywood manufacturers was measured in
nine randomly selected months. The results obtained (in tons) are 100, 120,
100, 102, 130, 140, 150, 140 and 145. Test the hypothesis that the mean
monthly output is 140 tons against the alternative that it is not 140 tons at
10%level of significance. Assume that the monthly output is normal random
variable.

Statistics Handouts
Page 68 of 83


Exercise # 8 Test of Hypothesis ( Z and T Test)

A. For each problem, formulate an appropriate null (Ho) and an appropriate (Ha)
hypothesis.


1. ADNU female students spend an average of 6 hours per day studying. Isabel
suspects that male ADNU students spend less time studying compared to their
female counterpart. She decided to conduct a study regarding the study habits of
male ADNU students. She intends to find out if the average time per day that a male
ADNU student spends doing his schoolwork is less than that of a female ADNU
student.

2. A fitness buff read about a new diet program. He wants to adopt it but
unfortunately, following the new diet program requires buying nutritious, low calorie
yet expensive foods. He thus randomly selected some of his friends who already
adopted the new diet and asked them about its affectivity. He intends to adopt the
new diet only if the percentage of people who claim that new diet program works is
greater than 60%.

3. During a flu epidemic, 20% of the population in Los Banos suffer from flu. A
physician theorizes that regular takers of vitamin C are less susceptible to the flu. To
test her theory, she sampled 500 regular takers of vitamin C to determine how many
of them had flu.


B. Carry out a complete test of hypothesis for the following problems.

1. A certain brand of powdered milk is advertised as having net weight of 250 grams. If
the net weights of a random sample of 10 cans are 253, 248,
252,245,247,249,251,250,247 and 248 grams, can it be concluded that the average
net weight of the cans is less than the advertised amount? Use o = 0.01 and assume
that the net weight of this brand of powdered milk is normally distributed.

2. In a time and motion study, it was found that the average time required by workers
to complete a certain manual operation was 26.6. A group of 20 workers was
randomly chosen to receive a special training for two weeks. After the training it was
found that their average time was 24 minutes and a standard deviation of 3
minutes. Can it be concluded that the special training speeds up the operation? Use
o = 0.05


3. The manager of an appliance store, after noting that the average daily sales was only
12 units, decided to adopt a new marketing strategy. Daily sales under this strategy
were recorded for 90 days after which period the average was found to be 15 units
with a standard deviation of 4 units. Does this indicate that the new marketing
strategy increased the daily sales? Employ o = 0.01
Statistics Handouts
Page 69 of 83




4. The daily wages in a particular industry are normally distributed with a mean of
P66.00. In a random sample of 144 workers of a very large company in this industry,
the average daily wage was found to be P62.00 with a standard deviation of P12.50,
can this company be accused of paying inferior wages at the 0.01 level of
significance?


5. An electrical company claims that the lives of the light bulbs it manufactures are
normally distributed with a mean of 1,000 hours and a standard deviation of 150
hours. If a random sample of 100 bulbs produced by this company has a mean life
of 980 hours, do the data support the claim of the electrical company at o = 0.01?


Statistics Handouts
Page 70 of 83


II. Two Population Means T Test

A. Dependent or Paired/ Independent
i. Ho: population mean of A is equal to population mean of B
Ha: The population means are not equal
ii. Decision rule: Reject Ho if p-value < level of significance
Or t-computed > t-value, otherwise accept Ho.


III. ANOVA
Sample Problems:
a. A researcher wishes to know if there are differences on the average preparation time of
four methods of preparing a solvent.
b. An agriculturist may compare the average yields of three corn varieties used by Los
Banos
c. A consumer wish to know if the different brands of gasoline in the market are equally
good with respect to average mileage
d. A medical researcher is interested in comparing the effectiveness of 3 different
treatments to lower the cholesterol of patients with high values
e. An ecologist wants to compare the amount of certain pollutant in five rivers


i. Ho: There is no difference between groups
Ha: There is difference between groups
i. Decision rule: Reject Ho if p-value < level of significance
Or f-value > critical value, otherwise accept Ho.


IV. Chi-Square Test-t of Independence

This test is usually applied on enumeration data or data in contingency tables. It
tests the association or independence of one variable from another variable.


i. Ho: The two variables are independent
Ha: The two variables are dependent.
ii. Decision rule: Reject Ho if p-value < level of significance
Or X
2
value > critical value, otherwise accept Ho.









Statistics Handouts
Page 71 of 83


SAMPLE PROBLEMS


Two Population Means - T test

A. Dependent or Paired

1. In a study of the effectiveness of physical exercise in weight reduction, a
simple random sample of 8 persons engaged in a prescribed program of
physical exercise for one month showed the ff. Results:



Weight
Before


209

178

169

212

180

192

158

180

Weight
After


196

171

170

207

177

190

159

180

At 1% level of significance, do the data provide evidence that the prescribed
program of exercise is effective?

a. Ho: The weights before and after are equal therefore the procedure is not
effective.

Ha: The weights before and after are not equal therefore the procedure is
effective.

b. Decision rule: Reject Ho if T-computed > critical value, otherwise accept Ho at
1% level of confidence.


c. Test Statistics: T-test on Two Populations

d. Computation: T-computed = 2.07
Critical value = 3.499


e. Decision: Accept Ho.

f. Conclusion: At 1% level of significance, there is sufficient evidence to say that
the program is not effective.

Statistics Handouts
Page 72 of 83


B. Independent


2. Some statistics students complain that pocket calculators give other students
advantage during statistics examination. To check this contention, a simple
random sample of 45 students were randomly assigned to two groups, 23 to
use calculators and 22 to perform calculations by hands. The students then
took a statistics examination that required a modest amount of arithmetic.
The results are shown below:


With Calculator


85 86 89 84 82 83 90 91 86 90 87 87 92 85 86 89 88
88 89 90 85 89 90


Without Calculator


86 88 90 92 86 85 88 89 85 91 86 85 92 84 83 88 90
91 86 90 86 87




Do the date provide sufficient evidence to indicate that the students taking
this particular examination obtain higher scores when using a calculator? Test at
o = 10%.

a. Ho: The mean scores are equal.
Ha: The mean scores are not equal.


b. Decision rule: Reject Ho if T-computed > critical value, otherwise accept Ho.


c. Test Statistics: T-test on Two Populations


d. Computation: T-computed = 0.25
Critical value = 1.303


e. Decision: Accept Ho.


f. Conclusion: At 10% level of significance there is no enough evidence to say
that the use of calculators will assure students of higher scores.
Statistics Handouts
Page 73 of 83






ANOVA

3. A study was conducted to compare the three teaching methods. Three groups
of 6 students were chosen and each group is subjected to one of three types of
teaching method. The grades of the students taken at the end of the semester
are given as:

Group I
Method A
Group II
Method B
Group III
Method C
Student 1 84 70 90
Student 2 90 75 95
Student 3 92 90 100
Student 4 96 80 98
Student 5 84 75 88
Student 6 88 75 90



a. Ho: The three teaching methods are equal.
Ha: The three teaching methods are not equal.

b. Decision rule: Reject Ho if F-computed > critical value, otherwise accept Ho.

c. Test Statistics: F-test ANOVA

d. Computation: F-computed = 13.121
Critical value= 3.68


e. Decision: Reject Ho.

f. Conclusion: There is evidence to say that the three methods are not equal.
We can also conclude that Method III is more effective since it students got higher
grades compared to the other two methods.
Statistics Handouts
Page 74 of 83


Chi-Square Test of Independence


4. It is believed that people with high blood pressure need to watch their weight.
A random sample of 300 subjects was classified according to their weight and
blood pressure. At the 5% level of significance, is there sufficient evidence to
conclude that a persons weight is related to his blood pressure?


Blood Pressure
Weight High Normal Low

Overweight
Normal
Underweight


40
36
16

34
77
33

18
27
19

a. Ho: Weight is independent with blood pressure or weight is unaffected by
blood pressure or the two variables weight and blood pressure are
independent.

Ha: Weight is dependent with blood pressure or weight is affected by blood
pressure or the two variables weight and blood pressure are dependent.

b. Decision rule: Reject Ho if X
2
-computed > critical value, otherwise accept Ho.

c. Test Statistics: Chi-square Test

d. Computation: X
2
-computed = 12.75
Critical value = 9.49

e. Decision: Reject Ho.

f. Conclusion: At 5% level of significance, there is evidence to say that weight
is affected by blood pressure. For overweight persons, most of them
(approximately 40% of the actual population) will have higher blood pressure. For
normal weight person, they are most likely to have normal blood pressure. Those
who are underweight will also most likely to have normal blood pressure.

Statistics Handouts
Page 75 of 83



Exercise # 9 Test of Hypothesis (T-test, ANOVA and Chi-Square Test)

Objectives:
At the end of the exercise, the student is expected to be able to apply the appropriate statistical procedure
in performing test of hypothesis of various problems


Carry out a complete test of hypothesis for the following problems.

1. As part of a study to determine the effects of a certain oral contraceptive on
weight gain, 12 healthy females were weighed at the beginning of a course of
oral contraceptive usage. They were reweighed after three months. Do the
results suggest evidence of weight gain? Use o = 0.05

Subject 1 2 3 4 5 6 7 8 9 10 11 12
Initial
Weight
120 141 130 162 150 148 135 140 129 120 140 130
3-Month
Weight
123 143 140 162 145 150 140 143 130 118 141 132
Source: Basic Statistics for Health Sciences by Kuzma

a. Ho:


Ha:


b. Test Statistic:

c. Decision Rule:


d. Computation: computed value = 1.75
Critical value = 2.201

e. Decision:


f. Conclusion:







Statistics Handouts
Page 76 of 83






2. An investment analyst claims to have mastered the art of forecasting the price
changes of gold. The ff. Table gives the actual gold price changes and the
changes forecasted by the investment analyst (in%) on a simple random
sample of 8 months. Use a o = 5%.

Month 1 2 3 4 5 6 7 8
Actual Price Changes 7.3 -2.1 8.5 -1.5 9.2 6.7 -4.8 -0.8
Forecasted Changes 14.9 -19.7 7.0 -5.3 1.0 -0.8 -8.3 6.7


a. Ho:


Ha:


b. Test Statistic:

i. Decision Rule:


j. Computation: Computed value = 1.15
Critical value = 2.365

k. Decision:



l. Conclusion:

Statistics Handouts
Page 77 of 83


3. Four groups of 4 patients each were subjected to four different types of
treatment fort he same ailment. The following data are on the number of days
that elapsed before that were completely cured. What conclusions may be drawn
about the four types of treatment?


Treatment
A
Treatment
B
Treatment
C
Treatment
D
Patient 1 10 11 3 6
Patient 2 9 11 4 10
Patient 3 6 18 5 8
Patient 4 7 6 7 11


a. Ho:


Ha:


b. Test Statistic:


c. Decision Rule:




d. Computation: Computed value = 3.474
Critical value = 3.49


e. Decision:



f. Conclusion:

Statistics Handouts
Page 78 of 83




4. Test if there is significant association between academic performance and IQ

Table. Academic Performance and IQ of 100 Students

IQ

Academic
Performance

High

Average

Low

Total

Passed
Failed


31
1

45
4

4
15


80
20

Total

32

49

19

100



a. Ho:

Ha:


b.Test Statistic:


c.Decision Rule:



d.Computation: Computed value = 51.25
Critical value = 5.99


e.Decision:



f.Conclusion:

Statistics Handouts
Page 79 of 83


Lesson # 11 - TWO-FACTOR ANOVA

Example 1. A research study was conducted to examine the impact of eating a high
protein breakfast on adolescents performance during a physical education physical
fitness test. Half of the subjects received a high protein breakfast and half were given
a low protein breakfast. All of the adolescents, both male and female, were given a
fitness test with high scores representing better performance. Test scores are
recorded below.

Males Females
High Protein Low Protein High Protein Low Protein
10
7
9
6
8
5
4
7
4
5
5
4
6
3
2
3
4
5
1
2


Statistical test results:
Treatment F -value F-critical

between (protein level)
within (gender)
among (interaction betwn
protein level and gender)


*8.89
*20.00
2.22

4.49
4.49
4.49

8.53
8.53
8.53
5% 1%

Ho : There is no difference on the performance between the two protein levels
There is no difference on the performance between the two gender
There is no interaction between protein levels and gender

Interpretation:

At 5% level of significance it can be concluded that there is significant difference
on the performance for both protein level and gender. There was no significant
interaction effect. Based on this data, it appears that a higher protein diet results in a
better fitness test scores. Additionally, young men seem to have a significantly higher
fitness test score than women.

Statistics Handouts
Page 80 of 83


Seatwork:

1. Different typing skills are required for secretaries depending on whether one is
working in a law office, an accounting firm, or for research mathematical group at a
major university. In order to evaluate candidate for this positions, an employment
agency administers three distinct standardized typing samples. A time penalty has been
incorporated into the scoring of each sample based on the number of typing errors. The
mean and standard deviation for each test, together with the score achieved by a recent
applicant, are given in Table below. For what type of position does this applicant seem
to be best suited?


Sample Applicants
Score
Mean Standard
Deviation

Law
Accounting
Scientific


141 sec
7min
33min



180sec
10min
26min

30 sec
2min
5min

Statistics Handouts
Page 81 of 83


2. Researchers have sought to examine the effect of various types of music on agitation
levels in patients who are in the early and middle stages of Alzheimers disease. Patients
were selected to participate in the study based on their stage of Alzheimer s disease.
Three forms of music were tested: easy listening, Mozart, and piano interludes. While
listening to music, agitation levels were recorded for the patients with a high score
indicating a higher level of agitation. Scores are recorded below.

Early Stage Alzheimer Middle Stage Alzheimer
Piano
Interlude

Mozart
Easy
Listening
Piano
Interlude

Mozart
Easy
listening

21
24
22
18
20


9
12
10
5
9


29
26
30
24
26

22
20
25
18
20

14
18
11
9
13

15
18
20
13
19




Statistics Handouts
Page 82 of 83


3. A study examining differences in life satisfaction between young adults, middle adult
and older adult men and women was conducted. Each individual who participated in
the study completed a life satisfaction questionnaire. A high score on the test indicates
a higher level of life satisfaction. Test scores are recorded below.

Male Females
Young
Adult
Middle
Adult
Older
Adult
Young
Adult
Middle
Adult
Older
Adult

4
2
3
4
2


7
5
7
5
6

10
7
9
8
11

7
4
3
6
5

8
10
7
7
8

10
9
12
11
13
Mean = 3 6 9 5 8 11



Statistics Handouts
Page 83 of 83


Lesson # 12 Pearson Moment Correlation
Pearson Moment is one of the measures of correlation which quantifies the
strength as well as direction of such relationship. The correlation coefficient (r) has the
following interpretation:


Scale ( +/ -) Decision

1.00
0.80 - 0.99
0.60 0.79
0.40 0.59
0.20 0.39
0.01 0.19
0.00


Perfect Relationship
Very Strong Relationship
Strong relationship
Moderate Relationship
Weak Relationship
Very Weak Relationship
No relationship



Table.
Result of AdNU Entrance Examinees of 20 Examinees
No. SAI RPM Math English
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
52
84
113
92
98
91
52
116
101
83
65
96
94
89
91
92
101
97
89
96
25
40
90
90
80
80
15
40
60
15
10
95
80
65
45
80
95
95
80
95
47
48
58
47
54
56
52
68
69
48
52
54
54
56
54
64
58
56
56
58
21
11
29
14
17
19
18
38
22
16
16
19
15
20
21
17
33
17
11
27

También podría gustarte