Documentos de Académico
Documentos de Profesional
Documentos de Cultura
(ANOVA)
A. CHAPTER OBJECTIVES
B. INTRODUCTION
C. UNDERSTANDING THE FUNDAMENTALS
OF ANOVA
D. UNDERSTANDING THE ANOVA TABLE
E. ANALYZING THE DATA IN MINITAB
F. ANOVA APPLICATION EXAMPLE
G. ANOVA PRACTICAL APPLICATION
EXAMPLE
H. ANOVA TEAM EXERCISE
20-1
CHAPTER OBJECTIVES
To understand the
fundamentals of ANOVA
including its components,
general definitions, statistical
assumptions and basic
concepts.
Provide an understanding of the
ANOVA table.
Be able to conduct an
experiment using ANOVA
20-2
INTRODUCTION
20-3
INTRODUCTION
H o : 1 2 3 4
H a : At least one k is different
20-4
UNDERSTANDING THE
FUNDAMENTALS OF ANOVA
20-5
CALCULATING THE SUM OF SQUARES
The following is the formula used for calculating the sum of squares.
SST SS B SSW
g n g g n
j 1 i 1
( X ij X ) = n ( X j X )
2
j 1
2
+
j 1 i 1
( X ij X j )2
Where;
X ij Individual values
g
j 1
Summation over all subgroups ( j 1 to g )
n
i 1
Summation over all individuals withi n t he subgroups (i 1 to n )
Population variances of the output (response) are equal across all levels
of the given factor (Test for Equal Variances). We can test this
assumption in Minitab using the Stat>ANOVA>Test for Equal Variances
procedure.
3. Compare these 2 estimates using the F (variance ratio) test. If they are
approximately equal in value, accept the null hypothesis.
4. In order for there to be real differences among factor means, the between sum
of squares must be significantly bigger than the within sum of squares.
20-9
To determine whether we can accept or not accept the null hypothesis, we must
calculate the Test Statistic (F ratio) using the ANOVA table as shown below.
Source indicates the different variation sources in the ANOVA Table. Factor
represents the variation introduced between the factor levels (groups). The Error is
the variation within each of the factor levels. Also, Total is the total variation.
ANOVA TABLE
Degrees of Freedom (DF) the number of degrees of freedom related to each Sum
of Squares (SS). Note: K = number of levels (groups), n = number of samples in each
level.
Sum of Squares (SS) the sum of squares measures the variability associated with
each source. SS (Factor) is due to the change in the factor level; the larger the
difference between the means of a factor level, the larger the factor sum of squares
will be. SS (error) is due to the variation within each factor level. Also, SS (Total) is
the sum of the Factor and Error sum of squares.
Mean Square (MS) is the estimate of the variance for the factor and error sources
computed by MS = SS/DF.
F the ratio of the mean square for the Factor and the mean square for the Error.
P-Value this value is compared with the alpha ( ) level (i.e.; .05) and the
following decision rule is applied; if p< alpha, reject the null hypothesis; if P
alpha, do not reject the null hypothesis. 20-10
ANALYZING THE DATA
IN MINITAB
The following statistical, graphical and diagnostic techniques will be used to
analyze our data.
Statistical
Graphical
Diagnostic
The golf balls were randomly assigned to Tiger Woods who was using the USGA
approved test driver. Also, the golf balls were tested in random order to
reduce/eliminate bias (i.e.; weather, different day, etc).
Note: Dimple pattern is the factor (input variable) and distance traveled is the
response (output).
Note: Responses have been stacked with the distances in column C5 and golf
ball pattern in column C6.
20-12
PERFORM TEST FOR EQUAL
VARIANCES
20-13
ANOVA APPLICATION EXAMPLE - CONT.
Results: Minitab produces a Variance test plot as well as the session window results.
The Variance Plot below displays a 95% confidence interval for the response standard
deviation for each level as well as the p-values for the Bartletts and Levenes test.
Note: The best practice is to always rely on the Levenes test since the test is good
whether you are dealing with normal or non-normal data. Bartletts test is very misleading
when there are even sight departures from normality.
20-14
ANOVA APPLICATION EXAMPLE -
CONT.
As indicated by the previous plot and session window below, a p-value of .581
(greater then .05) indicates there is no evidence to support different variances.
Test for Equal Variances: Distance versus Golf Ball
Golf
Ball N Lower StDev Upper
1 4 4.05035 8.2209 49.2999
2 6 7.02936 12.6596 42.0684
3 6 4.11665 7.4140 24.6368
4 8 6.99307 11.7321 30.1086
Stat>ANOVA>One-Way
In the dialog window:
Enter Distance in the response field
Enter Golf Ball in the factor field
Click OK
20-16
ANOVA TABLE - CONT.
Results: As indicated by the Minitab session window below, the p-value = .000 indicates at least 1
group mean is different. In this case, we reject the hypothesis that all group means are equal. At
least 1 dimple pattern mean is different.
Also, when the F value (F-test) is close to 1.00, the group means are similar. In this case, the f-value
= 13.75 is much greater than 1.00.
Lastly, as indicated by the 95% Confidence Interval, the dimple patterns of A and D are different from
B and C (Levels 1 and 4 are different than levels 2 and 3).
One-way ANOVA: Distance versus Golf Ball
Source DF SS MS F P
Golf Ball 3 4626 1542 13.75 0.000
Error 20 2242 112
Total 23 6868
20-17
Note: Pooled StDev = 10.59 is the square root of the Mean Square (Error) = 112
ANOVA APPLICATION EXAMPLE -
GRAPHICAL ANALYSIS
(MEAN EFFECTS PLOT)
20-18
GRAPHICAL ANALYSIS -
MAIN EFFECTS PLOT CONT.
To generate a Main Effects Plot, proceed as follows.
20-19
GRAPHICAL ANALYSIS -
MAIN EFFECTS PLOT CONT.
Results: As indicated by the Main Effects Plot illustrated below, the Distance
Traveled using dimple patterns 2 and 3 is much farther than the Distance
Traveled using dimple patterns 1 and 4. For further investigation, you can
eliminate dimple pattern 1 and 4 and now focus on dimple patterns 2 and 3 to
determine if there is a significant difference between them.
20-20
ANOVA APPLICATION EXAMPLE -
GRAPHICAL ANALYSIS (INTERVAL PLOT)
The interval plot is another method to graphically analyze your data. This plot
produces a plot of group means and standard error bars about the mean.
Stat>ANOVA>Interval Plot
In the dialog window Interval Plots
Under One Y Select With Groups
Click OK
In the dialog window Interval Plot One Y, With Groups
Enter Distance in the Graph variables field
Enter Golf Ball in the Categorical variables for grouping
Click OK
Note: Minitab calculates the standard error bars as s / n away from the
mean. The default is 1.0 standard error. However, you can specify a multiplier
for the standard error (i.e.; 2.0).
20-21
GRAPHICAL ANALYSIS -
INTERVAL PLOT CONT.
Result: As indicated by the Interval Plot below, the mean of each group is plotted
with lines extending 1 standard error above and below the means. The variability
Between the groups of Golf Balls appears to be large (i.e.; distance between
groups 3 and 4) relative to the variability Within each group of Golf Balls. In
addition, as indicated by the previous Main Effects Plot, we should focus our
attention on Dimple patterns 2 and 3 since they provide the greatest distance
traveled.
20-22
ANOVA APPLICATION EXAMPLE -
DIAGNOSTIC ANALYSIS RESIDUALS AND FITS
ANOVA assumes the errors (Residuals) are normally distributed with a mean = 0 and a
constant sigma.
We can test this by reviewing the residuals, which is each score subtracted from its
sample mean.
Stat>ANOVA>One-way
In the dialog window
Enter Distance in the Response field
Enter Golf Ball in the Factor field
Click on Store Residuals
Click on Store Fits
Click on Graphs
In the Graphs dialog window
Click on Normal Plot of Residuals
Residuals Versus Fits
Residuals Versus Order
Enter Golf Ball in the Residuals Versus the Variables field
Click OK
Click OK 20-23
DIAGNOSTIC ANALYSIS -
RESIDUALS AND FITS
Results: In addition to creating 4 plots, Minitab adds 2 columns onto your
worksheet; RESI1 and FITS1. As indicated on the worksheet below, the
FITS1 is simply the mean of each dimple pattern group (i.e.; 272.250 is the
mean of dimple pattern 1) and the residual (RESI1) is each distance minus
the FITS1 (i.e; 268 272.250 = -4.2500).
20-25
RESIDUALS FROM DISTANCE
VERSUS GOLF BALL
The plot illustrated below tests the assumption of equal variances across groups.
As indicated, randomness should be observed.
20-26
RESIDUALS VERSUS FITTED VALUES
The plot below investigates whether the mathematical model fits equally for low to
high values of the fits. As indicated, randomness should be observed.
20-27
RESIDUALS VERSUS THE
ORDER OF THE DATA
Lastly, the plot below investigates how the residuals behave across the
experiment. Once again, randomness should be observed; a nonrandom pattern
should be a warning.
Note: This is the most important plot, since it would signal something outside the
experiment might be operating.
20-28
ANOVA APPLICATION EXAMPLES -
EPSILON SQUARED
(PRACTICAL SIGNIFICANCE)
Analysis may indicate a factor (Golf Balls) is statistically significant.
However, this analysis may not have much practical significance.
To determine practical significance, we can calculate Epsilon Squared.
Referring to the ANOVA table, the SS (Golf Ball) = 4626 and the SS
(Total) = 6868. Therefore,
4626
E2 67%
6868
This indicates that 67% of the variation in distance traveled is attributed to
dimple pattern. 20-29
ANOVA PRACTICAL
APPLICATION EXAMPLE
As part of the Analyze phase, the team wanted to know if a warping condition on BA-43033 was cavity related. They
performed a One-way ANOVA using cavity as the input and defect observed as the output to see if there was
significant difference between cavity to cavity.
The One-way ANOVA indicated that Cavities 4 and 8 had significantly more defects compared to the other
cavities. As a result of this study, Tool Room was able to correct these two cavities that were out of specifications
improving the overall performance the production output.
During this exercise, you will analyze the shot distances on the 60
projectiles fired from the 3 catapults (20 per catapult). The purpose of this
experiment is to investigate the effect of the 3 catapults on the shot
distance and answer the question, Does the mean shot distance differ
for the different catapults? 20-31
ANOVA TEAM EXERCISE - CONT.
To analyze this data:
Working as a team (2 3 individuals per team), conduct the tests, analyze the
results and be prepared to present/discuss your results with the class. Allow 1
hours for this exercise. 20-32