Homework 2

Aaron Vincent
3 March 2014
Statistics 516
Power-Transformations via Nonlinear Regression

Part 1:
Answers:
Since the parameter estimates are the same in both summaries, I know I have done this
correctly
R Input:
data(ToothGrowth)
ToothGrowth
model1=lm(len~supp+dose+supp:dose, data=ToothGrowth)
summary(model1)
model2 = nls(len ~ ifelse(supp == "VC",theta1+theta2+(theta3+theta4)*dose,theta1+theta3*dose),start =
c(theta1 = 0, theta2 = 0, theta3 = 0, theta4 = 0),data = ToothGrowth)
summary(model2)
R Output:
> model1=lm(len~supp+dose+supp:dose, data=ToothGrowth)
> summary(model1)
Call:
lm(formula = len ~ supp + dose + supp:dose, data = ToothGrowth)
Residuals:
Min
1Q Median 3Q Max
-8.2264 -2.8463 0.0504 2.2893 7.9386
Coefficients:
Estimate Std. Error t value
(Intercept) 11.550
1.581 7.304
suppVC
-8.255
2.236 -3.691
dose
7.811
1.195 6.534
suppVC:dose 3.904
1.691 2.309
Pr(>|t|)
(Intercept) 1.09e-09 ***
suppVC
0.000507 ***
dose
2.03e-08 ***
suppVC:dose 0.024631 *
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 4.083 on 56 degrees of freedom
Multiple R-squared: 0.7296,
Adjusted R-squared: 0.7151
F-statistic: 50.36 on 3 and 56 DF, p-value: 6.521e-16
Aaron Vincent
3 March 2014
Statistics 516
> model2 = nls(len ~ ifelse(supp == "VC",theta1+theta2+(theta3+theta4)*dose,theta1+theta3*dose),start = c(theta1

= 0, theta2 = 0, theta3 = 0, theta4 = 0),data = ToothGrowth)
> summary(model2)
Formula: len ~ ifelse(supp == "VC", theta1 + theta2 + (theta3 + theta4) *
dose, theta1 + theta3 * dose)
Parameters:
Estimate Std. Error t value Pr(>|t|)
theta1 11.550
1.581 7.304 1.09e-09
theta2 -8.255
2.236 -3.691 0.000507
theta3 7.811
1.195 6.534 2.03e-08
theta4 3.904
1.691 2.309 0.024631
theta1 ***
theta2 ***
theta3 ***
theta4 *
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Part 2:
Answers:
Since the parameter estimates are the same in both summaries, I know I have done this
correctly
R Input:
model3=lm(len~supp+log(dose)+supp:log(dose),data=ToothGrowth)
summary(model3)
model4=nls(len~ ifelse(supp ==
"VC",theta1+theta2+(theta3+theta4)*log(dose),theta1+theta3*log(dose)),start = c(theta1 = 0,
theta2 = 0, theta3 = 0, theta4 = 0),data = ToothGrowth)
summary(model4)
R Output:
> model3=lm(len~supp+log(dose)+supp:log(dose),data=ToothGrowth)
> summary(model3)
Call:
lm(formula = len ~ supp + log(dose) + supp:log(dose), data = ToothGrowth)
Residuals:
Min
1Q Median 3Q Max
-7.5433 -2.4921 -0.5033 2.7117 7.8567
Aaron Vincent
3 March 2014
Statistics 516
Coefficients:
(Intercept)
20.6633 0.6791 30.425 < 2e-16 ***
suppVC
-3.7000 0.9605 -3.852 0.000303 ***
log(dose)
9.2549 1.2000 7.712 2.3e-10 ***
suppVC:log(dose) 3.8448 1.6971 2.266 0.027366 *
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Multiple R-squared: 0.7755,
Adjusted R-squared: 0.7635
F-statistic: 64.5 on 3 and 56 DF, p-value: < 2.2e-16
> model4=nls(len~ ifelse(supp == "VC",theta1+theta2+(theta3+theta4)*log(dose),theta1+theta3*log(dose)),start =
c(theta1 = 0, theta2 = 0, theta3 = 0, theta4 = 0),data = ToothGrowth)
> summary(model4)
log(dose), theta1 + theta3 * log(dose))
Parameters:
theta1 20.6633 0.6791 30.425 < 2e-16 ***
theta2 -3.7000 0.9605 -3.852 0.000303 ***
theta3 9.2549 1.2000 7.712 2.3e-10 ***
theta4 3.8448 1.6971 2.266 0.027366 *
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Number of iterations to convergence: 1
Achieved convergence tolerance: 1.52e-09
Part 3:
R Input:
g <- function(x, lambda) {
if (lambda == 0) {
x <- log(x)
}
else {
x <- (x^lambda - 1)/lambda
}
return(x)
}
model5=lm(len~supp+g(dose,0)+supp:g(dose,0),data=ToothGrowth)
summary(model5)
"VC",theta1+theta2+(theta3+theta4)*g(dose,lambda),theta1+theta3*g(dose,lambda)),start = c(theta1 = 20.6633,
theta2 = -3.7, theta3 = 9.2549, theta4 = 3.8448, lambda = 0),data = ToothGrowth)
Aaron Vincent
3 March 2014
Statistics 516
summary(model6)
R Output:
> model6=nls(len~ ifelse(supp == "VC",theta1+theta2+(theta3+theta4)*g(dose,la
mbda),theta1+theta3*g(dose,lambda)),start = c(theta1 = 20.6633, theta2
= -3.7, theta3 = 9.2549, theta4 = 3.8448, lambda = 0),data = ToothGrowt
h)
> summary(model6)
g(dose, lambda), theta1 + theta3 * g(dose, lambda))
Parameters:
Estimate Std. Error t value
theta1 21.2710
0.8831 24.085
theta2 -3.4675
0.9889 -3.506
theta3
9.2759
1.2129
7.648
theta4
3.5492
1.6708
2.124
lambda -0.4064
0.3821 -1.063
--Signif. codes: 0 *** 0.001 **
Pr(>|t|)
< 2e-16
0.000913
3.28e-10
0.038158
0.292258
***
***
***
*
0.01 * 0.05 . 0.1 1

Number of iterations to convergence: 5
Achieved convergence tolerance: 6.193e-06
Part 4:
Answers:
R Input:
ToothGrowth$yhat=predict(model6)
Aaron Vincent
3 March 2014
Statistics 516
newdata <- expand.grid(dose = seq(0, 2, length = 60), supp = c("VC", "OJ"))
newdata$yhat <- predict(model6, newdata)
install.packages("ggplot2")
library(ggplot2)
myplot <- ggplot(ToothGrowth, aes(x = dose, y = len, color = supp))
myplot <- myplot + geom_point()
myplot <- myplot + geom_line(aes(y = yhat), data = newdata)
myplot <- myplot + xlab("Dose") + ylab("Length")
plot(myplot)
R Output:
> ToothGrowth$yhat=predict(model6)
> newdata <- expand.grid(dose = seq(0, 2, length = 60), supp = c("VC", "OJ"))
> newdata$yhat <- predict(model6, newdata)
> library(ggplot2)
> myplot <- ggplot(ToothGrowth, aes(x = dose, y = len, color = supp))
> myplot <- myplot + geom_point()
> myplot <- myplot + geom_line(aes(y = yhat), data = newdata)
> myplot <- myplot + xlab("Dose") + ylab("Length")
> plot(myplot)
Part 5:
Answers:
According to our ANOVA, with an F value of 4.4462 and a p-value of 0.03955 we
reject the null hypothesis supporting the observation that expected tooth length does not
increase at the same rate with dose when comparing the two supplement types.
Additionally, since the confidence interval for theta 4 does not contain 0 we once again
reject the null hypothesis supporting the observation that expected tooth length does not
increase at the same rate with dose when comparing the two supplement types.
R Input:
theta=list(theta1 = 21.2710, theta2 = -3.4675, lambda=-0.4064, theta3 = 9.2759, theta4=3.5492)
"VC",theta1+theta2+(theta3+theta4)*g(dose,lambda),theta1+theta3*g(dose,lambda)),start =theta,data =
ToothGrowth)
model7.5=nls(len~ ifelse(supp ==
"VC",theta1+theta2+(theta3)*g(dose,lambda),theta1+theta3*g(dose,lambda)),data=ToothGrowth,
start=theta [1:4])
Aaron Vincent
3 March 2014
Statistics 516
anova(model7, model7.5)
confint(model7)
R Output:
> anova(model7, model7.5)
Analysis of Variance Table
Model 1: len ~ ifelse(supp == "VC", theta1 + theta2 + (theta3 + theta4) * g(dose, lambda), theta1 + theta3 * g(dose,
lambda))
Model 2: len ~ ifelse(supp == "VC", theta1 + theta2 + (theta3) * g(dose, lambda), theta1 + theta3 * g(dose, lambda))
Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
1 55 759.06
2 56 820.43 -1 -61.362 4.4462 0.03955
1
2*
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
> confint(model7)
Waiting for profiling to be done...
2.5%
97.5%
theta1 19.500701 23.0886329
theta2 -5.438901 -1.4824484
lambda -1.253519 0.3556167
theta3 6.902859 11.6648535
theta4 0.174306 6.9825694
Aaron Vincent
3 March 2014
Statistics 516
Aerial Snow Geese Counting
Part 1:
Answers:
Being that this residual vs predicted plot is relatively trumpet shaped towards the higher number
counts due in particular to 5 observations (28, 29, 73, 74, and 75), which may be transcription
errors, there appears to be an issue with heteroscedasticity.
Aaron Vincent
3 March 2014
Statistics 516
This qq plot is consistent with the assumption that the errors are normally distributed because it
is a roughly straight diagonal line. This occurs this way because even though there are
apparent issues with heteroscedasticity shown by the residual vs. predicted plot the sample size is
large enough to make it so that we do not need to worry about the normality assumption.
R Input:
install.packages("alr3")
library(alr3) # contains the data, no data statement needed
library(reshape2) # contains the function melt used below
library(ggplot2) # for plotting
snowgeese.long <- melt(snowgeese, measure.vars = c("obs1","obs2"),variable.name = "observer", value.name =
"count")
model1 <- lm(count ~ observer + photo + observer:photo, data = snowgeese.long)
snowgeese.long$yhat <- predict(model1) # predicted values
snowgeese.long$r <- rstudent(model1) # studentized residuals
p <- ggplot(snowgeese.long, aes(x = yhat, y = r, color = observer)) + geom_point()
p <- p + geom_segment(aes(x = yhat, xend = yhat, y = 0, yend = r))
p <- p + geom_hline(yintercept = 0)
p <- p + xlab("Predicted Value") + ylab("Studentized Residual")
p <- p + ggtitle("Snowgeese Data 1")
print(p)
qqnorm(snowgeese.long$r)
snowgeese.long
R Output:
> print(p)
> qqnorm(snowgeese.long$r)
> snowgeese.long
photo observer count
yhat
r
1 56 obs1 50 42.681013 0.174803055
2 38 obs1 25 27.378559 -0.056930376
3 25 obs1 30 16.326786 0.328216727
4 48 obs1 35 35.879922 -0.021030888
5 38 obs1 25 27.378559 -0.056930376
6 22 obs1 20 13.776377 0.149409096
7 22 obs1 12 13.776377 -0.042639935
8 42 obs1 34 30.779104 0.077046657
9 34 obs1 20 23.978013 -0.095278005
10 14 obs1 10 6.975286 0.072733155
11 30 obs1 25 20.577468 0.105999981
12 9 obs1 10 2.724604 0.175181401
13 18 obs1 15 10.375832 0.111098935
14 25 obs1 20 16.326786 0.088121347
15 62 obs1 40 47.781831 -0.185759660
16 26 obs1 30 17.176922 0.307726301
17 88 obs1 75 69.885377 0.121939896
18 56 obs1 35 42.681013 -0.183452861
19 11 obs1 9 4.424877 0.110096935
20 66 obs1 55 51.182377 0.091088387
21 42 obs1 30 30.779104 -0.018636245
22 30 obs1 25 20.577468 0.105999981
23 90 obs1 40 71.585649 -0.755503506
Aaron Vincent
3 March 2014
Statistics 516
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
119 obs1
165 obs1
152 obs1
205 obs1
409 obs1
342 obs1
200 obs1
73 obs1
123 obs1
150 obs1
70 obs1
90 obs1
110 obs1
95 obs1
57 obs1
43 obs1
55 obs1
325 obs1
114 obs1
83 obs1
91 obs1
56 obs1
56 obs2
38 obs2
25 obs2
48 obs2
38 obs2
22 obs2
22 obs2
42 obs2
34 obs2
14 obs2
30 obs2
9 obs2
18 obs2
25 obs2
62 obs2
26 obs2
88 obs2
56 obs2
11 obs2
66 obs2
42 obs2
30 obs2
90 obs2
119 obs2
165 obs2
152 obs2
205 obs2
409 obs2
342 obs2
200 obs2
73 obs2
123 obs2
75 96.239604 -0.507778637
100 135.345876 -0.853620652
150 124.294104 0.617852006
120 169.351331 -1.211275561
250 342.779148 -2.776416796
500 285.820012 7.213791417
200 165.100649 0.851371432
50 57.133331 -0.170150602
75 99.640149 -0.589611927
150 122.593831 0.658672099
50 54.582922 -0.109322644
60 71.585649 -0.276317714
75 88.588377 -0.324346604
150 75.836331 1.801533386
40 43.531150 -0.084316241
25 31.629241 -0.158572669
100 41.830877 1.405218186
200 271.367694 -1.903795729
60 91.988922 -0.765923189
40 65.634695 -0.612496417
35 72.435786 -0.896656646
20 42.681013 -0.542541718
40 58.090235 -0.432455553
30 38.085626 -0.193566965
40 23.637853 0.392870404
45 49.199298 -0.100372457
30 38.085626 -0.193566965
20 20.303751 -0.007291135
20 20.303751 -0.007291135
35 42.531095 -0.180178518
30 33.640158 -0.087185212
12 11.412814 0.014119225
30 29.194689 0.019300598
10 5.855978 0.099770114
18 15.858283 0.051453348
30 23.637853 0.152643492
50 64.758439 -0.352483700
20 24.749220 -0.113916338
120 93.653985 0.629534039
60 58.090235 0.045604232
10 8.078712 0.046231644
80 69.203907 0.257682504
35 42.531095 -0.180178518
30 29.194689 0.019300598
120 95.876720 0.576203999
200 128.106368 1.746700026
200 179.229258 0.500222116
150 164.781485 -0.354746391
200 223.683945 -0.577473666
300 450.402850 -4.875025903
500 375.941249 3.522840852
300 218.127109 2.036786662
40 76.983478 -0.886076945
80 132.551837 -1.266734442
Aaron Vincent
3 March 2014
Statistics 516
78
79
80
81
82
83
84
85
86
87
88
89
90
150
70
90
110
95
57
43
55
325
114
83
91
56
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
10
120 162.558750 -1.026549401

60 73.649376 -0.325777331
100 95.876720 0.098301677
120 118.104063 0.045227467
150 101.433556 1.167085433
40 59.201603 -0.459041643
35 43.642462 -0.206750700
110 56.978868 1.278346335
400 357.048007 1.130512688
120 122.549532 -0.060836189
40 88.097149 -1.155643080
60 96.988087 -0.885833822
40 58.090235 -0.432455553
Part 2:
Answers:
The use of weights in the model makes the residual vs. predicted plot more uniform and
eliminates the trumpet shape found in the previous plot. However, the use of weights also creates
more questionable observations than the previous model did. So there is improvement in come
aspects, such as uniformity of the model, but it also creates more questionable observations (29,
37, 40, 69, 73, 74, and 85).
Aaron Vincent
3 March 2014
Statistics 516
11
Adding weights to the model slightly improves the overall trend of the qq plot. However, this
improvement in the overall trend comes at the cost of an extremely abnormal curvature at the end
of the qq plot making the improvement questionable. Again, there is no reason to worry about
the assumption of normality in this case because the sample size is very large.
R Input:
mywt=1/snowgeese.long$photo
model2 <- lm(count ~ observer + photo + observer:photo, data = snowgeese.long, weights=mywt)
myplot <- ggplot(snowgeese.long, aes(x = yhat, y = r, color = observer)) + geom_point()
myplot <- myplot + geom_segment(aes(x = yhat, xend = yhat, y = 0, yend = r))
myplot <- myplot + geom_hline(yintercept = 0)
myplot <- myplot + xlab("Predicted Value") + ylab("Studentized Residual")
myplot <- myplot + ggtitle("Snowgeese Data 2")
print(myplot)
snowgeese.long
R Output:
> print(myplot)
> snowgeese.long
yhat
r logcount logphoto sqrtcount sqrtphoto
1 56 obs1 50 44.328357 0.24347357 3.912023 4.025352 7.071068 7.483315
2 38 obs1 25 29.916062 -0.25705864 3.218876 3.637586 5.000000 6.164414
3 25 obs1 30 19.507182 0.68445091 3.401197 3.218876 5.477226 5.000000
4 48 obs1 35 37.922893 -0.13560024 3.555348 3.871201 5.916080 6.928203
5 38 obs1 25 29.916062 -0.25705864 3.218876 3.637586 5.000000 6.164414
6 22 obs1 20 17.105133 0.20166696 2.995732 3.091042 4.472136 4.690416
7 22 obs1 12 17.105133 -0.35582170 2.484907 3.091042 3.464102 4.690416
8 42 obs1 34 33.118794 0.04375468 3.526361 3.737670 5.830952 6.480741
9 34 obs1 20 26.713330 -0.37196003 2.995732 3.526361 4.472136 5.830952
10 14 obs1 10 10.699668 -0.06249970 2.302585 2.639057 3.162278 3.741657
11 30 obs1 25 23.510597 0.08801467 3.218876 3.401197 5.000000 5.477226
12 9 obs1 10 6.696253 0.38307207 2.302585 2.197225 3.162278 3.000000
13 18 obs1 15 13.902400 0.08524104 2.708050 2.890372 3.872983 4.242641
Aaron Vincent
3 March 2014
Statistics 516
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
25 obs1
62 obs1
26 obs1
88 obs1
56 obs1
11 obs1
66 obs1
42 obs1
30 obs1
90 obs1
119 obs1
165 obs1
152 obs1
205 obs1
409 obs1
342 obs1
200 obs1
73 obs1
123 obs1
150 obs1
70 obs1
90 obs1
110 obs1
95 obs1
57 obs1
43 obs1
55 obs1
325 obs1
114 obs1
83 obs1
91 obs1
56 obs1
56 obs2
38 obs2
25 obs2
48 obs2
38 obs2
22 obs2
22 obs2
42 obs2
34 obs2
14 obs2
30 obs2
9 obs2
18 obs2
25 obs2
62 obs2
26 obs2
88 obs2
56 obs2
11 obs2
66 obs2
42 obs2
30 obs2
20 19.507182 0.03205871 2.995732 3.218876 4.472136 5.000000

40 49.132456 -0.37271639 3.688879 4.127134 6.324555 7.874008
30 20.307865 0.61890353 3.401197 3.258097 5.477226 5.099020
75 69.950216 0.17321312 4.317488 4.477337 8.660254 9.380832
35 44.328357 -0.40068815 3.555348 4.025352 5.916080 7.483315
9 8.297619 0.07211670 2.197225 2.397895 3.000000 3.316625
55 52.335188 0.10533875 4.007333 4.189655 7.416198 8.124038
30 33.118794 -0.15487817 3.401197 3.737670 5.477226 6.480741
25 23.510597 0.08801467 3.218876 3.401197 5.000000 5.477226
40 71.551582 -1.07752758 3.688879 4.499810 6.324555 9.486833
75 94.771391 -0.58708772 4.317488 4.779123 8.660254 10.908712
100 131.602812 -0.80567882 4.605170 5.105945 10.000000 12.845233
150 121.193932 0.76275226 5.010635 5.023881 12.247449 12.328828
120 163.630135 -1.00897893 4.787492 5.323010 10.954451 14.317821
250 326.969482 -1.33294876 5.521461 6.013715 15.811388 20.223748
500 273.323716 4.68016944 6.214608 5.834811 22.360680 18.493242
200 159.626720 0.94347711 5.298317 5.298317 14.142136 14.142136
50 57.939969 -0.29868232 3.912023 4.290459 7.071068 8.544004
75 97.974123 -0.67191049 4.317488 4.812184 8.660254 11.090537
150 119.592566 0.81052683 5.010635 5.010635 12.247449 12.247449
50 55.537920 -0.21264505 3.912023 4.248495 7.071068 8.366600
60 71.551582 -0.39218884 4.094345 4.499810 7.745967 9.486833
75 87.565243 -0.38702667 4.317488 4.700480 8.660254 10.488088
150 75.554997 2.55199933 5.010635 4.553877 12.247449 9.746794
40 45.129040 -0.21821622 3.688879 4.043051 6.324555 7.549834
25 33.919477 -0.43807450 3.218876 3.761200 5.000000 6.557439
100 43.527674 2.53632635 4.605170 4.007333 10.000000 7.416198
200 259.712104 -1.13117219 5.298317 5.783825 14.142136 18.027756
60 90.767975 -0.93551330 4.094345 4.736198 7.745967 10.677078
40 65.946800 -0.92027363 3.688879 4.418841 6.324555 9.110434
35 72.352265 -1.27212750 3.555348 4.510860 5.916080 9.539392
20 44.328357 -1.05076800 2.995732 4.025352 4.472136 7.483315
40 58.145916 -0.78149300 3.688879 4.025352 6.324555 7.483315
30 38.171394 -0.42757154 3.401197 3.637586 5.477226 6.164414
40 23.745350 1.06441141 3.688879 3.218876 6.324555 5.000000
45 49.268350 -0.19804363 3.806662 3.871201 6.708204 6.928203
30 38.171394 -0.42757154 3.401197 3.637586 5.477226 6.164414
20 20.416263 -0.02899159 2.995732 3.091042 4.472136 4.690416
20 20.416263 -0.02899159 2.995732 3.091042 4.472136 4.690416
35 42.610176 -0.37818301 3.555348 3.737670 5.916080 6.480741
30 33.732611 -0.20669358 3.401197 3.526361 5.477226 5.830952
12 11.538698 0.04120652 2.484907 2.639057 3.464102 3.741657
30 29.293828 0.04172899 3.401197 3.401197 5.477226 5.477226
10 5.990219 0.46512717 2.302585 2.197225 3.162278 3.000000
18 15.977480 0.15708763 2.890372 2.890372 4.242641 4.242641
30 23.745350 0.40727054 3.401197 3.218876 5.477226 5.000000
50 64.804090 -0.60499401 3.912023 4.127134 7.071068 7.874008
20 24.855046 -0.30950313 2.995732 3.258097 4.472136 5.099020
120 93.656177 0.90783187 4.787492 4.477337 10.954451 9.380832
60 58.145916 0.07956778 4.094345 4.025352 7.745967 7.483315
10 8.209611 0.18385840 2.302585 2.397895 3.162278 3.316625
80 69.242872 0.42564927 4.382027 4.189655 8.944272 8.124038
35 42.610176 -0.37818301 3.555348 3.737670 5.916080 6.480741
30 29.293828 0.04172899 3.401197 3.401197 5.477226 5.477226
12
Aaron Vincent
3 March 2014
Statistics 516
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
90
119
165
152
205
409
342
200
73
123
150
70
90
110
95
57
43
55
325
114
83
91
56
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
120 95.875568 0.82155332 4.787492 4.499810 10.954451 9.486833

200 128.056742 2.19134446 5.298317 4.779123 14.142136 10.908712
200 179.102743 0.53161141 5.298317 5.105945 14.142136 12.845233
150 164.676699 -0.38764148 5.010635 5.023881 12.247449 12.328828
200 223.490570 -0.54094117 5.298317 5.323010 14.142136 14.317821
300 449.868485 -2.67460368 5.703782 6.013715 17.320508 20.223748
500 375.518876 2.36603778 6.214608 5.834811 22.360680 18.493242
300 217.942091 1.94983153 5.703782 5.298317 17.320508 14.142136
40 77.010742 -1.40764978 3.688879 4.290459 6.324555 8.544004
80 132.495525 -1.55281101 4.382027 4.812184 8.944272 11.090537
120 162.457308 -1.13589888 4.787492 5.010635 10.954451 12.247449
60 73.681655 -0.52606276 4.094345 4.248495 7.745967 8.366600
100 95.875568 0.13991856 4.605170 4.499810 10.000000 9.486833
120 118.069481 0.05941152 4.787492 4.700480 10.954451 10.488088
150 101.424046 1.62973335 5.010635 4.553877 12.247449 9.746794
40 59.255611 -0.82225594 3.688879 4.043051 6.324555 7.549834
35 43.719872 -0.42824961 3.555348 3.761200 5.916080 6.557439
110 57.036220 2.36798405 4.700480 4.007333 10.488088 7.416198
400 356.654049 0.81822745 5.991465 5.783825 20.000000 18.027756
120 122.508264 -0.07587778 4.787492 4.736198 10.954451 10.677078
40 88.107698 -1.72737592 3.688879 4.418841 6.324555 9.110434
60 96.985264 -1.25939393 4.094345 4.510860 7.745967 9.539392
40 58.145916 -0.78149300 3.688879 4.025352 6.324555 7.483315
Part 3a (log of model):

Answers:
13
Aaron Vincent
3 March 2014
Statistics 516
Loging the data reduces heteroscedasticity on both of the tails of the model as shown in this
plot. However, it increases heterscedasticity in the center of the data; producing, as in the
previous plot, 7 questionable observsaitons (29, 37, 40, 44, 45, 85, and 88).
Logging the data creates a nearly perfect qq plot. The only abnormalities are on the tails, but
being as these two portions arent all that bad in and of themselves I would call this a normal
distribution unlike the original data.
R Input:
snowgeese.long$logcount=log(snowgeese.long$count)
snowgeese.long$logphoto=log(snowgeese.long$photo)
model3 <- lm(logcount ~ observer + logphoto + observer:logphoto, data = snowgeese.long)
plot3 <- ggplot(snowgeese.long, aes(x = yhat, y = r, color = observer)) + geom_point()
plot3 <- plot3 + geom_segment(aes(x = yhat, xend = yhat, y = 0, yend = r))
plot3 <- plot3 + geom_hline(yintercept = 0)
plot3 <- plot3 + xlab("Predicted Value") + ylab("Studentized Residual")
plot3 <- plot3 + ggtitle("Snowgeese Data 3a")
print(plot3)
snowgeese.long
R Output:
> print(plot3)
> snowgeese.long
photo observer count yhat
1 56 obs1 50 3.733964 0.56070145 3.912023 4.025352 7.071068 7.483315
2 38 obs1 25 3.363716 -0.45729971 3.218876 3.637586 5.000000 6.164414
3 25 obs1 30 2.963921 1.40696803 3.401197 3.218876 5.477226 5.000000
4 48 obs1 35 3.586777 -0.09886791 3.555348 3.871201 5.916080 6.928203
5 38 obs1 25 3.363716 -0.45729971 3.218876 3.637586 5.000000 6.164414
6 22 obs1 20 2.841862 0.49195151 2.995732 3.091042 4.472136 4.690416
7 22 obs1 12 2.841862 -1.14844204 2.484907 3.091042 3.464102 4.690416
14
Aaron Vincent
3 March 2014
Statistics 516
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
42
34
14
30
9
18
25
62
26
88
56
11
66
42
30
90
119
165
152
205
409
342
200
73
123
150
70
90
110
95
57
43
55
325
114
83
91
56
56
38
25
48
38
22
22
42
34
14
30
9
18
25
62
26
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs1
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
34 3.459278 0.21132362 3.526361 3.737670 5.830952 6.480741

20 3.257515 -0.83031059 2.995732 3.526361 4.472136 5.830952
10 2.410296 -0.35023095 2.302585 2.639057 3.162278 3.741657
25 3.138006 0.25618816 3.218876 3.401197 5.000000 5.477226
10 1.988422 1.05273001 2.302585 2.197225 3.162278 3.000000
15 2.650257 0.18583693 2.708050 2.890372 3.872983 4.242641
20 2.963921 0.10119000 2.995732 3.218876 4.472136 5.000000
40 3.831149 -0.44765420 3.688879 4.127134 6.324555 7.874008
30 3.001370 1.28266246 3.401197 3.258097 5.477226 5.099020
75 4.165531 0.47916094 4.317488 4.477337 8.660254 9.380832
35 3.733964 -0.56246439 3.555348 4.025352 5.916080 7.483315
9 2.180028 0.05657140 2.197225 2.397895 3.000000 3.316625
55 3.890845 0.36642060 4.007333 4.189655 7.416198 8.124038
30 3.459278 -0.18295558 3.401197 3.737670 5.477226 6.480741
25 3.138006 0.25618816 3.218876 3.401197 5.000000 5.477226
40 4.186989 -1.59217483 3.688879 4.499810 6.324555 9.486833
75 4.453685 -0.43135811 4.317488 4.779123 8.660254 10.908712
100 4.765743 -0.51298987 4.605170 5.105945 10.000000 12.845233
150 4.687385 1.03512066 5.010635 5.023881 12.247449 12.328828
120 4.973001 -0.59736536 4.787492 5.323010 10.954451 14.317821
250 5.632504 -0.36964291 5.521461 6.013715 15.811388 20.223748
500 5.461682 2.57244775 6.214608 5.834811 22.360680 18.493242
200 4.949424 1.12846331 5.298317 5.298317 14.142136 14.142136
50 3.987096 -0.23612781 3.912023 4.290459 7.071068 8.544004
75 4.485252 -0.53200077 4.317488 4.812184 8.660254 11.090537
150 4.674738 1.07576282 5.010635 5.010635 12.247449 12.247449
50 3.947028 -0.11005031 3.912023 4.248495 7.071068 8.366600
60 4.186989 -0.29195840 4.094345 4.499810 7.745967 9.486833
75 4.378594 -0.19307874 4.317488 4.700480 8.660254 10.488088
150 4.238614 2.52280261 5.010635 4.553877 12.247449 9.746794
40 3.750864 -0.19486441 3.688879 4.043051 6.324555 7.549834
25 3.481746 -0.83102510 3.218876 3.761200 5.000000 6.557439
100 3.716760 2.93021464 4.605170 4.007333 10.000000 7.416198
200 5.412999 -0.37666293 5.298317 5.783825 14.142136 18.027756
60 4.412699 -1.01239272 4.094345 4.736198 7.745967 10.677078
40 4.109678 -1.33820583 3.688879 4.418841 6.324555 9.110434
35 4.197540 -2.07358019 3.555348 4.510860 5.916080 9.539392
20 3.733964 -2.39756195 2.995732 4.025352 4.472136 7.483315
40 3.991229 -0.95542401 3.688879 4.025352 6.324555 7.483315
30 3.586782 -0.58640489 3.401197 3.637586 5.477226 6.164414
40 3.150060 1.74425125 3.688879 3.218876 6.324555 5.000000
45 3.830447 -0.07481765 3.806662 3.871201 6.708204 6.928203
30 3.586782 -0.58640489 3.401197 3.637586 5.477226 6.164414
20 3.016727 -0.06703062 2.995732 3.091042 4.472136 4.690416
20 3.016727 -0.06703062 2.995732 3.091042 4.472136 4.690416
35 3.691171 -0.42822171 3.555348 3.737670 5.916080 6.480741
30 3.470772 -0.21984639 3.401197 3.526361 5.477226 5.830952
12 2.545298 -0.19627132 2.484907 2.639057 3.464102 3.741657
30 3.340224 0.19312479 3.401197 3.401197 5.477226 5.477226
10 2.084458 0.72846783 2.302585 2.197225 3.162278 3.000000
18 2.807424 0.26678068 2.890372 2.890372 4.242641 4.242641
30 3.150060 0.80182065 3.401197 3.218876 5.477226 5.000000
50 4.097390 -0.58374228 3.912023 4.127134 7.071068 7.874008
20 3.190968 -0.62175673 2.995732 3.258097 4.472136 5.099020
15
Aaron Vincent
3 March 2014
Statistics 516
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
88
56
11
66
42
30
90
119
165
152
205
409
342
200
73
123
150
70
90
110
95
57
43
55
325
114
83
91
56
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
120 4.462658 1.02926194 4.787492 4.477337 10.954451 9.380832

60 3.991229 0.32430923 4.094345 4.025352 7.745967 7.483315
10 2.293761 0.02902647 2.302585 2.397895 3.162278 3.316625
80 4.162600 0.69161282 4.382027 4.189655 8.944272 8.124038
35 3.691171 -0.42822171 3.555348 3.737670 5.916080 6.480741
30 3.340224 0.19312479 3.401197 3.401197 5.477226 5.477226
120 4.486098 0.95440746 4.787492 4.499810 10.954451 9.486833
200 4.777427 1.67491895 5.298317 4.779123 14.142136 10.908712
200 5.118309 0.57531226 5.298317 5.105945 14.142136 12.845233
150 5.032713 -0.07025994 5.010635 5.023881 12.247449 12.328828
200 5.344711 -0.14910093 5.298317 5.323010 14.142136 14.317821
300 6.065130 -1.21223971 5.703782 6.013715 17.320508 20.223748
500 5.878529 1.11404702 6.214608 5.834811 22.360680 18.493242
300 5.318956 1.24670980 5.703782 5.298317 17.320508 14.142136
40 4.267741 -1.85664343 3.688879 4.290459 6.324555 8.544004
80 4.811910 -1.37602628 4.382027 4.812184 8.944272 11.090537
120 5.018898 -0.73847933 4.787492 5.010635 10.954451 12.247449
60 4.223972 -0.40790406 4.094345 4.248495 7.745967 8.366600
100 4.486098 0.37536652 4.605170 4.499810 10.000000 9.486833
120 4.695401 0.29106232 4.787492 4.700480 10.954451 10.488088
150 4.542491 1.49482015 5.010635 4.553877 12.247449 9.746794
40 4.009690 -1.01440400 3.688879 4.043051 6.324555 7.549834
35 3.715714 -0.50568559 3.555348 3.761200 5.916080 6.557439
110 3.972435 2.36241832 4.700480 4.007333 10.488088 7.416198
400 5.825350 0.54608932 5.991465 5.783825 20.000000 18.027756
120 4.732656 0.17337192 4.787492 4.736198 10.954451 10.677078
40 4.401645 -2.31270363 3.688879 4.418841 6.324555 9.110434
60 4.497623 -1.28265054 4.094345 4.510860 7.745967 9.539392
40 3.991229 -0.95542401 3.688879 4.025352 6.324555 7.483315
Part 3b (sqrt of model):

Answers:
16
Aaron Vincent
3 March 2014
Statistics 516
17
Taking the square root of the model produces a very nice and uniform residual vs. predicted plot.
It does not have the trumpet shape seen in the first problem and only has 6 questionable
observations (29, 37, 40, 73, 74, and 85) which are spread out throughout the data.
In my opinion this is the best qq plot I have produced. It shows a very normal distribution unlike
the original data.
R Input:
snowgeese.long$sqrtcount=sqrt(snowgeese.long$count)
snowgeese.long$sqrtphoto=sqrt(snowgeese.long$photo)
model4 <- lm(sqrtcount ~ observer + sqrtphoto + observer:sqrtphoto, data = snowgeese.long)
plot4 <- ggplot(snowgeese.long, aes(x = yhat, y = r, color = observer)) + geom_point()
plot4 <- plot4 + geom_segment(aes(x = yhat, xend = yhat, y = 0, yend = r))
plot4 <- plot4 + geom_hline(yintercept = 0)
plot4 <- plot4 + xlab("Predicted Value") + ylab("Studentized Residual")
plot4 <- plot4 + ggtitle("Snowgeese Data 3b")
print(plot4)
snowgeese.long
R Output:
> print(plot4)
> snowgeese.long
yhat
1 56 obs1 50 6.503183 0.367075621 3.912023 4.025352 7.071068 7.483315
2 38 obs1 25 5.323539 -0.209756966 3.218876 3.637586 5.000000 6.164414
3 25 obs1 30 4.282071 0.781562158 3.401197 3.218876 5.477226 5.000000
4 48 obs1 35 6.006683 -0.058587750 3.555348 3.871201 5.916080 6.928203
5 38 obs1 25 5.323539 -0.209756966 3.218876 3.637586 5.000000 6.164414
6 22 obs1 20 4.005174 0.304980192 2.995732 3.091042 4.472136 4.690416
7 22 obs1 12 4.005174 -0.353449475 2.484907 3.091042 3.464102 4.690416
8 42 obs1 34 5.606466 0.145361598 3.526361 3.737670 5.830952 6.480741
9 34 obs1 20 5.025286 -0.359266907 2.995732 3.526361 4.472136 5.830952
Aaron Vincent
3 March 2014
Statistics 516
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
14 obs1
30 obs1
9 obs1
18 obs1
25 obs1
62 obs1
26 obs1
88 obs1
56 obs1
11 obs1
66 obs1
42 obs1
30 obs1
90 obs1
119 obs1
165 obs1
152 obs1
205 obs1
409 obs1
342 obs1
200 obs1
73 obs1
123 obs1
150 obs1
70 obs1
90 obs1
110 obs1
95 obs1
57 obs1
43 obs1
55 obs1
325 obs1
114 obs1
83 obs1
91 obs1
56 obs1
56 obs2
38 obs2
25 obs2
48 obs2
38 obs2
22 obs2
22 obs2
42 obs2
34 obs2
14 obs2
30 obs2
9 obs2
18 obs2
25 obs2
62 obs2
26 obs2
88 obs2
56 obs2
10 3.156591 0.003735565 2.302585 2.639057 3.162278 3.741657

25 4.708908 0.189256138 3.218876 3.401197 5.000000 5.477226
10 2.493242 0.442679555 2.302585 2.197225 3.162278 3.000000
15 3.604678 0.175667018 2.708050 2.890372 3.872983 4.242641
20 4.282071 0.123858617 2.995732 3.218876 4.472136 5.000000
40 6.852624 -0.341117819 3.688879 4.127134 6.324555 7.874008
30 4.370635 0.722886869 3.401197 3.258097 5.477226 5.099020
75 8.200348 0.297052446 4.317488 4.477337 8.660254 9.380832
35 6.503183 -0.379518416 3.555348 4.025352 5.916080 7.483315
9 2.776436 0.147377575 2.197225 2.397895 3.000000 3.316625
55 7.076255 0.219456702 4.007333 4.189655 7.416198 8.124038
30 5.606466 -0.083680424 3.401197 3.737670 5.477226 6.480741
25 4.708908 0.189256138 3.218876 3.401197 5.000000 5.477226
40 8.295158 -1.284607121 3.688879 4.499810 6.324555 9.486833
75 9.566906 -0.588559941 4.317488 4.779123 8.660254 10.908712
100 11.298958 -0.853260987 4.605170 5.105945 10.000000 12.845233
150 10.837078 0.924225426 5.010635 5.023881 12.247449 12.328828
120 12.616062 -1.107122187 4.787492 5.323010 10.954451 14.317821
250 17.898407 -1.527043045 5.521461 6.013715 15.811388 20.223748
500 16.350617 4.711484992 6.214608 5.834811 22.360680 18.493242
200 12.458926 1.119969498 5.298317 5.298317 14.142136 14.142136
50 7.451878 -0.245814886 3.912023 4.290459 7.071068 8.544004
75 9.729533 -0.695140371 4.317488 4.812184 8.660254 11.090537
150 10.764291 0.971993012 5.010635 5.010635 12.247449 12.247449
50 7.293206 -0.143363192 3.912023 4.248495 7.071068 8.366600
60 8.295158 -0.354846702 4.094345 4.499810 7.745967 9.486833
75 9.190695 -0.343424800 4.317488 4.700480 8.660254 10.488088
150 8.527671 2.488598943 5.010635 4.553877 12.247449 9.746794
40 6.562679 -0.153803670 3.688879 4.043051 6.324555 7.549834
25 5.675066 -0.437459035 3.218876 3.761200 5.000000 6.557439
100 6.443153 2.372405406 4.605170 4.007333 10.000000 7.416198
200 15.934280 -1.252757820 5.298317 5.783825 14.142136 18.027756
60 9.359730 -1.051435387 4.094345 4.736198 7.745967 10.677078
40 7.958501 -1.061510382 3.688879 4.418841 6.324555 9.110434
35 8.342167 -1.589624858 3.555348 4.510860 5.916080 9.539392
20 6.503183 -1.325294034 2.995732 4.025352 4.472136 7.483315
40 7.459380 -0.735286833 3.688879 4.025352 6.324555 7.483315
30 6.036863 -0.363011496 3.401197 3.637586 5.477226 6.164414
40 4.780969 1.011847901 3.688879 3.218876 6.324555 5.000000
45 6.860658 -0.098586529 3.806662 3.871201 6.708204 6.928203
30 6.036863 -0.363011496 3.401197 3.637586 5.477226 6.164414
20 4.447063 0.016366515 2.995732 3.091042 4.472136 4.690416
20 4.447063 0.016366515 2.995732 3.091042 4.472136 4.690416
35 6.378041 -0.299255002 3.555348 3.737670 5.916080 6.480741
30 5.677203 -0.129798355 3.401197 3.526361 5.477226 5.830952
12 3.423768 0.026497079 2.484907 2.639057 3.464102 3.741657
30 5.295687 0.118013769 3.401197 3.401197 5.477226 5.477226
10 2.623843 0.356120423 2.302585 2.197225 3.162278 3.000000
18 3.964110 0.182364603 2.890372 2.890372 4.242641 4.242641
30 4.780969 0.454234444 3.401197 3.218876 5.477226 5.000000
50 7.880768 -0.523528065 3.912023 4.127134 7.071068 7.874008
20 4.887768 -0.270800127 2.995732 3.258097 4.472136 5.099020
120 9.505972 0.939932778 4.787492 4.477337 10.954451 9.380832
60 7.459380 0.185137282 4.094345 4.025352 7.745967 7.483315
18
Aaron Vincent
3 March 2014
Statistics 516
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
11
66
42
30
90
119
165
152
205
409
342
200
73
123
150
70
90
110
95
57
43
55
325
114
83
91
56
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
obs2
10 2.965343 0.129819286 2.302585 2.397895 3.162278 3.316625

80 8.150441 0.513119412 4.382027 4.189655 8.944272 8.124038
35 6.378041 -0.299255002 3.555348 3.737670 5.916080 6.480741
30 5.295687 0.118013769 3.401197 3.401197 5.477226 5.477226
120 9.620301 0.865176260 4.787492 4.499810 10.954451 9.486833
200 11.153887 1.980045388 5.298317 4.779123 14.142136 10.908712
200 13.242546 0.589610898 5.298317 5.105945 14.142136 12.845233
150 12.685571 -0.285810168 5.010635 5.023881 12.247449 12.328828
200 14.830826 -0.456154532 5.298317 5.323010 14.142136 14.317821
300 21.200740 -2.939910796 5.703782 6.013715 17.320508 20.223748
500 19.334280 2.170367401 6.214608 5.834811 22.360680 18.493242
300 14.641338 1.803178971 5.703782 5.298317 17.320508 14.142136
40 8.603400 -1.489552424 3.688879 4.290459 6.324555 8.544004
80 11.349996 -1.582342395 4.382027 4.812184 8.944272 11.090537
120 12.597799 -1.078339581 4.787492 5.010635 10.954451 12.247449
60 8.412059 -0.430298512 4.094345 4.248495 7.745967 8.366600
100 9.620301 0.245238764 4.605170 4.499810 10.000000 9.486833
120 10.700218 0.164511064 4.787492 4.700480 10.954451 10.488088
150 9.900686 1.536688788 5.010635 4.553877 12.247449 9.746794
40 7.531126 -0.782012474 3.688879 4.043051 6.324555 7.549834
35 6.460765 -0.352830467 3.555348 3.761200 5.916080 6.557439
110 7.386991 2.052189773 4.700480 4.007333 10.488088 7.416198
400 18.832224 0.812005043 5.991465 5.783825 20.000000 18.027756
120 10.904055 0.032623762 4.787492 4.736198 10.954451 10.677078
40 9.214330 -1.904426596 3.688879 4.418841 6.324555 9.110434
60 9.676989 -1.258411136 4.094345 4.510860 7.745967 9.539392
40 7.459380 -0.735286833 3.688879 4.025352 6.324555 7.483315
19

Homework 2

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Homework 2

Cargado por

Copyright:

Formatos disponibles

Aaron Vincent

Power-Transformations via Nonlinear Regression

> model2 = nls(len ~ ifelse(supp == "VC",theta1+theta2+(theta3+theta4)dose,theta1+theta3dose),start = c(theta1

0.01 * 0.05 . 0.1 1

Residual standard error: 3.715 on 55 degrees of freedom

120 162.558750 -1.026549401

20 19.507182 0.03205871 2.995732 3.218876 4.472136 5.000000

120 95.875568 0.82155332 4.787492 4.499810 10.954451 9.486833

Part 3a (log of model):

34 3.459278 0.21132362 3.526361 3.737670 5.830952 6.480741

120 4.462658 1.02926194 4.787492 4.477337 10.954451 9.380832

Part 3b (sqrt of model):

10 3.156591 0.003735565 2.302585 2.639057 3.162278 3.741657

10 2.965343 0.129819286 2.302585 2.397895 3.162278 3.316625

También podría gustarte