Está en la página 1de 7

STAT 30100: Elementary Statistical Methods I

Department of Mathematical Sciences


School of Science, IUPUI

Project # 1
Fall 2015, Total 30 points

Name:_______________KEY________________________________

Due Date: 9/20/2015

Insttructionss:
nations to all questions.
q
Graaphs must be labeled for cclear
Provide complete and clear explan
mportance of context throoughout in deescriptions an
nd graphs! Y
Your
communiccation. Remeember the im
project mu
ust be typed, easy to read and
a have the answer to eacch question iddentified by qquestion numbber.
Copy and
d paste the sp
pecified StatC
Crunch outp
put, includingg graphs and
d number sum
mmaries, intto the
Word document. The indicated StaatCrunch outp
put MUST apppear with thee discussion, nnot as separatte
pages. Yo
our project mu
ust be printed
d and handed in
i at the beginnning of classs on the due ddate specifiedd on
the class schedule.
s
The dataset (DEMOGR
RAPHIC.XLS
S) for this pro
oject containss data a variouus demographhic characteristics
for each of
o the individu
ual states in th
he US. The daata is taken frrom the Statisstical Abstracct of the Uniteed
States for 2007 which gives
g
the avaiilable data forr each of the vvariables. (Thhis is why you will see forr
different years
y
for diffeerent variablees. Luckily, deemographic ddata does not typically chaange dramatically
from yearr to year.) Mo
ore informatio
on on this dattaset can be foound in the fiile called Deescription of tthe
variable for
fo the Demog
graphic dataseet. To adeq
quately label your graphss and describ
be the data
behavior,, you must usse the inform
mation from that
t
documeent, not just tthe variable n
name from the
dataset.
b
to investigate this datta set by exam
mining severaal of the variaables. All grapphs and
We will begin
calculations must be co
ompleted usin
ng StatCrunch
h.
1. (a) Makeadotplo
M
otoftheInfan
ntmortalityrrate.

[ImportantNotes:Makesureyyouclearlyprrovidethetitleandlevelthheaxisinthecontext]

(2ptforco
orrectgraph
h,1ptforxaxislevel,11ptfortitle=Total4pts)
2

(b) Describeyourgraphobtainedinpart(a)usingguidelinesinclass.

[ImportantNotes:Inyourdescription,STARTbygivingthecontextYouaredescribingthe
distributionof
aspecificvariable(includingrelevantunits)
foraspecifiedsetofcasesorindividuals
foreitherasampleorpopulation(justify)

Then,continuingincontextidentifythedistributions
Generalshapemound,bellshaped,triangular,rectangulargenerallyignoresoutliers
Symmetryorskew
primarypeaksormodes(givelocationbyvariablevalue)(dontdescribeeveryupanddown
movementonthedotplotsorstemplotwearelookingfordistinctivetendencyofthedatato
goup,backdown,thengoupagain)
Outliers,Gaps,Cluster(givelocationbyvariablevalue)
Centerwhatvaluesplitsthedatasetinhalf?(youincludeboththepositionintermsofthe
orderedobservationsandthevariablevalue)
Spreaddescribespreadastheminimumtomaximumignoringtheoutliers!]

Variable: ThehistogramshowsthedistributionacrossstatesofInfantMortalityRatesin2007,
measuredbythenumberofdeaths(ofbabiesbeforetheyattainoneyearofage)per
1000livebirths,excludingfataldeaths.[2pt]
Cases:
Thecasesrepresentedarethestates.[1pt]

SampleorPopulation:
Thedatasetrepresentsapopulationsinceofall50statesintheUnited
Statesareincluded.[1pt]

Shape/Peak:
Thedistributionofinfantmortalityratesismoundshaped(unimodal)andfairly
symmetricwithapeakaround7deathsper1000livebirths.
[2pt]

Center: Thedistributionofdeathrateshasacenter(median)betweenthe25thand26thordered
observationsatabout6.88(or7)deathsper1000livebirths.1pt]

Spread: Thedeathratesspreadfromaround4deathsper1000livebirthsto11deathsper1000
livebirths.[1pt]

(Total8pts)

2. (a) Fiindthenumeericalsummaries(summarystatistics)ffortheInfantmortalityraate.
In
nterpretthevvalueofmeanandstandaarddeviation inthecontexxt.
Table: Su
ummary stattistics of theStateInfantM
MortalityRattes(deathperr1,000livebirths) in 2007
7

Column
State Infant Mortality
M
Rattes

Mean Std. Dev. Median Min Max Q1 Q3


6.88
8

1.466

6.75

4 10.7 5.7 7.8

(2 pts)
000
Mean:TheaverageofallStateInfantMortalityrratesis6.88ddeaths(beforreattaining1year)per1,0

livebirthss.

(1
1pt)

Standard
dDeviation:Roughlyspeeaking,onaveerage,eachS tateinfantm
mortalityrateis1.46death
hs
(beforeaattaining1yearper1,00
00livebirthss)awayfrom
mthemean66.88,theaveragedeathsoofall
Stateinfaantmortalityyrates.

(1 pt)

(b)Createahorizo
r
ontalboxplottoftheInfan
ntmortalityraate.Doyousseeanyoutlieer(s)?

(2ptforcorrectgrap
ph,ptforxaxislevel,ptfortittle=Total3 pts)

eisnooutlierdetectedbytheboxplot.
There

(1
1pt)

(c) WhatistheInfantmortalityrateinIndianaState?WhichquarterdoesIndianafallinthe
boxplot?

[ImportantNotes:Makesureforsummarystatisticsyouincludemean,median,standarddeviation,and
fivenumbersummary,andfortheboxplot,youclearlynamethetitleandlevelthehorizontalaxisinthe
context]

TheinfantmortalityrateinIndianais7.7deathsbeforeattaining1yearper1,000livebirths.
(1pt)
Intheboxplot,INfallsinthe(upperedgeof)3rdquarterofallstatesinfantmortalityrates.

(1pt)

State Infant Mortality Rate(deaths/1000 live births)

3. (a) Drawascatterplotforvariablesinfantmortalityrateandpercentageofpersonsin
stateage25&overwhohavenotcompletedhighschool.Usingonlytheplotandno
othernumericalinformationinthisstep,describethetrendandstrengthbetweenthese
twovariables.

Infant Mortality Rate VS % of 25+ State Population w/o High school


11
10
9
8
7
6
5
4
6

10
12
14
16
18
20
Percent of State 25+ population without High School

22

(2pts.,ifgraphlooksok;1pt.fortitle,ptforxlabel,pt.forylabel=4pts.Total


Describethetrend.
Thetrendofassociationispositiveasthepercentofstates25+populationwithouthigh
schooleducationincreases,soincreasestheinfantmortalityrate.

(1pt.)

Describethestrength.
Thestrengthbetweeninfantmortalityrateandpercentofstates25+populationwithouthigh
schooleducationismoderatelystrongandfairlyconstant(withtheexceptionofthe4states
thatareclearlyseparated).

(1pt.)

(b)

Findthecorrelationcoefficientbetweenthevariables,infantmortalityrate
andpercentageofpersonsinstateage25&overwhohavenotcompletedhigh
school,andinterprettheresultincontext.

Pearson correlation of InfMortRate and %noHS = 0.514

(pt.)

Thecorrelationcoefficientof.514indicatesamoderate,positivelinearassociationbetween
infantmortalityrateandpercentofstates25+populationwithouthighschooleducation.

(1pt.)

(c)

Findtheregressionmodelforpredictinginfantmortalityratefrompercentageof
personsinstateage25&overwhohavenotcompletedhighschool,andinterpretthe
regressioncoefficients(yinterceptandslope)incontext.

The regression equation is

(pt.)
InfMortRate 03 = 4.31 + 0.192 %noHS

SLOPE:
Theslopevalueof0.192indicatesthatforeach1%increaseinthepercentofstates25+
populationwithouthighschooleducation,theinfantmortalityrateincreasesbyabout0.2
deathsper1000livebirths.

(1pt.)

YINTERCEPT:
Theyinterceptvalueof4.31indicatesthatwhenthepercentofstates25+populationwithout
highschooleducationis0(whichisaverydesirablegoal),theinfantmortalityratewouldstill
beabout4.31deathsbeforeattainingoneyearper1000livebirths.
(1pt.)

(d)

Reportthecoefficientofdeterminationandinterpretitsvalueincontext.

(pt.)

R-Sq = 26.4%

26.4%ofthevariationininfantmortalityratesisexplainedbythelinearassociationbetween
infantmortalityrateandpercentofstates25+populationwithouthighschool.(pt.)

(e) Predicttheinfantmortalityratewhenpercentageofpersonsinstatesage25&over
whohavenotcompletedHighSchoolis7.5.

InfMortRate = 4.31 + 0.192 (7.5) = 5.74


(pt.)

Also,predicttheinfantmortalityratewhenpercentageofpersonsinstatesage25&over
whohavenotcompletedHighSchoolis19.5.

InfMortRate = 4.31 + 0.192 (19.5) = 8.044


(pt.)

[ImportantNotes:Forthescatterplot,useinfantmortalityrateontheYaxisandtheothervariable
ontheXaxis,andnamethetitleandleveleachaxisinthecontext]

También podría gustarte