Está en la página 1de 33

Basic Business Statistics:

Using Microsoft Excel

1. Histograms in Exel

1 Select Tools/Data Analysis

1. Histograms in Exel (contiued)


2 Choose Histogram

3 Input data and bin ranges


Select Chart Output
3

Exercise 1:
Given below are the heigts, in centimetres of 50 students: 164 160 160 171 160 155 158 170 164 162 160 162 172 171 162 160 162 166 172 158 163 165 164 161 168 157 168 166 160 162 163 167 158 159 160 163 167 168 170 168 164 160 168 165 165 159 158 167 159 160

Excercise 1 (continued)
1. Place the data in ordered array. 2. Set up a stem-and-leaf display for these data. 3. Construct the frequency distribution and the percentage distribution for these data. 4. Construct a grouped frequency distribution table with the width of classe interval of 5cm. 5. From this frequency distribution table, construct the bar graphs and the pie charts.

2. Descriptive statistics
Use menu choice: tools / data analysis / descriptive statistics Enter details in dialog box

2. Descriptive statistics (continued)




Use menu choice:

tools / data analysis / descriptive statistics

2. Descriptive statistics(continued)

Enter di l details

box

eck box for su ary statistics lick OK


8

2. Descriptive statistics (continued)


Microsoft Excel descriptive statistics output, using the house price data:
House Prices: $2,000,000 500,000 300,000 00,000 00,000

3.

Simple Linear Regression


Square Feet (x) 1400 1600 1700 1875 1100 1550 2350 2450 1425 1700
10

Sample Data for House Price Model:


House Price in $1000s (y) 245 312 279 308 199 219 405 324 319 255

3. Simple linear regression


Tools / Data Analysis /Regression

11

3. Simple linear regression : Excel Output


Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.76211 0.58082 0.52842 41.33032 10

The regressi n equati n is:


house pri e ! 98.24833  0.10977 (square feet)

ANOVA
df Regression Residual Total 1 8 9 SS 18934.9348 13665.5652 32600.5000 MS 18934.9348 1708.1957 F 11.0848

Significance F 0.01039

Coefficients Intercept Square Feet 98.24833 0.10977

Standard Error 58.03348 0.03297

t Stat 1.69296 3.32938

P-value 0.12892 0.01039

Lower 95% -35.57720 0.03374

Upper 95% 232.07386 12 0.18580

3. Regression Excel Output


(continued) (Graphical Presentation) House price model: scatter plot and regression line
450 House Price ($1000s) 400 350 300 250 200 150 100 50 0 0 500 1000 1500 2000 2500 3000 Square Feet

lope = 0.10977

Inter ept = 98.248

house pri e ! 98.24833  0.10977 (s uare feet)


13

3. Simple LinearRegression (Interpretation of b0 , b1)


house pri e ! 98.24833  0.10977 (square feet)
Here, no houses had 0 square feet, so b0 = 98.24833 just indicates that, for houses within the range of sizes observed, $98,248.33 is the portion of the house price not explained by square feet Here, b1 = .10977 tells us that the average value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of 14 size

3. Regression Excel Output


Estimated Regression Equation:

: Estimate
Predi t the pri e for a house with 2000 square feet

house pri e ! 98.25  0.1098 (sq.ft.)

housepri e ! 98.25 0.1098(sq.ft.) ! 98.25 0.1098(200 0) ! 317.85


The predi ted pri e for a house with 2000 square feet is 317.85($1,000s) = $317,850

15

3. Regression Excel Output : Coefficient of Determination, R2


Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.76211 0.58082 0.52842 41.33032 10

SSR 18934.9348 R ! ! ! 0.58082 SST 32600.5000


2

58.08% of the ariation in house pri es is explained y ariation in square feet


Significance F 0.01039

ANOVA
df Regression Residual Total 1 8 9 SS 18934.9348 13665.5652 32600.5000 MS 18934.9348 1708.1957 F 11.0848

Coefficients Intercept Square Feet 98.24833 0.10977

Standard Error 58.03348 0.03297

t Stat 1.69296 3.32938

P-value 0.12892 0.01039

Lower 95% -35.57720 0.03374

Upper 95% 232.07386 16 0.18580

3. Simple LinearRegression: t Test for the Slope, b1 t test for a population slope
Is there a linear relationship between x and y?

Hypotheses
H 0: H 1: =0 1{0
1

(no linear relationship) (linear relationship does exist)


where: b1 = Sample regression slope coefficient
1

Test statistic

b1  t! sb1

= ypothesized slope

d.f. ! n  2

sb1 = Estimator of the standard error of the slope


17

3. t Test for the Slope, the standard error Excel

Output
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.76211 0.58082 0.52842 41.33032 10

s ! 41.33032

sb1 ! 0.03297
Significance F 0.01039

ANOVA
df Regression Residual Total 1 8 9 SS 18934.9348 13665.5652 32600.5000 MS 18934.9348 1708.1957 F 11.0848

Coefficients Intercept Square Feet 98.24833 0.10977

Standard Error 58.03348 0.03297

t Stat 1.69296 3.32938

P-value 0.12892 0.01039

Lower 95% -35.57720 0.03374

Upper 95% 18 232.07386 0.18580

3. t Test for the Slope (continued)


Test Statistic: t = 3.329 H 0: H 1:
1

=0 1{0

From Excel output:


Coefficients Intercept Square Feet 98.24833 0.10977

b1
Standard Error 58.03348 0.03297

sb1
t Stat 1.69296 3.32938

t
P-value 0.12892 0.01039

d.f. = 10-2 = 8
E/2=.025 E/2=.025

Reject

-t /2 -2.3060

Do not reject

0 t /2 2.3060 3.329

Reject

Decision: Reject 0 Conclusion: There is sufficient evidence that square footage affects 19 house price

4. Multiple Regression: Example


A distributor of frozen desert pies wants to evaluate factors thought to influence demand
Dependent variable: Pie sales (units per week) Independent variables: Price (in $)
Advertising ($100s)

Data are collected for 15 weeks

20

Multiple Linear Regression Equation


Regression Statistics Multiple R R Square Adjusted R Standard Error Observations 0.72213 0.52148 0.44172 47.46341 15

Sales ! 306.526 - 24.975(Pri ce)  74.131(Adv ertising)

ANOVA Regression Residual Total

df 2 12 14

SS 29460.027 27033.306 56493.333 Standard Error 114.25389 10.83213 25.96732

MS 14730.013 2252.776

F 6.53861

Significance F 0.01201

Coefficients Intercept Price Advertising 306.52619 -24.97509 74.13096

t Stat 2.68285 -2.30565 2.85478

P-value 0.01993 0.03979 0.01449

Lower 95% 57.58835 -48.57626 17.55303

Upper 95% 555.46404 -1.37392


21

130.70888

4. Multiple Regression : (continued)


Multiple Linear Regression Equation
Sales ! 306.526 - 24.975(Pri ce)  74.131( dv ertising)
where Sales is in number of pies per week Price is in $ Advertising is in $100s.

b1 = -24.975: sales will decrease, on average, by 24.975 pies per week for each $1 increase in selling price, net of the effects of changes due to advertising

b2 = 74.131: sales will increase, on average, by 74.131 pies per week for each $100 increase in advertising, net of the effects of changes due to price
22

4. Multiple Regression : (continued) Using The Model to Make Predictions


Predict sales for a week in which the selling price is $5.50 and advertising is $350:
Sales ! 306.526 - 24.975(Price)  74.131(Advertising)

! 306.526 - 24.975 (5.50)  74.131(3.5) ! 428.62

Predicted sales is 428.62 pies

Note that Advertising is in $100s, so $350 means that x2 = 3.5


23

4. Multiple Regression : (continued)


Multiple Coefficient of Determination
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.72213 0.52148 0.44172 47.46341 15

R2 !

SSR 29460.0 ! ! .52148 SST 56493.3


52.1% of the variation in pie sales is explained by the variation in price and advertising

ANOVA Regression Residual Total

df 2 12 14

SS 29460.027 27033.306 56493.333 Standard Error 114.25389 10.83213

MS 14730.013 2252.776

F 6.53861

Significance F 0.01201

Coefficients Intercept Price 306.52619 -24.97509

t Stat 2.68285 -2.30565

P-value 0.01993 0.03979

Lower 95% 57.58835 -48.57626

Upper 95%
24 555.46404

-1.37392

4. Multiple Regression : (continued)


Correlation matrix
Week 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Pie Sales 350 460 350 430 350 380 430 470 450 490 340 300 440 450 300 Price ($) 5.50 7.50 8.00 8.00 6.80 7.50 4.50 6.40 7.00 5.00 7.20 7.90 5.90 5.00 7.00 Advertising ($100s) 3.3 3.3 3.0 4.5 3.0 4.0 3.0 3.7 3.5 4.0 3.5 3.2 4.0 3.5 2.7
25

Multiple regression model:

Sales = b0 + b1 (Price) + b2 (Advertising)


Correlation matrix:
Pie Sales Pie Sales Price Advertising 1 -0.44327 0.55632 1 0.03044 1 Price Advertising

4. Multiple Regression : (continued)


Correlation matrix
Pie Sales Pie Sales Price Advertising 1 -0.44327 0.55632 1 0.03044 1 Price Advertising

Price vs. Sales : r = -0.44327


There is a negative association between price and sales

Advertising vs. Sales : r = 0.55632


There is a positive association between advertising and sales
26

4. Multiple Regression : t-tests of


individual variable slopes, b1 and b2
Regression Statistics Multiple R R Square Adjusted R Standard Error Observations 0.72213 0.52148 0.44172 47.46341 15

t-value for Price is t = -2.306, with p-value .0398 t-value for Advertising is t = 2.855, with p-value .0145
SS MS 14730.013 2252.776 F 6.53861 Significance F 0.01201

ANOVA Regression Residual Total

df 2 12 14

29460.027 27033.306 56493.333 Standard Error 114.25389 10.83213 25.96732

Coefficients Intercept Price Advertising 306.52619 -24.97509 74.13096

t Stat 2.68285 -2.30565 2.85478

P-value 0.01993 0.03979 0.01449

Lower 95% 57.58835 -48.57626 17.55303

Upper 95% 555.46404 -1.37392 27 130.70888

4. Multiple Regression : t-tests of


individual variable slopes, b1 and b2
H0: H1:
E = .05

=0 i {0
i

From Excel output:


Coefficients Price Advertising -24.97509 74.13096 Standard Error 10.83213 25.96732 t Stat -2.30565 2.85478 P-value 0.03979 0.01449

d.f. = 15-2-1 = 12

tE/2 = 2.1788

The test statistic for each variable falls in the rejection region (p-values < .05)

Decision:
E/2=.025 E/2=.025

Reject

0 for

each variable

Conclusion:
Reject

-t -2.1788

Do not reject /2 0

t /2 2.1788
0

Reject

There is evidence that both Price and Advertising affect pie sales at E = .05 28

4. Multiple Regression :
Standard Deviation of the Regression Model
Regression Statistics Multiple R R Square Adjusted R Standard Error Observations 0.72213 0.52148 0.44172 47.46341 15

The standard deviation of the regression model is 47.46

ANOVA Regression Residual Total

df 2 12 14

SS 29460.027 27033.306 56493.333 Standard Error 114.25389 10.83213 25.96732

MS 14730.013 2252.776

F 6.53861

Significance F 0.01201

Coefficients Intercept Price Advertising 306.52619 -24.97509 74.13096

t Stat 2.68285 -2.30565 2.85478

P-value 0.01993 0.03979 0.01449

Lower 95% 57.58835 -48.57626 17.55303

Upper 95% 555.46404 -1.37392 29 130.70888

4. Multiple Regression:
F-Test for Overall Significance of(continued) the Model
Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations 0.72213 0.52148 0.44172 47.46341 15

F!

SR 14730.0 ! ! 6.5386 S 2252.8


P-value for the F-Test

With 2 and 12 degrees of freedom

ANOVA Regression Residual Total

df 2 12 14

SS 29460.027 27033.306 56493.333 Standard Error 114.25389 10.83213

MS 14730.013 2252.776

F 6.53861

Significance F 0.01201

Coefficients Intercept Price 306.52619 -24.97509

t Stat 2.68285 -2.30565

P-value 0.01993 0.03979

Lower 95% 57.58835 -48.57626

Upper 95%
30 555.46404

-1.37392

4. Multiple Regression:
(continued) F-Test for Overall Significance of the Model

H0:

=0
2 not both zero

Test Statistic:
F! SR ! 6.5386 S

H1: 1 and E = .05 df1= 2

df2 = 12
FE = 3.885
E = .05

Decision: Reject 0 at E = 0.05 Conclusion:


The regression model does explain a significant portion of the variation in pie sales

Do not reject

Reject
0

F
0

F.05 = 3.885

(There is evidence that at least one independent variable affects y) 31

5. ANOVA
** Tools Data Analysis single factor Anova:

32

** Tools Data Analysis single factor

Anova:

33

También podría gustarte