Está en la página 1de 12

ASA University Bangladesh Md.

Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business Regression Analysis The regression analysis is a technique of analyzing or studying the dependence of one variable (called dependent variable), on one or more variables (called explanatory variables), with a view to estimating and or predicting the population mean or average value of the former in terms of the known or fixed values of the latter. Uses of Regression Analysis 1. Estimate the relationship that exists, on the average, between the dependent and the explanatory variables. 2. Determine the effect of each of the explanatory variables on the dependent variable, controlling the effects of all other explanatory variables. 3. Predict the value of the dependent variable for a given value of the explanatory variable. Multiple Regression Model The regression equation that describes how the dependent variable related to the independent or explanatory variables error term is

and an

is called the multiple regression model. The multiple independent variables can be written as:

regression model with

where,

the value of dependent variable s = the value of ith independent variable, intercept the regression coefficients the random error component

Multiple Regression Equation

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business The equation that describes how the mean value of independent or explanatory variables is related to the

is called the multiple independent

regression equation. The multiple regression equation with variables can be written as:

where, variable

the mean or expected value of the dependent

s = the value of ith independent variable, intercept the regression coefficients

Estimated Multiple Regression Equation Generally, the parameters , are unknown in the regression

model. The ordinary least square (OLS) method is used to estimate the parameters based on the sample observations. The estimated values of the parameters provide the following estimated multiple regression equation:

where,

are the estimates of , is the estimated value of the dependent variable.

and

In generally, the estimated equation of the multiple regression model by ordinary least square method based on sample observations is known as estimated multiple regression equation.

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business Assumptions about the Error Term Model In order to estimate the unknown parameter by OLS, the error term should satisfy the following assumptions:

in the Multiple Regression

The error

is a random variable with mean or expected value . , is the same for all values of

of zero. That is

The variance of , denoted by the independent variable. The values of The error

are independent.

is a normally distributed random variable.

Relationship among SST, SSR and SSE In multiple regression analysis, the total sum of squares (SST) can be partitioned into two components: the sum of squares due to regression (SSR) and the sum of squares due to error (SSE). The relationship among them is given below: SST = SSR + SSE where, SST = total sum of squares = SSR = sum of squares due to regression = SSE = sum of squares due to error = Multiple Coefficient of Determination The multiple coefficient of determination is defined as the ratio of sum of squares due to regression (SSR) and sum of squares due to error (SSE). It is used to evaluate the goodness of fit for the estimated regression equation. It is denoted by and mathematically can be written as:

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business

The multiple coefficient of determination

can be interpreted as the

proportion of the variability in the independent variable that can be explained by the estimated multiple regression equation. Adjusted Multiple Coefficient of Determination The adjusted multiple coefficient of determination can be calculated as:

where,

= adjusted multiple coefficient of determination = multiple coefficient of determination = number of observations = number of regression coefficient including intercept.

Example 1 Butler Trucking Company, an independent trucking company in southern California involves deliveries throughout its local area. To develop the better work schedule, the manager wants to estimate the total daily travel time for their drivers. Initially, the manager believed that the total travel time would be closely related to the number of miles traveled and number of deliveries. A simple random sample of 10 drivers provided the information about total daily travel time (in hours), miles travel and number of deliveries. The multiple regression output is given below:
Predictor Constant miles deliveri S = 0.5731 Coef -0.8687 0.061135 0.9234 SE Coef 0.9515 0.009888 0.2211 T -0.91 6.18 4.18 P 0.392 0.000 0.004

R-Sq = 90.4%

R-Sq(adj) = 87.6%

Analysis of Variance Source Regression Residual Error Total DF 2 7 9 SS 21.601 2.299 23.900 MS 10.800 0.328 F 32.88 P 0.000

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business a) Develop the regression model of travel time (hours) on miles traveled and number of deliveries. b) Write down the estimated regression equation interpretation on the regression coefficients. and make

c) Make a comment on the goodness of fit of the estimated regression equation. d) Test whether there is a significant effect of miles travel on travel times at 5% level of significance. e) Test whether there is a significant effect of number of deliveries on travel times at 5% level of significance. f) Test whether the overall relationship between travel time and the set of independent variables miles travel and number of deliveries is significant at 5% level of significance. g) Estimate travel time hours when miles travel and number of deliveries are 70 and 4 respectively. Solution a) The regression model of travel time (hours) on miles traveled and number of deliveries can be written as:

where,

= the value of travel time (hours) = intercept are the regression coefficients = the value of miles travel = the value of number of deliveries = random error component

Alternative: The regression model of travel time (hours) on miles traveled and number of deliveries can be written as:

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business where, same as above

b) The estimated regression equation from the given output will be

(or Interpretation

The estimated coefficient 0.0611 indicates that for an increase of one mile in the distance traveled, the expected travel time will increase by 0.0611 hours when the number of deliveries is held constant. The estimated coefficient 0.9231 indicates that for an increase of one delivery, the expected travel time will increase by 0.9231 hours when the number of miles traveled is held constant. c) From the output, we have which indicates that 90.4% of the

variability in travel time is explained by the estimated multiple regression equation with miles traveled and number of deliveries as the independent variables. Or from the adjusted we may conclude that 87.6% of the variability in

travel time is explained by the estimated multiple regression equation with miles traveled and number of deliveries as the independent variables. d) Critical Value Approach: In order to know whether there is a significant effect of miles traveled on travel time, we need to test the following hypothesis: against

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business

Under the null hypothesis, the value of test statistic is given as:

The critical value of

is

As

, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant effect of miles travelled on travel time (hours). p Value Approach: In order to know whether there is a significant effect of miles traveled on travel time, we need to test the following hypothesis: against

From the output, we have

As p value

0.05, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant effect of miles travelled on travel time (hours). e) Critical Value Approach: In order to know whether there is a significant effect of number of deliveries on travel time, we need to test the following hypothesis: against

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business Under the null hypothesis, the value of test statistic is given as:

The critical value of

is

As

, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant effect of number of deliveries on travel time (hours). p Value Approach: In order to know whether there is a significant effect of number of deliveries on travel time, we need to test the following hypothesis: against

From the output, we have

As p value

0.05, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant effect of number of deliveries on travel time (hours). f) Critical Value Approach: To test whether the overall relationship between travel time and the set of independent variables miles travel and number of deliveries is significant or not, we need to test the following hypothesis: against

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business Under the null hypothesis, the value of test statistic is given as:

The critical value of F is

As

, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant overall relationship between travel time and the set of independent variables miles travel and number of deliveries. p Value Approach: To test whether the overall relationship between travel time and the set of independent variables miles travel and number of deliveries is significant or not, we need to test the following hypothesis: against

From the output, we have

As p value

0.05, we may reject the null hypothesis.

From the hypothesis test, we may conclude that there is a significant overall relationship between travel time and the set of independent variables miles travel and number of deliveries. g) The estimated travel time when miles travel and number of deliveries are respectively 70 and 4 will be

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business Hence we can say that the estimated travel time will be 8.84 hours if the driver wants to travel 70 miles and give 4 deliveries. Example 1 Consider the following data for a dependent variable independent variables and . and two

96 90 95 92 94

5.0 2.0 4.0 2.5 3.5

1.5 2.0 1.5 2.5 3.3

The estimated regression equation based on 5 observations is

a) Compute the total sum of squares (SST), sum of squares due to regression (SSR), and sum of squares due error (SSE).
b) Find the value of

and adjusted

. Comment on the goodness of fit.

c) Compute F and perform the appropriate F test at 5% level of significance. Solution a) We know that, Total sum of squares (SST) = Sum of squares due to regression (SSR) = Sum of squares due to error (SSE) = SST = SSR + SSE Calculation Table: and

96

5.0

1.5

96.43

6.76

9.15

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty 90 95 92 94

of Business 2.0 2.0 4.0 1.5 2.5 2.5 3.5 3.3

90.44 94.37 91.67 94.04

11.56 2.56 1.96 0.36 =23.20

8.76 0.93 3.01 0.41 =22.26

93.40

Hence,

Total sum of squares (SST) = 23.20 Sum of squares due to regression (SSR) = 22.26 Sum of squares due to error (SSE) = SST SSR = 23.20 22.26 = 0.94 b)

The multiple coefficient of determination (

0.96

Comment: The

value indicates that the 96% of the variability in y can

be explained by the estimated regression equation with independent variables and .

The adjusted

will be

Comment: The adjusted

value indicates that the 92% of the variability

in y can be explained by the estimated regression equation with independent variables and .

c) We need to test the following hypothesis:

ASA University Bangladesh Md. Tareq Ferdous Khan


A S A
U NIV ER SIT Y BANGLADESH

Lecturer, Faculty of Business against

Test Statistic: Under the null hypothesis, the test statistic will be

Critical Value: The critical value of F with degrees of freedom 2 and 2 will be

Decision rule: As , we may reject the null hypothesis.

Comment: From the hypothesis test, we may conclude that at least one parameter is not equal to zero.

También podría gustarte