Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Management Science
October, 2010
1 Introduction
2 Newton’s Method
In many problems in mathematics and the sciences we need to solve an equation of the form
f (x) = 0
for some function function f . We know that any quadratic ax2 + bx + c = 0 where a,b are
constants and a 6= 0 can be solved by using the quadratic formula. In this way we can find
the solutions (roots) of the quadratic equation exactly. Unfortunately, for more complicated
1
equations we do not have a formula for providing the solutions of the equation exactly. Often,
in these cases, a good approximation to a solution of the equation will suffice. Therefore it is
important to be able to find good approximations to the solutions of the equations. We begin
this topic with a numerical technique called Newton’s Method, which is used for approximating
solutions of equations.
The idea behind Newton’s Method is as follows – we have an equation f (x) = 0. Assume
we have made a guess (call it xn ) for a solution to this equation. How do we improve this
guess? To improve this guess we let T be the tangent line to the curve y = f (x) at the point
(xn , f (xn )), and we let xn+1 be the point where T intersects the x − axis. Then xn+1 will be
our improvement on xn .
y y = f (x)
T
r (xn , f (xn ))
true solution
R r
r r
@
@
x xn
0 n+1 x
The line T with slope m = f 0 (xn ) passing through a known point (xn , f (xn )) has equation
y − f (xn ) = f 0 (x)(x − xn )
Note that f 0 (xn ) denotes the derivative of the function f (x) evaluated at xn .
2
Now T intersects the x-axis when y = 0 and x = xn+1 and so
Now
−f (xn )
= xn+1 − xn
f 0 (xn )
Finally
f (xn )
xn+1 = xn − ∗∗
f 0 (xn )
This is how we will obtain the improved solution xn+1 from the initial guess xn .
• Use
f (xn )
xn+1 = xn − ∗∗
f 0 (xn )
x2 − 3 = 0
Guess x1 = 2. Now
3
f (x1 ) 1
x2 = x1 − 0
= 2 − = 1.75
f (x1 ) 4
also
f (x2 ) (1.75)2 − 3
x3 = x2 − = 1.75 − = 1.7321428
f 0 (x2 ) 2(1.75)
and
f (x3 ) (1.7321428)2 − 3
x4 = x3 − = 1.7321428 − = 1.7320508
f 0 (x3 ) 2(1.7321428)
x3 − x − 1 = 0
In order to guess a solution of the equation notice that f (1) = −1 < 0 and f (2) = 5 > 0
so we know that there is a point c between 1 and 2 such that f (c) = 0. Therefore there is a
solution between 1 and 2.
Guess x1 = 1. Now
f (x1 ) −1
x2 = x1 − =1− = 1.5
f 0 (x1 ) 2
also
and
4
√
Exercise Use Newton’s Method to approximate 11 by approximating the solution of the
equation x2 − 11 = 0. Give the first five approximations to the root of the equation using the
initial approximation x1 = 72 .
√
Exercise Use Newton’s Method to approximate 46 by approximating the solution of the
equation x2 − 46 = 0. Give the first three approximations to the root of the equation using
the initial approximation x1 = 7.
3 Curve Fitting
3.1 Introduction
In management science it is often the case that a process or an experiment produces a set of
data points
One goal in numerical methods is to determine a formula y = f (x) that relates these
variables. Usually, the class of allowable formulas is chosen and then coefficients must be
determined. There are many possibilities for the type of function that can be used. Often there
is an underlying mathematical model based on the physical situation, that will determine the
form of the function. We will consider firstly the class of linear functions of the form
y = f (x) = Ax + B
Our method is based on analyzing the errors associated with any measurement or approxi-
mation. In taking a measurement yk there may often be a measurement error involved so that
the true value f (xk ) satisfies
f (xk ) = yk + ek
5
where ek is the measurement error. Hence
ek = f (xk ) − yk
for all 1 ≤ k ≤ n. To determine the best fitted straight line (or curve) that goes near (not
always through) the points we need to average the errors associated with every measurement.
There are several such norms that can be used to measure how far the curve y = f (x) lies from
the data. We can choose one of the following – the maximum error E1 (f ), the average error
E2 (f ) and the root-mean square error E3 (f ).
E1 (f ) = max1≤k≤n |f (xk ) − yk |
n
1X
E2 (f ) = |f (xk ) − yk |
n
k=1
Xn 1
1 2
2
E3 (f ) = (f (xk ) − yk )
n
k=1
The following example show how to apply these norms when a function and a set of points
are given.
(−1, 10), (0, 9), (1, 7), (2, 5), (3, 4), (4, 3), (5, 0), (6, −1)
we can compare the maximum error, average error and root-mean square error for the
linear approximation
6
xk yk fk = 8.6 − 1.6xk |f (xk ) − yk | (f (xk ) − yk )2
−1 10.0 10.2 0.2 0.04
0 9.0 8.6 0.4 0.16
1 7.0 7.0 0.0 0.00
2 5.0 5.4 0.4 0.16
3 4.0 3.8 0.2 0.04
4 3.0 2.2 0.8 0.64
5 0.0 0.6 0.6 0.36
6 −1.0 −1.0 0.0 0.00
2.6 1.40
Now
1
E2 (f ) = (2.6) = 0.325
8
1
1 2
E3 (f ) = (1.4) = 0.41833
8
We can see that the maximum error is the largest, and if one point is badly in error, then
its value determines E1 (f ). The average error E2 (f ) simply averages the absolute value of the
error at the various points. It is often used because it is easy to compute. The root-mean
square error E3 (f ) is often used when the statistical nature of the errors is considered.
The “best-fitting” line is found by minimizing one of these errors. Hence there are three
best fitted lines that we can find. The root-mean square error E3 (f ) is the traditional choice
because it is much easier to minimize.
Definition Let (x1 , y1 ), (x2 , y2 ), .........., (xn , yn ) be a set of n points. The least squares line
y = f (x) = Ax + B is the line for which the root-mean square error E3 (f ) is a minimum.
7
Given n data points (x1 , y1 ), (x2 , y2 ), .........., (xn , yn ), we must find parameters A and B for
the line y = f (x) = Ax + B which minimize the root-mean square error E3 (f ). So, we have
Xn 1
1 2
2
E3 (f ) = (Axk + B − yk )
n
k=1
Hence
n
X
n.E32 (f ) = (Axk + B − yk )2
k=1
n
X
E(A, B) = (Axk + B − yk )2
k=1
At the point that minimizes the value of E(A, B), the partial derivatives of E with respect
to A and then with respect to B are both zero. Take note that in this work it is xk and yk
that are constant and A and B are variables. Hence,
∂E ∂E
=0 and =0
∂A ∂B
n
∂E X
= 2(Axk + B − yk )1 .(xk + 0 − 0)
∂A
k=1
n
X
=2 (Ax2k + Bxk − xk yk )
k=1
n
∂E X
= 2(Axk + B − yk )1 .(0 + 1 − 0)
∂B
k=1
n
X
=2 (Axk + B − yk )
k=1
Setting each of the partial derivatives equal to zero and using the distributive properties
of the summation yields
n
X n
X n
X n
X
0= (Ax2k + Bxk − xk yk ) = A x2k +B xk − xk yk
k=1 k=1 k=1 k=1
8
n
X n
X n
X
0= (Axk + B − yk ) = A xk + nB − yk
k=1 k=1 k=1
These equations can now be re-arranged to a more familiar form often referred to as the
normal equations.
n n n
! !
X X X
2
xk A+ xk B = xk yk
k=1 k=1 k=1
n n
!
X X
xk A + nB = yk
k=1 k=1
The solution of the resulting linear system can be found using matrices or algebra. The
simultaneous solution of this linear system A and B again represent the coefficient of x and
the constant term in the linear equation y = f (x) = Ax + B. This best fitted line can be used
to make predictions for y for a chosen (usually future) values of x. This technique is called
forecasting. Evaluating the root-mean square error for the same data set can be interpreted as
a measure of how accurate the prediction is since it is a measure of how far the line y = f (x)
lies from the data i.e., the average error. The smaller the value of the root-mean square error
the more confidence can be placed in any prediction that is made. This type of analysis will
ensure that a person can make more informed decisions based on the numerical data collected
which is a fundamental of management science.
To find the best fitted line for the following set of data points
(−1, 10), (0, 9), (1, 7), (2, 5), (3, 4), (4, 3), (5, 0), (6, −1)
we proceed as follows to get the required totals for the normal equations.
9
xk yk x2k xk yk
−1 10.0 1 −10
0 9.0 0 0
1 7.0 1 7
2 5.0 4 10
3 4.0 9 12
4 3.0 16 12
5 0.0 25 0
6 −1.0 36 −6
20 37 92 25
92A + 20B = 25
20A + 8B = 37
Solving yields A = −1.6071429 and B = 8.6428571. Hence the best fitted line, using the
least squares method, is
y = −1.6071429x + 8.6428571
We now can calculate the root-mean square error E3 (f ). This measure of how far the
straight line y = f (x) lies from the data will also allow us consider how much faith we can
place in any predictions made using our best fitted line. Now
Xn 1
1 2
2
E3 (f ) = (f (xk ) − yk )
n
k=1
10
xk yk f (xk ) f (xk ) − yk (f (xk ) − yk )2
−1 10.0 10.25 0.25 0.0625
0 9.0 8.6428571 −0.3571429 0.127551051
1 7.0 7.0357142 0.0357142 0.001275504
2 5.0 5.4285713 0.4285713 0.183673359
3 4.0 3.8214284 −0.1785716 0.031887816
4 3.0 2.2142855 −0.7857145 0.617347275
5 0.0 0.6071426 0.6071426 0.368622136
6 −1.0 −1.0000003 0.0000003 0.0
1.392857141
Now
Xn 1
1 2
2
E3 (f ) = (f (xk ) − yk )
n
k=1
1
1.392857141 2
=
8
= 0.417261479
The root-mean square error for this example has been found to be 0.417261479. This value
should be quoted when presenting the equation of the best fitted line to indicate how far the
line lies from the data especially when making predictions using the found equation. This
value is also useful when making comparisons between different data sets all generated from
the same source. If the best fitted line from each data set is determined, the corresponding
smallest root-mean square error would indicate the best equation to use for forecasting.
(−6, −5.3), (−2, −3.5), (0, −1.7), (2, 0.2), (6, 4.0)
Find the best fitted line for the following set of data points.
Also determine the root-mean square error and comment on its value.
11
Exercise Plot a scattergraph for the data.
(−8, 6.8), (−2, 5), (0, 2.2), (4, 0.5), (6, −1.3)
Find the best fitted line for the following set of data points.
Also determine the root-mean square error and comment on its value.
y = f (x) = Axm
where m is a known constant. In these cases there is only one parameter A to be found and
again using the least-squares technique we seek to minimize the function
n
X
E(A) = (Axm
k − yk )
2
k=1
At the point that minimizes the value of E(A), the ordinary derivative of E with respect
to A is zero. Hence,
n
dE X
= 2(Axm 1 m
k − yk ) .xk
dA
k=1
n
X
=2 (Ax2m m
k − xk yk )
k=1
n
X n
X n
X
0= (Ax2m
k − xm
k yk ) =A x2m
k − xm
k yk
k=1 k=1 k=1
Rearranging this equation we get the coefficient A for the power fit y = Axm is
n
X n
X
A= xm
k yk x2m
k
k=1 k=1
12
Example The relationship between distance(d) in meters and time(t) in seconds is given by
1
d = gt2
2
(0.2, 0.1960), (0.4, 0.7850), (0.6, 1.7665), (0.8, 3.1405), (1.0, 4.9075)
Using the formula for A derived above we can fit the curve y = f (x) = Axm where m = 2
and also estimate the value of g, the gravitational constant, from our work. We proceed as
follows to get the required totals to evaluate A
tk dk dk t2k t4k
0.2 0.1960 0.00784 0.0016
0.4 0.7850 0.12560 0.0256
0.6 1.7665 0.63594 0.1296
0.8 3.1405 2.00992 0.4096
1.0 4.90750 4.90750 1.0000
7.68680 1.5664
n
X n
X
A= t2k dk t4k
k=1 k=1
Hence
7.68680
A= = 4.9073
1.5664
d = 4.9073t2
For this example we can also evaluate the root-mean square error E3 (f ) and comment on our
findings. Also for this problem we can estimate the constant g which in this example represents
the gravitational constant. The purpose of this experiment is to estimate the constant g from
the experimental results – given that we have determined the constant A using the least-squares
method we can now equate as follows
13
1
g = 4.9073
2
∴ g = 9.8146 ms−2
Exercise Find the power fit y = Ax2 for the following data and evaluate the root-mean square
error and comment on its value.
xk yk
2.0 5.1
2.3 7.5
2.6 10.6
2.9 14.4
3.2 19.0
The method of least squares curve fitting can be extended to many non-linear cases. For ex-
ample, consider the n data points (x1 , y1 ), (x2 , y2 ), .........., (xn , yn ) and the class of exponential
functions of the form
y = f (x) = CeAx
In this case the parameters A and C are to be found and again using the least-squares
technique we seek to minimize the function
n
X
E(A, C) = (CeAxk − yk )2
k=1
At the point that minimizes the value of E(A, C), the partial derivatives of E with respect
to A and then with respect to C are both zero. Take note again that in this work it is xk and
yk that are constant and A and C are variables. Hence,
∂E ∂E
=0 and =0
∂A ∂C
14
Holding C fixed and differentiating with respect to A yields
n
∂E X
=2 (CeAxk − yk )1 .(Cxk eAxk )
∂A
k=1
n
X
=2 (C 2 xk e2Axk − Cxk yk eAxk )
k=1
n
∂E X
=2 (CeAxk − yk )(eAxk )
∂C
k=1
n
X
=2 (Ce2Axk − yk eAxk )
k=1
Setting each of the partial derivatives equal to zero and using the distributive properties
of the summation yields
n
X n
X
2 2Axk
C xk e − xk yk eAxk = 0
k=1 k=1
n
X n
X
2Axk
C e − yk eAxk = 0
k=1 k=1
This system of equations is non-linear in the unknowns A and C and can be solved using
variety of methods. We will not illustrate this method of curve fitting by example. We will
use a common alternative method called the data linearization method. This method is easier
to adapt to the many different families of curves that occur in applications.
This more common approach for the exponential fit require the natural logarithm of both
sides of the equation to be taken. This will reduce to a linear relationship between the variables.
So, for the class of exponential functions of the form
y = f (x) = CeAx
15
loge y = loge (CeAx )
= loge C + Ax loge e
= Ax + loge C
Equating with the general form of a line y = Ax + B we have the following change of
variables (and constant):
yk = loge yk , xk = xk , B = loge C
The method for finding the least-squares line in section 1.2 is now applied to transformed
data points (xk , yk ) = (xk , loge yk ). The coefficients A and B are found by solving the “trans-
formed” linear system
n n n
! !
X X X
xk 2 A + xk B = xk loge yk
k=1 k=1 k=1
n n
!
X X
xk A + nB = loge yk
k=1 k=1
loge C = B ⇒ C = eB
This technique involves using algebra to transpose the equation of a curve to a line and
using the resulting change of variables (and constant) along with the normal equations for the
line to determine the unknowns A, B and C.
Example To find the best fitted exponential curve for the following set of data points
(0, 1.5), (1, 2.5), (2, 3.5), (3, 5.0), (4, 7.5)
16
we use the change of variables (and constant) found above along with the normal equations
for the line to determine the unknowns A, B and C. We proceed as follows to get the required
totals for the normal equations.
10A + 5B = 6.19886
C = e0.457367 = 1.5799
Hence the best exponential curve, using the data linearization technique, is
y = 1.5799e0.391202x
y = 1.5799e0.391202x
n
X
E(A, C) = (CeAxk − yk )2
k=1
17
If we use the equations that result after minimising this quantity we would obtain the
following curve
y = 1.6109e0.38357x
There is a slight difference in coefficients but the function values differ by no more than
2% over the interval [0, 4].
The technique of data linearisation has been used by scientists to fit curves such as
y = f (x) = CeAx
A
y = f (x) = +B
x
L
y = f (x) =
1 + CeAx
Once the curve has been chosen, a suitable transformation of the variables must be found
so that a linear relationship is obtained. A list of the correct transformations for various curves
are listed below:
18
f (x) Linearised F orm y = Ax + B Change of variables
A 1 1
y= +B y=A +B x= , y=y
x x x
1 1 1
y= = Ax + B x=x, y=
Ax + B y y
x 1 1 1 1
y= =A +B x= , y=
A + Bx y x x y
L L L
y= loge − 1 = Ax+loge C x = x , y = loge −1
1 + CeAx y y
Notice that the final two functions above also require the change of constant
C = eB
The final transformation may be of interest in problems relating to population growth. The
curve is known as a logistic curve. To see the required change of variables and constant we
firstly rearrange the formula as follows
L
− 1 = CeAx
y
= loge C + Ax loge e
= Ax + loge C
19
Equating with the general form of a line y = Ax + B we have the following change of
variables (and constant):
L
yk = loge −1 , xk = xk , B = loge C
y
The method for finding the least-squares line in section 1.2 is now applied to transformed
data points. The coefficients A and B are found by solving the “transformed” linear system
n n n
! !
X
2
X X L
xk A+ xk B = xk loge −1
y
k=1 k=1 k=1
n n
!
X X L
xk A + nB = loge −1
y
k=1 k=1
loge C = B ⇒ C = eB
Example Consider the following data which represents the population in thousands who own
a personal computer in the year shown.
(1990, 76.1), (1992, 106.5), (1994, 132.6), (1996, 180.7), (1998, 226.5)
When the population P (in thousands) who own a personal computer is bounded by the
limiting value L = 800 thousand it follows a logistic curve and has the form
L
P (t) =
1 + CeAt
We use the change of variables (and constant) found above along with the normal equations
for the line to determine the unknowns A, B and C. We proceed as follows to get the required
totals for the normal equations.
20
800 800
tk Pk loge Pk −1 t2k tk loge Pk −1
20A + 5B = 7.9
C = e2.237 = 9.365
Hence the best exponential curve, using the data linearization technique, is
800
P (t) =
1 + 9.365e−0.1643t
We can now use this curve to made a prediction say, for example, the number of people
who will own a personal computer in 2004. We just evaluate this function at the value t = 14.
800
P (14) = = 412.64
1 + 9.365e−0.1643(14)
21
3.5.4 Further Examples
Example A Institute of Technology has recorded the number of first year students who register
on a full-time course at the beginning of each academic year. The data recorded for the past
eight years is as follows:
xk xk yk
2002 2 750
2003 3 950
2004 4 1025
2005 5 1100
2006 6 1175
2007 7 1350
2008 8 1500
2009 9 1870
yk
6
1900 s
1800
1700
1600
1500 s
1400
s
1300
1200 s
1100 s
s
1000 s s
-
2 3 4 5 6 7 8 9 xk
22
The management of the college wishes to predict, as accurately as possible, the number of
first year students who will register on full-time courses in the year 2010, based on all the data
recorded to date. An accurate predication will allow for forward planning well in advance of
the new academic year.
i Determine the equation of the best fitted curve which is of the form
y = f (x) = CeAx
where C and A are constants, using the ‘data linearization’ method where the required
change of variables and constant are as follows:
xk = xk , yk = loge yk , C = eB
ii Evaluate this best fitted curve at x = 10 (which is equivalent to the year 2010) to predict
the number of full-time students who will register in the college in the academic year
2010.
xk yk
0 200
1 400
2 650
3 850
4 950
23
The logistic curve for the above data was found using the data linearisation technique to
be
1000
y = f (x) =
1 + 4.3018e−1.0802x
By choosing appropriate values for x, plot the graph of this curve onto the scattergraph.
" n
#1
2
1X 2
E(f ) = f (xk ) − yk
n
k=1
Comment on how this error can be used to make comparisons between two or more fits.
Example
y = f (x) = CeAx
y = f (x) = CxA
" n
#1
2
1X 2
E(f ) = f (xk ) − yk
n
k=1
for the curve fit in part (i) and part (ii) above. Which curve fit is the best?
24
xk yk lnxk lnyk x2 xk lnyk (lnxk )2 lnxlnyk
1 0.6 0.0000 −0.5108 1 −0.5108 0 0
2 1.9 0.6931 0.6419 4 1.2838 0.4803 0.4449
3 4.3 1.0986 1.4586 9 4.3758 1.2069 1.6024
4 7.6 1.3863 2.0281 16 8.1124 1.9218 2.8115
5 12.6 1.6094 2.5337 25 12.6685 2.5337 4.0777
15 3.757 6.1515 55 25.9297 6.199 8.9365
Note: The totals for the normal equations are given in bold type.
y = f (x) = Ax2 + Bx + C
n
X 2
E(A, B, C) = Axk 2 + Bxk + C − yk
k=1
n n n n
! ! !
X X X X
4
xk A+ x3k B+ x2k C= yk x2k
k=1 k=1 k=1 k=1
n n n n
! ! !
X X X X
3
xk A+ x2k B+ xk C = yk xk
k=1 k=1 k=1 k=1
n n n
! !
X X X
2
xk A+ xk B + nC = yk
k=1 k=1 k=1
25
Find the least-squares parabola for the four points (−3, 3), (0, 1), (2, 1) and (4, 3).
————————o————————–
26