M AtrixCalc Ulus

Matrix Dierentiation
( and some other stu ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA
Introduction
Throughout this presentation I have chosen to use a symbolic matrix notation. This choice was not made lightly. I am a strong advocate of index notation, when appropriate. For example, index notation greatly simplies the presentation and manipulation of dierential geometry. As a rule-of-thumb, if your work is going to primarily involve dierentiation with respect to the spatial coordinates, then index notation is almost surely the appropriate choice. In the present case, however, I will be manipulating large systems of equations in which the matrix calculus is relatively simply while the matrix algebra and matrix arithmetic is messy and more involved. Thus, I have chosen to use symbolic notation.
Notation and Nomenclature
Denition 1 Let aij R, i = 1, 2, . . . , m, j = 1, 2, . . . , n. Then the ordered rectangular array a11 a12 a1n a21 a22 a2n A= (1) . . . . . . . . . am1 am2 amn is said to be a real matrix of dimension m n. When writing a matrix I will occasionally write down its typical element as well as its dimension. Thus, A = [aij ] , i = 1, 2, . . . , m; j = 1, 2, . . . , n, (2) denotes a matrix with m rows and n columns, whose typical element is aij . Note, the rst subscript locates the row in which the typical element lies while the second subscript locates the column. For example, ajk denotes the element lying in the jth row and kth column of the matrix A. Denition 2 A vector is a matrix with only one column. Thus, all vectors are inherently column vectors. Convention 1 Multi-column matrices are denoted by boldface uppercase letters: for example, A, B, X. Vectors (single-column matrices) are denoted by boldfaced lowercase letters: for example, a, b, x. I will attempt to use letters from the beginning of the alphabet to designate known matrices, and letters from the end of the alphabet for unknown or variable matrices.
1
CE 8361
Spring 2006
Convention 2 When it is useful to explicitly attach the matrix dimensions to the symbolic notation, I will use an underscript. For example, A , indicates a known, multi-column matrix with m rows
m n
and n columns. A superscript T denotes the matrix transpose operation; for example, AT denotes the transpose of A. Similarly, if A has an inverse it will be denoted by A1 . The determinant of A will be denoted by either |A| or det(A). Similarly, the rank of a matrix A is denoted by rank(A). An identity matrix will be denoted by I, and 0 will denote a null matrix.
Matrix Multiplication
C = AB (3)
Denition 3 Let A be m n, and B be n p, and let the product AB be
then C is a m p matrix, with element (i, j) given by

n
cij =
k=1
aik bkj
(4)
for all i = 1, 2, . . . , m,
j = 1, 2, . . . , p.
Proposition 1 Let A be m n, and x be n 1, then the typical element of the product z = Ax is given by
n
(5)
zi =
k=1
aik xk
(6)
for all i = 1, 2, . . . , m. Similarly, let y be m 1, then the typical element of the product zT = yT A is given by
n
(7)
zi =
k=1
aki yk
(8)
for all i = 1, 2, . . . , n. Finally, the scalar resulting from the product = yT Ax is given by
m n
(9)
=
j=1 k=1
ajk yj xk
(10)
Proof: These are merely direct applications of Denition 3. q.e.d.
CE 8361
Spring 2006
Proposition 2 Let A be m n, and B be n p, and let the product AB be C = AB then CT = BT AT Proof: The typical element of C is given by
n
(11)
(12)
cij =
k=1
aik bkj
(13)
By denition, the typical element of CT , say dij , is given by

n
dij = cji =
k=1
ajk bki
(14)
Hence, CT = BT AT q.e.d. Proposition 3 Let A and B be n n and invertible matrices. Let the product AB be given by C = AB (16) then C1 = B1 A1 Proof: CB1 A1 = ABB1 A1 = I q.e.d. (18) (17) (15)
Partioned Matrices
Frequently, I will nd it convenient to deal with partitioned matrices 1 . Such a representation, and the manipulation of this representation, are two of the relative advantages of the symbolic matrix notation. Denition 4 Let A be m n and write A= B C D E (19)
where B is m1 n1 , E is m2 n2 , C is m1 n2 , D is m2 n1 , m1 + m2 = m, and n1 + n2 = n. The above is said to be a partition of the matrix A.

1
Much of the material in this section is extracted directly from Dhrymes (1978, Section 2.7).
CE 8361
Spring 2006
Proposition 4 Let A be a square, nonsingular matrix of order m. Partition A as A= A11 A12 A21 A22 (20)
so that A11 is a nonsingular matrix of order m1 , A22 is a nonsingular matrix of order m2 , and m1 + m2 = m. Then A
1
1 A11 A12 A 22 A21
1 1
1 1 A 11 A12 A22 A21 A11 A12 1 A22 A21 A 11 A12 1
1 1 A 22 A21 A11 A12 A22 A21
(21)
Proof: Direct multiplication of the proposed A1 and A yields A1 A = I q.e.d. (22)
Matrix Dierentiation
In the following discussion I will dierentiate matrix quantities with respect to the elements of the referenced matrices. Although no new concept is required to carry out such operations, the element-by-element calculations involve cumbersome manipulations and, thus, it is useful to derive the necessary results and have them readily available 2 . Convention 3 Let y = (x), where y is an m-element vector, and x is an n-element y y1 1 x1 x y2 y2 2 y x1 x2 = . . . x . . .
ym x1 ym x2
(23) vector. The symbol y

1
xn y2 xn
. . .
ym xn
(24)
will denote the m n matrix of rst-order partial derivatives of the transformation from x to y. Such a matrix is called the Jacobian matrix of the transformation (). Notice that if x is actually a scalar in Convention 3 then the resulting Jacobian matrix is a m 1 matrix; that is, a single column (a vector). On the other hand, if y is actually a scalar in Convention 3 then the resulting Jacobian matrix is a 1 n matrix; that is, a single row (the transpose of a vector). Proposition 5 Let y = Ax
2
(25)
Much of the material in this section is extracted directly from Dhrymes (1978, Section 4.3). The interested reader is directed to this worthy reference to nd additional results.
CE 8361
Spring 2006
where y is m 1, x is n 1, A is m n, and A does not depend on x, then y =A x Proof: Since the ith element of y is given by
n
(26)
yi =
k=1
aik xk
(27)
it follows that yi = aij xj for all i = 1, 2, . . . , m, j = 1, 2, . . . , n. Hence y =A x q.e.d. Proposition 6 Let y = Ax (30) where y is m 1, x is n 1, A is m n, and A does not depend on x, as in Proposition 5. Suppose that x is a function of the vector z, while A is independent of z. Then x y =A z z Proof: Since the ith element of y is given by
n
(28)
(29)
(31)
yi =
k=1
aik xk
(32)
for all i = 1, 2, . . . , m, it follows that yi = zj

n
aik
k=1
xk zj
(33)
x but the right hand side of the above is simply element (i, j) of A . Hence z
y y x x = =A z x z z q.e.d. Proposition 7 Let the scalar be dened by = yT Ax where y is m 1, x is n 1, A is m n, and A is independent of x and y, then = yT A x
(34)
(35)
(36)
5
CE 8361
Spring 2006
and
= xT AT y wT = yT A
(37)
Proof: Dene (38) (39) and note that = wT x Hence, by Proposition 5 we have that = wT = yT A x which is the rst result. Since is a scalar, we can write = T = xT AT y and applying Proposition 5 as before we obtain = xT AT y q.e.d. Proposition 8 For the special case in which the scalar is given by the quadratic form = xT Ax where x is n 1, A is n n, and A does not depend on x, then = xT A + AT x Proof: By denition
n n
(40)
(41)
(42)
(43)
(44)
=
j=1 i=1
aij xi xj
(45)
Dierentiating with respect to the kth element of x we have = xk

n n
akj xj +
j=1 i=1
aik xi
(46)
for all k = 1, 2, . . . , n, and consequently, = xT AT + xT A = xT AT + A x q.e.d. (47)
CE 8361
Spring 2006
Proposition 9 For the special case where A is a symmetric matrix and = xT Ax where x is n 1, A is n n, and A does not depend on x, then = 2xT A x Proof: This is an obvious application of Proposition 8. q.e.d. Proposition 10 Let the scalar be dened by = yT x where y is n 1, x is n 1, and both y and x are functions of the vector z. Then y x = xT + yT z z z Proof: We have =
j=1 n
(48)
(49)
(50)
(51)
xj yj
(52)
Dierentiating with respect to the kth element of z we have = zk

n
xj
j=1
yj xj + yj zk zk
(53)
for all k = 1, 2, . . . , n, and consequently, y x y x = + = xT + yT z y z x z z z q.e.d. Proposition 11 Let the scalar be dened by = xT x where x is n 1, and x is a function of the vector z. Then x = 2xT z z Proof: This is an obvious application of Proposition 10. q.e.d. Proposition 12 Let the scalar be dened by = yT Ax (57) (56) (55) (54)
where y is m 1, x is n 1, A is m n, and both y and x are functions of the vector z, while A does not depend on z. Then y x = xT AT + yT A z z z (58)
7
CE 8361
Spring 2006
Proof: Dene wT = yT A and note that = wT x Applying Propositon 10 we have w x = xT + wT z z z Substituting back in for w we arrive at y x x y = + = x T AT + yT A z y z x z z z q.e.d. Proposition 13 Let the scalar be dened by the quadratic form = xT Ax (63) (62) (61) (60) (59)
where x is n 1, A is n n, and x is a function of the vector z, while A does not depend on z. Then x = xT A + AT (64) z z Proof: This is an obvious application of Proposition 12. q.e.d. Proposition 14 For the special case where A is a symmetric matrix and = xT Ax (65)
where x is n 1, A is n n, and x is a function of the vector z, while A does not depend on z. Then x = 2xT A (66) z z Proof: This is an obvious application of Proposition 13. q.e.d. Denition 5 Let A be a m n matrix whose elements are functions of the scalar parameter . Then the derivative of the matrix A with respect to the scalar parameter is the m n matrix of element-by-element derivatives: a a12 11 1n a a21 a22 2n a A (67) = . . . . . . . . .
am1 am2
amn
Proposition 15 Let A be a nonsingular, m m matrix whose elements are functions of the scalar parameter . Then A 1 A1 = A1 A (68)
8
CE 8361
Spring 2006
Proof: Start with the denition of the inverse A1 A = I and dierentiate, yielding A1 rearranging the terms yields A1 A 1 = A1 A q.e.d. (71) A A1 + A=0 (70) (69)
References
Dhrymes, Phoebus J., 1978, Mathematics for Econometrics, Springer-Verlag, New York, 136 pp. Golub, Gene H., and Charles F. Van Loan, 1983, Matrix Computations, Johns Hopkins University Press, Baltimore, Maryland, 476 pp. Graybill, Franklin A., 1983, Matrices with Applications in Statistics, 2nd Edition, Wadsworth International Group, Belmont, California, 461 pp.

M AtrixCalc Ulus

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

M AtrixCalc Ulus

Cargado por

Copyright:

Formatos disponibles

Matrix Dierentiation

Notation and Nomenclature

Denition 3 Let A be m n, and B be n p, and let the product AB be

then C is a m p matrix, with element (i, j) given by

Proof: These are merely direct applications of Denition 3. q.e.d.

By denition, the typical element of CT , say dij , is given by

where B is m1 n1 , E is m2 n2 , C is m1 n2 , D is m2 n1 , m1 + m2 = m, and n1 + n2 = n. The above is said to be a partition of the matrix A.

1 A11 A12 A 22 A21

1 1 A 11 A12 A22 A21 A11 A12 1 A22 A21 A 11 A12 1

1 1 A 22 A21 A11 A12 A22 A21

Proof: Direct multiplication of the proposed A1 and A yields A1 A = I q.e.d. (22)

(23) vector. The symbol y

for all i = 1, 2, . . . , m, it follows that yi = zj

Dierentiating with respect to the kth element of x we have = xk

for all k = 1, 2, . . . , n, and consequently, = xT AT + xT A = xT AT + A x q.e.d. (47)

Dierentiating with respect to the kth element of z we have = zk

También podría gustarte