CHAPTER 2
Basic Concepts of
Matrix Algebra
1, INTRODUCTION
Jn many arcas of cconomies—coasumer demand theory, itenindustry
anaiysis, general oqiorium models, and othere—matrx algebra serves
a 8 convenient and powerful notation. In mathematical statistics itis
Virtually @ necessity. ‘This chapter provides base concepts and results on
‘which we rely in our subsequent development of the theory of econo
In Seotion 2 vectors, matress, and operations on them are defined. In
Section 3 the determinant of «square matrix is intoduced. Tn Section 4
tho concept of linear depeadence of et of vectors andthe related concept
of tho rank of a matrix are define ana applied to investigate the solution
to a system of simultaneous homogencous linet equations. In Section 3
the snverse ofa square matrix introduced and applied to investigate the
solution ofa square system of simultaneous Bacar equations, ‘This matedial
constitutes the bare rudiments of mate algeora and has wide eppicebility
throughout economics. The remaining sections go somevliat deeper song
lines that are particuatly relevant for statistical theory. Section 6 ts
concerned with charectensite roots, diagonalization, orthogonal end
idempotent matrices, and linear and quadratic forms. ‘Section 7 develops
to properties of definte matness—positive, aonnegstve, and otters.
Finally mn Section § several basic concepts an res of the diferent
‘alelus are presented in matzix notation.
Our teatment nardly constitutes a rigorous introduction to matnx
‘algebra. Nevertneless m many cases proofs are inciuded to enhance
‘understanding and to ilwtraie the power of this tool. For 2 more
rigorous development see Hadley (1961),
6
2, VECTORS AND MATRICES 7
2, VICTORS AND MATRICES
Column Vectors
We begin by defining an m x 1 column Yector, written x, to be an
‘ordered m-tuple of real numbers, arranged in a cotuma;
ey x=
‘The numbers 2, (im 1,...,) are eallea the elements (or components
‘or coordinates) of the column vector. Two column vectors ate equal if
‘and only if they are equal cloment by element:
ey) x
ifand oniy it, y,
Lev
We also have occasion to write x y tomean 2, > y, (/= 1... «1. The fundamental opecations
fon column vectors are addition, ana multiplication by a scalar, To add
‘two column vectors, add their corresponding elements; thus
3) a=
‘Note that the sum of two column vectors i defined only if they have the
same number of elements, To multiply a eotuma vector by a scalar (Le,
‘real cumber) multiply each element of the cotuma vector by the scalar?
thus
+ yifand only fe = 2, +y, Gm A...5m).
OH — yeexifand onlyity mon G=1....m).
‘Combining and extending these two operations defines «linear combina
tion of a set of vectors:
25) yee fost egx™ ifand ony if
mm eel te beget! (het. ym),
hese af isthe Mth etement of the jth vector x, For example,
4 5) B+ UHH) 7
eH 3-2) 41 6)=(3-9 +16, of.
1 0 3) + 10) 38
Matrices
ASIC CONCEPTS OF MATRIX ALOEBRA
‘An mi X m matrix 1s a rectangular array of mn real numbers arranged
3 m rOWS and m columans:
ney
en A=
nt gt aa
‘The numbers ay
Ls .mj je tes+.52) are called the elements of,
‘the matrix A. An m x n matra is said to be of order m xm. We some-
‘umes write simply A = (a,) meaning that A is the matrix whose typical
element say, Two matrices are equal i and onty if they are equa clement
by clement
8) A=Bifand onlyifay=o,
Nat we acon oes Heald
es eer eon ne
@) CeAtaiimomteeear ty
unt otic ns at eo i oe
fordes. To multiply a matnx by a seslac, multiply each element of the
zmainx by the scalar; thus
@10) B= cA ifand omy i, = ea,
bmi y= beeen.
‘Combining and extending these two operations defines a linear combine
tion of a set of matnoes:
Qt) Baa,
+ pA! ifand ony it
y= ell bo cea
2. VECTORS AND MATRICES 9
From these definitions and te properties of real numbers it follows
GQ) A+B=B+A,
Oi) AFB 4+C=A4+@4Q=A4R4C,
213) A+B) CA +B,
@19 (C+ dA cA + dA,
GIT) eldA) = (CA = GOA = alos).
‘Comparing the definitions of the operations on matrices with those on
column vectors, st wll be seen that the definitions are consistent. That
4s, an m x 1 column vector may be treated just like an me x 4 males.
Similarly, a L xm matrix is often called a 1 Xn rovr vector.
‘The third fundamental operation is matrix multiplication, ‘The product
fof an mm xm maicix A times ann x p matrix B is an m X p matrix C
whose 1 element isthe sum of products of the clements inthe ith row of
‘A by the elements inthe jth column of B; thus
218) CmaBitandoayitey = Sent
@
For example, if A is 2 x 3 and Bis 3 x 2,
(bay Oss’
eae bn Om!
dnb + Babs + GeaPs Andre + Gabe + Gaaba
For a numeral example,
y+
Ams mDerO+ ID — KOFI-D+A+Uo )
2.20) coana [am + enw + 2m +30 204 XK-2 +a + 3)
+) +5004 00
“C3
woreotosxe |to BASIC CONCEPTS OF MATRIX ALODRA
in visual terms, the element im the ith row and jth column of C = AB is
obtained by “multiplying” the fh row of A mto the jth column of B,
Alternatively, if we think ofthe columns of A ana of C's m x 1 vectors,
‘we may interpret the jth eonumin of C, C,, as 2 linear combination of the
columns of A. Ay,..+y Ay, With the elements of the jth column of B as
coeflicients
221) Com OAs Fob Ae
Stil another interpretation Views the columns of B as n x 1 vectors and
considers the jth column of C as the mattix A times the jth cotumn of B:
02) Gm aR,
Note that for the product C = AB to be defined, B must have as many
rows as A has columns, and that then C has as many rows as A end as
‘many columns a8 B. If the product AB is defined, A and B are said to be
conformable. When a product appoars in this book it should be under-
stood that the result holds only if tne product 1s defined,
From these definitions and the properties of real numbers xt follows
that
2.23) (AB)C = A@BC) = ABC,
2) AB +O = AB + AC,
2.25) AcB = eAB,
‘Matrix multiplication 1s not commutative, however; 1m general BALs not
‘equal to AB, because when AB is defined, BA may not be—see (2.20)—
ang even if it is—as in 2.19) will nt necessarily equal AB. ‘Thus itis
‘important to distmguish vetween pre- and pastmultiplication by a matrox
‘and to preserve the order of products of matrices,
‘The transpose of a matrix, denoted by a prume, is the matnx obtamed
by interehanging the rows with the columns; thus the transpose of the
m X nmateix A of (27) is te n x m matrix
e298
Thats,
27) Be A'ifand ony ifay =a, G=1....my=1,...5n)
2. wevoRs AND Marnuces 1
‘The following properties of transposition follow directly from the dofini«
tuons of matrix operations and the properties of real numbers:
28) (YHA
22) (A+BYaAe By
230) (ABY = BA.
‘Thus, the transpose of a transpose is the original matnx; the transpose
‘ofa sum is the sum of the transposes; and the transpose of a product is
the product of the transposes in the reverse order.
‘A matrn: is called square if it has as many rows as it has column, 1
if m= m, ‘The diagonal of a square matrix consists of the elements ing
along the line running from northwest to southeast, x2, the diagonal of
the m x m matrix A consists Of dy de -++5 dqqe The trace of a square
smatra s the sum ofits diagonal elements:
Asuna ido EE SST
ifay=ay Gm te... 7 =m)
A square matrx whose diagonai elements are all 1 and whose off
diagonal elements are all 0s called the identity matrx and is writen
1 0
otro
239 I=
MOO eb
‘Cleary, where Ais any pas AER aaa
ays the role that 1 plays in ordinary algebra.
play plays » acer,
@xt) afl)2 ASIC CONCEPTS OF MATRIX ALGEBRA
‘A matrot, not necessarily square, whose elements are al zero, is called
the zero matrix and is wnittes 0. Cleanty, wnece A 1s any matttx OA = 0
and AO = 0 and A +0 =A, so that the z2r0 matrix plays the role that O
plays in ordinary algebra. If tne order of the zero matrix 8 of interest,
We write it as 0,» Sy; O,,. may be calles the 2er0 (column) vector and
0... the zero (cou!) veetor,
Partitioned Matrices
Tesscoften conventent to partition a matnx into submatrices. ‘Thus an
m x n matrix A may be partitionea as
@38) A= (AL) Ad,
where Ay is mx my Ay 1S m x My, and m, +.m% =m. Doing this we soe
how the transpose of a partitioned matrix may be written in terms of the
‘ransposes ofits submatrices; thus
=A: | Ay = ()
For example, tne matrix A of (2.20) may be partitioned as
(239)
rap 2 at
Gay Am | Ave [2 1) -2 3),
1 4) 5 2
and its transpose is
iad
(s 2-1 4
a =
ean a) eer)
eo ea)
‘Note how the submatnees are treated as if they were scalars. This treat-
‘meng extends to matrix multiplication as well, IC the m x m matrix As
partitioned as A= (Ay | As) where Ay is m Xm, Ay 18 me X My and
mbm
By is m x p and B, is ny X p, then the product C = AB may be expressed
ea c= = sa() =a + Ap
2. VECTORS AND marRicas B
For example, EHBIAISESO IGS MAAMORAIGIND (2.40) and tre B of
(@.20) be partitioned as,
“ =f i
‘Then the product C = AB may be computed as
rs,
OM) CH AB= a, ale) =A, + A.B)
ie yo(2 le
\s 2
(= el=(0 26),
:
“4
‘hin ss ust ie result of (2.20).
‘This device extends to finer partitionmgs. Thus an m x m mateix A,
may be partitioned as
[Aas | Aus}
ess a(t)
vow \s 1s,
An | Au
nee yy 18 my X Dy Aye My X May An, 18 a X My Ata 18 My X My
m+ my m, and mp my =n. ts transpose may then be expressed as
(Ax | Aas (te +s)
Me | And \Aie | A
Tfurtner the x p mates B is partitioned ss
@ | -
2 | Bal
ere By, 18, % Poy Bua 3 my % Poy Bay 18 % Poy Bagi X Po
Dit pe p, then the proguct C = AB may be computed in partitioned
(246) a
eanis BASIC CONCEPTS OF MATRIX ALGHIRA
form as
(Cay | Cas Ans) (Bu | Bi)
ew) ca (21S Pn |B
fo (a) ae
_ fABast AsBar | AuBa + AaB
Bayt ABs | AuBs + AnBa
where Cry 38 my % Pry Can 8 mr X Pas Cay 18m, X Pan Ad Curis a % Ps
For examples, let the A and B of (2.20) be partitioned as
1o2pa 1 ai a
aja os a] =a
04) A= ea B=
ars ata
1) 6
Thon
cottons +
mast Auta of} 46 af?)
=6+07=e,
Com Maat Aten a(S) 46 af!)
=(-9+U9=09,
3. peremnsasrs 1s
so that
ny 8
c Gay _ {0 | 26
esp les
(Gn Ge) ~ N35
as before, Sill finer partitionng can be utilized. In viewing an mr xn
‘mata as a set of column vectors we are effectively partitioning tt into m
‘m x 1 submatrices. In the rest of this book the dashed pactitioning lines
ate usually omitted.
Finally, a definition:
3) Let Ave and B be an x mat, Then
the dest QUIRON post ofA a waten AB,
is dln the non on ati
fayBo +> ayy B
=AQB=
eB dni
3. DETERMINANTS
Associated with any square matrix A Js a scalar function of its elements
called the determinant of A and written {Al. If A ss xm, then [Al is
said to be a determinant of order m, Wo define the determinant of an
ax nmatex in recursive fashion Fist, for n
GD A= @ then {Al =a,
and for n= 2,
62 tae ( : ") then [Al = aia — ay
lan ay
Porn > 2. the determinant of an m X m matrix may be defined in teems of
‘sterminants of (n— 1) x (w~ 1) suomatriess as follows. The minor
im of an element ay in an n x m matrix A 18 the determinant of the
Gr — 1) x (1 — 1D) submaters obtained by deletion of the ith rox and jth
column of A. The cofeetor ¢, of the element ay 18 is “signed” minor,
y= (—1%nmy, Then the determinant of ann xn matrix may be
defined as the sum of the products of tho elements in any row of A by
their cofactors:
G3) Asn xm then IA = ¥ aueg for any6 [BASIC CONCHITS OF MATROX ALGUDRA
For an example we take = 3, and arrange the cofactors in @ matex
Gd) A=[an a an},
\en Sas ay
G4) C=)
(atey— dvs) ~(Oy:89—A)
= | tear)
(tae) (03s8@~ 440)
(eA)
te Auta) ~C@tea— aaa) },
(6025440)
Ge) [Al zee Axx Gg¢0an — Aygza) — (A g,g3 — 33)
F ygldgydiys — gytty,)
= Seat = ~en(¢utan ~ data) + aaf@yees~ ate)
= Ay(@ te — Ayxtn,)
= 3 eata enlenta ~ dt) ~ ata ~ 60a)
+ dulasden — at):
For a numerical example,
@.5a)
G58) 9-3},
aaa
G5 [Al = 1(11) + 2-7) + 30) = (9) + 3) + (3)
= 10) + 5-2) + 120) = 3.
I follows from the definitions ofthe cofactors that the sum of products
of the rements in any row of A oy tne cofactors of the elements of anotner
|
4, UINBAR DEPENDANCE, RANK, HOMOGENEOUS HQUATION SYSTEMS 17
row (“alien cofactors”) 1s always zero; thus
6) Aten x nythen
doen =O x Ie 1,
It follows from the definition of the determinant that
ey Alara,
8) (AB) = 141)
9) Awa diagonal matex Ge, a, = 011), then
[Al = ays
and
G10) W Aas an mx m mate, then |~A} = (—D) LAL
Finally, a definition:
Gl) A ts ann x m matrx, then A is singular if and only if
|Al = 0, and A is nonsingular if ana oniy if |Al >é 0
4. LINEAR DEPENDENCE, RANK, AND HOMOGENEOUS
EQUATION SYSTEMS
[Linear Dependence
A set of vectors 1s said to be linearly dependent if there 1s a nontrvial
linear combination of the vectors which is equal to the zero vector; sus
G2) The set of mm x 1 vectors fx, ..., x!™)} is linearly dependent
‘fan only if tere exists a st of scalars fy, ¢4) not all of
‘which are zero, such that ex + +++ 4 e,xi9) =O,
Ifthe set 18 not linearly dependent it 8 linearly independent, ‘Thus a
set is linearly independent if the omty linear combination of tue vectors
Which 1s equal to zero isthe trivial one Ox 4 ++ 4+ Ox'*? = 0, Any set
‘containing a zer0 vector 1s linearly dependent; if x"! = 0, for example,
then Ox fee FOXY Fix) OH +O 0—0 shows a
nontrivial linear combination equal to zero. Obviously,
(4.2) Any subset of a lineanty maependent set of vectors 18 linearly
Inaependent.8 ‘BASIC CONCEPTS OF MATRIX ALGEBRA
Without proof we note that
(43) Ia set contains more than m_m x 1 vectors, iis
linearly dependent,
Some of tnese points may be illustrated with the following vectors:
4) 5) n
xo—=[-2), x=[ 6}, o).
1 9) 3
Co)
v : 0)
xmflo), well, w=fo
\9/ 0) 1
Theset (x, nx) is hineaey dependent since 3x + 1x® — 1x8 = 0
see 2.6), "The st fa!" x, x9} is linearly independent since
a 0 My fe
alo] +alt}+afo}=(a
0) y Wy \q
1s ze10 only ihe = 6 0; simitarty, tne sets (x! x}, {x90},
and {x', x9} are each linearly independent. ‘The set {x\°, x} is also
lineany independent. This 1s s0 because
4 5) 4, = Sea
a{-2) al 6) =[-2 +6);
1 9) la + Oe
for the third element to be zero ¢, must be zero, and then for the first
clement to ve z2r0 ey must be zero. (In general a set containing two
vectors will be linearly dependent if and only if one is a multipte of the
fother:) Tt will aso be seen that aay set of four or more of the 3x 1
‘vectors of (4.4) linearly dependent, Of cours, st 1s not always 80 easy
to find whether or not a set of vectors 1 nearly dependent.
Rank of a Matrix
The columns of an m x n matrix A may be viewed asa set of» m x 1
column vectors Ay, Ay. - A where A, isthe /th column of A. Taking.
this view, we define the rank of A, written (A), as the maximum number
of linearly independent vectors in the set Aj, +, A; ie. the number of
4, LINEAR DEPENDENCE, RANK, HOMOGENEOUS EQUATION SYSTEMS 19
‘Vectors i the largest linearly independent set of vectors which can oe
constructed from the columns of A. For example,
4-5 0
a -2 6 of=
deeiet ees,
since tae first two columns form a linearly independent set whereas the
set contaning all thee columns x lineariy depoadent—see (4.4). Since
any set of more than m m x 1 vectors 1s linearly dependent i follows tat
where Ais m Xn. r(A) < m; and since there are only n columns,
HA) Sn; thus
(9) WArwanm x mmatns, (A) < min {m,n}.
Without proof, we note the following very useful result:
(47) Consider all square suomatnces of A whose determinants are
nonzero. ‘The rank of A xs the order of the largest in order of
these determinants,
Loosely speaking, the rank of Ais the order of the largestancorder non-
oro subdeterminant of A. For example, et A bo te matrix of 4.5); then
4-507
4) wis|-2 6 0
10 3
= 1-0) — 1) + 34466) — (5-2),
=-2442=0
s0 HA) <3;
80 1(A) = 2.
Incidentally, (4.7) provides a systematic method of testing for linear
openconce. Given a set of nm x 1 vectors (x, x... x} with
Sm, form the m Xm matrix X= (RO x!9-+-x0), [LF n> m, the
setts linearly dependent by (4.3).] In = m, the gots linearly mdependent
iff end only if (X) =n, ve, if the m x m determunant |X| 7605 ifn < my
the set 1s linearly independent if and only if X has a nonzero.n xn
subdetermunant.20 [BASIC CONCHPTS OF MATRIX ALGHRA
“The following results are straightforward. ‘Sinee any set of cotume
vectors that includes the zero vector is linearly dependent,
=o
soe ater of gz mix sth poss ls aon
cone, 7
tetra x ign nen 18 soir no i
Semen A
‘sin pane
Gi) d=
say sbi te nse of mat of
(Bi = BI.
iz) A) = tA).
saree tnemncof pooes aot cn emai
fmt :
(B)}.
(4.13) (AB) < min {r(A), 10
ote ctor tA
cen of C= A ine conbiios tae
Tet of C0 pay ages coma, a 08 A
(C) < r(A. ‘Teansposmg C = AB gives C= BYA’. 18 ae
tans 0) SH) Tsao ote cots of B40 aC
cH can
eee Ma "Buk Me) = AC) ane @) = BY; MS
That, reaing the definition of singularity, ial
Ceara nam x nt ea 8) =a ony A
" rngoiar: (A) ) )
—D>GE, D*
‘The proof is direct. Muldiplying tne fist “super-row" of A~t into the
fest “super-cotua” of A,
20 + FD“GEQE — EARD“G
= EME + EOFDAG — EXED“G =;
‘multiplying tne first “super-row” of A+ mto the second “super-
column” of A,
0+ EDGE YF — ED
= FAIW + ERD IGP — EoD
EAR + DAGEAF — D4)
EOE[L + D-XGE“F ~ H)] = E“F(L + D-(—D)]
+R — = ERO
multiplying the second “super-row” of A~ into the fist “super-column”
of A, —D-GE-E + DG = DG + D=G = 0; and multiplyingey BASIC CONCEPTS OF MATRIX ALOERRA,
the second “super-row” of A~t into the second “super-column” of 4
—D°GE“F + DAH = D4 — GIF) = DD = L. Thus
Ga
0 that Ais indoed the inverse of AL
re
6. CHARACTERISTIC ROOTS, DIAGONALIZATION OF A
SYMMETRIC MATRIX, ORTHOGONAL AND IDEMPOTENT
‘MATRICES, AND FORMS
(Characteristic Roots of a Matrix
Associated with any X m matrix A are n scalar functions ofits elements
called the charactenstic roots (or eigenvalues) of A and written Jy «Zax
‘The characteristic roots 4, (7= 1, ....) ate defined by the property hat
Ax! = Ax' for some nonzero vector x" which is then called charac
teristic vector (or eigenvector) of A. Thus Ais a charactenstic root of A.
if and only if Ax = Ax for some x # 0, Le. if and omyy if (A — ADx = 0
for some x #0. Now for any choice of 2, (A ~ Ax =0 is system
of m homogeneous linear equations in n uniciowns; it has @ nontrivial
solution if and only if (A — AD the same characteristic roots a A
Furtner, by 235) eB) = (PAP) = (P%PA) = H(A). Fall,
singe P and P~ are nonsngatsr, ) = PAP!) = (A) by (5.2)
(64) IfDise diagonal mater. its charactenstc zo0ts azo ts diagonal
cements
IED is diagonal, then so is
an) 0
0 dea °
pD-a=
9 ° yn — A
By (3.9) the determinant 1s (dy — 2) ++ (dyq ~ 2), whicn shows that the
roots of [D are just diay = dane
(65) Ih Aisa cnaracterstic root of A and p is a positive integer, then
27 is a characteristic root of A, and if in addition A 1s non-
singular, dieu A-? is w cuuraciensc 100 of
If Axo ax, then Aty = A(Ax) = Aix = (Ax) = #x, and similarly
APx m= 20x; if Ax = sxand Ais nonsingular, then 18-2Ax = 41A"1x,
from which 74% = AT!x, and similarly 29x = Ans,
(66) WA 2,0)x, = Oand (A — 2,0)x, = O where Ars
symmetric, dy x40, 4a 7 O.and yo Za, then xx,I
20 ASIC CONCEPTS OF NATRIK ALOTHRA
‘his follows from the fact that x(A — A)x, = x,Ax, — Axx, = 0 50
that Axis, = AN, similarly Axixy = XAXy Since xX, — xix and
(AK = XiA'Ss = KAR We Gad thal Zany — AN Ny OF
Gallxixa; subce Ay # A, this roquuzes xix, = 0. Extending, we
(67) OB — AA), = Oana (B — 1,Ax,
symmetnic, 4 7 0, Ay 74 O,and 4, 7 2p, then XAX;
0 wnere A and Bare
°.
This 18 50 necause xiBx, = AxfAs, and XiBxy = Ax{ANs; therefore
Jus, = denim, XA, = (a/)ei Ame, 060 2 7 dy this requires
Xam = 0
Orthogonal Matrex
A square matrix C is said to be orthogonal if and only its transpose is
sts inverse; thus
G8) Cis orthogonal if and only if
Note that if € is orthogonal and if C, and C, are cotumns of C thea
C{C, equais 1 if mj and equals Oif# yj. Moreover, if Cis orthogonal,
then C's orthogonal. In addition,
(69) If Cis orthogonal, then [C
£1
For 1= [l= |CC}=|C}IC| and always |C|=ICl. Examples of
crthogonal matsices are the identity mateix (T= 11 = 1) and
WW3 2NE 0
610) C= [13 -1E INT
W316 —
IN3 W313)
ec=f2ve -1N6 —1Ve
0 a2
W336 0 100
x fund -We wa}=(01 0
wW3 -1We -w3/ \o 0 4
6. CHARACTERISTIC ROOTS AND DIAGONALIZATION 3
Dingonalization of a Sysumetric Matrix
Without proof we state the very useful diagonalization theorem:
(6.1) If A is an mx m symmetric matnx, there exists an Xm
orthogonal matrix C such that C’AC is diagonal.
For a proof see Hadley (1961, pp. 242-249). Suen a matrix C will be
called an orthogonal matrix which diagonalizes A. To illustrate tne
iagonalization process lot A be the symmetric matrix of (6.2) and let
ae a)
(na ae
c
Then C is an orthogonal matrx that diagonalizes A
om ee earl a6 ) é i)
NING = -N6\—V3WE W/o 1,
and
( 26 oe) —3 ea 26 a)
VaN6 — aWel\v2 —1)\-vaNe awe
5 0
of
It follows direcuy tnat
(613) Let A be a symmetric matnx and et C be an orthogonal
‘tnx that diagonalizes A. Then the characteristic roots of A.
are the diagonal elements of A= C’AC, and the rank of Ais
‘the number of nonzero diegonal clements of A.
Since A i a diagonal matrix ils characteristic roots are its diagonal
‘elements by (6.4) ana its rank is the number of nonzero elements on its
diagonal, by (4.10). Since C is orthogonal, (C’)-* = C, so: that A=
CAC = C’A(C)- nas the same charactenstic roots and rank as does
A.by (6.3). This result i illustrated in (6.12). It should be noted that the
‘noice of an orthogonal matrx to diagonalize a symmetric matrix is not
unique; however, the resulting diagonal matnices will all have tke same
diggonal elements, pernaps m a different order. We can always find an
orthogonal matnx to diagonalize A into a matrix which as the diagonal2 [BASIC CONCHPIS OF MATRIX ALGERRA
clements in any speed order. Note abo that since [Al = [C’AC] =
ICI*IA| = [Al by (69) ane sine A iste product of its diagonal elements
by 89),
(6.19) The determinant of a symmetsic matrix i the product ofits
characersic rots,
‘Hdempotent Matix
A symmetne matrix that reproduces itself on multiplication by itself is
stid to be idempotent; thus
(G15) Ais idempotent if and only if A’ = A and AA = A,
Exampies of idempotent matross are the identity mate (= 1 and
i= Dana
wo (27
seer
We ( + “Hl t “)-( 4 *
Tefollows that * Wat ae
(617) TEA. Wempotent, the characteristic rots of ae al ther
tore.
at 4b acta oo ofthe empotent max A then Ax
for some x 0. Premolipiyng 8y A ges A(As)
His) = Pi But on toe the und CAR) w= (AAD
ae fx for tome nonero x which requrts tat eer Ass Lor 2 =O
‘An idempotent matnx mo) be dingondined into very ample form.
LAtC bean orthogonal matin ta agonal te idempotent attic
‘hon A= CAC dopliyethe carci roots of cas agonal nd
haste une rn trace docs A. Sethe carci rot of
‘areal eter oO and sins te rank of agonal mati soa 0
theater of aon elmets outs diagonal
(6.18) Let A be ann x m idempotent mateax of rank r. Then there
‘exists an orthogonal matrix C such that
senate tat)
Finally, since A has the same trace as A, and since ir(A) is obviously r,
(6.19) If Aas idempotent, then (A) = (A).
6. CHARACTERISTIC ROOTS AND DIAGONALIZATION 3
For example, (1,) =n = irl), and the trace of the matrix A of (6.16)
1s $+ d= Vand so a6 ts rank (ts determinant is zero, but [di # 0).
‘Linear and Quadratic Forms
If Bis an m xn matrix, then the m x 1 vector y= Bx, whieh is
fined for alln x 1 Yeotors x, 8 said to oe a linear form in (the elements
of) x. Each element of y is indeed a homogeneous linear function of the
elements of x: y; = E}-; dyes (i= 1...,m)- If B as an orthogonal
matnx, then y = Bx 1s said to be an orthogonal linear form an (the
clements of) x
If A is an m x m symmetric matrix, the scalar Q «= x'Ax, which 1s
Gefinec for all n x 1 vectors x, is suid to be a quadratic form an (the
clements of) x. The scalar Q is indeed 2 homogeneous quadratic function
of the elements of x
620) Q=(n >
aust + SS autre
If (A) = 7, then Q = x’Ax is said to be a quadratic form of rank r. If A
2s idempotent, tnen Q = x’ Ax is sad to be an idempotent quadratic form.
‘The theory of the preceding subsection 1s directly applicable to show
hhow an idempotent quadratic form can be written as a sum of squares:
(621) Lat Q@= Ax bo an idempotent quadratic form of rank r.
‘Then Q = Big vf = viva, Where
mw
%
and Cis orthogonal,4 [BASIC CONCEPIS OF MATRIX ALGERRA
‘This may be shown as follows. Let C be the orthogonal matrix that
wgonalizes A into the A of (6.18) and let y = C’x be partitioned as,
(i) wea er tan 8 2) Theo
Cie
yin=oi al, p)(t) =var= 0
forall x #0,
‘Many useful properties of positive definite matrices are readily establishea.
First,
(7.2) A aymmeteie matcix A 1s postive definite if and only if all the
‘characteristic roots of A are postive.
7. esmiere MaTasces 35
Let C be an orthogonal matrix that diagonalizes A: 1.6, let
a 0 vs 0
rr)
A=cace :
0 0 a
where the 2's are the characteristic roots of A. Let y= Cx so
tat x = (Cy = Cys then x’Ax = (GyYACCy) = y'CACY = y'Ay =
Zhe: Aa. AE all & > 0, then cleanly w AK = y’Ay > 0 for all y with
equality holding only when y = 0, 1.., when x = Cy = CO= 0; thus A
ts positive definite. Conversely let A be positive definite and suppose that
A characteristic root of A. say 2,18 not positive. Let y* be the n x 1
‘vector with fist element 1 and remausang elements 0, and let x* — Cy
then oy (4.28) x* 760. Ang then x*Ax* = y"'C/ACy* = yy
Zh. Aut? = ’y <0 would contradict tne assumption that A is postive
asin. Next,
(73) I Atwann x n positive definite matrix, then A] > 0, (A) =m,
and Ais nonsingular
Let Ay-*+, be the enaracterisue roots of A. ‘Then [Al a, by
6.14), and A, ++, > 0 by (7.2), whence [Al > 0.
‘Another useful result is
(14) A Ais ann x m positive definite matrix and P is an n x m
‘mate with r0P) = m, then P‘AP is positive definite,
For the mx m mates PAP 8 obviously symmettc. Let y be any
nonzero mi x vector, Then y(P‘AP)y = xAX where x = Py, Since A
‘is positive definite and x nonzero by (4.28),x°Ax > 0. Buty (P’AP]y =
XAx. Hence ¥(P’AP)y is positive forall y 7 0, ‘Soverl specializations
of this theorem are smportant. If in (7.4) we take P to be ann Xn
‘nonsingular mates, we And
(7.5) If Ais positive cefinite ana P is nonsingular, then PAP is
positive definite
fin (7.4) we lot P = AM, then PYAP = (AMY'AAM* = (A)! = AM since
‘Aus symmetne; thus
(1.6) IE Ais positive definite, ten A~ is positive cefinite,36 [BASIC CONCEPTS OF MATRIX ALGEBRA
‘The identity mati is postive definite, since x = Ef, 29 and a sum of
squares is positive uniess all the terms are zero. If, thet, m (7.4) we take
so that PAP = PIP = P'P, we find,
the matrix A* is obtained from the matnx A by tnter-
changing th its ang jth columns of A and interchanging the fh and jth
rows of A then if A is positive definite, A* will aso be. For A* willbe
symmetric and x!A*x* = w’Ax where x* is any vector snd x is obtained
by imtorchanging the ith and jth elements of x*
‘A principal submatnx of a square matrix is one obtained by deleting
‘only corresponding rows snd columas. Thea
(78) AE A.ss positive celiate, every principal submatnix of A 8
positive definite
Without toss of generality et I be the principal swbmatrx obtained by
the last — mr rows and columns of A. Then
oe
te
ste ( _) Ban x m mas whose rank leary m,stmay seve
as the Pof (7-4). A principal minor of a square matrix isthe determinant
(of principal submateix. Then application of (7.3) and (7.8) shows that
(79) IEA. postive definite, then every principal minor of A is
positives
in particular,
(7.10) IFAs positive definite, them ay > O, aay — a; > 0, forall +
and).
Sul another useful result is
(7.11) If As positive definite, there exists & nonsingular matrix P
such taal PAP’ = Tand P’P =: A~
Let € be the orthogonal matrix such that
one 0
CACa A=
1. entre Maraices ”
and tet,
Wa + 0
D=
° Wi,
‘Then P = D’C is the requirod matrix, for P, being the product of non-
singular matrices, is nonsingular, and PAP’'= D’C’ACD = DAD = I.
Further, if PAP’= 1, then P(PAP)P = PIP = P'P, fom which
P'PA =I. from whien PP = A“
Nonnegative Definite Matrix
Let A be ann x msymmetric matrix. ‘Then As said to be nonnogative
‘defiite (or positive semidefinite) if and only if the quadratic form x’Ax
ss nonnegative for every n x 1 vector x; thus
(7.12) A symmetric matrix A is nonnegative definte if and only if
¥Ax > O for allx
‘Thus positive definteness 1s a special cate of nonnegative aefniteness.
‘The following implications of nonnegative definiteness may be derived
by arguments parallel to those used in deriving the corresponding implica-
tons of positive definiteness.
(743) Asymmetric matrix A ss nonnegative definite if and onty i all
the characteristic roots of A are nonnegative,
(714) If A is nonnegative definite and P is any matrix, then P'AP as
nonnegative cefinite,
(7.15) IP is any matetx, then PPP is nonnegative definite,
(7.19) IF Aas nonnegative definite then any principal submatex of A
1s nonnegative definite,
(7.17) If Avs nonnegative definite, then every prinespal minor of A is
nonnegative,
(7-18) If A ts nonnegative definite, mem ay 2 O and ayayy ~ a 2 O
for ll ana.
‘tis readily seen tnat
(749) If Ais nonnegative definite but not positive delimit, then its
‘smallest enaractenstc root 8 zero and A xs singular.38 BASIC CONCEPTS OF MATRIX ALGEDRA
All the charactenstic roots of A are nonnegative, and ifthe smallest were
positive then A would be positive definite by (7.2); then since A has at
least one zero characterstic root ang since {Al equals the proguct of these
characteristic roots, [Al = 0. Moreover, i fellows that
(720) If Ass positive definite and B 18 nonnegative definite but not
postive definite, then the smallest root of the determinantal
equation [B — 2A| = 0 is zero.
Let Q = PBP’ whore P is a nonsingular matrix such that PAP’ = 1;
then Q 18 nonnegative definite by (7.14) but not postive cefinit, singe it
4s singular—|Q| = [P}*[B| = 0—s0 that its smallest characterise root is
ze10 oy (79). Since [Q — if] = JPBP’ — APAP'| = [P( — 2A)P'| =
[PB — 28}, however, and since [P/* 0, te r00ls of |B — ZA| = 0 are
the characteristic roots of Q
‘The following theorem gives meaning to the idea of one matrix beng
“more” postive definite than another:
(121) FA = B+ C where Bis postive definite ana Cis nonnegative
definite, then (a) A is positive definite, (6) |B] < al, ana
(9 B+ — Ais nonnegative definite.
First, since for any x #0, x'Bx > 0 and x'Cx > 0, we have ¥As
X(B + Cx = x'Bx + x'Cx > 0, ctablishing part (a). Next, et P be &
nonsingular matrix such that PAPY = I and P'P = A~*, and Jet Q=
PRP’. then ((— Q) = PAP! — PBP’ = P(A — BP'= PCF Note
that Q is postive definte by (7.5), (— Q) 18 nonnegative definte by
7.14), aud [B| = [Qi P-# = [Q] Al. Let 2 be a enaracterstic root of Q;
‘then 0 — Vis, Now if ws a characteristic root of (Qo ~ 0, then
1+ p 1 a charactenstic root of Q-—decause |Q-* — (1+ ssl
(Q*— 1) ~ pa—whien in turn amples vy (6.5) thet (+ A)
characteristic root of Q. ‘The bounds we have established for the charee-
teristic roots of Q show 0 < (1 + J"! S 1. from Which O 0,
For completeness we may define the concept of nonpositive defiiteness:
Thesymmetnc n x matrix A.1s said to be nonpositive definite and only
if ~A.ts nonnegative definite; thus
(124) A symmetne matrix A ss nonpositive definite if and only if
AX $ D for all x
It should be recognizes that a symmetric matrix need not be definite
1m any of these senses, For example,
29 2a, aC) =Std tan
6 if m= mol
2 if a
Finally, te concepts of definsteness are applied to quadratic forms as
‘well as to their matrices. Thus tue quadratic form Q = x’Ax 8 said to be
positive definite, nonnegative definite, negative definite, or nonpositive
‘finite according as the mattx A 1s positive definite, nonnegative definite,
negative definite, or nonpositive definite
8, DIFFERENTIAL CALCULUS IN MATRIX NOTATION
Differentiation
‘xis an m x 1 vector und y 18a scalar function of the elements of x—
¥ =f (ey. ,%q)—then y/2x 1s defined to be the m X 1 veetor of partial40 BASIC CONCEPTS OF MATRIC ALGEBRA
deswvatves dye; thus
fault,
ay
ew
en 8
VfB
Moro generally if X is an m xn matex ane y is a scalar function of the
elements Of Xm fleas hypo Say +9 aq) —theD 2y/OX 1s
defined to be the m x m matnx of partial derivatives dy/2x,,, thus
auld Aue
ay
ey f=
\DU/B% my + By gy
Ax a8 an mx 1 vector and y is am mx 1 vector each of whose
clements 18 a scalar function of the elements of xy, = filzy 42)
G=L,...,m—then dy/0x is defined to be the mt x m matrix of partial
enivatives 2y,/@z,; thus
faylo%, + dyqley
ay 2)
63) Si 2,
w x)
Pl + Oya
‘Then if x 18 an m x 1 veetor and y = fle.) i a scala Fonction
of the elements of x, y/@xt is definea to be the m X m matrix of second.
partial derivatives Oyj, dxy; thus
BulOet «~~ Dylon ae
ay 22120
ay me ox
By /(Bm, Bay) atyfOan,
= BeHeen aes)
ox He
Te will be recognized that @y/@x? is symmetric uncer general conditions.
Consider the Taylor series expansion of the scalar y = fle... %q)
‘thout the point x? (23 ++-a2)'. ‘The first two terms of the expansion
8. DIFEREVTIAL CALCULUS IN MATRIX NOTATION 41
have a convenient matrx representation:
Sa
(8.5) vr Zan a)
15 3 ay
112 Rien
SVQ =x) 4 Ux XK RAE
where 9° = f(2{, +... x5), amd where j = dy/Ax and K = dyjax* are
evaluated atx" When an m 1 vetorcoasder the sts of Taylor
tenes expansions of tho. 2 fe sy.) about the polnt =
GfrssaQ)'. Since y, — 97 = EP, Oyp/@a Ne, — af) +--+, the set of
fist terms ofthe expansions has a convenient inatatreprenitation
6) yy aI xt,
sete
(a Me, — 2) +
and yj = (ey... 2%), and wnere J = dy/Bx is evatuated at x
For some functions convenient formulas for differentiation with respect
0 a vector ate available. A homogeneous linear function y = 3%, ast
dhas dy/@x, = a4; hence if we define
x= and a=(- },
then y = xa and by (8.1) we may write
ox
On Shwe
A set of homogeneous linear functions y, =
ha duty (Joe 1, sa) bas
Ayla, = a,,; hence iF we define
iy my fas, = ag
ed ed cee
, * a2 BASIC CONCEPTS OF MATRIX ALGEBRA
then y = Ax, and by (8.3) we may write
ax
ps) AX.
ey 3
A quadratic form y= BE, BI ayer, with ay — ay nas y/ée,
DBPL, ayers hence if we define
% fay Bi,
xs ang A= .
then y = X/Ax, and by (8.1) we may write
6m WATS,
‘The anaiogtes of these matrx expressions to the conventional formulas
should be noted,
It will also be useful to define the derivative of a mattix with respect to
scalar, If isa scalar and iY is an m > nmatztx each element of which
4 a scalar function of avy = f,(@) (= Ly..., mj =1,..-.0)—
then 3¥/2e is defined to be them x n matrix of denvatives dy,/de: thus
Ovnle= Ae
o
ey) F
upalO® + Bayle
For some matrix functions convenient formulas for differentiation with
respect to a scalar are available. If A and B are m x n matrices whose
elements are functions of a scalar # and if C = AB, then since cy =
Tihs Mab implies
oe ae
‘we have by (8.10)
aan.
AB,
on
Suppose that A is an m xn nonsingular matrix whose elements are
Functions ofa scalar 2; then the elements of A? will also be functions of
8. DIEREATIAL CALCULUS IN MAZIUX NOTATION 8
2 Dillerentiating AA™+ =I pues A(GAYde) + (@A/2e)A-! = Dae =
0 premuliiying through by A? anu rearranging anes
2A pith gs
em Bo shy
Jn particular, consider the denative of A“ with respect tan element.
aay of Ay With 2 ay, O8/81= Ey = ee) where By the n> 8
mare having tas 959! elameat and Os elsonher ard ey them x1
vector having Iss th clement and O's elsewnere (8.12) speciale to
1) Am amar = teyean) = an"
where A‘is the ith colusn of A~‘ an Ais the th row of A“! In partcus
lar (813) means
aut
ei &
fee aa
where a i the jy element of AE
Finally, itis useful to record some results on diferentiation of scalar
functions of the elements of a mati. Suppose that each element of an
‘ex mmatnx ¥is¢ function ofa solar 2~yy = fe)—and that B is an
fe X m constant mati, Le, the elements of B do not depend on x. Then
since
a(xB)= >
implies a
acm _
ae Pad
by (8.10) we may write
ein 2B. (2)
ae
Suppose that eaen element of an m x m matrix A isa Function of @ scalar
Sols). Then since [AI = Edgy ey impli 8 1A ayy = Cy
ner 6,5 the cofactor of ayy, we have
DIAN S$ BIA Aawe SS Day
Boge Oe BPs oe
In particular, suppose that A is symmetric anc consider the partial
denvative of 'A| with respect to an clement a= ay of A. With 2
84) yy Daye = LiL mand mz oF i wy and wm, aud
19,“4 BASIC CONCHPTS OF MATRIX ALGEBRA
4, ./0# = 0 otherwise, (8.16) specializes to
17) Asssym
BW oe i
64 aegis
were we ave used tec tat ey = iy symmetric mata. Fuster,
sme DIopAL_ Dtop IAI IAI 14,1 IAL
Bay IAL Bay A Ga,
‘and since [Ale = a, we have:
aie (ic
Naa at ay
(8.18) WA symmeteic,
‘Unconstrained Extrema
‘The familiar rules for locating the extrema (minima or maxuma) of
function of several variables are
G1) Let y= fle tye Then
(8.194) First-order condition: For y to have an extremum at
the point (f,...,23) it 18 necessary that @y/@x, = +++ =
y/@e = 0; and
(8.198) Second-order condition: Such a point will be @ minie
mum if
ae
Admeae
for every set of d's not all of which are 20; alternatively
suet point willbe a maximum if
ae
2 Rano,
dz,dz,>0
dz,d2.<0
for every set of d's not all of which are zero; where all the
cervatives are evaluated at (ej, 23).
|A convenient matrix formulation 1 available if we use (8.1), (8.4), and
the concepts of definite uatiees
(20) Letx= bbe an m X 1 vector, let ¥ = f(y +42) be
8. DIFFERENTIAL CALCULUS IN MATRIX NOTATION 45
2 scalar function of the clements of x, and tet x° =
Thon,
(6.204) Firstorder condition: For y to have an extremum at
X° iis necessary that ay/Ox = 0; anc
(8.200) Secondorder condition: Such a podnt willbe a mimmntm.
i d¢y/Ox? 1s positive definite; alternatively, such point will
bbe a maximum if 2y/0s* is negative definite
‘where all the derivatives are evaluated at x°
To illustrate the application of the rules, suppose that y =: + xb +
$x/Cxis to be minintzed where x18 m x Ty and where the 1 x 1 seatar a,
the m X 1 vector b, and the m Xx m positive definite matax C are all
constant, Then dy/@x = b+ Cx and dy/dx?—= C=C. Setting
dylox= 0 gives b+ Cx° =O, from whicn x'=-C>D lovates the ex-
tremum. Since 2y/@x*is positive definite, x° indeed locales the mmimum.
Constrained Extrema
‘The familiar rules for locating the constrained extrema of a function of
several variables are
2) Ley
Bile
(ay 5%q) be subject to the m- O for every dx 260
satisfying (4x)/(@x/2x) = 0; alternatively, suen a pount will be
4 (constramed) maximum if (4x) (2y)@x*\is) <0 for every
x x 0 satisfying (dx) @glx) = 05
:
‘To illustrate the application of these rules, suppose that y = 4x‘Ax is
to be minmized subject to Bx — b =O where x 18 m X 1, and where the
mx m positive ashe matrix A, the n x m matrix B of rank m ane the
nx J Veotor bare all constant. Let % be the n x 1 Lagrange vector;
then g = Bx — band = {x’Ax — N(Bx — b). Thea
()= (os), ay
jen) (anv4 ASIC CONCEPTS OF MATRIX. ALGEBRA
Selting the fret derivative vector equal to zero gives Ax” = BX” anc.
Bx’ =. Thus x= AB"; premultipying by Band imposing
Bx = gues Bx’ = BA"B2° =, from whieh 2° = (BAB),
Inserting this into x’ = AW*B'2? then locetes the (constrained) extre-
mum at x? = ARBAB), Since dy/dx? is positive define,
(xy @y/Ox"V(ds) > 0 for all dx 0 including those dx’s that satisfy
(GxYB’ = 0 so that x° indeed locates the (constrasned) minimum,
CHAPTER 3
Basic Concepts of
Statistical Inference
4, INTRODUCTION
From one point of view econometrics is the specialized branch of mathe-
‘matical statistics that treats statistical inference concerning the structure
‘of an economic system. The techniques of econometrics are then acap-
tations of the standard techniques of statistical inference to fit special
characterises of economic phenomena. In this chapter we review the
basic concepts of statistical inference to which we repeatedly refer in the
‘evelopment of the theory of econometrics.
{we review some rudiments of descriptive staustics, concerning.
‘empirical frequency distributions, in Section 2. In Section 3 we discuss
random variables and probability distributions, which constitute the
basic modet of mathematical statistics. In Section 4 we consider random
sampling ang the distribution of sample statistics. Seotion 5 is cevoted to
‘the normal distribution and distributions associated with st. In Section 6
‘we treat the theory of sequences of random variables ana apply this
theory to we asymptotic distributions of sample statistics. Then in
Section 7 we turn to the theory of satstical inference proper, developing,
fnitenia and techniques of drawing inferences from a sarnple about its
parent popwiation. Finally in Section § some rudimentary theory of
Stocnastic processes 1s applied to the case. where sampling 1s Honrancom.
For a more rigorous development of the materal reviowed n this
‘chapter, several excellent texts are evailable. Mood (1950) ana Parzen
(1960) are snstructive mtroductions. Kendall and Stuart (1958, 1961)
and Wilks (1962) are more advanced. Anderson (1958) emphasizes normal
distribution theory put also provides convenient general reference.
o