P. 1
1

1

|Views: 21|Likes:
Publicado pormattran

More info:

Published by: mattran on Sep 13, 2010
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

10/31/2011

pdf

text

original

Sections

  • Quiz 1.1
  • Quiz 1.2
  • Quiz 1.5
  • Quiz 1.7
  • Quiz 1.8
  • Quiz 1.9
  • Quiz 1.10
  • Quiz 1.11
  • Quiz 2.1
  • Quiz 2.2
  • Quiz 2.3
  • Quiz 2.4
  • Quiz 2.7
  • Quiz 2.8
  • Quiz 2.9
  • Quiz 2.10
  • Quiz 3.1
  • Quiz 3.2
  • Quiz 3.3
  • Quiz 3.4
  • Quiz 3.5
  • Quiz 3.6
  • Quiz 3.7
  • Quiz 3.8
  • Quiz 3.9
  • Quiz 4.1
  • Quiz 4.2
  • Quiz 4.3
  • Quiz 4.4
  • Quiz 4.5
  • Quiz 4.6
  • Quiz 4.7
  • Quiz 4.8
  • Quiz 4.9
  • Quiz 4.10
  • Quiz 4.11
  • Quiz 4.12
  • Quiz 5.1
  • Quiz 5.2
  • Quiz 5.3
  • Quiz 5.4
  • Quiz 5.5
  • Quiz 5.6
  • Quiz 5.7
  • Quiz 5.8
  • Quiz 6.1
  • Quiz 6.2
  • Quiz 6.3
  • Quiz 6.4
  • Quiz 6.5
  • Quiz 6.6
  • Quiz 6.7
  • Quiz 6.8
  • Quiz 6.9
  • Quiz 7.1
  • Quiz 7.2
  • Quiz 7.3
  • Quiz 7.4
  • Quiz 7.5
  • Quiz 8.1
  • Quiz 8.2
  • Quiz 8.3
  • Quiz 8.4
  • Quiz 9.1
  • Quiz 9.2
  • Quiz 9.3
  • Quiz 9.4
  • Quiz 9.5
  • Quiz 10.1
  • Quiz 10.2
  • Quiz 10.3
  • Quiz 10.4
  • Quiz 10.5
  • Quiz 10.6
  • Quiz 10.7
  • Quiz 10.8
  • Quiz 10.9
  • Quiz 10.10
  • Quiz 10.11
  • Quiz 10.12
  • Quiz 10.13
  • Quiz 11.1
  • Quiz 11.2
  • Quiz 11.3
  • Quiz 11.4
  • Quiz 11.5
  • Quiz 11.6
  • Quiz 11.7
  • Quiz 11.8
  • Quiz 11.9
  • Quiz 11.10
  • Quiz 12.1
  • Quiz 12.2
  • Quiz 12.3
  • Quiz 12.4
  • Quiz 12.5
  • Quiz 12.6
  • Quiz 12.7
  • Quiz 12.8
  • Quiz 12.9
  • Quiz 12.10

Probability and Stochastic Processes

A Friendly Introduction for Electrical and Computer Engineers
Second Edition
Quiz Solutions
Roy D. Yates and David J. Goodman
May 22, 2004
• The MATLAB section quizzes at the end of each chapter use programs available for
download as the archive matcode.zip. This archive has programs of general pur-
pose programs for solving probability problems as well as specific .m files associated
with examples or quizzes in the text. Also available is a manual probmatlab.pdf
describing the general purpose .m files in matcode.zip.
• We have made a substantial effort to check the solution to every quiz. Nevertheless,
there is a nonzero probability (in fact, a probability close to unity) that errors will be
found. If you find errors or have suggestions or comments, please send email to
ryates@winlab.rutgers.edu.
When errors are found, corrected solutions will be posted at the website.
1
Quiz Solutions – Chapter 1
Quiz 1.1
In the Venn diagrams for parts (a)-(g) below, the shaded area represents the indicated
set.
M
O
T
M
O
T
M
O
T
(1) R = T
c
(2) M ∪ O (3) M ∩ O
M
O
T
M
O
T
M
O
T
(4) R ∪ M (4) R ∩ M (6) T
c
− M
Quiz 1.2
(1) A
1
= {vvv, vvd, vdv, vdd}
(2) B
1
= {dvv, dvd, ddv, ddd}
(3) A
2
= {vvv, vvd, dvv, dvd}
(4) B
2
= {vdv, vdd, ddv, ddd}
(5) A
3
= {vvv, ddd}
(6) B
3
= {vdv, dvd}
(7) A
4
= {vvv, vvd, vdv, dvv, vdd, dvd, ddv}
(8) B
4
= {ddd, ddv, dvd, vdd}
Recall that A
i
and B
i
are collectively exhaustive if A
i
∪ B
i
= S. Also, A
i
and B
i
are
mutually exclusive if A
i
∩ B
i
= φ. Since we have written down each pair A
i
and B
i
above,
we can simply check for these properties.
The pair A
1
and B
1
are mutually exclusive and collectively exhaustive. The pair A
2
and
B
2
are mutually exclusive and collectively exhaustive. The pair A
3
and B
3
are mutually
exclusive but not collectively exhaustive. The pair A
4
and B
4
are not mutually exclusive
since dvd belongs to A
4
and B
4
. However, A
4
and B
4
are collectively exhaustive.
2
Quiz 1.3
There are exactly 50 equally likely outcomes: s
51
through s
100
. Each of these outcomes
has probability 0.02.
(1) P[{s
79
}] = 0.02
(2) P[{s
100
}] = 0.02
(3) P[A] = P[{s
90
, . . . , s
100
}] = 11 ×0.02 = 0.22
(4) P[F] = P[{s
51
, . . . , s
59
}] = 9 ×0.02 = 0.18
(5) P[T ≥ 80] = P[{s
80
, . . . , s
100
}] = 21 ×0.02 = 0.42
(6) P[T < 90] = P[{s
51
, s
52
, . . . , s
89
}] = 39 ×0.02 = 0.78
(7) P[a C grade or better] = P[{s
70
, . . . , s
100
}] = 31 ×0.02 = 0.62
(8) P[student passes] = P[{s
60
, . . . , s
100
}] = 41 ×0.02 = 0.82
Quiz 1.4
We can describe this experiment by the event space consisting of the four possible
events V B, V L, DB, and DL. We represent these events in the table:
V D
L 0.35 ?
B ? ?
In a roundabout way, the problem statement tells us how to fill in the table. In particular,
P [V] = 0.7 = P [V L] + P [V B] (1)
P [L] = 0.6 = P [V L] + P [DL] (2)
Since P[V L] = 0.35, we can conclude that P[V B] = 0.35 and that P[DL] = 0.6 −
0.35 = 0.25. This allows us to fill in two more table entries:
V D
L 0.35 0.25
B 0.35 ?
The remaining table entry is filled in by observing that the probabilities must sum to 1.
This implies P[DB] = 0.05 and the complete table is
V D
L 0.35 0.25
B 0.35 0.05
Finding the various probabilities is now straightforward:
3
(1) P[DL] = 0.25
(2) P[D ∪ L] = P[V L] + P[DL] + P[DB] = 0.35 +0.25 +0.05 = 0.65.
(3) P[V B] = 0.35
(4) P[V ∪ L] = P[V] + P[L] − P[V L] = 0.7 +0.6 −0.35 = 0.95
(5) P[V ∪ D] = P[S] = 1
(6) P[LB] = P[LL
c
] = 0
Quiz 1.5
(1) The probability of exactly two voice calls is
P [N
V
= 2] = P [{vvd, vdv, dvv}] = 0.3 (1)
(2) The probability of at least one voice call is
P [N
V
≥ 1] = P [{vdd, dvd, ddv, vvd, vdv, dvv, vvv}] (2)
= 6(0.1) +0.2 = 0.8 (3)
An easier way to get the same answer is to observe that
P [N
V
≥ 1] = 1 − P [N
V
< 1] = 1 − P [N
V
= 0] = 1 − P [{ddd}] = 0.8 (4)
(3) The conditional probability of two voice calls followed by a data call given that there
were two voice calls is
P [{vvd} |N
V
= 2] =
P [{vvd} , N
V
= 2]
P [N
V
= 2]
=
P [{vvd}]
P [N
V
= 2]
=
0.1
0.3
=
1
3
(5)
(4) The conditional probability of two data calls followed by a voice call given there
were two voice calls is
P [{ddv} |N
V
= 2] =
P [{ddv} , N
V
= 2]
P [N
V
= 2]
= 0 (6)
The joint event of the outcome ddv and exactly two voice calls has probability zero
since there is only one voice call in the outcome ddv.
(5) The conditional probability of exactly two voice calls given at least one voice call is
P [N
V
= 2|N
v
≥ 1] =
P [N
V
= 2, N
V
≥ 1]
P [N
V
≥ 1]
=
P [N
V
= 2]
P [N
V
≥ 1]
=
0.3
0.8
=
3
8
(7)
(6) The conditional probability of at least one voice call given there were exactly two
voice calls is
P [N
V
≥ 1|N
V
= 2] =
P [N
V
≥ 1, N
V
= 2]
P [N
V
= 2]
=
P [N
V
= 2]
P [N
V
= 2]
= 1 (8)
Given that there were two voice calls, there must have been at least one voice call.
4
Quiz 1.6
In this experiment, there are four outcomes with probabilities
P[{vv}] = (0.8)
2
= 0.64 P[{vd}] = (0.8)(0.2) = 0.16
P[{dv}] = (0.2)(0.8) = 0.16 P[{dd}] = (0.2)
2
= 0.04
When checking the independence of any two events A and B, it’s wise to avoid intuition
and simply check whether P[AB] = P[A]P[B]. Using the probabilities of the outcomes,
we now can test for the independence of events.
(1) First, we calculate the probability of the joint event:
P [N
V
= 2, N
V
≥ 1] = P [N
V
= 2] = P [{vv}] = 0.64 (1)
Next, we observe that
P [N
V
≥ 1] = P [{vd, dv, vv}] = 0.96 (2)
Finally, we make the comparison
P [N
V
= 2] P [N
V
≥ 1] = (0.64)(0.96) = P [N
V
= 2, N
V
≥ 1] (3)
which shows the two events are dependent.
(2) The probability of the joint event is
P [N
V
≥ 1, C
1
= v] = P [{vd, vv}] = 0.80 (4)
From part (a), P[N
V
≥ 1] = 0.96. Further, P[C
1
= v] = 0.8 so that
P [N
V
≥ 1] P [C
1
= v] = (0.96)(0.8) = 0.768 = P [N
V
≥ 1, C
1
= v] (5)
Hence, the events are dependent.
(3) The problem statement that the calls were independent implies that the events the
second call is a voice call, {C
2
= v}, and the first call is a data call, {C
1
= d} are
independent events. Just to be sure, we can do the calculations to check:
P [C
1
= d, C
2
= v] = P [{dv}] = 0.16 (6)
Since P[C
1
= d]P[C
2
= v] = (0.2)(0.8) = 0.16, we confirm that the events are
independent. Note that this shouldn’t be surprising since we used the information that
the calls were independent in the problem statement to determine the probabilities of
the outcomes.
(4) The probability of the joint event is
P [C
2
= v, N
V
is even] = P [{vv}] = 0.64 (7)
Also, each event has probability
P [C
2
= v] = P [{dv, vv}] = 0.8, P [N
V
is even] = P [{dd, vv}] = 0.68 (8)
Thus, P[C
2
= v]P[N
V
is even] = (0.8)(0.68) = 0.544. Since P[C
2
= v, N
V
is even] =
0.544, the events are dependent.
5
Quiz 1.7
Let F
i
denote the event that that the user is found on page i . The tree for the experiment
is
¨
¨
¨
¨
¨
¨
F
1
0.8
F
c
1
0.2
¨
¨
¨
¨
¨
¨
F
2
0.8
F
c
2
0.2
¨
¨
¨
¨
¨
¨
F
3
0.8
F
c
3
0.2
The user is found unless all three paging attempts fail. Thus the probability the user is
found is
P [F] = 1 − P
¸
F
c
1
F
c
2
F
c
3
¸
= 1 −(0.2)
3
= 0.992 (1)
Quiz 1.8
(1) We can view choosing each bit in the code word as a subexperiment. Each subex-
periment has two possible outcomes: 0 and 1. Thus by the fundamental principle of
counting, there are 2 ×2 ×2 ×2 = 2
4
= 16 possible code words.
(2) An experiment that can yield all possible code words with two zeroes is to choose
which 2 bits (out of 4 bits) will be zero. The other two bits then must be ones. There
are

4
2

= 6 ways to do this. Hence, there are six code words with exactly two zeroes.
For this problem, it is also possible to simply enumerate the six code words:
1100, 1010, 1001, 0101, 0110, 0011.
(3) When the first bit must be a zero, then the first subexperiment of choosing the first
bit has only one outcome. For each of the next three bits, we have two choices. In
this case, there are 1 ×2 ×2 ×2 = 8 ways of choosing a code word.
(4) For the constant ratio code, we can specify a code word by choosing M of the bits to
be ones. The other N −M bits will be zeroes. The number of ways of choosing such
a code word is

N
M

. For N = 8 and M = 3, there are

8
3

= 56 code words.
Quiz 1.9
(1) In this problem, k bits received in error is the same as k failures in 100 trials. The
failure probability is = 1 − p and the success probability is 1 − = p. That is, the
probability of k bits in error and 100 −k correctly received bits is
P
¸
S
k,100−k
¸
=

100
k

k
(1 −)
100−k
(1)
6
For = 0.01,
P
¸
S
0,100
¸
= (1 −)
100
= (0.99)
100
= 0.3660 (2)
P
¸
S
1,99
¸
= 100(0.01)(0.99)
99
= 0.3700 (3)
P
¸
S
2,98
¸
= 4950(0.01)
2
(0.99)
9
8 = 0.1849 (4)
P
¸
S
3,97
¸
= 161, 700(0.01)
3
(0.99)
97
= 0.0610 (5)
(2) The probability a packet is decoded correctly is just
P [C] = P
¸
S
0,100
¸
+ P
¸
S
1,99
¸
+ P
¸
S
2,98
¸
+ P
¸
S
3,97
¸
= 0.9819 (6)
Quiz 1.10
Since the chip works only if all n transistors work, the transistors in the chip are like
devices in series. The probability that a chip works is P[C] = p
n
.
The module works if either 8 chips work or 9 chips work. Let C
k
denote the event that
exactly k chips work. Since transistor failures are independent of each other, chip failures
are also independent. Thus each P[C
k
] has the binomial probability
P [C
8
] =

9
8

(P [C])
8
(1 − P [C])
9−8
= 9p
8n
(1 − p
n
), (1)
P [C
9
] = (P [C])
9
= p
9n
. (2)
The probability a memory module works is
P [M] = P [C
8
] + P [C
9
] = p
8n
(9 −8p
n
) (3)
Quiz 1.11
R=rand(1,100);
X=(R<= 0.4) ...
+ (2*(R>0.4).*(R<=0.9)) ...
+ (3*(R>0.9));
Y=hist(X,1:3)
For a MATLAB simulation, we first gen-
erate a vector R of 100 random numbers.
Second, we generate vector X as a func-
tion of R to represent the 3 possible out-
comes of a flip. That is, X(i)=1 if flip i
was heads, X(i)=2 if flip i was tails, and
X(i)=3) is flip i landed on the edge.
To see how this works, we note there are three cases:
• If R(i) <= 0.4, then X(i)=1.
• If 0.4 < R(i) and R(i)<=0.9, then X(i)=2.
• If 0.9 < R(i), then X(i)=3.
These three cases will have probabilities 0.4, 0.5 and 0.1. Lastly, we use the hist function
to count how many occurences of each possible value of X(i).
7
Quiz Solutions – Chapter 2
Quiz 2.1
The sample space, probabilities and corresponding grades for the experiment are
Outcome P[·] G
BB 0.36 3.0
BC 0.24 2.5
CB 0.24 2.5
CC 0.16 2
Quiz 2.2
(1) To find c, we recall that the PMF must sum to 1. That is,
3
¸
n=1
P
N
(n) = c

1 +
1
2
+
1
3

= 1 (1)
This implies c = 6/11. Now that we have found c, the remaining parts are straight-
forward.
(2) P[N = 1] = P
N
(1) = c = 6/11
(3) P[N ≥ 2] = P
N
(2) + P
N
(3) = c/2 +c/3 = 5/11
(4) P[N > 3] =
¸

n=4
P
N
(n) = 0
Quiz 2.3
Decoding each transmitted bit is an independent trial where we call a bit error a “suc-
cess.” Each bit is in error, that is, the trial is a success, with probability p. Now we can
interpret each experiment in the generic context of independent trials.
(1) The random variable X is the number of trials up to and including the first success.
Similar to Example 2.11, X has the geometric PMF
P
X
(x) =
¸
p(1 − p)
x−1
x = 1, 2, . . .
0 otherwise
(1)
(2) If p = 0.1, then the probability exactly 10 bits are sent is
P [X = 10] = P
X
(10) = (0.1)(0.9)
9
= 0.0387 (2)
8
The probability that at least 10 bits are sent is P[X ≥ 10] =
¸

x=10
P
X
(x). This
sum is not too hard to calculate. However, its even easier to observe that X ≥ 10 if
the first 10 bits are transmitted correctly. That is,
P [X ≥ 10] = P [first 10 bits are correct] = (1 − p)
10
(3)
For p = 0.1, P[X ≥ 10] = 0.9
10
= 0.3487.
(3) The random variable Y is the number of successes in 100 independent trials. Just as
in Example 2.13, Y has the binomial PMF
P
Y
(y) =

100
y

p
y
(1 − p)
100−y
(4)
If p = 0.01, the probability of exactly 2 errors is
P [Y = 2] = P
Y
(2) =

100
2

(0.01)
2
(0.99)
98
= 0.1849 (5)
(4) The probability of no more than 2 errors is
P [Y ≤ 2] = P
Y
(0) + P
Y
(1) + P
Y
(2) (6)
= (0.99)
100
+100(0.01)(0.99)
99
+

100
2

(0.01)
2
(0.99)
98
(7)
= 0.9207 (8)
(5) Random variable Z is the number of trials up to and including the third success. Thus
Z has the Pascal PMF (see Example 2.15)
P
Z
(z) =

z −1
2

p
3
(1 − p)
z−3
(9)
Note that P
Z
(z) > 0 for z = 3, 4, 5, . . ..
(6) If p = 0.25, the probability that the third error occurs on bit 12 is
P
Z
(12) =

11
2

(0.25)
3
(0.75)
9
= 0.0645 (10)
Quiz 2.4
Each of these probabilities can be read off the CDF F
Y
(y). However, we must keep in
mind that when F
Y
(y) has a discontinuity at y
0
, F
Y
(y) takes the upper value F
Y
(y
+
0
).
(1) P[Y < 1] = F
Y
(1

) = 0
9
(2) P[Y ≤ 1] = F
Y
(1) = 0.6
(3) P[Y > 2] = 1 − P[Y ≤ 2] = 1 − F
Y
(2) = 1 −0.8 = 0.2
(4) P[Y ≥ 2] = 1 − P[Y < 2] = 1 − F
Y
(2

) = 1 −0.6 = 0.4
(5) P[Y = 1] = P[Y ≤ 1] − P[Y < 1] = F
Y
(1
+
) − F
Y
(1

) = 0.6
(6) P[Y = 3] = P[Y ≤ 3] − P[Y < 3] = F
Y
(3
+
) − F
Y
(3

) = 0.8 −0.8 = 0
Quiz 2.5
(1) With probability 0.7, a call is a voice call and C = 25. Otherwise, with probability
0.3, we have a data call and C = 40. This corresponds to the PMF
P
C
(c) =



0.7 c = 25
0.3 c = 40
0 otherwise
(1)
(2) The expected value of C is
E [C] = 25(0.7) +40(0.3) = 29.5 cents (2)
Quiz 2.6
(1) As a function of N, the cost T is
T = 25N +40(3 − N) = 120 −15N (1)
(2) To find the PMF of T, we can draw the following tree:
¨
¨
¨
¨
¨
¨
¨
N=0
0.1
r
r
r
r
r
r
r
N=3
0.3
$
$
$
$
$
$
$N=1 0.3
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
N=2 0.3
•T=120
•T=105
•T=90
•T=75
From the tree, we can write down the PMF of T:
P
T
(t ) =



0.3 t = 75, 90, 105
0.1 t = 120
0 otherwise
(2)
From the PMF P
T
(t ), the expected value of T is
E [T] = 75P
T
(75) +90P
T
(90) +105P
T
(105) +120P
T
(120) (3)
= (75 +90 +105)(0.3) +120(0.1) = 62 (4)
10
Quiz 2.7
(1) Using Definition 2.14, the expected number of applications is
E [A] =
4
¸
a=1
aP
A
(a) = 1(0.4) +2(0.3) +3(0.2) +4(0.1) = 2 (1)
(2) The number of memory chips is M = g(A) where
g(A) =



4 A = 1, 2
6 A = 3
8 A = 4
(2)
(3) By Theorem 2.10, the expected number of memory chips is
E [M] =
4
¸
a=1
g(A)P
A
(a) = 4(0.4) +4(0.3) +6(0.2) +8(0.1) = 4.8 (3)
Since E[A] = 2, g(E[A]) = g(2) = 4. However, E[M] = 4.8 = g(E[A]). The two
quantities are different because g(A) is not of the form αA +β.
Quiz 2.8
The PMF P
N
(n) allows to calculate each of the desired quantities.
(1) The expected value of N is
E [N] =
2
¸
n=0
nP
N
(n) = 0(0.1) +1(0.4) +2(0.5) = 1.4 (1)
(2) The second moment of N is
E
¸
N
2
¸
=
2
¸
n=0
n
2
P
N
(n) = 0
2
(0.1) +1
2
(0.4) +2
2
(0.5) = 2.4 (2)
(3) The variance of N is
Var[N] = E
¸
N
2
¸
−(E [N])
2
= 2.4 −(1.4)
2
= 0.44 (3)
(4) The standard deviation is σ
N
=

Var[N] =

0.44 = 0.663.
11
Quiz 2.9
(1) From the problem statement, we learn that the conditional PMF of N given the event
I is
P
N|I
(n) =
¸
0.02 n = 1, 2, . . . , 50
0 otherwise
(1)
(2) Also from the problem statement, the conditional PMF of N given the event T is
P
N|T
(n) =
¸
0.2 n = 1, 2, 3, 4, 5
0 otherwise
(2)
(3) The problem statement tells us that P[T] = 1 − P[I ] = 3/4. From Theorem 1.10
(the law of total probability), we find the PMF of N is
P
N
(n) = P
N|T
(n) P [T] + P
N|I
(n) P [I ] (3)
=



0.2(0.75) +0.02(0.25) n = 1, 2, 3, 4, 5
0(0.75) +0.02(0.25) n = 6, 7, . . . , 50
0 otherwise
(4)
=



0.155 n = 1, 2, 3, 4, 5
0.005 n = 6, 7, . . . , 50
0 otherwise
(5)
(4) First we find
P [N ≤ 10] =
10
¸
n=1
P
N
(n) = (0.155)(5) +(0.005)(5) = 0.80 (6)
By Theorem 2.17, the conditional PMF of N given N ≤ 10 is
P
N|N≤10
(n) =
¸
P
N
(n)
P[N≤10]
n ≤ 10
0 otherwise
(7)
=



0.155/0.8 n = 1, 2, 3, 4, 5
0.005/0.8 n = 6, 7, 8, 9, 10
0 otherwise
(8)
=



0.19375 n = 1, 2, 3, 4, 5
0.00625 n = 6, 7, 8, 9, 10
0 otherwise
(9)
(5) Once we have the conditional PMF, calculating conditional expectations is easy.
E [N|N ≤ 10] =
¸
n
nP
N|N≤10
(n) (10)
=
5
¸
n=1
n(0.19375) +
10
¸
n=6
n(0.00625) (11)
= 3.15625 (12)
12
0 50 100
0
2
4
6
8
10
0 500 1000
0
2
4
6
8
10
(a) samplemean(100) (b) samplemean(1000)
Figure 1: Two examples of the output of samplemean(k)
(6) To find the conditional variance, we first find the conditional second moment
E
¸
N
2
|N ≤ 10
¸
=
¸
n
n
2
P
N|N≤10
(n) (13)
=
5
¸
n=1
n
2
(0.19375) +
10
¸
n=6
n
2
(0.00625) (14)
= 55(0.19375) +330(0.00625) = 12.71875 (15)
The conditional variance is
Var[N|N ≤ 10] = E
¸
N
2
|N ≤ 10
¸
−(E [N|N ≤ 10])
2
(16)
= 12.71875 −(3.15625)
2
= 2.75684 (17)
Quiz 2.10
The function samplemean(k) generates and plots five m
n
sequences for n = 1, 2, . . . , k.
The i th column M(:,i) of M holds a sequence m
1
, m
2
, . . . , m
k
.
function M=samplemean(k);
K=(1:k)’;
M=zeros(k,5);
for i=1:5,
X=duniformrv(0,10,k);
M(:,i)=cumsum(X)./K;
end;
plot(K,M);
Examples of the function calls (a) samplemean(100) and (b) samplemean(1000)
are shown in Figure 1. Each time samplemean(k) is called produces a random output.
What is observed in these figures is that for small n, m
n
is fairly random but as n gets
13
large, m
n
gets close to E[X] = 5. Although each sequence m
1
, m
2
, . . . that we generate is
random, the sequences always converges to E[X]. This random convergence is analyzed
in Chapter 7.
14
Quiz Solutions – Chapter 3
Quiz 3.1
The CDF of Y is
0 2 4
0
0.5
1
y
F
Y
(
y
)
F
Y
(y) =



0 y < 0
y/4 0 ≤ y ≤ 4
1 y > 4
(1)
From the CDF F
Y
(y), we can calculate the probabilities:
(1) P[Y ≤ −1] = F
Y
(−1) = 0
(2) P[Y ≤ 1] = F
Y
(1) = 1/4
(3) P[2 < Y ≤ 3] = F
Y
(3) − F
Y
(2) = 3/4 −2/4 = 1/4
(4) P[Y > 1.5] = 1 − P[Y ≤ 1.5] = 1 − F
Y
(1.5) = 1 −(1.5)/4 = 5/8
Quiz 3.2
(1) First we will find the constant c and then we will sketch the PDF. To find c, we use
the fact that


−∞
f
X
(x) dx = 1. We will evaluate this integral using integration by
parts:


−∞
f
X
(x) dx =


0
cxe
−x/2
dx (1)
= −2cxe
−x/2


0
. .. .
=0
+


0
2ce
−x/2
dx (2)
= −4ce
−x/2


0
= 4c (3)
Thus c = 1/4 and X has the Erlang (n = 2, λ = 1/2) PDF
0 5 10 15
0
0.1
0.2
x
f
X
(
x
)
f
X
(x) =
¸
(x/4)e
−x/2
x ≥ 0
0 otherwise
(4)
15
(2) To find the CDF F
X
(x), we first note X is a nonnegative random variable so that
F
X
(x) = 0 for all x < 0. For x ≥ 0,
F
X
(x) =

x
0
f
X
(y) dy =

x
0
y
4
e
−y/2
dy (5)
= −
y
2
e
−y/2

x
0

x
0

1
2
e
−y/2
dy (6)
= 1 −
x
2
e
−x/2
−e
−x/2
(7)
The complete expression for the CDF is
0 5 10 15
0
0.5
1
x
F
X
(
x
)
F
X
(x) =
¸
1 −

x
2
+1

e
−x/2
x ≥ 0
0 otherwise
(8)
(3) From the CDF F
X
(x),
P [0 ≤ X ≤ 4] = F
X
(4) − F
X
(0) = 1 −3e
−2
. (9)
(4) Similarly,
P [−2 ≤ X ≤ 2] = F
X
(2) − F
X
(−2) = 1 −3e
−1
. (10)
Quiz 3.3
The PDF of Y is
−2 0 2
0
1
2
3
y
f
Y
(
y
)
f
Y
(y) =
¸
3y
2
/2 −1 ≤ y ≤ 1,
0 otherwise.
(1)
(1) The expected value of Y is
E [Y] =


−∞
y f
Y
(y) dy =

1
−1
(3/2)y
3
dy = (3/8)y
4

1
−1
= 0. (2)
Note that the above calculation wasn’t really necessary because E[Y] = 0 whenever
the PDF f
Y
(y) is an even function (i.e., f
Y
(y) = f
Y
(−y)).
(2) The second moment of Y is
E
¸
Y
2
¸
=


−∞
y
2
f
Y
(y) dy =

1
−1
(3/2)y
4
dy = (3/10)y
5

1
−1
= 3/5. (3)
16
(3) The variance of Y is
Var[Y] = E
¸
Y
2
¸
−(E [Y])
2
= 3/5. (4)
(4) The standard deviation of Y is σ
Y
=

Var[Y] =

3/5.
Quiz 3.4
(1) When X is an exponential (λ) random variable, E[X] = 1/λ and Var[X] = 1/λ
2
.
Since E[X] = 3 and Var[X] = 9, we must have λ = 1/3. The PDF of X is
f
X
(x) =
¸
(1/3)e
−x/3
x ≥ 0,
0 otherwise.
(1)
(2) We know X is a uniform (a, b) random variable. To find a and b, we apply Theo-
rem 3.6 to write
E [X] =
a +b
2
= 3 Var[X] =
(b −a)
2
12
= 9. (2)
This implies
a +b = 6, b −a = ±6

3. (3)
The only valid solution with a < b is
a = 3 −3

3, b = 3 +3

3. (4)
The complete expression for the PDF of X is
f
X
(x) =
¸
1/(6

3) 3 −3

3 ≤ x < 3 +3

3,
0 otherwise.
(5)
Quiz 3.5
Each of the requested probabilities can be calculated using (z) function and Table 3.1
or Q(z) and Table 3.2. We start with the sketches.
(1) The PDFs of X and Y are shown below. The fact that Y has twice the standard
deviation of X is reflected in the greater spread of f
Y
(y). However, it is important
to remember that as the standard deviation increases, the peak value of the Gaussian
PDF goes down.
−5 0 5
0
0.2
0.4
x y
f
X
(
x
)









f
Y
(
y
)
← f
X
(x)
← f
Y
(y)
17
(2) Since X is Gaussian (0, 1),
P [−1 < X ≤ 1] = F
X
(1) − F
X
(−1) (1)
= (1) −(−1) = 2(1) −1 = 0.6826. (2)
(3) Since Y is Gaussian (0, 2),
P [−1 < Y ≤ 1] = F
Y
(1) − F
Y
(−1) (3)
=

1
σ
Y

−1
σ
Y

= 2

1
2

−1 = 0.383. (4)
(4) Again, since X is Gaussian (0, 1), P[X > 3.5] = Q(3.5) = 2.33 ×10
−4
.
(5) Since Y is Gaussian (0, 2), P[Y > 3.5] = Q(
3.5
2
) = Q(1.75) = 1 − (1.75) =
0.0401.
Quiz 3.6
The CDF of X is
−2 0 2
0
0.5
1
x
F
X
(
x
)
F
X
(x) =



0 x < −1,
(x +1)/4 −1 ≤ x < 1,
1 x ≥ 1.
(1)
The following probabilities can be read directly from the CDF:
(1) P[X ≤ 1] = F
X
(1) = 1.
(2) P[X < 1] = F
X
(1

) = 1/2.
(3) P[X = 1] = F
X
(1
+
) − F
X
(1

) = 1 −1/2 = 1/2.
(4) We find the PDF f
Y
(y) by taking the derivative of F
Y
(y). The resulting PDF is
−2 0 2
0
0.5
x
f
X
(
x
)
0.5
f
X
(x) =



1/4 −1 ≤ x < 1,
(1/2)δ(x −1) x = 1,
0 otherwise.
(2)
Quiz 3.7
18
(1) Since X is always nonnegative, F
X
(x) = 0 for x < 0. Also, F
X
(x) = 1 for x ≥ 2
since its always true that x ≤ 2. Lastly, for 0 ≤ x ≤ 2,
F
X
(x) =

x
−∞
f
X
(y) dy =

x
0
(1 − y/2) dy = x − x
2
/4. (1)
The complete CDF of X is
−1 0 1 2 3
0
0.5
1
x
F
X
(
x
)
F
X
(x) =



0 x < 0,
x − x
2
/4 0 ≤ x ≤ 2,
1 x > 2.
(2)
(2) The probability that Y = 1 is
P [Y = 1] = P [X ≥ 1] = 1 − F
X
(1) = 1 −3/4 = 1/4. (3)
(3) Since X is nonnegative, Y is also nonnegative. Thus F
Y
(y) = 0 for y < 0. Also,
because Y ≤ 1, F
Y
(y) = 1 for all y ≥ 1. Finally, for 0 < y < 1,
F
Y
(y) = P [Y ≤ y] = P [X ≤ y] = F
X
(y) . (4)
Using the CDF F
X
(x), the complete expression for the CDF of Y is
−1 0 1 2 3
0
0.5
1
y
F
Y
(
y
)
F
Y
(y) =



0 y < 0,
y − y
2
/4 0 ≤ y < 1,
1 y ≥ 1.
(5)
As expected, we see that the jump in F
Y
(y) at y = 1 is exactly equal to P[Y = 1].
(4) By taking the derivative of F
Y
(y), we obtain the PDF f
Y
(y). Note that when y < 0
or y > 1, the PDF is zero.
−1 0 1 2 3
0
0.5
1
1.5
y
f
Y
(
y
)
0.25
f
Y
(y) =
¸
1 − y/2 +(1/4)δ(y −1) 0 ≤ y ≤ 1
0 otherwise
(6)
Quiz 3.8
(1) P[Y ≤ 6] =

6
−∞
f
Y
(y) dy =

6
0
(1/10) dy = 0.6 .
19
(2) From Definition 3.15, the conditional PDF of Y given Y ≤ 6 is
f
Y|Y≤6
(y) =
¸
f
Y
(y)
P[Y≤6]
y ≤ 6,
0 otherwise,
=
¸
1/6 0 ≤ y ≤ 6,
0 otherwise.
(1)
(3) The probability Y > 8 is
P [Y > 8] =

10
8
1
10
dy = 0.2 . (2)
(4) From Definition 3.15, the conditional PDF of Y given Y > 8 is
f
Y|Y>8
(y) =
¸
f
Y
(y)
P[Y>8]
y > 8,
0 otherwise,
=
¸
1/2 8 < y ≤ 10,
0 otherwise.
(3)
(5) From the conditional PDF f
Y|Y≤6
(y), we can calculate the conditional expectation
E [Y|Y ≤ 6] =


−∞
y f
Y|Y≤6
(y) dy =

6
0
y
6
dy = 3. (4)
(6) From the conditional PDF f
Y|Y>8
(y), we can calculate the conditional expectation
E [Y|Y > 8] =


−∞
y f
Y|Y>8
(y) dy =

10
8
y
2
dy = 9. (5)
Quiz 3.9
A natural way to produce random variables with PDF f
T|T>2
(t ) is to generate samples
of T with PDF f
T
(t ) and then to discard those samples which fail to satisfy the condition
T > 2. Here is a MATLAB function that uses this method:
function t=t2rv(m)
i=0;lambda=1/3;
t=zeros(m,1);
while (i<m),
x=exponentialrv(lambda,1);
if (x>2)
t(i+1)=x;
i=i+1;
end
end
A second method exploits the fact that if T is an exponential (λ) random variable, then
T

= T +2 has PDF f
T
(t ) = f
T|T>2
(t ). In this case the command
t=2.0+exponentialrv(1/3,m)
generates the vector t.
20
Quiz Solutions – Chapter 4
Quiz 4.1
Each value of the joint CDF can be found by considering the corresponding probability.
(1) F
X,Y
(−∞, 2) = P[X ≤ −∞, Y ≤ 2] ≤ P[X ≤ −∞] = 0 since X cannot take on
the value −∞.
(2) F
X,Y
(∞, ∞) = P[X ≤ ∞, Y ≤ ∞] = 1. This result is given in Theorem 4.1.
(3) F
X,Y
(∞, y) = P[X ≤ ∞, Y ≤ y] = P[Y ≤ y] = F
Y
(y).
(4) F
X,Y
(∞, −∞) = P[X ≤ ∞, Y ≤ −∞] = 0 since Y cannot take on the value −∞.
Quiz 4.2
From the joint PMF of Q and G given in the table, we can calculate the requested
probabilities by summing the PMF over those values of Q and G that correspond to the
event.
(1) The probability that Q = 0 is
P [Q = 0] = P
Q,G
(0, 0) + P
Q,G
(0, 1) + P
Q,G
(0, 2) + P
Q,G
(0, 3) (1)
= 0.06 +0.18 +0.24 +0.12 = 0.6 (2)
(2) The probability that Q = G is
P [Q = G] = P
Q,G
(0, 0) + P
Q,G
(1, 1) = 0.18 (3)
(3) The probability that G > 1 is
P [G > 1] =
3
¸
g=2
1
¸
q=0
P
Q,G
(q, g) (4)
= 0.24 +0.16 +0.12 +0.08 = 0.6 (5)
(4) The probability that G > Q is
P [G > Q] =
1
¸
q=0
3
¸
g=q+1
P
Q,G
(q, g) (6)
= 0.18 +0.24 +0.12 +0.16 +0.08 = 0.78 (7)
21
Quiz 4.3
By Theorem 4.3, the marginal PMF of H is
P
H
(h) =
¸
b=0,2,4
P
H,B
(h, b) (1)
For each value of h, this corresponds to calculating the row sum across the table of the joint
PMF. Similarly, the marginal PMF of B is
P
B
(b) =
1
¸
h=−1
P
H,B
(h, b) (2)
For each value of b, this corresponds to the column sum down the table of the joint PMF.
The easiest way to calculate these marginal PMFs is to simply sum each row and column:
P
H,B
(h, b) b = 0 b = 2 b = 4 P
H
(h)
h = −1 0 0.4 0.2 0.6
h = 0 0.1 0 0.1 0.2
h = 1 0.1 0.1 0 0.2
P
B
(b) 0.2 0.5 0.3
(3)
Quiz 4.4
To find the constant c, we apply


−∞


−∞
f
X,Y
(x, y) dx dy = 1. Specifically,


−∞


−∞
f
X,Y
(x, y) dx dy =

2
0

1
0
cxy dx dy (1)
= c

2
0
y

x
2
/2

1
0

dy (2)
= (c/2)

2
0
y dy = (c/4)y
2

2
0
= c (3)
Thus c = 1. To calculate P[A], we write
P [A] =

A
f
X,Y
(x, y) dx dy (4)
To integrate over A, we convert to polar coordinates using the substitutions x = r cos θ,
y = r sin θ and dx dy = r dr dθ, yielding
Y
X
1
1
2
A
P [A] =

π/2
0

1
0
r
2
sin θ cos θ r dr dθ (5)
=

1
0
r
3
dr

π/2
0
sin θ cos θ dθ

(6)
=

r
4
/4

1
0



sin
2
θ
2

π/2
0


= 1/8 (7)
22
Quiz 4.5
By Theorem 4.8, the marginal PDF of X is
f
X
(x) =


−∞
f
X,Y
(x, y) dy (1)
For x < 0 or x > 1, f
X
(x) = 0. For 0 ≤ x ≤ 1,
f
X
(x) =
6
5

1
0
(x + y
2
) dy =
6
5

xy + y
3
/3

y=1
y=0
=
6
5
(x +1/3) =
6x +2
5
(2)
The complete expression for the PDf of X is
f
X
(x) =
¸
(6x +2)/5 0 ≤ x ≤ 1
0 otherwise
(3)
By the same method we obtain the marginal PDF for Y. For 0 ≤ y ≤ 1,
f
Y
(y) =


−∞
f
X,Y
(x, y) dy (4)
=
6
5

1
0
(x + y
2
) dx =
6
5

x
2
/2 + xy
2

x=1
x=0
=
6
5
(1/2 + y
2
) =
3 +6y
2
5
(5)
Since f
Y
(y) = 0 for y < 0 or y > 1, the complete expression for the PDF of Y is
f
Y
(y) =
¸
(3 +6y
2
)/5 0 ≤ y ≤ 1
0 otherwise
(6)
Quiz 4.6
(A) The time required for the transfer is T = L/B. For each pair of values of L and B,
we can calculate the time T needed for the transfer. We can write these down on the
table for the joint PMF of L and B as follows:
P
L,B
(l, b) b = 14, 400 b = 21, 600 b = 28, 800
l = 518, 400 0.20 (T=36) 0.10 (T=24) 0.05 (T=18)
l = 2, 592, 000 0.05 (T=180) 0.10 (T=120) 0.20 (T=90)
l = 7, 776, 000 0.00 (T=540) 0.10 (T=360) 0.20 (T=270)
From the table, writing down the PMF of T is straightforward.
P
T
(t ) =























0.05 t = 18
0.1 t = 24
0.2 t = 36, 90
0.1 t = 120
0.05 t = 180
0.2 t = 270
0.1 t = 360
0 otherwise
(1)
23
(B) First, we observe that since 0 ≤ X ≤ 1 and 0 ≤ Y ≤ 1, W = XY satisfies
0 ≤ W ≤ 1. Thus f
W
(0) = 0 and f
W
(1) = 1. For 0 < w < 1, we calculate the
CDF F
W
(w) = P[W ≤ w]. As shown below, integrating over the region W ≤ w
is fairly complex. The calculus is simpler if we integrate over the region XY > w.
Specifically,
Y
X
1
1
XY > w
w
w
XY = w
F
W
(w) = 1 − P [XY > w] (2)
= 1 −

1
w

1
w/x
dy dx (3)
= 1 −

1
w
(1 −w/x) dx (4)
= 1 −

x −wln x|
x=1
x=w

(5)
= 1 −(1 −w +wln w) = w −wln w (6)
The complete expression for the CDF is
F
W
(w) =



0 w < 0
w −wln w 0 ≤ w ≤ 1
1 w > 1
(7)
By taking the derivative of the CDF, we find the PDF is
f
W
(w) =
d F
W
(w)
dw
=



0 w < 0
−ln w 0 ≤ w ≤ 1
0 w > 1
(8)
Quiz 4.7
(A) It is helpful to first make a table that includes the marginal PMFs.
P
L,T
(l, t ) t = 40 t = 60 P
L
(l)
l = 1 0.15 0.1 0.25
l = 2 0.3 0.2 0.5
l = 3 0.15 0.1 0.25
P
T
(t ) 0.6 0.4
(1) The expected value of L is
E [L] = 1(0.25) +2(0.5) +3(0.25) = 2. (1)
Since the second moment of L is
E
¸
L
2
¸
= 1
2
(0.25) +2
2
(0.5) +3
2
(0.25) = 4.5, (2)
the variance of L is
Var [L] = E
¸
L
2
¸
−(E [L])
2
= 0.5. (3)
24
(2) The expected value of T is
E [T] = 40(0.6) +60(0.4) = 48. (4)
The second moment of T is
E
¸
T
2
¸
= 40
2
(0.6) +60
2
(0.4) = 2400. (5)
Thus
Var[T] = E
¸
T
2
¸
−(E [T])
2
= 2400 −48
2
= 96. (6)
(3) The correlation is
E [LT] =
¸
t =40,60
3
¸
l=1
lt P
LT
(lt ) (7)
= 1(40)(0.15) +2(40)(0.3) +3(40)(0.15) (8)
+1(60)(0.1) +2(60)(0.2) +3(60)(0.1) (9)
= 96 (10)
(4) From Theorem 4.16(a), the covariance of L and T is
Cov [L, T] = E [LT] − E [L] E [T] = 96 −2(48) = 0 (11)
(5) Since Cov[L, T] = 0, the correlation coefficient is ρ
L,T
= 0.
(B) As in the discrete case, the calculations become easier if we first calculate the marginal
PDFs f
X
(x) and f
Y
(y). For 0 ≤ x ≤ 1,
f
X
(x) =


−∞
f
X,Y
(x, y) dy =

2
0
xy dy =
1
2
xy
2

y=2
y=0
= 2x (12)
Similarly, for 0 ≤ y ≤ 2,
f
Y
(y) =


−∞
f
X,Y
(x, y) dx =

2
0
xy dx =
1
2
x
2
y

x=1
x=0
=
y
2
(13)
The complete expressions for the marginal PDFs are
f
X
(x) =
¸
2x 0 ≤ x ≤ 1
0 otherwise
f
Y
(y) =
¸
y/2 0 ≤ y ≤ 2
0 otherwise
(14)
From the marginal PDFs, it is straightforward to calculate the various expectations.
25
(1) The first and second moments of X are
E [X] =


−∞
x f
X
(x) dx =

1
0
2x
2
dx =
2
3
(15)
E
¸
X
2
¸
=


−∞
x
2
f
X
(x) dx =

1
0
2x
3
dx =
1
2
(16)
(17)
The variance of X is Var[X] = E[X
2
] −(E[X])
2
= 1/18.
(2) The first and second moments of Y are
E [Y] =


−∞
y f
Y
(y) dy =

2
0
1
2
y
2
dy =
4
3
(18)
E
¸
Y
2
¸
=


−∞
y
2
f
Y
(y) dy =

2
0
1
2
y
3
dy = 2 (19)
The variance of Y is Var[Y] = E[Y
2
] −(E[Y])
2
= 2 −16/9 = 2/9.
(3) The correlation of X and Y is
E [XY] =


−∞


−∞
xy f
X,Y
(x, y) dx, dy (20)
=

1
0

2
0
x
2
y
2
dx, dy =
x
3
3

1
0
y
3
3

2
0
=
8
9
(21)
(4) The covariance of X and Y is
Cov [X, Y] = E [XY] − E [X] E [Y] =
8
9

2
3

4
3

= 0. (22)
(5) Since Cov[X, Y] = 0, the correlation coefficient is ρ
X,Y
= 0.
Quiz 4.8
(A) Since the event V > 80 occurs only for the pairs (L, T) = (2, 60), (L, T) = (3, 40)
and (L, T) = (3, 60),
P [A] = P [V > 80] = P
L,T
(2, 60) + P
L,T
(3, 40) + P
L,T
(3, 60) = 0.45 (1)
By Definition 4.9,
P
L,T| A
(l, t ) =
¸
P
L,T
(l,t )
P[A]
lt > 80
0 otherwise
(2)
26
We can represent this conditional PMF in the following table:
P
L,T| A
(l, t ) t = 40 t = 60
l = 1 0 0
l = 2 0 4/9
l = 3 1/3 2/9
The conditional expectation of V can be found from the conditional PMF.
E [V| A] =
¸
l
¸
t
lt P
L,T| A
(l, t ) (3)
= (2 · 60)
4
9
+(3 · 40)
1
3
+(3 · 60)
2
9
= 133
1
3
(4)
For the conditional variance Var[V| A], we first find the conditional second moment
E
¸
V
2
| A
¸
=
¸
l
¸
t
(lt )
2
P
L,T| A
(l, t ) (5)
= (2 · 60)
2
4
9
+(3 · 40)
2
1
3
+(3 · 60)
2
2
9
= 18, 400 (6)
It follows that
Var [V| A] = E
¸
V
2
| A
¸
−(E [V| A])
2
= 622
2
9
(7)
(B) For continuous random variables X and Y, we first calculate the probability of the
conditioning event.
P [B] =

B
f
X,Y
(x, y) dx dy =

60
40

3
80/y
xy
4000
dx dy (8)
=

60
40
y
4000

x
2
2

3
80/y

dy (9)
=

60
40
y
4000

9
2

3200
y
2

dy (10)
=
9
8

4
5
ln
3
2
≈ 0.801 (11)
The conditional PDF of X and Y is
f
X,Y|B
(x, y) =
¸
f
X,Y
(x, y) /P [B] (x, y) ∈ B
0 otherwise
(12)
=
¸
Kxy 40 ≤ y ≤ 60, 80/y ≤ x ≤ 3
0 otherwise
(13)
27
where K = (4000P[B])
−1
. The conditional expectation of W given event B is
E [W|B] =


−∞


−∞
xy f
X,Y|B
(x, y) dx dy (14)
=

60
40

3
80/y
Kx
2
y
2
dx dy (15)
= (K/3)

60
40
y
2
x
3

x=3
x=80/y
dy (16)
= (K/3)

60
40

27y
2
−80
3
/y

dy (17)
= (K/3)

9y
3
−80
3
ln y

60
40
≈ 120.78 (18)
The conditional second moment of K given B is
E
¸
W
2
|B
¸
=


−∞


−∞
(xy)
2
f
X,Y|B
(x, y) dx dy (19)
=

60
40

3
80/y
Kx
3
y
3
dx dy (20)
= (K/4)

60
40
y
3
x
4

x=3
x=80/y
dy (21)
= (K/4)

60
40

81y
3
−80
4
/y

dy (22)
= (K/4)

(81/4)y
4
−80
4
ln y

60
40
≈ 16, 116.10 (23)
It follows that the conditional variance of W given B is
Var [W|B] = E
¸
W
2
|B
¸
−(E [W|B])
2
≈ 1528.30 (24)
Quiz 4.9
(A) (1) The joint PMF of A and B can be found from the marginal and conditional
PMFs via P
A,B
(a, b) = P
B| A
(b|a)P
A
(a). Incorporating the information from
the given conditional PMFs can be confusing, however. Consequently, we can
note that A has range S
A
= {0, 2} and B has range S
B
= {0, 1}. A table of the
joint PMF will include all four possible combinations of A and B. The general
form of the table is
P
A,B
(a, b) b = 0 b = 1
a = 0 P
B| A
(0|0)P
A
(0) P
B| A
(1|0)P
A
(0)
a = 2 P
B| A
(0|2)P
A
(2) P
B| A
(1|2)P
A
(2)
28
Substituting values from P
B| A
(b|a) and P
A
(a), we have
P
A,B
(a, b) b = 0 b = 1
a = 0 (0.8)(0.4) (0.2)(0.4)
a = 2 (0.5)(0.6) (0.5)(0.6)
or
P
A,B
(a, b) b = 0 b = 1
a = 0 0.32 0.08
a = 2 0.3 0.3
(2) Given the conditional PMF P
B| A
(b|2), it is easy to calculate the conditional
expectation
E [B| A = 2] =
1
¸
b=0
bP
B| A
(b|2) = (0)(0.5) +(1)(0.5) = 0.5 (1)
(3) From the joint PMF P
A,B
(a, b), we can calculate the the conditional PMF
P
A|B
(a|0) =
P
A,B
(a, 0)
P
B
(0)
=



0.32/0.62 a = 0
0.3/0.62 a = 2
0 otherwise
(2)
=



16/31 a = 0
15/31 a = 2
0 otherwise
(3)
(4) We can calculate the conditional variance Var[A|B = 0] using the conditional
PMF P
A|B
(a|0). First we calculate the conditional expected value
E [A|B = 0] =
¸
a
aP
A|B
(a|0) = 0(16/31) +2(15/31) = 30/31 (4)
The conditional second moment is
E
¸
A
2
|B = 0
¸
=
¸
a
a
2
P
A|B
(a|0) = 0
2
(16/31) +2
2
(15/31) = 60/31 (5)
The conditional variance is then
Var[A|B = 0] = E
¸
A
2
|B = 0
¸
−(E [A|B = 0])
2
=
960
961
(6)
(B) (1) The joint PDF of X and Y is
f
X,Y
(x, y) = f
Y|X
(y|x) f
X
(x) =
¸
6y 0 ≤ y ≤ x, 0 ≤ x ≤ 1
0 otherwise
(7)
(2) From the given conditional PDF f
Y|X
(y|x),
f
Y|X
(y|1/2) =
¸
8y 0 ≤ y ≤ 1/2
0 otherwise
(8)
29
(3) The conditional PDF of Y given X = 1/2 is f
X|Y
(x|1/2) = f
X,Y
(x, 1/2)/f
Y
(1/2).
To find f
Y
(1/2), we integrate the joint PDF.
f
Y
(1/2) =


−∞
f
X,1/2
( ) dx =

1
1/2
6(1/2) dx = 3/2 (9)
Thus, for 1/2 ≤ x ≤ 1,
f
X|Y
(x|1/2) =
f
X,Y
(x, 1/2)
f
Y
(1/2)
=
6(1/2)
3/2
= 2 (10)
(4) From the pervious part, we see that given Y = 1/2, the conditional PDF of X
is uniform (1/2, 1). Thus, by the definition of the uniform (a, b) PDF,
Var [X|Y = 1/2] =
(1 −1/2)
2
12
=
1
48
(11)
Quiz 4.10
(A) (1) For random variables X and Y from Example 4.1, we observe that P
Y
(1) =
0.09 and P
X
(0) = 0.01. However,
P
X,Y
(0, 1) = 0 = P
X
(0) P
Y
(1) (1)
Since we have found a pair x, y such that P
X,Y
(x, y) = P
X
(x)P
Y
(y), we can
conclude that X and Y are dependent. Note that whenever P
X,Y
(x, y) = 0,
independence requires that either P
X
(x) = 0 or P
Y
(y) = 0.
(2) For random variables Q and G from Quiz 4.2, it is not obvious whether they
are independent. Unlike X and Y in part (a), there are no obvious pairs q, g
that fail the independence requirement. In this case, we calculate the marginal
PMFs from the table of the joint PMF P
Q,G
(q, g) in Quiz 4.2.
P
Q,G
(q, g) g = 0 g = 1 g = 2 g = 3 P
Q
(q)
q = 0 0.06 0.18 0.24 0.12 0.60
q = 1 0.04 0.12 0.16 0.08 0.40
P
G
(g) 0.10 0.30 0.40 0.20
Careful study of the table will verify that P
Q,G
(q, g) = P
Q
(q)P
G
(g) for every
pair q, g. Hence Q and G are independent.
(B) (1) Since X
1
and X
2
are independent,
f
X
1
,X
2
(x
1
, x
2
) = f
X
1
(x
1
) f
X
2
(x
2
) (2)
=
¸
(1 − x
1
/2)(1 − x
2
/2) 0 ≤ x
1
≤ 2, 0 ≤ x
2
≤ 2
0 otherwise
(3)
30
(2) Let F
X
(x) denote the CDF of both X
1
and X
2
. The CDF of Z = max(X
1
, X
2
)
is found by observing that Z ≤ z iff X
1
≤ z and X
2
≤ z. That is,
P [Z ≤ z] = P [X
1
≤ z, X
2
≤ z] (4)
= P [X
1
≤ z] P [X
2
≤ z] = [F
X
(z)]
2
(5)
To complete the problem, we need to find the CDF of each X
i
. From the PDF
f
X
(x), the CDF is
F
X
(x) =

x
−∞
f
X
(y) dy =



0 x < 0
x − x
2
/4 0 ≤ x ≤ 2
1 x > 2
(6)
Thus for 0 ≤ z ≤ 2,
F
Z
(z) = (z − z
2
/4)
2
(7)
The complete expression for the CDF of Z is
F
Z
(z) =



0 z < 0
(z − z
2
/4)
2
0 ≤ z ≤ 2
1 z > 1
(8)
Quiz 4.11
This problem just requires identifying the various terms in Definition 4.17 and Theo-
rem 4.29. Specifically, from the problem statement, we know that ρ = 1/2,
µ
1
= µ
X
= 0, µ
2
= µ
Y
= 0, (1)
and that
σ
1
= σ
X
= 1, σ
2
= σ
Y
= 1. (2)
(1) Applying these facts to Definition 4.17, we have
f
X,Y
(x, y) =
1


2
e
−2(x
2
−xy+y
2
)/3
. (3)
(2) By Theorem 4.30, the conditional expected value and standard deviation of X given
Y = y are
E [X|Y = y] = y/2 ˜ σ
X
= σ
2
1
(1 −ρ
2
) =

3/4. (4)
When Y = y = 2, we see that E[X|Y = 2] = 1 and Var[X|Y = 2] = 3/4. The
conditional PDF of X given Y = 2 is simply the Gaussian PDF
f
X|Y
(x|2) =
1

3π/2
e
−2(x−1)
2
/3
. (5)
31
Quiz 4.12
One straightforward method is to follow the approach of Example 4.28. Instead, we use
an alternate approach. First we observe that X has the discrete uniform (1, 4) PMF. Also,
given X = x, Y has a discrete uniform (1, x) PMF. That is,
P
X
(x) =
¸
1/4 x = 1, 2, 3, 4,
0 otherwise,
P
Y|X
(y|x) =
¸
1/x y = 1, . . . , x
0 otherwise
(1)
Given X = x, and an independent uniform (0, 1) random variable U, we can generate a
sample value of Y with a discrete uniform (1, x) PMF via Y = xU. This observation
prompts the following program:
function xy=dtrianglerv(m)
sx=[1;2;3;4];
px=0.25*ones(4,1);
x=finiterv(sx,px,m);
y=ceil(x.*rand(m,1));
xy=[x’;y’];
32
Quiz Solutions – Chapter 5
Quiz 5.1
We find P[C] by integrating the joint PDF over the region of interest. Specifically,
P [C] =

1/2
0
dy
2

y
2
0
dy
1

1/2
0
dy
4

y
4
0
4dy
3
(1)
= 4

1/2
0
y
2
dy
2

1/2
0
y
4
dy
4

= 1/4. (2)
Quiz 5.2
By definition of A, Y
1
= X
1
, Y
2
= X
2
−X
1
and Y
3
= X
3
−X
2
. Since 0 < X
1
< X
2
<
X
3
, each Y
i
must be a strictly positive integer. Thus, for y
1
, y
2
, y
3
∈ {1, 2, . . .},
P
Y
(y) = P [Y
1
= y
1
, Y
2
= y
2
, Y
3
= y
3
] (1)
= P [X
1
= y
1
, X
2
− X
1
= y
2
, X
3
− X
2
= y
3
] (2)
= P [X
1
= y
1
, X
2
= y
2
+ y
1
, X
3
= y
3
+ y
2
+ y
1
] (3)
= (1 − p)
3
p
y
1
+y
2
+y
3
(4)
By defining the vector a =
¸
1 1 1
¸

, the complete expression for the joint PMF of Y is
P
Y
(y) =
¸
(1 − p) p
a

y
y
1
, y
2
, y
3
∈ {1, 2, . . .}
0 otherwise
(5)
Quiz 5.3
First we note that each marginal PDF is nonzero only if any subset of the x
i
obeys the
ordering contraints 0 ≤ x
1
≤ x
2
≤ x
3
≤ 1. Within these constraints, we have
f
X
1
,X
2
(x
1
, x
2
) =


−∞
f
X
(x) dx
3
=

1
x
2
6 dx
3
= 6(1 − x
2
), (1)
f
X
2
,X
3
(x
2
, x
3
) =


−∞
f
X
(x) dx
1
=

x
2
0
6 dx
1
= 6x
2
, (2)
f
X
1
,X
3
(x
1
, x
3
) =


−∞
f
X
(x) dx
2
=

x
3
x
1
6 dx
2
= 6(x
3
− x
1
). (3)
In particular, we must keep in mind that f
X
1
,X
2
(x
1
, x
2
) = 0 unless 0 ≤ x
1
≤ x
2
≤ 1,
f
X
2
,X
3
(x
2
, x
3
) = 0 unless 0 ≤ x
2
≤ x
3
≤ 1, and that f
X
1
,X
3
(x
1
, x
3
) = 0 unless 0 ≤ x
1

33
x
3
≤ 1. The complete expressions are
f
X
1
,X
2
(x
1
, x
2
) =
¸
6(1 − x
2
) 0 ≤ x
1
≤ x
2
≤ 1
0 otherwise
(4)
f
X
2
,X
3
(x
2
, x
3
) =
¸
6x
2
0 ≤ x
2
≤ x
3
≤ 1
0 otherwise
(5)
f
X
1
,X
3
(x
1
, x
3
) =
¸
6(x
3
− x
1
) 0 ≤ x
1
≤ x
3
≤ 1
0 otherwise
(6)
Now we can find the marginal PDFs. When 0 ≤ x
i
≤ 1 for each x
i
,
f
X
1
(x
1
) =


−∞
f
X
1
,X
2
(x
1
, x
2
) dx
2
=

1
x
1
6(1 − x
2
) dx
2
= 3(1 − x
1
)
2
(7)
f
X
2
(x
2
) =


−∞
f
X
2
,X
3
(x
2
, x
3
) dx
3
=

1
x
2
6x
2
dx
3
= 6x
2
(1 − x
2
) (8)
f
X
3
(x
3
) =


−∞
f
X
2
,X
3
(x
2
, x
3
) dx
2
=

x
3
0
6x
2
dx
2
= 3x
2
3
(9)
The complete expressions are
f
X
1
(x
1
) =
¸
3(1 − x
1
)
2
0 ≤ x
1
≤ 1
0 otherwise
(10)
f
X
2
(x
2
) =
¸
6x
2
(1 − x
2
) 0 ≤ x
2
≤ 1
0 otherwise
(11)
f
X
3
(x
3
) =
¸
3x
2
3
0 ≤ x
3
≤ 1
0 otherwise
(12)
Quiz 5.4
In the PDF f
Y
(y), the components have dependencies as a result of the ordering con-
straints Y
1
≤ Y
2
and Y
3
≤ Y
4
. We can separate these constraints by creating the vectors
V =
¸
Y
1
Y
2
¸
, W =
¸
Y
3
Y
4
¸
. (1)
The joint PDF of V and W is
f
V,W
(v, w) =
¸
4 0 ≤ v
1
≤ v
2
≤ 1, 0 ≤ w
1
≤ w
2
≤ 1
0 otherwise
(2)
34
We must verify that V and W are independent. For 0 ≤ v
1
≤ v
2
≤ 1,
f
V
(v) =

f
V,W
(v, w) dw
1
dw
2
(3)
=

1
0

1
w
1
4 dw
2

dw
1
(4)
=

1
0
4(1 −w
1
) dw
1
= 2 (5)
Similarly, for 0 ≤ w
1
≤ w
2
≤ 1,
f
W
(w) =

f
V,W
(v, w) dv
1
dv
2
(6)
=

1
0

1
v
1
4 dv
2

dv
1
= 2 (7)
It follows that V and W have PDFs
f
V
(v) =
¸
2 0 ≤ v
1
≤ v
2
≤ 1
0 otherwise
, f
W
(w) =
¸
2 0 ≤ w
1
≤ w
2
≤ 1
0 otherwise
(8)
It is easy to verify that f
V,W
(v, w) = f
V
(v) f
W
(w), confirming that V and W are indepen-
dent vectors.
Quiz 5.5
(A) Referring to Theorem 1.19, each test is a subexperiment with three possible out-
comes: L, A and R. In five trials, the vector X =
¸
X
1
X
2
X
3
¸

indicating the
number of outcomes of each subexperiment has the multinomial PMF
P
X
(x) =


5
x
1
,x
2
,x
3

(0.3)
x
1
(0.6)
x
2
(0.1)
x
3
x
1
+ x
2
+ x
3
= 5;
x
1
, x
2
, x
3
∈ {0, 1, . . . , 5}
0 otherwise
(1)
We can find the marginal PMF for each X
i
from the joint PMF P
X
(x); however it
is simpler to just start from first principles and observe that X
1
is the number of
occurrences of L in five independent tests. If we view each test as a trial with success
probability P[L] = 0.3, we see that X
1
is a binomial (n, p) = (5, 0.3) random
variable. Similarly, X
2
is a binomial (5, 0.6) random variable and X
3
is a binomial
(5, 0.1) random variable. That is, for p
1
= 0.3, p
2
= 0.6 and p
3
= 0.1,
P
X
i
(x) =
¸
5
x

p
x
i
(1 − p
i
)
5−x
x = 0, 1, . . . , 5
0 otherwise
(2)
35
From the marginal PMFs, we see that X
1
, X
2
and X
3
are not independent. Hence, we
must use Theorem 5.6 to find the PMF of W. In particular, since X
1
+ X
2
+ X
3
= 5
and since each X
i
is non-negative, P
W
(0) = P
W
(1) = 0. Furthermore,
P
W
(2) = P
X
(1, 2, 2) + P
X
(2, 1, 2) + P
X
(2, 2, 1) (3)
=
5![0.3(0.6)
2
(0.1)
2
+0.3
2
(0.6)(0.1)
2
+0.3
2
(0.6)
2
(0.1)]
2!2!1!
(4)
= 0.1458 (5)
In addition, for w = 3, w = 4, and w = 5, the event W = w occurs if and only if
one of the mutually exclusive events X
1
= w, X
2
= w, or X
3
= w occurs. Thus,
P
W
(3) = P
X
1
(3) + P
X
2
(3) + P
X
3
(3) = 0.486 (6)
P
W
(4) = P
X
1
(4) + P
X
2
(4) + P
X
3
(4) = 0.288 (7)
P
W
(5) = P
X
1
(5) + P
X
2
(5) + P
X
3
(5) = 0.0802 (8)
(B) Since each Y
i
= 2X
i
+4, we can apply Theorem 5.10 to write
f
Y
(y) =
1
2
3
f
X

y
1
−4
2
,
y
2
−4
2
,
y
3
−4
2

(9)
=
¸
(1/8)e
−(y
3
−4)/2
4 ≤ y
1
≤ y
2
≤ y
3
0 otherwise
(10)
Note that for other matrices A, the constraints on y resulting from the constraints
0 ≤ X
1
≤ X
2
≤ X
3
can be much more complicated.
Quiz 5.6
We start by finding the components E[X
i
] =


−∞
x f
X
i
(x) dx of µ
X
. To do so, we use
the marginal PDFs f
X
i
(x) found in Quiz 5.3:
E [X
1
] =

1
0
3x(1 − x)
2
dx = 1/4, (1)
E [X
2
] =

1
0
6x
2
(1 − x) dx = 1/2, (2)
E [X
3
] =

1
0
3x
3
dx = 3/4. (3)
To find the correlation matrix R
X
, we need to find E[X
i
X
j
] for all i and j . We start with
36
the second moments:
E
¸
X
2
1
¸
=

1
0
3x
2
(1 − x)
2
dx = 1/10. (4)
E
¸
X
2
2
¸
=

1
0
6x
3
(1 − x) dx = 3/10. (5)
E
¸
X
2
3
¸
=

1
0
3x
4
dx = 3/5. (6)
Using marginal PDFs from Quiz 5.3, the cross terms are
E [X
1
X
2
] =


−∞


−∞
x
1
x
2
f
X
1
,X
2
(x
1
, x
2
) , dx
1
dx
2
(7)
=

1
0

1
x
1
6x
1
x
2
(1 − x
2
) dx
2

dx
1
(8)
=

1
0
[x
1
−3x
3
1
+2x
4
1
] dx
1
= 3/20. (9)
E [X
2
X
3
] =

1
0

1
x
2
6x
2
2
x
3
dx
3
dx
2
(10)
=

1
0
[3x
2
2
−3x
4
2
] dx
2
= 2/5 (11)
E [X
1
X
3
] =

1
0

1
x
1
6x
1
x
3
(x
3
− x
1
) dx
3
dx
1
. (12)
=

1
0

(2x
1
x
3
3
−3x
2
1
x
2
3
)

x
3
=1
x
3
=x
1

dx
1
(13)
=

1
0
[2x
1
−3x
2
1
+ x
4
1
] dx
1
= 1/5. (14)
Summarizing the results, X has correlation matrix
R
X
=


1/10 3/20 1/5
3/20 3/10 2/5
1/5 2/5 3/5


. (15)
Vector X has covariance matrix
C
X
= R
X
− E [X] E [X]

(16)
=


1/10 3/20 1/5
3/20 3/10 2/5
1/5 2/5 3/5





1/4
1/2
3/4


¸
1/4 1/2 3/4
¸
(17)
=


1/10 3/20 1/5
3/20 3/10 2/5
1/5 2/5 3/5





1/16 1/8 3/16
1/8 1/4 3/8
3/16 3/8 9/16


=
1
80


3 2 1
2 4 2
1 2 3


. (18)
37
This problemshows that even for fairly simple joint PDFs, computing the covariance matrix
by calculus can be a time consuming task.
Quiz 5.7
We observe that X = AZ +b where
A =
¸
2 1
1 −1
¸
, b =
¸
2
0
¸
. (1)
It follows from Theorem 5.18 that µ
X
= b and that
C
X
= AA

=
¸
2 1
1 −1
¸ ¸
2 1
1 −1
¸
=
¸
5 1
1 2
¸
. (2)
Quiz 5.8
First, we observe that Y = AT where A =
¸
1/31 1/31 · · · 1/31
¸

. Since T is a
Gaussian random vector, Theorem 5.16 tells us that Y is a 1 dimensional Gaussian vector,
i.e., just a Gaussian random variable. The expected value of Y is µ
Y
= µ
T
= 80. The
covariance matrix of Y is 1 × 1 and is just equal to Var[Y]. Thus, by Theorem 5.16,
Var[Y] = AC
T
A

.
function p=julytemps(T);
[D1 D2]=ndgrid((1:31),(1:31));
CT=36./(1+abs(D1-D2));
A=ones(31,1)/31.0;
CY=(A’)*CT*A;
p=phi((T-80)/sqrt(CY));
In julytemps.m, the first two lines gen-
erate the 31 ×31 covariance matrix CT, or
C
T
. Next we calculate Var[Y]. The final
step is to use the (·) function to calculate
P[Y < T].
Here is the output of julytemps.m:
>> julytemps([70 75 80 85 90 95])
ans =
0.0000 0.0221 0.5000 0.9779 1.0000 1.0000
Note that P[T ≤ 70] is not actually zero and that P[T ≤ 90] is not actually 1.0000. Its
just that the MATLAB’s short format output, invoked with the command format short,
rounds off those probabilities. Here is the long format output:
>> format long
>> julytemps([70 75 80 85 90 95])
ans =
Columns 1 through 4
0.00002844263128 0.02207383067604 0.50000000000000 0.97792616932396
Columns 5 through 6
0.99997155736872 0.99999999922010
38
The ndgrid function is a useful to way calculate many covariance matrices. However, in
this problem, C
X
has a special structure; the i, j th element is
C
T
(i, j ) = c
|i −j |
=
36
1 +|i − j |
. (1)
If we write out the elements of the covariance matrix, we see that
C
T
=





c
0
c
1
· · · c
30
c
1
c
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. c
1
c
30
· · · c
1
c
0





. (2)
This covariance matrix is known as a symmetric Toeplitz matrix. We will see in Chap-
ters 9 and 11 that Toeplitz covariance matrices are quite common. In fact, MATLAB has a
toeplitz function for generating them. The function julytemps2 use the toeplitz
to generate the correlation matrix C
T
.
function p=julytemps2(T);
c=36./(1+abs(0:30));
CT=toeplitz(c);
A=ones(31,1)/31.0;
CY=(A’)*CT*A;
p=phi((T-80)/sqrt(CY));
39
Quiz Solutions – Chapter 6
Quiz 6.1
Let K
1
, . . . , K
n
denote a sequence of iid random variables each with PMF
P
K
(k) =
¸
1/4 k = 1, . . . , 4
0 otherwise
(1)
We can write W
n
in the form of W
n
= K
1
+ · · · + K
n
. First, we note that the first two
moments of K
i
are
E [K
i
] = (1 +2 +3 +4)/4 = 2.5 (2)
E
¸
K
2
i
¸
= (1
2
+2
2
+3
2
+4
2
)/4 = 7.5 (3)
Thus the variance of K
i
is
Var[K
i
] = E
¸
K
2
i
¸
−(E [K
i
])
2
= 7.5 −(2.5)
2
= 1.25 (4)
Since E[K
i
] = 2.5, the expected value of W
n
is
E [W
n
] = E [K
1
] +· · · + E [K
n
] = nE [K
i
] = 2.5n (5)
Since the rolls are independent, the random variables K
1
, . . . , K
n
are independent. Hence,
by Theorem 6.3, the variance of the sum equals the sum of the variances. That is,
Var[W
n
] = Var[K
1
] +· · · +Var[K
n
] = 1.25n (6)
Quiz 6.2
Random variables X and Y have PDFs
f
X
(x) =
¸
3e
−3x
x ≥ 0
0 otherwise
f
Y
(y) =
¸
2e
−2y
y ≥ 0
0 otherwise
(1)
Since X and Y are nonnegative, W = X +Y is nonnegative. By Theorem 6.5, the PDF of
W = X +Y is
f
W
(w) =


−∞
f
X
(w − y) f
Y
(y) dy = 6

w
0
e
−3(w−y)
e
−2y
dy (2)
Fortunately, this integral is easy to evaluate. For w > 0,
f
W
(w) = e
−3w
e
y

w
0
= 6

e
−2w
−e
−3w

(3)
Since f
W
(w) = 0 for w < 0, a conmplete expression for the PDF of W is
f
W
(w) =
¸
6e
−2w

1 −e
−w

w ≥ 0,
0 otherwise.
(4)
40
Quiz 6.3
The MGF of K is
φ
K
(s) = E
¸
e
s K
¸
==
4
¸
k=0
(0.2)e
sk
= 0.2

1 +e
s
+e
2s
+e
3s
+e
4s

(1)
We find the moments by taking derivatives. The first derivative of φ
K
(s) is

K
(s)
ds
= 0.2(e
s
+2e
2s
+3e
3s
+4e
4s
) (2)
Evaluating the derivative at s = 0 yields
E [K] =

K
(s)
ds

s=0
= 0.2(1 +2 +3 +4) = 2 (3)
To find higher-order moments, we continue to take derivatives:
E
¸
K
2
¸
=
d
2
φ
K
(s)
ds
2

s=0
= 0.2(e
s
+4e
2s
+9e
3s
+16e
4s
)

s=0
= 6 (4)
E
¸
K
3
¸
=
d
3
φ
K
(s)
ds
3

s=0
= 0.2(e
s
+8e
2s
+27e
3s
+64e
4s
)

s=0
= 20 (5)
E
¸
K
4
¸
=
d
4
φ
K
(s)
ds
4

s=0
= 0.2(e
s
+16e
2s
+81e
3s
+256e
4s
)

s=0
= 70.8 (6)
(7)
Quiz 6.4
(A) Each K
i
has MGF
φ
K
(s) = E
¸
e
s K
i
¸
=
e
s
+e
2s
+· · · +e
ns
n
=
e
s
(1 −e
ns
)
n(1 −e
s
)
(1)
Since the sequence of K
i
is independent, Theorem 6.8 says the MGF of J is
φ
J
(s) = (φ
K
(s))
m
=
e
ms
(1 −e
ns
)
m
n
m
(1 −e
s
)
m
(2)
(B) Since the set of α
j
X
j
are independent Gaussian random variables, Theorem 6.10
says that W is a Gaussian random variable. Thus to find the PDF of W, we need
only find the expected value and variance. Since the expectation of the sum equals
the sum of the expectations:
E [W] = αE [X
1
] +α
2
E [X
2
] +· · · +α
n
E [X
n
] = 0 (3)
41
Since the α
j
X
j
are independent, the variance of the sum equals the sum of the vari-
ances:
Var[W] = α
2
Var[X
1
] +α
4
Var[X
2
] +· · · +α
2n
Var[X
n
] (4)
= α
2
+2(α
2
)
2
+3(α
2
)
3
+· · · +n(α
2
)
n
(5)
Defining q = α
2
, we can use Math Fact B.6 to write
Var[W] =
α
2
−α
2n+2
[1 +n(1 −α
2
)]
(1 −α
2
)
2
(6)
With E[W] = 0 and σ
2
W
= Var[W], we can write the PDF of W as
f
W
(w) =
1

2πσ
2
W
e
−w
2
/2σ
2
W
(7)
Quiz 6.5
(1) From Table 6.1, each X
i
has MGF φ
X
(s) and random variable N has MGF φ
N
(s)
where
φ
X
(s) =
1
1 −s
, φ
N
(s) =
1
5
e
s
1 −
4
5
e
s
. (1)
From Theorem 6.12, R has MGF
φ
R
(s) = φ
N
(ln φ
X
(s)) =
1
5
φ
X
(s)
1 −
4
5
φ
X
(s)
(2)
Substituting the expression for φ
X
(s) yields
φ
R
(s) =
1
5
1
5
−s
. (3)
(2) From Table 6.1, we see that R has the MGF of an exponential (1/5) random variable.
The corresponding PDF is
f
R
(r) =
¸
(1/5)e
−r/5
r ≥ 0
0 otherwise
(4)
This quiz is an example of the general result that a geometric sum of exponential
random variables is an exponential random variable.
42
Quiz 6.6
(1) The expected access time is
E [X] =


−∞
x f
X
(x) dx =

12
0
x
12
dx = 6 msec (1)
(2) The second moment of the access time is
E
¸
X
2
¸
=


−∞
x
2
f
X
(x) dx =

12
0
x
2
12
dx = 48 (2)
The variance of the access time is Var[X] = E[X
2
] −(E[X])
2
= 48 −36 = 12.
(3) Using X
i
to denote the access time of block i , we can write
A = X
1
+ X
2
+· · · + X
12
(3)
Since the expectation of the sum equals the sum of the expectations,
E [A] = E [X
1
] +· · · + E [X
12
] = 12E [X] = 72 msec (4)
(4) Since the X
i
are independent,
Var[A] = Var[X
1
] +· · · +Var[X
12
] = 12 Var[X] = 144 (5)
Hence, the standard deviation of A is σ
A
= 12
(5) To use the central limit theorem, we write
P [A > 75] = 1 − P [A ≤ 75] (6)
= 1 − P
¸
A − E [A]
σ
A

75 − E [A]
σ
A
¸
(7)
≈ 1 −

75 −72
12

(8)
= 1 −0.5987 = 0.4013 (9)
Note that we used Table 3.1 to look up (0.25).
(6) Once again, we use the central limit theorem and Table 3.1 to estimate
P [A < 48] = P
¸
A − E [A]
σ
A
<
48 − E [A]
σ
A
¸
(10)

48 −72
12

(11)
= 1 −(2) = 1 −0.9773 = 0.0227 (12)
43
Quiz 6.7
Random variable K
n
has a binomial distribution for n trials and success probability
P[V] = 3/4.
(1) The expected number of voice calls out of 48 calls is E[K
48
] = 48P[V] = 36.
(2) The variance of K
48
is
Var[K
48
] = 48P [V] (1 − P [V]) = 48(3/4)(1/4) = 9 (1)
Thus K
48
has standard deviation σ
K
48
= 3.
(3) Using the ordinary central limit theorem and Table 3.1 yields
P [30 ≤ K
48
≤ 42] ≈

42 −36
3

30 −36
3

= (2) −(−2) (2)
Recalling that (−x) = 1 −(x), we have
P [30 ≤ K
48
≤ 42] ≈ 2(2) −1 = 0.9545 (3)
(4) Since K
48
is a discrete random variable, we can use the De Moivre-Laplace approx-
imation to estimate
P [30 ≤ K
48
≤ 42] ≈

42 +0.5 −36
3

30 −0.5 −36
3

(4)
= 2(2.16666) −1 = 0.9687 (5)
Quiz 6.8
The train interarrival times X
1
, X
2
, X
3
are iid exponential (λ) random variables. The
arrival time of the third train is
W = X
1
+ X
2
+ X
3
. (1)
In Theorem 6.11, we found that the sum of three iid exponential (λ) random variables is an
Erlang (n = 3, λ) random variable. From Appendix A, we find that W has expected value
and variance
E [W] = 3/λ = 6 Var[W] = 3/λ
2
= 12 (2)
(1) By the Central Limit Theorem,
P [W > 20] = P
¸
W −6

12
>
20 −6

12
¸
≈ Q(7/

3) = 2.66 ×10
−5
(3)
44
(2) To use the Chernoff bound, we note that the MGF of W is
φ
W
(s) =

λ
λ −s

3
=
1
(1 −2s)
3
(4)
The Chernoff bound states that
P [W > 20] ≤ min
s≥0
e
−20s
φ
X
(s) = min
s≥0
e
−20s
(1 −2s)
3
(5)
To minimize h(s) = e
−20s
/(1 −2s)
3
, we set the derivative of h(s) to zero:
dh(s)
ds
=
−20(1 −2s)
3
e
−20s
+6e
−20s
(1 −2s)
2
(1 −2s)
6
= 0 (6)
This implies 20(1 − 2s) = 6 or s = 7/20. Applying s = 7/20 into the Chernoff
bound yields
P [W > 20] ≤
e
−20s
(1 −2s)
3

s=7/20
= (10/3)
3
e
−7
= 0.0338 (7)
(3) Theorem 3.11 says that for any w > 0, the CDF of the Erlang (λ, 3) random variable
W satisfies
F
W
(w) = 1 −
2
¸
k=0
(λw)
k
e
−λw
k!
(8)
Equivalently, for λ = 1/2 and w = 20,
P [W > 20] = 1 − F
W
(20) (9)
= e
−10

1 +
10
1!
+
10
2
2!

= 61e
−10
= 0.0028 (10)
Although the Chernoff bound is relatively weak in that it overestimates the proba-
bility by roughly a factor of 12, it is a valid bound. By contrast, the Central Limit
Theorem approximation grossly underestimates the true probability.
Quiz 6.9
One solution to this problem is to follow the approach of Example 6.19:
%unifbinom100.m
sx=0:100;sy=0:100;
px=binomialpmf(100,0.5,sx); py=duniformpmf(0,100,sy);
[SX,SY]=ndgrid(sx,sy); [PX,PY]=ndgrid(px,py);
SW=SX+SY; PW=PX.*PY;
sw=unique(SW); pw=finitepmf(SW,PW,sw);
pmfplot(sw,pw,’\itw’,’\itP_W(w)’);
A graph of the PMF P
W
(w) appears in Figure 2 With some thought, it should be apparent
that the finitepmf function is implementing the convolution of the two PMFs.
45
0 20 40 60 80 100 120 140 160 180 200
0
0.002
0.004
0.006
0.008
0.01
w
P
W
(
w
)
Figure 2: From Quiz 6.9, the PMF P
W
(w) of the independent sum of a binomial (100, 0.5)
random variable and a discrete uniform (0, 100) random variable.
46
Quiz Solutions – Chapter 7
Quiz 7.1
An exponential random variable with expected value 1 also has variance 1. By Theo-
rem 7.1, M
n
(X) has variance Var[M
n
(X)] = 1/n. Hence, we need n = 100 samples.
Quiz 7.2
The arrival time of the third elevator is W = X
1
+ X
2
+ X
3
. Since each X
i
is uniform
(0, 30),
E [X
i
] = 15, Var [X
i
] =
(30 −0)
2
12
= 75. (1)
Thus E[W] = 3E[X
i
] = 45, and Var[W] = 3 Var[X
i
] = 225.
(1) By the Markov inequality,
P [W > 75] ≤
E [W]
75
=
45
75
=
3
5
(2)
(2) By the Chebyshev inequality,
P [W > 75] = P [W − E [W] > 30] (3)
≤ P [|W − E [W]| > 30] ≤
Var [W]
30
2
=
225
900
=
1
4
(4)
Quiz 7.3
Define the random variable W = (X − µ
X
)
2
. Observe that V
100
(X) = M
100
(W). By
Theorem 7.6, the mean square error is
E
¸
(M
100
(W) −µ
W
)
2
¸
=
Var[W]
100
(1)
Observe that µ
X
= 0 so that W = X
2
. Thus,
µ
W
= E
¸
X
2
¸
=

1
−1
x
2
f
X
(x) dx = 1/3 (2)
E
¸
W
2
¸
= E
¸
X
4
¸
=

1
−1
x
4
f
X
(x) dx = 1/5 (3)
Therefore Var[W] = E[W
2
] − µ
2
W
= 1/5 − (1/3)
2
= 4/45 and the mean square error is
4/4500 = 0.000889.
47
Quiz 7.4
Assuming the number n of samples is large, we can use a Gaussian approximation for
M
n
(X). SinceE[X] = p and Var[X] = p(1 − p), we apply Theorem 7.13 which says that
the interval estimate
M
n
(X) −c ≤ p ≤ M
n
(X) +c (1)
has confidence coefficient 1 −α where
α = 2 −2

c

n
p(1 − p)

. (2)
We must ensure for every value of p that 1 − α ≥ 0.9 or α ≤ 0.1. Equivalently, we must
have

c

n
p(1 − p)

≥ 0.95 (3)
for every value of p. Since (x) is an increasing function of x, we must satisfy c

n ≥
1.65p(1 − p). Since p(1 − p) ≤ 1/4 for all p, we require that
c ≥
1.65
4

n
=
0.41

n
. (4)
The 0.9 confidence interval estimate of p is
M
n
(X) −
0.41

n
≤ p ≤ M
n
(X) +
0.41

n
. (5)
For the 0.99 confidence interval, we have α ≤ 0.01, implying (c

n/( p(1−p))) ≥ 0.995.
This implies c

n ≥ 2.58p(1 − p). Since p(1 − p) ≤ 1/4 for all p, we require that
c ≥ (0.25)(2.58)/

n. In this case, the 0.99 confidence interval estimate is
M
n
(X) −
0.645

n
≤ p ≤ M
n
(X) +
0.645

n
. (6)
Note that if M
100
(X) = 0.4, then the 0.99 confidence interval estimate is
0.3355 ≤ p ≤ 0.4645. (7)
The interval is wide because the 0.99 confidence is high.
Quiz 7.5
Following the approach of bernoullitraces.m, we generate m = 1000 sample
paths, each sample path having n = 100 Bernoulli traces. at time k, OK(k) counts the
fraction of sample paths that have sample mean within one standard error of p. The pro-
gram bernoullisample.m generates graphs the number of traces within one standard
error as a function of the time, i.e. the number of trials in each trace.
48
function OK=bernoullisample(n,m,p);
x=reshape(bernoullirv(p,m*n),n,m);
nn=(1:n)’*ones(1,m);
MN=cumsum(x)./nn;
stderr=sqrt(p*(1-p))./sqrt((1:n)’);
stderrmat=stderr*ones(1,m);
OK=sum(abs(MN-p)<stderrmat,2)/m;
plot(1:n,OK,’-s’);
The following graph was generated by bernoullisample(100,5000,0.5):
0 10 20 30 40 50 60 70 80 90 100
0.4
0.5
0.6
0.7
0.8
0.9
1
As we would expect, as m gets large, the fraction of traces within one standard error ap-
proaches 2(1) −1 ≈ 0.68. The unusual sawtooth pattern, though perhaps unexpected, is
examined in Problem 7.5.2.
49
Quiz Solutions – Chapter 8
Quiz 8.1
From the problem statement, each X
i
has PDF and CDF
f
X
i
(x) =
¸
e
−x
x ≥ 0
0 otherwise
F
X
i
(x) =
¸
0 x < 0
1 −e
−x
x ≥ 0
(1)
Hence, the CDF of the maximum of X
1
, . . . , X
15
obeys
F
X
(x) = P [X ≤ x] = P [X
1
≤ x, X
2
≤ x, · · · , X
15
≤ x] = [P [X
i
≤ x]]
15
. (2)
This implies that for x ≥ 0,
F
X
(x) =
¸
F
X
i
(x)
¸
15
=
¸
1 −e
−x
¸
15
(3)
To design a significance test, we must choose a rejection region for X. A reasonable choice
is to reject the hypothesis if X is too small. That is, let R = {X ≤ r}. For a significance
level of α = 0.01, we obtain
α = P [X ≤ r] = (1 −e
−r
)
15
= 0.01 (4)
It is straightforward to show that
r = −ln
¸
1 −(0.01)
1/15
¸
= 1.33 (5)
Hence, if we observe X < 1.33, then we reject the hypothesis.
Quiz 8.2
From the problem statement, the conditional PMFs of K are
P
K|H
0
(k) =
¸
10
4k
e
−10
4
k!
k = 0, 1, . . .
0 otherwise
(1)
P
K|H
1
(k) =
¸
10
6k
e
−10
6
k!
k = 0, 1, . . .
0 otherwise
(2)
Since the two hypotheses are equally likely, the MAP and ML tests are the same. From
Theorem 8.6, the ML hypothesis rule is
k ∈ A
0
if P
K|H
0
(k) ≥ P
K|H
1
(k) ; k ∈ A
1
otherwise. (3)
This rule simplifies to
k ∈ A
0
if k ≤ k

=
10
6
−10
4
ln 100
= 214, 975.7; k ∈ A
1
otherwise. (4)
Thus if we observe at least 214, 976 photons, then we accept hypothesis H
1
.
50
Quiz 8.3
For the QPSK system, a symbol error occurs when s
i
is transmitted but (X
1
, X
2
) ∈ A
j
for some j = i . For a QPSK system, it is easier to calculate the probability of a correct
decision. Given H
0
, the conditional probability of a correct decision is
P [C|H
0
] = P [X
1
> 0, X
2
> 0|H
0
] = P
¸

E/2 + N
1
> 0,

E/2 + N
2
> 0
¸
(1)
Because of the symmetry of the signals, P[C|H
0
] = P[C|H
i
] for all i . This implies the
probability of a correct decision is P[C] = P[C|H
0
]. Since N
1
and N
2
are iid Gaussian
(0, σ) random variables, we have
P [C] = P [C|H
0
] = P
¸

E/2 + N
1
> 0
¸
P
¸

E/2 + N
2
> 0
¸
(2)
=

P
¸
N
1
> −

E/2
¸
2
(3)
=
¸
1 −



E/2
σ

2
(4)
Since (−x) = 1 − (x), we have P[C] =
2
(

E/2σ
2
). Equivalently, the probability
of error is
P
ERR
= 1 − P [C] = 1 −
2

E

2

(5)
Quiz 8.4
To generate the ROC, the existing program sqdistor already calculates this miss
probability P
MISS
= P
01
and the false alarm probability P
FA
= P
10
. The modified pro-
gram, sqdistroc.m is essentially the same as sqdistor except the output is a ma-
trix FM whose columns are the false alarm and miss probabilities. Next, the program
sqdistrocplot.m calls sqdistroc three times to generate a plot that compares the
receiver performance for the three requested values of d. Here is the modified code:
function FM=sqdistroc(v,d,m,T)
%square law distortion recvr
%P(error) for m bits tested
%transmit v volts or -v volts,
%add N volts, N is Gauss(0,1)
%add d(v+N)ˆ2 distortion
%receive 1 if x>T, otherwise 0
%FM = [P(FA) P(MISS)]
x=(v+randn(m,1));
[XX,TT]=ndgrid(x,T(:));
P01=sum((XX+d*(XX.ˆ2)< TT),1)/m;
x= -v+randn(m,1);
[XX,TT]=ndgrid(x,T(:));
P10=sum((XX+d*(XX.ˆ2)>TT),1)/m;
FM=[P10(:) P01(:)];
function FM=sqdistrocplot(v,m,T);
FM1=sqdistroc(v,0.1,m,T);
FM2=sqdistroc(v,0.2,m,T);
FM5=sqdistroc(v,0.3,m,T);
FM=[FM1 FM2 FM5];
loglog(FM1(:,1),FM1(:,2),’-k’, ...
FM2(:,1),FM2(:,2),’--k’, ...
FM5(:,1),FM5(:,2),’:k’);
legend(’\it d=0.1’,’\it d=0.2’,...
’\it d=0.3’,3)
ylabel(’P_{MISS}’);
xlabel(’P_{FA}’);
51
To see the effect of d, the commands
T=-3:0.1:3; sqdistrocplot(3,100000,T);
generated the plot shown in Figure 3.
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
10
−5
10
−4
10
−3
10
−2
10
−1
10
0
P
M
I
S
S
P
FA
d=0.1
d=0.2
d=0.3
T=-3:0.1:3; sqdistrocplot(3,100000,T);
Figure 3: The receiver operating curve for the communications system of Quiz 8.4 with
squared distortion.
52
Quiz Solutions – Chapter 9
Quiz 9.1
(1) First, we calculate the marginal PDF for 0 ≤ y ≤ 1:
f
Y
(y) =

y
0
2(y + x) dx = 2xy + x
2

x=y
x=0
= 3y
2
(1)
This implies the conditional PDF of X given Y is
f
X|Y
(x|y) =
f
X,Y
(x, y)
f
Y
(y)
=
¸
2
3y
+
2x
3y
2
0 ≤ x ≤ y
0 otherwise
(2)
(2) The minimum mean square error estimate of X given Y = y is
ˆ x
M
(y) = E [X|Y = y] =

y
0

2x
3y
+
2x
2
3y
2

dx = 5y/9 (3)
Thus the MMSE estimator of X given Y is
ˆ
X
M
(Y) = 5Y/9.
(3) To obtain the conditional PDF f
Y|X
(y|x), we need the marginal PDF f
X
(x). For
0 ≤ x ≤ 1,
f
X
(x) =

1
x
2(y + x) dy = y
2
+2xy

y=1
y=x
= 1 +2x −3x
2
(4)
(5)
For 0 ≤ x ≤ 1, the conditional PDF of Y given X is
f
Y|X
(y|x) =
¸
2(y+x)
1+2x−3x
2
x ≤ y ≤ 1
0 otherwise
(6)
(4) The MMSE estimate of Y given X = x is
ˆ y
M
(x) = E [Y|X = x] =

1
x
2y
2
+2xy
1 +2x −3x
2
dy (7)
=
2y
3
/3 + xy
2
1 +2x −3x
2

y=1
y=x
(8)
=
2 +3x −5x
3
3 +6x −9x
2
(9)
53
Quiz 9.2
(1) Since the expectation of the sum equals the sum of the expectations,
E [R] = E [T] + E [X] = 0 (1)
(2) Since T and X are independent, the variance of the sum R = T + X is
Var[R] = Var[T] +Var[X] = 9 +3 = 12 (2)
(3) Since T and R have expected values E[R] = E[T] = 0,
Cov [T, R] = E [T R] = E [T(T + X)] = E
¸
T
2
¸
+ E [T X] (3)
Since T and X are independent and have zero expected value, E[T X] = E[T]E[X] =
0 and E[T
2
] = Var[T]. Thus Cov[T, R] = Var[T] = 9.
(4) From Definition 4.8, the correlation coefficient of T and R is
ρ
T,R
=
Cov [T, R]

Var[R] Var[T]
=
σ
T
σ
R
=

3/2 (4)
(5) From Theorem 9.4, the optimum linear estimate of T given R is
ˆ
T
L
(R) = ρ
T,R
σ
T
σ
R
(R − E [R]) + E [T] (5)
Since E[R] = E[T] = 0 and ρ
T,R
= σ
T

R
,
ˆ
T
L
(R) =
σ
2
T
σ
2
R
R =
σ
2
T
σ
2
T

2
X
R =
3
4
R (6)
Hence a

= 3/4 and b

= 0.
(6) By Theorem 9.4, the mean square error of the linear estimate is
e

L
= Var[T](1 −ρ
2
T,R
) = 9(1 −3/4) = 9/4 (7)
Quiz 9.3
When R = r, the conditional PDF of X = Y −40−40 log
10
r is Gaussian with expected
value −40 −40 log
10
r and variance 64. The conditional PDF of X given R is
f
X|R
(x|r) =
1

128π
e
−(x+40+40 log
10
r)
2
/128
(1)
54
From the conditional PDF f
X|R
(x|r), we can use Definition 9.2 to write the ML estimate
of R given X = x as
ˆ r
ML
(x) = arg max
r≥0
f
X|R
(x|r) (2)
We observe that f
X|R
(x|r) is maximized when the exponent (x + 40 + 40 log
10
r)
2
is
minimized. This minimum occurs when the exponent is zero, yielding
log
10
r = −1 − x/40 (3)
or
ˆ r
ML
(x) = (0.1)10
−x/40
m (4)
If the result doesn’t look correct, note that a typical figure for the signal strength might be
x = −120 dB. This corresponds to a distance estimate of ˆ r
ML
(−120) = 100 m.
For the MAP estimate, we observe that the joint PDF of X and R is
f
X,R
(x, r) = f
X|R
(x|r) f
R
(r) =
1
10
6

32π
re
−(x+40+40 log
10
r)
2
/128
(5)
From Theorem 9.6, the MAP estimate of R given X = x is the value of r that maximizes
f
X,R
(x, r). That is,
ˆ r
MAP
(x) = arg max
0≤r≤1000
f
X,R
(x, r) (6)
Note that we have included the constraint r ≤ 1000 in the maximization to highlight the
fact that under our probability model, R ≤ 1000 m. Setting the derivative of f
X,R
(x, r)
with respect to r to zero yields
e
−(x+40+40 log
10
r)
2
/128
¸
1 −
80 log
10
e
128
(x +40 +40 log
10
r)
¸
= 0 (7)
Solving for r yields
r = 10

1
25 log
10
e
−1

10
−x/40
= (0.1236)10
−x/40
(8)
This is the MAP estimate of R given X = x as long as r ≤ 1000 m. When x ≤ −156.3 dB,
the above estimate will exceed 1000 m, which is not possible in our probability model.
Hence, the complete description of the MAP estimate is
ˆ r
MAP
(x) =
¸
1000 x < −156.3
(0.1236)10
−x/40
x ≥ −156.3
(9)
For example, if x = −120dB, then ˆ r
MAP
(−120) = 123.6 m. When the measured signal
strength is not too low, the MAP estimate is 23.6% larger than the ML estimate. This re-
flects the fact that large values of R are a priori more probable than small values. However,
for very low signal strengths, the MAP estimate takes into account that the distance can
never exceed 1000 m.
55
Quiz 9.4
(1) From Theorem 9.4, the LMSE estimate of X
2
given Y
2
is
ˆ
X
2
(Y
2
) = a

Y
2
+b

where
a

=
Cov [X
2
, Y
2
]
Var[Y
2
]
, b

= µ
X
2
−a

µ
Y
2
. (1)
Because E[X] = E[Y] = 0,
Cov [X
2
, Y
2
] = E [X
2
Y
2
] = E [X
2
(X
2
+ W
2
)] = E
¸
X
2
2
¸
= 1 (2)
Var[Y
2
] = Var[X
2
] +Var[W
2
] = E
¸
X
2
2
¸
+ E
¸
W
2
2
¸
= 1.1 (3)
It follows that a

= 1/1.1. Because µ
X
2
= µ
Y
2
= 0, it follows that b

= 0. Finally,
to compute the expected square error, we calculate the correlation coefficient
ρ
X
2
,Y
2
=
Cov [X
2
, Y
2
]
σ
X
2
σ
Y
2
=
1

1.1
(4)
The expected square error is
e

L
= Var[X
2
](1 −ρ
2
X
2
,Y
2
) = 1 −
1
1.1
=
1
11
= 0.0909 (5)
(2) Since Y = X + W and E[X] = E[W] = 0, it follows that E[Y] = 0. Thus we can
apply Theorem 9.7. Note that X and W have correlation matrices
R
X
=
¸
1 −0.9
−0.9 1
¸
, R
W
=
¸
0.1 0
0 0.1
¸
. (6)
In terms of Theorem 9.7, n = 2 and we wish to estimate X
2
given the observation
vector Y =
¸
Y
1
Y
2
¸

. To apply Theorem 9.7, we need to find R
Y
and R
YX
2
.
R
Y
= E
¸
YY

¸
= E
¸
(X +W)(X

+W

)
¸
(7)
= E
¸
XX

+XW

+WX

+WW

¸
. (8)
Because Xand Ware independent, E[XW

] = E[X]E[W

] = 0. Similarly, E[WX

] =
0. This implies
R
Y
= E
¸
XX

¸
+ E
¸
WW

¸
= R
X
+R
W
=
¸
1.1 −0.9
−0.9 1.1
¸
. (9)
In addition, we need to find
R
YX
2
= E [YX
2
] =
¸
E [Y
1
X
2
]
E [Y
2
X
2
]
¸
=
¸
E [(X
1
+ W
1
)X
2
]
E [(X
2
+ W
2
)X
2
]
¸
. (10)
56
Since Xand Ware independent vectors, E[W
1
X
2
] = E[W
1
]E[X
2
] = 0 and E[W
2
X
2
] =
0. Thus
R
YX
2
=
¸
E[X
1
X
2
]
E
¸
X
2
2
¸
¸
=
¸
−0.9
1
¸
. (11)
By Theorem 9.7,
ˆ a = R
−1
Y
R
YX
2
=
¸
−0.225
0.725
¸
(12)
Therefore, the optimum linear estimator of X
2
given Y
1
and Y
2
is
ˆ
X
L
= ˆ a

Y = −0.225Y
1
+0.725Y
2
. (13)
The mean square error is
Var [X
2
] − ˆ a

R
YX
2
= Var [X] −a
1
r
Y
1
,X
2
−a
2
r
Y
2
,X
2
= 0.0725. (14)
Quiz 9.5
Since X and W have zero expected value, Y also has zero expected value. Thus, by
Theorem 9.7,
ˆ
X
L
(Y) = ˆ a

Y where ˆ a = R
−1
Y
R
YX
. Since X and W are independent,
E[WX] = 0 and E[XW

] = 0

. This implies
R
YX
= E [YX] = E [(1X +W)X] = 1E
¸
X
2
¸
= 1. (1)
By the same reasoning, the correlation matrix of Y is
R
Y
= E
¸
YY

¸
= E
¸
(1X +W)(1

X +W

)
¸
(2)
= 11

E
¸
X
2
¸
+1E
¸
XW

¸
+ E [WX] 1

+ E
¸
WW

¸
(3)
= 11

+R
W
(4)
Note that 11

is a 20 ×20 matrix with every entry equal to 1. Thus,
ˆ a = R
−1
Y
R
YX
=

11

+R
W

−1
1 (5)
and the optimal linear estimator is
ˆ
X
L
(Y) = 1

11

+R
W

−1
Y (6)
The mean square error is
e

L
= Var[X] − ˆ a

R
YX
= 1 −1

11

+R
W

−1
1 (7)
Now we note that R
W
has i, j th entry R
W
(i, j ) = c
|i −j |−1
. The question we must address
is what value c minimizes e

L
. This problem is atypical in that one does not usually get
57
to choose the correlation structure of the noise. However, we will see that the answer is
somewhat instructive.
We note that the answer is not obviously apparent from Equation (7). In particular, we
observe that Var[W
i
] = R
W
(i, i ) = 1/c. Thus, when c is small, the noises W
i
have high
variance and we would expect our estimator to be poor. On the other hand, if c is large
W
i
and W
j
are highly correlated and the separate measurements of X are very dependent.
This would suggest that large values of c will also result in poor MSE. If this argument is
not clear, consider the extreme case in which every W
i
and W
j
have correlation coefficient
ρ
i j
= 1. In this case, our 20 measurements will be all the same and one measurement is as
good as 20 measurements.
To find the optimal value of c, we write a MATLAB function mquiz9(c) to calculate
the MSE for a given c and second function that finds plots the MSE for a range of values
of c.
function [mse,af]=mquiz9(c);
v1=ones(20,1);
RW=toeplitz(c.ˆ((0:19)-1));
RY=(v1*(v1’)) +RW;
af=(inv(RY))*v1;
mse=1-((v1’)*af);
function cmin=mquiz9minc(c);
msec=zeros(size(c));
for k=1:length(c),
[msec(k),af]=mquiz9(c(k));
end
plot(c,msec);
xlabel(’c’);ylabel(’e_Lˆ*’);
[msemin,optk]=min(msec);
cmin=c(optk);
Note in mquiz9 that v1 corresponds to the vector 1 of all ones. The following commands
finds the minimum c and also produces the following graph:
>> c=0.01:0.01:0.99;
>> mquiz9minc(c)
ans =
0.4500
0 0.5 1
0.2
0.4
0.6
0.8
1
c
e
L *
As we see in the graph, both small values and large values of c result in large MSE.
58
Quiz Solutions – Chapter 10
Quiz 10.1
There are many correct answers to this question. A correct answer specifies enough
random variables to specify the sample path exactly. One choice for an alternate set of
random variables that would specify m(t, s) is
• m(0, s), the number of ongoing calls at the start of the experiment
• N, the number of new calls that arrive during the experiment
• X
1
, . . . , X
N
, the interarrival times of the N new arrivals
• H, the number of calls that hang up during the experiment
• D
1
, . . . , D
H
, the call completion times of the H calls that hang up
Quiz 10.2
(1) We obtain a continuous time, continuous valued process when we record the temper-
ature as a continuous waveform over time.
(2) If at every moment in time, we round the temperature to the nearest degree, then we
obtain a continuous time, discrete valued process.
(3) If we sample the process in part (a) every T seconds, then we obtain a discrete time,
continuous valued process.
(4) Rounding the samples in part (c) to the nearest integer degree yields a discrete time,
discrete valued process.
Quiz 10.3
(1) Each resistor has resistance R in ohms with uniform PDF
f
R
(r) =
¸
0.01 950 ≤ r ≤ 1050
0 otherwise
(1)
The probability that a test produces a 1% resistor is
p = P [990 ≤ R ≤ 1010] =

1010
990
(0.01) dr = 0.2 (2)
59
(2) In t seconds, exactly t resistors are tested. Each resistor is a 1% resistor with proba-
bility p, independent of any other resistor. Consequently, the number of 1% resistors
found has the binomial PMF
P
N(t )
(n) =
¸
t
n

p
n
(1 − p)
t −n
n = 0, 1, . . . , t
0 otherwise
(3)
(3) First we will find the PMF of T
1
. This problem is easy if we view each resistor test
as an independent trial. A success occurs on a trial with probability p if we find a
1% resistor. The first 1% resistor is found at time T
1
= t if we observe failures on
trials 1, . . . , t − 1 followed by a success on trial t . Hence, just as in Example 2.11,
T
1
has the geometric PMF
P
T
1
(t ) =
¸
(1 − p)
t −1
p t = 1, 2, . . .
9 otherwise
(4)
Since p = 0.2, the probability the first 1% resistor is found in exactly five seconds is
P
T
1
(5) = (0.8)
4
(0.2) = 0.08192.
(4) From Theorem 2.5, a geometric random variable with success probability p has ex-
pected value 1/p. In this problem, E[T
1
] = 1/p = 5.
(5) Note that once we find the first 1% resistor, the number of additional trials needed to
find the second 1% resistor once again has a geometric PMF with expected value 1/p
since each independent trial is a success with probability p. That is, T
2
= T
1
+ T

where T

is independent and identically distributed to T
1
. Thus
E [T
2
|T
1
= 10] = E [T
1
|T
1
= 10] + E
¸
T

|T
1
= 10
¸
(5)
= 10 + E
¸
T

¸
= 10 +5 = 15 (6)
Quiz 10.4
Since each X
i
is a N(0, 1) random variable, each X
i
has PDF
f
X(i )
(x) =
1


e
−x
2
/2
(1)
By Theorem 10.1, the joint PDF of X =
¸
X
1
· · · X
n
¸

is
f
X
(x) = f
X(1),...,X(n)
(x
1
, . . . , x
n
) =
k
¸
i =1
f
X
(x
i
) =
1
(2π)
n/2
e
−(x
2
1
+···+x
2
n
)/2
(2)
60
Quiz 10.5
The first and second hours are nonoverlapping intervals. Since one hour equals 3600
sec and the Poisson process has a rate of 10 packets/sec, the expected number of packets
in each hour is E[M
i
] = α = 36, 000. This implies M
1
and M
2
are independent Poisson
random variables each with PMF
P
M
i
(m) =
¸
α
m
e
−α
m!
m = 0, 1, 2, . . .
0 otherwise
(1)
Since M
1
and M
2
are independent, the joint PMF of M
1
and M
2
is
P
M
1
,M
2
(m
1
, m
2
) = P
M
1
(m
1
) P
M
2
(m
2
) =







α
m
1
+m
2
e
−2α
m
1
!m
2
!
m
1
= 0, 1, . . . ;
m
2
= 0, 1, . . . ,
0 otherwise.
(2)
Quiz 10.6
To answer whether N

(t ) is a Poisson process, we look at the interarrival times. Let
X
1
, X
2
, . . . denote the interarrival times of the N(t ) process. Since we count only even-
numbered arrival for N

(t ), the time until the first arrival of the N

(t ) is Y
1
= X
1
+ X
2
.
Since X
1
and X
2
are independent exponential (λ) random variables, Y
1
is an Erlang (n =
2, λ) random variable; see Theorem 6.11. Since Y
i
(t ), the i th interarrival time of the N

(t )
process, has the same PDF as Y
1
(t ), we can conclude that the interarrival times of N

(t )
are not exponential random variables. Thus N

(t ) is not a Poisson process.
Quiz 10.7
First, we note that for t > s,
X(t ) − X(s) =
W(t ) − W(s)

α
(1)
Since W(t ) −W(s) is a Gaussian random variable, Theorem 3.13 states that W(t ) −W(s)
is Gaussian with expected value
E [X(t ) − X(s)] =
E [W(t ) − W(s)]

α
= 0 (2)
and variance
E
¸
(W(t ) − W(s))
2
¸
=
E
¸
(W(t ) − W(s))
2
¸
α
=
α(t −s)
α
(3)
Consider s

≤ s < t . Since s ≥ s

, W(t ) − W(s) is independent of W(s

). This implies
[W(t ) − W(s)]/

α is independent of W(s

)/

α for all s ≥ s

. That is, X(t ) − X(s) is
independent of X(s

) for all s ≥ s

. Thus X(t ) is a Brownian motion process with variance
Var[X(t )] = t .
61
Quiz 10.8
First we find the expected value
µ
Y
(t ) = µ
X
(t ) +µ
N
(t ) = µ
X
(t ). (1)
To find the autocorrelation, we observe that since X(t ) and N(t ) are independent and since
N(t ) has zero expected value, E[X(t )N(t

)] = E[X(t )]E[N(t

)] = 0. Since R
Y
(t, τ) =
E[Y(t )Y(t +τ)], we have
R
Y
(t, τ) = E [(X(t ) + N(t )) (X(t +τ) + N(t +τ))] (2)
= E [X(t )X(t +τ)] + E [X(t )N(t +τ)]
+ E [X(t +τ)N(t )] + E [N(t )N(t +τ)] (3)
= R
X
(t, τ) + R
N
(t, τ). (4)
Quiz 10.9
From Definition 10.14, X
1
, X
2
, . . . is a stationary random sequence if for all sets of
time instants n
1
, . . . , n
m
and time offset k,
f
X
n
1
,...,X
n
m
(x
1
, . . . , x
m
) = f
X
n
1
+k
,...,X
n
m
+k
(x
1
, . . . , x
m
) (1)
Since the random sequence is iid,
f
X
n
1
,...,X
n
m
(x
1
, . . . , x
m
) = f
X
(x
1
) f
X
(x
2
) · · · f
X
(x
m
) (2)
Similarly, for time instants n
1
+k, . . . , n
m
+k,
f
X
n
1
+k
,...,X
n
m
+k
(x
1
, . . . , x
m
) = f
X
(x
1
) f
X
(x
2
) · · · f
X
(x
m
) (3)
We can conclude that the iid random sequence is stationary.
Quiz 10.10
We must check whether each function R(τ) meets the conditions of Theorem 10.12:
R(τ) ≥ 0 R(τ) = R(−τ) |R(τ)| ≤ R(0) (1)
(1) R
1
(τ) = e
−|τ|
meets all three conditions and thus is valid.
(2) R
2
(τ) = e
−τ
2
also is valid.
(3) R
3
(τ) = e
−τ
cos τ is not valid because
R
3
(−2π) = e

cos 2π = e

> 1 = R
3
(0) (2)
(4) R
4
(τ) = e
−τ
2
sin τ also cannot be an autocorrelation function because
R
4
(π/2) = e
−π/2
sin π/2 = e
−π/2
> 0 = R
4
(0) (3)
62
Quiz 10.11
(1) The autocorrelation of Y(t ) is
R
Y
(t, τ) = E [Y(t )Y(t +τ)] (1)
= E [X(−t )X(−t −τ)] (2)
= R
X
(−t −(−t −τ)) = R
X
(τ) (3)
Since E[Y(t )] = E[X(−t )] = µ
X
, we can conclude that Y(t ) is a wide sense
stationary process. In fact, we see that by viewing a process backwards in time, we
see the same second order statistics.
(2) Since X(t ) and Y(t ) are both wide sense stationary processes, we can check whether
they are jointly wide sense stationary by seeing if R
XY
(t, τ) is just a function of τ.
In this case,
R
XY
(t, τ) = E [X(t )Y(t +τ)] (4)
= E [X(t )X(−t −τ)] (5)
= R
X
(t −(−t −τ)) = R
X
(2t +τ) (6)
Since R
XY
(t, τ) depends on both t and τ, we conclude that X(t ) and Y(t ) are not
jointly wide sense stationary. To see why this is, suppose R
X
(τ) = e
−|τ|
so that
samples of X(t ) far apart in time have almost no correlation. In this case, as t gets
larger, Y(t ) = X(−t ) and X(t ) become less and less correlated.
Quiz 10.12
From the problem statement,
E [X(t )] = E [X(t +1)] = 0 (1)
E [X(t )X(t +1)] = 1/2 (2)
Var[X(t )] = Var[X(t +1)] = 1 (3)
The Gaussian random vector X =
¸
X(t ) X(t +1)
¸

has covariance matrix and corre-
sponding inverse
C
X
=
¸
1 1/2
1/2 1
¸
C
−1
X
=
4
3
¸
1 −1/2
−1/2 1
¸
(4)
Since
x

C
−1
X
x =
¸
x
0
x
1
¸

4
3
¸
1 −1/2
−1/2 1
¸ ¸
x
0
x
1
¸
=
4
3

x
2
0
− x
0
x
+
x
2
1

(5)
the joint PDF of X(t ) and X(t +1) is the Gaussian vector PDF
f
X(t ),X(t +1)
(x
0
, x
1
) =
1
(2π)
n/2
[det (C
X
)]
1/2
exp


1
2
x

C
−1
X
x

(6)
=
1


2
e

2
3

x
2
0
−x
0
x
1
+x
2
1

(7)
63
0 10 20 30 40 50 60 70 80 90 100
0
20
40
60
80
100
120
t

M
(
t
)
Figure 4: Sample path of 100 minutes of the blocking switch of Quiz 10.13.
Quiz 10.13
The simple structure of the switch simulation of Example 10.28 admits a deceptively
simple solution in terms of the vector of arrivals A and the vector of departures D. With the
introduction of call blocking. we cannot generate these vectors all at once. In particular,
when an arrival occurs at time t , we need to know that M(t ), the number of ongoing calls,
satisfies M(t ) < c = 120. Otherwise, when M(t ) = c, we must block the call. Call
blocking can be implemented by setting the service time of the call to zero so that the call
departs as soon as it arrives.
The blocking switch is an example of a discrete event system. The system evolves via
a sequence of discrete events, namely arrivals and departures, at discrete time instances. A
simulation of the system moves from one time instant to the next by maintaining a chrono-
logical schedule of future events (arrivals and departures) to be executed. The program
simply executes the event at the head of the schedule. The logic of such a simulation is
1. Start at time t = 0 with an empty system. Schedule the first arrival to occur at S
1
, an
exponential (λ) random variable.
2. Examine the head-of-schedule event.
• When the head-of-schedule event is the kth arrival is at time t , check the state
M(t ).
– If M(t ) < c, admit the arrival, increase the system state n by 1, and sched-
ule a departure to occur at time t + S
n
, where S
k
is an exponential (λ)
random variable.
– If M(t ) = c, block the arrival, do not schedule a departure event.
• If the head of schedule event is a departure, reduce the system state n by 1.
3. Delete the head-of-schedule event and go to step 2.
After the head-of-schedule event is completed and any new events (departures in this sys-
tem) are scheduled, we know the system state cannot change until the next scheduled event.
64
Thus we know that M(t ) will stay the same until then. In our simulation, we use the vector
t as the set of time instances at which we inspect the system state. Thus for all times t(i)
between the current head-of-schedule event and the next, we set m(i) to the current switch
state.
The complete program is shown in Figure 5. In most programming languages, it is
common to implement the event schedule as a linked list where each item in the list has
a data structure indicating an event timestamp and the type of the event. In MATLAB, a
simple (but not elegant) way to do this is to have maintain two vectors: time is a list
of timestamps of scheduled events and event is a the list of event types. In this case,
event(i)=1 if the i th scheduled event is an arrival, or event(i)=-1 if the i th sched-
uled event is a departure.
When the program is passed a vector t, the output [m a b] is such that m(i) is the
number of ongoing calls at time t(i) while a and b are the number of admits and blocks.
The following instructions
t=0:0.1:5000;
[m,a,b]=simblockswitch(10,0.1,120,t);
plot(t,m);
generated a simulation lasting 5,000 minutes. A sample path of the first 100 minutes of
that simulation is shown in Figure 4. The 5,000 minute full simulation produced a=49658
admitted calls and b=239 blocked calls. We can estimate the probability a call is blocked
as
ˆ
P
b
=
b
a +b
= 0.0048. (1)
In Chapter 12, we will learn that the exact blocking probability is given by Equation (12.93),
a result known as the “Erlang-B formula.” From the Erlang-B formula, we can calculate
that the exact blocking probability is P
b
= 0.0057. One reason our simulation underesti-
mates the blocking probability is that in a 5,000 minute simulation, roughly the first 100
minutes are needed to load up the switch since the switch is idle when the simulation starts
at time t = 0. However, this says that roughly the first two percent of the simulation time
was unusual. Thus this would account for only part of the disparity. The rest of the gap
between 0.0048 and 0.0057 is that a simulation that includes only 239 blocks is not all that
likely to give a very accurate result for the blocking probability.
Note that in Chapter 12, we will learn that the blocking switch is an example of an
M/M/c/c queue, a kind of Markov chain. Chapter 12 develops techniques for analyzing
and simulating systems described by Markov chains that are much simpler than the discrete
event simulation technique shown here. Nevertheless, for very complicated systems, the
discrete event simulation is widely-used and often very efficient simulation method.
65
function [M,admits,blocks]=simblockswitch(lam,mu,c,t);
blocks=0; %total # blocks
admits=0; %total # admits
M=zeros(size(t));
n=0; % # in system
time=[ exponentialrv(lam,1) ];
event=[ 1 ]; %first event is an arrival
timenow=0;
tmax=max(t);
while (timenow<tmax)
M((timenow<=t)&(t<time(1)))=n;
timenow=time(1);
eventnow=event(1);
event(1)=[ ]; time(1)= [ ]; % clear current event
if (eventnow==1) % arrival
arrival=timenow+exponentialrv(lam,1); % next arrival
b4arrival=time<arrival;
event=[event(b4arrival) 1 event(˜b4arrival)];
time=[time(b4arrival) arrival time(˜b4arrival)];
if n<c %call admitted
admits=admits+1;
n=n+1;
depart=timenow+exponentialrv(mu,1);
b4depart=time<depart;
event=[event(b4depart) -1 event(˜b4depart)];
time=[time(b4depart) depart time(˜b4depart)];
else
blocks=blocks+1; %one more block, immed departure
disp(sprintf(’Time %10.3d Admits %10d Blocks %10d’,...
timenow,admits,blocks));
end
elseif (eventnow==-1) %departure
n=n-1;
end
end
Figure 5: Discrete event simulation of the blocking switch of Quiz 10.13.
66
Quiz Solutions – Chapter 11
Quiz 11.1
By Theorem 11.2,
µ
Y
= µ
X


−∞
h(t )dt = 2


0
e
−t
dt = 2 (1)
Since R
X
(τ) = δ(τ), the autocorrelation function of the output is
R
Y
(τ) =


−∞
h(u)


−∞
h(v)δ(τ +u −v) dv du =


−∞
h(u)h(τ +u) du (2)
For τ > 0, we have
R
Y
(τ) =


0
e
−u
e
−τ−u
du = e
−τ


0
e
−2u
du =
1
2
e
−τ
(3)
For τ < 0, we can deduce that R
Y
(τ) =
1
2
e
−|τ|
by symmetry. Just to be safe though, we
can double check. For τ < 0,
R
Y
(τ) =


−τ
h(u)h(τ +u) du =


−τ
e
−u
e
−τ−u
du =
1
2
e
τ
(4)
Hence,
R
Y
(τ) =
1
2
e
−|τ|
(5)
Quiz 11.2
The expected value of the output is
µ
Y
= µ
X

¸
n=−∞
h
n
= 0.5(1 +−1) = 0 (1)
The autocorrelation of the output is
R
Y
[n] =
1
¸
i =0
1
¸
j =0
h
i
h
j
R
X
[n +i − j ] (2)
= 2R
X
[n] − R
X
[n −1] − R
X
[n +1] =
¸
1 n = 0
0 otherwise
(3)
Since µ
Y
= 0, The variance of Y
n
is Var[Y
n
] = E[Y
2
n
] = R
Y
[0] = 1.
67
−15 −10 −5 0 5 10 15
0
0.2
0.4
0.6
f

S
X
(
f
)
−1500−1000 −500 0 500 1000 1500
0
2
4
6
8
x 10
f

S
X
(
f
)
−0.2 −0.1 0 0.1 0.2
−5
0
5
10
τ

R
X
(
τ
)
−2 −1 0 1 2
x 10
−3
−5
0
5
10
τ

R
X
(
τ
)
(a) W = 10 (b) W = 1000
Figure 6: The autocorrelation R
X
(τ) and power spectral density S
X
( f ) for process X(t ) in
Quiz 11.5.
Quiz 11.3
By Theorem 11.8, Y =
¸
Y
33
Y
34
Y
35
¸

is a Gaussian random vector since X
n
is
a Gaussian random process. Moreover, by Theorem 11.5, each Y
n
has expected value
E[Y
n
] = µ
X
¸

n=−∞
h
n
= 0. Thus E[Y] = 0. Fo find the PDF of the Gaussian vector
Y, we need to find the covariance matrix C
Y
, which equals the correlation matrix R
Y
since
Y has zero expected value. One way to find the R
Y
is to observe that R
Y
has the Toeplitz
structure of Theorem 11.6 and to use Theorem 11.5 to find the autocorrelation function
R
Y
[n] =

¸
i =−∞

¸
j =−∞
h
i
h
j
R
X
[n +i − j ]. (1)
Despite the fact that R
X
[k] is an impulse, using Equation (1) is surprisingly tedious because
we still need to sum over all i and j such that n +i − j = 0.
In this problem, it is simpler to observe that Y = HX where
X =
¸
X
30
X
31
X
32
X
33
X
34
X
35
¸

(2)
and
H =
1
4


1 1 1 1 0 0
0 1 1 1 1 0
0 0 1 1 1 1


. (3)
In this case, following Theorem 11.7, or by directly applying Theorem 5.13 with µ
X
= 0
and A = H, we obtain R
Y
= HR
X
H

. Since R
X
[n] = δ
n
, R
X
= I, the identity matrix.
68
Thus
C
Y
= R
Y
= HH

=
1
16


4 3 2
3 4 3
2 3 4


. (4)
It follows (very quickly if you use MATLAB for 3 ×3 matrix inversion) that
C
−1
Y
= 16


7/12 −1/2 1/12
−1/2 1 −1/2
1/12 −1/2 7/12


. (5)
Thus, the PDF of Y is
f
Y
(y) =
1
(2π)
3/2
[det (C
Y
)]
1/2
exp


1
2
y

C
−1
Y
y

. (6)
A disagreeable amount of algebra will show det(C
Y
) = 3/1024 and that the PDF can be
“simplified” to
f
Y
(y) =
16


3
exp
¸
−8

7
12
y
2
33
+ y
2
34
+
7
12
y
2
35
− y
33
y
34
+
1
6
y
33
y
35
− y
34
y
35
¸
. (7)
Equation (7) shows that one of the nicest features of the multivariate Gaussian distribution
is that y

C
−1
Y
y is a very concise representation of the cross-terms in the exponent of f
Y
(y).
Quiz 11.4
This quiz is solved using Theorem 11.9 for the case of k = 1 and M = 2. In this case,
X
n
=
¸
X
n−1
X
n
¸

and
R
X
n
=
¸
R
X
[0] R
X
[1]
R
X
[1] R
X
[0]
¸
=
¸
1.1 0.9
0.9 1.1
¸
(1)
and
R
X
n
X
n+1
= E
¸¸
X
n−1
X
n
¸
X
n+1
¸
=
¸
R
X
[2]
R
X
[1]
¸
=
¸
0.81
0.9
¸
. (2)
The MMSE linear first order filter for predicting X
n+1
at time n is the filter h such that
←−
h = R
−1
X
n
R
X
n
X
n+1
=
¸
1.1 0.9
0.9 1.1
¸
−1
¸
0.81
0.9
¸
=
1
400
¸
81
261
¸
. (3)
It follows that the filter is h =
¸
261/400 81/400
¸

and the MMSE linear predictor is
ˆ
X
n+1
=
81
400
X
n−1
+
261
400
X
n
. (4)
to find the mean square error, one approach is to follow the method of Example 11.13 and
to directly calculate
e

L
= E
¸
(X
n+1

ˆ
X
n+1
)
2
¸
. (5)
69
This method is workable for this simple problem but becomes increasingly tedious for
higher order filters. Instead, we can derive the mean square error for an arbitary prediction
filter h. Since
ˆ
X
n+1
=
←−
h

X
n
,
e

L
= E
¸

X
n+1

←−
h

X
n

2
¸
(6)
= E
¸
(X
n+1

←−
h

X
n
)(X
n+1

←−
h

X
n
)

¸
(7)
= E
¸
(X
n+1

←−
h

X
n
)(X
n+1
−X

n
←−
h )
¸
(8)
After a bit of algebra, we obtain
e

L
= R
X
[0] −2
←−
h

R
X
n
X
n+1
+
←−
h

R
X
n
←−
h (9)
(10)
with the substitution
←−
h = R
−1
X
n
R
X
n
X
n+1
, we obtain
e

L
= R
X
[0] −R

X
n
X
n+1
R
−1
X
n
R
X
n
X
n+1
(11)
= R
X
[0] −
←−
h

R
X
n
X
n+1
(12)
Note that this is essentially the same result as Theorem 9.7 with Y = X
n
, X = X
n+1
and
ˆ a

=
←−
h

. It is noteworthy that the result is derived in a much simpler way in the proof of
Theorem 9.7 by using the orthoginality property of the LMSE estimator.
In any case, the mean square error is
e

L
= R
X
[0] −
←−
h

R
X
n
X
n+1
= 1.1 −
1
400
¸
81 261
¸
¸
0.81
0.9
¸
=
506
1451
= 0.3487. (13)
recalling that the blind estimate would yield a mean square error of Var[X] = 1.1, we see
that observing X
n−1
and X
n
improves the accuracy of our prediction of X
n+1
.
Quiz 11.5
(1) By Theorem 11.13(b), the average power of X(t ) is
E
¸
X
2
(t )
¸
=


−∞
S
X
( f ) d f =

W
−W
5
W
d f = 10 Watts (1)
(2) The autocorrelation function is the inverse Fourier transform of S
X
( f ). Consulting
Table 11.1, we note that
S
X
( f ) = 10
1
2W
rect

f
2W

(2)
It follows that the inverse transform of S
X
( f ) is
R
X
(τ) = 10 sinc(2Wτ) = 10
sin(2πWτ)
2πWτ
(3)
(3) For W = 10 Hz and W = 1 kHZ, graphs of S
X
( f ) and R
X
(τ) appear in Figure 6.
70
Quiz 11.6
In a sampled system, the discrete time impulse δ[n] has a flat discrete Fourier transform.
That is, if R
X
[n] = 10δ[n], then
S
X
(φ) =

¸
n=−∞
10δ[n]e
−j 2πφn
= 10 (1)
Thus, R
X
[n] = 10δ[n]. (This quiz is really lame!)
Quiz 11.7
Since Y(t ) = X(t −t
0
),
R
XY
(t, τ) = E [X(t )Y(t +τ)] = E [X(t )X(t +τ −t
0
)] = R
X
(τ −t
0
) (1)
We see that R
XY
(t, τ) = R
XY
(τ) = R
X
(τ − t
0
). From Table 11.1, we recall the prop-
erty that g(τ − τ
0
) has Fourier transform G( f )e
−j 2π f τ
0
. Thus the Fourier transform of
R
XY
(τ) = R
X
(τ −t
0
) = g(τ −t
0
) is
S
XY
( f ) = S
X
( f )e
−j 2π f t
0
. (2)
Quiz 11.8
We solve this quiz using Theorem 11.17. First we need some preliminary facts. Let
a
0
= 5,000 so that
R
X
(τ) =
1
a
0
a
0
e
−a
0
|τ|
. (1)
Consulting with the Fourier transforms in Table 11.1, we see that
S
X
( f ) =
1
a
0
2a
2
0
a
2
0
+(2π f )
2
=
2a
0
a
2
0
+(2π f )
2
(2)
The RC filter has impulse response h(t ) = a
1
e
−a
1
t
u(t ), where u(t ) is the unit step function
and a
1
= 1/RC where RC = 10
−4
is the filter time constant. From Table 11.1,
H( f ) =
a
1
a
1
+ j 2π f
(3)
(1) Theorem 11.17,
S
XY
( f ) = H( f )S
X
( f ) =
2a
0
a
1
[a
1
+ j 2π f ]
¸
a
2
0
+(2π f )
2
¸. (4)
(2) Again by Theorem 11.17,
S
Y
( f ) = H

( f )S
XY
( f ) = |H( f )|
2
S
X
( f ). (5)
71
Note that
|H( f )|
2
= H( f )H

( f ) =
a
1
(a
1
+ j 2π f )
a
1
(a
1
− j 2π f )
=
a
2
1
a
2
1
+(2π f )
2
(6)
Thus,
S
Y
( f ) = |H( f )|
2
S
X
( f ) =
2a
0
a
2
1
¸
a
2
1
+(2π f )
2
¸ ¸
a
2
0
+(2π f )
2
¸ (7)
(3) To find the average power at the filter output, we can either use basic calculus and
calculate


−∞
S
Y
( f ) d f directly or we can find R
Y
(τ) as an inverse transform of
S
Y
( f ). Using partial fractions and the Fourier transform table, the latter method is
actually less algebra. In particular, some algebra will show that
S
Y
( f ) =
K
0
a
2
0
+(2π f )
2
+
K
1
a
1
+(2π f )
2
(8)
where
K
0
=
2a
0
a
2
1
a
2
1
−a
2
0
, K
1
=
−2a
0
a
2
1
a
2
1
−a
2
0
. (9)
Thus,
S
Y
( f ) =
K
0
2a
2
0
2a
2
0
a
2
0
+(2π f )
2
+
K
1
2a
2
1
2a
2
1
a
1
+(2π f )
2
. (10)
Consulting with Table 11.1, we see that
R
Y
(τ) =
K
0
2a
2
0
a
0
e
−a
0
|τ|
+
K
1
2a
2
1
a
1
e
−a
1
|τ|
(11)
Substituting the values of K
0
and K
1
, we obtain
R
Y
(τ) =
a
2
1
e
−a
0
|τ|
−a
0
a
1
e
−a
1
|τ|
a
2
1
−a
2
0
. (12)
The average power of the Y(t ) process is
R
Y
(0) =
a
1
a
1
+a
0
=
2
3
. (13)
Note that the input signal has average power R
X
(0) = 1. Since the RC filter has a 3dB
bandwidth of 10,000 rad/sec and the signal X(t ) has most of its its signal energy below
5,000 rad/sec, the output signal has almost as much power as the input.
72
Quiz 11.9
This quiz implements an example of Equations (11.146) and (11.147) for a system in
which we filter Y(t ) = X(t ) + N(t ) to produce an optimal linear estimate of X(t ). The
solution to this quiz is just to find the filter
ˆ
H( f ) using Equation (11.146) and to calculate
the mean square error e
L
∗ using Equation (11.147).
Comment: Since the text omitted the derivations of Equations (11.146) and (11.147), we
note that Example 10.24 showed that
R
Y
(τ) = R
X
(τ) + R
N
(τ), R
Y X
(τ) = R
X
(τ). (1)
Taking Fourier transforms, it follows that
S
Y
( f ) = S
X
( f ) + S
N
( f ), S
Y X
( f ) = S
X
( f ). (2)
Now we can go on to the quiz, at peace with the derivations.
(1) Since µ
N
= 0, R
N
(0) = Var[N] = 1. This implies
R
N
(0) =


−∞
S
N
( f ) d f =

B
−B
N
0
d f = 2N
0
B (3)
Thus N
0
= 1/(2B). Because the noise process N(t ) has constant power R
N
(0) = 1,
decreasing the single-sided bandwidth B increases the power spectral density of the
noise over frequencies | f | < B.
(2) Since R
X
(τ) = sinc(2Wτ), where W = 5,000 Hz, we see from Table 11.1 that
S
X
( f ) =
1
10
4
rect

f
10
4

. (4)
The noise power spectral density can be written as
S
N
( f ) = N
0
rect

f
2B

=
1
2B
rect

f
2B

, (5)
From Equation (11.146), the optimal filter is
ˆ
H( f ) =
S
X
( f )
S
X
( f ) + S
N
( f )
=
1
10
4
rect

f
10
4

1
10
4
rect

f
10
4

+
1
2B
rect

f
2B
. (6)
73
(3) We produce the output
ˆ
X(t ) by passing the noisy signal Y(t ) through the filter
ˆ
H( f ).
From Equation (11.147), the mean square error of the estimate is
e

L
=


−∞
S
X
( f )S
N
( f )
S
X
( f ) + S
N
( f )
d f (7)
=


−∞
1
10
4
rect

f
10
4

1
2B
rect

f
2B

1
10
4
rect

f
10
4

+
1
2B
rect

f
2B
d f. (8)
To evaluate the MSE e

L
, we need to whether B ≤ W. Since the problem asks us to
find the largest possible B, let’s suppose B ≤ W. We can go back and consider the
case B > W later. When B ≤ W, the MSE is
e

L
=

B
−B
1
10
4
1
2B
1
10
4
+
1
2B
d f =
1
10
4
1
10
4
+
1
2B
=
1
1 +
5,000
B
(9)
To obtain MSE e

L
≤ 0.05 requires B ≤ 5,000/19 = 263.16 Hz.
Although this completes the solution to the quiz, what is happening may not be obvious.
The noise power is always Var[N] = 1 Watt, for all values of B. As B is decreased, the PSD
S
N
( f ) becomes increasingly tall, but only over a bandwidth B that is decreasing. Thus as
B descreases, the filter
ˆ
H( f ) makes an increasingly deep and narrow notch at frequencies
| f | ≤ B. Two examples of the filter
ˆ
H( f ) are shown in Figure 7. As B shrinks, the filter
suppresses less of the signal of X(t ). The result is that the MSE goes down.
Finally, we note that we can choose B very large and also achieve MSE e

L
= 0.05. In
particular, when B > W = 5000, S
N
( f ) = 1/2B over frequencies | f | < W. In this case,
the Wiener filter
ˆ
H( f ) is an ideal (flat) lowpass filter
ˆ
H( f ) =



1
10
4
1
10
4
+
1
2B
| f | < 5,000,
0 otherwise.
(10)
Thus increasing B spreads the constant 1 watt of power of N(t ) over more bandwidth. The
Wiener filter removes the noise that is outside the band of the desired signal. The mean
square error is
e

L
=

5000
−5000
1
10
4
1
2B
1
10
4
+
1
2B
d f =
1
2B
1
10
4
+
1
2B
=
1
B
5000
+1
(11)
In this case, B ≥ 9.5 ×10
4
guarantees e

L
≤ 0.05.
Quiz 11.10
It is fairly straightforward to find S
X
(φ) and S
Y
(φ). The only thing to keep in mind is
to use fftc to transform the autocorrelation R
X
[ f ] into the power spectral density S
X
(φ).
The following MATLAB program generates and plots the functions shown in Figure 8
74
−5000 −2000 0 2000 5000
0
0.5
1
f

H
(
f
)
−5000 −2000 0 2000 5000
0
0.5
1
f

H
(
f
)
B = 500 B = 2500
Figure 7: Wiener filter for Quiz 11.9.
%mquiz11.m
N=32;
rx=[2 4 2]; SX=fftc(rx,N); %autocorrelation and PSD
stem(0:N-1,abs(sx));
xlabel(’n’);ylabel(’S_X(n/N)’);
h2=0.5*[1 1]; H2=fft(h2,N); %impulse/filter response: M=2
SY2=SX.* ((abs(H2)).ˆ2);
figure; stem(0:N-1,abs(SY2)); %PSD of Y for M=2
xlabel(’n’);ylabel(’S_{Y_2}(n/N)’);
h10=0.1*ones(1,10); H10=fft(h10,N); %impulse/filter response: M=10
SY10=sx.*((abs(H10)).ˆ2);
figure; stem(0:N-1,abs(SY10));
xlabel(’n’);ylabel(’S_{Y_{10}}(n/N)’);
Relative to M = 2, when M = 10, the filter H(φ) filters out almost all of the high
frequency components of X(t ). In the context of Example 11.26, the low pass moving
average filter for M = 10 removes the high frquency components and results in a filter
output that varies very slowly.
As an aside, note that the vectors SX, SY2 and SY10 in mquiz11 should all be real-
valued vectors. However, the finite numerical precision of MATLAB results in tiny imagi-
nary parts. Although these imaginary parts have no computational significance, they tend
to confuse the stem function. Hence, we generate stem plots of the magnitude of each
power spectral density.
75
0 5 10 15 20 25 30 35
0
5
10
n
S
X
(
n
/
N
)
0 5 10 15 20 25 30 35
0
5
10
n
S
Y
2
(
n
/
N
)
0 5 10 15 20 25 30 35
0
5
10
n
S
Y
1
0
(
n
/
N
)
Figure 8: For Quiz 11.10, graphs of S
X
(φ), S
Y
(n/N) for M = 2, and S
φ
(n/N) for M = 10
using an N = 32 point DFT.
76
Quiz Solutions – Chapter 12
Quiz 12.1
The system has two states depending on whether the previous packet was received in
error. From the problem statement, we are given the conditional probabilities
P
¸
X
n+1
= 0|X
n
= 0
¸
= 0.99 P
¸
X
n+1
= 1|X
n
= 1
¸
= 0.9 (1)
Since each X
n
must be either 0 or 1, we can conclude that
P
¸
X
n+1
= 1|X
n
= 0
¸
= 0.01 P
¸
X
n+1
= 0|X
n
= 1
¸
= 0.1 (2)
These conditional probabilities correspond to the transition matrix and Markov chain:
0 1
0.01
0.1
0.99 0.9
P =
¸
0.99 0.01
0.10 0.90
¸
(3)
Quiz 12.2
From the problem statement, the Markov chain and the transition matrix are
0 1 1
0.6 0.2
0.2 0.6
0.4 0.6 0.4
P =


0.4 0.6 0
0.2 0.6 0.2
0 0.6 0.4


(1)
The eigenvalues of P are
λ
1
= 0 λ
2
= 0.4 λ
3
= 1 (2)
We can diagonalize P into
P = S
−1
DS =


−0.6 0.5 1
0.4 0 1
−0.6 −0.5 1




λ
1
0 0
0 λ
2
0
0 0 λ
3




−0.5 1 −0.5
1 0 −1
0.2 0.6 0.2


(3)
where s
i
, the i th row of S, is the left eigenvector of P satisfying s
i
P = λ
i
s
i
. Algebra will
verify that the n-step transition matrix is
P
n
= S
−1
D
n
S =


0.2 0.6 0.2
0.2 0.6 0.2
0.2 0.6 0.2


+(0.4)
n


0.5 0 −0.5
0 0 0
−0.5 0 0.5


(4)
Quiz 12.3
The Markov chain describing the factory status and the corresponding state transition
matrix are
77
2
0 1
0.9
0.1
1
1
P =


0.9 0.1 0
0 0 1
1 0 0


(1)
With π =
¸
π
0
π
1
π
2
¸

, the system of equations π

= π

P yields π
1
= 0.1π
0
and
π
2
= π
1
. This implies
π
0

1

2
= π
0
(1 +0.1 +0.1) = 1 (2)
It follows that the limiting state probabilities are
π
0
= 5/6, π
1
= 1/12, π
2
= 1/12. (3)
Quiz 12.4
The communicating classes are
C
1
= {0, 1} C
2
= {2, 3} C
3
= {4, 5, 6} (1)
The states in C
1
and C
3
are aperiodic. The states in C
2
have period 2. Once the system
enters a state in C
1
, the class C
1
is never left. Thus the states in C
1
are recurrent. That
is, C
1
is a recurrent class. Similarly, the states in C
3
are recurrent. On the other hand, the
states in C
2
are transient. Once the system exits C
2
, the states in C
2
are never reentered.
Quiz 12.5
At any time t , the state n can take on the values 0, 1, 2, . . .. The state transition proba-
bilities are
P
n−1,n
= P [K > n|K > n −1] =
P [K > n]
P [K > n −1]
(1)
P
n−1,0
= P [K = n|K > n −1] =
P [K = n]
P [K > n −1]
(2)
(3)
The Markov chain resembles
0 1
P K=2 [ ]
P K= [ 1]
3 4
P K=4 [ ]
2
P K=3 [ ]
P K=5 [ ]
1 1 1 1 1
… ...
78
The stationary probabilities satisfy
π
0
= π
0
P [K = 1] +π
1
, (4)
π
1
= π
0
P [K = 2] +π
2
, (5)
.
.
.
π
k−1
= π
0
P [K = k] +π
k
, k = 1, 2, . . . (6)
From Equation (4), we obtain
π
1
= π
0
(1 − P [K = 1]) = π
0
P [K > 1] (7)
Similarly, Equation (5) implies
π
2
= π
1
−π
0
P [K = 2] = π
0
(P [K > 1] − P [K = 2]) = π
0
P [K > 2] (8)
This suggests that π
k
= π
0
P[K > k]. We verify this pattern by showing that π
k
=
π
0
P[K > k] satisfies Equation (6):
π
0
P [K > k −1] = π
0
P [K = k] +π
0
P [K > k] . (9)
When we apply
¸

k=0
π
k
= 1, we obtain π
0
¸

n=0
P[K > k] = 1. From Problem 2.5.11,
we recall that
¸

k=0
P[K > k] = E[K]. This implies
π
n
=
P [K > n]
E [K]
(10)
This Markov chain models repeated random countdowns. The system state is the time until
the counter expires. When the counter expires, the system is in state 0, and we randomly
reset the counter to a new value K = k and then we count down k units of time. Since we
spend one unit of time in each state, including state 0, we have k −1 units of time left after
the state 0 counter reset. If we have a random variable W such that the PMF of W satisfies
P
W
(n) = π
n
, then W has a discrete PMF representing the remaining time of the counter at
a time in the distant future.
Quiz 12.6
(1) By inspection, the number of transitions need to return to state 0 is always a multiple
of 2. Thus the period of state 0 is d = 2.
(2) To find the stationary probabilities, we solve the system of equations π = πP and
¸
3
i =0
π
i
= 1:
π
0
= (3/4)π
1
+(1/4)π
3
(1)
π
1
= (1/4)π
0
+(1/4)π
2
(2)
π
2
= (1/4)π
1
+(3/4)π
3
(3)
1 = π
0

1

2

3
(4)
79
Solving the second and third equations for π
2
and π
3
yields
π
2
= 4π
1
−π
0
π
3
= (4/3)π
2
−(1/3)π
1
= 5π
1
−(4/3)π
0
(5)
Substituting π
3
back into the first equation yields
π
0
= (3/4)π
1
+(1/4)π
3
= (3/4)π
1
+(5/4)π
1
−(1/3)π
0
(6)
This implies π
1
= (2/3)π
0
. It follows from the first and second equations that
π
2
= (5/3)π
0
and π
3
= 2π
0
. Lastly, we choose π
0
so the state probabilities sum to
1:
1 = π
0

1

2

3
= π
0

1 +
2
3
+
5
3
+2

=
16
3
π
0
(7)
It follows that the state probabilities are
π
0
=
3
16
π
1
=
2
16
π
2
=
5
16
π
3
=
6
16
(8)
(3) Since the system starts in state 0 at time 0, we can use Theorem 12.14 to find the
limiting probability that the system is in state 0 at time nd:
lim
n→∞
P
00
(nd) = dπ
0
=
3
8
(9)
Quiz 12.7
The Markov chain has the same structure as that in Example 12.22. The only difference
is the modified transition rates:
0 1
1
3 4
( ) 2/3
a
1 - ( ) 2/3
a
( ) 3/4
a
1 - 3/4 ( )
a
( ) 4/5
a
1 - 4/5 ( )
a
2
( ) 1/2
a
1- 1/2 ( )
a

The event T
00
> n occurs if the system reaches state n before returning to state 0, which
occurs with probability
P [T
00
> n] = 1 ×

1
2

α
×

2
3

α
×· · · ×

n −1
n

α
=

1
n

α
. (1)
Thus the CDF of T
00
satisfies F
T
00
(n) = 1−P[T
00
> n] = 1−1/n
α
. To determine whether
state 0 is recurrent, we observe that for all α > 0
P [V
00
] = lim
n→∞
F
T
00
(n) = lim
n→∞
1 −
1
n
α
= 1. (2)
80
Thus state 0 is recurrent for all α > 0. Since the chain has only one communicating class,
all states are recurrent. ( We also note that if α = 0, then all states are transient.)
To determine whether the chain is null recurrent or positive recurrent, we need to calcu-
late E[T
00
]. In Example 12.24, we did this by deriving the PMF P
T
00
(n). In this problem,
it will be simpler to use the result of Problem 2.5.11 which says that
¸

k=0
P[K > k] =
E[K] for any non-negative integer-valued random variable K. Applying this result, the
expected time to return to state 0 is
E [T
00
] =

¸
n=0
P [T
00
> n] = 1 +

¸
n=1
1
n
α
. (3)
For 0 < α ≤ 1, 1/n
α
≥ 1/n and it follows that
E [T
00
] ≥ 1 +

¸
n=1
1
n
= ∞. (4)
We conclude that the Markov chain is null recurrent for 0 < α ≤ 1. On the other hand, for
α > 1,
E [T
00
] = 2 +

¸
n=2
1
n
α
. (5)
Note that for all n ≥ 2
1
n
α

n
n−1
dx
x
α
(6)
This implies
E [T
00
] ≤ 2 +

¸
n=2

n
n−1
dx
x
α
(7)
= 2 +


1
dx
x
α
(8)
= 2 +
x
−α+1
−α +1


1
= 2 +
1
α −1
< ∞ (9)
Thus for all α > 1, the Markov chain is positive recurrent.
Quiz 12.8
The number of customers in the ”friendly” store is given by the Markov chain
1 i i+1
p p p
( )( ) 1-p 1-q ( )( ) 1-p 1-q ( )( ) 1-p 1-q ( )( ) 1-p 1-q
( ) 1-p q ( ) 1-p q ( ) 1-p q ( ) 1-p q
0
××× ×××
81
In the above chain, we note that (1 − p)q is the probability that no new customer arrives,
an existing customer gets one unit of service and then departs the store.
By applying Theorem 12.13 with state space partitioned between S = {0, 1, . . . , i } and
S

= {i +1, i +2, . . .}, we see that for any state i ≥ 0,
π
i
p = π
i +1
(1 − p)q. (1)
This implies
π
i +1
=
p
(1 − p)q
π
i
. (2)
Since Equation (2) holds for i = 0, 1, . . ., we have that π
i
= π
0
α
i
where
α =
p
(1 − p)q
. (3)
Requiring the state probabilities to sum to 1, we have that for α < 1,

¸
i =0
π
i
= π
0

¸
i =0
α
i
=
π
0
1 −α
= 1. (4)
Thus for α < 1, the limiting state probabilities are
π
i
= (1 −α)α
i
, i = 0, 1, 2, . . . (5)
In addition, for α ≥ 1 or, equivalently, p ≥ q/(1 − q), the limiting state probabilities do
not exist.
Quiz 12.9
The continuous time Markov chain describing the processor is
0 1
2
3.01
3 4
2
3
2
3
2
2
3
0.01
0.01
0.01
Note that q
10
= 3.1 since the task completes at rate 3 per msec and the processor reboots
at rate 0.1 per msec and the rate to state 0 is the sum of those two rates. From the Markov
chain, we obtain the following useful equations for the stationary distribution.
5.01p
1
= 2p
0
+3p
2
5.01p
2
= 2p
1
+3p
3
5.01p
3
= 2p
2
+3p
4
3.01p
4
= 2p
3
We can solve these equations by working backward and solving for p
4
in terms of p
3
, p
3
in terms of p
2
and so on, yielding
p
4
=
20
31
p
3
p
3
=
620
981
p
2
p
2
=
19620
31431
p
1
p
1
=
628, 620
1, 014, 381
p
0
(1)
82
Applying p
0
+ p
1
+ p
2
+ p
3
+ p
4
= 1 yields p
0
= 1, 014, 381/2, 443, 401 and the
stationary probabilities are
p
0
= 0.4151 p
1
= 0.2573 p
2
= 0.1606 p
3
= 0.1015 p
4
= 0.0655 (2)
Quiz 12.10
The M/M/c/∞queue has Markov chain
c c+1 1 0
λ λ λ λ λ
µ 2µ
cµ cµ cµ
From the Markov chain, the stationary probabilities must satisfy
p
n
=
¸
(ρ/n) p
n−1
n = 1, 2, . . . , c
(ρ/c) p
n−1
n = c +1, c +2, . . .
(1)
It is straightforward to show that this implies
p
n
=
¸
p
0
ρ
n
/n! n = 1, 2, . . . , c
p
0
(ρ/c)
n−c
ρ
c
/c! n = c +1, c +2, . . .
(2)
The requirement that
¸

n=0
p
n
= 1 yields
p
0
=

c
¸
n=0
ρ
n
/n! +
ρ
c
c!
ρ/c
1 −ρ/c

−1
(3)
83

Quiz Solutions – Chapter 1
Quiz 1.1 In the Venn diagrams for parts (a)-(g) below, the shaded area represents the indicated set.
M T O M T O M T O

(1) R = T c

(2) M ∪ O

(3) M ∩ O

M T

O

M T

O

M T

O

(4) R ∪ M Quiz 1.2 (1) A1 = {vvv, vvd, vdv, vdd} (2) B1 = {dvv, dvd, ddv, ddd} (3) A2 = {vvv, vvd, dvv, dvd} (4) B2 = {vdv, vdd, ddv, ddd} (5) A3 = {vvv, ddd} (6) B3 = {vdv, dvd}

(4) R ∩ M

(6) T c − M

(7) A4 = {vvv, vvd, vdv, dvv, vdd, dvd, ddv} (8) B4 = {ddd, ddv, dvd, vdd} Recall that Ai and Bi are collectively exhaustive if Ai ∪ Bi = S. Also, Ai and Bi are mutually exclusive if Ai ∩ Bi = φ. Since we have written down each pair Ai and Bi above, we can simply check for these properties. The pair A1 and B1 are mutually exclusive and collectively exhaustive. The pair A2 and B2 are mutually exclusive and collectively exhaustive. The pair A3 and B3 are mutually exclusive but not collectively exhaustive. The pair A4 and B4 are not mutually exclusive since dvd belongs to A4 and B4 . However, A4 and B4 are collectively exhaustive. 2

Quiz 1.3 There are exactly 50 equally likely outcomes: s51 through s100 . Each of these outcomes has probability 0.02. (1) P[{s79 }] = 0.02 (2) P[{s100 }] = 0.02 (3) P[A] = P[{s90 , . . . , s100 }] = 11 × 0.02 = 0.22 (4) P[F] = P[{s51 , . . . , s59 }] = 9 × 0.02 = 0.18 (5) P[T ≥ 80] = P[{s80 , . . . , s100 }] = 21 × 0.02 = 0.42 (6) P[T < 90] = P[{s51 , s52 , . . . , s89 }] = 39 × 0.02 = 0.78 (7) P[a C grade or better] = P[{s70 , . . . , s100 }] = 31 × 0.02 = 0.62 (8) P[student passes] = P[{s60 , . . . , s100 }] = 41 × 0.02 = 0.82 Quiz 1.4 We can describe this experiment by the event space consisting of the four possible events V B, V L, D B, and DL. We represent these events in the table: V D L 0.35 ? B ? ? In a roundabout way, the problem statement tells us how to fill in the table. In particular, P [V ] = 0.7 = P [V L] + P [V B] P [L] = 0.6 = P [V L] + P [DL] (1) (2)

Since P[V L] = 0.35, we can conclude that P[V B] = 0.35 and that P[DL] = 0.6 − 0.35 = 0.25. This allows us to fill in two more table entries: V D L 0.35 0.25 B 0.35 ? The remaining table entry is filled in by observing that the probabilities must sum to 1. This implies P[D B] = 0.05 and the complete table is V D L 0.35 0.25 B 0.35 0.05 Finding the various probabilities is now straightforward: 3

(1) P[DL] = 0.25 (2) P[D ∪ L] = P[V L] + P[DL] + P[D B] = 0.35 + 0.25 + 0.05 = 0.65. (3) P[V B] = 0.35 (4) P[V ∪ L] = P[V ] + P[L] − P[V L] = 0.7 + 0.6 − 0.35 = 0.95 (5) P[V ∪ D] = P[S] = 1 (6) P[L B] = P[L L c ] = 0 Quiz 1.5 (1) The probability of exactly two voice calls is P [N V = 2] = P [{vvd, vdv, dvv}] = 0.3 (2) The probability of at least one voice call is P [N V ≥ 1] = P [{vdd, dvd, ddv, vvd, vdv, dvv, vvv}] = 6(0.1) + 0.2 = 0.8 An easier way to get the same answer is to observe that P [N V ≥ 1] = 1 − P [N V < 1] = 1 − P [N V = 0] = 1 − P [{ddd}] = 0.8 (4) (2) (3) (1)

(3) The conditional probability of two voice calls followed by a data call given that there were two voice calls is 1 P [{vvd} , N V = 2] P [{vvd}] 0.1 = (5) = = P [{vvd} |N V = 2] = P [N V = 2] P [N V = 2] 0.3 3 (4) The conditional probability of two data calls followed by a voice call given there were two voice calls is P [{ddv} , N V = 2] P [{ddv} |N V = 2] = =0 (6) P [N V = 2] The joint event of the outcome ddv and exactly two voice calls has probability zero since there is only one voice call in the outcome ddv. (5) The conditional probability of exactly two voice calls given at least one voice call is P [N V = 2, N V ≥ 1] P [N V = 2] 0.3 3 = = = (7) P [N V = 2|Nv ≥ 1] = P [N V ≥ 1] P [N V ≥ 1] 0.8 8 (6) The conditional probability of at least one voice call given there were exactly two voice calls is P [N V ≥ 1, N V = 2] P [N V = 2] P [N V ≥ 1|N V = 2] = = =1 (8) P [N V = 2] P [N V = 2] Given that there were two voice calls, there must have been at least one voice call. 4

2)(0.6 In this experiment. there are four outcomes with probabilities P[{vv}] = (0. we confirm that the events are independent. Using the probabilities of the outcomes. vv}] = 0. the events are dependent.16 P[{dd}] = (0.2)(0. we calculate the probability of the joint event: P [N V = 2.64)(0.8 so that P [N V ≥ 1] P [C1 = v] = (0.64 Also. it’s wise to avoid intuition and simply check whether P[AB] = P[A]P[B]. and the first call is a data call. we now can test for the independence of events.8) = 0. N V is even] = P [{vv}] = 0.8) = 0. N V is even] = 0.544. P[C2 = v]P[N V is even] = (0.96. N V ≥ 1] which shows the two events are dependent. P[C1 = v] = 0.68) = 0. vv}] = 0. (1) First.16 (6) Since P[C1 = d]P[C2 = v] = (0.16 P[{vd}] = (0. vv}] = 0.8)2 = 0.96 Finally. {C1 = d} are independent events. {C2 = v}.16.80 From part (a).544.2) = 0.Quiz 1. vv}] = 0.96)(0.64 Next. Note that this shouldn’t be surprising since we used the information that the calls were independent in the problem statement to determine the probabilities of the outcomes. the events are dependent. C1 = v] = P [{vd.8.68 (8) Thus.768 = P [N V ≥ 1.8)(0. C2 = v] = P [{dv}] = 0. 5 (7) (5) (4) (3) (2) (1) . dv. (2) The probability of the joint event is P [N V ≥ 1. we make the comparison P [N V = 2] P [N V ≥ 1] = (0. (4) The probability of the joint event is P [C2 = v. C1 = v] Hence.2)2 = 0. N V ≥ 1] = P [N V = 2] = P [{vv}] = 0. we can do the calculations to check: P [C1 = d. we observe that P [N V ≥ 1] = P [{vd. Further.8)(0. P [N V is even] = P [{dd.8) = 0.96) = P [N V = 2. P[N V ≥ 1] = 0.64 P[{dv}] = (0. each event has probability P [C2 = v] = P [{dv. (3) The problem statement that the calls were independent implies that the events the second call is a voice call. Since P[C2 = v.04 When checking the independence of any two events A and B. Just to be sure.

2)3 = 0. In this case.7 Let Fi denote the event that that the user is found on page i. 1010. Thus the probability the user is found is c c c P [F] = 1 − P F1 F2 F3 = 1 − (0. 0110. then the first subexperiment of choosing the first bit has only one outcome. it is also possible to simply enumerate the six code words: 1100. 1001.8 ¨ F3 ¨¨ ¨¨ ¨¨ ¨ ¨¨ ¨¨ c c c ¨¨ F3 F1 ¨ F2 ¨ 0. There are 4 = 6 ways to do this. the probability of k bits in error and 100 − k correctly received bits is P Sk. 0101. (2) An experiment that can yield all possible code words with two zeroes is to choose which 2 bits (out of 4 bits) will be zero. there are six code words with exactly two zeroes. For N = 8 and M = 3. (3) When the first bit must be a zero.8 ¨ F1 0. we have two choices.100−k = 100 k 6 k (1 − )100−k (1) .Quiz 1. k bits received in error is the same as k failures in 100 trials.8 (1) We can view choosing each bit in the code word as a subexperiment. That is. Hence. The other two bits then must be ones. 3 Quiz 1.9 (1) In this problem. The failure probability is = 1 − p and the success probability is 1 − = p. there are 8 = 56 code words. there are 2 × 2 × 2 × 2 = 24 = 16 possible code words. For each of the next three bits. Each subexperiment has two possible outcomes: 0 and 1. we can specify a code word by choosing M of the bits to be ones. The number of ways of choosing such N a code word is M .2 0. The other N − M bits will be zeroes. there are 1 × 2 × 2 × 2 = 8 ways of choosing a code word. Thus by the fundamental principle of counting. (4) For the constant ratio code.992 (1) Quiz 1.2 0.2 The user is found unless all three paging attempts fail.8 ¨ F2 0. 2 For this problem. The tree for the experiment is 0. 0011.

. then X(i)=1. and X(i)=3) is flip i landed on the edge.99) 2 3 99 (2) (3) (4) (5) = 0.4 < R(i) and R(i)<=0. X(i)=1 if flip i was heads. • If 0.98 = 4950(0.97 = 161. P [C8 ] = The probability a memory module works is P [M] = P [C8 ] + P [C9 ] = p 8n (9 − 8 p n ) Quiz 1. then X(i)=2.100).11 R=rand(1..*(R<=0. + (2*(R>0.0610 (6) Quiz 1. Since transistor failures are independent of each other. we use the hist function to count how many occurences of each possible value of X(i).99)100 = 0. To see how this works.4) .3660 P S1. we generate vector X as a function of R to represent the 3 possible outcomes of a flip.97 = 0.100 = (1 − )100 = (0. The module works if either 8 chips work or 9 chips work.01. 700(0. • If 0.9 < R(i). we first generate a vector R of 100 random numbers.99) 8 = 0. the transistors in the chip are like devices in series.10 Since the chip works only if all n transistors work.1:3) (1) (2) (3) For a M ATLAB simulation.01) (0.5 and 0..1. 8 P [C9 ] = (P [C])9 = p 9n . then X(i)=3. chip failures are also independent. That is. Let Ck denote the event that exactly k chips work.9.98 + P S3.9819 = 0. The probability that a chip works is P[C] = pn . 0.3700 9 97 P S2. Lastly. + (3*(R>0.9)). X=(R<= 0.100 + P S1.. Y=hist(X.4.99) (2) The probability a packet is decoded correctly is just P [C] = P S0.01) (0. P S0.4. we note there are three cases: • If R(i) <= 0.01)(0.1849 P S3.9)) . Second. Thus each P[Ck ] has the binomial probability 9 (P [C])8 (1 − P [C])9−8 = 9 p 8n (1 − p n ). X(i)=2 if flip i was tails.99 + P S2.99 = 100(0. 7 . These three cases will have probabilities 0.4).For = 0.

then the probability exactly 10 bits are sent is P [X = 10] = PX (10) = (0. 2.0387 8 (2) .Quiz Solutions – Chapter 2 Quiz 2.1)(0. . Now that we have found c.1 The sample space. we recall that the PMF must sum to 1. the remaining parts are straightforward.1.2 (1) To find c. (1) The random variable X is the number of trials up to and including the first success. Similar to Example 2.36 3. the trial is a success. with probability p.5 0.” Each bit is in error. X has the geometric PMF PX (x) = p(1 − p)x−1 x = 1.0 0.3 Decoding each transmitted bit is an independent trial where we call a bit error a “success. probabilities and corresponding grades for the experiment are Outcome P[·] BB BC CB CC Quiz 2. That is. Now we can interpret each experiment in the generic context of independent trials.9)9 = 0.24 2. . . that is.11. 0 otherwise (1) (2) If p = 0.24 2.16 2 PN (n) = c 1 + n=1 1 1 + 2 3 =1 (1) This implies c = 6/11. 3 G 0. (2) P[N = 1] = PN (1) = c = 6/11 (3) P[N ≥ 2] = PN (2) + PN (3) = c/2 + c/3 = 5/11 (4) P[N > 3] = ∞ n=4 PN (n) = 0 Quiz 2.5 0.

(6) If p = 0.9207 100 (0. .01)(0.3487.. its even easier to observe that X ≥ 10 if the first 10 bits are transmitted correctly.1849 2 (5) (4) The probability of no more than 2 errors is P [Y ≤ 2] = PY (0) + PY (1) + PY (2) = (0. 4.910 = 0.01)2 (0.01)2 (0.25)3 (0.99)98 = 0. the probability that the third error occurs on bit 12 is PZ (12) = 11 (0. However. 5.15) PZ (z) = z−1 3 p (1 − p)z−3 2 (9) Note that PZ (z) > 0 for z = 3. P[X ≥ 10] = 0. the probability of exactly 2 errors is P [Y = 2] = PY (2) = 100 (0.13. (1) P[Y < 1] = FY (1− ) = 0 9 . This x=10 sum is not too hard to calculate. However. .25. FY (y) takes the upper value FY (y0 ).99)98 2 (6) (7) (8) (5) Random variable Z is the number of trials up to and including the third success. Thus Z has the Pascal PMF (see Example 2. (3) The random variable Y is the number of successes in 100 independent trials. we must keep in + mind that when FY (y) has a discontinuity at y0 .99)99 + = 0.4 Each of these probabilities can be read off the CDF FY (y).The probability that at least 10 bits are sent is P[X ≥ 10] = ∞ PX (x).99)100 + 100(0. .0645 2 (10) Quiz 2. P [X ≥ 10] = P [first 10 bits are correct] = (1 − p)10 For p = 0.1.75)9 = 0. Just as in Example 2. Y has the binomial PMF PY (y) = 100 y p (1 − p)100−y y (4) (3) If p = 0.01. That is.

This corresponds to the PMF ⎧ ⎨ 0. we can draw the following tree: N =0 •T =120 0.8 − 0.1¨¨ ¨ ¨ ¨ 0.(2) P[Y ≤ 1] = FY (1) = 0.2 (4) P[Y ≥ 2] = 1 − P[Y < 2] = 1 − FY (2− ) = 1 − 0.7) + 40(0.1 t = 120 ⎩ 0 otherwise From the PMF PT (t).3) + 120(0.3 t = 75.5 cents Quiz 2. 90. the cost T is T = 25N + 40(3 − N ) = 120 − 15N (2) To find the PMF of T .8 = 0 Quiz 2.5 (1) With probability 0.7.3.3ˆˆ N =2 •T =90 r rr 0.6 = 0. the expected value of T is E [T ] = 75PT (75) + 90PT (90) + 105PT (105) + 120PT (120) = (75 + 90 + 105)(0. Otherwise.4 (5) P[Y = 1] = P[Y ≤ 1] − P[Y < 1] = FY (1+ ) − FY (1− ) = 0. 105 PT (t) = 0.7 c = 25 PC (c) = 0.1) = 62 (2) (3) (4) 10 .3 N =3 •T =75 From the tree. with probability 0.3 c = 40 (1) ⎩ 0 otherwise (2) The expected value of C is E [C] = 25(0.3$$N =1 •T =105 $ (2) (1) $ $$ ¨¨$ ˆˆˆ rr ˆ ˆ rr0.6 (6) P[Y = 3] = P[Y ≤ 3] − P[Y < 3] = FY (3+ ) − FY (3− ) = 0. we have a data call and C = 40.3) = 29. we can write down the PMF of T : ⎧ ⎨ 0.6 (3) P[Y > 2] = 1 − P[Y ≤ 2] = 1 − FY (2) = 1 − 0. a call is a voice call and C = 25.6 (1) As a function of N .8 = 0.

3) + 6(0. However.4 (2) (3) The variance of N is Var[N ] = E N 2 − (E [N ])2 = 2. Quiz 2. 2 g(A) = 6 A = 3 ⎩ 8 A=4 (3) By Theorem 2.4)2 = 0.5) = 1.8 = g(E[A]).1) = 4.1) + 12 (0.1) + 1(0.4) + 22 (0. g(E[A]) = g(2) = 4.4) + 2(0.10. (1) The expected value of N is 2 E [N ] = n=0 n PN (n) = 0(0.8 (3) Since E[A] = 2. the expected number of memory chips is 4 (2) E [M] = a=1 g(A)PA (a) = 4(0.1) = 2 (1) (2) The number of memory chips is M = g(A) where ⎧ ⎨ 4 A = 1.14.Quiz 2.2) + 4(0.3) + 3(0. (3) 11 .8 The PMF PN (n) allows to calculate each of the desired quantities.2) + 8(0. the expected number of applications is 4 E [A] = a=1 a PA (a) = 1(0.4) + 4(0. E[M] = 4.5) = 2.4) + 2(0.663.7 (1) Using Definition 2.4 − (1.4 (1) (2) The second moment of N is 2 E N 2 = n=0 n 2 PN (n) = 02 (0.44 (4) The standard deviation is σ N = √ Var[N ] = √ 0.44 = 0. The two quantities are different because g(A) is not of the form α A + β.

155/0. . 5 = 0. 5 = 0. . 3. .75) + 0. 5 n = 6. 7. 7.005 n = 6. the conditional PMF of N given the event T is PN |T (n) = 0.17. 50 = 0(0.005/0. 9. 2.15625 12 .2(0. 4. . 3. 8.005)(5) = 0.25) ⎩ 0 otherwise ⎧ ⎨ 0. . 5 = 0. .00625) (11) (12) = 3. calculating conditional expectations is easy.155 n = 1.19375 n = 1. 2.Quiz 2.8 n = 6. 50 ⎩ 0 otherwise (4) First we find 10 (3) (4) (5) P [N ≤ 10] = n=1 PN (n) = (0. . E [N |N ≤ 10] = n 5 0 n ≤ 10 otherwise (7) (8) (9) n PN |N ≤10 (n) 10 (10) = n=1 n(0.02(0. we learn that the conditional PMF of N given the event I is 0. the conditional PMF of N given N ≤ 10 is PN |N ≤10 (n) = PN (n) P[N ≤10] ⎧ ⎨ 0. 4.80 (6) By Theorem 2. 2.8 n = 1.155)(5) + (0. 4. . 3.9 (1) From the problem statement. 3.25) n = 1. . .10 (the law of total probability). From Theorem 1. 2.75) + 0. 10 ⎩ 0 otherwise (5) Once we have the conditional PMF. . 9. 5 0 otherwise (2) (3) The problem statement tells us that P[T ] = 1 − P[I ] = 3/4. 4. we find the PMF of N is PN (n) = PN |T (n) P [T ] + PN |I (n) P [I ] ⎧ ⎨ 0. 10 ⎩ 0 otherwise ⎧ ⎨ 0. 4. 2. 7.00625 n = 6.02(0. 2. . 7.19375) + n=6 n(0.2 n = 1. 3. 8.02 n = 1. 50 PN |I (n) = (1) 0 otherwise (2) Also from the problem statement.

10 The function samplemean(k) generates and plots five m n sequences for n = 1. .i)=cumsum(X). .19375) + 330(0. m 2 . M=zeros(k. k.00625) = 12.15625)2 = 2.i) of M holds a sequence m 1 . Each time samplemean(k) is called produces a random output. 2. M(:. plot(K.10./K. Examples of the function calls (a) samplemean(100) and (b) samplemean(1000) are shown in Figure 1. . m k . for i=1:5. . m n is fairly random but as n gets 13 .10 8 6 4 2 0 0 50 100 10 8 6 4 2 0 0 500 1000 (a) samplemean(100) (b) samplemean(1000) Figure 1: Two examples of the output of samplemean(k) (6) To find the conditional variance. X=duniformrv(0. end.75684 (16) (17) Quiz 2.19375) + 2 (14) (15) = 55(0. . function M=samplemean(k). What is observed in these figures is that for small n. The ith column M(:.71875 − (3. .00625) n=6 = n=1 n (0. we first find the conditional second moment E N 2 |N ≤ 10 = n 5 n 2 PN |N ≤10 (n) 10 (13) n 2 (0. K=(1:k)’. .M).k). .5).71875 The conditional variance is Var[N |N ≤ 10] = E N 2 |N ≤ 10 − (E [N |N ≤ 10])2 = 12.

. m n gets close to E[X ] = 5. that we generate is random. . 14 . This random convergence is analyzed in Chapter 7. m 2 .large. . the sequences always converges to E[X ]. Although each sequence m 1 .

1 The CDF of Y is 1 FY(y) 0.2 (1) First we will find the constant c and then we will sketch the PDF.5)/4 = 5/8 Quiz 3.Quiz Solutions – Chapter 3 Quiz 3.5] = 1 − P[Y ≤ 1. we can calculate the probabilities: (1) P[Y ≤ −1] = FY (−1) = 0 (2) P[Y ≤ 1] = FY (1) = 1/4 (3) P[2 < Y ≤ 3] = FY (3) − FY (2) = 3/4 − 2/4 = 1/4 (4) P[Y > 1. We will evaluate this integral using integration by parts: ∞ −∞ f X (x) d x = 0 ∞ cxe−x/2 d x ∞ 0 (1) ∞ 0 = −2cxe−x/2 =0 + 2ce−x/2 d x (2) = −4ce−x/2 ∞ 0 = 4c (3) Thus c = 1/4 and X has the Erlang (n = 2. we use ∞ the fact that −∞ f X (x) d x = 1.5] = 1 − FY (1. To find c. λ = 1/2) PDF 0.5 0 0 2 y 4 ⎧ y<0 ⎨ 0 y/4 0 ≤ y ≤ 4 FY (y) = ⎩ 1 y>4 (1) From the CDF FY (y).5) = 1 − (1.1 0 0 5 x 10 15 f X (x) = (x/4)e−x/2 x ≥ 0 0 otherwise fX(x) (4) 15 .2 0.

P [0 ≤ X ≤ 4] = FX (4) − FX (0) = 1 − 3e−2 . P [−2 ≤ X ≤ 2] = FX (2) − FX (−2) = 1 − 3e−1 . f Y (y) = f Y (−y)).5 0 0 5 x 10 15 FX (x) = 1− 0 x 2 + 1 e−x/2 x ≥ 0 otherwise (8) (3) From the CDF FX (x). 0 otherwise.(2) To find the CDF FX (x).e. FX (x) = 0 x f X (y) dy = 0 x y −y/2 e dy 4 (5) (6) (7) x x 1 y − e−y/2 dy = − e−y/2 − 2 2 0 0 x −x/2 =1− e − e−x/2 2 The complete expression for the CDF is 1 FX(x) 0.. (3) 16 . (2) Note that the above calculation wasn’t really necessary because E[Y ] = 0 whenever the PDF f Y (y) is an even function (i. we first note X is a nonnegative random variable so that FX (x) = 0 for all x < 0. (2) The second moment of Y is E Y2 = ∞ −∞ y 2 f Y (y) dy = 1 −1 (3/2)y 4 dy = (3/10)y 5 1 −1 = 3/5. (9) (10) Quiz 3. (1) (1) The expected value of Y is E [Y ] = ∞ −∞ y f Y (y) dy = 1 −1 (3/2)y 3 dy = (3/8)y 4 1 −1 = 0.3 The PDF of Y is 3 fY(y) 2 1 0 −2 0 y 2 f Y (y) = 3y 2 /2 −1 ≤ y ≤ 1. For x ≥ 0. (4) Similarly.

it is important to remember that as the standard deviation increases. √ b = 3 + 3 3. The PDF of X is f X (x) = (1/3)e−x/3 x ≥ 0. (5) (z) function and Table 3. Quiz 3. the peak value of the Gaussian PDF goes down.2. a+b =3 2 Var[X ] = (b − a)2 = 9. To find a and b.5 Each of the requested probabilities can be calculated using or Q(z) and Table 3.1 (1) The PDFs of X and Y are shown below.2 fX(x) 0 −5 x ← fX(x) ← f (y) Y 0 y 5 17 . f X (x) = 0 otherwise. (3) (4) The complete expression for the PDF of X is √ √ √ 1/(6 3) 3 − 3 3 ≤ x < 3 + 3 3. 12 (2) √ b − a = ±6 3. (4) The standard deviation of Y is σY = Quiz 3.6 to write E [X ] = This implies a + b = 6. b) random variable. However. (1) √ Var[Y ] = √ 3/5. The fact that Y has twice the standard deviation of X is reflected in the greater spread of f Y (y). we must have λ = 1/3.4 (1) When X is an exponential (λ) random variable.4 0.(3) The variance of Y is Var[Y ] = E Y 2 − (E [Y ])2 = 3/5. The only valid solution with a < b is √ a = 3 − 3 3. Since E[X ] = 3 and Var[X ] = 9. E[X ] = 1/λ and Var[X ] = 1/λ2 . We start with the sketches. fY(y) 0. (4) (2) We know X is a uniform (a. 0 otherwise. we apply Theorem 3.

(3) Since Y is Gaussian (0.5 0 −2 (1.75) = 1 − 2 0. (1) The following probabilities can be read directly from the CDF: (1) P[X ≤ 1] = FX (1) = 1. (2) Quiz 3.6826. Quiz 3. ⎨ 0 FX (x) = (x + 1)/4 −1 ≤ x < 1. ⎩ 1 x ≥ 1.6 The CDF of X is 1 FX(x) 0. 2).5 fX(x) 0. The resulting PDF is 0. ⎨ 1/4 f X (x) = (1/2)δ(x − 1) x = 1. (3) P[X = 1] = FX (1+ ) − FX (1− ) = 1 − 1/2 = 1/2. P [−1 < X ≤ 1] = FX (1) − FX (−1) = (1) − (−1) = 2 (1) − 1 = 0. (2) P[X < 1] = FX (1− ) = 1/2. since X is Gaussian (0.5) = 2. 2 (4) (1) (2) (4) Again.5] = Q( 3.75) = 0 x 2 ⎧ x < −1.33 × 10−4 .(2) Since X is Gaussian (0.0401.5] = Q(3.383. P [−1 < Y ≤ 1] = FY (1) − FY (−1) 1 −1 = − σY σY (3) =2 1 − 1 = 0. P[Y > 3. 1). (5) Since Y is Gaussian (0. (4) We find the PDF f Y (y) by taking the derivative of FY (y). 2). P[X > 3. ⎩ 0 otherwise. 1).5 0 −2 0 x 2 ⎧ −1 ≤ x < 1.5 ) = Q(1.7 18 .

Thus FY (y) = 0 for y < 0.8 (1) P[Y ≤ 6] = 6 −∞ f Y (y) dy = 6 0 (1/10) dy = 0.6 .25 f Y (y) = 1 − y/2 + (1/4)δ(y − 1) 0 ≤ y ≤ 1 0 otherwise Y (6) Quiz 3. ⎨ 0 2 /4 0 ≤ y < 1. FY (y) = 1 for all y ≥ 1. because Y ≤ 1. 1. for 0 ≤ x ≤ 2. the complete expression for the CDF of Y is 1 F (y) 0. we obtain the PDF f Y (y). for 0 < y < 1.5 f (y) 1 0. the PDF is zero. Finally. Note that when y < 0 or y > 1. (5) 0. (1) The complete CDF of X is 1 F (x) 0. FY (y) = P [Y ≤ y] = P [X ≤ y] = FX (y) . (2) (2) The probability that Y = 1 is P [Y = 1] = P [X ≥ 1] = 1 − FX (1) = 1 − 3/4 = 1/4. Also. FX (x) = 1 for x ≥ 2 since its always true that x ≤ 2. FX (x) = 0 for x < 0. FX (x) = x−x ⎩ 1 x > 2. FX (x) = x −∞ f X (y) dy = 0 x (1 − y/2) dy = x − x 2 /4. Using the CDF FX (x). (3) (3) Since X is nonnegative. Lastly. Also. 19 .5 0 −1 X 0 1 x 2 3 ⎧ x < 0.5 0 −1 0 1 y 2 3 1 y 2 3 ⎧ y < 0. (4) By taking the derivative of FY (y). Y is also nonnegative.(1) Since X is always nonnegative. FY (y) = y−y ⎩ 1 y ≥ 1. ⎨ 0 2 /4 0 ≤ x ≤ 2. we see that the jump in FY (y) at y = 1 is exactly equal to P[Y = 1].5 0 −1 Y (4) 0 As expected.

if (x>2) t(i+1)=x.lambda=1/3. 10 (2) (4) From Definition 3. the conditional PDF of Y given Y > 8 is f Y |Y >8 (y) = f Y (y) P[Y >8] 0 y > 8. we can calculate the conditional expectation E [Y |Y > 8] = ∞ −∞ y f Y |Y >8 (y) dy = 10 8 y dy = 9. x=exponentialrv(lambda. 1/2 8 < y ≤ 10. i=i+1.(2) From Definition 3. = otherwise.0+exponentialrv(1/3. 1/6 0 ≤ y ≤ 6.m) generates the vector t. 0 otherwise. = otherwise.15. the conditional PDF of Y given Y ≤ 6 is f Y |Y ≤6 (y) = (3) The probability Y > 8 is P [Y > 8] = 8 10 f Y (y) P[Y ≤6] 0 y ≤ 6. 0 otherwise. In this case the command t=2.1).2 . 6 (4) (6) From the conditional PDF f Y |Y >8 (y).15. 20 . Here is a M ATLAB function that uses this method: function t=t2rv(m) i=0. t=zeros(m. (3) (5) From the conditional PDF f Y |Y ≤6 (y).1). while (i<m). (1) 1 dy = 0. we can calculate the conditional expectation E [Y |Y ≤ 6] = ∞ −∞ y f Y |Y ≤6 (y) dy = 6 0 y dy = 3. end end A second method exploits the fact that if T is an exponential (λ) random variable. 2 (5) Quiz 3.9 A natural way to produce random variables with PDF f T |T >2 (t) is to generate samples of T with PDF f T (t) and then to discard those samples which fail to satisfy the condition T > 2. then T = T + 2 has PDF f T (t) = f T |T >2 (t).

16 + 0.08 = 0. 2) + PQ.G (q.G (q.6 (2) The probability that Q = G is P [Q = G] = PQ. 0) + PQ. −∞) = P[X ≤ ∞.08 = 0. (4) FX.24 + 0.2 From the joint PMF of Q and G given in the table. Y ≤ ∞] = 1.G (0. Y ≤ 2] ≤ P[X ≤ −∞] = 0 since X cannot take on the value −∞. 0) + PQ. Y ≤ y] = P[Y ≤ y] = FY (y). (2) FX. This result is given in Theorem 4.18 (3) The probability that G > 1 is 3 1 (1) (2) (3) P [G > 1] = g=2 q=0 PQ.18 + 0.Y (∞. y) = P[X ≤ ∞.G (0. (1) The probability that Q = 0 is P [Q = 0] = PQ. ∞) = P[X ≤ ∞.1.Y (∞.18 + 0. 2) = P[X ≤ −∞. Quiz 4.6 (4) The probability that G > Q is 1 3 P [G > Q] = q=0 g=q+1 PQ. (3) FX. g) (6) (7) = 0.12 + 0. g) (4) (5) = 0.1 Each value of the joint CDF can be found by considering the corresponding probability. Y ≤ −∞] = 0 since Y cannot take on the value −∞.24 + 0.24 + 0.Quiz Solutions – Chapter 4 Quiz 4.G (1.06 + 0.G (0.Y (∞.Y (−∞.16 + 0.12 + 0.78 21 . we can calculate the requested probabilities by summing the PMF over those values of Q and G that correspond to the event.G (0. 1) + PQ.G (0. 3) = 0. (1) FX. 1) = 0.12 = 0.

To calculate P[A].2 PB (b) 0. b) b = 0 b = 2 b = 4 PH (h) h = −1 0 0. Similarly.Quiz 4. we write P [A] = A y dy = (c/4)y 2 f X.1 0 0.6 0. b) (1) For each value of h. Specifically.4 To find the constant c.5 0.B (h.Y (x. we apply ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ (3) f X.3 By Theorem 4. the marginal PMF of H is PH (h) = b=0. b) (2) For each value of b.1 0 0. this corresponds to the column sum down the table of the joint PMF. y) d x d y = =c cx y d x dy y 0 2 0 (1) dy 2 0 x 2 /2 1 0 (2) =c (3) = (c/2) Thus c = 1. y) d x d y = 1.4 0.B (h. this corresponds to calculating the row sum across the table of the joint PMF. The easiest way to calculate these marginal PMFs is to simply sum each row and column: PH.1 0.2 h=0 h=1 0.3.B (h. the marginal PMF of B is 1 PB (b) = h=−1 PH.1 0. y = r sin θ and d x d y = r dr dθ .Y (x. yielding 2 1 Y P [A] = 0 π/2 0 1 0 1 r 2 sin θ cos θ r dr dθ π/2 0 2 π/2 (5) (6) A 1 X = = r 3 dr ⎛ 1 0 sin θ cos θ dθ ⎞ ⎠ = 1/8 r 4 /4 ⎝ sin θ 2 (7) 0 22 .2.3 Quiz 4. y) d x d y (4) To integrate over A.2 0. we convert to polar coordinates using the substitutions x = r cos θ .2 0.4 PH. 2 0 0 2 1 f X.Y (x.

592.05 (T =180) 0.Y (x.05 t = 18 ⎪ ⎪ ⎪ 0.20 (T =90) 0. 800 0.1 t = 24 ⎪ ⎪ ⎪ ⎪ 0. We can write these down on the table for the joint PMF of L and B as follows: PL .6 (A) The time required for the transfer is T = L/B.5 By Theorem 4. For 0 ≤ x ≤ 1.8. writing down the PMF of T is straightforward.Y (x. y) dy (x + y 2 ) d x = 6 2 x /2 + x y 2 5 x=1 x=0 (4) 6 3 + 6y 2 = (1/2 + y 2 ) = 5 5 (5) 5 0 Since f Y (y) = 0 for y < 0 or y > 1. 600 0. 90 ⎪ ⎪ ⎨ 0.2 t = 270 ⎪ ⎪ ⎪ ⎪ 0. f X (x) = 6 5 1 0 (x + y 2 ) dy = 6 x y + y 3 /3 5 y=1 y=0 6x + 2 6 = (x + 1/3) = 5 5 (2) The complete expression for the PDf of X is f X (x) = (6x + 2)/5 0 ≤ x ≤ 1 0 otherwise (3) By the same method we obtain the marginal PDF for Y . b) l = 518.00 (T =540) b = 21. 776.Quiz 4. f X (x) = 0.B (l. 400 l = 2. y) dy (1) For x < 0 or x > 1.20 (T =36) 0.1 t = 120 PT (t) = ⎪ 0.05 (T =18) 0.2 t = 36. ⎧ ⎪ 0. For 0 ≤ y ≤ 1. we can calculate the time T needed for the transfer. 400 0. 000 b = 14. the complete expression for the PDF of Y is f Y (y) = Quiz 4. For each pair of values of L and B.05 t = 180 ⎪ ⎪ ⎪ 0. the marginal PDF of X is f X (x) = ∞ −∞ f X.10 (T =24) 0.1 t = 360 ⎪ ⎪ ⎩ 0 otherwise 23 (1) .10 (T =360) b = 28.20 (T =270) (3 + 6y 2 )/5 0 ≤ y ≤ 1 0 otherwise (6) From the table. f Y (y) = = ∞ −∞ 6 1 f X. 000 l = 7.10 (T =120) 0.

25 0. we observe that since 0 ≤ X ≤ 1 and 0 ≤ Y ≤ 1.1 0. For 0 < w < 1.25) = 4.5.5) + 32 (0. W = X Y satisfies 0 ≤ W ≤ 1. The calculus is simpler if we integrate over the region X Y > w.1 0.6 t = 60 0.5 0.T (l.15 0. Y 1 w w 1 XY > w FW (w) = 1 − P [X Y > w] =1− =1− 1 1 w w/x 1 w (2) (3) (4) (5) (6) dy dx XY = w X (1 − w/x) d x = 1 − x − w ln x|x=1 x=w = 1 − (1 − w + w ln w) = w − w ln w The complete expression for the CDF is ⎧ w<0 ⎨ 0 FW (w) = w − w ln w 0 ≤ w ≤ 1 ⎩ 1 w>1 By taking the derivative of the CDF. integrating over the region W ≤ w is fairly complex. As shown below. we calculate the CDF FW (w) = P[W ≤ w].4 PL (l) 0.5.5) + 3(0.25) + 2(0.(B) First.15 0.25) + 22 (0. Thus f W (0) = 0 and f W (1) = 1.2 0.3 0.25) = 2. PL . we find the PDF is ⎧ 0 w<0 d FW (w) ⎨ f W (w) = = − ln w 0 ≤ w ≤ 1 ⎩ dw 0 w>1 Quiz 4. the variance of L is Var [L] = E L 2 − (E [L])2 = 0.25 (7) (8) (1) (2) (3) . Specifically. t) l=1 l=2 l=3 PT (t) (1) The expected value of L is E [L] = 1(0. 24 t = 40 0. Since the second moment of L is E L 2 = 12 (0.7 (A) It is helpful to first make a table that includes the marginal PMFs.

it is straightforward to calculate the various expectations.(2) The expected value of T is E [T ] = 40(0.15) + 1(60)(0. the covariance of L and T is Cov [L . y) d x = 0 2 xy dx = 1 2 x y 2 x=1 = x=0 y 2 (13) The complete expressions for the marginal PDFs are f X (x) = 2x 0 ≤ x ≤ 1 0 otherwise f Y (y) = y/2 0 ≤ y ≤ 2 0 otherwise (14) From the marginal PDFs.2) + 3(60)(0. T ] = E [L T ] − E [L] E [T ] = 96 − 2(48) = 0 (5) Since Cov[L .Y (x. the correlation coefficient is ρ L . Thus Var[T ] = E T 2 − (E [T ])2 = 2400 − 482 = 96.1) = 96 (4) From Theorem 4.4) = 48.16(a). f X (x) = ∞ −∞ f X. (11) (B) As in the discrete case.6) + 60(0.15) + 2(40)(0. y) dy = 0 2 1 x y dy = x y 2 2 y=2 = 2x y=0 (12) Similarly. T ] = 0. the calculations become easier if we first calculate the marginal PDFs f X (x) and f Y (y).6) + 602 (0.1) + 2(60)(0. (3) The correlation is 3 (4) (5) (6) E [L T ] = t=40. 25 .3) + 3(40)(0.4) = 2400.Y (x. The second moment of T is E T 2 = 402 (0.60 l=1 lt PL T (lt) (7) (8) (9) (10) = 1(40)(0. For 0 ≤ x ≤ 1. f Y (y) = ∞ −∞ f X.T = 0. for 0 ≤ y ≤ 2.

Quiz 4.T (2.9.T (3. 60). dy 1 0 (20) 2 x3 x y d x. 40) + PL . the correlation coefficient is ρ X. P [A] = P [V > 80] = PL . T ) = (3.8 (A) Since the event V > 80 occurs only for the pairs (L . T ) = (2. (L . t) = 26 PL .T |A (l. 40) and (L .Y = 0. 60) = 0. PL .T (3. Y ] = E [X Y ] − E [X ] E [Y ] = 2 8 − 9 3 4 3 = 0.45 By Definition 4.t) P[A] (1) 0 lt > 80 otherwise (2) . T ) = (3. (3) The correlation of X and Y is E [X Y ] = = ∞ ∞ −∞ −∞ 1 2 2 2 0 0 x y f X. Y ] = 0. y) d x. (2) The first and second moments of Y are E [Y ] = E Y2 4 1 2 y dy = 3 −∞ 0 2 ∞ 2 1 = y 2 f Y (y) dy = y 3 dy = 2 −∞ 0 2 y f Y (y) dy = ∞ 2 (18) (19) The variance of Y is Var[Y ] = E[Y 2 ] − (E[Y ])2 = 2 − 16/9 = 2/9.Y (x. dy = 3 y3 3 = 0 8 9 (21) (4) The covariance of X and Y is Cov [X. (22) (5) Since Cov[X.(1) The first and second moments of X are E [X ] = E X2 = ∞ −∞ ∞ −∞ x f X (x) d x = 0 1 2x 2 d x = 1 2 3 1 2 (15) (16) (17) x 2 f X (x) d x = 0 2x 3 d x = The variance of X is Var[X ] = E[X 2 ] − (E[X ])2 = 1/18.T (l. 60) + PL . 60).

801 8 5 2 dy The conditional PDF of X and Y is f X.Y |B (x.T |A (l. t) (5) (6) 4 1 2 = (2 · 60)2 + (3 · 40)2 + (3 · 60)2 = 18. 80/y ≤ x ≤ 3 0 otherwise 27 (12) (13) .Y (x.T |A (l. we first calculate the probability of the conditioning event. y) = = f X. y) ∈ B 0 otherwise K x y 40 ≤ y ≤ 60. t) (3) (4) 1 2 1 4 = (2 · 60) + (3 · 40) + (3 · 60) = 133 9 3 9 3 For the conditional variance Var[V |A]. we first find the conditional second moment E V 2 |A = l t (lt)2 PL . E [V |A] = l t lt PL . y) d x d y = = = 60 40 60 40 60 3 80/y xy dx dy 4000 x2 2 3 (8) dy (9) (10) (11) y 4000 80/y 9 3200 y − 2 y 40 4000 2 9 4 3 = − ln ≈ 0.We can represent this conditional PMF in the following table: PL .Y (x. t) t = 40 t = 60 l=1 0 0 l=2 0 4/9 1/3 2/9 l=3 The conditional expectation of V can be found from the conditional PMF. P [B] = B f X. 400 9 3 9 It follows that Var [V |A] = E V 2 |A − (E [V |A])2 = 622 2 9 (7) (B) For continuous random variables X and Y .T |A (l. y) /P [B] (x.

where K = (4000P[B])−1 . The conditional expectation of W given event B is E [W |B] = =
∞ ∞ −∞ −∞ 60 3 40

x y f X,Y |B (x, y) d x d y K x 2 y2 d x d y y2 x 3
x=3 x=80/y

(14) (15)

= (K /3) = (K /3)

80/y 60 40 60 40

dy

(16) (17) (18)

27y 2 − 803 /y dy
60 40

= (K /3) 9y 3 − 803 ln y The conditional second moment of K given B is E W 2 |B = =
∞ ∞

≈ 120.78

−∞ −∞ 60 3 40

(x y)2 f X,Y |B (x, y) d x d y K x 3 y3 d x d y y3 x 4
x=3 x=80/y

(19) (20)

= (K /4)

80/y 60 40 60 40

dy

(21) (22) ≈ 16, 116.10 (23)

= (K /4)

81y 3 − 804 /y dy
60 40

= (K /4) (81/4)y 4 − 804 ln y It follows that the conditional variance of W given B is

Var [W |B] = E W 2 |B − (E [W |B])2 ≈ 1528.30 Quiz 4.9

(24)

(A) (1) The joint PMF of A and B can be found from the marginal and conditional PMFs via PA,B (a, b) = PB|A (b|a)PA (a). Incorporating the information from the given conditional PMFs can be confusing, however. Consequently, we can note that A has range S A = {0, 2} and B has range S B = {0, 1}. A table of the joint PMF will include all four possible combinations of A and B. The general form of the table is PA,B (a, b) b=0 b=1 a=0 PB|A (0|0)PA (0) PB|A (1|0)PA (0) PB|A (0|2)PA (2) PB|A (1|2)PA (2) a=2 28

Substituting values from PB|A (b|a) and PA (a), we have b=0 b=1 PA,B (a, b) a=0 (0.8)(0.4) (0.2)(0.4) (0.5)(0.6) (0.5)(0.6) a=2 or PA,B (a, b) b = 0 b = 1 a=0 0.32 0.08 0.3 0.3 a=2

(2) Given the conditional PMF PB|A (b|2), it is easy to calculate the conditional expectation
1

E [B|A = 2] =
b=0

b PB|A (b|2) = (0)(0.5) + (1)(0.5) = 0.5

(1)

(3) From the joint PMF PA,B (a, b), we can calculate the the conditional PMF ⎧ 0.32/0.62 a = 0 PA,B (a, 0) ⎨ PA|B (a|0) = = 0.3/0.62 a = 2 (2) ⎩ PB (0) 0 otherwise ⎧ ⎨ 16/31 a = 0 = 15/31 a = 2 (3) ⎩ 0 otherwise (4) We can calculate the conditional variance Var[A|B = 0] using the conditional PMF PA|B (a|0). First we calculate the conditional expected value E [A|B = 0] =
a

a PA|B (a|0) = 0(16/31) + 2(15/31) = 30/31

(4)

The conditional second moment is E A2 |B = 0 =
a

a 2 PA|B (a|0) = 02 (16/31) + 22 (15/31) = 60/31 (5)

The conditional variance is then Var[A|B = 0] = E A2 |B = 0 − (E [A|B = 0])2 = (B) (1) The joint PDF of X and Y is f X,Y (x, y) = f Y |X (y|x) f X (x) = (2) From the given conditional PDF f Y |X (y|x), f Y |X (y|1/2) = 29 8y 0 ≤ y ≤ 1/2 0 otherwise (8) 6y 0 ≤ y ≤ x, 0 ≤ x ≤ 1 0 otherwise (7) 960 961 (6)

(3) The conditional PDF of Y given X = 1/2 is f X |Y (x|1/2) = f X,Y (x, 1/2)/ f Y (1/2). To find f Y (1/2), we integrate the joint PDF. f Y (1/2) = Thus, for 1/2 ≤ x ≤ 1, f X |Y (x|1/2) = f X,Y (x, 1/2) 6(1/2) =2 = f Y (1/2) 3/2 (10)
∞ −∞

f X,1/2 ( ) d x =

1 1/2

6(1/2) d x = 3/2

(9)

(4) From the pervious part, we see that given Y = 1/2, the conditional PDF of X is uniform (1/2, 1). Thus, by the definition of the uniform (a, b) PDF, Var [X |Y = 1/2] = Quiz 4.10 (A) (1) For random variables X and Y from Example 4.1, we observe that PY (1) = 0.09 and PX (0) = 0.01. However, PX,Y (0, 1) = 0 = PX (0) PY (1) (1) (1 − 1/2)2 1 = 12 48 (11)

Since we have found a pair x, y such that PX,Y (x, y) = PX (x)PY (y), we can conclude that X and Y are dependent. Note that whenever PX,Y (x, y) = 0, independence requires that either PX (x) = 0 or PY (y) = 0. (2) For random variables Q and G from Quiz 4.2, it is not obvious whether they are independent. Unlike X and Y in part (a), there are no obvious pairs q, g that fail the independence requirement. In this case, we calculate the marginal PMFs from the table of the joint PMF PQ,G (q, g) in Quiz 4.2. PQ,G (q, g) g = 0 g = 1 g = 2 g = 3 PQ (q) q=0 0.06 0.18 0.24 0.12 0.60 0.04 0.12 0.16 0.08 0.40 q=1 PG (g) 0.10 0.30 0.40 0.20 Careful study of the table will verify that PQ,G (q, g) = PQ (q)PG (g) for every pair q, g. Hence Q and G are independent. (B) (1) Since X 1 and X 2 are independent, f X 1 ,X 2 (x1 , x2 ) = f X 1 (x1 ) f X 2 (x2 ) = (1 − x1 /2)(1 − x2 /2) 0 ≤ x1 ≤ 2, 0 ≤ x2 ≤ 2 0 otherwise 30 (2) (3)

(2) Let FX (x) denote the CDF of both X 1 and X 2 . The CDF of Z = max(X 1 , X 2 ) is found by observing that Z ≤ z iff X 1 ≤ z and X 2 ≤ z. That is, P [Z ≤ z] = P [X 1 ≤ z, X 2 ≤ z] = P [X 1 ≤ z] P [X 2 ≤ z] = [FX (z)]2 (4) (5)

To complete the problem, we need to find the CDF of each X i . From the PDF f X (x), the CDF is ⎧ x <0 ⎨ 0 x 2 /4 0 ≤ x ≤ 2 FX (x) = f X (y) dy = (6) x−x ⎩ −∞ 1 x >2 Thus for 0 ≤ z ≤ 2, FZ (z) = (z − z 2 /4)2 (7)

The complete expression for the CDF of Z is ⎧ z<0 ⎨ 0 2 /4)2 0 ≤ z ≤ 2 FZ (z) = (z − z ⎩ 1 z>1

(8)

Quiz 4.11 This problem just requires identifying the various terms in Definition 4.17 and Theorem 4.29. Specifically, from the problem statement, we know that ρ = 1/2, µ1 = µ X = 0, and that σ1 = σ X = 1, σ2 = σY = 1. (2) (1) Applying these facts to Definition 4.17, we have 1 2 2 e−2(x −x y+y )/3 . f X,Y (x, y) = √ 3π 2 (3) µ2 = µY = 0, (1)

(2) By Theorem 4.30, the conditional expected value and standard deviation of X given Y = y are 2 E [X |Y = y] = y/2 σ X = σ1 (1 − ρ 2 ) = 3/4. ˜ (4) When Y = y = 2, we see that E[X |Y = 2] = 1 and Var[X |Y = 2] = 3/4. The conditional PDF of X given Y = 2 is simply the Gaussian PDF 1 2 e−2(x−1) /3 . f X |Y (x|2) = √ 3π/2 (5)

31

x) PMF. Also. 2.4]. .Quiz 4. 4. y=ceil(x. x 0 otherwise (1) Given X = x. This observation prompts the following program: function xy=dtrianglerv(m) sx=[1. x) PMF via Y = xU . 32 .*rand(m.3. . That is. PY |X (y|x) = 1/x y = 1. px=0. x=finiterv(sx.y’].1)). 0 otherwise. 1) random variable U .px. 3.2.12 One straightforward method is to follow the approach of Example 4. PX (x) = 1/4 x = 1. 4) PMF.25*ones(4. . Instead.m). we can generate a sample value of Y with a discrete uniform (1. and an independent uniform (0.1). . First we observe that X has the discrete uniform (1. given X = x. we use an alternate approach.28. xy=[x’. Y has a discrete uniform (1.

each Yi must be a strictly positive integer. X 3 = y3 + y2 + y1 ] = (1 − p)3 p y1 +y2 +y3 (1) (2) (3) (4) By defining the vector a = 1 1 1 . y2 . 2. we have f X 1 . (1) (2) (3) x2 x2 0 x3 x1 In particular.1 We find P[C] by integrating the joint PDF over the region of interest. X 3 − X 2 = y3 ] = P [X 1 = y1 . 2.3 First we note that each marginal PDF is nonzero only if any subset of the xi obeys the ordering contraints 0 ≤ x 1 ≤ x2 ≤ x3 ≤ 1. y2 . x3 ) = 0 unless 0 ≤ x 1 ≤ 33 . 6 d x2 = 6(x3 − x1 ). x3 ) = 0 unless 0 ≤ x2 ≤ x3 ≤ 1. . Y1 = X 1 .X 3 (x1 . . y3 ∈ {1. and that f X 1 . Y2 = y2 . . Y3 = y3 ] = P [X 1 = y1 . for y1 . y3 ∈ {1.X 3 (x2 . 6 d x1 = 6x2 . X 2 − X 1 = y2 . . (1) (2) =4 0 y2 dy2 0 y4 dy4 Quiz 5. X 2 = y2 + y1 . x3 ) = ∞ −∞ ∞ −∞ ∞ −∞ f X (x) d x3 = f X (x) d x1 = f X (x) d x2 = 1 6 d x3 = 6(1 − x2 ). PY (y) = P [Y1 = y1 . Thus. we must keep in mind that f X 1 .}. Specifically. x2 ) = 0 unless 0 ≤ x 1 ≤ x2 ≤ 1. x2 ) = f X 2 .X 3 (x2 . x3 ) = f X 1 . f X 2 .2 By definition of A. Y2 = X 2 − X 1 and Y3 = X 3 − X 2 . Since 0 < X 1 < X 2 < X 3 . P [C] = 0 1/2 y2 1/2 y4 dy2 0 1/2 dy1 0 dy4 0 1/2 4dy3 = 1/4. .X 2 (x1 .} 0 otherwise (5) Quiz 5.Quiz Solutions – Chapter 5 Quiz 5.X 3 (x1 .X 2 (x1 . Within these constraints. the complete expression for the joint PMF of Y is PY (y) = (1 − p) p a y y1 . .

X 2 (x1 .W (v. x2 ) = f X 2 . x2 ) d x2 = f X 2 .X 3 (x1 .4 In the PDF f Y (y). w) = 4 0 ≤ v1 ≤ v2 ≤ 1. The complete expressions are f X 1 . 0 ≤ w1 ≤ w2 ≤ 1 0 otherwise (2) Y1 .X 2 (x1 . Y2 W= Y3 . f X 1 (x1 ) = f X 2 (x2 ) = f X 3 (x3 ) = ∞ −∞ ∞ −∞ ∞ −∞ f X 1 . x3 ) d x3 = f X 2 .X 3 (x2 . When 0 ≤ xi ≤ 1 for each xi . Y4 (1) 34 .X 3 (x2 .x3 ≤ 1. the components have dependencies as a result of the ordering constraints Y1 ≤ Y2 and Y3 ≤ Y4 .X 3 (x2 . x3 ) = 6(1 − x2 ) 0 ≤ x1 ≤ x2 ≤ 1 0 otherwise 6x2 0 ≤ x2 ≤ x3 ≤ 1 0 otherwise 6(x3 − x1 ) 0 ≤ x1 ≤ x3 ≤ 1 0 otherwise (4) (5) (6) Now we can find the marginal PDFs. We can separate these constraints by creating the vectors V= The joint PDF of V and W is f V. x3 ) d x2 = 1 x1 1 6(1 − x2 ) d x2 = 3(1 − x1 )2 6x2 d x3 = 6x2 (1 − x2 ) 2 6x2 d x2 = 3x3 (7) (8) (9) x2 x3 0 The complete expressions are f X 1 (x1 ) = f X 2 (x2 ) = f X 3 (x3 ) = 3(1 − x1 )2 0 ≤ x1 ≤ 1 0 otherwise 6x2 (1 − x2 ) 0 ≤ x2 ≤ 1 0 otherwise 2 3x3 0 ≤ x3 ≤ 1 0 otherwise (10) (11) (12) Quiz 5. x3 ) = f X 1 .

That is.6) random variable and X 3 is a binomial (5.6)x2 (0.x3 (0. 5} ⎩ 0 otherwise We can find the marginal PMF for each X i from the joint PMF PX (x). A and R.W (v.6 and p3 = 0.19. . For 0 ≤ v1 ≤ v2 ≤ 1.W (v. p2 = 0. . 1.3)x1 (0. Quiz 5. f W (w) = = 4(1 − w1 ) dw1 = 2 f V. 5 0 otherwise 35 5 x (2) . 0. w) dv1 dv2 1 0 1 v1 (6) (7) 4 dv2 dv1 = 2 It follows that V and W have PDFs f V (v) = 2 0 ≤ v1 ≤ v2 ≤ 1 .3) random variable.5 (A) Referring to Theorem 1. PX i (x) = pix (1 − pi )5−x x = 0. w) dw1 dw2 1 w1 1 0 (3) (4) (5) 4 dw2 dw1 = Similarly.1) random variable. . In five trials.1)x3 x1 + x2 + x3 = 5. If we view each test as a trial with success probability P[L] = 0. Similarly. X 2 is a binomial (5.x2 .We must verify that V and W are independent.3. for 0 ≤ w1 ≤ w2 ≤ 1. 0.1. f V (v) = = 0 1 f V. confirming that V and W are independent vectors. the vector X = X 1 X 2 X 3 indicating the number of outcomes of each subexperiment has the multinomial PMF ⎧ 5 ⎨ x1 . . . 1. . for p1 = 0. p) = (5. . each test is a subexperiment with three possible outcomes: L. 0 otherwise f W (w) = 2 0 ≤ w 1 ≤ w2 ≤ 1 0 otherwise (8) It is easy to verify that f V.3. .W (v. 0. x2 . x3 ∈ {0. w) = f V (v) f W (w). PX (x) = (1) x1 . however it is simpler to just start from first principles and observe that X 1 is the number of occurrences of L in five independent tests. we see that X 1 is a binomial (n.

1)2 + 0. PW (3) = PX 1 (3) + PX 2 (3) + PX 3 (3) = 0. for w = 3. We start with 36 .486 PW (4) = PX 1 (4) + PX 2 (4) + PX 3 (4) = 0. 2.0802 (B) Since each Yi = 2X i + 4. 3x 3 d x = 3/4. we can apply Theorem 5. or X 3 = w occurs. we need to find E[X i X j ] for all i and j. 1) 5![0. w = 4. 2) + PX (2.1)2 + 0. we see that X 1 . PW (0) = PW (1) = 0. 1. In particular. the event W = w occurs if and only if one of the mutually exclusive events X 1 = w. 6x 2 (1 − x) d x = 1/2.1458 = (3) (4) (5) In addition.6 to find the PMF of W . (1) (2) (3) E [X 2 ] = 0 1 E [X 3 ] = 0 1 To find the correlation matrix R X . f 3 X 2 2 2 2 (1/8)e−(y3 −4)/2 4 ≤ y1 ≤ y2 ≤ y3 = 0 otherwise (9) (10) (6) (7) (8) Note that for other matrices A. PW (2) = PX (1.6 We start by finding the components E[X i ] = the marginal PDFs f X i (x) found in Quiz 5. Quiz 5. X 2 and X 3 are not independent.1)] 2!2!1! = 0.6)(0.10 to write f Y (y) = y1 − 4 y2 − 4 y3 − 4 1 . 2) + PX (2.32 (0. Thus. since X 1 + X 2 + X 3 = 5 and since each X i is non-negative. we use 3x(1 − x)2 d x = 1/4.From the marginal PMFs.288 PW (5) = PX 1 (5) + PX 2 (5) + PX 3 (5) = 0. .3(0. the constraints on y resulting from the constraints 0 ≤ X 1 ≤ X 2 ≤ X 3 can be much more complicated. and w = 5.32 (0. 2. X 2 = w. Hence. Furthermore. we must use Theorem 5.3: E [X 1 ] = 0 1 ∞ −∞ x f X i (x) d x of µ X . To do so.6)2 (0.6)2 (0.

1/4 3/8 ⎦ = 80 1 2 3 3/8 9/16 (17) (18) .X 2 (x1 . (4) (5) (6) 2 E X2 = 2 E X3 = 1 0 1 0 Using marginal PDFs from Quiz 5. Summarizing the results. x3 =1 x3 =x1 = 0 1 3 2 2 (2x1 x3 − 3x1 x3 ) d x1 = 0 1 2 4 [2x1 − 3x1 + x1 ] d x1 = 1/5. 1 x2 1 0 2 6x2 x3 d x3 d x2 x1 x2 f X 1 . X has correlation matrix ⎡ ⎤ 1/10 3/20 1/5 R X = ⎣3/20 3/10 2/5⎦ .the second moments: E 2 X1 = 0 1 3x 2 (1 − x)2 d x = 1/10. 6x 3 (1 − x) d x = 3/10. 1/5 2/5 3/5 Vector X has covariance matrix C X = R X − E [X] E [X] ⎡ ⎤ ⎡ ⎤ 1/10 3/20 1/5 1/4 ⎣3/20 3/10 2/5⎦ − ⎣1/2⎦ = 1/5 2/5 3/5 3/4 ⎡ ⎤ ⎡ 1/10 3/20 1/5 1/16 ⎣3/20 3/10 2/5⎦ − ⎣ 1/8 = 1/5 2/5 3/5 3/16 37 (15) (16) 1/4 1/2 3/4 ⎤ ⎡ ⎤ 3 2 1 1/8 3/16 1 ⎣ 2 4 2⎦ . the cross terms are E [X 1 X 2 ] = = = 0 ∞ ∞ −∞ −∞ 1 1 0 1 x1 3 4 [x1 − 3x1 + 2x1 ] d x1 = 3/20.3. d x1 d x2 d x1 (7) (8) (9) (10) (11) (12) (13) (14) 6x1 x2 (1 − x2 ) d x2 E [X 2 X 3 ] = 0 1 = E [X 1 X 3 ] = 0 2 4 [3x2 − 3x2 ] d x2 = 2/5 1 x1 1 6x1 x3 (x3 − x1 ) d x3 d x1 . x2 ) . 3x 4 d x = 3/5.

A=ones(31.7 We observe that X = AZ + b where A= 2 1 . CY=(A’)*CT*A. the first two lines generate the 31 × 31 covariance matrix CT.99997155736872 0.e. 1 −1 b= 2 .0000 0.0221 0. invoked with the command format short.8 First.18 that µ X = b and that C X = AA = 2 1 1 −1 2 1 5 1 = . 0 (1) It follows from Theorem 5. The covariance matrix of Y is 1 × 1 and is just equal to Var[Y ]. Thus. CT=36. Here is the output of julytemps. Since T is a Gaussian random vector. by Theorem 5. Theorem 5.16.0000.1)/31. In julytemps.16 tells us that Y is a 1 dimensional Gaussian vector. Here is the long format output: >> format long >> julytemps([70 75 80 85 90 95]) ans = Columns 1 through 4 0.5000 0./(1+abs(D1-D2)). The expected value of Y is µY = µT = 80. 1 −1 1 2 (2) Quiz 5.. Var[Y ] = ACT A .9779 1.0.m.This problem shows that even for fairly simple joint PDFs.0000 1. [D1 D2]=ndgrid((1:31).0000 Note that P[T ≤ 70] is not actually zero and that P[T ≤ 90] is not actually 1. we observe that Y = AT where A = 1/31 1/31 · · · 1/31 .97792616932396 38 . computing the covariance matrix by calculus can be a time consuming task. p=phi((T-80)/sqrt(CY)).m: >> julytemps([70 75 80 85 90 95]) ans = 0.02207383067604 Columns 5 through 6 0. just a Gaussian random variable. Next we calculate Var[Y ].99999999922010 0.50000000000000 0. or CT . Its just that the M ATLAB’s short format output.(1:31)). Quiz 5.00002844263128 0. The final step is to use the (·) function to calculate P[Y < T ]. function p=julytemps(T). rounds off those probabilities. i.

jth element is CT (i. 1 + |i − j| (1) If we write out the elements of the covariance matrix. p=phi((T-80)/sqrt(CY)). we see that ⎡ ⎤ c0 c1 · · · c30 .0. c30 · · · c1 c0 (2) This covariance matrix is known as a symmetric Toeplitz matrix. . ⎥ . M ATLAB has a toeplitz function for generating them.. ⎥ ⎢ . . The function julytemps2 use the toeplitz to generate the correlation matrix CT . ⎥. in this problem. A=ones(31. ⎢ c1 c0 CT = ⎢ . . However. . j) = c|i− j| = 36 . ⎣ . c1 ⎦ . C X has a special structure. CY=(A’)*CT*A. c=36. . CT=toeplitz(c)... 39 . We will see in Chapters 9 and 11 that Toeplitz covariance matrices are quite common. the i. In fact./(1+abs(0:30)).The ndgrid function is a useful to way calculate many covariance matrices.1)/31.. function p=julytemps2(T).

. . 4 0 otherwise (1) We can write Wn in the form of Wn = K 1 + · · · + K n .2 Random variables X and Y have PDFs f X (x) = 3e−3x x ≥ 0 0 otherwise f Y (y) = 2e−2y y ≥ 0 0 otherwise (1) (6) (4) (2) (3) Since X and Y are nonnegative. . the random variables K 1 . . by Theorem 6.5.3. the variance of the sum equals the sum of the variances. we note that the first two moments of K i are E [K i ] = (1 + 2 + 3 + 4)/4 = 2. a conmplete expression for the PDF of W is f W (w) = 6e−2w 1 − e−w 0 w ≥ 0. That is.25n Quiz 6. K n denote a sequence of iid random variables each with PMF PK (k) = 1/4 k = 1. otherwise.5)2 = 1. . f W (w) = e−3w e y w 0 = 6 e−2w − e−3w (3) Since f W (w) = 0 for w < 0. .5 E K i2 = (12 + 22 + 32 + 42 )/4 = 7.5 − (2.5. . . First. Hence. W = X + Y is nonnegative. By Theorem 6. the expected value of Wn is E [Wn ] = E [K 1 ] + · · · + E [K n ] = n E [K i ] = 2.1 Let K 1 .5n (5) Since the rolls are independent.Quiz Solutions – Chapter 6 Quiz 6. the PDF of W = X + Y is f W (w) = ∞ −∞ f X (w − y) f Y (y) dy = 6 0 w e−3(w−y) e−2y dy (2) Fortunately. . . K n are independent. For w > 0. Var[Wn ] = Var[K 1 ] + · · · + Var[K n ] = 1. .25 Since E[K i ] = 2. .5 Thus the variance of K i is Var[K i ] = E K i2 − (E [K i ])2 = 7. (4) 40 . this integral is easy to evaluate.

3 The MGF of K is 4 φ K (s) = E es K == k=0 (0.8 says the MGF of J is φ J (s) = (φ K (s))m = (2) (B) Since the set of α j X j are independent Gaussian random variables.2(es + 2e2s + 3e3s + 4e4s ) ds Evaluating the derivative at s = 0 yields E [K ] = d φ K (s) ds = 0.8 (4) (5) (6) (7) = 0.10 says that W is a Gaussian random variable. The first derivative of φ K (s) is d φ K (s) = 0.2 1 + es + e2s + e3s + e4s (1) We find the moments by taking derivatives. we need only find the expected value and variance. Since the expectation of the sum equals the sum of the expectations: E [W ] = α E [X 1 ] + α 2 E [X 2 ] + · · · + α n E [X n ] = 0 41 (3) .2(es + 8e2s + 27e3s + 64e4s ) s=0 s=0 = 0. we continue to take derivatives: E K2 = E K3 E K4 d 2 φ K (s) ds 2 d 3 φ K (s) = ds 3 d 4 φ K (s) = ds 4 = 0.2(1 + 2 + 3 + 4) = 2 s=0 (2) (3) To find higher-order moments. Theorem 6.4 (A) Each K i has MGF φ K (s) = E es K i = es (1 − ens ) es + e2s + · · · + ens = n n(1 − es ) ems (1 − ens )m n m (1 − es )m (1) Since the sequence of K i is independent.2(es + 4e2s + 9e3s + 16e4s ) s=0 s=0 =6 = 20 = 70.2(es + 16e2s + 81e3s + 256e4s ) s=0 s=0 Quiz 6.Quiz 6.2)esk = 0. Theorem 6. Thus to find the PDF of W .

we can write the PDF of W as f W (w) = 1 2 2π σW e−w 2 /2σ 2 W (7) Quiz 6.6 to write Var[W ] = α 2 − α 2n+2 [1 + n(1 − α 2 )] (1 − α 2 )2 (6) (4) (5) 2 With E[W ] = 0 and σW = Var[W ].12. 1−s φ N (s) = 1 s 5e .1. we see that R has the MGF of an exponential (1/5) random variable.1. R has MGF φ R (s) = φ N (ln φ X (s)) = Substituting the expression for φ X (s) yields φ R (s) = 1 5 1 5 1 5 φ X (s) 1 − 4 φ X (s) 5 (2) −s . 42 . the variance of the sum equals the sum of the variances: Var[W ] = α 2 Var[X 1 ] + α 4 Var[X 2 ] + · · · + α 2n Var[X n ] = α 2 + 2(α 2 )2 + 3(α 2 )3 + · · · + n(α 2 )n Defining q = α 2 .5 (1) From Table 6. we can use Math Fact B. 1 − 4 es 5 (1) From Theorem 6. (3) (2) From Table 6. The corresponding PDF is f R (r ) = (1/5)e−r/5 r ≥ 0 0 otherwise (4) This quiz is an example of the general result that a geometric sum of exponential random variables is an exponential random variable. each X i has MGF φ X (s) and random variable N has MGF φ N (s) where φ X (s) = 1 .Since the α j X j are independent.

Var[A] = Var[X 1 ] + · · · + Var[X 12 ] = 12 Var[X ] = 144 Hence. E [A] = E [X 1 ] + · · · + E [X 12 ] = 12E [X ] = 72 msec (4) Since the X i are independent. we use the central limit theorem and Table 3.4013 Note that we used Table 3. we write P [A > 75] = 1 − P [A ≤ 75] 75 − E [A] A − E [A] ≤ =1− P σA σA 75 − 72 ≈1− 12 = 1 − 0.9773 = 0. (6) (7) (8) (9) (5) (4) (3) (6) Once again. (3) Using X i to denote the access time of block i. we can write A = X 1 + X 2 + · · · + X 12 Since the expectation of the sum equals the sum of the expectations.0227 (10) (11) (12) 43 . the standard deviation of A is σ A = 12 (5) To use the central limit theorem.Quiz 6.25).1 to estimate P [A < 48] = P 48 − E [A] A − E [A] < σA σA 48 − 72 ≈ 12 = 1 − (2) = 1 − 0.6 (1) The expected access time is E [X ] = ∞ −∞ x f X (x) d x = 0 12 x d x = 6 msec 12 (1) (2) The second moment of the access time is E X2 = ∞ −∞ x 2 f X (x) d x = 0 12 x2 d x = 48 12 (2) The variance of the access time is Var[X ] = E[X 2 ] − (E[X ])2 = 48 − 36 = 12.5987 = 0.1 to look up (0.

(1) The expected number of voice calls out of 48 calls is E[K 48 ] = 48P[V ] = 36.9545 (4) Since K 48 is a discrete random variable.8 The train interarrival times X 1 .11.9687 (4) (5) Quiz 6.1 yields P [30 ≤ K 48 ≤ 42] ≈ Recalling that (−x) = 1 − 42 − 36 − 3 (x).16666) − 1 = 0. λ) random variable. (3) Using the ordinary central limit theorem and Table 3. (1) In Theorem 6. X 2 . (2) The variance of K 48 is Var[K 48 ] = 48P [V ] (1 − P [V ]) = 48(3/4)(1/4) = 9 Thus K 48 has standard deviation σ K 48 = 3. The arrival time of the third train is W = X 1 + X 2 + X 3.5 − 36 − 3 3 = 2 (2. we can use the De Moivre-Laplace approximation to estimate P [30 ≤ K 48 ≤ 42] ≈ 42 + 0. X 3 are iid exponential (λ) random variables.Quiz 6. we found that the sum of three iid exponential (λ) random variables is an Erlang (n = 3.5 − 36 30 − 0. P [W > 20] = P √ W −6 20 − 6 > √ ≈ Q(7/ 3) = 2.66 × 10−5 √ 12 12 (3) 44 . we have (3) 30 − 36 3 = (2) − (−2) (2) (1) P [30 ≤ K 48 ≤ 42] ≈ 2 (2) − 1 = 0. we find that W has expected value and variance E [W ] = 3/λ = 6 Var[W ] = 3/λ2 = 12 (2) (1) By the Central Limit Theorem.7 Random variable K n has a binomial distribution for n trials and success probability P[V ] = 3/4. From Appendix A.

0028 (9) (10) Although the Chernoff bound is relatively weak in that it overestimates the probability by roughly a factor of 12. we set the derivative of h(s) to zero: −20(1 − 2s)3 e−20s + 6e−20s (1 − 2s)2 d h(s) = =0 ds (1 − 2s)6 (6) This implies 20(1 − 2s) = 6 or s = 7/20.sx). pmfplot(sw. it should be apparent that the finitepmf function is implementing the convolution of the two PMFs.11 says that for any w > 0.5.0338 s=7/20 (7) (3) Theorem 3.sy=0:100. PW=PX.sy). pw=finitepmf(SW. Applying s = 7/20 into the Chernoff bound yields P [W > 20] ≤ e−20s (1 − 2s)3 = (10/3)3 e−7 = 0.PY]=ndgrid(px. sw=unique(SW).m sx=0:100.’\itP_W(w)’).py).19: %unifbinom100. px=binomialpmf(100.pw. A graph of the PMF PW (w) appears in Figure 2 With some thought.PW. it is a valid bound. SW=SX+SY.100. By contrast.9 One solution to this problem is to follow the approach of Example 6. 3) random variable W satisfies 2 (λw)k e−λw FW (w) = 1 − (8) k! k=0 Equivalently. for λ = 1/2 and w = 20.sw).*PY. the Central Limit Theorem approximation grossly underestimates the true probability.sy).0.’\itw’. [PX. py=duniformpmf(0. [SX. the CDF of the Erlang (λ. P [W > 20] = 1 − FW (20) = e−10 1 + 10 102 + 1! 2! = 61e−10 = 0. 45 .SY]=ndgrid(sx. we note that the MGF of W is φW (s) = The Chernoff bound states that P [W > 20] ≤ min e−20s φ X (s) = min s≥0 s≥0 λ λ−s 3 = 1 (1 − 2s)3 e−20s (1 − 2s)3 (4) (5) To minimize h(s) = e−20s /(1 − 2s)3 . Quiz 6.(2) To use the Chernoff bound.

5) random variable and a discrete uniform (0.9. the PMF PW (w) of the independent sum of a binomial (100.0. 100) random variable. 46 .008 PW(w) 0.006 0.002 0 0 20 40 60 80 100 w 120 140 160 180 200 Figure 2: From Quiz 6. 0.004 0.01 0.

By Theorem 7. 47 . (1) E [X i ] = 15.Quiz Solutions – Chapter 7 Quiz 7. Quiz 7. P [W > 75] = P [W − E [W ] > 30] ≤ P [|W − E [W ]| > 30] ≤ 225 Var [W ] 1 = = 2 900 4 30 (3) (4) E [W ] 45 3 = = 75 75 5 (2) Quiz 7.1 An exponential random variable with expected value 1 also has variance 1. the mean square error is E (M100 (W ) − µW )2 = Observe that µ X = 0 so that W = X 2 . and Var[W ] = 3 Var[X i ] = 225. (1) By the Markov inequality.3 Define the random variable W = (X − µ X )2 . 30). we need n = 100 samples. Hence.000889. By Theorem 7. Thus.2 The arrival time of the third elevator is W = X 1 + X 2 + X 3 . Mn (X ) has variance Var[Mn (X )] = 1/n. 12 Thus E[W ] = 3E[X i ] = 45. µW = E X 2 Var[W ] 100 (1) = 1 −1 1 −1 x 2 f X (x) d x = 1/3 x 4 f X (x) d x = 1/5 (2) (3) E W2 = E X4 = Therefore Var[W ] = E[W 2 ] − µ2 = 1/5 − (1/3)2 = 4/45 and the mean square error is W 4/4500 = 0.1. (30 − 0)2 Var [X i ] = = 75.6. Observe that V100 (X ) = M100 (W ). Since each X i is uniform (0. P [W > 75] ≤ (2) By the Chebyshev inequality.

Quiz 7. n n Note that if M100 (X ) = 0. the 0.95 (3) p(1 − p) √ for every value of p. we require that 1.65 0. we have α ≤ 0. √ This implies c n√ 2.645 Mn (X ) − √ ≤ p ≤ Mn (X ) + √ . we require that ≥ c ≥ (0. we must satisfy c n ≥ 1.99 confidence interval. Equivalently.3355 ≤ p ≤ 0.e. Since p(1 − p) ≤ 1/4 for all p. at time k.5 Following the approach of bernoullitraces. we can use a Gaussian approximation for Mn (X ).4 Assuming the number n of samples is large.645 0.99 confidence interval estimate is 0. then the 0.m. 48 (7) (6) . 4 n n The 0.58)/ n. OK(k) counts the fraction of sample paths that have sample mean within one standard error of p. implying (c n/( p(1− p))) ≥ 0. Since (x) is an increasing function of x.25)(2. The program bernoullisample.m generates graphs the number of traces within one standard error as a function of the time.9 confidence interval estimate of p is 0.9 or α ≤ 0.4.99 confidence interval estimate is 0. each sample path having n = 100 Bernoulli traces. In this case. the number of trials in each trace.Quiz 7.13 which says that the interval estimate Mn (X ) − c ≤ p ≤ Mn (X ) + c (1) has confidence coefficient 1 − α where α =2−2 √ c n . p(1 − p) (2) We must ensure for every value of p that 1 − α ≥ 0.41 c≥ √ = √ .1.995.41 0.65 p(1 − p). we generate m = 1000 sample paths. we apply Theorem 7.01. we must have √ c n ≥ 0.4645. The interval is wide because the 0.41 Mn (X ) − √ ≤ p ≤ Mn (X ) + √ . i. n n (5) (4) √ For the 0.58 p(1 − p). Since p(1 − p) ≤ 1/4 for all p. SinceE[X ] = p and Var[X ] = p(1 − p).99 confidence is high.

is examined in Problem 7. The following graph was generated by bernoullisample(100.m).68. stderrmat=stderr*ones(1.m).2)/m. x=reshape(bernoullirv(p.’-s’).m*n). plot(1:n. OK=sum(abs(MN-p)<stderrmat. stderr=sqrt(p*(1-p)). MN=cumsum(x).0.5): 1 0. 49 . The unusual sawtooth pattern. as m gets large.5000.7 0.OK. though perhaps unexpected. nn=(1:n)’*ones(1.8 0.6 0./sqrt((1:n)’).9 0./nn.n.5.2.m).5 0.4 0 10 20 30 40 50 60 70 80 90 100 As we would expect.function OK=bernoullisample(n.p).m. the fraction of traces within one standard error approaches 2 (1) − 1 ≈ 0.

. . This rule simplifies to 106 − 104 k ∈ A0 if k ≤ k = = 214. we must choose a rejection region for X .33 Hence. For a significance level of α = 0. · · · . if we observe X < 1. let R = {X ≤ r }. then we accept hypothesis H1 . .01)1/15 = 1. This implies that for x ≥ 0. .1 From the problem statement.7. A reasonable choice is to reject the hypothesis if X is too small. the conditional PMFs of K are PK |H0 (k) = PK |H1 (k) = 104k e−10 k! 4 (4) (5) 0 106k e−10 k! 6 k = 0. . X 2 ≤ x. the CDF of the maximum of X 1 .2 From the problem statement. 50 .33. (4) Thus if we observe at least 214. That is. then we reject the hypothesis. 975. the ML hypothesis rule is k ∈ A0 if PK |H0 (k) ≥ PK |H1 (k) . Quiz 8.01.01 It is straightforward to show that r = − ln 1 − (0. X 15 ≤ x] = [P [X i ≤ x]]15 . each X i has PDF and CDF f X i (x) = e−x x ≥ 0 0 otherwise FX i (x) = 0 x <0 1 − e−x x ≥ 0 (1) Hence. . ln 100 ∗ k ∈ A1 otherwise. we obtain α = P [X ≤ r ] = (1 − e−r )15 = 0. . otherwise (1) (2) 0 Since the two hypotheses are equally likely. 1. the MAP and ML tests are the same. From Theorem 8.6. 976 photons. FX (x) = FX i (x) 15 (2) = 1 − e−x 15 (3) To design a significance test. . . 1.Quiz Solutions – Chapter 8 Quiz 8. . otherwise k = 0. X 15 obeys FX (x) = P [X ≤ x] = P [X 1 ≤ x. (3) k ∈ A1 otherwise.

2. a symbol error occurs when si is transmitted but (X 1 .. FM1=sqdistroc(v.. X 2 > 0|H0 ] = P E/2 + N1 > 0. 51 .T) %square law distortion recvr %P(error) for m bits tested %transmit v volts or -v volts. For a QPSK system.TT]=ndgrid(x.FM2(:.T(:)). function FM=sqdistrocplot(v.T). FM5=sqdistroc(v.T). E/2 + N2 > 0 (1) Because of the symmetry of the signals.’\it d=0. the probability 2 PERR = 1 − P [C] = 1 − E 2σ 2 (5) Quiz 8.1).. Here is the modified code: function FM=sqdistroc(v. Given H0 .ˆ2)< TT).2’. N is Gauss(0. . [XX.m is essentially the same as sqdistor except the output is a matrix FM whose columns are the false alarm and miss probabilities.2).4 To generate the ROC. the existing program sqdistor already calculates this miss probability PMISS = P01 and the false alarm probability PFA = P10 .ˆ2)>TT).0.m.m.2).T). σ ) random variables. xlabel(’P_{FA}’). FM5(:.’:k’). Next.1’.1). FM2=sqdistroc(v.. x= -v+randn(m.3.m calls sqdistroc three times to generate a plot that compares the receiver performance for the three requested values of d.TT]=ndgrid(x. FM=[FM1 FM2 FM5]. X 2 ) ∈ A j for some j = i.’--k’..Quiz 8.1)/m.’-k’.1).3 For the QPSK system. %add N volts. P[C|H0 ] = P[C|Hi ] for all i. P10=sum((XX+d*(XX. P01=sum((XX+d*(XX. Since N1 and N2 are iid Gaussian (0.0. the program sqdistrocplot. it is easier to calculate the probability of a correct decision.1).1)). The modified program.1. This implies the probability of a correct decision is P[C] = P[C|H0 ].T(:)).3’.0. ’\it d=0. Equivalently. . loglog(FM1(:.2). we have P[C] = 2( E/2σ 2 ).FM1(:.. otherwise 0 %FM = [P(FA) P(MISS)] x=(v+randn(m.1)/m. legend(’\it d=0.. FM2(:.m.FM5(:.3) ylabel(’P_{MISS}’). sqdistroc. the conditional probability of a correct decision is √ √ P [C|H0 ] = P [X 1 > 0.m.m. we have √ √ P [C] = P [C|H0 ] = P E/2 + N1 > 0 P E/2 + N2 > 0 (2) √ 2 (3) = P N1 > − E/2 √ 2 − E/2 (4) = 1− σ Since (−x) = 1 − of error is (x).T). [XX.1) %add d(v+N)ˆ2 distortion %receive 1 if x>T. FM=[P10(:) P01(:)].d.

To see the effect of d. 10 0 10 −1 10 PMISS 10 10 −2 −3 −4 10 −5 d=0. 52 .1:3.100000.4 with squared distortion. generated the plot shown in Figure 3.100000.1 d=0.3 −5 10 10 −4 10 −3 10 PFA −2 10 −1 10 0 T=-3:0. sqdistrocplot(3.T).2 d=0. sqdistrocplot(3. the commands T=-3:0.T).1:3. Figure 3: The receiver operating curve for the communications system of Quiz 8.

Quiz Solutions – Chapter 9 Quiz 9. we need the marginal PDF f X (x). For 0 ≤ x ≤ 1.Y (x. f X (x) = x 1 2(y + x) dy = y 2 + 2x y y=1 y=x = 1 + 2x − 3x 2 (4) (5) For 0 ≤ x ≤ 1. we calculate the marginal PDF for 0 ≤ y ≤ 1: f Y (y) = 0 y 2(y + x) d x = 2x y + x 2 x=y x=0 = 3y 2 (1) This implies the conditional PDF of X given Y is f X |Y (x|y) = f X.1 (1) First. the conditional PDF of Y given X is f Y |X (y|x) = 2(y+x) 1+2x−3x 2 0 x ≤y≤1 otherwise (6) (4) The MMSE estimate of Y given X = x is y M (x) = E [Y |X = x] = ˆ x 1 2y 2 + 2x y dy 1 + 2x − 3x 2 y=1 y=x (7) (8) (9) 2y 3 /3 + x y 2 = 1 + 2x − 3x 2 = 2 + 3x − 5x 3 3 + 6x − 9x 2 53 . (3) To obtain the conditional PDF f Y |X (y|x). y) = f Y (y) 2 3y + 2x 3y 2 0 0≤x ≤y otherwise (2) (2) The minimum mean square error estimate of X given Y = y is x M (y) = E [X |Y = y] = ˆ 0 y 2x 2 2x + 2 3y 3y d x = 5y/9 (3) ˆ Thus the MMSE estimator of X given Y is X M (Y ) = 5Y /9.

the mean square error of the linear estimate is 2 e∗ = Var[T ](1 − ρT.3 When R = r . the optimum linear estimate of T given R is σT ˆ TL (R) = ρT. Cov [T.4.2 (1) Since the expectation of the sum equals the sum of the expectations. E[T X ] = E[T ]E[X ] = 0 and E[T 2 ] = Var[T ]. the conditional PDF of X = Y −40−40 log10 r is Gaussian with expected value −40 − 40 log10 r and variance 64. E [R] = E [T ] + E [X ] = 0 (2) Since T and X are independent. Thus Cov[T.R ) = 9(1 − 3/4) = 9/4 L 2 σT (5) σR R= 2 2 σT 2 2 σT + σ X R= 3 R 4 (6) (7) Quiz 9.R = σT /σ R .R (R − E [R]) + E [T ] σR Since E[R] = E[T ] = 0 and ρT. R] = Var[T ] = 9. R] = E [T R] = E [T (T + X )] = E T 2 + E [T X ] (3) (2) (1) Since T and X are independent and have zero expected value.8. R] = = 3/2 σR Var[R] Var[T ] (4) (5) From Theorem 9. The conditional PDF of X given R is 1 2 f X |R (x|r ) = √ e−(x+40+40 log10 r ) /128 128π 54 (1) . (4) From Definition 4.Quiz 9. the correlation coefficient of T and R is ρT.R = √ √ σT Cov [T.4. (6) By Theorem 9. the variance of the sum R = T + X is Var[R] = Var[T ] + Var[X ] = 9 + 3 = 12 (3) Since T and R have expected values E[R] = E[T ] = 0. ˆ TL (R) = Hence a ∗ = 3/4 and b∗ = 0.

r ).R (x. the above estimate will exceed 1000 m.6 m.3 dB. That is.R (x.3 (0. yielding log10 r = −1 − x/40 or rML (x) = (0. r ) = f X |R (x|r ) f R (r ) = 106 32π 1 √ r e−(x+40+40 log10 r ) 2 /128 (5) From Theorem 9. Hence.1236)10 (9) For example. if x = −120dB.6% larger than the ML estimate. the MAP estimate is 23. When x ≤ −156. R ≤ 1000 m. When the measured signal ˆ strength is not too low.R (x.3 −x/40 x ≥ −156. which is not possible in our probability model. the MAP estimate of R given X = x is the value of r that maximizes f X. 55 . ˆ For the MAP estimate. the complete description of the MAP estimate is rMAP (x) = ˆ 1000 x < −156.6. This minimum occurs when the exponent is zero.1)10−x/40 m ˆ (3) (4) If the result doesn’t look correct. the MAP estimate takes into account that the distance can never exceed 1000 m. However. r ) ˆ 0≤r ≤1000 Note that we have included the constraint r ≤ 1000 in the maximization to highlight the fact that under our probability model. This reflects the fact that large values of R are a priori more probable than small values. This corresponds to a distance estimate of rML (−120) = 100 m.R (x. for very low signal strengths. note that a typical figure for the signal strength might be x = −120 dB.1236)10−x/40 (8) This is the MAP estimate of R given X = x as long as r ≤ 1000 m. we observe that the joint PDF of X and R is f X. (6) rMAP (x) = arg max f X. then rMAP (−120) = 123. r ) with respect to r to zero yields e−(x+40+40 log10 r ) Solving for r yields r = 10 1 25 log10 e −1 2 /128 1− 80 log10 e (x + 40 + 40 log10 r ) = 0 128 (7) 10−x/40 = (0. we can use Definition 9.2 to write the ML estimate of R given X = x as rML (x) = arg max f X |R (x|r ) ˆ r ≥0 (2) We observe that f X |R (x|r ) is maximized when the exponent (x + 40 + 40 log10 r )2 is minimized. Setting the derivative of f X.From the conditional PDF f X |R (x|r ).

9 1.1 (4) 1 1 = = 0.1 0 . we need to find RY and RYX 2 .1. Finally. Because µ X 2 = µY2 = 0. 0 0. To apply Theorem 9.1 (6) In terms of Theorem 9.9 1 RW = 0. Var[Y2 ] b ∗ = µ X 2 − a ∗ µ Y2 .Y2 ) = 1 − L Cov [X 2 . RY = E YY = E (X + W)(X + W ) = E XX + XW + WX + WW .4. it follows that E[Y] = 0.4 ˆ (1) From Theorem 9. n = 2 and we wish to estimate X 2 given the observation vector Y = Y1 Y2 . −0. Similarly. 2 Cov [X 2 .7. (7) (8) Because X and W are independent.1 (2) (3) It follows that a ∗ = 1/1. E[WX ] = 0. E[XW ] = E[X]E[W ] = 0.0909 1.Y2 = The expected square error is 2 e∗ = Var[X 2 ](1 − ρ X 2 .1 −0. (1) Because E[X] = E[Y] = 0. This implies RY = E XX + E WW = RX + RW = In addition. Y2 ] . to compute the expected square error. Note that X and W have correlation matrices RX = 1 −0. the LMSE estimate of X 2 given Y2 is X 2 (Y2 ) = a ∗ Y2 + b∗ where a∗ = Cov [X 2 .9 .Quiz 9. it follows that b∗ = 0. Y2 ] = E [X 2 Y2 ] = E [X 2 (X 2 + W2 )] = E X 2 = 1 2 2 Var[Y2 ] = Var[X 2 ] + Var[W2 ] = E X 2 + E W2 = 1.9 .7.7.1 (9) . we need to find RYX 2 = E [YX 2 ] = E [Y1 X 2 ] E [(X 1 + W1 )X 2 ] = . Thus we can apply Theorem 9.1 11 (5) (2) Since Y = X + W and E[X] = E[W] = 0. we calculate the correlation coefficient ρ X 2 . Y2 ] 1 =√ σ X 2 σY2 1. E [Y2 X 2 ] E [(X 2 + W2 )X 2 ] 56 (10) 1. −0.

Since X and W are independent vectors. Thus E[X 1 X 2 ] −0. This problem is atypical in that one does not usually get L 57 . X L (Y) = a Y where a = R−1 RYX . The mean square error is ˆ Var [X 2 ] − a RYX 2 = Var [X ] − a1rY1 . Y also has zero expected value. By the same reasoning. jth entry RW (i.225 0. This implies RYX = E [YX ] = E [(1X + W)X ] = 1E X 2 = 1. Thus.725Y2 .0725. E[W1 X 2 ] = E[W1 ]E[X 2 ] = 0 and E[W2 X 2 ] = 0.5 Since X and W have zero expected value. the correlation matrix of Y is RY = E YY = E (1X + W)(1 X + W ) = 11 E X 2 + 1E X W + E [WX ] 1 + E WW = 11 + RW Note that 11 is a 20 × 20 matrix with every entry equal to 1.225Y1 + 0. Since X and W are independent. the optimum linear estimator of X 2 given Y1 and Y2 is ˆ ˆ X L = a Y = −0. by ˆ ˆ ˆ Theorem 9.7.725 (12) Therefore. (14) (13) Quiz 9. ˆ a = R−1 RYX 2 = Y −0.7.X 2 − a2rY2 . Thus. j) = c|i− j|−1 . Y E[WX ] = 0 and E[X W ] = 0 . ˆ a = R−1 RYX = 11 + RW Y and the optimal linear estimator is ˆ X L (Y) = 1 11 + RW The mean square error is ˆ e∗ = Var[X ] − a RYX = 1 − 1 11 + RW L −1 −1 −1 (1) (2) (3) (4) 1 (5) Y (6) 1 (7) Now we note that RW has i. The question we must address is what value c minimizes e∗ . (11) 2 1 E X2 By Theorem 9.9 RYX 2 = = .X 2 = 0.

6 0. [msemin. This would suggest that large values of c will also result in poor MSE. If this argument is not clear.to choose the correlation structure of the noise.4 0. we will see that the answer is somewhat instructive.01:0. when c is small. On the other hand. We note that the answer is not obviously apparent from Equation (7).af]=mquiz9(c(k)). In this case. In particular. both small values and large values of c result in large MSE.8 e* L 0.msec).01:0.4500 1 0. mse=1-((v1’)*af). Note in mquiz9 that v1 corresponds to the vector 1 of all ones.2 0 0. function cmin=mquiz9minc(c). end plot(c. RW=toeplitz(c. To find the optimal value of c.optk]=min(msec). [msec(k). The following commands finds the minimum c and also produces the following graph: >> c=0. we observe that Var[Wi ] = RW (i.99. af=(inv(RY))*v1. xlabel(’c’). v1=ones(20. Thus. i) = 1/c.af]=mquiz9(c). for k=1:length(c). 58 .5 c 1 As we see in the graph. RY=(v1*(v1’)) +RW. cmin=c(optk). msec=zeros(size(c)).1). function [mse. >> mquiz9minc(c) ans = 0. consider the extreme case in which every Wi and W j have correlation coefficient ρi j = 1. we write a M ATLAB function mquiz9(c) to calculate the MSE for a given c and second function that finds plots the MSE for a range of values of c. However.ˆ((0:19)-1)). the noises Wi have high variance and we would expect our estimator to be poor. our 20 measurements will be all the same and one measurement is as good as 20 measurements. if c is large Wi and W j are highly correlated and the separate measurements of X are very dependent.ylabel(’e_Lˆ*’).

we round the temperature to the nearest degree.01 950 ≤ r ≤ 1050 0 otherwise (1) The probability that a test produces a 1% resistor is p = P [990 ≤ R ≤ 1010] = 1010 990 (0. A correct answer specifies enough random variables to specify the sample path exactly.3 (1) Each resistor has resistance R in ohms with uniform PDF f R (r ) = 0.1 There are many correct answers to this question. . .2 (1) We obtain a continuous time.2 (2) 59 . the number of ongoing calls at the start of the experiment • N .Quiz Solutions – Chapter 10 Quiz 10. . continuous valued process when we record the temperature as a continuous waveform over time. One choice for an alternate set of random variables that would specify m(t. the number of calls that hang up during the experiment • D1 . s) is • m(0. then we obtain a discrete time. the call completion times of the H calls that hang up Quiz 10. (2) If at every moment in time. discrete valued process. . the number of new calls that arrive during the experiment • X 1 . the interarrival times of the N new arrivals • H . continuous valued process. X N . then we obtain a continuous time. discrete valued process. s). . .01) dr = 0. (4) Rounding the samples in part (c) to the nearest integer degree yields a discrete time. . Quiz 10. (3) If we sample the process in part (a) every T seconds. D H . .

.. Each resistor is a 1% resistor with probability p. This problem is easy if we view each resistor test as an independent trial. . 1. Hence. .2. . (4) From Theorem 2. the number of additional trials needed to find the second 1% resistor once again has a geometric PMF with expected value 1/ p since each independent trial is a success with probability p. independent of any other resistor.5. .2) = 0.X (n) (x1 ...1. The first 1% resistor is found at time T1 = t if we observe failures on trials 1. That is. In this problem. . Consequently. E[T1 ] = 1/ p = 5. Thus E [T2 |T1 = 10] = E [T1 |T1 = 10] + E T |T1 = 10 = 10 + E T = 10 + 5 = 15 (5) (6) Quiz 10. A success occurs on a trial with probability p if we find a 1% resistor. T2 = T1 + T where T is independent and identically distributed to T1 .08192.(2) In t seconds. . the joint PDF of X = X 1 · · · X n is k (1) f X (x) = f X (1). a geometric random variable with success probability p has expected value 1/ p. the probability the first 1% resistor is found in exactly five seconds is PT1 (5) = (0. t − 1 followed by a success on trial t. xn ) = i=1 f X (xi ) = 1 2 2 e−(x1 +···+xn )/2 n/2 (2π ) (2) 60 . . t 0 otherwise t n (3) (3) First we will find the PMF of T1 . . the number of 1% resistors found has the binomial PMF PN (t) (n) = p n (1 − p)t−n n = 0. just as in Example 2.8)4 (0.4 Since each X i is a N (0. .. . 1) random variable. exactly t resistors are tested. 2. . each X i has PDF 1 2 f X (i) (x) = √ e−x /2 2π By Theorem 10. 9 otherwise (4) Since p = 0. T1 has the geometric PMF PT1 (t) = (1 − p)t−1 p t = 1. . . .11. (5) Note that once we find the first 1% resistor.

the time until the first arrival of the N (t) is Y1 = X 1 + X 2 . Thus N (t) is not a Poisson process. otherwise (1) Since M1 and M2 are independent. the expected number of packets in each hour is E[Mi ] = α = 36. 61 . . X (t) − X (s) is independent of X (s ) for all s ≥ s .11. Theorem 3. X (t) − X (s) = W (t) − W (s) √ α (1) Since W (t) − W (s) is a Gaussian random variable. the ith interarrival time of the N (t) process. Let X 1 . the joint PMF of M1 and M2 is ⎧ α m 1 +m 2 e−2α m 1 = 0. . . Since one hour equals 3600 sec and the Poisson process has a rate of 10 packets/sec. .13 states that W (t) − W (s) is Gaussian with expected value E [X (t) − X (s)] = and variance E (W (t) − W (s))2 = E (W (t) − W (s))2 α(t − s) = α α (3) E [W (t) − W (s)] =0 √ α (2) Consider s ≤ s √ t.5 The first and second hours are nonoverlapping intervals. . we look at the interarrival times. This implies < √ [W (t) − W (s)]/ α is independent of W (s )/ α for all s ≥ s . . Quiz 10. 2. . we note that for t > s. . denote the interarrival times of the N (t) process. 1. m 2 ) = PM1 (m 1 ) PM2 (m 2 ) = ⎪ ⎪ ⎩ 0 otherwise. PM1 . Thus X (t) is a Brownian motion process with variance Var[X (t)] = t. see Theorem 6. has the same PDF as Y1 (t). (2) Quiz 10. 000. 1. X 2 . λ) random variable. . .7 First. That is.M2 (m 1 .Quiz 10. Since X 1 and X 2 are independent exponential (λ) random variables. .6 To answer whether N (t) is a Poisson process. This implies M1 and M2 are independent Poisson random variables each with PMF PMi (m) = α m e−α m! 0 m = 0. Since Yi (t). we can conclude that the interarrival times of N (t) are not exponential random variables. W (t) − W (s) is independent of W (s ). 1. Since we count only evennumbered arrival for N (t). Since s ≥ s . . ⎪ m 1 !m 2 ! ⎪ ⎨ m 2 = 0. Y1 is an Erlang (n = 2. . .

. τ ).... . . f X n1 .Quiz 10.. . . . .. (2) (3) (4) Quiz 10. ..12: R(τ ) ≥ 0 R(τ ) = R(−τ ) |R(τ )| ≤ R(0) (1) (3) (2) (1) (1) R1 (τ ) = e−|τ | meets all three conditions and thus is valid. . . X 1 . E[X (t)N (t )] = E[X (t)]E[N (t )] = 0. (2) R2 (τ ) = e−τ also is valid. xm ) Since the random sequence is iid. xm ) = f X (x1 ) f X (x2 ) · · · f X (xm ) Similarly.X nm +k (x1 . .. . τ ) = E[Y (t)Y (t + τ )]. τ ) = E [(X (t) + N (t)) (X (t + τ ) + N (t + τ ))] = E [X (t)X (t + τ )] + E [X (t)N (t + τ )] + E [X (t + τ )N (t)] + E [N (t)N (t + τ )] = R X (t. . . Quiz 10.9 From Definition 10. xm ) = f X (x1 ) f X (x2 ) · · · f X (xm ) We can conclude that the iid random sequence is stationary.X nm (x1 .. for time instants n 1 + k. . n m + k. we have RY (t.14. .. is a stationary random sequence if for all sets of time instants n 1 . .. ...10 We must check whether each function R(τ ) meets the conditions of Theorem 10. (1) To find the autocorrelation. . .. . we observe that since X (t) and N (t) are independent and since N (t) has zero expected value. X 2 . f X n1 . n m and time offset k. 2 (3) R3 (τ ) = e−τ cos τ is not valid because R3 (−2π ) = e2π cos 2π = e2π > 1 = R3 (0) (4) R4 (τ ) = e−τ sin τ also cannot be an autocorrelation function because 2 (2) R4 (π/2) = e−π/2 sin π/2 = e−π/2 > 0 = R4 (0) (3) 62 .X nm (x1 .8 First we find the expected value µY (t) = µ X (t) + µ N (t) = µ X (t). . Since RY (t. . . . . f X n1 +k ..X nm +k (x1 .. xm ) = f X n1 +k . τ ) + R N (t. . .

In this case.Quiz 10. τ ) = E [Y (t)Y (t + τ )] = E [X (−t)X (−t − τ )] = R X (−t − (−t − τ )) = R X (τ ) (1) (2) (3) Since E[Y (t)] = E[X (−t)] = µ X . R X Y (t. τ ) depends on both t and τ . we can conclude that Y (t) is a wide sense stationary process. In fact. Quiz 10.11 (1) The autocorrelation of Y (t) is RY (t. as t gets larger. E [X (t)] = E [X (t + 1)] = 0 E [X (t)X (t + 1)] = 1/2 Var[X (t)] = Var[X (t + 1)] = 1 The Gaussian random vector X = X (t) X (t + 1) sponding inverse CX = Since 1 1/2 1/2 1 C−1 = X (1) (2) (3) has covariance matrix and corre- 4 1 −1/2 1 3 −1/2 (4) 4 4 2 1 −1/2 x0 2 x − x0 x+ x1 = 1 x1 3 −1/2 3 0 the joint PDF of X (t) and X (t + 1) is the Gaussian vector PDF x C−1 x = x0 x1 X f X (t). we see that by viewing a process backwards in time. suppose R X (τ ) = e−|τ | so that samples of X (t) far apart in time have almost no correlation. (2) Since X (t) and Y (t) are both wide sense stationary processes. To see why this is. we conclude that X (t) and Y (t) are not jointly wide sense stationary. we can check whether they are jointly wide sense stationary by seeing if R X Y (t. Y (t) = X (−t) and X (t) become less and less correlated.X (t+1) (x0 . τ ) = E [X (t)Y (t + τ )] = E [X (t)X (−t − τ )] = R X (t − (−t − τ )) = R X (2t + τ ) (4) (5) (6) Since R X Y (t. τ ) is just a function of τ . x1 ) = 1 (2π )n/2 [det (CX )]1/2 1 3π 2 e− 3 2 2 2 x0 −x0 x1 +x1 (5) 1 exp − x C−1 x X 2 (6) (7) =√ 63 . we see the same second order statistics. In this case.12 From the problem statement.

2. After the head-of-schedule event is completed and any new events (departures in this system) are scheduled. increase the system state n by 1. we cannot generate these vectors all at once. where Sk is an exponential (λ) random variable. Call blocking can be implemented by setting the service time of the call to zero so that the call departs as soon as it arrives. an exponential (λ) random variable. 3. Delete the head-of-schedule event and go to step 2. The logic of such a simulation is 1. admit the arrival. and schedule a departure to occur at time t + Sn . reduce the system state n by 1.13. The program simply executes the event at the head of the schedule. A simulation of the system moves from one time instant to the next by maintaining a chronological schedule of future events (arrivals and departures) to be executed. when M(t) = c. namely arrivals and departures. Otherwise. we know the system state cannot change until the next scheduled event. – If M(t) < c. when an arrival occurs at time t.28 admits a deceptively simple solution in terms of the vector of arrivals A and the vector of departures D. • When the head-of-schedule event is the kth arrival is at time t. do not schedule a departure event. Start at time t = 0 with an empty system. The blocking switch is an example of a discrete event system.13 The simple structure of the switch simulation of Example 10. at discrete time instances.120 100 80 M(t) 60 40 20 0 0 10 20 30 40 50 t 60 70 80 90 100 Figure 4: Sample path of 100 minutes of the blocking switch of Quiz 10. Schedule the first arrival to occur at S1 . check the state M(t). Examine the head-of-schedule event. – If M(t) = c. The system evolves via a sequence of discrete events. 64 . the number of ongoing calls. satisfies M(t) < c = 120. • If the head of schedule event is a departure. In particular. we need to know that M(t). block the arrival. Quiz 10. we must block the call. With the introduction of call blocking.

0048. we will learn that the exact blocking probability is given by Equation (12.a.0. The 5. or event(i)=-1 if the ith scheduled event is a departure. We can estimate the probability a call is blocked as b ˆ = 0. for very complicated systems. Thus this would account for only part of the disparity.0057 is that a simulation that includes only 239 blocks is not all that likely to give a very accurate result for the blocking probability. it is common to implement the event schedule as a linked list where each item in the list has a data structure indicating an event timestamp and the type of the event. In this case. we will learn that the blocking switch is an example of an M/M/c/c queue. a simple (but not elegant) way to do this is to have maintain two vectors: time is a list of timestamps of scheduled events and event is a the list of event types. this says that roughly the first two percent of the simulation time was unusual. The following instructions t=0:0. A sample path of the first 100 minutes of that simulation is shown in Figure 4. 65 . we can calculate that the exact blocking probability is Pb = 0.000 minutes.0057.” From the Erlang-B formula.Thus we know that M(t) will stay the same until then.000 minute full simulation produced a=49658 admitted calls and b=239 blocked calls. generated a simulation lasting 5. a result known as the “Erlang-B formula. In M ATLAB. a kind of Markov chain. However. plot(t.000 minute simulation.1:5000. [m. In our simulation. The complete program is shown in Figure 5. we use the vector t as the set of time instances at which we inspect the system state. the output [m a b] is such that m(i) is the number of ongoing calls at time t(i) while a and b are the number of admits and blocks.120. we set m(i) to the current switch state.93). event(i)=1 if the ith scheduled event is an arrival. Thus for all times t(i) between the current head-of-schedule event and the next. In most programming languages. Nevertheless. the discrete event simulation is widely-used and often very efficient simulation method. One reason our simulation underestimates the blocking probability is that in a 5.1. Chapter 12 develops techniques for analyzing and simulating systems described by Markov chains that are much simpler than the discrete event simulation technique shown here. roughly the first 100 minutes are needed to load up the switch since the switch is idle when the simulation starts at time t = 0. When the program is passed a vector t.b]=simblockswitch(10.0048 and 0.t). The rest of the gap between 0. Note that in Chapter 12. (1) Pb = a+b In Chapter 12.m).

1). end end Figure 5: Discrete event simulation of the blocking switch of Quiz 10. event=[event(b4depart) -1 event(˜b4depart)]. eventnow=event(1).c. event=[ 1 ]. b4depart=time<depart.admits.13.. time=[time(b4arrival) arrival time(˜b4arrival)].t).blocks]=simblockswitch(lam. while (timenow<tmax) M((timenow<=t)&(t<time(1)))=n. %total # admits M=zeros(size(t)). timenow=time(1).mu. %one more block. depart=timenow+exponentialrv(mu. time(1)= [ ]. 66 . n=0.1) ]. n=n+1. blocks=0. event=[event(b4arrival) 1 event(˜b4arrival)]. % # in system time=[ exponentialrv(lam.. else blocks=blocks+1.3d Admits %10d Blocks %10d’. if n<c %call admitted admits=admits+1. % clear current event if (eventnow==1) % arrival arrival=timenow+exponentialrv(lam. event(1)=[ ]. immed departure disp(sprintf(’Time %10. % next arrival b4arrival=time<arrival.. %first event is an arrival timenow=0. time=[time(b4depart) depart time(˜b4depart)]. tmax=max(t). timenow. %total # blocks admits=0.function [M.blocks)).admits.1). end elseif (eventnow==-1) %departure n=n-1.

5(1 + −1) = 0 (1) The autocorrelation of the output is 1 1 RY [n] = i=0 j=0 h i h j R X [n + i − j] 1 n=0 0 otherwise (2) (3) = 2R X [n] − R X [n − 1] − R X [n + 1] = 2 Since µY = 0. RY (τ ) = Hence. µY = µ X ∞ −∞ h(t)dt = 2 0 ∞ e−t dt = 2 (1) Since R X (τ ) = δ(τ ). we can deduce that RY (τ ) = 1 e−|τ | by symmetry. Just to be safe though. For τ < 0. we have RY (τ ) = 0 ∞ e−u e−τ −u du = e−τ 0 ∞ 1 e−2u du = e−τ 2 (3) For τ < 0.2.2 The expected value of the output is ∞ ∞ −τ h(u)h(τ + u) du = ∞ −τ 1 e−u e−τ −u du = eτ 2 (4) (5) µY = µ X n=−∞ h n = 0. the autocorrelation function of the output is RY (τ ) = ∞ −∞ ∞ h(u) −∞ h(v)δ(τ + u − v) dv du = ∞ −∞ h(u)h(τ + u) du (2) For τ > 0. we 2 can double check. 67 .1 By Theorem 11. 1 RY (τ ) = e−|τ | 2 Quiz 11. The variance of Yn is Var[Yn ] = E[Yn ] = RY [0] = 1.Quiz Solutions – Chapter 11 Quiz 11.

it is simpler to observe that Y = HX where X = X 30 X 31 X 32 X 33 X 34 X 35 and ⎡ ⎤ 1 1 1 1 0 0 1 H = ⎣0 1 1 1 1 0⎦ . Quiz 11. In this problem.5.x 10 8 0. Moreover.4 0.5 to find the autocorrelation function ∞ ∞ RY [n] = i=−∞ j=−∞ h i h j R X [n + i − j]. 4 0 0 1 1 1 1 (2) (3) In this case. Y = Y33 Y34 Y35 is a Gaussian random vector since X n is a Gaussian random process. RX = I.6 and to use Theorem 11. the identity matrix.2 X (a) W = 10 (b) W = 1000 Figure 6: The autocorrelation R X (τ ) and power spectral density S X ( f ) for process X (t) in Quiz 11.8.6 SX(f) 0.2 −0. 68 . or by directly applying Theorem 5. using Equation (1) is surprisingly tedious because we still need to sum over all i and j such that n + i − j = 0. which equals the correlation matrix RY since Y has zero expected value.1 0 τ 0. by Theorem 11. following Theorem 11. (1) Despite the fact that R X [k] is an impulse.3 By Theorem 11. Fo find the PDF of the Gaussian vector Y. we obtain RY = HRX H . One way to find the RY is to observe that RY has the Toeplitz structure of Theorem 11.13 with µX = 0 and A = H. we need to find the covariance matrix CY .1 0.7.5. each Yn has expected value E[Yn ] = µ X ∞ n=−∞ h n = 0.2 0 −15 −10 −5 0 f 5 10 15 SX(f) 6 4 2 0 −1500−1000 −500 10 R (τ) 5 0 −5 −2 −1 0 τ 1 x 10 2 −3 0 f 500 1000 1500 10 RX(τ) 5 0 −5 −0. Thus E[Y] = 0. Since R X [n] = δn .

4 This quiz is solved using Theorem 11.1 1 0. 0.1 R X [1] R X [0] 0. Y 2 (5) (6) A disagreeable amount of algebra will show det(CY ) = 3/1024 and that the PDF can be “simplified” to 16 7 2 7 2 1 2 y33 + y34 + y35 − y33 y34 + y33 y35 − y34 y35 exp −8 f Y (y) = √ 3 12 12 6 6π .81 81 = .81 X n−1 R X [2] = .9 for the case of k = 1 and M = 2.9 h = R−1 RXn X n+1 = Xn 0. CY = RY = HH = 16 2 3 4 (4) It follows (very quickly if you use M ATLAB for 3 × 3 matrix inversion) that ⎡ ⎤ 7/12 −1/2 1/12 1 −1/2⎦ . (7) Equation (7) shows that one of the nicest features of the multivariate Gaussian distribution is that y C−1 y is a very concise representation of the cross-terms in the exponent of f Y (y).9 R X [1] −1 (1) (2) The MMSE linear first order filter for predicting X n+1 at time n is the filter h such that ← − 1.1 0. X n+1 = 400 400 (4) to find the mean square error.9 400 261 (3) It follows that the filter is h = 261/400 81/400 and the MMSE linear predictor is 81 261 ˆ X n−1 + Xn. one approach is to follow the method of Example 11.9 1. Y Quiz 11.Thus ⎡ ⎤ 4 3 2 1 ⎣ 3 4 3⎦ .13 and to directly calculate ˆ (5) e∗ = E (X n+1 − X n+1 )2 .1 0. X n+1 = Xn 0. In this case.9 R X [0] R X [1] = 0.9 1. L 69 . Xn = X n−1 X n and RXn = and RXn X n+1 = E 1. the PDF of Y is f Y (y) = 1 (2π )3/2 [det (CY )]1/2 1 exp − y C−1 y . C−1 = 16 ⎣−1/2 Y 1/12 −1/2 7/12 Thus.

graphs of S X ( f ) and R X (τ ) appear in Figure 6.9 400 1451 recalling that the blind estimate would yield a mean square error of Var[X ] = 1. Consulting Table 11. In any case. we can derive the mean square error for an arbitary prediction ← − ˆ filter h.1. X = X n+1 and ← − ˆ a = h . It is noteworthy that the result is derived in a much simpler way in the proof of Theorem 9. we see that observing X n−1 and X n improves the accuracy of our prediction of X n+1 .7 with Y = Xn . e∗ = E L ← − X n+1 − h Xn 2 (6) (7) (8) ← − ← − = E (X n+1 − h Xn )(X n+1 − h Xn ) ← − ← − = E (X n+1 − h Xn )(X n+1 − Xn h ) After a bit of algebra.1.This method is workable for this simple problem but becomes increasingly tedious for higher order filters.7 by using the orthoginality property of the LMSE estimator. 70 . Since X n+1 = h Xn . Instead.81 81 261 e∗ = R X [0] − h RXn X n+1 = 1.1 − = = 0. the mean square error is 1 506 ← − 0. we obtain ← − ← − ← − e∗ = R X [0] − 2 h RXn X n+1 + h RXn h L (9) (10) ← − with the substitution h = R−1 RXn X n+1 . (13) L 0. we obtain Xn e∗ = R X [0] − RXn X n+1 R−1 RXn X n+1 L Xn ← − = R X [0] − h RXn X n+1 (11) (12) Note that this is essentially the same result as Theorem 9. we note that 1 f S X ( f ) = 10 rect (2) 2W 2W It follows that the inverse transform of S X ( f ) is sin(2π W τ ) R X (τ ) = 10 sinc(2W τ ) = 10 (3) 2π W τ (3) For W = 10 Hz and W = 1 kHZ.13(b).5 (1) By Theorem 11. the average power of X (t) is E X 2 (t) = ∞ −∞ W −W SX ( f ) d f = 5 d f = 10 Watts W (1) (2) The autocorrelation function is the inverse Fourier transform of S X ( f ). Quiz 11.3487.

8 We solve this quiz using Theorem 11.1.17. (This quiz is really lame!) Quiz 11. That is.1. the discrete time impulse δ[n] has a flat discrete Fourier transform. τ ) = E [X (t)Y (t + τ )] = E [X (t)X (t + τ − t0 )] = R X (τ − t0 ) (1) We see that R X Y (t.6 In a sampled system. 71 (5) 2a0 a1 .7 Since Y (t) = X (t − t0 ). where u(t) is the unit step function and a1 = 1/RC where RC = 10−4 is the filter time constant.17.000 so that 1 (1) R X (τ ) = a0 e−a0 |τ | . Thus the Fourier transform of R X Y (τ ) = R X (τ − t0 ) = g(τ − t0 ) is S X Y ( f ) = S X ( f )e− j2π f t0 . 2 [a1 + j2π f ] a0 + (2π f )2 (4) a1 a1 + j2π f (3) . if R X [n] = 10δ[n]. we recall the property that g(τ − τ0 ) has Fourier transform G( f )e− j2π f τ0 . First we need some preliminary facts. Let a0 = 5. SY ( f ) = H ∗ ( f )S X Y ( f ) = |H ( f )|2 S X ( f ). we see that 2 2a0 1 2a0 SX ( f ) = = 2 2 + (2π f )2 a0 a0 a0 + (2π f )2 (2) The RC filter has impulse response h(t) = a1 e−a1 t u(t).1. From Table 11. S X Y ( f ) = H ( f )S X ( f ) = (2) Again by Theorem 11. τ ) = R X Y (τ ) = R X (τ − t0 ).17. R X Y (t. a0 Consulting with the Fourier transforms in Table 11. then ∞ S X (φ) = n=−∞ 10δ[n]e− j2π φn = 10 (1) Thus. From Table 11. (2) Quiz 11. H( f ) = (1) Theorem 11. R X [n] = 10δ[n].Quiz 11.

we can either use basic calculus and ∞ calculate −∞ SY ( f ) d f directly or we can find RY (τ ) as an inverse transform of SY ( f ). SY ( f ) = |H ( f )|2 S X ( f ) = 2 2a0 a1 2 2 a1 + (2π f )2 a0 + (2π f )2 2 a1 a1 a1 = 2 (a1 + j2π f ) (a1 − j2π f ) a1 + (2π f )2 (6) (7) (3) To find the average power at the filter output. (12) The average power of the Y (t) process is RY (0) = a1 2 = . a1 + a0 3 (13) Note that the input signal has average power R X (0) = 1. 2 2 2a0 2a1 K0 K1 + 2 . Since the RC filter has a 3dB bandwidth of 10. some algebra will show that SY ( f ) = where K0 = Thus. Using partial fractions and the Fourier transform table.000 rad/sec and the signal X (t) has most of its its signal energy below 5. the output signal has almost as much power as the input. we obtain RY (τ ) = 2 a1 e−a0 |τ | − a0 a1 e−a1 |τ | 2 2 a1 − a0 . 2 K1 = . the latter method is actually less algebra.000 rad/sec.1. we see that RY (τ ) = K0 K1 a e−a0 |τ | + 2 a1 e−a1 |τ | 2 0 2a0 2a1 (11) Substituting the values of K 0 and K 1 . In particular. SY ( f ) = 2 2 2a0 a0 + (2π f )2 2a1 a1 + (2π f )2 2 a0 K0 K1 + + (2π f )2 a1 + (2π f )2 2 −2a0 a1 2 2 a1 − a0 (8) 2 2a0 a1 2 a1 − a0 . (9) (10) Consulting with Table 11. 72 .Note that |H ( f )|2 = H ( f )H ∗ ( f ) = Thus.

146) and (11.9 This quiz implements an example of Equations (11. it follows that SY ( f ) = S X ( f ) + S N ( f ). we see from Table 11.147) for a system in which we filter Y (t) = X (t) + N (t) to produce an optimal linear estimate of X (t). Because the noise process N (t) has constant power R N (0) = 1.146) and to calculate the mean square error e L ∗ using Equation (11. The ˆ solution to this quiz is just to find the filter H ( f ) using Equation (11.146) and (11.1 that SX ( f ) = 1 f rect . Comment: Since the text omitted the derivations of Equations (11. the optimal filter is ˆ H( f ) = SX ( f ) = SX ( f ) + SN ( f ) 1 104 1 104 rect + f 104 1 2B rect f 104 rect f 2B . (1) Now we can go on to the quiz. (2) Since R X (τ ) = sinc(2W τ ).Quiz 11.146). This implies R N (0) = ∞ −∞ SN ( f ) d f = B −B N0 d f = 2N0 B (3) Thus N0 = 1/(2B). R N (0) = Var[N ] = 1. (1) Since µ N = 0. at peace with the derivations. where W = 5. decreasing the single-sided bandwidth B increases the power spectral density of the noise over frequencies | f | < B.147). we note that Example 10.24 showed that RY (τ ) = R X (τ ) + R N (τ ). Taking Fourier transforms.000 Hz.147). (2) RY X (τ ) = R X (τ ). SY X ( f ) = S X ( f ). (6) 73 . (5) From Equation (11. 4 10 104 (4) The noise power spectral density can be written as S N ( f ) = N0 rect f 2B = 1 f rect 2B 2B .

000. the PSD S N ( f ) becomes increasingly tall. Thus increasing B spreads the constant 1 watt of power of N (t) over more bandwidth. we need to whether B ≤ W . we note that we can choose B very large and also achieve MSE e∗ = 0.147). Two examples of the filter H ( f ) are shown in Figure 7. 1 ˆ + 1 (10) H( f ) = 104 2B ⎩ 0 otherwise. Finally. From Equation (11.000/19 = 263. As B is decreased. The noise power is always Var[N ] = 1 Watt. ˆ the Wiener filter H ( f ) is an ideal (flat) lowpass filter ⎧ 1 ⎨ 104 | f | < 5. let’s suppose B ≤ W .ˆ ˆ (3) We produce the output X (t) by passing the noisy signal Y (t) through the filter H ( f ).000 B (9) To obtain MSE e∗ ≤ 0. the mean square error of the estimate is e∗ = L = ∞ −∞ ∞ −∞ S X ( f )S N ( f ) df SX ( f ) + SN ( f ) 1 104 1 104 (7) f 2B f 2B rect f 104 f 104 1 2B rect rect rect + 1 2B d f. S N ( f ) = 1/2B over frequencies | f | < W . The Wiener filter removes the noise that is outside the band of the desired signal. what is happening may not be obvious. L Quiz 11. but only over a bandwidth B that is decreasing.05 requires B ≤ 5. L Although this completes the solution to the quiz. As B shrinks. The only thing to keep in mind is to use fftc to transform the autocorrelation R X [ f ] into the power spectral density S X (φ). B ≥ 9.05. The mean square error is e∗ L = 1 1 104 2B 1 1 −5000 104 + 2B 5000 df = 1 2B 1 104 + 1 2B = 1 B 5000 +1 (11) In this case. the filter suppresses less of the signal of X (t). The result is that the MSE goes down. the MSE is e∗ L = 1 1 104 2B 1 1 −B 104 + 2B B df = 1 104 1 104 + 1 2B = 1 1+ 5.5 × 104 guarantees e∗ ≤ 0. The following M ATLAB program generates and plots the functions shown in Figure 8 74 . In this case. for all values of B. Thus as ˆ B descreases. (8) To evaluate the MSE e∗ . when B > W = 5000.16 Hz. Since the problem asks us to L find the largest possible B. When B ≤ W . the filter H ( f ) makes an increasingly deep and narrow notch at frequencies ˆ | f | ≤ B. In L particular.10 It is fairly straightforward to find S X (φ) and SY (φ).05. We can go back and consider the case B > W later.

%impulse/filter response: M=2 SY2=SX.N). rx=[2 4 2].N). H2=fft(h2. 75 . %impulse/filter response: M=10 SY10=sx.ylabel(’S_{Y_{10}}(n/N)’). xlabel(’n’). Although these imaginary parts have no computational significance. xlabel(’n’). h2=0. figure. Relative to M = 2. the low pass moving average filter for M = 10 removes the high frquency components and results in a filter output that varies very slowly.abs(SY10)). figure.5 0 H(f) −5000 −2000 0 f 2000 5000 1 0. note that the vectors SX. stem(0:N-1. h10=0. H10=fft(h10. SY2 and SY10 in mquiz11 should all be realvalued vectors. However. they tend to confuse the stem function.* ((abs(H2)).ylabel(’S_X(n/N)’).N). %mquiz11.26. when M = 10. %autocorrelation and PSD stem(0:N-1.abs(SY2)).9.ˆ2).1 H(f) 0.1*ones(1.10). stem(0:N-1.5 0 −5000 −2000 0 f 2000 5000 B = 500 B = 2500 Figure 7: Wiener filter for Quiz 11. the finite numerical precision of M ATLAB results in tiny imaginary parts. we generate stem plots of the magnitude of each power spectral density. SX=fftc(rx.*((abs(H10)). In the context of Example 11. As an aside. Hence. the filter H (φ) filters out almost all of the high frequency components of X (t). %PSD of Y for M=2 xlabel(’n’).5*[1 1].abs(sx)).m N=32.ylabel(’S_{Y_2}(n/N)’).ˆ2).

10. SY (n/N ) for M = 2. and Sφ (n/N ) for M = 10 using an N = 32 point DFT. 76 .10 SX(n/N) 5 0 0 5 10 15 n 20 25 30 35 10 SY (n/N) 2 5 0 0 5 10 15 n 20 25 30 35 10 SY (n/N) 10 5 0 0 5 10 15 n 20 25 30 35 Figure 8: For Quiz 11. graphs of S X (φ).

6 P = ⎣0. Algebra will verify that the n-step transition matrix is ⎡ ⎡ ⎤ ⎤ 0.2 −0.01 0.6 0.6 0.2 0.4 0 0 λ3 0.3 The Markov chain describing the factory status and the corresponding state transition matrix are 77 .6 −0. the ith row of S.4 0.6 0. we are given the conditional probabilities P X n+1 = 0|X n = 0 = 0.4 0.6 0.1 1 P= 0. the Markov chain and the transition matrix are ⎡ ⎤ 0.99 P X n+1 = 1|X n = 1 = 0.6 0.5 1 (3) where si .1 The system has two states depending on whether the previous packet was received in error.99 0.5 −0.2 Quiz 12.4 0.5 0 0.5 1 −0.10 0.2 0.99 0.6 0.90 (3) Quiz 12.9 P X n+1 = 0|X n = 1 = 0.4 λ3 = 1 (1) (2) We can diagonalize P into ⎤⎡ ⎡ ⎤ ⎤⎡ −0.01 0 0.01 0. From the problem statement.5 0.9 (1) Since each X n must be either 0 or 1.6 0 0.4 0.6 0.6 0.5 0 −0.1 (2) These conditional probabilities correspond to the transition matrix and Markov chain: 0. is the left eigenvector of P satisfying si P = λi si .2 0 0 ⎦ Pn = S−1 Dn S = ⎣0.5 0.2⎦ 1 0 1 0 0.2 0.2⎦ + (0.2 The eigenvalues of P are λ1 = 0 λ2 = 0.6 0. we can conclude that P X n+1 = 1|X n = 0 = 0.5 1 λ1 0 0 0 −1 ⎦ 0 1 ⎦ ⎣ 0 λ2 0 ⎦ ⎣ 1 P = S−1 DS = ⎣ 0.2 From the problem statement.4)n ⎣ 0 (4) −0.Quiz Solutions – Chapter 12 Quiz 12.2 0.2 0.2 0.

9 0. 1. Quiz 12. 1} C2 = {2. (3) (2) The states in C1 and C3 are aperiodic. the class C1 is never left.5 At any time t. 6} (1) π1 = 1/12.. This implies π0 + π1 + π2 = π0 (1 + 0.1 0 1 1 1 ⎡ ⎤ 0. 2..1π0 and π2 = π1 ..1 + 0.n = P [K > n|K > n − 1] = Pn−1. the states in C2 are transient. Similarly.1) = 1 It follows that the limiting state probabilities are π0 = 5/6. the states in C2 are never reentered. 5. the states in C3 are recurrent.4 The communicating classes are C1 = {0. the state n can take on the values 0. . Thus the states in C1 are recurrent. Quiz 12. On the other hand. 3} C3 = {4. That is.9 0. Once the system exits C2 . . C1 is a recurrent class. 1 … 78 . the system of equations π = π P yields π1 = 0. π2 = 1/12. Once the system enters a state in C1 .1 0 0 1⎦ P=⎣ 0 1 0 0 (1) 2 With π = π0 π1 π2 . The state transition probabilities are Pn−1.0 P [K > n] P [K > n − 1] P [K = n] = P [K = n|K > n − 1] = P [K > n − 1] (1) (2) (3) The Markov chain resembles P[K=5] P[K=4] P[K= 1] P[K=2] P[K=3] 0 1 1 1 2 1 3 1 4 . The states in C2 have period 2. .0.

(2) To find the stationary probabilities. .6 (1) By inspection. Quiz 12. then W has a discrete PMF representing the remaining time of the counter at a time in the distant future. When the counter expires.11. . including state 0. the system is in state 0. we solve the system of equations π = πP and 3 i=0 πi = 1: π0 = (3/4)π1 + (1/4)π3 π1 = (1/4)π0 + (1/4)π2 π2 = (1/4)π1 + (3/4)π3 1 = π0 + π1 + π2 + π3 79 (1) (2) (3) (4) . If we have a random variable W such that the PMF of W satisfies PW (n) = πn . . we obtain π0 ∞ P[K > k] = 1. 2. we obtain π1 = π0 (1 − P [K = 1]) = π0 P [K > 1] Similarly. Thus the period of state 0 is d = 2.The stationary probabilities satisfy π0 = π0 P [K = 1] + π1 .5. Since we spend one unit of time in each state. From Problem 2. . From Equation (4). The system state is the time until the counter expires. This implies πn = P [K > n] E [K ] (10) This Markov chain models repeated random countdowns. and we randomly reset the counter to a new value K = k and then we count down k units of time. When we apply we recall that ∞ k=0 πk ∞ k=0 P[K (9) = 1. (6) This suggests that πk = π0 P[K > k]. πk−1 = π0 P [K = k] + πk . . . the number of transitions need to return to state 0 is always a multiple of 2. π1 = π0 P [K = 2] + π2 . We verify this pattern by showing that πk = π0 P[K > k] satisfies Equation (6): π0 P [K > k − 1] = π0 P [K = k] + π0 P [K > k] . n=0 > k] = E[K ]. we have k − 1 units of time left after the state 0 counter reset. Equation (5) implies π2 = π1 − π0 P [K = 2] = π0 (P [K > 1] − P [K = 2]) = π0 P [K > 2] (8) (7) (4) (5) k = 1.

22. (1) Thus the CDF of T00 satisfies FT00 (n) = 1− P[T00 > n] = 1−1/n α . To determine whether state 0 is recurrent. which occurs with probability P [T00 1 > n] = 1 × 2 α 2 × 3 α n−1 × ··· × n α = 1 n α . The only difference is the modified transition rates: 1 (1/2)a (2/3)a (3/4) a (4/5) a 0 1. we choose π0 so the state probabilities sum to 1: 16 2 5 1 = π0 + π1 + π2 + π3 = π0 1 + + + 2 = π0 (7) 3 3 3 It follows that the state probabilities are π0 = 3 16 π1 = 2 16 π2 = 5 16 π3 = 6 16 (8) (3) Since the system starts in state 0 at time 0. nα (2) 80 . we observe that for all α > 0 P [V00 ] = lim FT00 (n) = lim 1 − n→∞ n→∞ 1 = 1. we can use Theorem 12.(1/2) a 1 1 .7 The Markov chain has the same structure as that in Example 12.(2/3) a 1 .14 to find the limiting probability that the system is in state 0 at time nd: lim P00 (nd) = dπ0 = 3 8 (9) n→∞ Quiz 12. It follows from the first and second equations that π2 = (5/3)π0 and π3 = 2π0 .(3/4) 1 .(4/5)a a 2 3 4 … The event T00 > n occurs if the system reaches state n before returning to state 0.Solving the second and third equations for π2 and π3 yields π2 = 4π1 − π0 π3 = (4/3)π2 − (1/3)π1 = 5π1 − (4/3)π0 (5) Substituting π3 back into the first equation yields π0 = (3/4)π1 + (1/4)π3 = (3/4)π1 + (5/4)π1 − (1/3)π0 (6) This implies π1 = (2/3)π0 . Lastly.

( We also note that if α = 0. it will be simpler to use the result of Problem 2. In Example 12. 1/n α ≥ 1/n and it follows that ∞ E [T00 ] ≥ 1 + n=1 1 = ∞. nα (3) For 0 < α ≤ 1. the Markov chain is positive recurrent.Thus state 0 is recurrent for all α > 0. all states are recurrent. n (4) We conclude that the Markov chain is null recurrent for 0 < α ≤ 1.5.) To determine whether the chain is null recurrent or positive recurrent. we did this by deriving the PMF PT00 (n). Applying this result. Quiz 12.11 which says that ∞ P[K > k] = k=0 E[K ] for any non-negative integer-valued random variable K . Since the chain has only one communicating class. ∞ 1 E [T00 ] = 2 + . On the other hand.8 The number of customers in the ”friendly” store is given by the Markov chain (1-p)(1-q) p (1-p)(1-q) p (1-p)(1-q) p (1-p)(1-q) 0 (1-p)q 1 (1-p)q ××× i (1-p)q (1-p)q i+1 ××× 81 . then all states are transient. In this problem. for α > 1.24. we need to calculate E[T00 ]. (5) nα n=2 Note that for all n ≥ 2 1 ≤ nα ∞ n n−1 dx xα (6) This implies E [T00 ] ≤ 2 + =2+ n n=2 n−1 ∞ dx 1 dx xα (7) (8) xα x −α+1 =2+ −α + 1 ∞ =2+ 1 1 <∞ α−1 (9) Thus for all α > 1. the expected time to return to state 0 is ∞ ∞ E [T00 ] = n=0 P [T00 > n] = 1 + n=1 1 .

2. From the Markov chain. 1.9 The continuous time Markov chain describing the processor is 2 2 2 2 0 3. . the limiting state probabilities do not exist. equivalently. p3 in terms of p2 and so on.1 per msec and the rate to state 0 is the sum of those two rates. . p ≥ q/(1 − q). πi p = πi+1 (1 − p)q. α= (1 − p)q Requiring the state probabilities to sum to 1.01 p4 = 2 p3 We can solve these equations by working backward and solving for p4 in terms of p3 . . . 1−α (4) Thus for α < 1.13 with state space partitioned between S = {0.In the above chain. . 5.01 0.1 since the task completes at rate 3 per msec and the processor reboots at rate 0. Quiz 12. . . we see that for any state i ≥ 0. the limiting state probabilities are πi = (1 − α)α i .01 p2 = 2 p1 + 3 p3 3. . 014. 1. for α ≥ 1 or. i} and S = {i + 1. we have that πi = π0 α i where p .01 1 3 0. i + 2. an existing customer gets one unit of service and then departs the store. ∞ ∞ (3) πi = π0 i=0 i=0 αi = π0 = 1. . By applying Theorem 12. 620 p0 1.01 p1 = 2 p0 + 3 p2 5.. . 381 (1) . we note that (1 − p)q is the probability that no new customer arrives. we have that for α < 1.}.01 2 3 3 3 4 Note that q10 = 3. yielding p4 = 20 p3 31 p3 = 620 p2 981 p2 = 82 19620 p1 31431 p1 = 628. . i = 0.01 p3 = 2 p2 + 3 p4 5. This implies πi+1 = p πi . . 1. we obtain the following useful equations for the stationary distribution. . (1 − p)q (1) (2) Since Equation (2) holds for i = 0. (5) In addition.01 0.

2573 p2 = 0.1606 p3 = 0. 381/2. 401 and the stationary probabilities are p0 = 0. . . . . .4151 p1 = 0. pn = 1 yields c (2) p0 = n=0 ρ c ρ/c ρ /n! + c! 1 − ρ/c n −1 (3) 83 . . c + 2. the stationary probabilities must satisfy pn = (ρ/n) pn−1 n = 1.Applying p0 + p1 + p2 + p3 + p4 = 1 yields p0 = 1.0655 Quiz 12. . . . 2. . . 014. 443.10 The M/M/c/∞ queue has Markov chain λ λ λ λ λ (2) 0 µ 1 2µ cµ c cµ c+1 cµ From the Markov chain. c (ρ/c) pn−1 n = c + 1. c + 2. . (1) It is straightforward to show that this implies pn = The requirement that ∞ n=0 p0 ρ n /n! n = 1. . c n−c c p0 (ρ/c) ρ /c! n = c + 1. 2. .1015 p4 = 0.

You're Reading a Free Preview

Descarga
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->