This action might not be possible to undo. Are you sure you want to continue?

BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

### Publishers

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

- Quiz 1.1
- Quiz 1.2
- Quiz 1.5
- Quiz 1.7
- Quiz 1.8
- Quiz 1.9
- Quiz 1.10
- Quiz 1.11
- Quiz 2.1
- Quiz 2.2
- Quiz 2.3
- Quiz 2.4
- Quiz 2.7
- Quiz 2.8
- Quiz 2.9
- Quiz 2.10
- Quiz 3.1
- Quiz 3.2
- Quiz 3.3
- Quiz 3.4
- Quiz 3.5
- Quiz 3.6
- Quiz 3.7
- Quiz 3.8
- Quiz 3.9
- Quiz 4.1
- Quiz 4.2
- Quiz 4.3
- Quiz 4.4
- Quiz 4.5
- Quiz 4.6
- Quiz 4.7
- Quiz 4.8
- Quiz 4.9
- Quiz 4.10
- Quiz 4.11
- Quiz 4.12
- Quiz 5.1
- Quiz 5.2
- Quiz 5.3
- Quiz 5.4
- Quiz 5.5
- Quiz 5.6
- Quiz 5.7
- Quiz 5.8
- Quiz 6.1
- Quiz 6.2
- Quiz 6.3
- Quiz 6.4
- Quiz 6.5
- Quiz 6.6
- Quiz 6.7
- Quiz 6.8
- Quiz 6.9
- Quiz 7.1
- Quiz 7.2
- Quiz 7.3
- Quiz 7.4
- Quiz 7.5
- Quiz 8.1
- Quiz 8.2
- Quiz 8.3
- Quiz 8.4
- Quiz 9.1
- Quiz 9.2
- Quiz 9.3
- Quiz 9.4
- Quiz 9.5
- Quiz 10.1
- Quiz 10.2
- Quiz 10.3
- Quiz 10.4
- Quiz 10.5
- Quiz 10.6
- Quiz 10.7
- Quiz 10.8
- Quiz 10.9
- Quiz 10.10
- Quiz 10.11
- Quiz 10.12
- Quiz 10.13
- Quiz 11.1
- Quiz 11.2
- Quiz 11.3
- Quiz 11.4
- Quiz 11.5
- Quiz 11.6
- Quiz 11.7
- Quiz 11.8
- Quiz 11.9
- Quiz 11.10
- Quiz 12.1
- Quiz 12.2
- Quiz 12.3
- Quiz 12.4
- Quiz 12.5
- Quiz 12.6
- Quiz 12.7
- Quiz 12.8
- Quiz 12.9
- Quiz 12.10

**A Friendly Introduction for Electrical and Computer Engineers
**

Second Edition

Quiz Solutions

Roy D. Yates and David J. Goodman

May 22, 2004

• The MATLAB section quizzes at the end of each chapter use programs available for

download as the archive matcode.zip. This archive has programs of general pur-

pose programs for solving probability problems as well as speciﬁc .m ﬁles associated

with examples or quizzes in the text. Also available is a manual probmatlab.pdf

describing the general purpose .m ﬁles in matcode.zip.

• We have made a substantial effort to check the solution to every quiz. Nevertheless,

there is a nonzero probability (in fact, a probability close to unity) that errors will be

found. If you ﬁnd errors or have suggestions or comments, please send email to

ryates@winlab.rutgers.edu.

When errors are found, corrected solutions will be posted at the website.

1

Quiz Solutions – Chapter 1

Quiz 1.1

In the Venn diagrams for parts (a)-(g) below, the shaded area represents the indicated

set.

M

O

T

M

O

T

M

O

T

(1) R = T

c

(2) M ∪ O (3) M ∩ O

M

O

T

M

O

T

M

O

T

(4) R ∪ M (4) R ∩ M (6) T

c

− M

Quiz 1.2

(1) A

1

= {vvv, vvd, vdv, vdd}

(2) B

1

= {dvv, dvd, ddv, ddd}

(3) A

2

= {vvv, vvd, dvv, dvd}

(4) B

2

= {vdv, vdd, ddv, ddd}

(5) A

3

= {vvv, ddd}

(6) B

3

= {vdv, dvd}

(7) A

4

= {vvv, vvd, vdv, dvv, vdd, dvd, ddv}

(8) B

4

= {ddd, ddv, dvd, vdd}

Recall that A

i

and B

i

are collectively exhaustive if A

i

∪ B

i

= S. Also, A

i

and B

i

are

mutually exclusive if A

i

∩ B

i

= φ. Since we have written down each pair A

i

and B

i

above,

we can simply check for these properties.

The pair A

1

and B

1

are mutually exclusive and collectively exhaustive. The pair A

2

and

B

2

are mutually exclusive and collectively exhaustive. The pair A

3

and B

3

are mutually

exclusive but not collectively exhaustive. The pair A

4

and B

4

are not mutually exclusive

since dvd belongs to A

4

and B

4

. However, A

4

and B

4

are collectively exhaustive.

2

Quiz 1.3

There are exactly 50 equally likely outcomes: s

51

through s

100

. Each of these outcomes

has probability 0.02.

(1) P[{s

79

}] = 0.02

(2) P[{s

100

}] = 0.02

(3) P[A] = P[{s

90

, . . . , s

100

}] = 11 ×0.02 = 0.22

(4) P[F] = P[{s

51

, . . . , s

59

}] = 9 ×0.02 = 0.18

(5) P[T ≥ 80] = P[{s

80

, . . . , s

100

}] = 21 ×0.02 = 0.42

(6) P[T < 90] = P[{s

51

, s

52

, . . . , s

89

}] = 39 ×0.02 = 0.78

(7) P[a C grade or better] = P[{s

70

, . . . , s

100

}] = 31 ×0.02 = 0.62

(8) P[student passes] = P[{s

60

, . . . , s

100

}] = 41 ×0.02 = 0.82

Quiz 1.4

We can describe this experiment by the event space consisting of the four possible

events V B, V L, DB, and DL. We represent these events in the table:

V D

L 0.35 ?

B ? ?

In a roundabout way, the problem statement tells us how to ﬁll in the table. In particular,

P [V] = 0.7 = P [V L] + P [V B] (1)

P [L] = 0.6 = P [V L] + P [DL] (2)

Since P[V L] = 0.35, we can conclude that P[V B] = 0.35 and that P[DL] = 0.6 −

0.35 = 0.25. This allows us to ﬁll in two more table entries:

V D

L 0.35 0.25

B 0.35 ?

The remaining table entry is ﬁlled in by observing that the probabilities must sum to 1.

This implies P[DB] = 0.05 and the complete table is

V D

L 0.35 0.25

B 0.35 0.05

Finding the various probabilities is now straightforward:

3

(1) P[DL] = 0.25

(2) P[D ∪ L] = P[V L] + P[DL] + P[DB] = 0.35 +0.25 +0.05 = 0.65.

(3) P[V B] = 0.35

(4) P[V ∪ L] = P[V] + P[L] − P[V L] = 0.7 +0.6 −0.35 = 0.95

(5) P[V ∪ D] = P[S] = 1

(6) P[LB] = P[LL

c

] = 0

Quiz 1.5

(1) The probability of exactly two voice calls is

P [N

V

= 2] = P [{vvd, vdv, dvv}] = 0.3 (1)

(2) The probability of at least one voice call is

P [N

V

≥ 1] = P [{vdd, dvd, ddv, vvd, vdv, dvv, vvv}] (2)

= 6(0.1) +0.2 = 0.8 (3)

An easier way to get the same answer is to observe that

P [N

V

≥ 1] = 1 − P [N

V

< 1] = 1 − P [N

V

= 0] = 1 − P [{ddd}] = 0.8 (4)

(3) The conditional probability of two voice calls followed by a data call given that there

were two voice calls is

P [{vvd} |N

V

= 2] =

P [{vvd} , N

V

= 2]

P [N

V

= 2]

=

P [{vvd}]

P [N

V

= 2]

=

0.1

0.3

=

1

3

(5)

(4) The conditional probability of two data calls followed by a voice call given there

were two voice calls is

P [{ddv} |N

V

= 2] =

P [{ddv} , N

V

= 2]

P [N

V

= 2]

= 0 (6)

The joint event of the outcome ddv and exactly two voice calls has probability zero

since there is only one voice call in the outcome ddv.

(5) The conditional probability of exactly two voice calls given at least one voice call is

P [N

V

= 2|N

v

≥ 1] =

P [N

V

= 2, N

V

≥ 1]

P [N

V

≥ 1]

=

P [N

V

= 2]

P [N

V

≥ 1]

=

0.3

0.8

=

3

8

(7)

(6) The conditional probability of at least one voice call given there were exactly two

voice calls is

P [N

V

≥ 1|N

V

= 2] =

P [N

V

≥ 1, N

V

= 2]

P [N

V

= 2]

=

P [N

V

= 2]

P [N

V

= 2]

= 1 (8)

Given that there were two voice calls, there must have been at least one voice call.

4

Quiz 1.6

In this experiment, there are four outcomes with probabilities

P[{vv}] = (0.8)

2

= 0.64 P[{vd}] = (0.8)(0.2) = 0.16

P[{dv}] = (0.2)(0.8) = 0.16 P[{dd}] = (0.2)

2

= 0.04

When checking the independence of any two events A and B, it’s wise to avoid intuition

and simply check whether P[AB] = P[A]P[B]. Using the probabilities of the outcomes,

we now can test for the independence of events.

(1) First, we calculate the probability of the joint event:

P [N

V

= 2, N

V

≥ 1] = P [N

V

= 2] = P [{vv}] = 0.64 (1)

Next, we observe that

P [N

V

≥ 1] = P [{vd, dv, vv}] = 0.96 (2)

Finally, we make the comparison

P [N

V

= 2] P [N

V

≥ 1] = (0.64)(0.96) = P [N

V

= 2, N

V

≥ 1] (3)

which shows the two events are dependent.

(2) The probability of the joint event is

P [N

V

≥ 1, C

1

= v] = P [{vd, vv}] = 0.80 (4)

From part (a), P[N

V

≥ 1] = 0.96. Further, P[C

1

= v] = 0.8 so that

P [N

V

≥ 1] P [C

1

= v] = (0.96)(0.8) = 0.768 = P [N

V

≥ 1, C

1

= v] (5)

Hence, the events are dependent.

(3) The problem statement that the calls were independent implies that the events the

second call is a voice call, {C

2

= v}, and the ﬁrst call is a data call, {C

1

= d} are

independent events. Just to be sure, we can do the calculations to check:

P [C

1

= d, C

2

= v] = P [{dv}] = 0.16 (6)

Since P[C

1

= d]P[C

2

= v] = (0.2)(0.8) = 0.16, we conﬁrm that the events are

independent. Note that this shouldn’t be surprising since we used the information that

the calls were independent in the problem statement to determine the probabilities of

the outcomes.

(4) The probability of the joint event is

P [C

2

= v, N

V

is even] = P [{vv}] = 0.64 (7)

Also, each event has probability

P [C

2

= v] = P [{dv, vv}] = 0.8, P [N

V

is even] = P [{dd, vv}] = 0.68 (8)

Thus, P[C

2

= v]P[N

V

is even] = (0.8)(0.68) = 0.544. Since P[C

2

= v, N

V

is even] =

0.544, the events are dependent.

5

Quiz 1.7

Let F

i

denote the event that that the user is found on page i . The tree for the experiment

is

¨

¨

¨

¨

¨

¨

F

1

0.8

F

c

1

0.2

¨

¨

¨

¨

¨

¨

F

2

0.8

F

c

2

0.2

¨

¨

¨

¨

¨

¨

F

3

0.8

F

c

3

0.2

The user is found unless all three paging attempts fail. Thus the probability the user is

found is

P [F] = 1 − P

¸

F

c

1

F

c

2

F

c

3

¸

= 1 −(0.2)

3

= 0.992 (1)

Quiz 1.8

(1) We can view choosing each bit in the code word as a subexperiment. Each subex-

periment has two possible outcomes: 0 and 1. Thus by the fundamental principle of

counting, there are 2 ×2 ×2 ×2 = 2

4

= 16 possible code words.

(2) An experiment that can yield all possible code words with two zeroes is to choose

which 2 bits (out of 4 bits) will be zero. The other two bits then must be ones. There

are

4

2

**= 6 ways to do this. Hence, there are six code words with exactly two zeroes.
**

For this problem, it is also possible to simply enumerate the six code words:

1100, 1010, 1001, 0101, 0110, 0011.

(3) When the ﬁrst bit must be a zero, then the ﬁrst subexperiment of choosing the ﬁrst

bit has only one outcome. For each of the next three bits, we have two choices. In

this case, there are 1 ×2 ×2 ×2 = 8 ways of choosing a code word.

(4) For the constant ratio code, we can specify a code word by choosing M of the bits to

be ones. The other N −M bits will be zeroes. The number of ways of choosing such

a code word is

N

M

. For N = 8 and M = 3, there are

8

3

= 56 code words.

Quiz 1.9

(1) In this problem, k bits received in error is the same as k failures in 100 trials. The

failure probability is = 1 − p and the success probability is 1 − = p. That is, the

probability of k bits in error and 100 −k correctly received bits is

P

¸

S

k,100−k

¸

=

100

k

k

(1 −)

100−k

(1)

6

For = 0.01,

P

¸

S

0,100

¸

= (1 −)

100

= (0.99)

100

= 0.3660 (2)

P

¸

S

1,99

¸

= 100(0.01)(0.99)

99

= 0.3700 (3)

P

¸

S

2,98

¸

= 4950(0.01)

2

(0.99)

9

8 = 0.1849 (4)

P

¸

S

3,97

¸

= 161, 700(0.01)

3

(0.99)

97

= 0.0610 (5)

(2) The probability a packet is decoded correctly is just

P [C] = P

¸

S

0,100

¸

+ P

¸

S

1,99

¸

+ P

¸

S

2,98

¸

+ P

¸

S

3,97

¸

= 0.9819 (6)

Quiz 1.10

Since the chip works only if all n transistors work, the transistors in the chip are like

devices in series. The probability that a chip works is P[C] = p

n

.

The module works if either 8 chips work or 9 chips work. Let C

k

denote the event that

exactly k chips work. Since transistor failures are independent of each other, chip failures

are also independent. Thus each P[C

k

] has the binomial probability

P [C

8

] =

9

8

(P [C])

8

(1 − P [C])

9−8

= 9p

8n

(1 − p

n

), (1)

P [C

9

] = (P [C])

9

= p

9n

. (2)

The probability a memory module works is

P [M] = P [C

8

] + P [C

9

] = p

8n

(9 −8p

n

) (3)

Quiz 1.11

R=rand(1,100);

X=(R<= 0.4) ...

+ (2*(R>0.4).*(R<=0.9)) ...

+ (3*(R>0.9));

Y=hist(X,1:3)

For a MATLAB simulation, we ﬁrst gen-

erate a vector R of 100 random numbers.

Second, we generate vector X as a func-

tion of R to represent the 3 possible out-

comes of a ﬂip. That is, X(i)=1 if ﬂip i

was heads, X(i)=2 if ﬂip i was tails, and

X(i)=3) is ﬂip i landed on the edge.

To see how this works, we note there are three cases:

• If R(i) <= 0.4, then X(i)=1.

• If 0.4 < R(i) and R(i)<=0.9, then X(i)=2.

• If 0.9 < R(i), then X(i)=3.

These three cases will have probabilities 0.4, 0.5 and 0.1. Lastly, we use the hist function

to count how many occurences of each possible value of X(i).

7

Quiz Solutions – Chapter 2

Quiz 2.1

The sample space, probabilities and corresponding grades for the experiment are

Outcome P[·] G

BB 0.36 3.0

BC 0.24 2.5

CB 0.24 2.5

CC 0.16 2

Quiz 2.2

(1) To ﬁnd c, we recall that the PMF must sum to 1. That is,

3

¸

n=1

P

N

(n) = c

1 +

1

2

+

1

3

= 1 (1)

This implies c = 6/11. Now that we have found c, the remaining parts are straight-

forward.

(2) P[N = 1] = P

N

(1) = c = 6/11

(3) P[N ≥ 2] = P

N

(2) + P

N

(3) = c/2 +c/3 = 5/11

(4) P[N > 3] =

¸

∞

n=4

P

N

(n) = 0

Quiz 2.3

Decoding each transmitted bit is an independent trial where we call a bit error a “suc-

cess.” Each bit is in error, that is, the trial is a success, with probability p. Now we can

interpret each experiment in the generic context of independent trials.

(1) The random variable X is the number of trials up to and including the ﬁrst success.

Similar to Example 2.11, X has the geometric PMF

P

X

(x) =

¸

p(1 − p)

x−1

x = 1, 2, . . .

0 otherwise

(1)

(2) If p = 0.1, then the probability exactly 10 bits are sent is

P [X = 10] = P

X

(10) = (0.1)(0.9)

9

= 0.0387 (2)

8

The probability that at least 10 bits are sent is P[X ≥ 10] =

¸

∞

x=10

P

X

(x). This

sum is not too hard to calculate. However, its even easier to observe that X ≥ 10 if

the ﬁrst 10 bits are transmitted correctly. That is,

P [X ≥ 10] = P [ﬁrst 10 bits are correct] = (1 − p)

10

(3)

For p = 0.1, P[X ≥ 10] = 0.9

10

= 0.3487.

(3) The random variable Y is the number of successes in 100 independent trials. Just as

in Example 2.13, Y has the binomial PMF

P

Y

(y) =

100

y

p

y

(1 − p)

100−y

(4)

If p = 0.01, the probability of exactly 2 errors is

P [Y = 2] = P

Y

(2) =

100

2

(0.01)

2

(0.99)

98

= 0.1849 (5)

(4) The probability of no more than 2 errors is

P [Y ≤ 2] = P

Y

(0) + P

Y

(1) + P

Y

(2) (6)

= (0.99)

100

+100(0.01)(0.99)

99

+

100

2

(0.01)

2

(0.99)

98

(7)

= 0.9207 (8)

(5) Random variable Z is the number of trials up to and including the third success. Thus

Z has the Pascal PMF (see Example 2.15)

P

Z

(z) =

z −1

2

p

3

(1 − p)

z−3

(9)

Note that P

Z

(z) > 0 for z = 3, 4, 5, . . ..

(6) If p = 0.25, the probability that the third error occurs on bit 12 is

P

Z

(12) =

11

2

(0.25)

3

(0.75)

9

= 0.0645 (10)

Quiz 2.4

Each of these probabilities can be read off the CDF F

Y

(y). However, we must keep in

mind that when F

Y

(y) has a discontinuity at y

0

, F

Y

(y) takes the upper value F

Y

(y

+

0

).

(1) P[Y < 1] = F

Y

(1

−

) = 0

9

(2) P[Y ≤ 1] = F

Y

(1) = 0.6

(3) P[Y > 2] = 1 − P[Y ≤ 2] = 1 − F

Y

(2) = 1 −0.8 = 0.2

(4) P[Y ≥ 2] = 1 − P[Y < 2] = 1 − F

Y

(2

−

) = 1 −0.6 = 0.4

(5) P[Y = 1] = P[Y ≤ 1] − P[Y < 1] = F

Y

(1

+

) − F

Y

(1

−

) = 0.6

(6) P[Y = 3] = P[Y ≤ 3] − P[Y < 3] = F

Y

(3

+

) − F

Y

(3

−

) = 0.8 −0.8 = 0

Quiz 2.5

(1) With probability 0.7, a call is a voice call and C = 25. Otherwise, with probability

0.3, we have a data call and C = 40. This corresponds to the PMF

P

C

(c) =

⎧

⎨

⎩

0.7 c = 25

0.3 c = 40

0 otherwise

(1)

(2) The expected value of C is

E [C] = 25(0.7) +40(0.3) = 29.5 cents (2)

Quiz 2.6

(1) As a function of N, the cost T is

T = 25N +40(3 − N) = 120 −15N (1)

(2) To ﬁnd the PMF of T, we can draw the following tree:

¨

¨

¨

¨

¨

¨

¨

N=0

0.1

r

r

r

r

r

r

r

N=3

0.3

$

$

$

$

$

$

$N=1 0.3

N=2 0.3

•T=120

•T=105

•T=90

•T=75

From the tree, we can write down the PMF of T:

P

T

(t ) =

⎧

⎨

⎩

0.3 t = 75, 90, 105

0.1 t = 120

0 otherwise

(2)

From the PMF P

T

(t ), the expected value of T is

E [T] = 75P

T

(75) +90P

T

(90) +105P

T

(105) +120P

T

(120) (3)

= (75 +90 +105)(0.3) +120(0.1) = 62 (4)

10

Quiz 2.7

(1) Using Deﬁnition 2.14, the expected number of applications is

E [A] =

4

¸

a=1

aP

A

(a) = 1(0.4) +2(0.3) +3(0.2) +4(0.1) = 2 (1)

(2) The number of memory chips is M = g(A) where

g(A) =

⎧

⎨

⎩

4 A = 1, 2

6 A = 3

8 A = 4

(2)

(3) By Theorem 2.10, the expected number of memory chips is

E [M] =

4

¸

a=1

g(A)P

A

(a) = 4(0.4) +4(0.3) +6(0.2) +8(0.1) = 4.8 (3)

Since E[A] = 2, g(E[A]) = g(2) = 4. However, E[M] = 4.8 = g(E[A]). The two

quantities are different because g(A) is not of the form αA +β.

Quiz 2.8

The PMF P

N

(n) allows to calculate each of the desired quantities.

(1) The expected value of N is

E [N] =

2

¸

n=0

nP

N

(n) = 0(0.1) +1(0.4) +2(0.5) = 1.4 (1)

(2) The second moment of N is

E

¸

N

2

¸

=

2

¸

n=0

n

2

P

N

(n) = 0

2

(0.1) +1

2

(0.4) +2

2

(0.5) = 2.4 (2)

(3) The variance of N is

Var[N] = E

¸

N

2

¸

−(E [N])

2

= 2.4 −(1.4)

2

= 0.44 (3)

(4) The standard deviation is σ

N

=

√

Var[N] =

√

0.44 = 0.663.

11

Quiz 2.9

(1) From the problem statement, we learn that the conditional PMF of N given the event

I is

P

N|I

(n) =

¸

0.02 n = 1, 2, . . . , 50

0 otherwise

(1)

(2) Also from the problem statement, the conditional PMF of N given the event T is

P

N|T

(n) =

¸

0.2 n = 1, 2, 3, 4, 5

0 otherwise

(2)

(3) The problem statement tells us that P[T] = 1 − P[I ] = 3/4. From Theorem 1.10

(the law of total probability), we ﬁnd the PMF of N is

P

N

(n) = P

N|T

(n) P [T] + P

N|I

(n) P [I ] (3)

=

⎧

⎨

⎩

0.2(0.75) +0.02(0.25) n = 1, 2, 3, 4, 5

0(0.75) +0.02(0.25) n = 6, 7, . . . , 50

0 otherwise

(4)

=

⎧

⎨

⎩

0.155 n = 1, 2, 3, 4, 5

0.005 n = 6, 7, . . . , 50

0 otherwise

(5)

(4) First we ﬁnd

P [N ≤ 10] =

10

¸

n=1

P

N

(n) = (0.155)(5) +(0.005)(5) = 0.80 (6)

By Theorem 2.17, the conditional PMF of N given N ≤ 10 is

P

N|N≤10

(n) =

¸

P

N

(n)

P[N≤10]

n ≤ 10

0 otherwise

(7)

=

⎧

⎨

⎩

0.155/0.8 n = 1, 2, 3, 4, 5

0.005/0.8 n = 6, 7, 8, 9, 10

0 otherwise

(8)

=

⎧

⎨

⎩

0.19375 n = 1, 2, 3, 4, 5

0.00625 n = 6, 7, 8, 9, 10

0 otherwise

(9)

(5) Once we have the conditional PMF, calculating conditional expectations is easy.

E [N|N ≤ 10] =

¸

n

nP

N|N≤10

(n) (10)

=

5

¸

n=1

n(0.19375) +

10

¸

n=6

n(0.00625) (11)

= 3.15625 (12)

12

0 50 100

0

2

4

6

8

10

0 500 1000

0

2

4

6

8

10

(a) samplemean(100) (b) samplemean(1000)

Figure 1: Two examples of the output of samplemean(k)

(6) To ﬁnd the conditional variance, we ﬁrst ﬁnd the conditional second moment

E

¸

N

2

|N ≤ 10

¸

=

¸

n

n

2

P

N|N≤10

(n) (13)

=

5

¸

n=1

n

2

(0.19375) +

10

¸

n=6

n

2

(0.00625) (14)

= 55(0.19375) +330(0.00625) = 12.71875 (15)

The conditional variance is

Var[N|N ≤ 10] = E

¸

N

2

|N ≤ 10

¸

−(E [N|N ≤ 10])

2

(16)

= 12.71875 −(3.15625)

2

= 2.75684 (17)

Quiz 2.10

The function samplemean(k) generates and plots ﬁve m

n

sequences for n = 1, 2, . . . , k.

The i th column M(:,i) of M holds a sequence m

1

, m

2

, . . . , m

k

.

function M=samplemean(k);

K=(1:k)’;

M=zeros(k,5);

for i=1:5,

X=duniformrv(0,10,k);

M(:,i)=cumsum(X)./K;

end;

plot(K,M);

Examples of the function calls (a) samplemean(100) and (b) samplemean(1000)

are shown in Figure 1. Each time samplemean(k) is called produces a random output.

What is observed in these ﬁgures is that for small n, m

n

is fairly random but as n gets

13

large, m

n

gets close to E[X] = 5. Although each sequence m

1

, m

2

, . . . that we generate is

random, the sequences always converges to E[X]. This random convergence is analyzed

in Chapter 7.

14

Quiz Solutions – Chapter 3

Quiz 3.1

The CDF of Y is

0 2 4

0

0.5

1

y

F

Y

(

y

)

F

Y

(y) =

⎧

⎨

⎩

0 y < 0

y/4 0 ≤ y ≤ 4

1 y > 4

(1)

From the CDF F

Y

(y), we can calculate the probabilities:

(1) P[Y ≤ −1] = F

Y

(−1) = 0

(2) P[Y ≤ 1] = F

Y

(1) = 1/4

(3) P[2 < Y ≤ 3] = F

Y

(3) − F

Y

(2) = 3/4 −2/4 = 1/4

(4) P[Y > 1.5] = 1 − P[Y ≤ 1.5] = 1 − F

Y

(1.5) = 1 −(1.5)/4 = 5/8

Quiz 3.2

(1) First we will ﬁnd the constant c and then we will sketch the PDF. To ﬁnd c, we use

the fact that

∞

−∞

f

X

(x) dx = 1. We will evaluate this integral using integration by

parts:

∞

−∞

f

X

(x) dx =

∞

0

cxe

−x/2

dx (1)

= −2cxe

−x/2

∞

0

. .. .

=0

+

∞

0

2ce

−x/2

dx (2)

= −4ce

−x/2

∞

0

= 4c (3)

Thus c = 1/4 and X has the Erlang (n = 2, λ = 1/2) PDF

0 5 10 15

0

0.1

0.2

x

f

X

(

x

)

f

X

(x) =

¸

(x/4)e

−x/2

x ≥ 0

0 otherwise

(4)

15

(2) To ﬁnd the CDF F

X

(x), we ﬁrst note X is a nonnegative random variable so that

F

X

(x) = 0 for all x < 0. For x ≥ 0,

F

X

(x) =

x

0

f

X

(y) dy =

x

0

y

4

e

−y/2

dy (5)

= −

y

2

e

−y/2

x

0

−

x

0

−

1

2

e

−y/2

dy (6)

= 1 −

x

2

e

−x/2

−e

−x/2

(7)

The complete expression for the CDF is

0 5 10 15

0

0.5

1

x

F

X

(

x

)

F

X

(x) =

¸

1 −

x

2

+1

e

−x/2

x ≥ 0

0 otherwise

(8)

(3) From the CDF F

X

(x),

P [0 ≤ X ≤ 4] = F

X

(4) − F

X

(0) = 1 −3e

−2

. (9)

(4) Similarly,

P [−2 ≤ X ≤ 2] = F

X

(2) − F

X

(−2) = 1 −3e

−1

. (10)

Quiz 3.3

The PDF of Y is

−2 0 2

0

1

2

3

y

f

Y

(

y

)

f

Y

(y) =

¸

3y

2

/2 −1 ≤ y ≤ 1,

0 otherwise.

(1)

(1) The expected value of Y is

E [Y] =

∞

−∞

y f

Y

(y) dy =

1

−1

(3/2)y

3

dy = (3/8)y

4

1

−1

= 0. (2)

Note that the above calculation wasn’t really necessary because E[Y] = 0 whenever

the PDF f

Y

(y) is an even function (i.e., f

Y

(y) = f

Y

(−y)).

(2) The second moment of Y is

E

¸

Y

2

¸

=

∞

−∞

y

2

f

Y

(y) dy =

1

−1

(3/2)y

4

dy = (3/10)y

5

1

−1

= 3/5. (3)

16

(3) The variance of Y is

Var[Y] = E

¸

Y

2

¸

−(E [Y])

2

= 3/5. (4)

(4) The standard deviation of Y is σ

Y

=

√

Var[Y] =

√

3/5.

Quiz 3.4

(1) When X is an exponential (λ) random variable, E[X] = 1/λ and Var[X] = 1/λ

2

.

Since E[X] = 3 and Var[X] = 9, we must have λ = 1/3. The PDF of X is

f

X

(x) =

¸

(1/3)e

−x/3

x ≥ 0,

0 otherwise.

(1)

(2) We know X is a uniform (a, b) random variable. To ﬁnd a and b, we apply Theo-

rem 3.6 to write

E [X] =

a +b

2

= 3 Var[X] =

(b −a)

2

12

= 9. (2)

This implies

a +b = 6, b −a = ±6

√

3. (3)

The only valid solution with a < b is

a = 3 −3

√

3, b = 3 +3

√

3. (4)

The complete expression for the PDF of X is

f

X

(x) =

¸

1/(6

√

3) 3 −3

√

3 ≤ x < 3 +3

√

3,

0 otherwise.

(5)

Quiz 3.5

Each of the requested probabilities can be calculated using (z) function and Table 3.1

or Q(z) and Table 3.2. We start with the sketches.

(1) The PDFs of X and Y are shown below. The fact that Y has twice the standard

deviation of X is reﬂected in the greater spread of f

Y

(y). However, it is important

to remember that as the standard deviation increases, the peak value of the Gaussian

PDF goes down.

−5 0 5

0

0.2

0.4

x y

f

X

(

x

)

f

Y

(

y

)

← f

X

(x)

← f

Y

(y)

17

(2) Since X is Gaussian (0, 1),

P [−1 < X ≤ 1] = F

X

(1) − F

X

(−1) (1)

= (1) −(−1) = 2(1) −1 = 0.6826. (2)

(3) Since Y is Gaussian (0, 2),

P [−1 < Y ≤ 1] = F

Y

(1) − F

Y

(−1) (3)

=

1

σ

Y

−

−1

σ

Y

= 2

1

2

−1 = 0.383. (4)

(4) Again, since X is Gaussian (0, 1), P[X > 3.5] = Q(3.5) = 2.33 ×10

−4

.

(5) Since Y is Gaussian (0, 2), P[Y > 3.5] = Q(

3.5

2

) = Q(1.75) = 1 − (1.75) =

0.0401.

Quiz 3.6

The CDF of X is

−2 0 2

0

0.5

1

x

F

X

(

x

)

F

X

(x) =

⎧

⎨

⎩

0 x < −1,

(x +1)/4 −1 ≤ x < 1,

1 x ≥ 1.

(1)

The following probabilities can be read directly from the CDF:

(1) P[X ≤ 1] = F

X

(1) = 1.

(2) P[X < 1] = F

X

(1

−

) = 1/2.

(3) P[X = 1] = F

X

(1

+

) − F

X

(1

−

) = 1 −1/2 = 1/2.

(4) We ﬁnd the PDF f

Y

(y) by taking the derivative of F

Y

(y). The resulting PDF is

−2 0 2

0

0.5

x

f

X

(

x

)

0.5

f

X

(x) =

⎧

⎨

⎩

1/4 −1 ≤ x < 1,

(1/2)δ(x −1) x = 1,

0 otherwise.

(2)

Quiz 3.7

18

(1) Since X is always nonnegative, F

X

(x) = 0 for x < 0. Also, F

X

(x) = 1 for x ≥ 2

since its always true that x ≤ 2. Lastly, for 0 ≤ x ≤ 2,

F

X

(x) =

x

−∞

f

X

(y) dy =

x

0

(1 − y/2) dy = x − x

2

/4. (1)

The complete CDF of X is

−1 0 1 2 3

0

0.5

1

x

F

X

(

x

)

F

X

(x) =

⎧

⎨

⎩

0 x < 0,

x − x

2

/4 0 ≤ x ≤ 2,

1 x > 2.

(2)

(2) The probability that Y = 1 is

P [Y = 1] = P [X ≥ 1] = 1 − F

X

(1) = 1 −3/4 = 1/4. (3)

(3) Since X is nonnegative, Y is also nonnegative. Thus F

Y

(y) = 0 for y < 0. Also,

because Y ≤ 1, F

Y

(y) = 1 for all y ≥ 1. Finally, for 0 < y < 1,

F

Y

(y) = P [Y ≤ y] = P [X ≤ y] = F

X

(y) . (4)

Using the CDF F

X

(x), the complete expression for the CDF of Y is

−1 0 1 2 3

0

0.5

1

y

F

Y

(

y

)

F

Y

(y) =

⎧

⎨

⎩

0 y < 0,

y − y

2

/4 0 ≤ y < 1,

1 y ≥ 1.

(5)

As expected, we see that the jump in F

Y

(y) at y = 1 is exactly equal to P[Y = 1].

(4) By taking the derivative of F

Y

(y), we obtain the PDF f

Y

(y). Note that when y < 0

or y > 1, the PDF is zero.

−1 0 1 2 3

0

0.5

1

1.5

y

f

Y

(

y

)

0.25

f

Y

(y) =

¸

1 − y/2 +(1/4)δ(y −1) 0 ≤ y ≤ 1

0 otherwise

(6)

Quiz 3.8

(1) P[Y ≤ 6] =

6

−∞

f

Y

(y) dy =

6

0

(1/10) dy = 0.6 .

19

(2) From Deﬁnition 3.15, the conditional PDF of Y given Y ≤ 6 is

f

Y|Y≤6

(y) =

¸

f

Y

(y)

P[Y≤6]

y ≤ 6,

0 otherwise,

=

¸

1/6 0 ≤ y ≤ 6,

0 otherwise.

(1)

(3) The probability Y > 8 is

P [Y > 8] =

10

8

1

10

dy = 0.2 . (2)

(4) From Deﬁnition 3.15, the conditional PDF of Y given Y > 8 is

f

Y|Y>8

(y) =

¸

f

Y

(y)

P[Y>8]

y > 8,

0 otherwise,

=

¸

1/2 8 < y ≤ 10,

0 otherwise.

(3)

(5) From the conditional PDF f

Y|Y≤6

(y), we can calculate the conditional expectation

E [Y|Y ≤ 6] =

∞

−∞

y f

Y|Y≤6

(y) dy =

6

0

y

6

dy = 3. (4)

(6) From the conditional PDF f

Y|Y>8

(y), we can calculate the conditional expectation

E [Y|Y > 8] =

∞

−∞

y f

Y|Y>8

(y) dy =

10

8

y

2

dy = 9. (5)

Quiz 3.9

A natural way to produce random variables with PDF f

T|T>2

(t ) is to generate samples

of T with PDF f

T

(t ) and then to discard those samples which fail to satisfy the condition

T > 2. Here is a MATLAB function that uses this method:

function t=t2rv(m)

i=0;lambda=1/3;

t=zeros(m,1);

while (i<m),

x=exponentialrv(lambda,1);

if (x>2)

t(i+1)=x;

i=i+1;

end

end

A second method exploits the fact that if T is an exponential (λ) random variable, then

T

= T +2 has PDF f

T

(t ) = f

T|T>2

(t ). In this case the command

t=2.0+exponentialrv(1/3,m)

generates the vector t.

20

Quiz Solutions – Chapter 4

Quiz 4.1

Each value of the joint CDF can be found by considering the corresponding probability.

(1) F

X,Y

(−∞, 2) = P[X ≤ −∞, Y ≤ 2] ≤ P[X ≤ −∞] = 0 since X cannot take on

the value −∞.

(2) F

X,Y

(∞, ∞) = P[X ≤ ∞, Y ≤ ∞] = 1. This result is given in Theorem 4.1.

(3) F

X,Y

(∞, y) = P[X ≤ ∞, Y ≤ y] = P[Y ≤ y] = F

Y

(y).

(4) F

X,Y

(∞, −∞) = P[X ≤ ∞, Y ≤ −∞] = 0 since Y cannot take on the value −∞.

Quiz 4.2

From the joint PMF of Q and G given in the table, we can calculate the requested

probabilities by summing the PMF over those values of Q and G that correspond to the

event.

(1) The probability that Q = 0 is

P [Q = 0] = P

Q,G

(0, 0) + P

Q,G

(0, 1) + P

Q,G

(0, 2) + P

Q,G

(0, 3) (1)

= 0.06 +0.18 +0.24 +0.12 = 0.6 (2)

(2) The probability that Q = G is

P [Q = G] = P

Q,G

(0, 0) + P

Q,G

(1, 1) = 0.18 (3)

(3) The probability that G > 1 is

P [G > 1] =

3

¸

g=2

1

¸

q=0

P

Q,G

(q, g) (4)

= 0.24 +0.16 +0.12 +0.08 = 0.6 (5)

(4) The probability that G > Q is

P [G > Q] =

1

¸

q=0

3

¸

g=q+1

P

Q,G

(q, g) (6)

= 0.18 +0.24 +0.12 +0.16 +0.08 = 0.78 (7)

21

Quiz 4.3

By Theorem 4.3, the marginal PMF of H is

P

H

(h) =

¸

b=0,2,4

P

H,B

(h, b) (1)

For each value of h, this corresponds to calculating the row sum across the table of the joint

PMF. Similarly, the marginal PMF of B is

P

B

(b) =

1

¸

h=−1

P

H,B

(h, b) (2)

For each value of b, this corresponds to the column sum down the table of the joint PMF.

The easiest way to calculate these marginal PMFs is to simply sum each row and column:

P

H,B

(h, b) b = 0 b = 2 b = 4 P

H

(h)

h = −1 0 0.4 0.2 0.6

h = 0 0.1 0 0.1 0.2

h = 1 0.1 0.1 0 0.2

P

B

(b) 0.2 0.5 0.3

(3)

Quiz 4.4

To ﬁnd the constant c, we apply

∞

−∞

∞

−∞

f

X,Y

(x, y) dx dy = 1. Speciﬁcally,

∞

−∞

∞

−∞

f

X,Y

(x, y) dx dy =

2

0

1

0

cxy dx dy (1)

= c

2

0

y

x

2

/2

1

0

dy (2)

= (c/2)

2

0

y dy = (c/4)y

2

2

0

= c (3)

Thus c = 1. To calculate P[A], we write

P [A] =

A

f

X,Y

(x, y) dx dy (4)

To integrate over A, we convert to polar coordinates using the substitutions x = r cos θ,

y = r sin θ and dx dy = r dr dθ, yielding

Y

X

1

1

2

A

P [A] =

π/2

0

1

0

r

2

sin θ cos θ r dr dθ (5)

=

1

0

r

3

dr

π/2

0

sin θ cos θ dθ

(6)

=

r

4

/4

1

0

⎛

⎝

sin

2

θ

2

π/2

0

⎞

⎠

= 1/8 (7)

22

Quiz 4.5

By Theorem 4.8, the marginal PDF of X is

f

X

(x) =

∞

−∞

f

X,Y

(x, y) dy (1)

For x < 0 or x > 1, f

X

(x) = 0. For 0 ≤ x ≤ 1,

f

X

(x) =

6

5

1

0

(x + y

2

) dy =

6

5

xy + y

3

/3

y=1

y=0

=

6

5

(x +1/3) =

6x +2

5

(2)

The complete expression for the PDf of X is

f

X

(x) =

¸

(6x +2)/5 0 ≤ x ≤ 1

0 otherwise

(3)

By the same method we obtain the marginal PDF for Y. For 0 ≤ y ≤ 1,

f

Y

(y) =

∞

−∞

f

X,Y

(x, y) dy (4)

=

6

5

1

0

(x + y

2

) dx =

6

5

x

2

/2 + xy

2

x=1

x=0

=

6

5

(1/2 + y

2

) =

3 +6y

2

5

(5)

Since f

Y

(y) = 0 for y < 0 or y > 1, the complete expression for the PDF of Y is

f

Y

(y) =

¸

(3 +6y

2

)/5 0 ≤ y ≤ 1

0 otherwise

(6)

Quiz 4.6

(A) The time required for the transfer is T = L/B. For each pair of values of L and B,

we can calculate the time T needed for the transfer. We can write these down on the

table for the joint PMF of L and B as follows:

P

L,B

(l, b) b = 14, 400 b = 21, 600 b = 28, 800

l = 518, 400 0.20 (T=36) 0.10 (T=24) 0.05 (T=18)

l = 2, 592, 000 0.05 (T=180) 0.10 (T=120) 0.20 (T=90)

l = 7, 776, 000 0.00 (T=540) 0.10 (T=360) 0.20 (T=270)

From the table, writing down the PMF of T is straightforward.

P

T

(t ) =

⎧

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎨

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎪

⎩

0.05 t = 18

0.1 t = 24

0.2 t = 36, 90

0.1 t = 120

0.05 t = 180

0.2 t = 270

0.1 t = 360

0 otherwise

(1)

23

(B) First, we observe that since 0 ≤ X ≤ 1 and 0 ≤ Y ≤ 1, W = XY satisﬁes

0 ≤ W ≤ 1. Thus f

W

(0) = 0 and f

W

(1) = 1. For 0 < w < 1, we calculate the

CDF F

W

(w) = P[W ≤ w]. As shown below, integrating over the region W ≤ w

is fairly complex. The calculus is simpler if we integrate over the region XY > w.

Speciﬁcally,

Y

X

1

1

XY > w

w

w

XY = w

F

W

(w) = 1 − P [XY > w] (2)

= 1 −

1

w

1

w/x

dy dx (3)

= 1 −

1

w

(1 −w/x) dx (4)

= 1 −

x −wln x|

x=1

x=w

(5)

= 1 −(1 −w +wln w) = w −wln w (6)

The complete expression for the CDF is

F

W

(w) =

⎧

⎨

⎩

0 w < 0

w −wln w 0 ≤ w ≤ 1

1 w > 1

(7)

By taking the derivative of the CDF, we ﬁnd the PDF is

f

W

(w) =

d F

W

(w)

dw

=

⎧

⎨

⎩

0 w < 0

−ln w 0 ≤ w ≤ 1

0 w > 1

(8)

Quiz 4.7

(A) It is helpful to ﬁrst make a table that includes the marginal PMFs.

P

L,T

(l, t ) t = 40 t = 60 P

L

(l)

l = 1 0.15 0.1 0.25

l = 2 0.3 0.2 0.5

l = 3 0.15 0.1 0.25

P

T

(t ) 0.6 0.4

(1) The expected value of L is

E [L] = 1(0.25) +2(0.5) +3(0.25) = 2. (1)

Since the second moment of L is

E

¸

L

2

¸

= 1

2

(0.25) +2

2

(0.5) +3

2

(0.25) = 4.5, (2)

the variance of L is

Var [L] = E

¸

L

2

¸

−(E [L])

2

= 0.5. (3)

24

(2) The expected value of T is

E [T] = 40(0.6) +60(0.4) = 48. (4)

The second moment of T is

E

¸

T

2

¸

= 40

2

(0.6) +60

2

(0.4) = 2400. (5)

Thus

Var[T] = E

¸

T

2

¸

−(E [T])

2

= 2400 −48

2

= 96. (6)

(3) The correlation is

E [LT] =

¸

t =40,60

3

¸

l=1

lt P

LT

(lt ) (7)

= 1(40)(0.15) +2(40)(0.3) +3(40)(0.15) (8)

+1(60)(0.1) +2(60)(0.2) +3(60)(0.1) (9)

= 96 (10)

(4) From Theorem 4.16(a), the covariance of L and T is

Cov [L, T] = E [LT] − E [L] E [T] = 96 −2(48) = 0 (11)

(5) Since Cov[L, T] = 0, the correlation coefﬁcient is ρ

L,T

= 0.

(B) As in the discrete case, the calculations become easier if we ﬁrst calculate the marginal

PDFs f

X

(x) and f

Y

(y). For 0 ≤ x ≤ 1,

f

X

(x) =

∞

−∞

f

X,Y

(x, y) dy =

2

0

xy dy =

1

2

xy

2

y=2

y=0

= 2x (12)

Similarly, for 0 ≤ y ≤ 2,

f

Y

(y) =

∞

−∞

f

X,Y

(x, y) dx =

2

0

xy dx =

1

2

x

2

y

x=1

x=0

=

y

2

(13)

The complete expressions for the marginal PDFs are

f

X

(x) =

¸

2x 0 ≤ x ≤ 1

0 otherwise

f

Y

(y) =

¸

y/2 0 ≤ y ≤ 2

0 otherwise

(14)

From the marginal PDFs, it is straightforward to calculate the various expectations.

25

(1) The ﬁrst and second moments of X are

E [X] =

∞

−∞

x f

X

(x) dx =

1

0

2x

2

dx =

2

3

(15)

E

¸

X

2

¸

=

∞

−∞

x

2

f

X

(x) dx =

1

0

2x

3

dx =

1

2

(16)

(17)

The variance of X is Var[X] = E[X

2

] −(E[X])

2

= 1/18.

(2) The ﬁrst and second moments of Y are

E [Y] =

∞

−∞

y f

Y

(y) dy =

2

0

1

2

y

2

dy =

4

3

(18)

E

¸

Y

2

¸

=

∞

−∞

y

2

f

Y

(y) dy =

2

0

1

2

y

3

dy = 2 (19)

The variance of Y is Var[Y] = E[Y

2

] −(E[Y])

2

= 2 −16/9 = 2/9.

(3) The correlation of X and Y is

E [XY] =

∞

−∞

∞

−∞

xy f

X,Y

(x, y) dx, dy (20)

=

1

0

2

0

x

2

y

2

dx, dy =

x

3

3

1

0

y

3

3

2

0

=

8

9

(21)

(4) The covariance of X and Y is

Cov [X, Y] = E [XY] − E [X] E [Y] =

8

9

−

2

3

4

3

= 0. (22)

(5) Since Cov[X, Y] = 0, the correlation coefﬁcient is ρ

X,Y

= 0.

Quiz 4.8

(A) Since the event V > 80 occurs only for the pairs (L, T) = (2, 60), (L, T) = (3, 40)

and (L, T) = (3, 60),

P [A] = P [V > 80] = P

L,T

(2, 60) + P

L,T

(3, 40) + P

L,T

(3, 60) = 0.45 (1)

By Deﬁnition 4.9,

P

L,T| A

(l, t ) =

¸

P

L,T

(l,t )

P[A]

lt > 80

0 otherwise

(2)

26

We can represent this conditional PMF in the following table:

P

L,T| A

(l, t ) t = 40 t = 60

l = 1 0 0

l = 2 0 4/9

l = 3 1/3 2/9

The conditional expectation of V can be found from the conditional PMF.

E [V| A] =

¸

l

¸

t

lt P

L,T| A

(l, t ) (3)

= (2 · 60)

4

9

+(3 · 40)

1

3

+(3 · 60)

2

9

= 133

1

3

(4)

For the conditional variance Var[V| A], we ﬁrst ﬁnd the conditional second moment

E

¸

V

2

| A

¸

=

¸

l

¸

t

(lt )

2

P

L,T| A

(l, t ) (5)

= (2 · 60)

2

4

9

+(3 · 40)

2

1

3

+(3 · 60)

2

2

9

= 18, 400 (6)

It follows that

Var [V| A] = E

¸

V

2

| A

¸

−(E [V| A])

2

= 622

2

9

(7)

(B) For continuous random variables X and Y, we ﬁrst calculate the probability of the

conditioning event.

P [B] =

B

f

X,Y

(x, y) dx dy =

60

40

3

80/y

xy

4000

dx dy (8)

=

60

40

y

4000

x

2

2

3

80/y

dy (9)

=

60

40

y

4000

9

2

−

3200

y

2

dy (10)

=

9

8

−

4

5

ln

3

2

≈ 0.801 (11)

The conditional PDF of X and Y is

f

X,Y|B

(x, y) =

¸

f

X,Y

(x, y) /P [B] (x, y) ∈ B

0 otherwise

(12)

=

¸

Kxy 40 ≤ y ≤ 60, 80/y ≤ x ≤ 3

0 otherwise

(13)

27

where K = (4000P[B])

−1

. The conditional expectation of W given event B is

E [W|B] =

∞

−∞

∞

−∞

xy f

X,Y|B

(x, y) dx dy (14)

=

60

40

3

80/y

Kx

2

y

2

dx dy (15)

= (K/3)

60

40

y

2

x

3

x=3

x=80/y

dy (16)

= (K/3)

60

40

27y

2

−80

3

/y

dy (17)

= (K/3)

9y

3

−80

3

ln y

60

40

≈ 120.78 (18)

The conditional second moment of K given B is

E

¸

W

2

|B

¸

=

∞

−∞

∞

−∞

(xy)

2

f

X,Y|B

(x, y) dx dy (19)

=

60

40

3

80/y

Kx

3

y

3

dx dy (20)

= (K/4)

60

40

y

3

x

4

x=3

x=80/y

dy (21)

= (K/4)

60

40

81y

3

−80

4

/y

dy (22)

= (K/4)

(81/4)y

4

−80

4

ln y

60

40

≈ 16, 116.10 (23)

It follows that the conditional variance of W given B is

Var [W|B] = E

¸

W

2

|B

¸

−(E [W|B])

2

≈ 1528.30 (24)

Quiz 4.9

(A) (1) The joint PMF of A and B can be found from the marginal and conditional

PMFs via P

A,B

(a, b) = P

B| A

(b|a)P

A

(a). Incorporating the information from

the given conditional PMFs can be confusing, however. Consequently, we can

note that A has range S

A

= {0, 2} and B has range S

B

= {0, 1}. A table of the

joint PMF will include all four possible combinations of A and B. The general

form of the table is

P

A,B

(a, b) b = 0 b = 1

a = 0 P

B| A

(0|0)P

A

(0) P

B| A

(1|0)P

A

(0)

a = 2 P

B| A

(0|2)P

A

(2) P

B| A

(1|2)P

A

(2)

28

Substituting values from P

B| A

(b|a) and P

A

(a), we have

P

A,B

(a, b) b = 0 b = 1

a = 0 (0.8)(0.4) (0.2)(0.4)

a = 2 (0.5)(0.6) (0.5)(0.6)

or

P

A,B

(a, b) b = 0 b = 1

a = 0 0.32 0.08

a = 2 0.3 0.3

(2) Given the conditional PMF P

B| A

(b|2), it is easy to calculate the conditional

expectation

E [B| A = 2] =

1

¸

b=0

bP

B| A

(b|2) = (0)(0.5) +(1)(0.5) = 0.5 (1)

(3) From the joint PMF P

A,B

(a, b), we can calculate the the conditional PMF

P

A|B

(a|0) =

P

A,B

(a, 0)

P

B

(0)

=

⎧

⎨

⎩

0.32/0.62 a = 0

0.3/0.62 a = 2

0 otherwise

(2)

=

⎧

⎨

⎩

16/31 a = 0

15/31 a = 2

0 otherwise

(3)

(4) We can calculate the conditional variance Var[A|B = 0] using the conditional

PMF P

A|B

(a|0). First we calculate the conditional expected value

E [A|B = 0] =

¸

a

aP

A|B

(a|0) = 0(16/31) +2(15/31) = 30/31 (4)

The conditional second moment is

E

¸

A

2

|B = 0

¸

=

¸

a

a

2

P

A|B

(a|0) = 0

2

(16/31) +2

2

(15/31) = 60/31 (5)

The conditional variance is then

Var[A|B = 0] = E

¸

A

2

|B = 0

¸

−(E [A|B = 0])

2

=

960

961

(6)

(B) (1) The joint PDF of X and Y is

f

X,Y

(x, y) = f

Y|X

(y|x) f

X

(x) =

¸

6y 0 ≤ y ≤ x, 0 ≤ x ≤ 1

0 otherwise

(7)

(2) From the given conditional PDF f

Y|X

(y|x),

f

Y|X

(y|1/2) =

¸

8y 0 ≤ y ≤ 1/2

0 otherwise

(8)

29

(3) The conditional PDF of Y given X = 1/2 is f

X|Y

(x|1/2) = f

X,Y

(x, 1/2)/f

Y

(1/2).

To ﬁnd f

Y

(1/2), we integrate the joint PDF.

f

Y

(1/2) =

∞

−∞

f

X,1/2

( ) dx =

1

1/2

6(1/2) dx = 3/2 (9)

Thus, for 1/2 ≤ x ≤ 1,

f

X|Y

(x|1/2) =

f

X,Y

(x, 1/2)

f

Y

(1/2)

=

6(1/2)

3/2

= 2 (10)

(4) From the pervious part, we see that given Y = 1/2, the conditional PDF of X

is uniform (1/2, 1). Thus, by the deﬁnition of the uniform (a, b) PDF,

Var [X|Y = 1/2] =

(1 −1/2)

2

12

=

1

48

(11)

Quiz 4.10

(A) (1) For random variables X and Y from Example 4.1, we observe that P

Y

(1) =

0.09 and P

X

(0) = 0.01. However,

P

X,Y

(0, 1) = 0 = P

X

(0) P

Y

(1) (1)

Since we have found a pair x, y such that P

X,Y

(x, y) = P

X

(x)P

Y

(y), we can

conclude that X and Y are dependent. Note that whenever P

X,Y

(x, y) = 0,

independence requires that either P

X

(x) = 0 or P

Y

(y) = 0.

(2) For random variables Q and G from Quiz 4.2, it is not obvious whether they

are independent. Unlike X and Y in part (a), there are no obvious pairs q, g

that fail the independence requirement. In this case, we calculate the marginal

PMFs from the table of the joint PMF P

Q,G

(q, g) in Quiz 4.2.

P

Q,G

(q, g) g = 0 g = 1 g = 2 g = 3 P

Q

(q)

q = 0 0.06 0.18 0.24 0.12 0.60

q = 1 0.04 0.12 0.16 0.08 0.40

P

G

(g) 0.10 0.30 0.40 0.20

Careful study of the table will verify that P

Q,G

(q, g) = P

Q

(q)P

G

(g) for every

pair q, g. Hence Q and G are independent.

(B) (1) Since X

1

and X

2

are independent,

f

X

1

,X

2

(x

1

, x

2

) = f

X

1

(x

1

) f

X

2

(x

2

) (2)

=

¸

(1 − x

1

/2)(1 − x

2

/2) 0 ≤ x

1

≤ 2, 0 ≤ x

2

≤ 2

0 otherwise

(3)

30

(2) Let F

X

(x) denote the CDF of both X

1

and X

2

. The CDF of Z = max(X

1

, X

2

)

is found by observing that Z ≤ z iff X

1

≤ z and X

2

≤ z. That is,

P [Z ≤ z] = P [X

1

≤ z, X

2

≤ z] (4)

= P [X

1

≤ z] P [X

2

≤ z] = [F

X

(z)]

2

(5)

To complete the problem, we need to ﬁnd the CDF of each X

i

. From the PDF

f

X

(x), the CDF is

F

X

(x) =

x

−∞

f

X

(y) dy =

⎧

⎨

⎩

0 x < 0

x − x

2

/4 0 ≤ x ≤ 2

1 x > 2

(6)

Thus for 0 ≤ z ≤ 2,

F

Z

(z) = (z − z

2

/4)

2

(7)

The complete expression for the CDF of Z is

F

Z

(z) =

⎧

⎨

⎩

0 z < 0

(z − z

2

/4)

2

0 ≤ z ≤ 2

1 z > 1

(8)

Quiz 4.11

This problem just requires identifying the various terms in Deﬁnition 4.17 and Theo-

rem 4.29. Speciﬁcally, from the problem statement, we know that ρ = 1/2,

µ

1

= µ

X

= 0, µ

2

= µ

Y

= 0, (1)

and that

σ

1

= σ

X

= 1, σ

2

= σ

Y

= 1. (2)

(1) Applying these facts to Deﬁnition 4.17, we have

f

X,Y

(x, y) =

1

√

3π

2

e

−2(x

2

−xy+y

2

)/3

. (3)

(2) By Theorem 4.30, the conditional expected value and standard deviation of X given

Y = y are

E [X|Y = y] = y/2 ˜ σ

X

= σ

2

1

(1 −ρ

2

) =

3/4. (4)

When Y = y = 2, we see that E[X|Y = 2] = 1 and Var[X|Y = 2] = 3/4. The

conditional PDF of X given Y = 2 is simply the Gaussian PDF

f

X|Y

(x|2) =

1

√

3π/2

e

−2(x−1)

2

/3

. (5)

31

Quiz 4.12

One straightforward method is to follow the approach of Example 4.28. Instead, we use

an alternate approach. First we observe that X has the discrete uniform (1, 4) PMF. Also,

given X = x, Y has a discrete uniform (1, x) PMF. That is,

P

X

(x) =

¸

1/4 x = 1, 2, 3, 4,

0 otherwise,

P

Y|X

(y|x) =

¸

1/x y = 1, . . . , x

0 otherwise

(1)

Given X = x, and an independent uniform (0, 1) random variable U, we can generate a

sample value of Y with a discrete uniform (1, x) PMF via Y = xU. This observation

prompts the following program:

function xy=dtrianglerv(m)

sx=[1;2;3;4];

px=0.25*ones(4,1);

x=finiterv(sx,px,m);

y=ceil(x.*rand(m,1));

xy=[x’;y’];

32

Quiz Solutions – Chapter 5

Quiz 5.1

We ﬁnd P[C] by integrating the joint PDF over the region of interest. Speciﬁcally,

P [C] =

1/2

0

dy

2

y

2

0

dy

1

1/2

0

dy

4

y

4

0

4dy

3

(1)

= 4

1/2

0

y

2

dy

2

1/2

0

y

4

dy

4

= 1/4. (2)

Quiz 5.2

By deﬁnition of A, Y

1

= X

1

, Y

2

= X

2

−X

1

and Y

3

= X

3

−X

2

. Since 0 < X

1

< X

2

<

X

3

, each Y

i

must be a strictly positive integer. Thus, for y

1

, y

2

, y

3

∈ {1, 2, . . .},

P

Y

(y) = P [Y

1

= y

1

, Y

2

= y

2

, Y

3

= y

3

] (1)

= P [X

1

= y

1

, X

2

− X

1

= y

2

, X

3

− X

2

= y

3

] (2)

= P [X

1

= y

1

, X

2

= y

2

+ y

1

, X

3

= y

3

+ y

2

+ y

1

] (3)

= (1 − p)

3

p

y

1

+y

2

+y

3

(4)

By deﬁning the vector a =

¸

1 1 1

¸

**, the complete expression for the joint PMF of Y is
**

P

Y

(y) =

¸

(1 − p) p

a

y

y

1

, y

2

, y

3

∈ {1, 2, . . .}

0 otherwise

(5)

Quiz 5.3

First we note that each marginal PDF is nonzero only if any subset of the x

i

obeys the

ordering contraints 0 ≤ x

1

≤ x

2

≤ x

3

≤ 1. Within these constraints, we have

f

X

1

,X

2

(x

1

, x

2

) =

∞

−∞

f

X

(x) dx

3

=

1

x

2

6 dx

3

= 6(1 − x

2

), (1)

f

X

2

,X

3

(x

2

, x

3

) =

∞

−∞

f

X

(x) dx

1

=

x

2

0

6 dx

1

= 6x

2

, (2)

f

X

1

,X

3

(x

1

, x

3

) =

∞

−∞

f

X

(x) dx

2

=

x

3

x

1

6 dx

2

= 6(x

3

− x

1

). (3)

In particular, we must keep in mind that f

X

1

,X

2

(x

1

, x

2

) = 0 unless 0 ≤ x

1

≤ x

2

≤ 1,

f

X

2

,X

3

(x

2

, x

3

) = 0 unless 0 ≤ x

2

≤ x

3

≤ 1, and that f

X

1

,X

3

(x

1

, x

3

) = 0 unless 0 ≤ x

1

≤

33

x

3

≤ 1. The complete expressions are

f

X

1

,X

2

(x

1

, x

2

) =

¸

6(1 − x

2

) 0 ≤ x

1

≤ x

2

≤ 1

0 otherwise

(4)

f

X

2

,X

3

(x

2

, x

3

) =

¸

6x

2

0 ≤ x

2

≤ x

3

≤ 1

0 otherwise

(5)

f

X

1

,X

3

(x

1

, x

3

) =

¸

6(x

3

− x

1

) 0 ≤ x

1

≤ x

3

≤ 1

0 otherwise

(6)

Now we can ﬁnd the marginal PDFs. When 0 ≤ x

i

≤ 1 for each x

i

,

f

X

1

(x

1

) =

∞

−∞

f

X

1

,X

2

(x

1

, x

2

) dx

2

=

1

x

1

6(1 − x

2

) dx

2

= 3(1 − x

1

)

2

(7)

f

X

2

(x

2

) =

∞

−∞

f

X

2

,X

3

(x

2

, x

3

) dx

3

=

1

x

2

6x

2

dx

3

= 6x

2

(1 − x

2

) (8)

f

X

3

(x

3

) =

∞

−∞

f

X

2

,X

3

(x

2

, x

3

) dx

2

=

x

3

0

6x

2

dx

2

= 3x

2

3

(9)

The complete expressions are

f

X

1

(x

1

) =

¸

3(1 − x

1

)

2

0 ≤ x

1

≤ 1

0 otherwise

(10)

f

X

2

(x

2

) =

¸

6x

2

(1 − x

2

) 0 ≤ x

2

≤ 1

0 otherwise

(11)

f

X

3

(x

3

) =

¸

3x

2

3

0 ≤ x

3

≤ 1

0 otherwise

(12)

Quiz 5.4

In the PDF f

Y

(y), the components have dependencies as a result of the ordering con-

straints Y

1

≤ Y

2

and Y

3

≤ Y

4

. We can separate these constraints by creating the vectors

V =

¸

Y

1

Y

2

¸

, W =

¸

Y

3

Y

4

¸

. (1)

The joint PDF of V and W is

f

V,W

(v, w) =

¸

4 0 ≤ v

1

≤ v

2

≤ 1, 0 ≤ w

1

≤ w

2

≤ 1

0 otherwise

(2)

34

We must verify that V and W are independent. For 0 ≤ v

1

≤ v

2

≤ 1,

f

V

(v) =

f

V,W

(v, w) dw

1

dw

2

(3)

=

1

0

1

w

1

4 dw

2

dw

1

(4)

=

1

0

4(1 −w

1

) dw

1

= 2 (5)

Similarly, for 0 ≤ w

1

≤ w

2

≤ 1,

f

W

(w) =

f

V,W

(v, w) dv

1

dv

2

(6)

=

1

0

1

v

1

4 dv

2

dv

1

= 2 (7)

It follows that V and W have PDFs

f

V

(v) =

¸

2 0 ≤ v

1

≤ v

2

≤ 1

0 otherwise

, f

W

(w) =

¸

2 0 ≤ w

1

≤ w

2

≤ 1

0 otherwise

(8)

It is easy to verify that f

V,W

(v, w) = f

V

(v) f

W

(w), conﬁrming that V and W are indepen-

dent vectors.

Quiz 5.5

(A) Referring to Theorem 1.19, each test is a subexperiment with three possible out-

comes: L, A and R. In ﬁve trials, the vector X =

¸

X

1

X

2

X

3

¸

indicating the

number of outcomes of each subexperiment has the multinomial PMF

P

X

(x) =

⎧

⎨

⎩

5

x

1

,x

2

,x

3

(0.3)

x

1

(0.6)

x

2

(0.1)

x

3

x

1

+ x

2

+ x

3

= 5;

x

1

, x

2

, x

3

∈ {0, 1, . . . , 5}

0 otherwise

(1)

We can ﬁnd the marginal PMF for each X

i

from the joint PMF P

X

(x); however it

is simpler to just start from ﬁrst principles and observe that X

1

is the number of

occurrences of L in ﬁve independent tests. If we view each test as a trial with success

probability P[L] = 0.3, we see that X

1

is a binomial (n, p) = (5, 0.3) random

variable. Similarly, X

2

is a binomial (5, 0.6) random variable and X

3

is a binomial

(5, 0.1) random variable. That is, for p

1

= 0.3, p

2

= 0.6 and p

3

= 0.1,

P

X

i

(x) =

¸

5

x

p

x

i

(1 − p

i

)

5−x

x = 0, 1, . . . , 5

0 otherwise

(2)

35

From the marginal PMFs, we see that X

1

, X

2

and X

3

are not independent. Hence, we

must use Theorem 5.6 to ﬁnd the PMF of W. In particular, since X

1

+ X

2

+ X

3

= 5

and since each X

i

is non-negative, P

W

(0) = P

W

(1) = 0. Furthermore,

P

W

(2) = P

X

(1, 2, 2) + P

X

(2, 1, 2) + P

X

(2, 2, 1) (3)

=

5![0.3(0.6)

2

(0.1)

2

+0.3

2

(0.6)(0.1)

2

+0.3

2

(0.6)

2

(0.1)]

2!2!1!

(4)

= 0.1458 (5)

In addition, for w = 3, w = 4, and w = 5, the event W = w occurs if and only if

one of the mutually exclusive events X

1

= w, X

2

= w, or X

3

= w occurs. Thus,

P

W

(3) = P

X

1

(3) + P

X

2

(3) + P

X

3

(3) = 0.486 (6)

P

W

(4) = P

X

1

(4) + P

X

2

(4) + P

X

3

(4) = 0.288 (7)

P

W

(5) = P

X

1

(5) + P

X

2

(5) + P

X

3

(5) = 0.0802 (8)

(B) Since each Y

i

= 2X

i

+4, we can apply Theorem 5.10 to write

f

Y

(y) =

1

2

3

f

X

y

1

−4

2

,

y

2

−4

2

,

y

3

−4

2

(9)

=

¸

(1/8)e

−(y

3

−4)/2

4 ≤ y

1

≤ y

2

≤ y

3

0 otherwise

(10)

Note that for other matrices A, the constraints on y resulting from the constraints

0 ≤ X

1

≤ X

2

≤ X

3

can be much more complicated.

Quiz 5.6

We start by ﬁnding the components E[X

i

] =

∞

−∞

x f

X

i

(x) dx of µ

X

. To do so, we use

the marginal PDFs f

X

i

(x) found in Quiz 5.3:

E [X

1

] =

1

0

3x(1 − x)

2

dx = 1/4, (1)

E [X

2

] =

1

0

6x

2

(1 − x) dx = 1/2, (2)

E [X

3

] =

1

0

3x

3

dx = 3/4. (3)

To ﬁnd the correlation matrix R

X

, we need to ﬁnd E[X

i

X

j

] for all i and j . We start with

36

the second moments:

E

¸

X

2

1

¸

=

1

0

3x

2

(1 − x)

2

dx = 1/10. (4)

E

¸

X

2

2

¸

=

1

0

6x

3

(1 − x) dx = 3/10. (5)

E

¸

X

2

3

¸

=

1

0

3x

4

dx = 3/5. (6)

Using marginal PDFs from Quiz 5.3, the cross terms are

E [X

1

X

2

] =

∞

−∞

∞

−∞

x

1

x

2

f

X

1

,X

2

(x

1

, x

2

) , dx

1

dx

2

(7)

=

1

0

1

x

1

6x

1

x

2

(1 − x

2

) dx

2

dx

1

(8)

=

1

0

[x

1

−3x

3

1

+2x

4

1

] dx

1

= 3/20. (9)

E [X

2

X

3

] =

1

0

1

x

2

6x

2

2

x

3

dx

3

dx

2

(10)

=

1

0

[3x

2

2

−3x

4

2

] dx

2

= 2/5 (11)

E [X

1

X

3

] =

1

0

1

x

1

6x

1

x

3

(x

3

− x

1

) dx

3

dx

1

. (12)

=

1

0

(2x

1

x

3

3

−3x

2

1

x

2

3

)

x

3

=1

x

3

=x

1

dx

1

(13)

=

1

0

[2x

1

−3x

2

1

+ x

4

1

] dx

1

= 1/5. (14)

Summarizing the results, X has correlation matrix

R

X

=

⎡

⎣

1/10 3/20 1/5

3/20 3/10 2/5

1/5 2/5 3/5

⎤

⎦

. (15)

Vector X has covariance matrix

C

X

= R

X

− E [X] E [X]

(16)

=

⎡

⎣

1/10 3/20 1/5

3/20 3/10 2/5

1/5 2/5 3/5

⎤

⎦

−

⎡

⎣

1/4

1/2

3/4

⎤

⎦

¸

1/4 1/2 3/4

¸

(17)

=

⎡

⎣

1/10 3/20 1/5

3/20 3/10 2/5

1/5 2/5 3/5

⎤

⎦

−

⎡

⎣

1/16 1/8 3/16

1/8 1/4 3/8

3/16 3/8 9/16

⎤

⎦

=

1

80

⎡

⎣

3 2 1

2 4 2

1 2 3

⎤

⎦

. (18)

37

This problemshows that even for fairly simple joint PDFs, computing the covariance matrix

by calculus can be a time consuming task.

Quiz 5.7

We observe that X = AZ +b where

A =

¸

2 1

1 −1

¸

, b =

¸

2

0

¸

. (1)

It follows from Theorem 5.18 that µ

X

= b and that

C

X

= AA

=

¸

2 1

1 −1

¸ ¸

2 1

1 −1

¸

=

¸

5 1

1 2

¸

. (2)

Quiz 5.8

First, we observe that Y = AT where A =

¸

1/31 1/31 · · · 1/31

¸

. Since T is a

Gaussian random vector, Theorem 5.16 tells us that Y is a 1 dimensional Gaussian vector,

i.e., just a Gaussian random variable. The expected value of Y is µ

Y

= µ

T

= 80. The

covariance matrix of Y is 1 × 1 and is just equal to Var[Y]. Thus, by Theorem 5.16,

Var[Y] = AC

T

A

.

function p=julytemps(T);

[D1 D2]=ndgrid((1:31),(1:31));

CT=36./(1+abs(D1-D2));

A=ones(31,1)/31.0;

CY=(A’)*CT*A;

p=phi((T-80)/sqrt(CY));

In julytemps.m, the ﬁrst two lines gen-

erate the 31 ×31 covariance matrix CT, or

C

T

. Next we calculate Var[Y]. The ﬁnal

step is to use the (·) function to calculate

P[Y < T].

Here is the output of julytemps.m:

>> julytemps([70 75 80 85 90 95])

ans =

0.0000 0.0221 0.5000 0.9779 1.0000 1.0000

Note that P[T ≤ 70] is not actually zero and that P[T ≤ 90] is not actually 1.0000. Its

just that the MATLAB’s short format output, invoked with the command format short,

rounds off those probabilities. Here is the long format output:

>> format long

>> julytemps([70 75 80 85 90 95])

ans =

Columns 1 through 4

0.00002844263128 0.02207383067604 0.50000000000000 0.97792616932396

Columns 5 through 6

0.99997155736872 0.99999999922010

38

The ndgrid function is a useful to way calculate many covariance matrices. However, in

this problem, C

X

has a special structure; the i, j th element is

C

T

(i, j ) = c

|i −j |

=

36

1 +|i − j |

. (1)

If we write out the elements of the covariance matrix, we see that

C

T

=

⎡

⎢

⎢

⎢

⎣

c

0

c

1

· · · c

30

c

1

c

0

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. c

1

c

30

· · · c

1

c

0

⎤

⎥

⎥

⎥

⎦

. (2)

This covariance matrix is known as a symmetric Toeplitz matrix. We will see in Chap-

ters 9 and 11 that Toeplitz covariance matrices are quite common. In fact, MATLAB has a

toeplitz function for generating them. The function julytemps2 use the toeplitz

to generate the correlation matrix C

T

.

function p=julytemps2(T);

c=36./(1+abs(0:30));

CT=toeplitz(c);

A=ones(31,1)/31.0;

CY=(A’)*CT*A;

p=phi((T-80)/sqrt(CY));

39

Quiz Solutions – Chapter 6

Quiz 6.1

Let K

1

, . . . , K

n

denote a sequence of iid random variables each with PMF

P

K

(k) =

¸

1/4 k = 1, . . . , 4

0 otherwise

(1)

We can write W

n

in the form of W

n

= K

1

+ · · · + K

n

. First, we note that the ﬁrst two

moments of K

i

are

E [K

i

] = (1 +2 +3 +4)/4 = 2.5 (2)

E

¸

K

2

i

¸

= (1

2

+2

2

+3

2

+4

2

)/4 = 7.5 (3)

Thus the variance of K

i

is

Var[K

i

] = E

¸

K

2

i

¸

−(E [K

i

])

2

= 7.5 −(2.5)

2

= 1.25 (4)

Since E[K

i

] = 2.5, the expected value of W

n

is

E [W

n

] = E [K

1

] +· · · + E [K

n

] = nE [K

i

] = 2.5n (5)

Since the rolls are independent, the random variables K

1

, . . . , K

n

are independent. Hence,

by Theorem 6.3, the variance of the sum equals the sum of the variances. That is,

Var[W

n

] = Var[K

1

] +· · · +Var[K

n

] = 1.25n (6)

Quiz 6.2

Random variables X and Y have PDFs

f

X

(x) =

¸

3e

−3x

x ≥ 0

0 otherwise

f

Y

(y) =

¸

2e

−2y

y ≥ 0

0 otherwise

(1)

Since X and Y are nonnegative, W = X +Y is nonnegative. By Theorem 6.5, the PDF of

W = X +Y is

f

W

(w) =

∞

−∞

f

X

(w − y) f

Y

(y) dy = 6

w

0

e

−3(w−y)

e

−2y

dy (2)

Fortunately, this integral is easy to evaluate. For w > 0,

f

W

(w) = e

−3w

e

y

w

0

= 6

e

−2w

−e

−3w

(3)

Since f

W

(w) = 0 for w < 0, a conmplete expression for the PDF of W is

f

W

(w) =

¸

6e

−2w

1 −e

−w

w ≥ 0,

0 otherwise.

(4)

40

Quiz 6.3

The MGF of K is

φ

K

(s) = E

¸

e

s K

¸

==

4

¸

k=0

(0.2)e

sk

= 0.2

1 +e

s

+e

2s

+e

3s

+e

4s

(1)

We ﬁnd the moments by taking derivatives. The ﬁrst derivative of φ

K

(s) is

dφ

K

(s)

ds

= 0.2(e

s

+2e

2s

+3e

3s

+4e

4s

) (2)

Evaluating the derivative at s = 0 yields

E [K] =

dφ

K

(s)

ds

s=0

= 0.2(1 +2 +3 +4) = 2 (3)

To ﬁnd higher-order moments, we continue to take derivatives:

E

¸

K

2

¸

=

d

2

φ

K

(s)

ds

2

s=0

= 0.2(e

s

+4e

2s

+9e

3s

+16e

4s

)

s=0

= 6 (4)

E

¸

K

3

¸

=

d

3

φ

K

(s)

ds

3

s=0

= 0.2(e

s

+8e

2s

+27e

3s

+64e

4s

)

s=0

= 20 (5)

E

¸

K

4

¸

=

d

4

φ

K

(s)

ds

4

s=0

= 0.2(e

s

+16e

2s

+81e

3s

+256e

4s

)

s=0

= 70.8 (6)

(7)

Quiz 6.4

(A) Each K

i

has MGF

φ

K

(s) = E

¸

e

s K

i

¸

=

e

s

+e

2s

+· · · +e

ns

n

=

e

s

(1 −e

ns

)

n(1 −e

s

)

(1)

Since the sequence of K

i

is independent, Theorem 6.8 says the MGF of J is

φ

J

(s) = (φ

K

(s))

m

=

e

ms

(1 −e

ns

)

m

n

m

(1 −e

s

)

m

(2)

(B) Since the set of α

j

X

j

are independent Gaussian random variables, Theorem 6.10

says that W is a Gaussian random variable. Thus to ﬁnd the PDF of W, we need

only ﬁnd the expected value and variance. Since the expectation of the sum equals

the sum of the expectations:

E [W] = αE [X

1

] +α

2

E [X

2

] +· · · +α

n

E [X

n

] = 0 (3)

41

Since the α

j

X

j

are independent, the variance of the sum equals the sum of the vari-

ances:

Var[W] = α

2

Var[X

1

] +α

4

Var[X

2

] +· · · +α

2n

Var[X

n

] (4)

= α

2

+2(α

2

)

2

+3(α

2

)

3

+· · · +n(α

2

)

n

(5)

Deﬁning q = α

2

, we can use Math Fact B.6 to write

Var[W] =

α

2

−α

2n+2

[1 +n(1 −α

2

)]

(1 −α

2

)

2

(6)

With E[W] = 0 and σ

2

W

= Var[W], we can write the PDF of W as

f

W

(w) =

1

2πσ

2

W

e

−w

2

/2σ

2

W

(7)

Quiz 6.5

(1) From Table 6.1, each X

i

has MGF φ

X

(s) and random variable N has MGF φ

N

(s)

where

φ

X

(s) =

1

1 −s

, φ

N

(s) =

1

5

e

s

1 −

4

5

e

s

. (1)

From Theorem 6.12, R has MGF

φ

R

(s) = φ

N

(ln φ

X

(s)) =

1

5

φ

X

(s)

1 −

4

5

φ

X

(s)

(2)

Substituting the expression for φ

X

(s) yields

φ

R

(s) =

1

5

1

5

−s

. (3)

(2) From Table 6.1, we see that R has the MGF of an exponential (1/5) random variable.

The corresponding PDF is

f

R

(r) =

¸

(1/5)e

−r/5

r ≥ 0

0 otherwise

(4)

This quiz is an example of the general result that a geometric sum of exponential

random variables is an exponential random variable.

42

Quiz 6.6

(1) The expected access time is

E [X] =

∞

−∞

x f

X

(x) dx =

12

0

x

12

dx = 6 msec (1)

(2) The second moment of the access time is

E

¸

X

2

¸

=

∞

−∞

x

2

f

X

(x) dx =

12

0

x

2

12

dx = 48 (2)

The variance of the access time is Var[X] = E[X

2

] −(E[X])

2

= 48 −36 = 12.

(3) Using X

i

to denote the access time of block i , we can write

A = X

1

+ X

2

+· · · + X

12

(3)

Since the expectation of the sum equals the sum of the expectations,

E [A] = E [X

1

] +· · · + E [X

12

] = 12E [X] = 72 msec (4)

(4) Since the X

i

are independent,

Var[A] = Var[X

1

] +· · · +Var[X

12

] = 12 Var[X] = 144 (5)

Hence, the standard deviation of A is σ

A

= 12

(5) To use the central limit theorem, we write

P [A > 75] = 1 − P [A ≤ 75] (6)

= 1 − P

¸

A − E [A]

σ

A

≤

75 − E [A]

σ

A

¸

(7)

≈ 1 −

75 −72

12

(8)

= 1 −0.5987 = 0.4013 (9)

Note that we used Table 3.1 to look up (0.25).

(6) Once again, we use the central limit theorem and Table 3.1 to estimate

P [A < 48] = P

¸

A − E [A]

σ

A

<

48 − E [A]

σ

A

¸

(10)

≈

48 −72

12

(11)

= 1 −(2) = 1 −0.9773 = 0.0227 (12)

43

Quiz 6.7

Random variable K

n

has a binomial distribution for n trials and success probability

P[V] = 3/4.

(1) The expected number of voice calls out of 48 calls is E[K

48

] = 48P[V] = 36.

(2) The variance of K

48

is

Var[K

48

] = 48P [V] (1 − P [V]) = 48(3/4)(1/4) = 9 (1)

Thus K

48

has standard deviation σ

K

48

= 3.

(3) Using the ordinary central limit theorem and Table 3.1 yields

P [30 ≤ K

48

≤ 42] ≈

42 −36

3

−

30 −36

3

= (2) −(−2) (2)

Recalling that (−x) = 1 −(x), we have

P [30 ≤ K

48

≤ 42] ≈ 2(2) −1 = 0.9545 (3)

(4) Since K

48

is a discrete random variable, we can use the De Moivre-Laplace approx-

imation to estimate

P [30 ≤ K

48

≤ 42] ≈

42 +0.5 −36

3

−

30 −0.5 −36

3

(4)

= 2(2.16666) −1 = 0.9687 (5)

Quiz 6.8

The train interarrival times X

1

, X

2

, X

3

are iid exponential (λ) random variables. The

arrival time of the third train is

W = X

1

+ X

2

+ X

3

. (1)

In Theorem 6.11, we found that the sum of three iid exponential (λ) random variables is an

Erlang (n = 3, λ) random variable. From Appendix A, we ﬁnd that W has expected value

and variance

E [W] = 3/λ = 6 Var[W] = 3/λ

2

= 12 (2)

(1) By the Central Limit Theorem,

P [W > 20] = P

¸

W −6

√

12

>

20 −6

√

12

¸

≈ Q(7/

√

3) = 2.66 ×10

−5

(3)

44

(2) To use the Chernoff bound, we note that the MGF of W is

φ

W

(s) =

λ

λ −s

3

=

1

(1 −2s)

3

(4)

The Chernoff bound states that

P [W > 20] ≤ min

s≥0

e

−20s

φ

X

(s) = min

s≥0

e

−20s

(1 −2s)

3

(5)

To minimize h(s) = e

−20s

/(1 −2s)

3

, we set the derivative of h(s) to zero:

dh(s)

ds

=

−20(1 −2s)

3

e

−20s

+6e

−20s

(1 −2s)

2

(1 −2s)

6

= 0 (6)

This implies 20(1 − 2s) = 6 or s = 7/20. Applying s = 7/20 into the Chernoff

bound yields

P [W > 20] ≤

e

−20s

(1 −2s)

3

s=7/20

= (10/3)

3

e

−7

= 0.0338 (7)

(3) Theorem 3.11 says that for any w > 0, the CDF of the Erlang (λ, 3) random variable

W satisﬁes

F

W

(w) = 1 −

2

¸

k=0

(λw)

k

e

−λw

k!

(8)

Equivalently, for λ = 1/2 and w = 20,

P [W > 20] = 1 − F

W

(20) (9)

= e

−10

1 +

10

1!

+

10

2

2!

= 61e

−10

= 0.0028 (10)

Although the Chernoff bound is relatively weak in that it overestimates the proba-

bility by roughly a factor of 12, it is a valid bound. By contrast, the Central Limit

Theorem approximation grossly underestimates the true probability.

Quiz 6.9

One solution to this problem is to follow the approach of Example 6.19:

%unifbinom100.m

sx=0:100;sy=0:100;

px=binomialpmf(100,0.5,sx); py=duniformpmf(0,100,sy);

[SX,SY]=ndgrid(sx,sy); [PX,PY]=ndgrid(px,py);

SW=SX+SY; PW=PX.*PY;

sw=unique(SW); pw=finitepmf(SW,PW,sw);

pmfplot(sw,pw,’\itw’,’\itP_W(w)’);

A graph of the PMF P

W

(w) appears in Figure 2 With some thought, it should be apparent

that the finitepmf function is implementing the convolution of the two PMFs.

45

0 20 40 60 80 100 120 140 160 180 200

0

0.002

0.004

0.006

0.008

0.01

w

P

W

(

w

)

Figure 2: From Quiz 6.9, the PMF P

W

(w) of the independent sum of a binomial (100, 0.5)

random variable and a discrete uniform (0, 100) random variable.

46

Quiz Solutions – Chapter 7

Quiz 7.1

An exponential random variable with expected value 1 also has variance 1. By Theo-

rem 7.1, M

n

(X) has variance Var[M

n

(X)] = 1/n. Hence, we need n = 100 samples.

Quiz 7.2

The arrival time of the third elevator is W = X

1

+ X

2

+ X

3

. Since each X

i

is uniform

(0, 30),

E [X

i

] = 15, Var [X

i

] =

(30 −0)

2

12

= 75. (1)

Thus E[W] = 3E[X

i

] = 45, and Var[W] = 3 Var[X

i

] = 225.

(1) By the Markov inequality,

P [W > 75] ≤

E [W]

75

=

45

75

=

3

5

(2)

(2) By the Chebyshev inequality,

P [W > 75] = P [W − E [W] > 30] (3)

≤ P [|W − E [W]| > 30] ≤

Var [W]

30

2

=

225

900

=

1

4

(4)

Quiz 7.3

Deﬁne the random variable W = (X − µ

X

)

2

. Observe that V

100

(X) = M

100

(W). By

Theorem 7.6, the mean square error is

E

¸

(M

100

(W) −µ

W

)

2

¸

=

Var[W]

100

(1)

Observe that µ

X

= 0 so that W = X

2

. Thus,

µ

W

= E

¸

X

2

¸

=

1

−1

x

2

f

X

(x) dx = 1/3 (2)

E

¸

W

2

¸

= E

¸

X

4

¸

=

1

−1

x

4

f

X

(x) dx = 1/5 (3)

Therefore Var[W] = E[W

2

] − µ

2

W

= 1/5 − (1/3)

2

= 4/45 and the mean square error is

4/4500 = 0.000889.

47

Quiz 7.4

Assuming the number n of samples is large, we can use a Gaussian approximation for

M

n

(X). SinceE[X] = p and Var[X] = p(1 − p), we apply Theorem 7.13 which says that

the interval estimate

M

n

(X) −c ≤ p ≤ M

n

(X) +c (1)

has conﬁdence coefﬁcient 1 −α where

α = 2 −2

c

√

n

p(1 − p)

. (2)

We must ensure for every value of p that 1 − α ≥ 0.9 or α ≤ 0.1. Equivalently, we must

have

c

√

n

p(1 − p)

≥ 0.95 (3)

for every value of p. Since (x) is an increasing function of x, we must satisfy c

√

n ≥

1.65p(1 − p). Since p(1 − p) ≤ 1/4 for all p, we require that

c ≥

1.65

4

√

n

=

0.41

√

n

. (4)

The 0.9 conﬁdence interval estimate of p is

M

n

(X) −

0.41

√

n

≤ p ≤ M

n

(X) +

0.41

√

n

. (5)

For the 0.99 conﬁdence interval, we have α ≤ 0.01, implying (c

√

n/( p(1−p))) ≥ 0.995.

This implies c

√

n ≥ 2.58p(1 − p). Since p(1 − p) ≤ 1/4 for all p, we require that

c ≥ (0.25)(2.58)/

√

n. In this case, the 0.99 conﬁdence interval estimate is

M

n

(X) −

0.645

√

n

≤ p ≤ M

n

(X) +

0.645

√

n

. (6)

Note that if M

100

(X) = 0.4, then the 0.99 conﬁdence interval estimate is

0.3355 ≤ p ≤ 0.4645. (7)

The interval is wide because the 0.99 conﬁdence is high.

Quiz 7.5

Following the approach of bernoullitraces.m, we generate m = 1000 sample

paths, each sample path having n = 100 Bernoulli traces. at time k, OK(k) counts the

fraction of sample paths that have sample mean within one standard error of p. The pro-

gram bernoullisample.m generates graphs the number of traces within one standard

error as a function of the time, i.e. the number of trials in each trace.

48

function OK=bernoullisample(n,m,p);

x=reshape(bernoullirv(p,m*n),n,m);

nn=(1:n)’*ones(1,m);

MN=cumsum(x)./nn;

stderr=sqrt(p*(1-p))./sqrt((1:n)’);

stderrmat=stderr*ones(1,m);

OK=sum(abs(MN-p)<stderrmat,2)/m;

plot(1:n,OK,’-s’);

The following graph was generated by bernoullisample(100,5000,0.5):

0 10 20 30 40 50 60 70 80 90 100

0.4

0.5

0.6

0.7

0.8

0.9

1

As we would expect, as m gets large, the fraction of traces within one standard error ap-

proaches 2(1) −1 ≈ 0.68. The unusual sawtooth pattern, though perhaps unexpected, is

examined in Problem 7.5.2.

49

Quiz Solutions – Chapter 8

Quiz 8.1

From the problem statement, each X

i

has PDF and CDF

f

X

i

(x) =

¸

e

−x

x ≥ 0

0 otherwise

F

X

i

(x) =

¸

0 x < 0

1 −e

−x

x ≥ 0

(1)

Hence, the CDF of the maximum of X

1

, . . . , X

15

obeys

F

X

(x) = P [X ≤ x] = P [X

1

≤ x, X

2

≤ x, · · · , X

15

≤ x] = [P [X

i

≤ x]]

15

. (2)

This implies that for x ≥ 0,

F

X

(x) =

¸

F

X

i

(x)

¸

15

=

¸

1 −e

−x

¸

15

(3)

To design a signiﬁcance test, we must choose a rejection region for X. A reasonable choice

is to reject the hypothesis if X is too small. That is, let R = {X ≤ r}. For a signiﬁcance

level of α = 0.01, we obtain

α = P [X ≤ r] = (1 −e

−r

)

15

= 0.01 (4)

It is straightforward to show that

r = −ln

¸

1 −(0.01)

1/15

¸

= 1.33 (5)

Hence, if we observe X < 1.33, then we reject the hypothesis.

Quiz 8.2

From the problem statement, the conditional PMFs of K are

P

K|H

0

(k) =

¸

10

4k

e

−10

4

k!

k = 0, 1, . . .

0 otherwise

(1)

P

K|H

1

(k) =

¸

10

6k

e

−10

6

k!

k = 0, 1, . . .

0 otherwise

(2)

Since the two hypotheses are equally likely, the MAP and ML tests are the same. From

Theorem 8.6, the ML hypothesis rule is

k ∈ A

0

if P

K|H

0

(k) ≥ P

K|H

1

(k) ; k ∈ A

1

otherwise. (3)

This rule simpliﬁes to

k ∈ A

0

if k ≤ k

∗

=

10

6

−10

4

ln 100

= 214, 975.7; k ∈ A

1

otherwise. (4)

Thus if we observe at least 214, 976 photons, then we accept hypothesis H

1

.

50

Quiz 8.3

For the QPSK system, a symbol error occurs when s

i

is transmitted but (X

1

, X

2

) ∈ A

j

for some j = i . For a QPSK system, it is easier to calculate the probability of a correct

decision. Given H

0

, the conditional probability of a correct decision is

P [C|H

0

] = P [X

1

> 0, X

2

> 0|H

0

] = P

¸

√

E/2 + N

1

> 0,

√

E/2 + N

2

> 0

¸

(1)

Because of the symmetry of the signals, P[C|H

0

] = P[C|H

i

] for all i . This implies the

probability of a correct decision is P[C] = P[C|H

0

]. Since N

1

and N

2

are iid Gaussian

(0, σ) random variables, we have

P [C] = P [C|H

0

] = P

¸

√

E/2 + N

1

> 0

¸

P

¸

√

E/2 + N

2

> 0

¸

(2)

=

P

¸

N

1

> −

√

E/2

¸

2

(3)

=

¸

1 −

−

√

E/2

σ

2

(4)

Since (−x) = 1 − (x), we have P[C] =

2

(

E/2σ

2

). Equivalently, the probability

of error is

P

ERR

= 1 − P [C] = 1 −

2

E

2σ

2

(5)

Quiz 8.4

To generate the ROC, the existing program sqdistor already calculates this miss

probability P

MISS

= P

01

and the false alarm probability P

FA

= P

10

. The modiﬁed pro-

gram, sqdistroc.m is essentially the same as sqdistor except the output is a ma-

trix FM whose columns are the false alarm and miss probabilities. Next, the program

sqdistrocplot.m calls sqdistroc three times to generate a plot that compares the

receiver performance for the three requested values of d. Here is the modiﬁed code:

function FM=sqdistroc(v,d,m,T)

%square law distortion recvr

%P(error) for m bits tested

%transmit v volts or -v volts,

%add N volts, N is Gauss(0,1)

%add d(v+N)ˆ2 distortion

%receive 1 if x>T, otherwise 0

%FM = [P(FA) P(MISS)]

x=(v+randn(m,1));

[XX,TT]=ndgrid(x,T(:));

P01=sum((XX+d*(XX.ˆ2)< TT),1)/m;

x= -v+randn(m,1);

[XX,TT]=ndgrid(x,T(:));

P10=sum((XX+d*(XX.ˆ2)>TT),1)/m;

FM=[P10(:) P01(:)];

function FM=sqdistrocplot(v,m,T);

FM1=sqdistroc(v,0.1,m,T);

FM2=sqdistroc(v,0.2,m,T);

FM5=sqdistroc(v,0.3,m,T);

FM=[FM1 FM2 FM5];

loglog(FM1(:,1),FM1(:,2),’-k’, ...

FM2(:,1),FM2(:,2),’--k’, ...

FM5(:,1),FM5(:,2),’:k’);

legend(’\it d=0.1’,’\it d=0.2’,...

’\it d=0.3’,3)

ylabel(’P_{MISS}’);

xlabel(’P_{FA}’);

51

To see the effect of d, the commands

T=-3:0.1:3; sqdistrocplot(3,100000,T);

generated the plot shown in Figure 3.

10

−5

10

−4

10

−3

10

−2

10

−1

10

0

10

−5

10

−4

10

−3

10

−2

10

−1

10

0

P

M

I

S

S

P

FA

d=0.1

d=0.2

d=0.3

T=-3:0.1:3; sqdistrocplot(3,100000,T);

Figure 3: The receiver operating curve for the communications system of Quiz 8.4 with

squared distortion.

52

Quiz Solutions – Chapter 9

Quiz 9.1

(1) First, we calculate the marginal PDF for 0 ≤ y ≤ 1:

f

Y

(y) =

y

0

2(y + x) dx = 2xy + x

2

x=y

x=0

= 3y

2

(1)

This implies the conditional PDF of X given Y is

f

X|Y

(x|y) =

f

X,Y

(x, y)

f

Y

(y)

=

¸

2

3y

+

2x

3y

2

0 ≤ x ≤ y

0 otherwise

(2)

(2) The minimum mean square error estimate of X given Y = y is

ˆ x

M

(y) = E [X|Y = y] =

y

0

2x

3y

+

2x

2

3y

2

dx = 5y/9 (3)

Thus the MMSE estimator of X given Y is

ˆ

X

M

(Y) = 5Y/9.

(3) To obtain the conditional PDF f

Y|X

(y|x), we need the marginal PDF f

X

(x). For

0 ≤ x ≤ 1,

f

X

(x) =

1

x

2(y + x) dy = y

2

+2xy

y=1

y=x

= 1 +2x −3x

2

(4)

(5)

For 0 ≤ x ≤ 1, the conditional PDF of Y given X is

f

Y|X

(y|x) =

¸

2(y+x)

1+2x−3x

2

x ≤ y ≤ 1

0 otherwise

(6)

(4) The MMSE estimate of Y given X = x is

ˆ y

M

(x) = E [Y|X = x] =

1

x

2y

2

+2xy

1 +2x −3x

2

dy (7)

=

2y

3

/3 + xy

2

1 +2x −3x

2

y=1

y=x

(8)

=

2 +3x −5x

3

3 +6x −9x

2

(9)

53

Quiz 9.2

(1) Since the expectation of the sum equals the sum of the expectations,

E [R] = E [T] + E [X] = 0 (1)

(2) Since T and X are independent, the variance of the sum R = T + X is

Var[R] = Var[T] +Var[X] = 9 +3 = 12 (2)

(3) Since T and R have expected values E[R] = E[T] = 0,

Cov [T, R] = E [T R] = E [T(T + X)] = E

¸

T

2

¸

+ E [T X] (3)

Since T and X are independent and have zero expected value, E[T X] = E[T]E[X] =

0 and E[T

2

] = Var[T]. Thus Cov[T, R] = Var[T] = 9.

(4) From Deﬁnition 4.8, the correlation coefﬁcient of T and R is

ρ

T,R

=

Cov [T, R]

√

Var[R] Var[T]

=

σ

T

σ

R

=

√

3/2 (4)

(5) From Theorem 9.4, the optimum linear estimate of T given R is

ˆ

T

L

(R) = ρ

T,R

σ

T

σ

R

(R − E [R]) + E [T] (5)

Since E[R] = E[T] = 0 and ρ

T,R

= σ

T

/σ

R

,

ˆ

T

L

(R) =

σ

2

T

σ

2

R

R =

σ

2

T

σ

2

T

+σ

2

X

R =

3

4

R (6)

Hence a

∗

= 3/4 and b

∗

= 0.

(6) By Theorem 9.4, the mean square error of the linear estimate is

e

∗

L

= Var[T](1 −ρ

2

T,R

) = 9(1 −3/4) = 9/4 (7)

Quiz 9.3

When R = r, the conditional PDF of X = Y −40−40 log

10

r is Gaussian with expected

value −40 −40 log

10

r and variance 64. The conditional PDF of X given R is

f

X|R

(x|r) =

1

√

128π

e

−(x+40+40 log

10

r)

2

/128

(1)

54

From the conditional PDF f

X|R

(x|r), we can use Deﬁnition 9.2 to write the ML estimate

of R given X = x as

ˆ r

ML

(x) = arg max

r≥0

f

X|R

(x|r) (2)

We observe that f

X|R

(x|r) is maximized when the exponent (x + 40 + 40 log

10

r)

2

is

minimized. This minimum occurs when the exponent is zero, yielding

log

10

r = −1 − x/40 (3)

or

ˆ r

ML

(x) = (0.1)10

−x/40

m (4)

If the result doesn’t look correct, note that a typical ﬁgure for the signal strength might be

x = −120 dB. This corresponds to a distance estimate of ˆ r

ML

(−120) = 100 m.

For the MAP estimate, we observe that the joint PDF of X and R is

f

X,R

(x, r) = f

X|R

(x|r) f

R

(r) =

1

10

6

√

32π

re

−(x+40+40 log

10

r)

2

/128

(5)

From Theorem 9.6, the MAP estimate of R given X = x is the value of r that maximizes

f

X,R

(x, r). That is,

ˆ r

MAP

(x) = arg max

0≤r≤1000

f

X,R

(x, r) (6)

Note that we have included the constraint r ≤ 1000 in the maximization to highlight the

fact that under our probability model, R ≤ 1000 m. Setting the derivative of f

X,R

(x, r)

with respect to r to zero yields

e

−(x+40+40 log

10

r)

2

/128

¸

1 −

80 log

10

e

128

(x +40 +40 log

10

r)

¸

= 0 (7)

Solving for r yields

r = 10

1

25 log

10

e

−1

10

−x/40

= (0.1236)10

−x/40

(8)

This is the MAP estimate of R given X = x as long as r ≤ 1000 m. When x ≤ −156.3 dB,

the above estimate will exceed 1000 m, which is not possible in our probability model.

Hence, the complete description of the MAP estimate is

ˆ r

MAP

(x) =

¸

1000 x < −156.3

(0.1236)10

−x/40

x ≥ −156.3

(9)

For example, if x = −120dB, then ˆ r

MAP

(−120) = 123.6 m. When the measured signal

strength is not too low, the MAP estimate is 23.6% larger than the ML estimate. This re-

ﬂects the fact that large values of R are a priori more probable than small values. However,

for very low signal strengths, the MAP estimate takes into account that the distance can

never exceed 1000 m.

55

Quiz 9.4

(1) From Theorem 9.4, the LMSE estimate of X

2

given Y

2

is

ˆ

X

2

(Y

2

) = a

∗

Y

2

+b

∗

where

a

∗

=

Cov [X

2

, Y

2

]

Var[Y

2

]

, b

∗

= µ

X

2

−a

∗

µ

Y

2

. (1)

Because E[X] = E[Y] = 0,

Cov [X

2

, Y

2

] = E [X

2

Y

2

] = E [X

2

(X

2

+ W

2

)] = E

¸

X

2

2

¸

= 1 (2)

Var[Y

2

] = Var[X

2

] +Var[W

2

] = E

¸

X

2

2

¸

+ E

¸

W

2

2

¸

= 1.1 (3)

It follows that a

∗

= 1/1.1. Because µ

X

2

= µ

Y

2

= 0, it follows that b

∗

= 0. Finally,

to compute the expected square error, we calculate the correlation coefﬁcient

ρ

X

2

,Y

2

=

Cov [X

2

, Y

2

]

σ

X

2

σ

Y

2

=

1

√

1.1

(4)

The expected square error is

e

∗

L

= Var[X

2

](1 −ρ

2

X

2

,Y

2

) = 1 −

1

1.1

=

1

11

= 0.0909 (5)

(2) Since Y = X + W and E[X] = E[W] = 0, it follows that E[Y] = 0. Thus we can

apply Theorem 9.7. Note that X and W have correlation matrices

R

X

=

¸

1 −0.9

−0.9 1

¸

, R

W

=

¸

0.1 0

0 0.1

¸

. (6)

In terms of Theorem 9.7, n = 2 and we wish to estimate X

2

given the observation

vector Y =

¸

Y

1

Y

2

¸

**. To apply Theorem 9.7, we need to ﬁnd R
**

Y

and R

YX

2

.

R

Y

= E

¸

YY

¸

= E

¸

(X +W)(X

+W

)

¸

(7)

= E

¸

XX

+XW

+WX

+WW

¸

. (8)

Because Xand Ware independent, E[XW

] = E[X]E[W

] = 0. Similarly, E[WX

] =

0. This implies

R

Y

= E

¸

XX

¸

+ E

¸

WW

¸

= R

X

+R

W

=

¸

1.1 −0.9

−0.9 1.1

¸

. (9)

In addition, we need to ﬁnd

R

YX

2

= E [YX

2

] =

¸

E [Y

1

X

2

]

E [Y

2

X

2

]

¸

=

¸

E [(X

1

+ W

1

)X

2

]

E [(X

2

+ W

2

)X

2

]

¸

. (10)

56

Since Xand Ware independent vectors, E[W

1

X

2

] = E[W

1

]E[X

2

] = 0 and E[W

2

X

2

] =

0. Thus

R

YX

2

=

¸

E[X

1

X

2

]

E

¸

X

2

2

¸

¸

=

¸

−0.9

1

¸

. (11)

By Theorem 9.7,

ˆ a = R

−1

Y

R

YX

2

=

¸

−0.225

0.725

¸

(12)

Therefore, the optimum linear estimator of X

2

given Y

1

and Y

2

is

ˆ

X

L

= ˆ a

Y = −0.225Y

1

+0.725Y

2

. (13)

The mean square error is

Var [X

2

] − ˆ a

R

YX

2

= Var [X] −a

1

r

Y

1

,X

2

−a

2

r

Y

2

,X

2

= 0.0725. (14)

Quiz 9.5

Since X and W have zero expected value, Y also has zero expected value. Thus, by

Theorem 9.7,

ˆ

X

L

(Y) = ˆ a

Y where ˆ a = R

−1

Y

R

YX

. Since X and W are independent,

E[WX] = 0 and E[XW

] = 0

. This implies

R

YX

= E [YX] = E [(1X +W)X] = 1E

¸

X

2

¸

= 1. (1)

By the same reasoning, the correlation matrix of Y is

R

Y

= E

¸

YY

¸

= E

¸

(1X +W)(1

X +W

)

¸

(2)

= 11

E

¸

X

2

¸

+1E

¸

XW

¸

+ E [WX] 1

+ E

¸

WW

¸

(3)

= 11

+R

W

(4)

Note that 11

**is a 20 ×20 matrix with every entry equal to 1. Thus,
**

ˆ a = R

−1

Y

R

YX

=

11

+R

W

−1

1 (5)

and the optimal linear estimator is

ˆ

X

L

(Y) = 1

11

+R

W

−1

Y (6)

The mean square error is

e

∗

L

= Var[X] − ˆ a

R

YX

= 1 −1

11

+R

W

−1

1 (7)

Now we note that R

W

has i, j th entry R

W

(i, j ) = c

|i −j |−1

. The question we must address

is what value c minimizes e

∗

L

. This problem is atypical in that one does not usually get

57

to choose the correlation structure of the noise. However, we will see that the answer is

somewhat instructive.

We note that the answer is not obviously apparent from Equation (7). In particular, we

observe that Var[W

i

] = R

W

(i, i ) = 1/c. Thus, when c is small, the noises W

i

have high

variance and we would expect our estimator to be poor. On the other hand, if c is large

W

i

and W

j

are highly correlated and the separate measurements of X are very dependent.

This would suggest that large values of c will also result in poor MSE. If this argument is

not clear, consider the extreme case in which every W

i

and W

j

have correlation coefﬁcient

ρ

i j

= 1. In this case, our 20 measurements will be all the same and one measurement is as

good as 20 measurements.

To ﬁnd the optimal value of c, we write a MATLAB function mquiz9(c) to calculate

the MSE for a given c and second function that ﬁnds plots the MSE for a range of values

of c.

function [mse,af]=mquiz9(c);

v1=ones(20,1);

RW=toeplitz(c.ˆ((0:19)-1));

RY=(v1*(v1’)) +RW;

af=(inv(RY))*v1;

mse=1-((v1’)*af);

function cmin=mquiz9minc(c);

msec=zeros(size(c));

for k=1:length(c),

[msec(k),af]=mquiz9(c(k));

end

plot(c,msec);

xlabel(’c’);ylabel(’e_Lˆ*’);

[msemin,optk]=min(msec);

cmin=c(optk);

Note in mquiz9 that v1 corresponds to the vector 1 of all ones. The following commands

ﬁnds the minimum c and also produces the following graph:

>> c=0.01:0.01:0.99;

>> mquiz9minc(c)

ans =

0.4500

0 0.5 1

0.2

0.4

0.6

0.8

1

c

e

L *

As we see in the graph, both small values and large values of c result in large MSE.

58

Quiz Solutions – Chapter 10

Quiz 10.1

There are many correct answers to this question. A correct answer speciﬁes enough

random variables to specify the sample path exactly. One choice for an alternate set of

random variables that would specify m(t, s) is

• m(0, s), the number of ongoing calls at the start of the experiment

• N, the number of new calls that arrive during the experiment

• X

1

, . . . , X

N

, the interarrival times of the N new arrivals

• H, the number of calls that hang up during the experiment

• D

1

, . . . , D

H

, the call completion times of the H calls that hang up

Quiz 10.2

(1) We obtain a continuous time, continuous valued process when we record the temper-

ature as a continuous waveform over time.

(2) If at every moment in time, we round the temperature to the nearest degree, then we

obtain a continuous time, discrete valued process.

(3) If we sample the process in part (a) every T seconds, then we obtain a discrete time,

continuous valued process.

(4) Rounding the samples in part (c) to the nearest integer degree yields a discrete time,

discrete valued process.

Quiz 10.3

(1) Each resistor has resistance R in ohms with uniform PDF

f

R

(r) =

¸

0.01 950 ≤ r ≤ 1050

0 otherwise

(1)

The probability that a test produces a 1% resistor is

p = P [990 ≤ R ≤ 1010] =

1010

990

(0.01) dr = 0.2 (2)

59

(2) In t seconds, exactly t resistors are tested. Each resistor is a 1% resistor with proba-

bility p, independent of any other resistor. Consequently, the number of 1% resistors

found has the binomial PMF

P

N(t )

(n) =

¸

t

n

p

n

(1 − p)

t −n

n = 0, 1, . . . , t

0 otherwise

(3)

(3) First we will ﬁnd the PMF of T

1

. This problem is easy if we view each resistor test

as an independent trial. A success occurs on a trial with probability p if we ﬁnd a

1% resistor. The ﬁrst 1% resistor is found at time T

1

= t if we observe failures on

trials 1, . . . , t − 1 followed by a success on trial t . Hence, just as in Example 2.11,

T

1

has the geometric PMF

P

T

1

(t ) =

¸

(1 − p)

t −1

p t = 1, 2, . . .

9 otherwise

(4)

Since p = 0.2, the probability the ﬁrst 1% resistor is found in exactly ﬁve seconds is

P

T

1

(5) = (0.8)

4

(0.2) = 0.08192.

(4) From Theorem 2.5, a geometric random variable with success probability p has ex-

pected value 1/p. In this problem, E[T

1

] = 1/p = 5.

(5) Note that once we ﬁnd the ﬁrst 1% resistor, the number of additional trials needed to

ﬁnd the second 1% resistor once again has a geometric PMF with expected value 1/p

since each independent trial is a success with probability p. That is, T

2

= T

1

+ T

where T

**is independent and identically distributed to T
**

1

. Thus

E [T

2

|T

1

= 10] = E [T

1

|T

1

= 10] + E

¸

T

|T

1

= 10

¸

(5)

= 10 + E

¸

T

¸

= 10 +5 = 15 (6)

Quiz 10.4

Since each X

i

is a N(0, 1) random variable, each X

i

has PDF

f

X(i )

(x) =

1

√

2π

e

−x

2

/2

(1)

By Theorem 10.1, the joint PDF of X =

¸

X

1

· · · X

n

¸

is

f

X

(x) = f

X(1),...,X(n)

(x

1

, . . . , x

n

) =

k

¸

i =1

f

X

(x

i

) =

1

(2π)

n/2

e

−(x

2

1

+···+x

2

n

)/2

(2)

60

Quiz 10.5

The ﬁrst and second hours are nonoverlapping intervals. Since one hour equals 3600

sec and the Poisson process has a rate of 10 packets/sec, the expected number of packets

in each hour is E[M

i

] = α = 36, 000. This implies M

1

and M

2

are independent Poisson

random variables each with PMF

P

M

i

(m) =

¸

α

m

e

−α

m!

m = 0, 1, 2, . . .

0 otherwise

(1)

Since M

1

and M

2

are independent, the joint PMF of M

1

and M

2

is

P

M

1

,M

2

(m

1

, m

2

) = P

M

1

(m

1

) P

M

2

(m

2

) =

⎧

⎪

⎪

⎨

⎪

⎪

⎩

α

m

1

+m

2

e

−2α

m

1

!m

2

!

m

1

= 0, 1, . . . ;

m

2

= 0, 1, . . . ,

0 otherwise.

(2)

Quiz 10.6

To answer whether N

**(t ) is a Poisson process, we look at the interarrival times. Let
**

X

1

, X

2

, . . . denote the interarrival times of the N(t ) process. Since we count only even-

numbered arrival for N

(t ), the time until the ﬁrst arrival of the N

(t ) is Y

1

= X

1

+ X

2

.

Since X

1

and X

2

are independent exponential (λ) random variables, Y

1

is an Erlang (n =

2, λ) random variable; see Theorem 6.11. Since Y

i

(t ), the i th interarrival time of the N

(t )

process, has the same PDF as Y

1

(t ), we can conclude that the interarrival times of N

(t )

are not exponential random variables. Thus N

**(t ) is not a Poisson process.
**

Quiz 10.7

First, we note that for t > s,

X(t ) − X(s) =

W(t ) − W(s)

√

α

(1)

Since W(t ) −W(s) is a Gaussian random variable, Theorem 3.13 states that W(t ) −W(s)

is Gaussian with expected value

E [X(t ) − X(s)] =

E [W(t ) − W(s)]

√

α

= 0 (2)

and variance

E

¸

(W(t ) − W(s))

2

¸

=

E

¸

(W(t ) − W(s))

2

¸

α

=

α(t −s)

α

(3)

Consider s

≤ s < t . Since s ≥ s

, W(t ) − W(s) is independent of W(s

). This implies

[W(t ) − W(s)]/

√

α is independent of W(s

)/

√

α for all s ≥ s

. That is, X(t ) − X(s) is

independent of X(s

) for all s ≥ s

**. Thus X(t ) is a Brownian motion process with variance
**

Var[X(t )] = t .

61

Quiz 10.8

First we ﬁnd the expected value

µ

Y

(t ) = µ

X

(t ) +µ

N

(t ) = µ

X

(t ). (1)

To ﬁnd the autocorrelation, we observe that since X(t ) and N(t ) are independent and since

N(t ) has zero expected value, E[X(t )N(t

)] = E[X(t )]E[N(t

)] = 0. Since R

Y

(t, τ) =

E[Y(t )Y(t +τ)], we have

R

Y

(t, τ) = E [(X(t ) + N(t )) (X(t +τ) + N(t +τ))] (2)

= E [X(t )X(t +τ)] + E [X(t )N(t +τ)]

+ E [X(t +τ)N(t )] + E [N(t )N(t +τ)] (3)

= R

X

(t, τ) + R

N

(t, τ). (4)

Quiz 10.9

From Deﬁnition 10.14, X

1

, X

2

, . . . is a stationary random sequence if for all sets of

time instants n

1

, . . . , n

m

and time offset k,

f

X

n

1

,...,X

n

m

(x

1

, . . . , x

m

) = f

X

n

1

+k

,...,X

n

m

+k

(x

1

, . . . , x

m

) (1)

Since the random sequence is iid,

f

X

n

1

,...,X

n

m

(x

1

, . . . , x

m

) = f

X

(x

1

) f

X

(x

2

) · · · f

X

(x

m

) (2)

Similarly, for time instants n

1

+k, . . . , n

m

+k,

f

X

n

1

+k

,...,X

n

m

+k

(x

1

, . . . , x

m

) = f

X

(x

1

) f

X

(x

2

) · · · f

X

(x

m

) (3)

We can conclude that the iid random sequence is stationary.

Quiz 10.10

We must check whether each function R(τ) meets the conditions of Theorem 10.12:

R(τ) ≥ 0 R(τ) = R(−τ) |R(τ)| ≤ R(0) (1)

(1) R

1

(τ) = e

−|τ|

meets all three conditions and thus is valid.

(2) R

2

(τ) = e

−τ

2

also is valid.

(3) R

3

(τ) = e

−τ

cos τ is not valid because

R

3

(−2π) = e

2π

cos 2π = e

2π

> 1 = R

3

(0) (2)

(4) R

4

(τ) = e

−τ

2

sin τ also cannot be an autocorrelation function because

R

4

(π/2) = e

−π/2

sin π/2 = e

−π/2

> 0 = R

4

(0) (3)

62

Quiz 10.11

(1) The autocorrelation of Y(t ) is

R

Y

(t, τ) = E [Y(t )Y(t +τ)] (1)

= E [X(−t )X(−t −τ)] (2)

= R

X

(−t −(−t −τ)) = R

X

(τ) (3)

Since E[Y(t )] = E[X(−t )] = µ

X

, we can conclude that Y(t ) is a wide sense

stationary process. In fact, we see that by viewing a process backwards in time, we

see the same second order statistics.

(2) Since X(t ) and Y(t ) are both wide sense stationary processes, we can check whether

they are jointly wide sense stationary by seeing if R

XY

(t, τ) is just a function of τ.

In this case,

R

XY

(t, τ) = E [X(t )Y(t +τ)] (4)

= E [X(t )X(−t −τ)] (5)

= R

X

(t −(−t −τ)) = R

X

(2t +τ) (6)

Since R

XY

(t, τ) depends on both t and τ, we conclude that X(t ) and Y(t ) are not

jointly wide sense stationary. To see why this is, suppose R

X

(τ) = e

−|τ|

so that

samples of X(t ) far apart in time have almost no correlation. In this case, as t gets

larger, Y(t ) = X(−t ) and X(t ) become less and less correlated.

Quiz 10.12

From the problem statement,

E [X(t )] = E [X(t +1)] = 0 (1)

E [X(t )X(t +1)] = 1/2 (2)

Var[X(t )] = Var[X(t +1)] = 1 (3)

The Gaussian random vector X =

¸

X(t ) X(t +1)

¸

**has covariance matrix and corre-
**

sponding inverse

C

X

=

¸

1 1/2

1/2 1

¸

C

−1

X

=

4

3

¸

1 −1/2

−1/2 1

¸

(4)

Since

x

C

−1

X

x =

¸

x

0

x

1

¸

4

3

¸

1 −1/2

−1/2 1

¸ ¸

x

0

x

1

¸

=

4

3

x

2

0

− x

0

x

+

x

2

1

(5)

the joint PDF of X(t ) and X(t +1) is the Gaussian vector PDF

f

X(t ),X(t +1)

(x

0

, x

1

) =

1

(2π)

n/2

[det (C

X

)]

1/2

exp

−

1

2

x

C

−1

X

x

(6)

=

1

√

3π

2

e

−

2

3

x

2

0

−x

0

x

1

+x

2

1

(7)

63

0 10 20 30 40 50 60 70 80 90 100

0

20

40

60

80

100

120

t

M

(

t

)

Figure 4: Sample path of 100 minutes of the blocking switch of Quiz 10.13.

Quiz 10.13

The simple structure of the switch simulation of Example 10.28 admits a deceptively

simple solution in terms of the vector of arrivals A and the vector of departures D. With the

introduction of call blocking. we cannot generate these vectors all at once. In particular,

when an arrival occurs at time t , we need to know that M(t ), the number of ongoing calls,

satisﬁes M(t ) < c = 120. Otherwise, when M(t ) = c, we must block the call. Call

blocking can be implemented by setting the service time of the call to zero so that the call

departs as soon as it arrives.

The blocking switch is an example of a discrete event system. The system evolves via

a sequence of discrete events, namely arrivals and departures, at discrete time instances. A

simulation of the system moves from one time instant to the next by maintaining a chrono-

logical schedule of future events (arrivals and departures) to be executed. The program

simply executes the event at the head of the schedule. The logic of such a simulation is

1. Start at time t = 0 with an empty system. Schedule the ﬁrst arrival to occur at S

1

, an

exponential (λ) random variable.

2. Examine the head-of-schedule event.

• When the head-of-schedule event is the kth arrival is at time t , check the state

M(t ).

– If M(t ) < c, admit the arrival, increase the system state n by 1, and sched-

ule a departure to occur at time t + S

n

, where S

k

is an exponential (λ)

random variable.

– If M(t ) = c, block the arrival, do not schedule a departure event.

• If the head of schedule event is a departure, reduce the system state n by 1.

3. Delete the head-of-schedule event and go to step 2.

After the head-of-schedule event is completed and any new events (departures in this sys-

tem) are scheduled, we know the system state cannot change until the next scheduled event.

64

Thus we know that M(t ) will stay the same until then. In our simulation, we use the vector

t as the set of time instances at which we inspect the system state. Thus for all times t(i)

between the current head-of-schedule event and the next, we set m(i) to the current switch

state.

The complete program is shown in Figure 5. In most programming languages, it is

common to implement the event schedule as a linked list where each item in the list has

a data structure indicating an event timestamp and the type of the event. In MATLAB, a

simple (but not elegant) way to do this is to have maintain two vectors: time is a list

of timestamps of scheduled events and event is a the list of event types. In this case,

event(i)=1 if the i th scheduled event is an arrival, or event(i)=-1 if the i th sched-

uled event is a departure.

When the program is passed a vector t, the output [m a b] is such that m(i) is the

number of ongoing calls at time t(i) while a and b are the number of admits and blocks.

The following instructions

t=0:0.1:5000;

[m,a,b]=simblockswitch(10,0.1,120,t);

plot(t,m);

generated a simulation lasting 5,000 minutes. A sample path of the ﬁrst 100 minutes of

that simulation is shown in Figure 4. The 5,000 minute full simulation produced a=49658

admitted calls and b=239 blocked calls. We can estimate the probability a call is blocked

as

ˆ

P

b

=

b

a +b

= 0.0048. (1)

In Chapter 12, we will learn that the exact blocking probability is given by Equation (12.93),

a result known as the “Erlang-B formula.” From the Erlang-B formula, we can calculate

that the exact blocking probability is P

b

= 0.0057. One reason our simulation underesti-

mates the blocking probability is that in a 5,000 minute simulation, roughly the ﬁrst 100

minutes are needed to load up the switch since the switch is idle when the simulation starts

at time t = 0. However, this says that roughly the ﬁrst two percent of the simulation time

was unusual. Thus this would account for only part of the disparity. The rest of the gap

between 0.0048 and 0.0057 is that a simulation that includes only 239 blocks is not all that

likely to give a very accurate result for the blocking probability.

Note that in Chapter 12, we will learn that the blocking switch is an example of an

M/M/c/c queue, a kind of Markov chain. Chapter 12 develops techniques for analyzing

and simulating systems described by Markov chains that are much simpler than the discrete

event simulation technique shown here. Nevertheless, for very complicated systems, the

discrete event simulation is widely-used and often very efﬁcient simulation method.

65

function [M,admits,blocks]=simblockswitch(lam,mu,c,t);

blocks=0; %total # blocks

admits=0; %total # admits

M=zeros(size(t));

n=0; % # in system

time=[ exponentialrv(lam,1) ];

event=[ 1 ]; %first event is an arrival

timenow=0;

tmax=max(t);

while (timenow<tmax)

M((timenow<=t)&(t<time(1)))=n;

timenow=time(1);

eventnow=event(1);

event(1)=[ ]; time(1)= [ ]; % clear current event

if (eventnow==1) % arrival

arrival=timenow+exponentialrv(lam,1); % next arrival

b4arrival=time<arrival;

event=[event(b4arrival) 1 event(˜b4arrival)];

time=[time(b4arrival) arrival time(˜b4arrival)];

if n<c %call admitted

admits=admits+1;

n=n+1;

depart=timenow+exponentialrv(mu,1);

b4depart=time<depart;

event=[event(b4depart) -1 event(˜b4depart)];

time=[time(b4depart) depart time(˜b4depart)];

else

blocks=blocks+1; %one more block, immed departure

disp(sprintf(’Time %10.3d Admits %10d Blocks %10d’,...

timenow,admits,blocks));

end

elseif (eventnow==-1) %departure

n=n-1;

end

end

Figure 5: Discrete event simulation of the blocking switch of Quiz 10.13.

66

Quiz Solutions – Chapter 11

Quiz 11.1

By Theorem 11.2,

µ

Y

= µ

X

∞

−∞

h(t )dt = 2

∞

0

e

−t

dt = 2 (1)

Since R

X

(τ) = δ(τ), the autocorrelation function of the output is

R

Y

(τ) =

∞

−∞

h(u)

∞

−∞

h(v)δ(τ +u −v) dv du =

∞

−∞

h(u)h(τ +u) du (2)

For τ > 0, we have

R

Y

(τ) =

∞

0

e

−u

e

−τ−u

du = e

−τ

∞

0

e

−2u

du =

1

2

e

−τ

(3)

For τ < 0, we can deduce that R

Y

(τ) =

1

2

e

−|τ|

by symmetry. Just to be safe though, we

can double check. For τ < 0,

R

Y

(τ) =

∞

−τ

h(u)h(τ +u) du =

∞

−τ

e

−u

e

−τ−u

du =

1

2

e

τ

(4)

Hence,

R

Y

(τ) =

1

2

e

−|τ|

(5)

Quiz 11.2

The expected value of the output is

µ

Y

= µ

X

∞

¸

n=−∞

h

n

= 0.5(1 +−1) = 0 (1)

The autocorrelation of the output is

R

Y

[n] =

1

¸

i =0

1

¸

j =0

h

i

h

j

R

X

[n +i − j ] (2)

= 2R

X

[n] − R

X

[n −1] − R

X

[n +1] =

¸

1 n = 0

0 otherwise

(3)

Since µ

Y

= 0, The variance of Y

n

is Var[Y

n

] = E[Y

2

n

] = R

Y

[0] = 1.

67

−15 −10 −5 0 5 10 15

0

0.2

0.4

0.6

f

S

X

(

f

)

−1500−1000 −500 0 500 1000 1500

0

2

4

6

8

x 10

f

S

X

(

f

)

−0.2 −0.1 0 0.1 0.2

−5

0

5

10

τ

R

X

(

τ

)

−2 −1 0 1 2

x 10

−3

−5

0

5

10

τ

R

X

(

τ

)

(a) W = 10 (b) W = 1000

Figure 6: The autocorrelation R

X

(τ) and power spectral density S

X

( f ) for process X(t ) in

Quiz 11.5.

Quiz 11.3

By Theorem 11.8, Y =

¸

Y

33

Y

34

Y

35

¸

**is a Gaussian random vector since X
**

n

is

a Gaussian random process. Moreover, by Theorem 11.5, each Y

n

has expected value

E[Y

n

] = µ

X

¸

∞

n=−∞

h

n

= 0. Thus E[Y] = 0. Fo ﬁnd the PDF of the Gaussian vector

Y, we need to ﬁnd the covariance matrix C

Y

, which equals the correlation matrix R

Y

since

Y has zero expected value. One way to ﬁnd the R

Y

is to observe that R

Y

has the Toeplitz

structure of Theorem 11.6 and to use Theorem 11.5 to ﬁnd the autocorrelation function

R

Y

[n] =

∞

¸

i =−∞

∞

¸

j =−∞

h

i

h

j

R

X

[n +i − j ]. (1)

Despite the fact that R

X

[k] is an impulse, using Equation (1) is surprisingly tedious because

we still need to sum over all i and j such that n +i − j = 0.

In this problem, it is simpler to observe that Y = HX where

X =

¸

X

30

X

31

X

32

X

33

X

34

X

35

¸

(2)

and

H =

1

4

⎡

⎣

1 1 1 1 0 0

0 1 1 1 1 0

0 0 1 1 1 1

⎤

⎦

. (3)

In this case, following Theorem 11.7, or by directly applying Theorem 5.13 with µ

X

= 0

and A = H, we obtain R

Y

= HR

X

H

. Since R

X

[n] = δ

n

, R

X

= I, the identity matrix.

68

Thus

C

Y

= R

Y

= HH

=

1

16

⎡

⎣

4 3 2

3 4 3

2 3 4

⎤

⎦

. (4)

It follows (very quickly if you use MATLAB for 3 ×3 matrix inversion) that

C

−1

Y

= 16

⎡

⎣

7/12 −1/2 1/12

−1/2 1 −1/2

1/12 −1/2 7/12

⎤

⎦

. (5)

Thus, the PDF of Y is

f

Y

(y) =

1

(2π)

3/2

[det (C

Y

)]

1/2

exp

−

1

2

y

C

−1

Y

y

. (6)

A disagreeable amount of algebra will show det(C

Y

) = 3/1024 and that the PDF can be

“simpliﬁed” to

f

Y

(y) =

16

√

6π

3

exp

¸

−8

7

12

y

2

33

+ y

2

34

+

7

12

y

2

35

− y

33

y

34

+

1

6

y

33

y

35

− y

34

y

35

¸

. (7)

Equation (7) shows that one of the nicest features of the multivariate Gaussian distribution

is that y

C

−1

Y

y is a very concise representation of the cross-terms in the exponent of f

Y

(y).

Quiz 11.4

This quiz is solved using Theorem 11.9 for the case of k = 1 and M = 2. In this case,

X

n

=

¸

X

n−1

X

n

¸

and

R

X

n

=

¸

R

X

[0] R

X

[1]

R

X

[1] R

X

[0]

¸

=

¸

1.1 0.9

0.9 1.1

¸

(1)

and

R

X

n

X

n+1

= E

¸¸

X

n−1

X

n

¸

X

n+1

¸

=

¸

R

X

[2]

R

X

[1]

¸

=

¸

0.81

0.9

¸

. (2)

The MMSE linear ﬁrst order ﬁlter for predicting X

n+1

at time n is the ﬁlter h such that

←−

h = R

−1

X

n

R

X

n

X

n+1

=

¸

1.1 0.9

0.9 1.1

¸

−1

¸

0.81

0.9

¸

=

1

400

¸

81

261

¸

. (3)

It follows that the ﬁlter is h =

¸

261/400 81/400

¸

**and the MMSE linear predictor is
**

ˆ

X

n+1

=

81

400

X

n−1

+

261

400

X

n

. (4)

to ﬁnd the mean square error, one approach is to follow the method of Example 11.13 and

to directly calculate

e

∗

L

= E

¸

(X

n+1

−

ˆ

X

n+1

)

2

¸

. (5)

69

This method is workable for this simple problem but becomes increasingly tedious for

higher order ﬁlters. Instead, we can derive the mean square error for an arbitary prediction

ﬁlter h. Since

ˆ

X

n+1

=

←−

h

X

n

,

e

∗

L

= E

¸

X

n+1

−

←−

h

X

n

2

¸

(6)

= E

¸

(X

n+1

−

←−

h

X

n

)(X

n+1

−

←−

h

X

n

)

¸

(7)

= E

¸

(X

n+1

−

←−

h

X

n

)(X

n+1

−X

n

←−

h )

¸

(8)

After a bit of algebra, we obtain

e

∗

L

= R

X

[0] −2

←−

h

R

X

n

X

n+1

+

←−

h

R

X

n

←−

h (9)

(10)

with the substitution

←−

h = R

−1

X

n

R

X

n

X

n+1

, we obtain

e

∗

L

= R

X

[0] −R

X

n

X

n+1

R

−1

X

n

R

X

n

X

n+1

(11)

= R

X

[0] −

←−

h

R

X

n

X

n+1

(12)

Note that this is essentially the same result as Theorem 9.7 with Y = X

n

, X = X

n+1

and

ˆ a

=

←−

h

. It is noteworthy that the result is derived in a much simpler way in the proof of

Theorem 9.7 by using the orthoginality property of the LMSE estimator.

In any case, the mean square error is

e

∗

L

= R

X

[0] −

←−

h

R

X

n

X

n+1

= 1.1 −

1

400

¸

81 261

¸

¸

0.81

0.9

¸

=

506

1451

= 0.3487. (13)

recalling that the blind estimate would yield a mean square error of Var[X] = 1.1, we see

that observing X

n−1

and X

n

improves the accuracy of our prediction of X

n+1

.

Quiz 11.5

(1) By Theorem 11.13(b), the average power of X(t ) is

E

¸

X

2

(t )

¸

=

∞

−∞

S

X

( f ) d f =

W

−W

5

W

d f = 10 Watts (1)

(2) The autocorrelation function is the inverse Fourier transform of S

X

( f ). Consulting

Table 11.1, we note that

S

X

( f ) = 10

1

2W

rect

f

2W

(2)

It follows that the inverse transform of S

X

( f ) is

R

X

(τ) = 10 sinc(2Wτ) = 10

sin(2πWτ)

2πWτ

(3)

(3) For W = 10 Hz and W = 1 kHZ, graphs of S

X

( f ) and R

X

(τ) appear in Figure 6.

70

Quiz 11.6

In a sampled system, the discrete time impulse δ[n] has a ﬂat discrete Fourier transform.

That is, if R

X

[n] = 10δ[n], then

S

X

(φ) =

∞

¸

n=−∞

10δ[n]e

−j 2πφn

= 10 (1)

Thus, R

X

[n] = 10δ[n]. (This quiz is really lame!)

Quiz 11.7

Since Y(t ) = X(t −t

0

),

R

XY

(t, τ) = E [X(t )Y(t +τ)] = E [X(t )X(t +τ −t

0

)] = R

X

(τ −t

0

) (1)

We see that R

XY

(t, τ) = R

XY

(τ) = R

X

(τ − t

0

). From Table 11.1, we recall the prop-

erty that g(τ − τ

0

) has Fourier transform G( f )e

−j 2π f τ

0

. Thus the Fourier transform of

R

XY

(τ) = R

X

(τ −t

0

) = g(τ −t

0

) is

S

XY

( f ) = S

X

( f )e

−j 2π f t

0

. (2)

Quiz 11.8

We solve this quiz using Theorem 11.17. First we need some preliminary facts. Let

a

0

= 5,000 so that

R

X

(τ) =

1

a

0

a

0

e

−a

0

|τ|

. (1)

Consulting with the Fourier transforms in Table 11.1, we see that

S

X

( f ) =

1

a

0

2a

2

0

a

2

0

+(2π f )

2

=

2a

0

a

2

0

+(2π f )

2

(2)

The RC ﬁlter has impulse response h(t ) = a

1

e

−a

1

t

u(t ), where u(t ) is the unit step function

and a

1

= 1/RC where RC = 10

−4

is the ﬁlter time constant. From Table 11.1,

H( f ) =

a

1

a

1

+ j 2π f

(3)

(1) Theorem 11.17,

S

XY

( f ) = H( f )S

X

( f ) =

2a

0

a

1

[a

1

+ j 2π f ]

¸

a

2

0

+(2π f )

2

¸. (4)

(2) Again by Theorem 11.17,

S

Y

( f ) = H

∗

( f )S

XY

( f ) = |H( f )|

2

S

X

( f ). (5)

71

Note that

|H( f )|

2

= H( f )H

∗

( f ) =

a

1

(a

1

+ j 2π f )

a

1

(a

1

− j 2π f )

=

a

2

1

a

2

1

+(2π f )

2

(6)

Thus,

S

Y

( f ) = |H( f )|

2

S

X

( f ) =

2a

0

a

2

1

¸

a

2

1

+(2π f )

2

¸ ¸

a

2

0

+(2π f )

2

¸ (7)

(3) To ﬁnd the average power at the ﬁlter output, we can either use basic calculus and

calculate

∞

−∞

S

Y

( f ) d f directly or we can ﬁnd R

Y

(τ) as an inverse transform of

S

Y

( f ). Using partial fractions and the Fourier transform table, the latter method is

actually less algebra. In particular, some algebra will show that

S

Y

( f ) =

K

0

a

2

0

+(2π f )

2

+

K

1

a

1

+(2π f )

2

(8)

where

K

0

=

2a

0

a

2

1

a

2

1

−a

2

0

, K

1

=

−2a

0

a

2

1

a

2

1

−a

2

0

. (9)

Thus,

S

Y

( f ) =

K

0

2a

2

0

2a

2

0

a

2

0

+(2π f )

2

+

K

1

2a

2

1

2a

2

1

a

1

+(2π f )

2

. (10)

Consulting with Table 11.1, we see that

R

Y

(τ) =

K

0

2a

2

0

a

0

e

−a

0

|τ|

+

K

1

2a

2

1

a

1

e

−a

1

|τ|

(11)

Substituting the values of K

0

and K

1

, we obtain

R

Y

(τ) =

a

2

1

e

−a

0

|τ|

−a

0

a

1

e

−a

1

|τ|

a

2

1

−a

2

0

. (12)

The average power of the Y(t ) process is

R

Y

(0) =

a

1

a

1

+a

0

=

2

3

. (13)

Note that the input signal has average power R

X

(0) = 1. Since the RC ﬁlter has a 3dB

bandwidth of 10,000 rad/sec and the signal X(t ) has most of its its signal energy below

5,000 rad/sec, the output signal has almost as much power as the input.

72

Quiz 11.9

This quiz implements an example of Equations (11.146) and (11.147) for a system in

which we ﬁlter Y(t ) = X(t ) + N(t ) to produce an optimal linear estimate of X(t ). The

solution to this quiz is just to ﬁnd the ﬁlter

ˆ

H( f ) using Equation (11.146) and to calculate

the mean square error e

L

∗ using Equation (11.147).

Comment: Since the text omitted the derivations of Equations (11.146) and (11.147), we

note that Example 10.24 showed that

R

Y

(τ) = R

X

(τ) + R

N

(τ), R

Y X

(τ) = R

X

(τ). (1)

Taking Fourier transforms, it follows that

S

Y

( f ) = S

X

( f ) + S

N

( f ), S

Y X

( f ) = S

X

( f ). (2)

Now we can go on to the quiz, at peace with the derivations.

(1) Since µ

N

= 0, R

N

(0) = Var[N] = 1. This implies

R

N

(0) =

∞

−∞

S

N

( f ) d f =

B

−B

N

0

d f = 2N

0

B (3)

Thus N

0

= 1/(2B). Because the noise process N(t ) has constant power R

N

(0) = 1,

decreasing the single-sided bandwidth B increases the power spectral density of the

noise over frequencies | f | < B.

(2) Since R

X

(τ) = sinc(2Wτ), where W = 5,000 Hz, we see from Table 11.1 that

S

X

( f ) =

1

10

4

rect

f

10

4

. (4)

The noise power spectral density can be written as

S

N

( f ) = N

0

rect

f

2B

=

1

2B

rect

f

2B

, (5)

From Equation (11.146), the optimal ﬁlter is

ˆ

H( f ) =

S

X

( f )

S

X

( f ) + S

N

( f )

=

1

10

4

rect

f

10

4

1

10

4

rect

f

10

4

+

1

2B

rect

f

2B

. (6)

73

(3) We produce the output

ˆ

X(t ) by passing the noisy signal Y(t ) through the ﬁlter

ˆ

H( f ).

From Equation (11.147), the mean square error of the estimate is

e

∗

L

=

∞

−∞

S

X

( f )S

N

( f )

S

X

( f ) + S

N

( f )

d f (7)

=

∞

−∞

1

10

4

rect

f

10

4

1

2B

rect

f

2B

1

10

4

rect

f

10

4

+

1

2B

rect

f

2B

d f. (8)

To evaluate the MSE e

∗

L

, we need to whether B ≤ W. Since the problem asks us to

ﬁnd the largest possible B, let’s suppose B ≤ W. We can go back and consider the

case B > W later. When B ≤ W, the MSE is

e

∗

L

=

B

−B

1

10

4

1

2B

1

10

4

+

1

2B

d f =

1

10

4

1

10

4

+

1

2B

=

1

1 +

5,000

B

(9)

To obtain MSE e

∗

L

≤ 0.05 requires B ≤ 5,000/19 = 263.16 Hz.

Although this completes the solution to the quiz, what is happening may not be obvious.

The noise power is always Var[N] = 1 Watt, for all values of B. As B is decreased, the PSD

S

N

( f ) becomes increasingly tall, but only over a bandwidth B that is decreasing. Thus as

B descreases, the ﬁlter

ˆ

H( f ) makes an increasingly deep and narrow notch at frequencies

| f | ≤ B. Two examples of the ﬁlter

ˆ

H( f ) are shown in Figure 7. As B shrinks, the ﬁlter

suppresses less of the signal of X(t ). The result is that the MSE goes down.

Finally, we note that we can choose B very large and also achieve MSE e

∗

L

= 0.05. In

particular, when B > W = 5000, S

N

( f ) = 1/2B over frequencies | f | < W. In this case,

the Wiener ﬁlter

ˆ

H( f ) is an ideal (ﬂat) lowpass ﬁlter

ˆ

H( f ) =

⎧

⎨

⎩

1

10

4

1

10

4

+

1

2B

| f | < 5,000,

0 otherwise.

(10)

Thus increasing B spreads the constant 1 watt of power of N(t ) over more bandwidth. The

Wiener ﬁlter removes the noise that is outside the band of the desired signal. The mean

square error is

e

∗

L

=

5000

−5000

1

10

4

1

2B

1

10

4

+

1

2B

d f =

1

2B

1

10

4

+

1

2B

=

1

B

5000

+1

(11)

In this case, B ≥ 9.5 ×10

4

guarantees e

∗

L

≤ 0.05.

Quiz 11.10

It is fairly straightforward to ﬁnd S

X

(φ) and S

Y

(φ). The only thing to keep in mind is

to use fftc to transform the autocorrelation R

X

[ f ] into the power spectral density S

X

(φ).

The following MATLAB program generates and plots the functions shown in Figure 8

74

−5000 −2000 0 2000 5000

0

0.5

1

f

H

(

f

)

−5000 −2000 0 2000 5000

0

0.5

1

f

H

(

f

)

B = 500 B = 2500

Figure 7: Wiener ﬁlter for Quiz 11.9.

%mquiz11.m

N=32;

rx=[2 4 2]; SX=fftc(rx,N); %autocorrelation and PSD

stem(0:N-1,abs(sx));

xlabel(’n’);ylabel(’S_X(n/N)’);

h2=0.5*[1 1]; H2=fft(h2,N); %impulse/filter response: M=2

SY2=SX.* ((abs(H2)).ˆ2);

figure; stem(0:N-1,abs(SY2)); %PSD of Y for M=2

xlabel(’n’);ylabel(’S_{Y_2}(n/N)’);

h10=0.1*ones(1,10); H10=fft(h10,N); %impulse/filter response: M=10

SY10=sx.*((abs(H10)).ˆ2);

figure; stem(0:N-1,abs(SY10));

xlabel(’n’);ylabel(’S_{Y_{10}}(n/N)’);

Relative to M = 2, when M = 10, the ﬁlter H(φ) ﬁlters out almost all of the high

frequency components of X(t ). In the context of Example 11.26, the low pass moving

average ﬁlter for M = 10 removes the high frquency components and results in a ﬁlter

output that varies very slowly.

As an aside, note that the vectors SX, SY2 and SY10 in mquiz11 should all be real-

valued vectors. However, the ﬁnite numerical precision of MATLAB results in tiny imagi-

nary parts. Although these imaginary parts have no computational signiﬁcance, they tend

to confuse the stem function. Hence, we generate stem plots of the magnitude of each

power spectral density.

75

0 5 10 15 20 25 30 35

0

5

10

n

S

X

(

n

/

N

)

0 5 10 15 20 25 30 35

0

5

10

n

S

Y

2

(

n

/

N

)

0 5 10 15 20 25 30 35

0

5

10

n

S

Y

1

0

(

n

/

N

)

Figure 8: For Quiz 11.10, graphs of S

X

(φ), S

Y

(n/N) for M = 2, and S

φ

(n/N) for M = 10

using an N = 32 point DFT.

76

Quiz Solutions – Chapter 12

Quiz 12.1

The system has two states depending on whether the previous packet was received in

error. From the problem statement, we are given the conditional probabilities

P

¸

X

n+1

= 0|X

n

= 0

¸

= 0.99 P

¸

X

n+1

= 1|X

n

= 1

¸

= 0.9 (1)

Since each X

n

must be either 0 or 1, we can conclude that

P

¸

X

n+1

= 1|X

n

= 0

¸

= 0.01 P

¸

X

n+1

= 0|X

n

= 1

¸

= 0.1 (2)

These conditional probabilities correspond to the transition matrix and Markov chain:

0 1

0.01

0.1

0.99 0.9

P =

¸

0.99 0.01

0.10 0.90

¸

(3)

Quiz 12.2

From the problem statement, the Markov chain and the transition matrix are

0 1 1

0.6 0.2

0.2 0.6

0.4 0.6 0.4

P =

⎡

⎣

0.4 0.6 0

0.2 0.6 0.2

0 0.6 0.4

⎤

⎦

(1)

The eigenvalues of P are

λ

1

= 0 λ

2

= 0.4 λ

3

= 1 (2)

We can diagonalize P into

P = S

−1

DS =

⎡

⎣

−0.6 0.5 1

0.4 0 1

−0.6 −0.5 1

⎤

⎦

⎡

⎣

λ

1

0 0

0 λ

2

0

0 0 λ

3

⎤

⎦

⎡

⎣

−0.5 1 −0.5

1 0 −1

0.2 0.6 0.2

⎤

⎦

(3)

where s

i

, the i th row of S, is the left eigenvector of P satisfying s

i

P = λ

i

s

i

. Algebra will

verify that the n-step transition matrix is

P

n

= S

−1

D

n

S =

⎡

⎣

0.2 0.6 0.2

0.2 0.6 0.2

0.2 0.6 0.2

⎤

⎦

+(0.4)

n

⎡

⎣

0.5 0 −0.5

0 0 0

−0.5 0 0.5

⎤

⎦

(4)

Quiz 12.3

The Markov chain describing the factory status and the corresponding state transition

matrix are

77

2

0 1

0.9

0.1

1

1

P =

⎡

⎣

0.9 0.1 0

0 0 1

1 0 0

⎤

⎦

(1)

With π =

¸

π

0

π

1

π

2

¸

, the system of equations π

= π

P yields π

1

= 0.1π

0

and

π

2

= π

1

. This implies

π

0

+π

1

+π

2

= π

0

(1 +0.1 +0.1) = 1 (2)

It follows that the limiting state probabilities are

π

0

= 5/6, π

1

= 1/12, π

2

= 1/12. (3)

Quiz 12.4

The communicating classes are

C

1

= {0, 1} C

2

= {2, 3} C

3

= {4, 5, 6} (1)

The states in C

1

and C

3

are aperiodic. The states in C

2

have period 2. Once the system

enters a state in C

1

, the class C

1

is never left. Thus the states in C

1

are recurrent. That

is, C

1

is a recurrent class. Similarly, the states in C

3

are recurrent. On the other hand, the

states in C

2

are transient. Once the system exits C

2

, the states in C

2

are never reentered.

Quiz 12.5

At any time t , the state n can take on the values 0, 1, 2, . . .. The state transition proba-

bilities are

P

n−1,n

= P [K > n|K > n −1] =

P [K > n]

P [K > n −1]

(1)

P

n−1,0

= P [K = n|K > n −1] =

P [K = n]

P [K > n −1]

(2)

(3)

The Markov chain resembles

0 1

P K=2 [ ]

P K= [ 1]

3 4

P K=4 [ ]

2

P K=3 [ ]

P K=5 [ ]

1 1 1 1 1

… ...

78

The stationary probabilities satisfy

π

0

= π

0

P [K = 1] +π

1

, (4)

π

1

= π

0

P [K = 2] +π

2

, (5)

.

.

.

π

k−1

= π

0

P [K = k] +π

k

, k = 1, 2, . . . (6)

From Equation (4), we obtain

π

1

= π

0

(1 − P [K = 1]) = π

0

P [K > 1] (7)

Similarly, Equation (5) implies

π

2

= π

1

−π

0

P [K = 2] = π

0

(P [K > 1] − P [K = 2]) = π

0

P [K > 2] (8)

This suggests that π

k

= π

0

P[K > k]. We verify this pattern by showing that π

k

=

π

0

P[K > k] satisﬁes Equation (6):

π

0

P [K > k −1] = π

0

P [K = k] +π

0

P [K > k] . (9)

When we apply

¸

∞

k=0

π

k

= 1, we obtain π

0

¸

∞

n=0

P[K > k] = 1. From Problem 2.5.11,

we recall that

¸

∞

k=0

P[K > k] = E[K]. This implies

π

n

=

P [K > n]

E [K]

(10)

This Markov chain models repeated random countdowns. The system state is the time until

the counter expires. When the counter expires, the system is in state 0, and we randomly

reset the counter to a new value K = k and then we count down k units of time. Since we

spend one unit of time in each state, including state 0, we have k −1 units of time left after

the state 0 counter reset. If we have a random variable W such that the PMF of W satisﬁes

P

W

(n) = π

n

, then W has a discrete PMF representing the remaining time of the counter at

a time in the distant future.

Quiz 12.6

(1) By inspection, the number of transitions need to return to state 0 is always a multiple

of 2. Thus the period of state 0 is d = 2.

(2) To ﬁnd the stationary probabilities, we solve the system of equations π = πP and

¸

3

i =0

π

i

= 1:

π

0

= (3/4)π

1

+(1/4)π

3

(1)

π

1

= (1/4)π

0

+(1/4)π

2

(2)

π

2

= (1/4)π

1

+(3/4)π

3

(3)

1 = π

0

+π

1

+π

2

+π

3

(4)

79

Solving the second and third equations for π

2

and π

3

yields

π

2

= 4π

1

−π

0

π

3

= (4/3)π

2

−(1/3)π

1

= 5π

1

−(4/3)π

0

(5)

Substituting π

3

back into the ﬁrst equation yields

π

0

= (3/4)π

1

+(1/4)π

3

= (3/4)π

1

+(5/4)π

1

−(1/3)π

0

(6)

This implies π

1

= (2/3)π

0

. It follows from the ﬁrst and second equations that

π

2

= (5/3)π

0

and π

3

= 2π

0

. Lastly, we choose π

0

so the state probabilities sum to

1:

1 = π

0

+π

1

+π

2

+π

3

= π

0

1 +

2

3

+

5

3

+2

=

16

3

π

0

(7)

It follows that the state probabilities are

π

0

=

3

16

π

1

=

2

16

π

2

=

5

16

π

3

=

6

16

(8)

(3) Since the system starts in state 0 at time 0, we can use Theorem 12.14 to ﬁnd the

limiting probability that the system is in state 0 at time nd:

lim

n→∞

P

00

(nd) = dπ

0

=

3

8

(9)

Quiz 12.7

The Markov chain has the same structure as that in Example 12.22. The only difference

is the modiﬁed transition rates:

0 1

1

3 4

( ) 2/3

a

1 - ( ) 2/3

a

( ) 3/4

a

1 - 3/4 ( )

a

( ) 4/5

a

1 - 4/5 ( )

a

2

( ) 1/2

a

1- 1/2 ( )

a

…

The event T

00

> n occurs if the system reaches state n before returning to state 0, which

occurs with probability

P [T

00

> n] = 1 ×

1

2

α

×

2

3

α

×· · · ×

n −1

n

α

=

1

n

α

. (1)

Thus the CDF of T

00

satisﬁes F

T

00

(n) = 1−P[T

00

> n] = 1−1/n

α

. To determine whether

state 0 is recurrent, we observe that for all α > 0

P [V

00

] = lim

n→∞

F

T

00

(n) = lim

n→∞

1 −

1

n

α

= 1. (2)

80

Thus state 0 is recurrent for all α > 0. Since the chain has only one communicating class,

all states are recurrent. ( We also note that if α = 0, then all states are transient.)

To determine whether the chain is null recurrent or positive recurrent, we need to calcu-

late E[T

00

]. In Example 12.24, we did this by deriving the PMF P

T

00

(n). In this problem,

it will be simpler to use the result of Problem 2.5.11 which says that

¸

∞

k=0

P[K > k] =

E[K] for any non-negative integer-valued random variable K. Applying this result, the

expected time to return to state 0 is

E [T

00

] =

∞

¸

n=0

P [T

00

> n] = 1 +

∞

¸

n=1

1

n

α

. (3)

For 0 < α ≤ 1, 1/n

α

≥ 1/n and it follows that

E [T

00

] ≥ 1 +

∞

¸

n=1

1

n

= ∞. (4)

We conclude that the Markov chain is null recurrent for 0 < α ≤ 1. On the other hand, for

α > 1,

E [T

00

] = 2 +

∞

¸

n=2

1

n

α

. (5)

Note that for all n ≥ 2

1

n

α

≤

n

n−1

dx

x

α

(6)

This implies

E [T

00

] ≤ 2 +

∞

¸

n=2

n

n−1

dx

x

α

(7)

= 2 +

∞

1

dx

x

α

(8)

= 2 +

x

−α+1

−α +1

∞

1

= 2 +

1

α −1

< ∞ (9)

Thus for all α > 1, the Markov chain is positive recurrent.

Quiz 12.8

The number of customers in the ”friendly” store is given by the Markov chain

1 i i+1

p p p

( )( ) 1-p 1-q ( )( ) 1-p 1-q ( )( ) 1-p 1-q ( )( ) 1-p 1-q

( ) 1-p q ( ) 1-p q ( ) 1-p q ( ) 1-p q

0

××× ×××

81

In the above chain, we note that (1 − p)q is the probability that no new customer arrives,

an existing customer gets one unit of service and then departs the store.

By applying Theorem 12.13 with state space partitioned between S = {0, 1, . . . , i } and

S

**= {i +1, i +2, . . .}, we see that for any state i ≥ 0,
**

π

i

p = π

i +1

(1 − p)q. (1)

This implies

π

i +1

=

p

(1 − p)q

π

i

. (2)

Since Equation (2) holds for i = 0, 1, . . ., we have that π

i

= π

0

α

i

where

α =

p

(1 − p)q

. (3)

Requiring the state probabilities to sum to 1, we have that for α < 1,

∞

¸

i =0

π

i

= π

0

∞

¸

i =0

α

i

=

π

0

1 −α

= 1. (4)

Thus for α < 1, the limiting state probabilities are

π

i

= (1 −α)α

i

, i = 0, 1, 2, . . . (5)

In addition, for α ≥ 1 or, equivalently, p ≥ q/(1 − q), the limiting state probabilities do

not exist.

Quiz 12.9

The continuous time Markov chain describing the processor is

0 1

2

3.01

3 4

2

3

2

3

2

2

3

0.01

0.01

0.01

Note that q

10

= 3.1 since the task completes at rate 3 per msec and the processor reboots

at rate 0.1 per msec and the rate to state 0 is the sum of those two rates. From the Markov

chain, we obtain the following useful equations for the stationary distribution.

5.01p

1

= 2p

0

+3p

2

5.01p

2

= 2p

1

+3p

3

5.01p

3

= 2p

2

+3p

4

3.01p

4

= 2p

3

We can solve these equations by working backward and solving for p

4

in terms of p

3

, p

3

in terms of p

2

and so on, yielding

p

4

=

20

31

p

3

p

3

=

620

981

p

2

p

2

=

19620

31431

p

1

p

1

=

628, 620

1, 014, 381

p

0

(1)

82

Applying p

0

+ p

1

+ p

2

+ p

3

+ p

4

= 1 yields p

0

= 1, 014, 381/2, 443, 401 and the

stationary probabilities are

p

0

= 0.4151 p

1

= 0.2573 p

2

= 0.1606 p

3

= 0.1015 p

4

= 0.0655 (2)

Quiz 12.10

The M/M/c/∞queue has Markov chain

c c+1 1 0

λ λ λ λ λ

µ 2µ

cµ cµ cµ

From the Markov chain, the stationary probabilities must satisfy

p

n

=

¸

(ρ/n) p

n−1

n = 1, 2, . . . , c

(ρ/c) p

n−1

n = c +1, c +2, . . .

(1)

It is straightforward to show that this implies

p

n

=

¸

p

0

ρ

n

/n! n = 1, 2, . . . , c

p

0

(ρ/c)

n−c

ρ

c

/c! n = c +1, c +2, . . .

(2)

The requirement that

¸

∞

n=0

p

n

= 1 yields

p

0

=

c

¸

n=0

ρ

n

/n! +

ρ

c

c!

ρ/c

1 −ρ/c

−1

(3)

83

**Quiz Solutions – Chapter 1
**

Quiz 1.1 In the Venn diagrams for parts (a)-(g) below, the shaded area represents the indicated set.

M T O M T O M T O

(1) R = T c

(2) M ∪ O

(3) M ∩ O

M T

O

M T

O

M T

O

(4) R ∪ M Quiz 1.2 (1) A1 = {vvv, vvd, vdv, vdd} (2) B1 = {dvv, dvd, ddv, ddd} (3) A2 = {vvv, vvd, dvv, dvd} (4) B2 = {vdv, vdd, ddv, ddd} (5) A3 = {vvv, ddd} (6) B3 = {vdv, dvd}

(4) R ∩ M

(6) T c − M

(7) A4 = {vvv, vvd, vdv, dvv, vdd, dvd, ddv} (8) B4 = {ddd, ddv, dvd, vdd} Recall that Ai and Bi are collectively exhaustive if Ai ∪ Bi = S. Also, Ai and Bi are mutually exclusive if Ai ∩ Bi = φ. Since we have written down each pair Ai and Bi above, we can simply check for these properties. The pair A1 and B1 are mutually exclusive and collectively exhaustive. The pair A2 and B2 are mutually exclusive and collectively exhaustive. The pair A3 and B3 are mutually exclusive but not collectively exhaustive. The pair A4 and B4 are not mutually exclusive since dvd belongs to A4 and B4 . However, A4 and B4 are collectively exhaustive. 2

Quiz 1.3 There are exactly 50 equally likely outcomes: s51 through s100 . Each of these outcomes has probability 0.02. (1) P[{s79 }] = 0.02 (2) P[{s100 }] = 0.02 (3) P[A] = P[{s90 , . . . , s100 }] = 11 × 0.02 = 0.22 (4) P[F] = P[{s51 , . . . , s59 }] = 9 × 0.02 = 0.18 (5) P[T ≥ 80] = P[{s80 , . . . , s100 }] = 21 × 0.02 = 0.42 (6) P[T < 90] = P[{s51 , s52 , . . . , s89 }] = 39 × 0.02 = 0.78 (7) P[a C grade or better] = P[{s70 , . . . , s100 }] = 31 × 0.02 = 0.62 (8) P[student passes] = P[{s60 , . . . , s100 }] = 41 × 0.02 = 0.82 Quiz 1.4 We can describe this experiment by the event space consisting of the four possible events V B, V L, D B, and DL. We represent these events in the table: V D L 0.35 ? B ? ? In a roundabout way, the problem statement tells us how to ﬁll in the table. In particular, P [V ] = 0.7 = P [V L] + P [V B] P [L] = 0.6 = P [V L] + P [DL] (1) (2)

Since P[V L] = 0.35, we can conclude that P[V B] = 0.35 and that P[DL] = 0.6 − 0.35 = 0.25. This allows us to ﬁll in two more table entries: V D L 0.35 0.25 B 0.35 ? The remaining table entry is ﬁlled in by observing that the probabilities must sum to 1. This implies P[D B] = 0.05 and the complete table is V D L 0.35 0.25 B 0.35 0.05 Finding the various probabilities is now straightforward: 3

(1) P[DL] = 0.25 (2) P[D ∪ L] = P[V L] + P[DL] + P[D B] = 0.35 + 0.25 + 0.05 = 0.65. (3) P[V B] = 0.35 (4) P[V ∪ L] = P[V ] + P[L] − P[V L] = 0.7 + 0.6 − 0.35 = 0.95 (5) P[V ∪ D] = P[S] = 1 (6) P[L B] = P[L L c ] = 0 Quiz 1.5 (1) The probability of exactly two voice calls is P [N V = 2] = P [{vvd, vdv, dvv}] = 0.3 (2) The probability of at least one voice call is P [N V ≥ 1] = P [{vdd, dvd, ddv, vvd, vdv, dvv, vvv}] = 6(0.1) + 0.2 = 0.8 An easier way to get the same answer is to observe that P [N V ≥ 1] = 1 − P [N V < 1] = 1 − P [N V = 0] = 1 − P [{ddd}] = 0.8 (4) (2) (3) (1)

(3) The conditional probability of two voice calls followed by a data call given that there were two voice calls is 1 P [{vvd} , N V = 2] P [{vvd}] 0.1 = (5) = = P [{vvd} |N V = 2] = P [N V = 2] P [N V = 2] 0.3 3 (4) The conditional probability of two data calls followed by a voice call given there were two voice calls is P [{ddv} , N V = 2] P [{ddv} |N V = 2] = =0 (6) P [N V = 2] The joint event of the outcome ddv and exactly two voice calls has probability zero since there is only one voice call in the outcome ddv. (5) The conditional probability of exactly two voice calls given at least one voice call is P [N V = 2, N V ≥ 1] P [N V = 2] 0.3 3 = = = (7) P [N V = 2|Nv ≥ 1] = P [N V ≥ 1] P [N V ≥ 1] 0.8 8 (6) The conditional probability of at least one voice call given there were exactly two voice calls is P [N V ≥ 1, N V = 2] P [N V = 2] P [N V ≥ 1|N V = 2] = = =1 (8) P [N V = 2] P [N V = 2] Given that there were two voice calls, there must have been at least one voice call. 4

it’s wise to avoid intuition and simply check whether P[AB] = P[A]P[B]. Note that this shouldn’t be surprising since we used the information that the calls were independent in the problem statement to determine the probabilities of the outcomes. (1) First. there are four outcomes with probabilities P[{vv}] = (0. {C1 = d} are independent events. dv.544. 5 (7) (5) (4) (3) (2) (1) .8)2 = 0. N V ≥ 1] = P [N V = 2] = P [{vv}] = 0. Using the probabilities of the outcomes. P[C1 = v] = 0. we observe that P [N V ≥ 1] = P [{vd. we calculate the probability of the joint event: P [N V = 2. Further. the events are dependent.96 Finally. N V is even] = 0. P[N V ≥ 1] = 0.Quiz 1. vv}] = 0.96. (4) The probability of the joint event is P [C2 = v.2)(0.544.96) = P [N V = 2.8)(0.68) = 0.68 (8) Thus.96)(0.64 Next.16 P[{dd}] = (0.64 P[{dv}] = (0. and the ﬁrst call is a data call. {C2 = v}.80 From part (a). vv}] = 0.8. C2 = v] = P [{dv}] = 0. P[C2 = v]P[N V is even] = (0.8) = 0.8) = 0.8 so that P [N V ≥ 1] P [C1 = v] = (0.8)(0. the events are dependent. vv}] = 0. P [N V is even] = P [{dd.64)(0.2) = 0.16 P[{vd}] = (0.16 (6) Since P[C1 = d]P[C2 = v] = (0. C1 = v] = P [{vd. we make the comparison P [N V = 2] P [N V ≥ 1] = (0.2)2 = 0. Just to be sure. (2) The probability of the joint event is P [N V ≥ 1. each event has probability P [C2 = v] = P [{dv.2)(0.8) = 0. Since P[C2 = v. we conﬁrm that the events are independent. C1 = v] Hence. we now can test for the independence of events.6 In this experiment. we can do the calculations to check: P [C1 = d.04 When checking the independence of any two events A and B.64 Also. N V is even] = P [{vv}] = 0. N V ≥ 1] which shows the two events are dependent.16.768 = P [N V ≥ 1. vv}] = 0. (3) The problem statement that the calls were independent implies that the events the second call is a voice call.

For each of the next three bits. There are 4 = 6 ways to do this. The failure probability is = 1 − p and the success probability is 1 − = p. it is also possible to simply enumerate the six code words: 1100. 0011. Thus the probability the user is found is c c c P [F] = 1 − P F1 F2 F3 = 1 − (0. 2 For this problem.8 ¨ F1 0.7 Let Fi denote the event that that the user is found on page i. Each subexperiment has two possible outcomes: 0 and 1. 1001.2)3 = 0. there are 8 = 56 code words. (2) An experiment that can yield all possible code words with two zeroes is to choose which 2 bits (out of 4 bits) will be zero.Quiz 1. 0101. Hence. 3 Quiz 1.992 (1) Quiz 1. The number of ways of choosing such N a code word is M . (4) For the constant ratio code.2 0.9 (1) In this problem. Thus by the fundamental principle of counting.8 ¨ F3 ¨¨ ¨¨ ¨¨ ¨ ¨¨ ¨¨ c c c ¨¨ F3 F1 ¨ F2 ¨ 0.100−k = 100 k 6 k (1 − )100−k (1) .2 The user is found unless all three paging attempts fail. The other N − M bits will be zeroes. then the ﬁrst subexperiment of choosing the ﬁrst bit has only one outcome. we can specify a code word by choosing M of the bits to be ones. the probability of k bits in error and 100 − k correctly received bits is P Sk. there are 1 × 2 × 2 × 2 = 8 ways of choosing a code word.8 ¨ F2 0. k bits received in error is the same as k failures in 100 trials. 1010. The other two bits then must be ones. That is. For N = 8 and M = 3.8 (1) We can view choosing each bit in the code word as a subexperiment. we have two choices. 0110. there are six code words with exactly two zeroes. there are 2 × 2 × 2 × 2 = 24 = 16 possible code words. (3) When the ﬁrst bit must be a zero. The tree for the experiment is 0.2 0. In this case.

100). + (2*(R>0. we ﬁrst generate a vector R of 100 random numbers.5 and 0. That is. 700(0.9)). + (3*(R>0.4). P S0.99) 8 = 0.97 = 0.97 = 161. then X(i)=1. we use the hist function to count how many occurences of each possible value of X(i). X=(R<= 0.3700 9 97 P S2.10 Since the chip works only if all n transistors work.99 + P S2.4 < R(i) and R(i)<=0. X(i)=2 if ﬂip i was tails. The probability that a chip works is P[C] = pn . 7 . Since transistor failures are independent of each other.99) (2) The probability a packet is decoded correctly is just P [C] = P S0. we note there are three cases: • If R(i) <= 0.11 R=rand(1.98 + P S3. chip failures are also independent. Lastly.9 < R(i).9819 = 0.100 = (1 − )100 = (0. The module works if either 8 chips work or 9 chips work.For = 0.1849 P S3.4.0610 (6) Quiz 1. P [C8 ] = The probability a memory module works is P [M] = P [C8 ] + P [C9 ] = p 8n (9 − 8 p n ) Quiz 1.1:3) (1) (2) (3) For a M ATLAB simulation. 0. 8 P [C9 ] = (P [C])9 = p 9n . Let Ck denote the event that exactly k chips work. Thus each P[Ck ] has the binomial probability 9 (P [C])8 (1 − P [C])9−8 = 9 p 8n (1 − p n ).1..98 = 4950(0. • If 0. Second..9)) .9.100 + P S1. we generate vector X as a function of R to represent the 3 possible outcomes of a ﬂip. X(i)=1 if ﬂip i was heads.01)(0. To see how this works.01..*(R<=0.99 = 100(0. and X(i)=3) is ﬂip i landed on the edge. the transistors in the chip are like devices in series.01) (0.4.01) (0.3660 P S1. Y=hist(X. then X(i)=3.99) 2 3 99 (2) (3) (4) (5) = 0. then X(i)=2..4) . • If 0.99)100 = 0. These three cases will have probabilities 0.

16 2 PN (n) = c 1 + n=1 1 1 + 2 3 =1 (1) This implies c = 6/11. (2) P[N = 1] = PN (1) = c = 6/11 (3) P[N ≥ 2] = PN (2) + PN (3) = c/2 + c/3 = 5/11 (4) P[N > 3] = ∞ n=4 PN (n) = 0 Quiz 2. . the trial is a success. with probability p. . 0 otherwise (1) (2) If p = 0. then the probability exactly 10 bits are sent is P [X = 10] = PX (10) = (0. Similar to Example 2. .24 2.5 0.5 0.1. we recall that the PMF must sum to 1. Now we can interpret each experiment in the generic context of independent trials.3 Decoding each transmitted bit is an independent trial where we call a bit error a “success.11. That is.9)9 = 0.1)(0.Quiz Solutions – Chapter 2 Quiz 2. Now that we have found c. the remaining parts are straightforward.0 0. that is. probabilities and corresponding grades for the experiment are Outcome P[·] BB BC CB CC Quiz 2.0387 8 (2) . (1) The random variable X is the number of trials up to and including the ﬁrst success.2 (1) To ﬁnd c. 3 G 0.” Each bit is in error.1 The sample space.24 2.36 3. X has the geometric PMF PX (x) = p(1 − p)x−1 x = 1. 2.

01)2 (0. Y has the binomial PMF PY (y) = 100 y p (1 − p)100−y y (4) (3) If p = 0.910 = 0.25. This x=10 sum is not too hard to calculate. (3) The random variable Y is the number of successes in 100 independent trials. its even easier to observe that X ≥ 10 if the ﬁrst 10 bits are transmitted correctly.01)2 (0. P[X ≥ 10] = 0. 5..15) PZ (z) = z−1 3 p (1 − p)z−3 2 (9) Note that PZ (z) > 0 for z = 3. (6) If p = 0.99)98 = 0. .1849 2 (5) (4) The probability of no more than 2 errors is P [Y ≤ 2] = PY (0) + PY (1) + PY (2) = (0.4 Each of these probabilities can be read off the CDF FY (y). However. . However.75)9 = 0. we must keep in + mind that when FY (y) has a discontinuity at y0 . .1.The probability that at least 10 bits are sent is P[X ≥ 10] = ∞ PX (x).01)(0.13. 4.9207 100 (0.0645 2 (10) Quiz 2.99)99 + = 0.25)3 (0. FY (y) takes the upper value FY (y0 ). P [X ≥ 10] = P [ﬁrst 10 bits are correct] = (1 − p)10 For p = 0. Just as in Example 2.01. Thus Z has the Pascal PMF (see Example 2.3487. the probability of exactly 2 errors is P [Y = 2] = PY (2) = 100 (0. the probability that the third error occurs on bit 12 is PZ (12) = 11 (0. (1) P[Y < 1] = FY (1− ) = 0 9 .99)98 2 (6) (7) (8) (5) Random variable Z is the number of trials up to and including the third success. That is.99)100 + 100(0.

1) = 62 (2) (3) (4) 10 . a call is a voice call and C = 25.8 = 0.(2) P[Y ≤ 1] = FY (1) = 0.7) + 40(0.7 c = 25 PC (c) = 0.3) = 29.8 = 0 Quiz 2. 105 PT (t) = 0.6 (6) P[Y = 3] = P[Y ≤ 3] − P[Y < 3] = FY (3+ ) − FY (3− ) = 0. with probability 0. the cost T is T = 25N + 40(3 − N ) = 120 − 15N (2) To ﬁnd the PMF of T .3) + 120(0. we can draw the following tree: N =0 •T =120 0.6 (1) As a function of N .3 N =2 •T =90 r rr 0. 90.3 t = 75.3 N =3 •T =75 From the tree.5 (1) With probability 0.1¨¨ ¨ ¨ ¨ 0.8 − 0.2 (4) P[Y ≥ 2] = 1 − P[Y < 2] = 1 − FY (2− ) = 1 − 0.4 (5) P[Y = 1] = P[Y ≤ 1] − P[Y < 1] = FY (1+ ) − FY (1− ) = 0. the expected value of T is E [T ] = 75PT (75) + 90PT (90) + 105PT (105) + 120PT (120) = (75 + 90 + 105)(0. Otherwise. we have a data call and C = 40.1 t = 120 ⎩ 0 otherwise From the PMF PT (t).7.6 = 0. we can write down the PMF of T : ⎧ ⎨ 0. This corresponds to the PMF ⎧ ⎨ 0.3.3 c = 40 (1) ⎩ 0 otherwise (2) The expected value of C is E [C] = 25(0.3$$N =1 •T =105 $ (2) (1) $ $$ ¨¨$ rr rr0.5 cents Quiz 2.6 (3) P[Y > 2] = 1 − P[Y ≤ 2] = 1 − FY (2) = 1 − 0.

8 The PMF PN (n) allows to calculate each of the desired quantities.4 (1) (2) The second moment of N is 2 E N 2 = n=0 n 2 PN (n) = 02 (0.663.4)2 = 0.1) = 2 (1) (2) The number of memory chips is M = g(A) where ⎧ ⎨ 4 A = 1. (1) The expected value of N is 2 E [N ] = n=0 n PN (n) = 0(0. (3) 11 .Quiz 2.3) + 6(0.44 = 0.3) + 3(0.4) + 4(0.4) + 2(0.4) + 22 (0. 2 g(A) = 6 A = 3 ⎩ 8 A=4 (3) By Theorem 2.2) + 8(0.8 (3) Since E[A] = 2.10. g(E[A]) = g(2) = 4.1) + 12 (0. E[M] = 4. The two quantities are different because g(A) is not of the form α A + β.4 − (1.2) + 4(0.5) = 2. Quiz 2.14.44 (4) The standard deviation is σ N = √ Var[N ] = √ 0.4 (2) (3) The variance of N is Var[N ] = E N 2 − (E [N ])2 = 2.1) = 4.4) + 2(0.1) + 1(0. the expected number of applications is 4 E [A] = a=1 a PA (a) = 1(0. However.8 = g(E[A]). the expected number of memory chips is 4 (2) E [M] = a=1 g(A)PA (a) = 4(0.5) = 1.7 (1) Using Deﬁnition 2.

3.005/0. 3. 9. From Theorem 1.19375 n = 1. 7.9 (1) From the problem statement. 5 0 otherwise (2) (3) The problem statement tells us that P[T ] = 1 − P[I ] = 3/4. 50 ⎩ 0 otherwise (4) First we ﬁnd 10 (3) (4) (5) P [N ≤ 10] = n=1 PN (n) = (0.80 (6) By Theorem 2. calculating conditional expectations is easy. 2. 8. . 50 = 0(0. . 9. 4. 3.8 n = 6. 4.19375) + n=6 n(0. 2. 5 n = 6.155)(5) + (0. 8.2(0.75) + 0. . 4. 7.8 n = 1. 3. 2. 10 ⎩ 0 otherwise ⎧ ⎨ 0.155/0. . the conditional PMF of N given N ≤ 10 is PN |N ≤10 (n) = PN (n) P[N ≤10] ⎧ ⎨ 0.005)(5) = 0.10 (the law of total probability). . 10 ⎩ 0 otherwise (5) Once we have the conditional PMF. 2. 2. . 5 = 0. we learn that the conditional PMF of N given the event I is 0. 4. . 7. .02(0. 4. .17. . E [N |N ≤ 10] = n 5 0 n ≤ 10 otherwise (7) (8) (9) n PN |N ≤10 (n) 10 (10) = n=1 n(0. .25) ⎩ 0 otherwise ⎧ ⎨ 0.75) + 0.02(0. 7. 5 = 0.15625 12 .155 n = 1. 50 PN |I (n) = (1) 0 otherwise (2) Also from the problem statement. 3.00625 n = 6.02 n = 1. we ﬁnd the PMF of N is PN (n) = PN |T (n) P [T ] + PN |I (n) P [I ] ⎧ ⎨ 0. the conditional PMF of N given the event T is PN |T (n) = 0.2 n = 1.25) n = 1.005 n = 6. 2.Quiz 2.00625) (11) (12) = 3. 5 = 0. .

m 2 . m k . m n is fairly random but as n gets 13 . .75684 (16) (17) Quiz 2. M=zeros(k.10 8 6 4 2 0 0 50 100 10 8 6 4 2 0 0 500 1000 (a) samplemean(100) (b) samplemean(1000) Figure 1: Two examples of the output of samplemean(k) (6) To ﬁnd the conditional variance. M(:. . .10 The function samplemean(k) generates and plots ﬁve m n sequences for n = 1.00625) = 12.5). K=(1:k)’. plot(K.k). end. . . The ith column M(:. . Each time samplemean(k) is called produces a random output.i) of M holds a sequence m 1 . function M=samplemean(k). for i=1:5.15625)2 = 2. .71875 The conditional variance is Var[N |N ≤ 10] = E N 2 |N ≤ 10 − (E [N |N ≤ 10])2 = 12. Examples of the function calls (a) samplemean(100) and (b) samplemean(1000) are shown in Figure 1.71875 − (3. X=duniformrv(0.19375) + 330(0. k./K. 2. . we ﬁrst ﬁnd the conditional second moment E N 2 |N ≤ 10 = n 5 n 2 PN |N ≤10 (n) 10 (13) n 2 (0.M).19375) + 2 (14) (15) = 55(0. What is observed in these ﬁgures is that for small n.i)=cumsum(X).10.00625) n=6 = n=1 n (0.

m n gets close to E[X ] = 5. the sequences always converges to E[X ]. that we generate is random. m 2 . This random convergence is analyzed in Chapter 7. 14 . Although each sequence m 1 .large. . . .

5) = 1 − (1.1 The CDF of Y is 1 FY(y) 0.5)/4 = 5/8 Quiz 3.5 0 0 2 y 4 ⎧ y<0 ⎨ 0 y/4 0 ≤ y ≤ 4 FY (y) = ⎩ 1 y>4 (1) From the CDF FY (y).2 0. λ = 1/2) PDF 0.5] = 1 − P[Y ≤ 1.5] = 1 − FY (1. We will evaluate this integral using integration by parts: ∞ −∞ f X (x) d x = 0 ∞ cxe−x/2 d x ∞ 0 (1) ∞ 0 = −2cxe−x/2 =0 + 2ce−x/2 d x (2) = −4ce−x/2 ∞ 0 = 4c (3) Thus c = 1/4 and X has the Erlang (n = 2. we can calculate the probabilities: (1) P[Y ≤ −1] = FY (−1) = 0 (2) P[Y ≤ 1] = FY (1) = 1/4 (3) P[2 < Y ≤ 3] = FY (3) − FY (2) = 3/4 − 2/4 = 1/4 (4) P[Y > 1. we use ∞ the fact that −∞ f X (x) d x = 1.Quiz Solutions – Chapter 3 Quiz 3. To ﬁnd c.1 0 0 5 x 10 15 f X (x) = (x/4)e−x/2 x ≥ 0 0 otherwise fX(x) (4) 15 .2 (1) First we will ﬁnd the constant c and then we will sketch the PDF.

we ﬁrst note X is a nonnegative random variable so that FX (x) = 0 for all x < 0.(2) To ﬁnd the CDF FX (x). FX (x) = 0 x f X (y) dy = 0 x y −y/2 e dy 4 (5) (6) (7) x x 1 y − e−y/2 dy = − e−y/2 − 2 2 0 0 x −x/2 =1− e − e−x/2 2 The complete expression for the CDF is 1 FX(x) 0.3 The PDF of Y is 3 fY(y) 2 1 0 −2 0 y 2 f Y (y) = 3y 2 /2 −1 ≤ y ≤ 1. For x ≥ 0. P [−2 ≤ X ≤ 2] = FX (2) − FX (−2) = 1 − 3e−1 . f Y (y) = f Y (−y)). (2) Note that the above calculation wasn’t really necessary because E[Y ] = 0 whenever the PDF f Y (y) is an even function (i. 0 otherwise. (3) 16 .e. (1) (1) The expected value of Y is E [Y ] = ∞ −∞ y f Y (y) dy = 1 −1 (3/2)y 3 dy = (3/8)y 4 1 −1 = 0. P [0 ≤ X ≤ 4] = FX (4) − FX (0) = 1 − 3e−2 . (4) Similarly. (9) (10) Quiz 3.5 0 0 5 x 10 15 FX (x) = 1− 0 x 2 + 1 e−x/2 x ≥ 0 otherwise (8) (3) From the CDF FX (x). (2) The second moment of Y is E Y2 = ∞ −∞ y 2 f Y (y) dy = 1 −1 (3/2)y 4 dy = (3/10)y 5 1 −1 = 3/5..

The fact that Y has twice the standard deviation of X is reﬂected in the greater spread of f Y (y).2.1 (1) The PDFs of X and Y are shown below. f X (x) = 0 otherwise. (4) The standard deviation of Y is σY = Quiz 3. √ b = 3 + 3 3. a+b =3 2 Var[X ] = (b − a)2 = 9. The only valid solution with a < b is √ a = 3 − 3 3. (1) √ Var[Y ] = √ 3/5. the peak value of the Gaussian PDF goes down. The PDF of X is f X (x) = (1/3)e−x/3 x ≥ 0. However. 0 otherwise.5 Each of the requested probabilities can be calculated using or Q(z) and Table 3.4 0. Since E[X ] = 3 and Var[X ] = 9. Quiz 3. We start with the sketches. (3) (4) The complete expression for the PDF of X is √ √ √ 1/(6 3) 3 − 3 3 ≤ x < 3 + 3 3. b) random variable. E[X ] = 1/λ and Var[X ] = 1/λ2 .6 to write E [X ] = This implies a + b = 6. it is important to remember that as the standard deviation increases. (4) (2) We know X is a uniform (a.2 fX(x) 0 −5 x ← fX(x) ← f (y) Y 0 y 5 17 .4 (1) When X is an exponential (λ) random variable. (5) (z) function and Table 3. To ﬁnd a and b.(3) The variance of Y is Var[Y ] = E Y 2 − (E [Y ])2 = 3/5. 12 (2) √ b − a = ±6 3. we apply Theorem 3. we must have λ = 1/3. fY(y) 0.

⎨ 0 FX (x) = (x + 1)/4 −1 ≤ x < 1. 2). P [−1 < Y ≤ 1] = FY (1) − FY (−1) 1 −1 = − σY σY (3) =2 1 − 1 = 0. (2) Quiz 3. P[Y > 3.(2) Since X is Gaussian (0.5 0 −2 0 x 2 ⎧ −1 ≤ x < 1. (5) Since Y is Gaussian (0. ⎩ 0 otherwise.5] = Q( 3. The resulting PDF is 0.5 0 −2 (1. 2). (4) We ﬁnd the PDF f Y (y) by taking the derivative of FY (y). ⎩ 1 x ≥ 1. (3) P[X = 1] = FX (1+ ) − FX (1− ) = 1 − 1/2 = 1/2.5 ) = Q(1.5) = 2.383.5] = Q(3.7 18 .0401. 1). (1) The following probabilities can be read directly from the CDF: (1) P[X ≤ 1] = FX (1) = 1. ⎨ 1/4 f X (x) = (1/2)δ(x − 1) x = 1.75) = 1 − 2 0.6826.5 fX(x) 0. P [−1 < X ≤ 1] = FX (1) − FX (−1) = (1) − (−1) = 2 (1) − 1 = 0. 1). since X is Gaussian (0.6 The CDF of X is 1 FX(x) 0.33 × 10−4 .75) = 0 x 2 ⎧ x < −1. (3) Since Y is Gaussian (0. P[X > 3. 2 (4) (1) (2) (4) Again. Quiz 3. (2) P[X < 1] = FX (1− ) = 1/2.

we obtain the PDF f Y (y). FX (x) = 0 for x < 0. FY (y) = P [Y ≤ y] = P [X ≤ y] = FX (y) . Also. Using the CDF FX (x). FX (x) = x−x ⎩ 1 x > 2. because Y ≤ 1.(1) Since X is always nonnegative. 19 . Thus FY (y) = 0 for y < 0. the PDF is zero.5 0 −1 0 1 y 2 3 1 y 2 3 ⎧ y < 0. 1. (3) (3) Since X is nonnegative. (5) 0.5 0 −1 Y (4) 0 As expected. FX (x) = x −∞ f X (y) dy = 0 x (1 − y/2) dy = x − x 2 /4.5 f (y) 1 0. (2) (2) The probability that Y = 1 is P [Y = 1] = P [X ≥ 1] = 1 − FX (1) = 1 − 3/4 = 1/4. Lastly.5 0 −1 X 0 1 x 2 3 ⎧ x < 0. ⎨ 0 2 /4 0 ≤ y < 1. Finally. ⎨ 0 2 /4 0 ≤ x ≤ 2. FY (y) = y−y ⎩ 1 y ≥ 1.25 f Y (y) = 1 − y/2 + (1/4)δ(y − 1) 0 ≤ y ≤ 1 0 otherwise Y (6) Quiz 3. Also. we see that the jump in FY (y) at y = 1 is exactly equal to P[Y = 1]. for 0 ≤ x ≤ 2. (1) The complete CDF of X is 1 F (x) 0. (4) By taking the derivative of FY (y). the complete expression for the CDF of Y is 1 F (y) 0.8 (1) P[Y ≤ 6] = 6 −∞ f Y (y) dy = 6 0 (1/10) dy = 0. FY (y) = 1 for all y ≥ 1. FX (x) = 1 for x ≥ 2 since its always true that x ≤ 2. Note that when y < 0 or y > 1.6 . Y is also nonnegative. for 0 < y < 1.

0+exponentialrv(1/3.9 A natural way to produce random variables with PDF f T |T >2 (t) is to generate samples of T with PDF f T (t) and then to discard those samples which fail to satisfy the condition T > 2. we can calculate the conditional expectation E [Y |Y ≤ 6] = ∞ −∞ y f Y |Y ≤6 (y) dy = 6 0 y dy = 3.1). we can calculate the conditional expectation E [Y |Y > 8] = ∞ −∞ y f Y |Y >8 (y) dy = 10 8 y dy = 9. 1/6 0 ≤ y ≤ 6. 10 (2) (4) From Deﬁnition 3. Here is a M ATLAB function that uses this method: function t=t2rv(m) i=0. the conditional PDF of Y given Y ≤ 6 is f Y |Y ≤6 (y) = (3) The probability Y > 8 is P [Y > 8] = 8 10 f Y (y) P[Y ≤6] 0 y ≤ 6. 0 otherwise. = otherwise.1).2 . 20 .(2) From Deﬁnition 3. (1) 1 dy = 0. the conditional PDF of Y given Y > 8 is f Y |Y >8 (y) = f Y (y) P[Y >8] 0 y > 8.15. if (x>2) t(i+1)=x. 0 otherwise. t=zeros(m.m) generates the vector t. i=i+1.lambda=1/3. = otherwise. then T = T + 2 has PDF f T (t) = f T |T >2 (t). end end A second method exploits the fact that if T is an exponential (λ) random variable. In this case the command t=2. while (i<m).15. 6 (4) (6) From the conditional PDF f Y |Y >8 (y). 2 (5) Quiz 3. x=exponentialrv(lambda. 1/2 8 < y ≤ 10. (3) (5) From the conditional PDF f Y |Y ≤6 (y).

G (q. g) (4) (5) = 0. we can calculate the requested probabilities by summing the PMF over those values of Q and G that correspond to the event.6 (4) The probability that G > Q is 1 3 P [G > Q] = q=0 g=q+1 PQ. Y ≤ 2] ≤ P[X ≤ −∞] = 0 since X cannot take on the value −∞. This result is given in Theorem 4.Quiz Solutions – Chapter 4 Quiz 4. 2) = P[X ≤ −∞. Quiz 4.6 (2) The probability that Q = G is P [Q = G] = PQ. 2) + PQ.Y (∞.24 + 0. Y ≤ ∞] = 1. 0) + PQ.24 + 0.12 + 0.G (0. g) (6) (7) = 0. 1) + PQ. (1) The probability that Q = 0 is P [Q = 0] = PQ.08 = 0. Y ≤ −∞] = 0 since Y cannot take on the value −∞. (1) FX.G (q.G (0. (3) FX.1 Each value of the joint CDF can be found by considering the corresponding probability.2 From the joint PMF of Q and G given in the table.Y (∞.Y (−∞.16 + 0.12 + 0.18 (3) The probability that G > 1 is 3 1 (1) (2) (3) P [G > 1] = g=2 q=0 PQ.78 21 .06 + 0. Y ≤ y] = P[Y ≤ y] = FY (y). 1) = 0.18 + 0.G (1.12 = 0.G (0. ∞) = P[X ≤ ∞.1. (2) FX. 3) = 0. −∞) = P[X ≤ ∞.G (0. (4) FX.24 + 0.Y (∞.16 + 0.18 + 0.08 = 0. 0) + PQ. y) = P[X ≤ ∞.G (0.

b) (2) For each value of b.4 0. we apply ∞ ∞ −∞ −∞ ∞ ∞ −∞ −∞ (3) f X. b) (1) For each value of h. Speciﬁcally. y) d x d y = =c cx y d x dy y 0 2 0 (1) dy 2 0 x 2 /2 1 0 (2) =c (3) = (c/2) Thus c = 1.6 0.3 Quiz 4. 2 0 0 2 1 f X. b) b = 0 b = 2 b = 4 PH (h) h = −1 0 0.4 PH.2 PB (b) 0. the marginal PMF of H is PH (h) = b=0.4 To ﬁnd the constant c. To calculate P[A].Y (x.Y (x.Quiz 4. y) d x d y (4) To integrate over A.3 By Theorem 4.3.1 0 0. y = r sin θ and d x d y = r dr dθ . this corresponds to the column sum down the table of the joint PMF. yielding 2 1 Y P [A] = 0 π/2 0 1 0 1 r 2 sin θ cos θ r dr dθ π/2 0 2 π/2 (5) (6) A 1 X = = r 3 dr ⎛ 1 0 sin θ cos θ dθ ⎞ ⎠ = 1/8 r 4 /4 ⎝ sin θ 2 (7) 0 22 . we write P [A] = A y dy = (c/4)y 2 f X. the marginal PMF of B is 1 PB (b) = h=−1 PH.B (h. Similarly.1 0. y) d x d y = 1.B (h. we convert to polar coordinates using the substitutions x = r cos θ .2 h=0 h=1 0. The easiest way to calculate these marginal PMFs is to simply sum each row and column: PH.B (h.2 0. this corresponds to calculating the row sum across the table of the joint PMF.5 0.2 0.2.1 0.Y (x.1 0 0.

10 (T =120) 0. b) l = 518. f Y (y) = = ∞ −∞ 6 1 f X.5 By Theorem 4.1 t = 24 ⎪ ⎪ ⎪ ⎪ 0. we can calculate the time T needed for the transfer.Y (x. 592. For 0 ≤ x ≤ 1.05 (T =180) 0. 400 0. writing down the PMF of T is straightforward. 800 0. For each pair of values of L and B.05 t = 18 ⎪ ⎪ ⎪ 0.20 (T =270) (3 + 6y 2 )/5 0 ≤ y ≤ 1 0 otherwise (6) From the table. the complete expression for the PDF of Y is f Y (y) = Quiz 4. y) dy (x + y 2 ) d x = 6 2 x /2 + x y 2 5 x=1 x=0 (4) 6 3 + 6y 2 = (1/2 + y 2 ) = 5 5 (5) 5 0 Since f Y (y) = 0 for y < 0 or y > 1. 000 l = 7. For 0 ≤ y ≤ 1.05 t = 180 ⎪ ⎪ ⎪ 0.20 (T =90) 0. 000 b = 14.1 t = 360 ⎪ ⎪ ⎩ 0 otherwise 23 (1) .8.20 (T =36) 0. 600 0.B (l.1 t = 120 PT (t) = ⎪ 0. f X (x) = 0. We can write these down on the table for the joint PMF of L and B as follows: PL .2 t = 36. f X (x) = 6 5 1 0 (x + y 2 ) dy = 6 x y + y 3 /3 5 y=1 y=0 6x + 2 6 = (x + 1/3) = 5 5 (2) The complete expression for the PDf of X is f X (x) = (6x + 2)/5 0 ≤ x ≤ 1 0 otherwise (3) By the same method we obtain the marginal PDF for Y .05 (T =18) 0. 400 l = 2.10 (T =360) b = 28. 90 ⎪ ⎪ ⎨ 0.00 (T =540) b = 21. the marginal PDF of X is f X (x) = ∞ −∞ f X.Quiz 4.6 (A) The time required for the transfer is T = L/B.Y (x. ⎧ ⎪ 0. y) dy (1) For x < 0 or x > 1. 776.10 (T =24) 0.2 t = 270 ⎪ ⎪ ⎪ ⎪ 0.

25 (7) (8) (1) (2) (3) .2 0. 24 t = 40 0. we observe that since 0 ≤ X ≤ 1 and 0 ≤ Y ≤ 1.(B) First. we ﬁnd the PDF is ⎧ 0 w<0 d FW (w) ⎨ f W (w) = = − ln w 0 ≤ w ≤ 1 ⎩ dw 0 w>1 Quiz 4. For 0 < w < 1.25) = 4.5) + 32 (0.5) + 3(0. Since the second moment of L is E L 2 = 12 (0.25) = 2. PL .5. Speciﬁcally. we calculate the CDF FW (w) = P[W ≤ w].15 0.T (l. The calculus is simpler if we integrate over the region X Y > w.1 0.4 PL (l) 0. As shown below.15 0.7 (A) It is helpful to ﬁrst make a table that includes the marginal PMFs.5. Y 1 w w 1 XY > w FW (w) = 1 − P [X Y > w] =1− =1− 1 1 w w/x 1 w (2) (3) (4) (5) (6) dy dx XY = w X (1 − w/x) d x = 1 − x − w ln x|x=1 x=w = 1 − (1 − w + w ln w) = w − w ln w The complete expression for the CDF is ⎧ w<0 ⎨ 0 FW (w) = w − w ln w 0 ≤ w ≤ 1 ⎩ 1 w>1 By taking the derivative of the CDF. t) l=1 l=2 l=3 PT (t) (1) The expected value of L is E [L] = 1(0.3 0.6 t = 60 0. Thus f W (0) = 0 and f W (1) = 1.25) + 2(0. integrating over the region W ≤ w is fairly complex.25) + 22 (0.1 0. the variance of L is Var [L] = E L 2 − (E [L])2 = 0.25 0. W = X Y satisﬁes 0 ≤ W ≤ 1.5 0.

(2) The expected value of T is E [T ] = 40(0.T = 0. For 0 ≤ x ≤ 1. the calculations become easier if we ﬁrst calculate the marginal PDFs f X (x) and f Y (y).Y (x. it is straightforward to calculate the various expectations. T ] = 0. f Y (y) = ∞ −∞ f X. Thus Var[T ] = E T 2 − (E [T ])2 = 2400 − 482 = 96.6) + 60(0. for 0 ≤ y ≤ 2.Y (x. 25 . (11) (B) As in the discrete case. the covariance of L and T is Cov [L . The second moment of T is E T 2 = 402 (0. y) dy = 0 2 1 x y dy = x y 2 2 y=2 = 2x y=0 (12) Similarly.60 l=1 lt PL T (lt) (7) (8) (9) (10) = 1(40)(0.4) = 2400. y) d x = 0 2 xy dx = 1 2 x y 2 x=1 = x=0 y 2 (13) The complete expressions for the marginal PDFs are f X (x) = 2x 0 ≤ x ≤ 1 0 otherwise f Y (y) = y/2 0 ≤ y ≤ 2 0 otherwise (14) From the marginal PDFs.2) + 3(60)(0. f X (x) = ∞ −∞ f X.1) + 2(60)(0.1) = 96 (4) From Theorem 4.4) = 48.15) + 1(60)(0.16(a). T ] = E [L T ] − E [L] E [T ] = 96 − 2(48) = 0 (5) Since Cov[L . the correlation coefﬁcient is ρ L . (3) The correlation is 3 (4) (5) (6) E [L T ] = t=40.15) + 2(40)(0.3) + 3(40)(0.6) + 602 (0.

dy 1 0 (20) 2 x3 x y d x. T ) = (3. PL . Y ] = 0. y) d x. Quiz 4.8 (A) Since the event V > 80 occurs only for the pairs (L .Y (x. 60) + PL . (L .9. (3) The correlation of X and Y is E [X Y ] = = ∞ ∞ −∞ −∞ 1 2 2 2 0 0 x y f X. T ) = (2. t) = 26 PL . (2) The ﬁrst and second moments of Y are E [Y ] = E Y2 4 1 2 y dy = 3 −∞ 0 2 ∞ 2 1 = y 2 f Y (y) dy = y 3 dy = 2 −∞ 0 2 y f Y (y) dy = ∞ 2 (18) (19) The variance of Y is Var[Y ] = E[Y 2 ] − (E[Y ])2 = 2 − 16/9 = 2/9.T (3.T (l. 40) and (L .(1) The ﬁrst and second moments of X are E [X ] = E X2 = ∞ −∞ ∞ −∞ x f X (x) d x = 0 1 2x 2 d x = 1 2 3 1 2 (15) (16) (17) x 2 f X (x) d x = 0 2x 3 d x = The variance of X is Var[X ] = E[X 2 ] − (E[X ])2 = 1/18.45 By Deﬁnition 4. Y ] = E [X Y ] − E [X ] E [Y ] = 2 8 − 9 3 4 3 = 0.T |A (l.Y = 0. 40) + PL . 60).t) P[A] (1) 0 lt > 80 otherwise (2) . T ) = (3. dy = 3 y3 3 = 0 8 9 (21) (4) The covariance of X and Y is Cov [X. 60). P [A] = P [V > 80] = PL . (22) (5) Since Cov[X. the correlation coefﬁcient is ρ X.T (2. 60) = 0.T (3.

y) ∈ B 0 otherwise K x y 40 ≤ y ≤ 60.T |A (l. we ﬁrst calculate the probability of the conditioning event. we ﬁrst ﬁnd the conditional second moment E V 2 |A = l t (lt)2 PL .801 8 5 2 dy The conditional PDF of X and Y is f X.Y |B (x. t) (3) (4) 1 2 1 4 = (2 · 60) + (3 · 40) + (3 · 60) = 133 9 3 9 3 For the conditional variance Var[V |A]. y) /P [B] (x.We can represent this conditional PMF in the following table: PL .Y (x. y) d x d y = = = 60 40 60 40 60 3 80/y xy dx dy 4000 x2 2 3 (8) dy (9) (10) (11) y 4000 80/y 9 3200 y − 2 y 40 4000 2 9 4 3 = − ln ≈ 0.Y (x. E [V |A] = l t lt PL . t) t = 40 t = 60 l=1 0 0 l=2 0 4/9 1/3 2/9 l=3 The conditional expectation of V can be found from the conditional PMF. 400 9 3 9 It follows that Var [V |A] = E V 2 |A − (E [V |A])2 = 622 2 9 (7) (B) For continuous random variables X and Y . y) = = f X.T |A (l. 80/y ≤ x ≤ 3 0 otherwise 27 (12) (13) . t) (5) (6) 4 1 2 = (2 · 60)2 + (3 · 40)2 + (3 · 60)2 = 18. P [B] = B f X.T |A (l.

**where K = (4000P[B])−1 . The conditional expectation of W given event B is E [W |B] = =
**

∞ ∞ −∞ −∞ 60 3 40

x y f X,Y |B (x, y) d x d y K x 2 y2 d x d y y2 x 3

x=3 x=80/y

(14) (15)

= (K /3) = (K /3)

80/y 60 40 60 40

dy

(16) (17) (18)

27y 2 − 803 /y dy

60 40

**= (K /3) 9y 3 − 803 ln y The conditional second moment of K given B is E W 2 |B = =
**

∞ ∞

≈ 120.78

−∞ −∞ 60 3 40

(x y)2 f X,Y |B (x, y) d x d y K x 3 y3 d x d y y3 x 4

x=3 x=80/y

(19) (20)

= (K /4)

80/y 60 40 60 40

dy

(21) (22) ≈ 16, 116.10 (23)

= (K /4)

81y 3 − 804 /y dy

60 40

= (K /4) (81/4)y 4 − 804 ln y It follows that the conditional variance of W given B is

Var [W |B] = E W 2 |B − (E [W |B])2 ≈ 1528.30 Quiz 4.9

(24)

(A) (1) The joint PMF of A and B can be found from the marginal and conditional PMFs via PA,B (a, b) = PB|A (b|a)PA (a). Incorporating the information from the given conditional PMFs can be confusing, however. Consequently, we can note that A has range S A = {0, 2} and B has range S B = {0, 1}. A table of the joint PMF will include all four possible combinations of A and B. The general form of the table is PA,B (a, b) b=0 b=1 a=0 PB|A (0|0)PA (0) PB|A (1|0)PA (0) PB|A (0|2)PA (2) PB|A (1|2)PA (2) a=2 28

Substituting values from PB|A (b|a) and PA (a), we have b=0 b=1 PA,B (a, b) a=0 (0.8)(0.4) (0.2)(0.4) (0.5)(0.6) (0.5)(0.6) a=2 or PA,B (a, b) b = 0 b = 1 a=0 0.32 0.08 0.3 0.3 a=2

**(2) Given the conditional PMF PB|A (b|2), it is easy to calculate the conditional expectation
**

1

E [B|A = 2] =

b=0

b PB|A (b|2) = (0)(0.5) + (1)(0.5) = 0.5

(1)

(3) From the joint PMF PA,B (a, b), we can calculate the the conditional PMF ⎧ 0.32/0.62 a = 0 PA,B (a, 0) ⎨ PA|B (a|0) = = 0.3/0.62 a = 2 (2) ⎩ PB (0) 0 otherwise ⎧ ⎨ 16/31 a = 0 = 15/31 a = 2 (3) ⎩ 0 otherwise (4) We can calculate the conditional variance Var[A|B = 0] using the conditional PMF PA|B (a|0). First we calculate the conditional expected value E [A|B = 0] =

a

a PA|B (a|0) = 0(16/31) + 2(15/31) = 30/31

(4)

**The conditional second moment is E A2 |B = 0 =
**

a

a 2 PA|B (a|0) = 02 (16/31) + 22 (15/31) = 60/31 (5)

The conditional variance is then Var[A|B = 0] = E A2 |B = 0 − (E [A|B = 0])2 = (B) (1) The joint PDF of X and Y is f X,Y (x, y) = f Y |X (y|x) f X (x) = (2) From the given conditional PDF f Y |X (y|x), f Y |X (y|1/2) = 29 8y 0 ≤ y ≤ 1/2 0 otherwise (8) 6y 0 ≤ y ≤ x, 0 ≤ x ≤ 1 0 otherwise (7) 960 961 (6)

**(3) The conditional PDF of Y given X = 1/2 is f X |Y (x|1/2) = f X,Y (x, 1/2)/ f Y (1/2). To ﬁnd f Y (1/2), we integrate the joint PDF. f Y (1/2) = Thus, for 1/2 ≤ x ≤ 1, f X |Y (x|1/2) = f X,Y (x, 1/2) 6(1/2) =2 = f Y (1/2) 3/2 (10)
**

∞ −∞

f X,1/2 ( ) d x =

1 1/2

6(1/2) d x = 3/2

(9)

(4) From the pervious part, we see that given Y = 1/2, the conditional PDF of X is uniform (1/2, 1). Thus, by the deﬁnition of the uniform (a, b) PDF, Var [X |Y = 1/2] = Quiz 4.10 (A) (1) For random variables X and Y from Example 4.1, we observe that PY (1) = 0.09 and PX (0) = 0.01. However, PX,Y (0, 1) = 0 = PX (0) PY (1) (1) (1 − 1/2)2 1 = 12 48 (11)

Since we have found a pair x, y such that PX,Y (x, y) = PX (x)PY (y), we can conclude that X and Y are dependent. Note that whenever PX,Y (x, y) = 0, independence requires that either PX (x) = 0 or PY (y) = 0. (2) For random variables Q and G from Quiz 4.2, it is not obvious whether they are independent. Unlike X and Y in part (a), there are no obvious pairs q, g that fail the independence requirement. In this case, we calculate the marginal PMFs from the table of the joint PMF PQ,G (q, g) in Quiz 4.2. PQ,G (q, g) g = 0 g = 1 g = 2 g = 3 PQ (q) q=0 0.06 0.18 0.24 0.12 0.60 0.04 0.12 0.16 0.08 0.40 q=1 PG (g) 0.10 0.30 0.40 0.20 Careful study of the table will verify that PQ,G (q, g) = PQ (q)PG (g) for every pair q, g. Hence Q and G are independent. (B) (1) Since X 1 and X 2 are independent, f X 1 ,X 2 (x1 , x2 ) = f X 1 (x1 ) f X 2 (x2 ) = (1 − x1 /2)(1 − x2 /2) 0 ≤ x1 ≤ 2, 0 ≤ x2 ≤ 2 0 otherwise 30 (2) (3)

(2) Let FX (x) denote the CDF of both X 1 and X 2 . The CDF of Z = max(X 1 , X 2 ) is found by observing that Z ≤ z iff X 1 ≤ z and X 2 ≤ z. That is, P [Z ≤ z] = P [X 1 ≤ z, X 2 ≤ z] = P [X 1 ≤ z] P [X 2 ≤ z] = [FX (z)]2 (4) (5)

To complete the problem, we need to ﬁnd the CDF of each X i . From the PDF f X (x), the CDF is ⎧ x <0 ⎨ 0 x 2 /4 0 ≤ x ≤ 2 FX (x) = f X (y) dy = (6) x−x ⎩ −∞ 1 x >2 Thus for 0 ≤ z ≤ 2, FZ (z) = (z − z 2 /4)2 (7)

The complete expression for the CDF of Z is ⎧ z<0 ⎨ 0 2 /4)2 0 ≤ z ≤ 2 FZ (z) = (z − z ⎩ 1 z>1

(8)

Quiz 4.11 This problem just requires identifying the various terms in Deﬁnition 4.17 and Theorem 4.29. Speciﬁcally, from the problem statement, we know that ρ = 1/2, µ1 = µ X = 0, and that σ1 = σ X = 1, σ2 = σY = 1. (2) (1) Applying these facts to Deﬁnition 4.17, we have 1 2 2 e−2(x −x y+y )/3 . f X,Y (x, y) = √ 3π 2 (3) µ2 = µY = 0, (1)

(2) By Theorem 4.30, the conditional expected value and standard deviation of X given Y = y are 2 E [X |Y = y] = y/2 σ X = σ1 (1 − ρ 2 ) = 3/4. ˜ (4) When Y = y = 2, we see that E[X |Y = 2] = 1 and Var[X |Y = 2] = 3/4. The conditional PDF of X given Y = 2 is simply the Gaussian PDF 1 2 e−2(x−1) /3 . f X |Y (x|2) = √ 3π/2 (5)

31

3. 3. That is. First we observe that X has the discrete uniform (1. 4) PMF. . x=finiterv(sx. . and an independent uniform (0. 2. . x) PMF via Y = xU . . px=0. Instead. 1) random variable U . Also. This observation prompts the following program: function xy=dtrianglerv(m) sx=[1.28. Y has a discrete uniform (1.25*ones(4. x 0 otherwise (1) Given X = x.12 One straightforward method is to follow the approach of Example 4. y=ceil(x. we can generate a sample value of Y with a discrete uniform (1. 0 otherwise.m).4].px.Quiz 4. PX (x) = 1/4 x = 1.1). we use an alternate approach.1)). xy=[x’. given X = x.y’]. 4. PY |X (y|x) = 1/x y = 1.*rand(m. 32 .2. x) PMF.

(1) (2) (3) x2 x2 0 x3 x1 In particular.1 We ﬁnd P[C] by integrating the joint PDF over the region of interest.Quiz Solutions – Chapter 5 Quiz 5. we must keep in mind that f X 1 . x3 ) = ∞ −∞ ∞ −∞ ∞ −∞ f X (x) d x3 = f X (x) d x1 = f X (x) d x2 = 1 6 d x3 = 6(1 − x2 ). Y1 = X 1 .X 3 (x2 .X 2 (x1 . x2 ) = 0 unless 0 ≤ x 1 ≤ x2 ≤ 1. PY (y) = P [Y1 = y1 . (1) (2) =4 0 y2 dy2 0 y4 dy4 Quiz 5. each Yi must be a strictly positive integer.X 3 (x1 . X 2 = y2 + y1 . Speciﬁcally. y2 . 6 d x1 = 6x2 . X 3 − X 2 = y3 ] = P [X 1 = y1 .X 2 (x1 . 2.X 3 (x2 .2 By deﬁnition of A. 6 d x2 = 6(x3 − x1 ). P [C] = 0 1/2 y2 1/2 y4 dy2 0 1/2 dy1 0 dy4 0 1/2 4dy3 = 1/4. Y2 = y2 . x2 ) = f X 2 . Y3 = y3 ] = P [X 1 = y1 . y3 ∈ {1.X 3 (x1 . y3 ∈ {1. X 3 = y3 + y2 + y1 ] = (1 − p)3 p y1 +y2 +y3 (1) (2) (3) (4) By deﬁning the vector a = 1 1 1 . x3 ) = 0 unless 0 ≤ x2 ≤ x3 ≤ 1. and that f X 1 . Since 0 < X 1 < X 2 < X 3 . f X 2 . Y2 = X 2 − X 1 and Y3 = X 3 − X 2 . . y2 .3 First we note that each marginal PDF is nonzero only if any subset of the xi obeys the ordering contraints 0 ≤ x 1 ≤ x2 ≤ x3 ≤ 1. we have f X 1 . Thus. 2.}. . . x3 ) = 0 unless 0 ≤ x 1 ≤ 33 . . Within these constraints. . . the complete expression for the joint PMF of Y is PY (y) = (1 − p) p a y y1 .} 0 otherwise (5) Quiz 5. X 2 − X 1 = y2 . for y1 . x3 ) = f X 1 .

W (v.X 3 (x2 .X 2 (x1 . x3 ) = f X 1 . x3 ) = 6(1 − x2 ) 0 ≤ x1 ≤ x2 ≤ 1 0 otherwise 6x2 0 ≤ x2 ≤ x3 ≤ 1 0 otherwise 6(x3 − x1 ) 0 ≤ x1 ≤ x3 ≤ 1 0 otherwise (4) (5) (6) Now we can ﬁnd the marginal PDFs.X 2 (x1 . x3 ) d x2 = 1 x1 1 6(1 − x2 ) d x2 = 3(1 − x1 )2 6x2 d x3 = 6x2 (1 − x2 ) 2 6x2 d x2 = 3x3 (7) (8) (9) x2 x3 0 The complete expressions are f X 1 (x1 ) = f X 2 (x2 ) = f X 3 (x3 ) = 3(1 − x1 )2 0 ≤ x1 ≤ 1 0 otherwise 6x2 (1 − x2 ) 0 ≤ x2 ≤ 1 0 otherwise 2 3x3 0 ≤ x3 ≤ 1 0 otherwise (10) (11) (12) Quiz 5. f X 1 (x1 ) = f X 2 (x2 ) = f X 3 (x3 ) = ∞ −∞ ∞ −∞ ∞ −∞ f X 1 .X 3 (x2 . We can separate these constraints by creating the vectors V= The joint PDF of V and W is f V. Y2 W= Y3 . the components have dependencies as a result of the ordering constraints Y1 ≤ Y2 and Y3 ≤ Y4 . Y4 (1) 34 . The complete expressions are f X 1 .4 In the PDF f Y (y). w) = 4 0 ≤ v1 ≤ v2 ≤ 1. x2 ) d x2 = f X 2 . x2 ) = f X 2 .x3 ≤ 1. When 0 ≤ xi ≤ 1 for each xi . 0 ≤ w1 ≤ w2 ≤ 1 0 otherwise (2) Y1 . x3 ) d x3 = f X 2 .X 3 (x2 .X 3 (x1 .

1. PX i (x) = pix (1 − pi )5−x x = 0. Quiz 5. however it is simpler to just start from ﬁrst principles and observe that X 1 is the number of occurrences of L in ﬁve independent tests.W (v. w) dw1 dw2 1 w1 1 0 (3) (4) (5) 4 dw2 dw1 = Similarly. f V (v) = = 0 1 f V. conﬁrming that V and W are independent vectors.3.1) random variable. f W (w) = = 4(1 − w1 ) dw1 = 2 f V. That is. If we view each test as a trial with success probability P[L] = 0. 0.x2 . for p1 = 0. .W (v. 5 0 otherwise 35 5 x (2) . 5} ⎩ 0 otherwise We can ﬁnd the marginal PMF for each X i from the joint PMF PX (x). .W (v. . In ﬁve trials.19. . PX (x) = (1) x1 . p) = (5.1.1)x3 x1 + x2 + x3 = 5. for 0 ≤ w1 ≤ w2 ≤ 1.3.5 (A) Referring to Theorem 1. each test is a subexperiment with three possible outcomes: L.x3 (0.6 and p3 = 0. x3 ∈ {0. the vector X = X 1 X 2 X 3 indicating the number of outcomes of each subexperiment has the multinomial PMF ⎧ 5 ⎨ x1 . 0. p2 = 0. x2 . .6)x2 (0. Similarly.3) random variable. w) dv1 dv2 1 0 1 v1 (6) (7) 4 dv2 dv1 = 2 It follows that V and W have PDFs f V (v) = 2 0 ≤ v1 ≤ v2 ≤ 1 . 0. . A and R. 1.6) random variable and X 3 is a binomial (5. For 0 ≤ v1 ≤ v2 ≤ 1. w) = f V (v) f W (w). we see that X 1 is a binomial (n. .3)x1 (0. X 2 is a binomial (5. .We must verify that V and W are independent. 0 otherwise f W (w) = 2 0 ≤ w 1 ≤ w2 ≤ 1 0 otherwise (8) It is easy to verify that f V.

6 We start by ﬁnding the components E[X i ] = the marginal PDFs f X i (x) found in Quiz 5. 2) + PX (2. Thus. f 3 X 2 2 2 2 (1/8)e−(y3 −4)/2 4 ≤ y1 ≤ y2 ≤ y3 = 0 otherwise (9) (10) (6) (7) (8) Note that for other matrices A. we must use Theorem 5. we see that X 1 . 2) + PX (2. Hence. 6x 2 (1 − x) d x = 1/2. PW (3) = PX 1 (3) + PX 2 (3) + PX 3 (3) = 0. . We start with 36 . and w = 5.6)2 (0. PW (2) = PX (1. (1) (2) (3) E [X 2 ] = 0 1 E [X 3 ] = 0 1 To ﬁnd the correlation matrix R X . Quiz 5. X 2 and X 3 are not independent.1458 = (3) (4) (5) In addition. X 2 = w.1)2 + 0.32 (0.From the marginal PMFs.10 to write f Y (y) = y1 − 4 y2 − 4 y3 − 4 1 . In particular. 3x 3 d x = 3/4. PW (0) = PW (1) = 0. w = 4.1)] 2!2!1! = 0.288 PW (5) = PX 1 (5) + PX 2 (5) + PX 3 (5) = 0.6)2 (0.486 PW (4) = PX 1 (4) + PX 2 (4) + PX 3 (4) = 0. since X 1 + X 2 + X 3 = 5 and since each X i is non-negative.1)2 + 0.6 to ﬁnd the PMF of W . we can apply Theorem 5. we need to ﬁnd E[X i X j ] for all i and j. To do so. or X 3 = w occurs. the constraints on y resulting from the constraints 0 ≤ X 1 ≤ X 2 ≤ X 3 can be much more complicated. we use 3x(1 − x)2 d x = 1/4.0802 (B) Since each Yi = 2X i + 4.6)(0. 1) 5![0. the event W = w occurs if and only if one of the mutually exclusive events X 1 = w.3(0. for w = 3. Furthermore. 2. 1.32 (0.3: E [X 1 ] = 0 1 ∞ −∞ x f X i (x) d x of µ X . 2.

1/5 2/5 3/5 Vector X has covariance matrix C X = R X − E [X] E [X] ⎡ ⎤ ⎡ ⎤ 1/10 3/20 1/5 1/4 ⎣3/20 3/10 2/5⎦ − ⎣1/2⎦ = 1/5 2/5 3/5 3/4 ⎡ ⎤ ⎡ 1/10 3/20 1/5 1/16 ⎣3/20 3/10 2/5⎦ − ⎣ 1/8 = 1/5 2/5 3/5 3/16 37 (15) (16) 1/4 1/2 3/4 ⎤ ⎡ ⎤ 3 2 1 1/8 3/16 1 ⎣ 2 4 2⎦ .3. d x1 d x2 d x1 (7) (8) (9) (10) (11) (12) (13) (14) 6x1 x2 (1 − x2 ) d x2 E [X 2 X 3 ] = 0 1 = E [X 1 X 3 ] = 0 2 4 [3x2 − 3x2 ] d x2 = 2/5 1 x1 1 6x1 x3 (x3 − x1 ) d x3 d x1 . x2 ) . 1 x2 1 0 2 6x2 x3 d x3 d x2 x1 x2 f X 1 . 1/4 3/8 ⎦ = 80 1 2 3 3/8 9/16 (17) (18) . 3x 4 d x = 3/5. the cross terms are E [X 1 X 2 ] = = = 0 ∞ ∞ −∞ −∞ 1 1 0 1 x1 3 4 [x1 − 3x1 + 2x1 ] d x1 = 3/20. (4) (5) (6) 2 E X2 = 2 E X3 = 1 0 1 0 Using marginal PDFs from Quiz 5.X 2 (x1 . x3 =1 x3 =x1 = 0 1 3 2 2 (2x1 x3 − 3x1 x3 ) d x1 = 0 1 2 4 [2x1 − 3x1 + x1 ] d x1 = 1/5. X has correlation matrix ⎡ ⎤ 1/10 3/20 1/5 R X = ⎣3/20 3/10 2/5⎦ . Summarizing the results.the second moments: E 2 X1 = 0 1 3x 2 (1 − x)2 d x = 1/10. 6x 3 (1 − x) d x = 3/10.

16 tells us that Y is a 1 dimensional Gaussian vector. CT=36. Since T is a Gaussian random vector.18 that µ X = b and that C X = AA = 2 1 1 −1 2 1 5 1 = . i.16. p=phi((T-80)/sqrt(CY)).1)/31.50000000000000 0.99997155736872 0.(1:31)). The covariance matrix of Y is 1 × 1 and is just equal to Var[Y ].0000 Note that P[T ≤ 70] is not actually zero and that P[T ≤ 90] is not actually 1. Var[Y ] = ACT A .This problem shows that even for fairly simple joint PDFs.0000. Theorem 5. Quiz 5. In julytemps.m. computing the covariance matrix by calculus can be a time consuming task.0. the ﬁrst two lines generate the 31 × 31 covariance matrix CT. The ﬁnal step is to use the (·) function to calculate P[Y < T ].97792616932396 38 .5000 0. just a Gaussian random variable.9779 1.0000 1.02207383067604 Columns 5 through 6 0.8 First. Here is the long format output: >> format long >> julytemps([70 75 80 85 90 95]) ans = Columns 1 through 4 0. Next we calculate Var[Y ]. A=ones(31.0221 0. rounds off those probabilities. by Theorem 5. [D1 D2]=ndgrid((1:31).m: >> julytemps([70 75 80 85 90 95]) ans = 0. function p=julytemps(T). invoked with the command format short../(1+abs(D1-D2)).e. The expected value of Y is µY = µT = 80.0000 0.00002844263128 0. Thus. Here is the output of julytemps. Its just that the M ATLAB’s short format output. we observe that Y = AT where A = 1/31 1/31 · · · 1/31 .7 We observe that X = AZ + b where A= 2 1 .99999999922010 0. 1 −1 b= 2 . CY=(A’)*CT*A. or CT . 1 −1 1 2 (2) Quiz 5. 0 (1) It follows from Theorem 5.

⎥ ⎢ . c=36. 1 + |i − j| (1) If we write out the elements of the covariance matrix. The function julytemps2 use the toeplitz to generate the correlation matrix CT . In fact.1)/31. However. ⎢ c1 c0 CT = ⎢ . ⎥ . c30 · · · c1 c0 (2) This covariance matrix is known as a symmetric Toeplitz matrix. . c1 ⎦ . CY=(A’)*CT*A. we see that ⎡ ⎤ c0 c1 · · · c30 ./(1+abs(0:30)). j) = c|i− j| = 36 . p=phi((T-80)/sqrt(CY)). A=ones(31. in this problem. function p=julytemps2(T). . CT=toeplitz(c). jth element is CT (i.The ndgrid function is a useful to way calculate many covariance matrices. C X has a special structure. 39 . ⎥. We will see in Chapters 9 and 11 that Toeplitz covariance matrices are quite common.. . M ATLAB has a toeplitz function for generating them. ⎣ ...0. the i. . ..

Var[Wn ] = Var[K 1 ] + · · · + Var[K n ] = 1. For w > 0. by Theorem 6.25 Since E[K i ] = 2. otherwise.1 Let K 1 .5. the random variables K 1 . .2 Random variables X and Y have PDFs f X (x) = 3e−3x x ≥ 0 0 otherwise f Y (y) = 2e−2y y ≥ 0 0 otherwise (1) (6) (4) (2) (3) Since X and Y are nonnegative. .5. K n denote a sequence of iid random variables each with PMF PK (k) = 1/4 k = 1. First. a conmplete expression for the PDF of W is f W (w) = 6e−2w 1 − e−w 0 w ≥ 0. Hence.5 − (2.Quiz Solutions – Chapter 6 Quiz 6. (4) 40 . f W (w) = e−3w e y w 0 = 6 e−2w − e−3w (3) Since f W (w) = 0 for w < 0.3. the PDF of W = X + Y is f W (w) = ∞ −∞ f X (w − y) f Y (y) dy = 6 0 w e−3(w−y) e−2y dy (2) Fortunately. . 4 0 otherwise (1) We can write Wn in the form of Wn = K 1 + · · · + K n .5 Thus the variance of K i is Var[K i ] = E K i2 − (E [K i ])2 = 7. .5)2 = 1.5 E K i2 = (12 + 22 + 32 + 42 )/4 = 7. . the expected value of Wn is E [Wn ] = E [K 1 ] + · · · + E [K n ] = n E [K i ] = 2.25n Quiz 6. . the variance of the sum equals the sum of the variances. By Theorem 6. . this integral is easy to evaluate. . That is. .5n (5) Since the rolls are independent. . we note that the ﬁrst two moments of K i are E [K i ] = (1 + 2 + 3 + 4)/4 = 2. W = X + Y is nonnegative. . K n are independent. .

2(es + 2e2s + 3e3s + 4e4s ) ds Evaluating the derivative at s = 0 yields E [K ] = d φ K (s) ds = 0.8 says the MGF of J is φ J (s) = (φ K (s))m = (2) (B) Since the set of α j X j are independent Gaussian random variables. we continue to take derivatives: E K2 = E K3 E K4 d 2 φ K (s) ds 2 d 3 φ K (s) = ds 3 d 4 φ K (s) = ds 4 = 0.2)esk = 0.2(es + 4e2s + 9e3s + 16e4s ) s=0 s=0 =6 = 20 = 70. Thus to ﬁnd the PDF of W . Theorem 6.2 1 + es + e2s + e3s + e4s (1) We ﬁnd the moments by taking derivatives. The ﬁrst derivative of φ K (s) is d φ K (s) = 0.2(es + 16e2s + 81e3s + 256e4s ) s=0 s=0 Quiz 6.2(es + 8e2s + 27e3s + 64e4s ) s=0 s=0 = 0.2(1 + 2 + 3 + 4) = 2 s=0 (2) (3) To ﬁnd higher-order moments.3 The MGF of K is 4 φ K (s) = E es K == k=0 (0.4 (A) Each K i has MGF φ K (s) = E es K i = es (1 − ens ) es + e2s + · · · + ens = n n(1 − es ) ems (1 − ens )m n m (1 − es )m (1) Since the sequence of K i is independent.8 (4) (5) (6) (7) = 0. we need only ﬁnd the expected value and variance.10 says that W is a Gaussian random variable.Quiz 6. Since the expectation of the sum equals the sum of the expectations: E [W ] = α E [X 1 ] + α 2 E [X 2 ] + · · · + α n E [X n ] = 0 41 (3) . Theorem 6.

1 − 4 es 5 (1) From Theorem 6.5 (1) From Table 6.12. the variance of the sum equals the sum of the variances: Var[W ] = α 2 Var[X 1 ] + α 4 Var[X 2 ] + · · · + α 2n Var[X n ] = α 2 + 2(α 2 )2 + 3(α 2 )3 + · · · + n(α 2 )n Deﬁning q = α 2 . we see that R has the MGF of an exponential (1/5) random variable. 42 . R has MGF φ R (s) = φ N (ln φ X (s)) = Substituting the expression for φ X (s) yields φ R (s) = 1 5 1 5 1 5 φ X (s) 1 − 4 φ X (s) 5 (2) −s .6 to write Var[W ] = α 2 − α 2n+2 [1 + n(1 − α 2 )] (1 − α 2 )2 (6) (4) (5) 2 With E[W ] = 0 and σW = Var[W ]. The corresponding PDF is f R (r ) = (1/5)e−r/5 r ≥ 0 0 otherwise (4) This quiz is an example of the general result that a geometric sum of exponential random variables is an exponential random variable. (3) (2) From Table 6. 1−s φ N (s) = 1 s 5e . we can use Math Fact B.1. we can write the PDF of W as f W (w) = 1 2 2π σW e−w 2 /2σ 2 W (7) Quiz 6. each X i has MGF φ X (s) and random variable N has MGF φ N (s) where φ X (s) = 1 .1.Since the α j X j are independent.

the standard deviation of A is σ A = 12 (5) To use the central limit theorem.4013 Note that we used Table 3. we write P [A > 75] = 1 − P [A ≤ 75] 75 − E [A] A − E [A] ≤ =1− P σA σA 75 − 72 ≈1− 12 = 1 − 0. we use the central limit theorem and Table 3.6 (1) The expected access time is E [X ] = ∞ −∞ x f X (x) d x = 0 12 x d x = 6 msec 12 (1) (2) The second moment of the access time is E X2 = ∞ −∞ x 2 f X (x) d x = 0 12 x2 d x = 48 12 (2) The variance of the access time is Var[X ] = E[X 2 ] − (E[X ])2 = 48 − 36 = 12.25). (3) Using X i to denote the access time of block i.5987 = 0.1 to look up (0. Var[A] = Var[X 1 ] + · · · + Var[X 12 ] = 12 Var[X ] = 144 Hence.Quiz 6. (6) (7) (8) (9) (5) (4) (3) (6) Once again.1 to estimate P [A < 48] = P 48 − E [A] A − E [A] < σA σA 48 − 72 ≈ 12 = 1 − (2) = 1 − 0. E [A] = E [X 1 ] + · · · + E [X 12 ] = 12E [X ] = 72 msec (4) Since the X i are independent.0227 (10) (11) (12) 43 .9773 = 0. we can write A = X 1 + X 2 + · · · + X 12 Since the expectation of the sum equals the sum of the expectations.

X 3 are iid exponential (λ) random variables. (1) The expected number of voice calls out of 48 calls is E[K 48 ] = 48P[V ] = 36. we ﬁnd that W has expected value and variance E [W ] = 3/λ = 6 Var[W ] = 3/λ2 = 12 (2) (1) By the Central Limit Theorem. (1) In Theorem 6. X 2 . (2) The variance of K 48 is Var[K 48 ] = 48P [V ] (1 − P [V ]) = 48(3/4)(1/4) = 9 Thus K 48 has standard deviation σ K 48 = 3. P [W > 20] = P √ W −6 20 − 6 > √ ≈ Q(7/ 3) = 2.16666) − 1 = 0.9545 (4) Since K 48 is a discrete random variable.8 The train interarrival times X 1 .1 yields P [30 ≤ K 48 ≤ 42] ≈ Recalling that (−x) = 1 − 42 − 36 − 3 (x).5 − 36 − 3 3 = 2 (2. λ) random variable. The arrival time of the third train is W = X 1 + X 2 + X 3. we can use the De Moivre-Laplace approximation to estimate P [30 ≤ K 48 ≤ 42] ≈ 42 + 0.5 − 36 30 − 0. From Appendix A. (3) Using the ordinary central limit theorem and Table 3.66 × 10−5 √ 12 12 (3) 44 .7 Random variable K n has a binomial distribution for n trials and success probability P[V ] = 3/4.11.Quiz 6.9687 (4) (5) Quiz 6. we found that the sum of three iid exponential (λ) random variables is an Erlang (n = 3. we have (3) 30 − 36 3 = (2) − (−2) (2) (1) P [30 ≤ K 48 ≤ 42] ≈ 2 (2) − 1 = 0.

we note that the MGF of W is φW (s) = The Chernoff bound states that P [W > 20] ≤ min e−20s φ X (s) = min s≥0 s≥0 λ λ−s 3 = 1 (1 − 2s)3 e−20s (1 − 2s)3 (4) (5) To minimize h(s) = e−20s /(1 − 2s)3 . px=binomialpmf(100. the Central Limit Theorem approximation grossly underestimates the true probability. it is a valid bound. pmfplot(sw. we set the derivative of h(s) to zero: −20(1 − 2s)3 e−20s + 6e−20s (1 − 2s)2 d h(s) = =0 ds (1 − 2s)6 (6) This implies 20(1 − 2s) = 6 or s = 7/20.PW.19: %unifbinom100.0. P [W > 20] = 1 − FW (20) = e−10 1 + 10 102 + 1! 2! = 61e−10 = 0.m sx=0:100.SY]=ndgrid(sx.9 One solution to this problem is to follow the approach of Example 6.5. PW=PX.sw).0338 s=7/20 (7) (3) Theorem 3. for λ = 1/2 and w = 20.sy).py).sy). Applying s = 7/20 into the Chernoff bound yields P [W > 20] ≤ e−20s (1 − 2s)3 = (10/3)3 e−7 = 0.sx).PY]=ndgrid(px. sw=unique(SW). [SX.pw. A graph of the PMF PW (w) appears in Figure 2 With some thought.11 says that for any w > 0. py=duniformpmf(0. it should be apparent that the finitepmf function is implementing the convolution of the two PMFs. the CDF of the Erlang (λ.’\itP_W(w)’). SW=SX+SY.0028 (9) (10) Although the Chernoff bound is relatively weak in that it overestimates the probability by roughly a factor of 12. 3) random variable W satisﬁes 2 (λw)k e−λw FW (w) = 1 − (8) k! k=0 Equivalently.’\itw’.*PY. 45 . By contrast. [PX.(2) To use the Chernoff bound. Quiz 6.100. pw=finitepmf(SW.sy=0:100.

100) random variable.008 PW(w) 0.01 0. the PMF PW (w) of the independent sum of a binomial (100.002 0 0 20 40 60 80 100 w 120 140 160 180 200 Figure 2: From Quiz 6.9. 0.006 0.5) random variable and a discrete uniform (0.004 0.0. 46 .

30). By Theorem 7.3 Deﬁne the random variable W = (X − µ X )2 . Quiz 7. P [W > 75] = P [W − E [W ] > 30] ≤ P [|W − E [W ]| > 30] ≤ 225 Var [W ] 1 = = 2 900 4 30 (3) (4) E [W ] 45 3 = = 75 75 5 (2) Quiz 7. Observe that V100 (X ) = M100 (W ).1 An exponential random variable with expected value 1 also has variance 1. we need n = 100 samples. the mean square error is E (M100 (W ) − µW )2 = Observe that µ X = 0 so that W = X 2 . and Var[W ] = 3 Var[X i ] = 225. 12 Thus E[W ] = 3E[X i ] = 45.000889.1. By Theorem 7.6. Since each X i is uniform (0. (30 − 0)2 Var [X i ] = = 75.2 The arrival time of the third elevator is W = X 1 + X 2 + X 3 . µW = E X 2 Var[W ] 100 (1) = 1 −1 1 −1 x 2 f X (x) d x = 1/3 x 4 f X (x) d x = 1/5 (2) (3) E W2 = E X4 = Therefore Var[W ] = E[W 2 ] − µ2 = 1/5 − (1/3)2 = 4/45 and the mean square error is W 4/4500 = 0. P [W > 75] ≤ (2) By the Chebyshev inequality. (1) E [X i ] = 15. Thus. (1) By the Markov inequality. 47 . Mn (X ) has variance Var[Mn (X )] = 1/n.Quiz Solutions – Chapter 7 Quiz 7. Hence.

5 Following the approach of bernoullitraces. we must satisfy c n ≥ 1. The program bernoullisample. we have α ≤ 0. The interval is wide because the 0.3355 ≤ p ≤ 0. SinceE[X ] = p and Var[X ] = p(1 − p).41 0.m generates graphs the number of traces within one standard error as a function of the time. 48 (7) (6) . the 0.99 conﬁdence interval. i.m.25)(2.4645. √ This implies c n√ 2. we must have √ c n ≥ 0. Equivalently. n n Note that if M100 (X ) = 0. we require that ≥ c ≥ (0.99 conﬁdence is high. Since p(1 − p) ≤ 1/4 for all p. we require that 1. at time k.4. 4 n n The 0.65 0. implying (c n/( p(1− p))) ≥ 0. then the 0. Quiz 7.58 p(1 − p).13 which says that the interval estimate Mn (X ) − c ≤ p ≤ Mn (X ) + c (1) has conﬁdence coefﬁcient 1 − α where α =2−2 √ c n . each sample path having n = 100 Bernoulli traces.4 Assuming the number n of samples is large.645 0.9 conﬁdence interval estimate of p is 0. the number of trials in each trace.95 (3) p(1 − p) √ for every value of p.Quiz 7. we generate m = 1000 sample paths. OK(k) counts the fraction of sample paths that have sample mean within one standard error of p. we apply Theorem 7.41 Mn (X ) − √ ≤ p ≤ Mn (X ) + √ .9 or α ≤ 0. p(1 − p) (2) We must ensure for every value of p that 1 − α ≥ 0.1.01.995. Since (x) is an increasing function of x. In this case.65 p(1 − p). we can use a Gaussian approximation for Mn (X ). n n (5) (4) √ For the 0.99 conﬁdence interval estimate is 0.58)/ n.41 c≥ √ = √ .99 conﬁdence interval estimate is 0. Since p(1 − p) ≤ 1/4 for all p.645 Mn (X ) − √ ≤ p ≤ Mn (X ) + √ .e.

0. the fraction of traces within one standard error approaches 2 (1) − 1 ≈ 0.7 0. MN=cumsum(x).5000.m). stderrmat=stderr*ones(1.6 0.’-s’). x=reshape(bernoullirv(p. plot(1:n.68./nn.function OK=bernoullisample(n. stderr=sqrt(p*(1-p)).m.n.8 0. is examined in Problem 7.5 0.OK. The unusual sawtooth pattern.m).2)/m. The following graph was generated by bernoullisample(100.2.m). as m gets large.p).9 0. 49 ./sqrt((1:n)’). though perhaps unexpected. nn=(1:n)’*ones(1. OK=sum(abs(MN-p)<stderrmat.5): 1 0.m*n).4 0 10 20 30 40 50 60 70 80 90 100 As we would expect.5.

. . . if we observe X < 1. .01)1/15 = 1. . . (4) Thus if we observe at least 214. That is.6. X 15 ≤ x] = [P [X i ≤ x]]15 . This implies that for x ≥ 0. the ML hypothesis rule is k ∈ A0 if PK |H0 (k) ≥ PK |H1 (k) . let R = {X ≤ r }.2 From the problem statement. A reasonable choice is to reject the hypothesis if X is too small. From Theorem 8. 976 photons. otherwise k = 0. we must choose a rejection region for X .Quiz Solutions – Chapter 8 Quiz 8.01 It is straightforward to show that r = − ln 1 − (0. . X 2 ≤ x. For a signiﬁcance level of α = 0. otherwise (1) (2) 0 Since the two hypotheses are equally likely.33. each X i has PDF and CDF f X i (x) = e−x x ≥ 0 0 otherwise FX i (x) = 0 x <0 1 − e−x x ≥ 0 (1) Hence. . (3) k ∈ A1 otherwise. This rule simpliﬁes to 106 − 104 k ∈ A0 if k ≤ k = = 214.33 Hence. X 15 obeys FX (x) = P [X ≤ x] = P [X 1 ≤ x. 1. . ln 100 ∗ k ∈ A1 otherwise. the MAP and ML tests are the same. Quiz 8.1 From the problem statement. the CDF of the maximum of X 1 . then we accept hypothesis H1 . 975. 1. then we reject the hypothesis.01. . 50 . we obtain α = P [X ≤ r ] = (1 − e−r )15 = 0.7. · · · . FX (x) = FX i (x) 15 (2) = 1 − e−x 15 (3) To design a signiﬁcance test. the conditional PMFs of K are PK |H0 (k) = PK |H1 (k) = 104k e−10 k! 4 (4) (5) 0 106k e−10 k! 6 k = 0.

loglog(FM1(:. This implies the probability of a correct decision is P[C] = P[C|H0 ]. it is easier to calculate the probability of a correct decision..2). Given H0 .m. σ ) random variables. Next.ˆ2)>TT).m.Quiz 8.3. FM=[FM1 FM2 FM5].1. 51 . X 2 ) ∈ A j for some j = i. Equivalently.. N is Gauss(0. the conditional probability of a correct decision is √ √ P [C|H0 ] = P [X 1 > 0.T).1)/m. .m.TT]=ndgrid(x.0. legend(’\it d=0.1’.. The modiﬁed program. otherwise 0 %FM = [P(FA) P(MISS)] x=(v+randn(m.m calls sqdistroc three times to generate a plot that compares the receiver performance for the three requested values of d. a symbol error occurs when si is transmitted but (X 1 .T(:)).1). Since N1 and N2 are iid Gaussian (0. ’\it d=0. FM2=sqdistroc(v. Here is the modiﬁed code: function FM=sqdistroc(v. FM5(:.T) %square law distortion recvr %P(error) for m bits tested %transmit v volts or -v volts...T).4 To generate the ROC.’\it d=0.FM1(:.FM5(:. the probability 2 PERR = 1 − P [C] = 1 − E 2σ 2 (5) Quiz 8.. P[C|H0 ] = P[C|Hi ] for all i.3 For the QPSK system. E/2 + N2 > 0 (1) Because of the symmetry of the signals. FM2(:.3’.3) ylabel(’P_{MISS}’).2’.T). the program sqdistrocplot.0.2.m is essentially the same as sqdistor except the output is a matrix FM whose columns are the false alarm and miss probabilities.’:k’). [XX.1) %add d(v+N)ˆ2 distortion %receive 1 if x>T. [XX. x= -v+randn(m.m.TT]=ndgrid(x.1). . sqdistroc.1)/m. we have P[C] = 2( E/2σ 2 ). P10=sum((XX+d*(XX.1). P01=sum((XX+d*(XX. FM5=sqdistroc(v. For a QPSK system.2).. FM=[P10(:) P01(:)]. function FM=sqdistrocplot(v. the existing program sqdistor already calculates this miss probability PMISS = P01 and the false alarm probability PFA = P10 .d.1)).2).’--k’.T(:)). we have √ √ P [C] = P [C|H0 ] = P E/2 + N1 > 0 P E/2 + N2 > 0 (2) √ 2 (3) = P N1 > − E/2 √ 2 − E/2 (4) = 1− σ Since (−x) = 1 − of error is (x).0.’-k’. xlabel(’P_{FA}’).m. %add N volts.1).ˆ2)< TT).FM2(:. X 2 > 0|H0 ] = P E/2 + N1 > 0. FM1=sqdistroc(v.T).

4 with squared distortion.1:3. the commands T=-3:0. sqdistrocplot(3.100000.1 d=0.3 −5 10 10 −4 10 −3 10 PFA −2 10 −1 10 0 T=-3:0.100000. sqdistrocplot(3.To see the effect of d.1:3.T). 52 . generated the plot shown in Figure 3. 10 0 10 −1 10 PMISS 10 10 −2 −3 −4 10 −5 d=0. Figure 3: The receiver operating curve for the communications system of Quiz 8.2 d=0.T).

Quiz Solutions – Chapter 9 Quiz 9. we need the marginal PDF f X (x). the conditional PDF of Y given X is f Y |X (y|x) = 2(y+x) 1+2x−3x 2 0 x ≤y≤1 otherwise (6) (4) The MMSE estimate of Y given X = x is y M (x) = E [Y |X = x] = ˆ x 1 2y 2 + 2x y dy 1 + 2x − 3x 2 y=1 y=x (7) (8) (9) 2y 3 /3 + x y 2 = 1 + 2x − 3x 2 = 2 + 3x − 5x 3 3 + 6x − 9x 2 53 . (3) To obtain the conditional PDF f Y |X (y|x).Y (x. For 0 ≤ x ≤ 1.1 (1) First. we calculate the marginal PDF for 0 ≤ y ≤ 1: f Y (y) = 0 y 2(y + x) d x = 2x y + x 2 x=y x=0 = 3y 2 (1) This implies the conditional PDF of X given Y is f X |Y (x|y) = f X. f X (x) = x 1 2(y + x) dy = y 2 + 2x y y=1 y=x = 1 + 2x − 3x 2 (4) (5) For 0 ≤ x ≤ 1. y) = f Y (y) 2 3y + 2x 3y 2 0 0≤x ≤y otherwise (2) (2) The minimum mean square error estimate of X given Y = y is x M (y) = E [X |Y = y] = ˆ 0 y 2x 2 2x + 2 3y 3y d x = 5y/9 (3) ˆ Thus the MMSE estimator of X given Y is X M (Y ) = 5Y /9.

The conditional PDF of X given R is 1 2 f X |R (x|r ) = √ e−(x+40+40 log10 r ) /128 128π 54 (1) . the optimum linear estimate of T given R is σT ˆ TL (R) = ρT. E[T X ] = E[T ]E[X ] = 0 and E[T 2 ] = Var[T ]. Cov [T.Quiz 9.3 When R = r .R = √ √ σT Cov [T.4. Thus Cov[T. (4) From Deﬁnition 4.R = σT /σ R .R ) = 9(1 − 3/4) = 9/4 L 2 σT (5) σR R= 2 2 σT 2 2 σT + σ X R= 3 R 4 (6) (7) Quiz 9. (6) By Theorem 9. the conditional PDF of X = Y −40−40 log10 r is Gaussian with expected value −40 − 40 log10 r and variance 64. the mean square error of the linear estimate is 2 e∗ = Var[T ](1 − ρT. R] = = 3/2 σR Var[R] Var[T ] (4) (5) From Theorem 9. the correlation coefﬁcient of T and R is ρT. ˆ TL (R) = Hence a ∗ = 3/4 and b∗ = 0.4.R (R − E [R]) + E [T ] σR Since E[R] = E[T ] = 0 and ρT. the variance of the sum R = T + X is Var[R] = Var[T ] + Var[X ] = 9 + 3 = 12 (3) Since T and R have expected values E[R] = E[T ] = 0. E [R] = E [T ] + E [X ] = 0 (2) Since T and X are independent. R] = Var[T ] = 9.8. R] = E [T R] = E [T (T + X )] = E T 2 + E [T X ] (3) (2) (1) Since T and X are independent and have zero expected value.2 (1) Since the expectation of the sum equals the sum of the expectations.

6% larger than the ML estimate. When x ≤ −156. ˆ For the MAP estimate. note that a typical ﬁgure for the signal strength might be x = −120 dB.2 to write the ML estimate of R given X = x as rML (x) = arg max f X |R (x|r ) ˆ r ≥0 (2) We observe that f X |R (x|r ) is maximized when the exponent (x + 40 + 40 log10 r )2 is minimized. This minimum occurs when the exponent is zero. for very low signal strengths. the above estimate will exceed 1000 m.1236)10−x/40 (8) This is the MAP estimate of R given X = x as long as r ≤ 1000 m. if x = −120dB.3 dB. which is not possible in our probability model. r ) ˆ 0≤r ≤1000 Note that we have included the constraint r ≤ 1000 in the maximization to highlight the fact that under our probability model. the complete description of the MAP estimate is rMAP (x) = ˆ 1000 x < −156.1236)10 (9) For example. r ).6.1)10−x/40 m ˆ (3) (4) If the result doesn’t look correct. the MAP estimate takes into account that the distance can never exceed 1000 m. then rMAP (−120) = 123.R (x. That is.R (x. the MAP estimate of R given X = x is the value of r that maximizes f X. Hence. the MAP estimate is 23. This corresponds to a distance estimate of rML (−120) = 100 m.R (x. This reﬂects the fact that large values of R are a priori more probable than small values. r ) = f X |R (x|r ) f R (r ) = 106 32π 1 √ r e−(x+40+40 log10 r ) 2 /128 (5) From Theorem 9. 55 .3 (0. Setting the derivative of f X. R ≤ 1000 m. r ) with respect to r to zero yields e−(x+40+40 log10 r ) Solving for r yields r = 10 1 25 log10 e −1 2 /128 1− 80 log10 e (x + 40 + 40 log10 r ) = 0 128 (7) 10−x/40 = (0. we can use Deﬁnition 9.3 −x/40 x ≥ −156.R (x. When the measured signal ˆ strength is not too low.From the conditional PDF f X |R (x|r ). However.6 m. yielding log10 r = −1 − x/40 or rML (x) = (0. (6) rMAP (x) = arg max f X. we observe that the joint PDF of X and R is f X.

This implies RY = E XX + E WW = RX + RW = In addition. Y2 ] . Similarly. Y2 ] = E [X 2 Y2 ] = E [X 2 (X 2 + W2 )] = E X 2 = 1 2 2 Var[Y2 ] = Var[X 2 ] + Var[W2 ] = E X 2 + E W2 = 1. Finally. E [Y2 X 2 ] E [(X 2 + W2 )X 2 ] 56 (10) 1. it follows that b∗ = 0. Var[Y2 ] b ∗ = µ X 2 − a ∗ µ Y2 . it follows that E[Y] = 0.4.Quiz 9. To apply Theorem 9. n = 2 and we wish to estimate X 2 given the observation vector Y = Y1 Y2 .4 ˆ (1) From Theorem 9. we calculate the correlation coefﬁcient ρ X 2 .Y2 = The expected square error is 2 e∗ = Var[X 2 ](1 − ρ X 2 .7. (7) (8) Because X and W are independent. (1) Because E[X] = E[Y] = 0. we need to ﬁnd RYX 2 = E [YX 2 ] = E [Y1 X 2 ] E [(X 1 + W1 )X 2 ] = . Y2 ] 1 =√ σ X 2 σY2 1. Thus we can apply Theorem 9.7.9 1 RW = 0.1 −0.9 1. E[WX ] = 0.1 (2) (3) It follows that a ∗ = 1/1.1 (4) 1 1 = = 0. we need to ﬁnd RY and RYX 2 . E[XW ] = E[X]E[W ] = 0. −0.Y2 ) = 1 − L Cov [X 2 . 0 0.9 . to compute the expected square error.7.1 (9) .1.9 .0909 1.1 11 (5) (2) Since Y = X + W and E[X] = E[W] = 0.1 0 . RY = E YY = E (X + W)(X + W ) = E XX + XW + WX + WW . Because µ X 2 = µY2 = 0.1 (6) In terms of Theorem 9. 2 Cov [X 2 . −0. Note that X and W have correlation matrices RX = 1 −0. the LMSE estimate of X 2 given Y2 is X 2 (Y2 ) = a ∗ Y2 + b∗ where a∗ = Cov [X 2 .

This implies RYX = E [YX ] = E [(1X + W)X ] = 1E X 2 = 1. Y also has zero expected value. The mean square error is ˆ Var [X 2 ] − a RYX 2 = Var [X ] − a1rY1 .Since X and W are independent vectors.7. j) = c|i− j|−1 . ˆ a = R−1 RYX 2 = Y −0. (14) (13) Quiz 9. Thus. Y E[WX ] = 0 and E[X W ] = 0 . ˆ a = R−1 RYX = 11 + RW Y and the optimal linear estimator is ˆ X L (Y) = 1 11 + RW The mean square error is ˆ e∗ = Var[X ] − a RYX = 1 − 1 11 + RW L −1 −1 −1 (1) (2) (3) (4) 1 (5) Y (6) 1 (7) Now we note that RW has i.225Y1 + 0.5 Since X and W have zero expected value.X 2 = 0. This problem is atypical in that one does not usually get L 57 .X 2 − a2rY2 .0725. jth entry RW (i.725Y2 . by ˆ ˆ ˆ Theorem 9. The question we must address is what value c minimizes e∗ .9 RYX 2 = = . E[W1 X 2 ] = E[W1 ]E[X 2 ] = 0 and E[W2 X 2 ] = 0. Thus E[X 1 X 2 ] −0. X L (Y) = a Y where a = R−1 RYX . Thus. Since X and W are independent.225 0.725 (12) Therefore. the optimum linear estimator of X 2 given Y1 and Y2 is ˆ ˆ X L = a Y = −0. (11) 2 1 E X2 By Theorem 9. By the same reasoning. the correlation matrix of Y is RY = E YY = E (1X + W)(1 X + W ) = 11 E X 2 + 1E X W + E [WX ] 1 + E WW = 11 + RW Note that 11 is a 20 × 20 matrix with every entry equal to 1.7.

when c is small. Thus. end plot(c. our 20 measurements will be all the same and one measurement is as good as 20 measurements.01:0. cmin=c(optk).1). consider the extreme case in which every Wi and W j have correlation coefﬁcient ρi j = 1. xlabel(’c’). [msec(k).5 c 1 As we see in the graph. if c is large Wi and W j are highly correlated and the separate measurements of X are very dependent. v1=ones(20. In this case.optk]=min(msec). >> mquiz9minc(c) ans = 0. We note that the answer is not obviously apparent from Equation (7). msec=zeros(size(c)). [msemin. both small values and large values of c result in large MSE. RW=toeplitz(c.ˆ((0:19)-1)). This would suggest that large values of c will also result in poor MSE.6 0. The following commands ﬁnds the minimum c and also produces the following graph: >> c=0. we will see that the answer is somewhat instructive.af]=mquiz9(c).01:0. for k=1:length(c). function cmin=mquiz9minc(c). af=(inv(RY))*v1. On the other hand. i) = 1/c.2 0 0. RY=(v1*(v1’)) +RW. the noises Wi have high variance and we would expect our estimator to be poor.ylabel(’e_Lˆ*’).to choose the correlation structure of the noise. In particular. function [mse. 58 . mse=1-((v1’)*af). Note in mquiz9 that v1 corresponds to the vector 1 of all ones.af]=mquiz9(c(k)). To ﬁnd the optimal value of c.99.4500 1 0.4 0. we write a M ATLAB function mquiz9(c) to calculate the MSE for a given c and second function that ﬁnds plots the MSE for a range of values of c. However.8 e* L 0. we observe that Var[Wi ] = RW (i. If this argument is not clear.msec).

. discrete valued process. s).1 There are many correct answers to this question. the number of ongoing calls at the start of the experiment • N . (4) Rounding the samples in part (c) to the nearest integer degree yields a discrete time. the number of calls that hang up during the experiment • D1 . then we obtain a continuous time. (3) If we sample the process in part (a) every T seconds.3 (1) Each resistor has resistance R in ohms with uniform PDF f R (r ) = 0. . continuous valued process. the number of new calls that arrive during the experiment • X 1 . . the call completion times of the H calls that hang up Quiz 10. . the interarrival times of the N new arrivals • H . .2 (1) We obtain a continuous time. X N . then we obtain a discrete time.01 950 ≤ r ≤ 1050 0 otherwise (1) The probability that a test produces a 1% resistor is p = P [990 ≤ R ≤ 1010] = 1010 990 (0. we round the temperature to the nearest degree. continuous valued process when we record the temperature as a continuous waveform over time.01) dr = 0. D H . discrete valued process. . A correct answer speciﬁes enough random variables to specify the sample path exactly.Quiz Solutions – Chapter 10 Quiz 10. (2) If at every moment in time. . Quiz 10. . s) is • m(0. One choice for an alternate set of random variables that would specify m(t.2 (2) 59 .

In this problem. 1) random variable. the probability the ﬁrst 1% resistor is found in exactly ﬁve seconds is PT1 (5) = (0. independent of any other resistor. t 0 otherwise t n (3) (3) First we will ﬁnd the PMF of T1 . Thus E [T2 |T1 = 10] = E [T1 |T1 = 10] + E T |T1 = 10 = 10 + E T = 10 + 5 = 15 (5) (6) Quiz 10.. .4 Since each X i is a N (0. . . 9 otherwise (4) Since p = 0. . . xn ) = i=1 f X (xi ) = 1 2 2 e−(x1 +···+xn )/2 n/2 (2π ) (2) 60 . Each resistor is a 1% resistor with probability p.. . That is. T1 has the geometric PMF PT1 (t) = (1 − p)t−1 p t = 1. a geometric random variable with success probability p has expected value 1/ p. A success occurs on a trial with probability p if we ﬁnd a 1% resistor. . . t − 1 followed by a success on trial t. the joint PDF of X = X 1 · · · X n is k (1) f X (x) = f X (1). 1.(2) In t seconds.. 2. . .08192. . (4) From Theorem 2. (5) Note that once we ﬁnd the ﬁrst 1% resistor. . .11.X (n) (x1 .2) = 0. Consequently.2. exactly t resistors are tested. T2 = T1 + T where T is independent and identically distributed to T1 . This problem is easy if we view each resistor test as an independent trial. just as in Example 2.8)4 (0. the number of 1% resistors found has the binomial PMF PN (t) (n) = p n (1 − p)t−n n = 0. Hence. E[T1 ] = 1/ p = 5. the number of additional trials needed to ﬁnd the second 1% resistor once again has a geometric PMF with expected value 1/ p since each independent trial is a success with probability p. The ﬁrst 1% resistor is found at time T1 = t if we observe failures on trials 1. each X i has PDF 1 2 f X (i) (x) = √ e−x /2 2π By Theorem 10.5.1. . ..

. Thus N (t) is not a Poisson process.7 First. we look at the interarrival times. . we can conclude that the interarrival times of N (t) are not exponential random variables. Thus X (t) is a Brownian motion process with variance Var[X (t)] = t. Since we count only evennumbered arrival for N (t). the joint PMF of M1 and M2 is ⎧ α m 1 +m 2 e−2α m 1 = 0. . 1. the ith interarrival time of the N (t) process. . PM1 . we note that for t > s. . the expected number of packets in each hour is E[Mi ] = α = 36. has the same PDF as Y1 (t). Since Yi (t). Since one hour equals 3600 sec and the Poisson process has a rate of 10 packets/sec.M2 (m 1 . Quiz 10. otherwise (1) Since M1 and M2 are independent. . This implies < √ [W (t) − W (s)]/ α is independent of W (s )/ α for all s ≥ s . Theorem 3. This implies M1 and M2 are independent Poisson random variables each with PMF PMi (m) = α m e−α m! 0 m = 0. m 2 ) = PM1 (m 1 ) PM2 (m 2 ) = ⎪ ⎪ ⎩ 0 otherwise. 1. . . Y1 is an Erlang (n = 2. denote the interarrival times of the N (t) process.13 states that W (t) − W (s) is Gaussian with expected value E [X (t) − X (s)] = and variance E (W (t) − W (s))2 = E (W (t) − W (s))2 α(t − s) = α α (3) E [W (t) − W (s)] =0 √ α (2) Consider s ≤ s √ t.Quiz 10.5 The ﬁrst and second hours are nonoverlapping intervals. (2) Quiz 10. .6 To answer whether N (t) is a Poisson process. X (t) − X (s) is independent of X (s ) for all s ≥ s . . Let X 1 . Since X 1 and X 2 are independent exponential (λ) random variables. ⎪ m 1 !m 2 ! ⎪ ⎨ m 2 = 0. λ) random variable. 2. 1. X (t) − X (s) = W (t) − W (s) √ α (1) Since W (t) − W (s) is a Gaussian random variable. . . see Theorem 6. W (t) − W (s) is independent of W (s ). Since s ≥ s . the time until the ﬁrst arrival of the N (t) is Y1 = X 1 + X 2 . X 2 . That is. 61 .11. 000. . .

(2) R2 (τ ) = e−τ also is valid.Quiz 10. X 2 . . f X n1 .10 We must check whether each function R(τ ) meets the conditions of Theorem 10. xm ) Since the random sequence is iid.14. .. .. .. . .. E[X (t)N (t )] = E[X (t)]E[N (t )] = 0. we have RY (t. τ ) = E[Y (t)Y (t + τ )]. for time instants n 1 + k. . . f X n1 +k . n m + k. . .. τ ) + R N (t. f X n1 . . (1) To ﬁnd the autocorrelation. .X nm +k (x1 .. Quiz 10.X nm +k (x1 .X nm (x1 . . .12: R(τ ) ≥ 0 R(τ ) = R(−τ ) |R(τ )| ≤ R(0) (1) (3) (2) (1) (1) R1 (τ ) = e−|τ | meets all three conditions and thus is valid. we observe that since X (t) and N (t) are independent and since N (t) has zero expected value.. . .. . Since RY (t. (2) (3) (4) Quiz 10. . . .9 From Deﬁnition 10.. .8 First we ﬁnd the expected value µY (t) = µ X (t) + µ N (t) = µ X (t). 2 (3) R3 (τ ) = e−τ cos τ is not valid because R3 (−2π ) = e2π cos 2π = e2π > 1 = R3 (0) (4) R4 (τ ) = e−τ sin τ also cannot be an autocorrelation function because 2 (2) R4 (π/2) = e−π/2 sin π/2 = e−π/2 > 0 = R4 (0) (3) 62 .. . .. xm ) = f X (x1 ) f X (x2 ) · · · f X (xm ) Similarly. . xm ) = f X n1 +k .. is a stationary random sequence if for all sets of time instants n 1 . . xm ) = f X (x1 ) f X (x2 ) · · · f X (xm ) We can conclude that the iid random sequence is stationary.. τ )...X nm (x1 . . . τ ) = E [(X (t) + N (t)) (X (t + τ ) + N (t + τ ))] = E [X (t)X (t + τ )] + E [X (t)N (t + τ )] + E [X (t + τ )N (t)] + E [N (t)N (t + τ )] = R X (t.. n m and time offset k. X 1 .

we can conclude that Y (t) is a wide sense stationary process. we see that by viewing a process backwards in time. (2) Since X (t) and Y (t) are both wide sense stationary processes. as t gets larger.X (t+1) (x0 . Y (t) = X (−t) and X (t) become less and less correlated. we conclude that X (t) and Y (t) are not jointly wide sense stationary.11 (1) The autocorrelation of Y (t) is RY (t. x1 ) = 1 (2π )n/2 [det (CX )]1/2 1 3π 2 e− 3 2 2 2 x0 −x0 x1 +x1 (5) 1 exp − x C−1 x X 2 (6) (7) =√ 63 . Quiz 10.12 From the problem statement. we see the same second order statistics. τ ) depends on both t and τ . we can check whether they are jointly wide sense stationary by seeing if R X Y (t.Quiz 10. τ ) = E [X (t)Y (t + τ )] = E [X (t)X (−t − τ )] = R X (t − (−t − τ )) = R X (2t + τ ) (4) (5) (6) Since R X Y (t. E [X (t)] = E [X (t + 1)] = 0 E [X (t)X (t + 1)] = 1/2 Var[X (t)] = Var[X (t + 1)] = 1 The Gaussian random vector X = X (t) X (t + 1) sponding inverse CX = Since 1 1/2 1/2 1 C−1 = X (1) (2) (3) has covariance matrix and corre- 4 1 −1/2 1 3 −1/2 (4) 4 4 2 1 −1/2 x0 2 x − x0 x+ x1 = 1 x1 3 −1/2 3 0 the joint PDF of X (t) and X (t + 1) is the Gaussian vector PDF x C−1 x = x0 x1 X f X (t). In fact. To see why this is. τ ) = E [Y (t)Y (t + τ )] = E [X (−t)X (−t − τ )] = R X (−t − (−t − τ )) = R X (τ ) (1) (2) (3) Since E[Y (t)] = E[X (−t)] = µ X . suppose R X (τ ) = e−|τ | so that samples of X (t) far apart in time have almost no correlation. R X Y (t. In this case. In this case. τ ) is just a function of τ .

when M(t) = c. reduce the system state n by 1. In particular. – If M(t) < c. namely arrivals and departures. Examine the head-of-schedule event. increase the system state n by 1. 3.13 The simple structure of the switch simulation of Example 10. – If M(t) = c. The program simply executes the event at the head of the schedule. we must block the call. do not schedule a departure event. The system evolves via a sequence of discrete events. admit the arrival. we cannot generate these vectors all at once. 2. check the state M(t). Call blocking can be implemented by setting the service time of the call to zero so that the call departs as soon as it arrives. when an arrival occurs at time t. Otherwise. 64 . Delete the head-of-schedule event and go to step 2. • If the head of schedule event is a departure. A simulation of the system moves from one time instant to the next by maintaining a chronological schedule of future events (arrivals and departures) to be executed. Start at time t = 0 with an empty system. satisﬁes M(t) < c = 120. the number of ongoing calls. After the head-of-schedule event is completed and any new events (departures in this system) are scheduled. The logic of such a simulation is 1. where Sk is an exponential (λ) random variable. an exponential (λ) random variable. at discrete time instances.120 100 80 M(t) 60 40 20 0 0 10 20 30 40 50 t 60 70 80 90 100 Figure 4: Sample path of 100 minutes of the blocking switch of Quiz 10. • When the head-of-schedule event is the kth arrival is at time t.13. we know the system state cannot change until the next scheduled event. we need to know that M(t). block the arrival. and schedule a departure to occur at time t + Sn . Schedule the ﬁrst arrival to occur at S1 .28 admits a deceptively simple solution in terms of the vector of arrivals A and the vector of departures D. Quiz 10. The blocking switch is an example of a discrete event system. With the introduction of call blocking.

1:5000. a result known as the “Erlang-B formula. Nevertheless. for very complicated systems. roughly the ﬁrst 100 minutes are needed to load up the switch since the switch is idle when the simulation starts at time t = 0. In this case. The following instructions t=0:0. we will learn that the blocking switch is an example of an M/M/c/c queue. Note that in Chapter 12. the output [m a b] is such that m(i) is the number of ongoing calls at time t(i) while a and b are the number of admits and blocks. we can calculate that the exact blocking probability is Pb = 0. a simple (but not elegant) way to do this is to have maintain two vectors: time is a list of timestamps of scheduled events and event is a the list of event types.0057. A sample path of the ﬁrst 100 minutes of that simulation is shown in Figure 4. We can estimate the probability a call is blocked as b ˆ = 0. The rest of the gap between 0.0048 and 0. When the program is passed a vector t. However. we use the vector t as the set of time instances at which we inspect the system state. Thus this would account for only part of the disparity. In our simulation. Thus for all times t(i) between the current head-of-schedule event and the next. One reason our simulation underestimates the blocking probability is that in a 5.a.Thus we know that M(t) will stay the same until then. or event(i)=-1 if the ith scheduled event is a departure. (1) Pb = a+b In Chapter 12. event(i)=1 if the ith scheduled event is an arrival. this says that roughly the ﬁrst two percent of the simulation time was unusual. we set m(i) to the current switch state. 65 .t).000 minutes. the discrete event simulation is widely-used and often very efﬁcient simulation method.b]=simblockswitch(10.93). generated a simulation lasting 5. The complete program is shown in Figure 5. we will learn that the exact blocking probability is given by Equation (12.” From the Erlang-B formula.1.000 minute full simulation produced a=49658 admitted calls and b=239 blocked calls.0057 is that a simulation that includes only 239 blocks is not all that likely to give a very accurate result for the blocking probability.120. [m.m). a kind of Markov chain. Chapter 12 develops techniques for analyzing and simulating systems described by Markov chains that are much simpler than the discrete event simulation technique shown here. In M ATLAB.000 minute simulation.0048.0. plot(t. In most programming languages. The 5. it is common to implement the event schedule as a linked list where each item in the list has a data structure indicating an event timestamp and the type of the event.

time(1)= [ ]. n=n+1. timenow. %total # blocks admits=0.blocks]=simblockswitch(lam.admits. depart=timenow+exponentialrv(mu.blocks)). 66 . %total # admits M=zeros(size(t)). n=0. blocks=0.mu. %first event is an arrival timenow=0.. event=[event(b4depart) -1 event(˜b4depart)]. while (timenow<tmax) M((timenow<=t)&(t<time(1)))=n.3d Admits %10d Blocks %10d’.1). % clear current event if (eventnow==1) % arrival arrival=timenow+exponentialrv(lam. time=[time(b4arrival) arrival time(˜b4arrival)]. immed departure disp(sprintf(’Time %10. %one more block. event=[ 1 ]. b4depart=time<depart.t).function [M. eventnow=event(1). timenow=time(1). % # in system time=[ exponentialrv(lam.admits.c. % next arrival b4arrival=time<arrival. if n<c %call admitted admits=admits+1. end elseif (eventnow==-1) %departure n=n-1. event(1)=[ ]. event=[event(b4arrival) 1 event(˜b4arrival)].1) ]. time=[time(b4depart) depart time(˜b4depart)]. end end Figure 5: Discrete event simulation of the blocking switch of Quiz 10..13.1). else blocks=blocks+1. tmax=max(t)..

2. 1 RY (τ ) = e−|τ | 2 Quiz 11. we 2 can double check. we have RY (τ ) = 0 ∞ e−u e−τ −u du = e−τ 0 ∞ 1 e−2u du = e−τ 2 (3) For τ < 0. µY = µ X ∞ −∞ h(t)dt = 2 0 ∞ e−t dt = 2 (1) Since R X (τ ) = δ(τ ).1 By Theorem 11. The variance of Yn is Var[Yn ] = E[Yn ] = RY [0] = 1.2 The expected value of the output is ∞ ∞ −τ h(u)h(τ + u) du = ∞ −τ 1 e−u e−τ −u du = eτ 2 (4) (5) µY = µ X n=−∞ h n = 0. we can deduce that RY (τ ) = 1 e−|τ | by symmetry. the autocorrelation function of the output is RY (τ ) = ∞ −∞ ∞ h(u) −∞ h(v)δ(τ + u − v) dv du = ∞ −∞ h(u)h(τ + u) du (2) For τ > 0.5(1 + −1) = 0 (1) The autocorrelation of the output is 1 1 RY [n] = i=0 j=0 h i h j R X [n + i − j] 1 n=0 0 otherwise (2) (3) = 2R X [n] − R X [n − 1] − R X [n + 1] = 2 Since µY = 0. 67 . Just to be safe though. For τ < 0.Quiz Solutions – Chapter 11 Quiz 11. RY (τ ) = Hence.

5. One way to ﬁnd the RY is to observe that RY has the Toeplitz structure of Theorem 11. Moreover.2 −0.8.7. using Equation (1) is surprisingly tedious because we still need to sum over all i and j such that n + i − j = 0. Y = Y33 Y34 Y35 is a Gaussian random vector since X n is a Gaussian random process.13 with µX = 0 and A = H. Fo ﬁnd the PDF of the Gaussian vector Y.2 X (a) W = 10 (b) W = 1000 Figure 6: The autocorrelation R X (τ ) and power spectral density S X ( f ) for process X (t) in Quiz 11. following Theorem 11. Since R X [n] = δn .1 0.4 0. or by directly applying Theorem 5. Thus E[Y] = 0. (1) Despite the fact that R X [k] is an impulse. the identity matrix.6 SX(f) 0. by Theorem 11.6 and to use Theorem 11. 4 0 0 1 1 1 1 (2) (3) In this case.5 to ﬁnd the autocorrelation function ∞ ∞ RY [n] = i=−∞ j=−∞ h i h j R X [n + i − j].5. RX = I. we need to ﬁnd the covariance matrix CY . Quiz 11.x 10 8 0. which equals the correlation matrix RY since Y has zero expected value. we obtain RY = HRX H .2 0 −15 −10 −5 0 f 5 10 15 SX(f) 6 4 2 0 −1500−1000 −500 10 R (τ) 5 0 −5 −2 −1 0 τ 1 x 10 2 −3 0 f 500 1000 1500 10 RX(τ) 5 0 −5 −0.3 By Theorem 11. In this problem. 68 .1 0 τ 0. it is simpler to observe that Y = HX where X = X 30 X 31 X 32 X 33 X 34 X 35 and ⎡ ⎤ 1 1 1 1 0 0 1 H = ⎣0 1 1 1 1 0⎦ . each Yn has expected value E[Yn ] = µ X ∞ n=−∞ h n = 0.

the PDF of Y is f Y (y) = 1 (2π )3/2 [det (CY )]1/2 1 exp − y C−1 y .9 for the case of k = 1 and M = 2.Thus ⎡ ⎤ 4 3 2 1 ⎣ 3 4 3⎦ .9 400 261 (3) It follows that the ﬁlter is h = 261/400 81/400 and the MMSE linear predictor is 81 261 ˆ X n−1 + Xn. CY = RY = HH = 16 2 3 4 (4) It follows (very quickly if you use M ATLAB for 3 × 3 matrix inversion) that ⎡ ⎤ 7/12 −1/2 1/12 1 −1/2⎦ .9 1. X n+1 = Xn 0. one approach is to follow the method of Example 11. Y Quiz 11.81 81 = . (7) Equation (7) shows that one of the nicest features of the multivariate Gaussian distribution is that y C−1 y is a very concise representation of the cross-terms in the exponent of f Y (y).9 R X [1] −1 (1) (2) The MMSE linear ﬁrst order ﬁlter for predicting X n+1 at time n is the ﬁlter h such that ← − 1.13 and to directly calculate ˆ (5) e∗ = E (X n+1 − X n+1 )2 .9 1. X n+1 = 400 400 (4) to ﬁnd the mean square error. In this case.1 0.81 X n−1 R X [2] = . Xn = X n−1 X n and RXn = and RXn X n+1 = E 1.4 This quiz is solved using Theorem 11.1 R X [1] R X [0] 0.1 1 0. 0. Y 2 (5) (6) A disagreeable amount of algebra will show det(CY ) = 3/1024 and that the PDF can be “simpliﬁed” to 16 7 2 7 2 1 2 y33 + y34 + y35 − y33 y34 + y33 y35 − y34 y35 exp −8 f Y (y) = √ 3 12 12 6 6π .1 0. C−1 = 16 ⎣−1/2 Y 1/12 −1/2 7/12 Thus.9 R X [0] R X [1] = 0.9 h = R−1 RXn X n+1 = Xn 0. L 69 .

Instead. Since X n+1 = h Xn . X = X n+1 and ← − ˆ a = h .1.13(b). we note that 1 f S X ( f ) = 10 rect (2) 2W 2W It follows that the inverse transform of S X ( f ) is sin(2π W τ ) R X (τ ) = 10 sinc(2W τ ) = 10 (3) 2π W τ (3) For W = 10 Hz and W = 1 kHZ. the mean square error is 1 506 ← − 0. (13) L 0. we can derive the mean square error for an arbitary prediction ← − ˆ ﬁlter h.This method is workable for this simple problem but becomes increasingly tedious for higher order ﬁlters.7 with Y = Xn . Quiz 11. we see that observing X n−1 and X n improves the accuracy of our prediction of X n+1 . we obtain ← − ← − ← − e∗ = R X [0] − 2 h RXn X n+1 + h RXn h L (9) (10) ← − with the substitution h = R−1 RXn X n+1 .1 − = = 0. we obtain Xn e∗ = R X [0] − RXn X n+1 R−1 RXn X n+1 L Xn ← − = R X [0] − h RXn X n+1 (11) (12) Note that this is essentially the same result as Theorem 9.9 400 1451 recalling that the blind estimate would yield a mean square error of Var[X ] = 1. graphs of S X ( f ) and R X (τ ) appear in Figure 6. In any case. Consulting Table 11. It is noteworthy that the result is derived in a much simpler way in the proof of Theorem 9.81 81 261 e∗ = R X [0] − h RXn X n+1 = 1.3487. the average power of X (t) is E X 2 (t) = ∞ −∞ W −W SX ( f ) d f = 5 d f = 10 Watts W (1) (2) The autocorrelation function is the inverse Fourier transform of S X ( f ). 70 . e∗ = E L ← − X n+1 − h Xn 2 (6) (7) (8) ← − ← − = E (X n+1 − h Xn )(X n+1 − h Xn ) ← − ← − = E (X n+1 − h Xn )(X n+1 − Xn h ) After a bit of algebra.5 (1) By Theorem 11.1.7 by using the orthoginality property of the LMSE estimator.

1. S X Y ( f ) = H ( f )S X ( f ) = (2) Again by Theorem 11.17.000 so that 1 (1) R X (τ ) = a0 e−a0 |τ | . where u(t) is the unit step function and a1 = 1/RC where RC = 10−4 is the ﬁlter time constant.7 Since Y (t) = X (t − t0 ). we see that 2 2a0 1 2a0 SX ( f ) = = 2 2 + (2π f )2 a0 a0 a0 + (2π f )2 (2) The RC ﬁlter has impulse response h(t) = a1 e−a1 t u(t). From Table 11.17. then ∞ S X (φ) = n=−∞ 10δ[n]e− j2π φn = 10 (1) Thus. the discrete time impulse δ[n] has a ﬂat discrete Fourier transform. (This quiz is really lame!) Quiz 11. 71 (5) 2a0 a1 . R X [n] = 10δ[n].1. SY ( f ) = H ∗ ( f )S X Y ( f ) = |H ( f )|2 S X ( f ). H( f ) = (1) Theorem 11. 2 [a1 + j2π f ] a0 + (2π f )2 (4) a1 a1 + j2π f (3) .1. τ ) = R X Y (τ ) = R X (τ − t0 ). (2) Quiz 11. if R X [n] = 10δ[n]. From Table 11. a0 Consulting with the Fourier transforms in Table 11.17.8 We solve this quiz using Theorem 11. First we need some preliminary facts. we recall the property that g(τ − τ0 ) has Fourier transform G( f )e− j2π f τ0 .Quiz 11.6 In a sampled system. R X Y (t. τ ) = E [X (t)Y (t + τ )] = E [X (t)X (t + τ − t0 )] = R X (τ − t0 ) (1) We see that R X Y (t. Thus the Fourier transform of R X Y (τ ) = R X (τ − t0 ) = g(τ − t0 ) is S X Y ( f ) = S X ( f )e− j2π f t0 . Let a0 = 5. That is.

2 2 2a0 2a1 K0 K1 + 2 . Using partial fractions and the Fourier transform table. we obtain RY (τ ) = 2 a1 e−a0 |τ | − a0 a1 e−a1 |τ | 2 2 a1 − a0 . we can either use basic calculus and ∞ calculate −∞ SY ( f ) d f directly or we can ﬁnd RY (τ ) as an inverse transform of SY ( f ). SY ( f ) = |H ( f )|2 S X ( f ) = 2 2a0 a1 2 2 a1 + (2π f )2 a0 + (2π f )2 2 a1 a1 a1 = 2 (a1 + j2π f ) (a1 − j2π f ) a1 + (2π f )2 (6) (7) (3) To ﬁnd the average power at the ﬁlter output. the output signal has almost as much power as the input. (12) The average power of the Y (t) process is RY (0) = a1 2 = . a1 + a0 3 (13) Note that the input signal has average power R X (0) = 1.Note that |H ( f )|2 = H ( f )H ∗ ( f ) = Thus. the latter method is actually less algebra. SY ( f ) = 2 2 2a0 a0 + (2π f )2 2a1 a1 + (2π f )2 2 a0 K0 K1 + + (2π f )2 a1 + (2π f )2 2 −2a0 a1 2 2 a1 − a0 (8) 2 2a0 a1 2 a1 − a0 . we see that RY (τ ) = K0 K1 a e−a0 |τ | + 2 a1 e−a1 |τ | 2 0 2a0 2a1 (11) Substituting the values of K 0 and K 1 . In particular. (9) (10) Consulting with Table 11. 72 .000 rad/sec and the signal X (t) has most of its its signal energy below 5.000 rad/sec. some algebra will show that SY ( f ) = where K0 = Thus. Since the RC ﬁlter has a 3dB bandwidth of 10.1. 2 K1 = .

147). Comment: Since the text omitted the derivations of Equations (11.Quiz 11.146) and (11. (2) Since R X (τ ) = sinc(2W τ ). (5) From Equation (11. R N (0) = Var[N ] = 1.000 Hz. we note that Example 10.146). SY X ( f ) = S X ( f ).146) and to calculate the mean square error e L ∗ using Equation (11.147). where W = 5.147) for a system in which we ﬁlter Y (t) = X (t) + N (t) to produce an optimal linear estimate of X (t). (6) 73 . at peace with the derivations. (1) Since µ N = 0. it follows that SY ( f ) = S X ( f ) + S N ( f ). Because the noise process N (t) has constant power R N (0) = 1.24 showed that RY (τ ) = R X (τ ) + R N (τ ).146) and (11. This implies R N (0) = ∞ −∞ SN ( f ) d f = B −B N0 d f = 2N0 B (3) Thus N0 = 1/(2B). the optimal ﬁlter is ˆ H( f ) = SX ( f ) = SX ( f ) + SN ( f ) 1 104 1 104 rect + f 104 1 2B rect f 104 rect f 2B . 4 10 104 (4) The noise power spectral density can be written as S N ( f ) = N0 rect f 2B = 1 f rect 2B 2B . (1) Now we can go on to the quiz. The ˆ solution to this quiz is just to ﬁnd the ﬁlter H ( f ) using Equation (11.1 that SX ( f ) = 1 f rect . Taking Fourier transforms.9 This quiz implements an example of Equations (11. we see from Table 11. (2) RY X (τ ) = R X (τ ). decreasing the single-sided bandwidth B increases the power spectral density of the noise over frequencies | f | < B.

We can go back and consider the case B > W later. we need to whether B ≤ W . but only over a bandwidth B that is decreasing.16 Hz.10 It is fairly straightforward to ﬁnd S X (φ) and SY (φ). L Quiz 11. As B shrinks.000.000/19 = 263. when B > W = 5000.05 requires B ≤ 5. Two examples of the ﬁlter H ( f ) are shown in Figure 7. As B is decreased. From Equation (11. The mean square error is e∗ L = 1 1 104 2B 1 1 −5000 104 + 2B 5000 df = 1 2B 1 104 + 1 2B = 1 B 5000 +1 (11) In this case. The result is that the MSE goes down.147). L Although this completes the solution to the quiz. (8) To evaluate the MSE e∗ . B ≥ 9. 1 ˆ + 1 (10) H( f ) = 104 2B ⎩ 0 otherwise. The following M ATLAB program generates and plots the functions shown in Figure 8 74 . ˆ the Wiener ﬁlter H ( f ) is an ideal (ﬂat) lowpass ﬁlter ⎧ 1 ⎨ 104 | f | < 5. The noise power is always Var[N ] = 1 Watt. In this case. the mean square error of the estimate is e∗ = L = ∞ −∞ ∞ −∞ S X ( f )S N ( f ) df SX ( f ) + SN ( f ) 1 104 1 104 (7) f 2B f 2B rect f 104 f 104 1 2B rect rect rect + 1 2B d f.05. The only thing to keep in mind is to use fftc to transform the autocorrelation R X [ f ] into the power spectral density S X (φ). let’s suppose B ≤ W . The Wiener ﬁlter removes the noise that is outside the band of the desired signal.ˆ ˆ (3) We produce the output X (t) by passing the noisy signal Y (t) through the ﬁlter H ( f ). what is happening may not be obvious. the MSE is e∗ L = 1 1 104 2B 1 1 −B 104 + 2B B df = 1 104 1 104 + 1 2B = 1 1+ 5. Since the problem asks us to L ﬁnd the largest possible B. Finally. the ﬁlter H ( f ) makes an increasingly deep and narrow notch at frequencies ˆ | f | ≤ B.05.5 × 104 guarantees e∗ ≤ 0. In L particular. S N ( f ) = 1/2B over frequencies | f | < W . Thus as ˆ B descreases. When B ≤ W . we note that we can choose B very large and also achieve MSE e∗ = 0. Thus increasing B spreads the constant 1 watt of power of N (t) over more bandwidth.000 B (9) To obtain MSE e∗ ≤ 0. the PSD S N ( f ) becomes increasingly tall. the ﬁlter suppresses less of the signal of X (t). for all values of B.

they tend to confuse the stem function. 75 . Hence. %PSD of Y for M=2 xlabel(’n’).abs(sx)). stem(0:N-1.ˆ2). In the context of Example 11. %impulse/filter response: M=10 SY10=sx. As an aside. when M = 10.abs(SY2)). Although these imaginary parts have no computational signiﬁcance. note that the vectors SX. However. SX=fftc(rx. SY2 and SY10 in mquiz11 should all be realvalued vectors.5 0 −5000 −2000 0 f 2000 5000 B = 500 B = 2500 Figure 7: Wiener ﬁlter for Quiz 11. stem(0:N-1.m N=32. %impulse/filter response: M=2 SY2=SX. H10=fft(h10. %mquiz11.*((abs(H10)).1*ones(1. the ﬁlter H (φ) ﬁlters out almost all of the high frequency components of X (t). h10=0. the ﬁnite numerical precision of M ATLAB results in tiny imaginary parts. figure. xlabel(’n’).ylabel(’S_{Y_{10}}(n/N)’).26.abs(SY10)). Relative to M = 2.1 H(f) 0.N).ylabel(’S_X(n/N)’). figure. %autocorrelation and PSD stem(0:N-1.5 0 H(f) −5000 −2000 0 f 2000 5000 1 0.ˆ2).9.* ((abs(H2)). rx=[2 4 2]. xlabel(’n’).5*[1 1].10).N). we generate stem plots of the magnitude of each power spectral density.ylabel(’S_{Y_2}(n/N)’). H2=fft(h2. the low pass moving average ﬁlter for M = 10 removes the high frquency components and results in a ﬁlter output that varies very slowly.N). h2=0.

10 SX(n/N) 5 0 0 5 10 15 n 20 25 30 35 10 SY (n/N) 2 5 0 0 5 10 15 n 20 25 30 35 10 SY (n/N) 10 5 0 0 5 10 15 n 20 25 30 35 Figure 8: For Quiz 11.10. and Sφ (n/N ) for M = 10 using an N = 32 point DFT. graphs of S X (φ). 76 . SY (n/N ) for M = 2.

4 0.4 0.99 0.5 −0.6 0.5 1 (3) where si .2 −0.4 0 0 λ3 0.Quiz Solutions – Chapter 12 Quiz 12.2 From the problem statement.2⎦ 1 0 1 0 0. the Markov chain and the transition matrix are ⎡ ⎤ 0.2 0.6 0. we can conclude that P X n+1 = 1|X n = 0 = 0.6 0.5 0 −0.5 1 −0.10 0. is the left eigenvector of P satisfying si P = λi si .1 The system has two states depending on whether the previous packet was received in error.5 0 0.4 0.5 0.2 0.6 P = ⎣0.99 P X n+1 = 1|X n = 1 = 0.6 0.4)n ⎣ 0 (4) −0.01 0.6 0.01 0 0.2 0. we are given the conditional probabilities P X n+1 = 0|X n = 0 = 0.2⎦ + (0.2 0.5 1 λ1 0 0 0 −1 ⎦ 0 1 ⎦ ⎣ 0 λ2 0 ⎦ ⎣ 1 P = S−1 DS = ⎣ 0.2 0 0 ⎦ Pn = S−1 Dn S = ⎣0.90 (3) Quiz 12.3 The Markov chain describing the factory status and the corresponding state transition matrix are 77 .5 0.6 0.1 1 P= 0.6 0.6 0 0. Algebra will verify that the n-step transition matrix is ⎡ ⎡ ⎤ ⎤ 0.2 Quiz 12. From the problem statement.6 0.4 λ3 = 1 (1) (2) We can diagonalize P into ⎤⎡ ⎡ ⎤ ⎤⎡ −0.4 0.2 0.1 (2) These conditional probabilities correspond to the transition matrix and Markov chain: 0.9 (1) Since each X n must be either 0 or 1.9 P X n+1 = 0|X n = 1 = 0. the ith row of S.6 −0.6 0.2 The eigenvalues of P are λ1 = 0 λ2 = 0.2 0.99 0.01 0.

0.9 0. On the other hand. 1} C2 = {2. The state transition probabilities are Pn−1.1 + 0. Once the system exits C2 . Quiz 12. the states in C2 are never reentered.5 At any time t. The states in C2 have period 2. the state n can take on the values 0. .1π0 and π2 = π1 .n = P [K > n|K > n − 1] = Pn−1. 3} C3 = {4. That is. π2 = 1/12. Quiz 12. Similarly. C1 is a recurrent class. . the system of equations π = π P yields π1 = 0.1 0 1 1 1 ⎡ ⎤ 0.9 0. This implies π0 + π1 + π2 = π0 (1 + 0.. 5. the states in C2 are transient. the class C1 is never left. 6} (1) π1 = 1/12.. (3) (2) The states in C1 and C3 are aperiodic.1) = 1 It follows that the limiting state probabilities are π0 = 5/6.4 The communicating classes are C1 = {0. 2.. . 1 … 78 . the states in C3 are recurrent.0 P [K > n] P [K > n − 1] P [K = n] = P [K = n|K > n − 1] = P [K > n − 1] (1) (2) (3) The Markov chain resembles P[K=5] P[K=4] P[K= 1] P[K=2] P[K=3] 0 1 1 1 2 1 3 1 4 . 1.1 0 0 1⎦ P=⎣ 0 1 0 0 (1) 2 With π = π0 π1 π2 . Thus the states in C1 are recurrent. Once the system enters a state in C1 .

The system state is the time until the counter expires. the number of transitions need to return to state 0 is always a multiple of 2.5. From Equation (4). and we randomly reset the counter to a new value K = k and then we count down k units of time. we solve the system of equations π = πP and 3 i=0 πi = 1: π0 = (3/4)π1 + (1/4)π3 π1 = (1/4)π0 + (1/4)π2 π2 = (1/4)π1 + (3/4)π3 1 = π0 + π1 + π2 + π3 79 (1) (2) (3) (4) . (2) To ﬁnd the stationary probabilities. When the counter expires. . If we have a random variable W such that the PMF of W satisﬁes PW (n) = πn . 2. Quiz 12. πk−1 = π0 P [K = k] + πk . From Problem 2. We verify this pattern by showing that πk = π0 P[K > k] satisﬁes Equation (6): π0 P [K > k − 1] = π0 P [K = k] + π0 P [K > k] . Since we spend one unit of time in each state. then W has a discrete PMF representing the remaining time of the counter at a time in the distant future. This implies πn = P [K > n] E [K ] (10) This Markov chain models repeated random countdowns. Thus the period of state 0 is d = 2.The stationary probabilities satisfy π0 = π0 P [K = 1] + π1 . . we obtain π1 = π0 (1 − P [K = 1]) = π0 P [K > 1] Similarly. Equation (5) implies π2 = π1 − π0 P [K = 2] = π0 (P [K > 1] − P [K = 2]) = π0 P [K > 2] (8) (7) (4) (5) k = 1. we have k − 1 units of time left after the state 0 counter reset. π1 = π0 P [K = 2] + π2 . the system is in state 0.6 (1) By inspection. .11. (6) This suggests that πk = π0 P[K > k]. When we apply we recall that ∞ k=0 πk ∞ k=0 P[K (9) = 1. n=0 > k] = E[K ]. . . . we obtain π0 ∞ P[K > k] = 1. including state 0.

The only difference is the modiﬁed transition rates: 1 (1/2)a (2/3)a (3/4) a (4/5) a 0 1.Solving the second and third equations for π2 and π3 yields π2 = 4π1 − π0 π3 = (4/3)π2 − (1/3)π1 = 5π1 − (4/3)π0 (5) Substituting π3 back into the ﬁrst equation yields π0 = (3/4)π1 + (1/4)π3 = (3/4)π1 + (5/4)π1 − (1/3)π0 (6) This implies π1 = (2/3)π0 .7 The Markov chain has the same structure as that in Example 12. we choose π0 so the state probabilities sum to 1: 16 2 5 1 = π0 + π1 + π2 + π3 = π0 1 + + + 2 = π0 (7) 3 3 3 It follows that the state probabilities are π0 = 3 16 π1 = 2 16 π2 = 5 16 π3 = 6 16 (8) (3) Since the system starts in state 0 at time 0. It follows from the ﬁrst and second equations that π2 = (5/3)π0 and π3 = 2π0 . we can use Theorem 12. To determine whether state 0 is recurrent.(3/4) 1 .(4/5)a a 2 3 4 … The event T00 > n occurs if the system reaches state n before returning to state 0.22. nα (2) 80 . (1) Thus the CDF of T00 satisﬁes FT00 (n) = 1− P[T00 > n] = 1−1/n α .(1/2) a 1 1 .(2/3) a 1 .14 to ﬁnd the limiting probability that the system is in state 0 at time nd: lim P00 (nd) = dπ0 = 3 8 (9) n→∞ Quiz 12. Lastly. we observe that for all α > 0 P [V00 ] = lim FT00 (n) = lim 1 − n→∞ n→∞ 1 = 1. which occurs with probability P [T00 1 > n] = 1 × 2 α 2 × 3 α n−1 × ··· × n α = 1 n α .

Applying this result.Thus state 0 is recurrent for all α > 0. Since the chain has only one communicating class.5. the expected time to return to state 0 is ∞ ∞ E [T00 ] = n=0 P [T00 > n] = 1 + n=1 1 . it will be simpler to use the result of Problem 2.) To determine whether the chain is null recurrent or positive recurrent. In this problem. for α > 1. In Example 12. nα (3) For 0 < α ≤ 1. we did this by deriving the PMF PT00 (n).11 which says that ∞ P[K > k] = k=0 E[K ] for any non-negative integer-valued random variable K . we need to calculate E[T00 ]. ∞ 1 E [T00 ] = 2 + . all states are recurrent.8 The number of customers in the ”friendly” store is given by the Markov chain (1-p)(1-q) p (1-p)(1-q) p (1-p)(1-q) p (1-p)(1-q) 0 (1-p)q 1 (1-p)q ××× i (1-p)q (1-p)q i+1 ××× 81 . (5) nα n=2 Note that for all n ≥ 2 1 ≤ nα ∞ n n−1 dx xα (6) This implies E [T00 ] ≤ 2 + =2+ n n=2 n−1 ∞ dx 1 dx xα (7) (8) xα x −α+1 =2+ −α + 1 ∞ =2+ 1 1 <∞ α−1 (9) Thus for all α > 1. n (4) We conclude that the Markov chain is null recurrent for 0 < α ≤ 1.24. the Markov chain is positive recurrent. 1/n α ≥ 1/n and it follows that ∞ E [T00 ] ≥ 1 + n=1 1 = ∞. ( We also note that if α = 0. Quiz 12. then all states are transient. On the other hand.

.13 with state space partitioned between S = {0. . . we note that (1 − p)q is the probability that no new customer arrives.01 1 3 0.01 p4 = 2 p3 We can solve these equations by working backward and solving for p4 in terms of p3 .01 p3 = 2 p2 + 3 p4 5.In the above chain. p3 in terms of p2 and so on.01 0. 1. we obtain the following useful equations for the stationary distribution. equivalently. the limiting state probabilities do not exist. . (1 − p)q (1) (2) Since Equation (2) holds for i = 0.9 The continuous time Markov chain describing the processor is 2 2 2 2 0 3. Quiz 12. . we see that for any state i ≥ 0. . 2. yielding p4 = 20 p3 31 p3 = 620 p2 981 p2 = 82 19620 p1 31431 p1 = 628. i + 2. 1−α (4) Thus for α < 1.01 0.01 2 3 3 3 4 Note that q10 = 3. we have that for α < 1.. p ≥ q/(1 − q).}. . From the Markov chain. α= (1 − p)q Requiring the state probabilities to sum to 1. . πi p = πi+1 (1 − p)q. (5) In addition. 1. 620 p0 1. . for α ≥ 1 or. an existing customer gets one unit of service and then departs the store. i = 0. . .1 since the task completes at rate 3 per msec and the processor reboots at rate 0. 1. ∞ ∞ (3) πi = π0 i=0 i=0 αi = π0 = 1.01 p2 = 2 p1 + 3 p3 3.01 p1 = 2 p0 + 3 p2 5. i} and S = {i + 1. the limiting state probabilities are πi = (1 − α)α i . By applying Theorem 12. . . 5.1 per msec and the rate to state 0 is the sum of those two rates. 014. 381 (1) . we have that πi = π0 α i where p . This implies πi+1 = p πi .

(1) It is straightforward to show that this implies pn = The requirement that ∞ n=0 p0 ρ n /n! n = 1.4151 p1 = 0. . . .1015 p4 = 0. . 381/2. 443. c + 2. . . . . 2.2573 p2 = 0. 014. 2. . 401 and the stationary probabilities are p0 = 0. the stationary probabilities must satisfy pn = (ρ/n) pn−1 n = 1. . pn = 1 yields c (2) p0 = n=0 ρ c ρ/c ρ /n! + c! 1 − ρ/c n −1 (3) 83 . c + 2.Applying p0 + p1 + p2 + p3 + p4 = 1 yields p0 = 1. c (ρ/c) pn−1 n = c + 1.1606 p3 = 0. . .0655 Quiz 12. . . c n−c c p0 (ρ/c) ρ /c! n = c + 1.10 The M/M/c/∞ queue has Markov chain λ λ λ λ λ (2) 0 µ 1 2µ cµ c cµ c+1 cµ From the Markov chain.

Download

Part 6

Probability and Stochastic Processes 2nd Roy D Yates and David J Goodman

Sols Probability and Stochastic Processes 2nd Roy D Yates and David J Goodman Solution Manual

Communications Practice12-Satellite Communications

Probability and Stochastic Processes 2E, By Roy D. Yates , David J. Goodman

YatesGoodman_Probability_and_Stochastic_Processes

Engineering Mechanics "Statics" second edition William F. Riley Solutions key_1-6

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

CANCEL

OK

scribd