Markov Chains

MARKOV CHAINS
Davood, Pour Yousefian Barfeh

davoodpb@gmail.com
Markov Chains
1. What a Markov chain means
2. Relation between Markov chains and Automata
3. Types of Markov chains
4. Applications of Markov chains
5. My idea
Markov Chains-
what a Markov chain means

history
Markov chains were introduced by Andrei Andreevich Markov (18561922). He was a talented undergraduate who received a gold medal
for his undergraduate thesis at St. Petersburg University.
First paper in 1906: he proved that for a Markov chain with positive
transition probabilities and numerical states the average of the
outcomes converges to the expected value of the limiting distribution
(the fixed vector).
Second paper: he proved the central limit theorem for such chains.
In a paper written in 1913, he chose a sequence of 20,000 letters from
Pushkin's novel to see if this sequence can be approximately
considered a simple chain. He found out about 43.2 percent vowels
and 56.8 percent consonants in the novel, which was equal to the
actual count.
Markov Chains-
characteristics
Describes a system whose states change over time

the changes are governed by a probability
distribution
the next state only and only depends upon the
current state
- the paths to the current state are not
relevant
Markov Chains-
specification
A sequence of random variables {X0,X1,Xn}: Xi describes

the state at time i
A finite set of states S={1,2,,n} for some finite n
-The value of the Xi is taken from S
An initial distribution 0, 0(i)=P{X0=i}: probability that
Markov chain starts in state i
The probability transition rules probability matrix
Markov Chains-
The probability matrix P=(Pij) specifies the transition rules

If the size of S is N, P is a NN stochastic matrix
-each entry is non-negative
-the sum of each row is 1
Pij is a conditional probability: defines the probability that jumps

to state j at time n+1, given that it is in state i at time n:
Pij=P(Xn+1=j | Xn=i)
Current state: rowNext sate: column
We assume time-homogeneity: the probability does not depend on

the time n, but only depends on states i and j
Markov Chains-
The same probability over time

every time the chain is in state s, the probability of
jumping to another state is the same
where we go next given that we are in state s at time x is
equal to where we go next given that we are in the same
state s at time yx
Markov Chains-
A frog hopping among lily pad

States: S={1,2,3} as pads
Initial distribution: 0=(1/2,1/4,1/4)
Probability transition matrix:
P=
1/3
2/3
1/3
1/3
1/3
P defines the probabilities of jumping from one state to another state
Markov Chains-
The frog chooses its initial positionX 0 according to the

initial distribution 0
X 0=
1 If
2 If
3 If
0 U0 1/2
1/2 < U0 3/4
3/4 < U0 1
For instance, if U0=0.85 then X0=3 (the frog starts on the third lily pad)
Markov Chains-
Total of 6 balls in two urns (ex. 4 in the first, and 2

in the second). We pick one of the 6 balls randomly
and move it to the other urn
Xn: number of balls in the first urn, after the nth move
Markov Chains-
Xn: number of balls in the first urn, after the nth move
p(X0=4) = 1
p(X1=j) =
4/6
j=3
2/6
j=5
otherwise
p(Xn+1=k | Xn=j) =
j/6
k=j1
(6j)/6
k=j+1
otherwise
Markov Chains-
Probability Transition Matrix:
P=
1
0
1/6
5/6
2/6
4/6
3/6
3/6
4/6
2/6
5/6
1/6
5/6
1
1/6
4/6
2
2/6
3/6
3
3/6
4/6
4
4/6
5/6
5
5/6
6
1
Markov Chains-
What is the probability pnij that given the chain in state

i, it will be in state j, n step after?
If the start is on lily pad 3, what is the probability of
being on lily pad 1, after 2 steps?
p231 = p31 p11 + p32 p21 + p33 p31
= 1/3 0 + 1/3 1/3 + 1/3 1/3
= 1/9 + 1/9 = 2/9
P=
1/3
2/3
1/3
1/3
1/3
Markov Chains-
If the start is on lily pad 3, what is the

probability of being on lily pad 1, after 2
steps? p231 = 2/9
pad1 pad2 pad3
P1 =
pad1
Pad2
1/3
2/3
Pad3
1/3
1/3
1/3
pad1 pad2 pad3
P2 =
pad1
1/3
2/3
Pad2
2/9
5/9
2/9
Pad3
2/9
4/9
1/3
Theorem: let p be the transition matrix of a Markov chain. The

ij-th entry of the matrix pn gives the probability that the Markov
chain, starting in state si, will be in state sj after n steps.
Markov Chains-
P1 =
P3 =
pad1
pad2
pad3
pad1
Pad2
1/3
2/3
Pad3
1/3
1/3
1/3
pad1
pad2
pad3
pad1
21/81
24/81
36/81
Pad2
20/81
33/81
28/81
Pad3
20/81
32/81
29/81
P2 =
P4 =
pad1
pad2
pad3
pad1
3/9
6/9
Pad2
2/9
5/9
2/9
Pad3
2/9
4/9
3/9
pad1
pad2
pad3
pad1
0.246
0.368
0.371
Pad2
0.244
0.369
0.367
Pad3
0.245
0.369
0.367
Markov Chains-
P7 =
pad1
pad2
pad3
pad1
0.213
0.322
0.321
Pad2
0.212
0.320
0.320
Pad3
0.212
0.321
0.320
The probabilities of the three lily pads are:

0.21, 0.32 and 0.32
NO MATTER WHERE THE FROG
STARTS AT THE FIRST STEP.
The long range of predictions are independent from the starting

state
The columns are identical (approximately), because the chain
forgets the initial state
Markov Chains-
Consider 0 is a vector of N component defining state i,

the probability that the Markov chain is initially at i
0(i) = P{X0=i} , i =1 . . . N
n(j) defines the probability that Markov chain is at j after n
steps
n = {n(1), , n(N)}
Theorem: let p be the transition matrix of a Markov chain,

and let 0 be the probability vector which represents the
starting distribution. Then the probability that the chain is in
state si after n steps is the i-th entry of n = 0 pn
Markov Chains-
n(i) defines state j, the probability that Markov chain is at

j after n steps
n = { n(1), . . . , n(N) }
So, we have:
1 = 0 p
2 = 1 p = 0 p2
3 = 2 p = 0 p3
In general:
n = 0 pn
Markov Chains-
Let us suppose 0 = (1/3, 1/3, 1/3)

We want to compute the state distribution after 3 frog's
jumps:
3 = 0 p3
3 = (1/3, 1/3, 1/3)
21/81
24/81
36/81
20/81
33/81
28/81
20/81
32/81
29/81
3 = (20/81, 32/81, 29/81) = (0.247, 0.395, 0.358)
Markov Chains-
Given a graph and a starting point

Select one of its neighbors at random and move to
this neighbor
Then select a neighbor of the new point at random
and move to it , and so on . . .
The random sequence of points selected in this way
is a random walk on the graph
Markov Chains-
Markov chains describe random walk on P2P overlays
(social graph analysis)

States correspond to peer
Transitions correspond to link overlays
The size of matrix is huge ! ! !

Matrix algebra may not be exploited
Markov Chains-
A simplified clock with 6 number: 0, 1, 2, 3, 4, 5

From each state we can move clockwise, counter clockwise,
or stay in place with the same probability
The transition matrix is:
1/3
p(i,j) = 1/3
1/3
if j=i-1 mod 6
if j=i
if j=i+1 mod 6
Markov Chains-
Suppose we start at X0= 2, we have:
0 = (0,0,1,0,0,0)
1 = (0,1/3,1/3,1/3,0,0)
2 = (1/9,2/9,3/9,2/9,1/9,0)
3 = (3/27,6/27,7/27,6/27,3/27,2/27)
Probability is spreading out away its initial concentration on
the initial state 2
Markov Chains-
Guess what is the state of random walk

at time 10,000?
X10,000 is uniformly distributed over
the 6 states
n = (1/6,1/6,1/6,1/6,1/6,1/6)
After 10,000 steps, the random walk has
forgotten that it started in state 2
Markov Chains- relation
between Markov chains and automata
Nondeterministic A. vs.
a
q0
Probabilistic A.
q1
a
q2
Internal nondeterminism
q0
q1
q2
q3
a
q0
q1
b
q2
External nondeterminism
++=1
Markov Chains- relation
between Markov chains and automata
Probabilistic Automata:
a transition relates a state and an action to a probability
distribution over the set of states.
a probability distribution is a function that assigns a
probability in [0,1] to each element.
sum of the probabilities of all elements is 1.
Markov Chains
Markov Chains-
types of Markov chains
Probabilistic models without nondeterminism

Discrete Time Markov Chains: unlabeled PA in which each state has
exactly one outgoing probabilistic transition
Continuous Time Markov Chains: same as DTMC in which each
state s has a rate s >0. The rate s determines the sojourn time (the
amount of time the process can spend in s). The probability to stay in s
for at most t time units is 1 es.t
Semi-Markov Chains: generalize CTMCs by allowing the sojourn
time to be determined by an arbitrary probability distributions.
Markov Chains-
Probabilistic models with external nondeterminism

Markov Decision Process: without internal actions in which each
state contains at most one outgoing transition labeled with a.
Probabilistic I/O Automata: combine external nondeterminism with
exponential probability distributions.
Semi-Markov Decision Process: basically semi-Markov chains with
external nondeterminism
Markov Chains-
Probabilistic models with full nondeterminism

Interactive Markov Chains: combine exponential distributions with
full nondeterminism.
SPADES (): full determinism and arbitrary probability distributions
are combined in the process algebra SPADES underlying semantic
model stochastic automata. Each transition is decorated with a set of
clocks k and can only be taken when all clocks in k have expired. In
that case, all clocks in k are assigned new values according to their
probability distributions.
Markov Chains-
applications of Markov chains
Sampling and simulation: For many systems, their states

are governed by some probability models. e.g. in statistical
physics, the microscopic states of a system follows a Gibbs
model given the macroscopic constraints. The fair samples
generated by Markov chain show what states are typical of
the underlying system. In computer vision, this is called
"synthesis" -the visual appearance of the simulated images,
textures, and shapes.
Molecular dynamics - protein structures: typical
configurations of protein folding given some known
properties are interesting, while the set of typical
configurations of protein is often huge.
Markov Chains-
applications of Markov chains
Scientific computing: In scientific computing, one often

needs to compute the integral in very high dimensional
space. e.g. estimate the value of by generating uniform
samples in a unit square.
Approximate counting in polymer study: something same
as what Markov did about vowels and consonants.
PageRank in Google: to prioritize the pages found in a
search transition matrix
Markov Chains-
my idea
MCMC Markov Chain Monte Carlo

to have the best eigenfaces
general purpose technique for generating fair
samples from a probability in high-dimensional
space, using random numbers drawn from uniform
probability in certain range.
- The eigenfaces (eigenvectors) are derived from the
covariance matrix of the probability distribution over
the high-dimensional vector space of face images.
number of eigenfaces = number of images ! ! !
Markov Chains-
References
A Look at Markov Chains and Their Use in Google, 2005 Rebecca Atherton
Markov Chain, Basic Concepts, 2012 Laura Ricci
Markov Chain Monte Carlo for Computer Vision, 2005 Song-Chun Zhu
P Automata: Concepts, Results and New Aspects Erzsebet Csuhaj-Varju
Markov Chains and Random Walk, 2009 Takis Konstantopoulos
An Introduction to Probabilistic Automata Marielle Stoelinga
Probabilistic Inference Using Markov Chain Monte Carlo Methods, 1995 Radford Neal
Markov Chains: An Introduction/Review, 2005 David Sirl
Markov Chains-
References
http://www.math.harvard.edu/~kmatveev/Lecture%20Notes.html
https://www3.nd.edu/~tutorial/tutorial/markov.html
http://learntofish.wordpress.com/2012/01/15/introduction-to-markov-chains/
http://www.math.harvard.edu/~kmatveev/Topics.html
http://www.math.harvard.edu/~kmatveev/markov.html
Thank you very much
Davood
Email: davoodpb@gmail.com

Markov Chains

Cargado por

Información del documento

Descripción original:

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Markov Chains

Cargado por

Copyright:

Formatos disponibles

MARKOV CHAINS

Davood, Pour Yousefian Barfeh

what a Markov chain means

what a Markov chain means

Describes a system whose states change over time

what a Markov chain means

A sequence of random variables {X0,X1,Xn}: Xi describes

what a Markov chain means

The probability matrix P=(Pij) specifies the transition rules

Pij is a conditional probability: defines the probability that jumps

We assume time-homogeneity: the probability does not depend on

what a Markov chain means

The same probability over time

what a Markov chain means

A frog hopping among lily pad

P defines the probabilities of jumping from one state to another state

what a Markov chain means

The frog chooses its initial positionX 0 according to the

what a Markov chain means

Total of 6 balls in two urns (ex. 4 in the first, and 2

what a Markov chain means

what a Markov chain means

Probability Transition Matrix:

what a Markov chain means

What is the probability pnij that given the chain in state

what a Markov chain means

If the start is on lily pad 3, what is the

pad1 pad2 pad3

Theorem: let p be the transition matrix of a Markov chain. The

what a Markov chain means

what a Markov chain means

The probabilities of the three lily pads are:

The long range of predictions are independent from the starting

what a Markov chain means

Consider 0 is a vector of N component defining state i,

Theorem: let p be the transition matrix of a Markov chain,

what a Markov chain means

n(i) defines state j, the probability that Markov chain is at

what a Markov chain means

Let us suppose 0 = (1/3, 1/3, 1/3)

3 = (20/81, 32/81, 29/81) = (0.247, 0.395, 0.358)

what a Markov chain means

Given a graph and a starting point

what a Markov chain means

Markov chains describe random walk on P2P overlays

(social graph analysis)

The size of matrix is huge ! ! !

what a Markov chain means

A simplified clock with 6 number: 0, 1, 2, 3, 4, 5

what a Markov chain means

Suppose we start at X0= 2, we have:

what a Markov chain means

Guess what is the state of random walk

Markov Chains- relation

between Markov chains and automata

Markov Chains- relation

between Markov chains and automata

types of Markov chains

Probabilistic models without nondeterminism

types of Markov chains

Probabilistic models with external nondeterminism

types of Markov chains

Probabilistic models with full nondeterminism