Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Compression
CCU, Taiwan
Wen-Nung Lie
Applications of image
compression
CCU, Taiwan
Wen-Nung Lie
Televideo conferencing,
Remote sensing (satellite imagery),
Document and medical imaging,
Facsimile transmission (FAX),
Control of remotely piloted vehicles in military,
space, and hazardous waste management
8-1
Fundamentals
Data redundancy
CCU, Taiwan
Wen-Nung Lie
8-2
Coding redundancy
CCU, Taiwan
Wen-Nung Lie
8-3
Interpixel redundancy
Autocorrelation coefficients
A(n)
(n) =
A(0)
Interpixel redundancy
CCU, Taiwan
Wen-Nung Lie
1 N 1 n
A(n) =
f ( x, y ) f ( x, y + n)
N n y =0
Spatial redundancy
Geometrical redundancy
Interframe redundancy
C R = 2.63
RD = 1
CCU, Taiwan
Wen-Nung Lie
1
= 0.62
2.63
8-5
Psychovisual redundancy
CCU, Taiwan
Wen-Nung Lie
Fidelity criteria
Root-mean-square error
1
2
M
1
N
1
2
erms =
[ f ( x, y ) f ( x, y )]
MN x =0 y =0
Mean-square signal-to-noise ratio
M 1 N 1
SNRms =
f ( x, y)
x =0 y =0
M 1 N 1
CCU, Taiwan
Wen-Nung Lie
f ( x, y ) f ( x, y ) 2
x =0 y =0
Much worse, worse, slightly worse, the same, slightly better, better,
much better
8-7
Source encoder
Channel encoder
CCU, Taiwan
Wen-Nung Lie
8-8
Mapper
Quantizer
Symbol encoder
CCU, Taiwan
Wen-Nung Lie
h1 h2 h4 are even
parity bits
h2 = b3 b1 b0
h5 = b2
h4 = b2 b1 b0
h6 = b1
h7 = b0
CCU, Taiwan
Wen-Nung Lie
c2 = h2 h3 h6 h7
c4 = h4 h5 h6 h7
8-10
Information theory
CCU, Taiwan
Wen-Nung Lie
8-11
Information channel
J
P(a j ) = 1
Source alphabet A = {a1 , a2 ,..., a J }
j =1
Symbol probability z = [P(a1 ), P(a2 ),..., P(a J )]T
Ensemble (A, z) describes the information source
completely
Average information per source output
J
P (b1 a1 ) P(b1 a2 )
...
P (b2 a1 )
.
...
Q=
.
...
P (b a ) P(b a )
K 1
K 2
CCU, Taiwan
Wen-Nung Lie
v = Qz
P (b1 a J )
... ...
.
... ...
.
... ...
... ...
8-13
H (z v ) = H (z bk ) P(bk )
k =1
CCU, Taiwan
Wen-Nung Lie
8-14
Mutual information
J
I ( z, v ) = P (a j , bk ) log
j =1 k =1
J
P( a j , bk )
P (a j ) P (bk )
= P ( a j )qkj log
j =1 k =1
P(a )q
i =1
Q={qij}
qkj
i
ki
CCU, Taiwan
Wen-Nung Lie
8-15
Channel capacity
CCU, Taiwan
Wen-Nung Lie
8-16
Plot_1
error probability : pe
channel matrix : Q = pe pe
pe pe
pe pbs + pe pbs
probability of received symbols : v = Qz =
pe pbs + pe pbs
Plot_2
8-17
8-18
A = {a1 , a2 ,..., a J }
Single symbol :
Block symbol : (n-tuple of symbols) A = {1 , 2 ,...., J n }
( A, z ) H (z )
( A, z) H (z)
Jn
H (z) = P( i ) log P( i )
i =1
CCU, Taiwan
Wen-Nung Lie
H (z) = nH (z )
Entropy of zero-memory
source with block symbols
8-19
N-extended source
1
= P( i )l ( i ) = P ( i ) log
Lavg
P
(
)
i =1
i =1
i
Jn
Jn
< H (z) + 1
H (z) Lavg
Lavg
1
H (z )
< H (z ) +
n
n
lim[
n
CCU, Taiwan
Wen-Nung Lie
Lavg
n
] = H (z )
8-20
Coding efficiency
H ( z)
Coding efficiency : = n
Lavg
Example :
= 1 bit / symbol
Lavg
= 0.918 / 1 = 0.918
CCU, Taiwan
Wen-Nung Lie
8-21
Q D = {qkj d (Q) D}
R ( D) = min [ I (z, v )]
QQ D
CCU, Taiwan
Wen-Nung Lie
Example of Rate-distortion
function
CCU, Taiwan
Wen-Nung Lie
8-23
Example image
8-24
CCU, Taiwan
Wen-Nung Lie
8-25
Huffman coding
CCU, Taiwan
Wen-Nung Lie
8-26
Huffman code is :
CCU, Taiwan
Wen-Nung Lie
= 0.973
8-27
Variations of VLC
CCU, Taiwan
Wen-Nung Lie
8-28
B-code
CCU, Taiwan
Wen-Nung Lie
Only partial symbols (with most probabilities, here the top 12) are
encoded with Huffman code
A prefix code (whose probability was the sum of other symbols,
here 10) followed with a fixed-length code is used to represent
all the other symbols
Each code word is made up of continuation bits (C) and
information bits (natural binary numbers)
C can be 1 or 0, but alternative
E.g., a11a2 a7 001010101000010 or 101110001100110
8-29
CCU, Taiwan
Wen-Nung Lie
8-30
Arithmetic coding
CCU, Taiwan
Wen-Nung Lie
CCU, Taiwan
Wen-Nung Lie
LZW (Lempel-Ziv-Welch)
coding
CCU, Taiwan
Wen-Nung Lie
CCU, Taiwan
Wen-Nung Lie
8-34
Bit-plane coding
Bit-plane decomposition
g i = ai ai +1
g m 1 = am 1
CCU, Taiwan
Wen-Nung Lie
0i m2
CCU, Taiwan
Wen-Nung Lie
Binary-coded
gray-coded
8-36
CCU, Taiwan
Wen-Nung Lie
8-37
CCU, Taiwan
Wen-Nung Lie
code the solid lines as 0s and all other lines with a 1 followed by
original bit patterns
Comparison of various
methods
CCU, Taiwan
Wen-Nung Lie
CCU, Taiwan
Wen-Nung Lie
8-40
Methods of prediction
fn = round i f n i
i =1
i : prediction coefficients
CCU, Taiwan
Wen-Nung Lie
m : order of prediction
8-41
Entropy reduction by
difference mapping
Due to removal of interpixel redundancies by prediction, firstorder entropy of difference mapping will be lower than the
original image (3.96 bits/pixel vs. 6.81 bits/pixel)
The probability density function of the prediction errors is
highly peaked at zero and characterized by a relatively small
variance (modeled by the zero mean uncorrelated Laplacian pdf)
1
pe ( e) =
e
2 e
pe ( e ) =
CCU, Taiwan
Wen-Nung Lie
2e
8-42
CCU, Taiwan
Wen-Nung Lie
8-43
CCU, Taiwan
Wen-Nung Lie
+ for en > 0
e&n =
otherwise
the resulting bit rate is 1 bit/pixel
DM : fn = f&n 1
8-44
CCU, Taiwan
Wen-Nung Lie
8-45
Optimal predictors
DPCM
fn = i f n i
with f&n = en + fn en + fn = f n
i =1
select m prediction coefficients to minimize
2
m
2
E{en } = E f n i f n i
i =1
1
The solution : = R r
E{ f n f n 1}
E{ f n 1 f n 1} E{ f n 1 f n 2 }
E{ f f }
E{ f f } E{ f f }
2
n
n
n 2 n 1
n 2 n 2
r=
.
R=
.
.
.
.
.
E{ f n m f n 1}
.
E{ f n f n m }
CCU, Taiwan
Wen-Nung Lie
. .
. .
. .
. .
. .
E{ f n 1 f n m }
E{ f n m f n m }
1
2
= .
.
m
8-46
= r = E{ f n f n i } i
2
e
i =1
f ( x, y ) = 1 f ( x, y 1) + 2 f ( x 1, y 1) + 3 f ( x 1, y ) + 4 f ( x 1, y + 1)
1 = h , 2 = v h , 3 = v , 4 = 0
We generally have
i =1
CCU, Taiwan
Wen-Nung Lie
1 .0
8-47
Examples of predictors
Four examples :
f ( x, y ) = 0.97 f ( x, y 1)
f ( x, y ) = 0.5 f ( x, y 1) + 0.5 f ( x 1, y )
f ( x, y ) = 0.75 f ( x, y 1) + 0.75 f ( x 1, y ) 0.5 f ( x 1, y 1)
0.97 f ( x, y 1) ifh v
f ( x, y ) =
0.97 f ( x 1, y ) otherwise
CCU, Taiwan
Wen-Nung Lie
8-48
Optimal quantization
CCU, Taiwan
Wen-Nung Lie
8-49
si
si 1
Non-uniform
quantizer
( s ti ) p ( s )ds = 0
i=0
i = 1,2,..., L2 1
si = ti +2ti +1
si = si ,
i = 1,2,..., L2
i=
L
2
ti is the centroid of
each decision interval
Decision levels are halfway
of the reconstruction levels
ti = ti
CCU, Taiwan
Wen-Nung Lie
8-50
Transform coding
Subimage decomposition
Transformation
Quantization
CCU, Taiwan
Wen-Nung Lie
coding
8-51
Transform selection
Requirements
Types
CCU, Taiwan
Wen-Nung Lie
WHT
N = 2m
N 1 N 1
T (u, v ) = f ( x, y ) g ( x, y , u, v )
x =0 y =0
N 1 N 1
f ( x, y ) = T (u, v )h ( x, y , u, v )
u =0 v =0
m 1
bi ( x ) pi ( u )+bi ( y ) pi ( v )
1
g ( x, y , u, v ) = h( x, y , u, v ) = ( 1) i =0
N
CCU, Taiwan
Wen-Nung Lie
8-53
DCT
( 2 x + 1)u ( 2 y + 1)v
g ( x, y , u, v ) = h ( x, y , u, v ) = (u ) ( v ) cos
cos
2N
2N
(u ) =
CCU, Taiwan
Wen-Nung Lie
1
N
2
N
for u = 0
for u = 1,2,..., N 1
8-54
DCT
CCU, Taiwan
Wen-Nung Lie
8-55
Basis images
2
A linear combination of n basis images : H uv
n 1 n 1
F = T (u , v)H uv
u =0 v =0
h(0,1, u , v)
h(0,0, u , v)
h(1,0, u , v)
.
H uv =
.
.
.
.
CCU, Taiwan
Wen-Nung Lie
h(0, n 1, u , v)
. .
.
. .
.
. . h(n 1, n 1, u , v)
. .
. .
8-56
F = (u , v)T (u , v)H uv
u =0 v =0
n 1 n 1
u =0 v =0
u =0 v =0
n 1 n 1
= = T2 ( u ,v ) [1 (u , v)]
u =0 v =0
KLT or
DCT
8-57
CCU, Taiwan
Wen-Nung Lie
8-58
CCU, Taiwan
Wen-Nung Lie
original
2x2
subimage
4x4
subimage
8x8
subimage
8-59
Zonal coding
Threshold coding
CCU, Taiwan
Wen-Nung Lie
8-60
Threshold
coding
Zonal
coding
8-61
CCU, Taiwan
Wen-Nung Lie
8-62
Zonal mask
Thresholded and
retained coefficients
CCU, Taiwan
Wen-Nung Lie
Bit allocation
CCU, Taiwan
Wen-Nung Lie
Normalization (quantization)
matrix
CCU, Taiwan
Wen-Nung Lie
8-65
Wavelet coding
CCU, Taiwan
Wen-Nung Lie
An analyzing wavelet
Number of decomposition levels
CCU, Taiwan
Wen-Nung Lie
8-67
Analysis filter :
Synthesis filter :
CCU, Taiwan
Wen-Nung Lie
8-68
CCU, Taiwan
Wen-Nung Lie
8-69
An example of Daubechies
orthonormal filters
CCU, Taiwan
Wen-Nung Lie
8-70
DWT-based
DCT-based
Rms error :
2.29, 2.96
Rms error :
3.42, 6.33
CCU, Taiwan
Wen-Nung Lie
8-71
CCU, Taiwan
Wen-Nung Lie
Rms error :
3.72, 4.73
8-72
Wavelet selection
CCU, Taiwan
Wen-Nung Lie
Daubechies wavelets
Bi-orthogonal wavelets
8-73
CCU, Taiwan
Wen-Nung Lie
8-74
CCU, Taiwan
Wen-Nung Lie
8-75
Other issues
Quantizer design
CCU, Taiwan
Wen-Nung Lie
The initial decompositions are responsible for the majority of the data
compression (3 levels is enough)
CCU, Taiwan
Wen-Nung Lie
Continuous-tone image
compression -- JPEG
Baseline system
CCU, Taiwan
Wen-Nung Lie
JPEG (cont)
DCT-transformed
Quantized by using the quantization or normalization matrix
The DC DCT coefficients is difference coded relative to the DC
coefficient of the previous block
The non-zero AC coefficients are coded by using a VLC that
defines the coefficients value and number of preceding zeros
CCU, Taiwan
Wen-Nung Lie
The user is free to construct custom tables and/or arrays, which may
be adapted to the characteristics of the image being compressed
Add a special EOB (end of block) code behind the last non-zero
AC coefficient
8-79
CCU, Taiwan
Wen-Nung Lie
52
55
61
66
70
61
64
73
-76
-73
-67
-62
-58
-67
-64
-55
63
59
66
90
109
85
69
72
-65
-69
-62
-38
-19
-43
-59
-56
62
59
68
113
144
104
66
73
-66
-69
-60
-15
16
-24
-62
-55
63
58
71
122
154
106
70
69
-65
-70
-57
-6
26
-22
-58
-59
67
61
68
104
126
88
68
70
-61
-67
-60
-24
-2
-40
-60
-58
79
65
60
70
77
68
58
75
-49
-63
-68
-58
-51
-65
-70
-53
85
71
64
59
55
61
65
83
-43
-57
-64
-69
-73
-67
-63
-45
87
79
69
68
65
76
78
94
-41
-49
-59
-60
-63
-52
-50
-34
DCT
transformed
CCU, Taiwan
Wen-Nung Lie
-415
-29
-62
25
55
-20
-1
-21
-62
11
-7
-6
-46
77
-25
-30
10
-5
-50
13
35
-15
-9
11
-8
-13
-2
-1
-4
-10
-3
-1
-1
-4
-1
-1
-3
-2
-1
-1
-2
-1
-1
-1
Level
shifted
8-81
quantized
-26
-3
-6
-2
-4
-3
-1
-1
-4
-1
8-82
residual
CCU, Taiwan
Wen-Nung Lie
58
64
67
64
59
62
70
78
56
55
67
89
98
88
74
69
60
50
70
119
141
116
80
64
69
51
71
128
149
115
77
68
74
53
64
105
115
84
65
72
76
57
56
74
75
57
57
74
83
69
59
60
61
61
67
78
93
81
67
62
69
80
84
84
-6
-9
-6
11
-1
-6
-5
-1
11
-3
-5
-2
-6
-3
-12
-14
-6
-4
-5
-9
-7
-7
-1
11
-2
-4
11
-1
-6
-2
-6
-2
-4
-4
-6
10
8-83
CCU, Taiwan
Wen-Nung Lie
8-84
CCU, Taiwan
Wen-Nung Lie
8-85
JPEG-2000
Procedures
CCU, Taiwan
Wen-Nung Lie
8-86
JPEG-2000 (cont)
CCU, Taiwan
Wen-Nung Lie
8-87
H0
H1
0
1
6/8
2/8
1
-1/2
-1/8
8-88
JPEG-2000 (cont)
CCU, Taiwan
Wen-Nung Lie
CCU, Taiwan
Wen-Nung Lie
8-90
JPEG-2000 (cont)
ab (u , v)
b
b
211
CCU, Taiwan
Wen-Nung Lie
JPEG-2000 (cont)
CCU, Taiwan
Wen-Nung Lie
The coding outputs from each code block are grouped to form
layers
The resulting layers are finally partitioned into packets
8-92
R-D / Quality
code-block 3
code-block 5
code-block 1
bit-stream code-block 2 bit-stream code-block 4 bit-stream code-block 6
bit-stream
bit-stream
bit-stream
code-block 7
bit-stream
empty
Layer 5
Layer 4
N decomposition level
empty
empty
Layer 3
Layer 2
empty
Layer 1
Resolution
CCU, Taiwan
Wen-Nung Lie
8-93
Standards
CCU, Taiwan
Wen-Nung Lie
Motion-compensated
transform coding - encoder
CCU, Taiwan
Wen-Nung Lie
8-95
Motion-compensated
transform coding -- decoder
Compressed
bit stream
BUF
VLD
IT
Reconstructed
Frame
MC
Predictor
Motion Vectors
IT : inverse DCT transform
VLD : variable length decoding
CCU, Taiwan
Wen-Nung Lie
8-96
Assumptions :
Methods :
CCU, Taiwan
Wen-Nung Lie
8-97
Find the best match (mvx , mv y ) between current image block and
candidates in the previous frame by minimizing the following cost
x ( x, y ) x
n
( x , y )block
n 1
( x + mvx , y + mv y )
One motion vector (MV) for each block between successive frames
frame k-1
frame k
search window
block
CCU, Taiwan
Wen-Nung Lie
8-98
BM motion estimation
techniques
Existing Algorithms :
CCU, Taiwan
Wen-Nung Lie
0 3 12 12
11 6 4 6
difference 32 26 38 49
10 8 10 6 34
21 22 22 25 90
31 26 31 37 125
25 21 18 22 86
CCU, Taiwan
Wen-Nung Lie
gradient
23 27 20 23
33 38 29 31
1 16 19 29
77 72 79 85
40 45 50 49
Template
8-100
CCU, Taiwan
Wen-Nung Lie
The full search examines 1515=225 positions, the three step search examines
9+8+8=25 positions (assume MV range is 7 to +7)
Search range
8-101
Predictive search
MVpredict=(MV1+MV2+MV3)/3
Search area
MV1 MV2
MV3
CCU, Taiwan
Wen-Nung Lie
8-102
CCU, Taiwan
Wen-Nung Lie
Question : what happens if motion vector (mvx , mv y ) are real numbers instead
of integer numbers ?
Answer : Find integer motion vector first and then do interpolation from
surrounding pixels for refined search. Most popular case is the half-pixel
accuracy (i.e., 1 out of 8 positions for refinement). 1/3 pixel accuracy will be
proposed in H.263L standard.
Half-pixel accuracy often leads to a matching with less residue (high
reconstruction quality), hence decreases the bit rate required in following
encoding.
: integer MV position
: candidates of half-pixel MV positions
8-103
Evaluation of BM motion
estimation
Advantages :
Disadvantages :
Improvements :
CCU, Taiwan
Wen-Nung Lie
Intra-frame
forward prediction
Predictive
frame
.....
Bi-directional
frame
I/ P/ B frame (MPEG-x)
forward prediction
.....
bidirectional
prediction
CCU, Taiwan
Wen-Nung Lie
8-105