Está en la página 1de 30

Ti liu tham kho

X L TING NI
Tr
Trnh Vn Loan
B mn K
K thu
thut M
My t
tnh
Khoa CNTT, HBK H
H Ni

La parole et son traitement automatique


Calliope, Masson, 1989

Traitement de la parole
Rene Boite et Murat Kunt, Presse Polytechnique Romandes, 1987

Fundamentals of Speech Signal Processing


Saito S., Nakata K. , Academic Press, 1985

Digital Processing of Speech Signals


Lawrence R. Rabiner, Ronald W. Schafer, PrenticePrentice-Hall .1978

DiscreteDiscrete-Time Processing of Speech Signals


John R. Deller, John G. Proakis, Hansen John H. L. 1999

Ti
Ting Vi
Vit hi
hin
i (Ng
(Ng m, ng
ng ph
php, phong c
cch)
Nguy
Nguyn H
Hu Qu
Qunh, H
H Ni, 1994

Dn lu
lun Ngn ng
ng hc
Nguy
Nguyn Thi
Thin Gi
Gip, o
on Thi
Thin Thu
Thut , Nguy
Nguyn Minh Thuy
Thuyt, H
H Ni, 1994

http://dce.hut.edu.vn

Ni dung

1. Mt s khi nim c bn

1. M
Mt s
s kh
khi ni
nim c b
bn
2. X
X l t
tn hi
hiu ti
ting n
ni
3. M ho
ho ti
ting n
ni
4. T
Tng h
hp ti
ting n
ni
5. Nh
Nhn d
dng ti
ting n
ni

X l thng tin ch
cha trong t
tn hi
hiu ti
ting n
ni
nh
nhm truy
truyn, lu tr
tr tn hi
hiu n
ny ho
hoc t
tng
hp, nh
nhn d
dng ti
ting n
ni.

Cc nghin c
cu
c ti
tin h
hnh
x l
ti
ting n
ni yu c
cu nh
nhng hi
hiu bi
bit trn nhi
nhiu
lnh v

c
ng

y
c

ng
a
d

ng:
t

ng

m
v
v ng c
d
t ng
v
ngn ng
ng hc cho
n x
x l t
tn hi
hiu...

Mc ch

Mt s khi nim c bn

M ho
ho mt c
cch c
c hi
hiu qu
qu tn hi
hiu
ti
ting n
ni
truy
truyn v
v lu tr
tr ti
ting n
ni.
Tng h
hp v nh
nhn d
dng ti
ting n
ni ti
tin
ti giao ti
tip ng
ngi-my b
bng ti
ting n
ni.
Tt c
c cc ng d
dng c
ca x
x l ti
ting
ni
u c
cn ph
phi d
da trn c
cc k
kt qu
qu
ca phn t
tch ti
ting n
ni

Phn bi
bit ti
ting n
ni v
v m thanh
Ti
Ting n
ni
c phn bi
bit v
vi c
cc m
thanh kh
khc b
bi c
cc
c t
tnh m h
hc c
c
ngu
ni.
ngun g
gc t
t c ch
ch to ti
ting n
C 2 lo
loi ngu
ngun m
tu
tun ho
hon (dy thanh rung)
tp m (dy thanh khng rung)

B my pht m

B my pht m

B my pht m

S khi b my pht m

NASAL CAVITY: Khoang mi


SOFT PALATE: Vm ming mm
EPIGLOTTIS: Np thanh qun
VOCAL FOLDS (CORDS): Dy thanh
OESOPHAGUS: Thc qun
TRACHEA: Kh qun
PHARYNX: Hng

10

1. Mt s
s kh
khi ni
nim c
c bn

Thanh mn

Thanh mn

cc v
v tr
tr ht, th
th,ph
,pht m, n
ni th
th th
tho

Thanh mn
Dy thanh

A. Glotte pendant la respiration B. Glotte pour la phonation


1. Glotte 2. Cordes vocales 3. Epiglotte 5. Cartilages arytnodes
11

12

Dy thanh trong mt chu k


dao ng

Biu din tn hiu ting ni

Dng s
sng theo th
thi gian

13

File WAV

14

Biu din tn hiu ting ni

Tn s
s ly m
mu: 8kHz, F1= 11025 Hz,
2F1, 4F1 (16kHz, 10kHz)
S bit/m
bit/mu: 8,16
Mono, Stereo

15

Ph
Ph tn hi
hiu ti
ting n
ni

16

Biu din tn hiu ting ni

Biu din tn hiu ting ni

Spectrogram (Sonagram)

17

Biu din tn hiu ting ni

18

Biu din tn hiu ting ni

19

Thu b
bng micro kh
khc lo
loi

20

Biu din tn hiu ting ni

Biu din tn hiu ting ni

Hai gi
ging kh
khc nhau cho c
cng m
mt m

Cng ng
ngi n
ni, c
cng m
mt m

21

22

To m hu thanh
Formant v antiformant

Nng l
lng, t
t l bi
bin thin qua gi
gi tr
tr khng
file:C:\wav\1-6-5-8-10-0.wav, ss,es:1, 43029, window length, shift (samples):160, 40, wtype:1

amplitude

0.4
0.2
0
-0.2
-0.4

Signal

-0.6

short-time energy

0.5

short-time magnitude

1.5

2.5

3.5

2.5

3.5

2.5

3.5

En

4
3
2
1
0.5

zero crossing rate

1.5

15

Mn

10
5
0.5

1.5

80

ZC

60
40
20
0

23
0.5

1.5

2
time in seconds

2.5

24

3.5

Mt s c im ng m
ting Vit

To m v thanh

n m ti
tit
C thanh i
iu (6), bi
bin
i thanh i
iu
km theo bi
bin
i ngh
ngha
Khng bi
bin
i h
hnh th
thi

25

Mt s c im ng m
ting Vit

Mt s c im ng m
ting Vit

H th
thng m v
v: 14 nguyn m (11
nguyn m n,
n, 3 nguyn m i,
i, 22 ph
ph m)
1

i,y

ch
ch

ch
ch

e d
d

a ha

mt

b ph
ph

n c
cn

t t

10

co ro

11

l m

2
3

ia,y,ya,i
(c ia, y)

26

kia k
ka, yu
ki
kiu, khuya, tin
ti
tin

ua,u
(c ua)

tua rua, lun

a,
a,
(c a)
a)

la tha,
tha,
l
lt
27

H th
thng m v
v: 22 ph
ph m
1

bng b
bnh

12

tr

tr
trng

p p

13

sinh vin

vn v

14

rng

ph

phi pha

15

ch

chng

m m
mng

16

nh

nh
nhc

t ai

17

ng,ngh

tin t
tng

18

c,k,q

th

th th
thn

19

kh

d,gi

duyn, gi
gi

20

g,gh

g gh
gh

10

nng

21

h h

11

long lanh

22

xa xi

ng ngh
con,k
con,kt,qua
kh
khc

28

Mt s c im ng m
ting Vit

Mt s c im ng m
ting Vit

Phn lo
loi nguyn m theo
nng
ca l
li v
v chuy
chuyn
ng c
ca l
li
cao

Hng
i

gi
gia
sau

Phn lo
loi nguyn m theo
m ca
mi
ming v
v chuy
chuyn
ng c
ca l
li
Hng

nng
tr
trc

trung b
bnh

hp

hng tr
trc

th
thp

ia,y,ya,i

hi h
hp

hi r
rng

rng

hng sau khng


trn mi

Mt s c im ng m
ting Vit

Phn lo
loi ph
ph m theo t
tc hay x
xt,
hu thanh hay v thanh, m
mi h
ha
V tr cu m

Bt hi

Tc

V
thanh
Khng bt
hi
Hu
thanh

u li
Mi

Rng

Vm ming

Mt li Cui li

Hng

th
p

m
ph
v

n
x
d,gi
l

tr

ch

c,k,qu

Xt

Vang mi
V thanh
Hu thanh
Vang bn

nh
s
r

30

Mt s c im ng m
ting Vit

Phng thc cu m

u ua
o

29

hng sau
trn mi

ng,ngh
kh
g

31

m t
tc: ti
ting n
n, ph
pht sinh do lu
lung kh
kh t ph
phi i ra b
b cn tr
tr ho
hon
to
ton, ph
phi ph
ph v s cn tr
tr

tho
thot ra.
m x
xt: ti
ting c
c xt, ph
pht sinh do lu
lung khng kh
kh i ra b
b cn tr
tr
khng ho
hon to
ton (ch
(ch b kh
kh khn),
khn), ph
phi l
lch qua m
mt khe h
h nh
nh v
trong khi tho
thot ra nh v
vy ph
phi c
c xt v
vo th
thnh c
ca b
b my ph
pht
m.
Ph
Ph m bn:
bn: u l
li ti
tip x
xc v
vi l
li ch
chn l
li tho
thot c
ca khng kh
kh,
bu
buc n
n ph
phi l
lch qua khe h
h hai bn c
cnh l
li ti
tip gi
gip v
vi m
m
m ra ngo
ngoi t
to nn ti
ting x
xt nh
nh (l).
Lu
Lung khng kh
kh tho
thot ra ngo
ngoi b
b cn tr
tr, t
to nn ti
ting x
xt hay ti
ting
n, d
dng t
tn hi
hiu khng tu
tun ho
hon g
gi l
l ti
ting
ng (
(n).
Trong khi ph

t
m
m

t
s

ph

m,
dy
thanh
c

ng
ho

ng

ph
m s ph
c
ho
ng
th
thi t
to nn ti
ting thanh.
Ph
Ph m c
c t l ti
ting
ng l
ln hn g
gi l
l ph
ph m n.
Ph
Ph m c
c t l ti
ting thanh l
ln hn g
gi l
l ph
ph m vang.

32

Dng sng mt s t ting Vit

ph

Dng sng mt s t ting Vit

tr

ch

tm

nh

tm
33

Dng sng mt s t ting Vit

34

Dng sng mt s t ting Vit


CHUR.WAV, Fs = 11025Hz, 5669 samples, Time = 514ms
0.5

0.4

0.3

0.2

0.1

Amplitude

-0.1

-0.2

-0.3

-0.4

-0.5
0

kh

35

50

100

150

200

250
Time in ms

300

350

400

450

500

36

Dng sng mt s t ting Vit

Dng sng mt s t ting Vit

DDEER.WAV, Fs = 11025Hz, 5278 samples, Time = 479ms

0.4

KHAR.WAV, Fs = 11025Hz, 7718 samples, Time = 700ms

0.4

0.3

0.2

0.2

0.1

Amplitude

Amplitude

-0.2

-0.1
-0.4

-0.2

-0.6

-0.3

-0.4

-0.8

50

100

150

200

250
Time in ms

300

350

400

100

200

300

Dng sng mt s t ting Vit

400

500

600

Time in ms

37

450

38

Dng sng mt s t ting Vit


XOA.WAV, Fs = 11025Hz, 7690 samples, Time = 697ms
0.6

N G H IR .W A V , F s

1 1 0 2 5 H z , 6 7 0 7 s a m p le s , T im e =

6 0 8 m s

0 .3

0.4
0 .2

0.2

Amplitude

Amplitude

0 .1

-0 .1

-0.2

-0 .2

-0.4
-0 .3
0

1 0 0

2 0 0

30 0
T im e in m s

4 0 0

5 0 0

6 0 0

-0.6

-0.8

39

100

200

300

400
Time in ms

500

600

40

10

Dng sng mt s t ting Vit

Dng sng mt s t ting Vit


MEJ.WAV, Fs = 11025Hz, 4922 samples, Time = 446ms

P H A I R . W A V , F s = 1 1 0 2 5 H z , 6 9 3 4 s a m p le s , T im e = 6 2 9 m s

0.2
0.6

0.15
0.4

0.1
0.2

Amplitude

-0 . 2

-0.05

-0.1

-0 . 4

-0.15

-0 . 6

100

200

300
T im e in m s

400

500

600

-0.2

41

Dng sng mt s t ting Vit

50

100

150

200
250
Time in ms

300

350

42

400

Dng sng mt s t ting Vit


TAMS.WAV, Fs = 11025Hz, 4989 samples, Time = 452ms

BUF.WAV, Fs = 11025Hz, 6779 samples, Time = 615ms


0.6

0.4

0.3

0.4
0.2

0.1

0.2

Amplitude

Amplitude

Amplitude

0.05
0

-0.1

-0.2

-0.3

-0.2

-0.4

-0.5

-0.4

-0.6

43

-0.6
0

100

200

300
Time in ms

400

500

50

100

150

200

250
Time in ms

300

350

400

450

44

600

11

Dng sng mt s t ting Vit

Dng sng mt s t ting Vit

GIAF.WAV, Fs = 11025Hz, 8772 samples, Time = 796ms

VIF.WAV, Fs = 11025Hz, 9872 samples, Time = 895ms

0.4

0.3

0.3
0.2

0.2

0.1

0.1

Amplitude

Amplitude

-0.1

-0.2
-0.1

-0.3

-0.2

-0.4

-0.5

45
0

100

200

300

400
Time in ms

500

600

-0.3
0

700

Dng sng mt s t ting Vit

100

200

300

400
500
Time in ms

600

700

46

800

Dng sng mt s t ting Vit

KHOONG.WAV, Fs = 11025Hz, 6743 samples, Time = 612ms


NHAAN.WAV, Fs = 11025Hz, 5713 samples, Time = 518ms

0.4
0.6

0.2
0.4

0
Amplitude

Amplitude

0.2

-0.2

-0.4

-0.2

-0.6

-0.4

47
0

100

200

300
Time in ms

400

500

600

50

100

150

200

250
Time in ms

300

350

400

450

500

48

12

Dng sng mt s t ting Vit

Dng sng mt s t ting Vit


TRIJ.WAV, Fs = 11025Hz, 4108 samples, Time = 373ms

LAJ.WAV, Fs = 11025Hz, 5442 samples, Time = 494ms


0.4

0.4
0.3

0.2

0.2

0.1

Amplitude

Amplitude

-0.2
-0.1

-0.4
-0.2

-0.3

-0.6

50

100

150

200

250
Time in ms

300

350

400

49

450

Dng sng mt s t ting Vit

50

100

150

200
Time in ms

250

300

350

50

Dng sng mt s t ting Vit


TIMF.WAV, Fs = 11025Hz, 5589 samples, Time = 507ms

SOOS.WAV, Fs = 11025Hz, 8888 samples, Time = 806ms

0.6

0.4

0.3

0.4
0.2

0.2
Amplitude

Amplitude

0.1

-0.1

-0.2

-0.2
-0.3

-0.4

-0.4

-0.5

51
0

100

200

300

400
Time in ms

500

600

700

800

52
0

50

100

150

200

250
Time in ms

300

350

400

450

500

13

M hnh to ting ni
(Fant-1960)
u(n)
T0

Tuy
n m
Tuy
Tuyn
m
V(z)
V(z)

Lc
Lcthng
thng
thp
thpG(z)
G(z)

G(z ) =

A
(1 + z 1 )(1 + z 1 )
V(z ) =

M hnh ton im cc (AR)


TTi
i b
c x

bbc
xx
R(z)
R(z)

T( z ) = G ( z )V ( z )R ( z ) =
x(n)

R ( z ) = C(1 z 1 )

A( z )

A(z): H
Hm truy
truyn
t c
ca b
b lc
o

T( z ) =

A( z )

A(z) = 1 +

2K +1

az
i =1

A(z) = a i z i
i =0

a0 = 1

x( n ) + a i x ( n i ) = u ( n )

(1 + b1k z 1 + b 2k z 2 )

i =1

k =1

P = 2K+1

53

M hnh ARMA

54

Di thng
Bin

1
2
C( z )
+
=
T( z ) =
A1 ( z ) A 2 ( z )
A( z )

C( z ) = c i z -i
i=0

1
1/ 2

c0 = 1

Di thng Bk
p

i =1

i =0

x( n ) + a i x( n i ) = c i u ( n i )

Fk
55

Tn s

56

14

2. X l tn hiu ting ni

x(n)

Phn t
tch ph
ph
B lc
hiu chnh

Ca s
Hamming

FFT

Log |.|
N

B lc hi
hiu ch
chnh H(z) = 1 az-1, a = 0,95..0,98
57

X l ng hnh (homomorphic)

frame

58

<= N/2, >0

S khi x l ng hnh

s(n)=h(n)*e(n) S(
S() = H(
H().E(
).E()
log[S(

)]=
log[H(

)]+
log[E(
log[S(
log[H(
log[E()]
1
1
F {log[S(
{log[S()]} = F {log[H(
{log[H()]} + F-1{log[E(
{log[E()]}
$
1
F {log[S(
{log[S()]} = s(n)
1
$
F {log[H(
{log[H()]} = h(n)
$
1
F {log[H(
{log[H()]} = e(n)
$ = h(n)
$
$ + e(n)
s(n)

B lc
hiu chnh

Ca s
Hamming

FFT

Log |.|

FFT-1

$
s(n)
59

60

15

Tin on tuyn tnh (Linear


Prediction Coding)

V d

c(n)

M h
hnh AR

x(n) + ai x(n i) = u(n)


i=1

T0

T0

Tin o
on
Sai s
s tin o
on
Sai s
s bnh phng to
ton ph
phn
Ti thi
thiu h
ha sai s
s

)
h(n)

$ = a$ i x(n i)
x(n)

i=1

$
e(n) = x(n) x(n)
E = e2 (n)
n

E
a$ i

= 0, i = 1,2,...,p

61

Xc nh tn s c bn

Mt s phng php xc nh Fo

Gi
Gi tr
tr F0 ph
ph thu
thuc v
vo gi
gii t
tnh v
v
la tu
tui

Gi
Ging nam: 80..250 Hz
Gi
Ging n
n: 150..500 Hz

Tn hiu
ting ni

62

Tin

Xc nh

nh gi

x l

Fo

kt qu

63

Da v
vo h
hm t
t tng quan
Da v
vo h
hm vi sai bin
trung b
bnh
Dng b
b lc
o v
v hm t
t tng
quan
X l
ng h
hnh

64

16

Phng php t tng quan c


ci tin

Da vo hm t tng quan

Tnh h
hm t
t tng quan R(k) ca t
tn hi
hiu ti
ting n
ni
x(n)
N 1 k

R(k ) =

Hn ch
ch, lo
loi b
b |x| < CL

x(n) x(n + k ) k = 0,1,..., K

n =0
Fs = 10 kHz, N = 300, K = 150.T
150.Tm c
cc
i trong kho
khong (0, K)

65

Da vo hm vi sai bin trung


bnh (Average Magnitude Difference Function)

66

V d

N 1

D (k ) = x(n + m) x(n + m k ) k = 0,1,..., K


1

N 1

x(n)
x(n)

N 1

2
D(iP) = 0, i = 0,1,... N u (n) N u (n)
n=0
n=0

0.3
0.3
0.2
0.2
0.1
0.1
0
0
-0.1
-0.1
-0.2
-0.2
700
700

1/2

1 N-1

D(k ) = [ x(n + m) x(n + m k )]2


N m=0

1/ 2
1

k = 0,1,..., K
= [2r (0) 2r (k )]
N

vi < 1
67

0.015
0.015
0.01
0.01
0.005
0.005
0
0
-0.005
-0.005
-0.01
0
-0.01

750
750

800
800

850
850

900
900

950
n 950
n

1000
1000

1050
1050

1100
1100

1150
1150

r(k)
r(k)

1/ 2

50
50

100
100

150
k 150
k

200
200

250
250

300
300

50
50

100
100

150
k 150
k

200
200

250
250

300
300

0.2
0.2
0.15
0.15
D(k)
D(k)

m=0

0.1
0.1

0.05
0.05
0

00

68

17

Dng b lc o (Simplified Inverse

X l ng hnh

Filter Tracking)
10kHz

Thng th
thp

Thng thp

4700Hz

900Hz

1-z-1

A(z)

W(n)
W(n)

LPC(p=4)
LPC(p=4)

Hm t tng quan

HT/VT
nh gi kt qu

Ni suy

Tm cc i

Fo
69

Xc nh formant

X l ng hnh

Tham s
s cn x
xc
nh

Tn hiu
ting ni

Formant Fk
Di thng Bk

70

B lc
hiu chnh

Ca s

FFT

Phng ph
php
X l
ng h
hnh
LPC

Log10|.|

FFT-1

FFT

Wc(n)
71

72

18

X l ng hnh

Phng php LPC


B lc
hiu chnh

Ca s

Tnh h
s ai

s(n)
Tnh1/ |A(ej)|
bng FFT

Tm
cc i

Fk,Bk
Quyt nh

Tnh nghim
ca A(z)
73

Mt s tnh cht thng k ca


tn hiu ting ni

3. M ha ting ni

Dy thao t
tc m ho
ho v gi
gii m

Nhiu, suy gim,


sai s

Lc1
Lc1

AD
AD

M
M ho
ho

DA
DA

Mt
xc su
sut

ng
m
N : s l
l
mu x(n)
c bin
trong
kho
khong [
[-/2,
/2, +/2]
/2]

n [-N,...,N
,...,N]
x egodic v
v dng

Nhiu, suy gim,


sai s

Gii
Gii m
m

74

Lc2
Lc2

px ( ) = lim [ N /(2 N + 1)]


N
0

75

76

19

Gi tr trung bnh v phng sai

Lng t tc thi (khng nh)

Gi
Gi tr
tr trung b
bnh c
ca t
tn hi
hiu d
dng

N
1
x = px ( ) d = lim
x ( n)
N 2 N + 1
n = N

(L+1) m
mc t
tn hi
hiu x(0), x(1), ..., x(L)
L m
mc l
lng t
t ho
ho

vi t
tn hi
hiu ti
ting n
ni x = 0
Phng sai

=
2
x

N
1
x 2 (n)

N 2 N + 1
n = N

2
px ( ) d = lim

Lu
Lut l
lng t
t y = Q(x)
Q(x)
c
nh ngh
ngha:

Mi m
mc l
lng t
t ho
ho bi
biu di
din b
bng t
t b bit
L = 2b.
Sai s
s l
lng t
t (t
(tp m l
lng t
t) e = Q(x) - x
B
Bc l
lng t
t : hi
hiu 2 m
mc t
tn hi
hiu k
k nhau
(i) = x(i)
x(i)--x(ix(i-1)
Thng l
lng I = bFs (bit/s). Fs : t
tn s
s ly m
mu

77

Thng lng

78

Thng lng

Tn hi
hiu l
lng t
t 8 bit (256 m
mc), Fs = 8
kHz Thng l
lng = 64 kbit/s
Tn hi
hiu l
lng t
t 16 bit (65536 m
mc),
Fs = 16 kHz Thng l
lng = 256 kbit/s ,
1 gi
gi ti
ting n
ni ~100
~100 Mbyte
Cn ph
phi m ho
ho tn hi
hiu ti
ting n
ni (MPEG,
GSM, G723, ...) truy
truyn ti
ting n
ni trn m
mng
ho
hoc lu tr
tr
79

Tn s
s ly
mu (kHz)

S bit cho
1 m
mu

Thng
lu
lung kbit/s

Dung l
lng /
ph
pht (kbyte)

48

16

768

11520

Ghi m chuyn
nghi
nghip

44,1

16

705,6

10584

CD Audio

32

16

512

7680

Radio FM

22

12

264

3960

Radio AM

64

960

i
in tho
thoi

Lnh v
vc

80

20

Lng t u

Lng t u

Tng qu
qut, b
bc l
lng t
t l hm c
ca bin
tn
hi
ng t
hiu x (l
(l
t khng
u) n gi
gin nh
nht l
l
l
lng t
t
u.
Mc l
lng t
t
c ch
chn gi
gia 2 m
mc t
tn hi
hiu
y(i) = (1/2)[x(i(1/2)[x(i-1)+x(i)]
Lu
Lut l
lng t
t
u v
v
i x
xng
c trng b
bi:
cc m
mc bo ho
ho xs
mc l
lng t
t L ho
hoc (L+1) = 2b.
B
Bc l
lng t
t = 2xs/L

L=9

81

Lng t u
1

82

Lng t u

L = 16
1

0.8
0.8
0.6
0.6

0.6
0.6

0.4
0.4

0.4
0.4

0.2
0.2
0

0.8
0.8

0.2
0.2

0
0

-0.2
-0.2
-0.4
-0.4

-0.4
-0.4

-0.6
-0.6

-0.6
-0.6

-0.8
-0.8
-1
-1
0

-0.2
-0.2

-0.8
-0.8

10
10

12
12

14
14

-1
-1
0

83

10
10

12
12

14
14

84

21

Lng t u
1
0

1
0

-1
-1
0
0
1
1
0

10
10

12
12

10
10

12
12

10
10

12
12

10
10

12
12

6 ation E rror
Quantific
6 ation E rror
Quantific

-0.2
-0.2
0

Mt
xc su
sut sai s
s l
lng t
t
l
pe ( ) = p x (i + ), l = ( L 1) / 2
i = l

-1
-1
0
0
0.2
0.2
0

-1
-1
0
0
1
1
0

Cc tnh cht lng t u

phn b
b
u gi
gia - /2 v
v + /2
pe ( ) = 1/ , / 2
= 0, > / 2
Trung b
ng t
bnh t
tp m /l
l
t = 0
2
2
2
Phng sai e = / d = 2 /12
/ 2

85

Cc tnh cht lng t u

86

T s tn hiu trn nhiu

T s tn hi
hiu trn nhi
nhiu

SN =


xs
SN = 10 lg
(d B) = 6, 02b + 4, 77 20 lg

x

2
x
2
e

Nng lng tn hiu Ws


=
Nng lng nhiu
Wn

SN dB = 10 log 10 SN

Nu xs = 4 max SN (d B) = 6b 7,3

ho
hoc

Vi b 6, tng 6 dB mi khi tng 1 bit


bit l
lng t
t.
c ch
cht l
lng th
thch h
hp c
cn c
c b 11
87

SN dB = 20 log 10

Bi n tn hiu
Bi n nhiu
88

22

T s tn hiu trn nhiu


Nng l
lng

SN (dB)

Tn hi
hiu = Nhi
Nhiu

Tn hi
hiu = 2 Nhi
Nhiu

Tn hi
hiu = 10 Nhi
Nhiu

10

Tn hi
hiu = 100 Nhi
Nhiu

20

Tn hi
hiu = 1000 Nhi
Nhiu

30

Tn hi
hiu = 10N Nhi
Nhiu

Lng t logarit

Sau khi l
ly logarit bin
tn)hi
hiu s
s m ho
ho tuy
tuyn
y(n)
tnh
y(n)
x(n)

log[]
log[]

Q[]
Q[]

M
M ha
ha

c(n)

signe[]
signe[]
y '(n)

c(n)

N x 10

Gii
Gii m
m

exp[]
exp[]

x '(n)

x '(n)

signe[x(n)]
89

Lng t logarit

Lng t logarit

Hai gi
gii ph
php d
dng cho i
in tho
thoi
Lu

(d

ng

)
Lu
(d

y =

90

Hai gi
gii ph
php d
dng cho i
in tho
thoi
Lu

t
A(d

ng

chu
u)
)
Lu A(d
u

y =

log(1 + x )
log(1 + )

1 + log A x
1 + log A

= 255

A = 87,56

8 bit
bit logarit ~ 12 bit
bit l
lng t
t
u
91

92

23

Lng t thch nghi

Lng t thch nghi

B
Bc l
lng t
t tu
tu thu
thuc v
vo bin
tn hi
hiu

Th
Thch nghi tr
trc
y(n)=
x(n) G(n)
x(n)

Th
Thch nghi sau

y (n)

Q[]
Q[]

y(n)

x(n)

M
Mha
ha
c(n)

Thch
Thchnghi
nghi

k.i
k.i

y'(n)

x'(n)
=
G'(n)

G(n)

y '(n)

Gii
Giim
m

y'(n)

x'(n)
=
G'(n)

c(n)
G(n)

:
G(n)

93

Mt s chun m ho
m thanh/ting ni

G(n)

G(n)

y (n)

Q[]
Q[]

y '(n)

M
Mha
ha

c(n)

Thch
Thchnghi
nghi

k.i
k.i
Gii
Giim
m

c(n)

Thch
Thchnghi
nghi

k.i
k.i

94

4. Tng hp ting ni

G.721 : ADPCM,
ADPCM, 32 kbps, 4bits,
4bits, 8kHz
8kHz
G.722 : ~ADPCM,
~ADPCM, 48
n 64 kbps,
G.723 : ~ADPCM,
~ADPCM, 24 kbps,
kbps, 3 bits, 8kHz
8kHz
G.728 : 16
16 Kbps
Kbps
GSM : i
in tho
thoi di
ng, 13 kbps
Linear Predictive Encoding (Xerox), 5 kbps
Code Excited Linear Prediction (CELP)
Digital Video Interactive : ~ADPCM,
~ADPCM, 4 n 8 bits
VoIP: G723.1 (6.4kbits/s), G728, G729 (8kbits/s)

To ti
ting n
ni xu
xut ph
pht t
t bi
biu di
din
ng
ng m c
ca l
li n
ni
K thu
thut t
tng h
hp ti
ting n
ni:
Tng h
hp tr
trc ti
tip
Tng h
hp d
da trn m h
hnh
B

tng h
hp formant
tng h
hp d
dng LPC
B tng h
hp m ph
phng b
b my ph
pht m
B

95

96

24

Phn loi

Tng hp trc tip

Ch
Cht l
lng b
b tng h
hp: M
Mc
t nhin

Mc
r
Thanh i
iu
Ng
Ng i
iu

S l
lng t
t vng:

Hn ch
ch
Khng h
hn ch
ch

Ghi m ti
ting n
ni t
t nhin
- n v
v ghi m
- Gh
Ghp c
cc n v
v ghi m: t
t, cu.
n v
v ghi m

B tng h
hp ti
ting n
ni t
t vn b
bn (Text(Text-totoSpeech)

m v
v
m ti
tit (diphone)
t
t hp t
t
cu

97

98

Tng hp formant
F0

A1
TTo
o xung
xung

Tng hp LPC
F1

F2 F3

F0

TTo
o xung
xung

A2

BB
l
c s

ss
lc
bbc
c pp

Khoang mi
ming
TTo
o t
p m
t
tp
m

A3

a1 a2 ... ap

Knh
i
m
Knhm
mi
A4
TTo
o t
p m
t
tp
m

B1

Synthesis-by-Analysis

B2 B3
99

100

25

M phng b my pht m
Ngu
Ngun m

M hnh ngun m

Tuy
Tuyn m

Tham s
s i
iu khi
khin

M hnh 2 khi

M ph
phng ngu
ngun m (ngu
(ngun tu
tun ho
hon)
M ph
phng dy thanh:M h
hnh m
mt kh
khi, M h
hnh
hai kh
khi, M h
hnh nhi
nhiu kh
khi, M h
hnh hai d
dm...

M hnh nhiu khi


101

M phng tuyn m

M hnh 2 dm

102

M hnh phn x

Gi
Gi thi
thit
Vch ngn c
cng
Sng truy
truyn n h
hng (d
(dc theo tr
trc
ng)ch
ng)ch xt c
cc t
tn s
s < 5000 Hz, bi
bin
thin di
din t
tch khng qu
qu
t ng
ngt
B qua t

n
hao:
t

nh
l

ng,
truy
t
t
l
truyn nhi
nhit

Ri rc ha

103

104

26

Tng t m hc in hc

ng tit din u, khng tn hao

m h
hc

ng ti
tit di
din
u v
v

ng dy tng ng

v(l,t)=0

H phng tr
trnh Webster

i
in h
hc

p: p su
sut

v: i
in p

u: Thng l
lng

i: Dng i
in

0/A: i
in c
cm m h
hc

L: i
in c
cm

A/
A/0c2:

C: i
in dung

i
in dung m h
hc

x
x

u
p
u ( x, t) = u + t u t +
= 0
c
c

x
A t
u
A p

x
x c

=
p ( x, t ) = u + t + u t + 0
x
0 c 2 t
c
c A


u: thng
thng l
lng, p: p su
sut, : m
mt
khng kh
kh, c: v
vn t
tc s
sng m

105

Xt trong min tn s

p ng tn s

Sng t
ti v
v sng ph
phn x
x c dng
x

i
iu ki
kin bin t
ti thanh mn

u (0, t ) = uG (t ) = U G ()e jt
i
iu ki
kin bin t
ti mi p (l, t ) = 0
p(x, t) = jZ0

sin[(l x)/ c]
cos[(l x)/ c]
UG ()e jt , u(x, t) =
UG ()e jt
cos l / c
cos l / c

Z 0 ( ) = j

0
A

u ( l , t ) = U ( l , ) e j t
1
x = l U ( l, ) =
U G ( )
cos ( l / c )
1
p ng t
tn s
s H () = U (l, ) =
U G () cos(l / c)

j(t )
j ( t + )
x
x

c
c
u+ t = K +e
, u t + = K e
c
c
x

106

107

Ti mi

H () vi
(2n + 1)c
f =
4l
l = 17,5 cm, c=350 m/s
f = 500,1500, 2500... Hz

108

27

M hnh phn x khng tn hao


(Kelly-Lochbaum)
u k+ + 1 (t) u k+ + 1 (t - k + 1 )
u k+ (t)

u k+ (t - k )

u k- (t)

u k- (t + k )

M hnh phn x khng tn hao


(Kelly-Lochbaum)

p k (l, t) = p k +1 (0, t)
u k (l, t) = u k +1 (0, t)
2 A k+1
A Ak
+
u k+1 (t) =
u +k (t - ) + k+1
u k +1 (t)
A k+1 + A k
A k+1 + A k
A Ak +
2 Ak
u k (t+ ) = k+1
u k (t - ) +
u k +1 (t)
A k+1 + A k
A k+1 + A k

u k- + 1 (t) u k- + 1 (t + k + 1 )

lk

l k +1

tit din Ak

tit din Ak+1

Cc ng c b
bn c
c cng chi
chiu d
di k = k +1 =

l
=
c

Phn b sng
tr

tr

u+k (t ) (1+ rk ) u+k+1(t)

ng k

uk (t +)

A k+1 A k
A k+1 + A k

110

Hiu ng ca cc tn hao

rk

uk (t)

rk =

t h
h s ph
phn x
x

u +k+1 (t) = (1 + rk ) u +k (t - ) + rk u k +1 (t)


u k (t+ ) = rk u +k (t - ) + (1 rk ) u k +1 (t)

109

u+k (t)

Tnh lin t
tc c
ca p su
sut v
v thng l
lng

tr

uk++1(t )

Do t
tnh l
lng c
ca khng kh
kh
Do truy
truyn nhi
nhit
Do rung v
vch ngn

rk

(1 rk ) uk+1(t)

tr

ng k+1

Tn hao do d
dch chuy
chuyn khng kh
kh trong tuy
tuyn m

tnh lng

uk+1
(t+)

Tip gip

truyn nhit
111

rung

112

28

Hiu ng ca cc tn hao

Hiu ng chung ca cc tn hao


Di thng

Tn hao do b
bc x
x ti mi
M h
hnh qu
qu bng v h
hn

Bc x ti mi

Tr
Tr kh
khng b
bc x
x

Zr =

j Lr Rr
p ()
=
U (, l) Rr + j Lr

Rung
Nhit+lng

128
8a
Rr = 2 , Lr =
3 c
9
a: bn knh m ti mi
113

5. Nhn dng ting ni

Phn loi theo phc tp

Hai giai o
on: hu
hun luy
luyn (h
(hc) nh
nhn d
dng
Phn lo

i
theo
lo

114

S l
lng t
t vng
T ri r
rc lin t
tc
Mt ng
ngi n
ni nhi
nhiu ng
ngi n
ni
Nh
Nhn d
dng t
t cu

115

Nh
Nhn d
dng t
t ring l
l, t
t vng t (<100), m
mt ng
ngi
ni
T vng nhi
nhiu hn (v
(vi ngh
nghn t
t), m
mt ng
ngi n
ni
Nh trn nhng cho h
h th
thng nhi
nhiu ng
ngi n
ni
Nh
Nhn d
dng c
cc t
t i v
vi nhau, t
t vng t (h
(hng
ch
chc t
t)
Nh
Nhn d
dng cu ng
ngn, t
t vng h
hn ch
ch, m
mt ng
ngi
ni
Nh trn nhng cho h
h th
thng nhi
nhiu ng
ngi n
ni
Nh
Nhn d
dng l
li n
ni lin t
tc, m
mt ho
hoc nhi
nhiu ng
ngi
ni
116

29

Mt s vn i vi h thng
nhn dng ting ni

Nh
Nhn d
dng ng
ngi n
ni (Speaker Recognition)

Ki
Kim tra (verification) gi
ging n
ni
nh danh (identification) gi
ging n
ni

Ph
Pht hi
hin kho
khong l
lng
Ci thi
thin ch
cht l
lng t
tn hi
hiu ti
ting n
ni (gi
(gim
nhi
nhiu)
Ti
Ting n
ni
c ph
pht m v
vi th
thi h
hn v
v
nh
nhp i
iu kh
khc
M h
hnh nh
nhn d
dng
M h
hnh Markov n (Hidden Markov Model:
HMM)
Mng nn-ron

117

118

30

También podría gustarte