Está en la página 1de 15

Speech Enhancement

:Edited by
Ashkan masomi

Azad university of bushehr

EE493Q: Digital Speech Processing


Iterative Wiener Filtering
voiced
pulse
… train
… gs all-pole filter
pitch
u[k] 1 speech
period
x M

unvoiced white noise 1 − ∑ α m e − jω m signal


generator m =1

gs
S (ω ) = U (ω )
frequency domain: M
1 − ∑α m e − jω m
m =1

M
s[k ] = ∑ α m s[k − m] + g s u[ k ]
time domain: m =1

α = [α 1  α M ]
T
= linear prediction parameters
EE493Q: Digital Speech Processing
Overview
Single-microphone noise reduction 

Problem description 
Spectral subtraction methods (=spectral 
(filtering
Iterative Wiener filtering based on speech 
modeling

Multi-microphone noise reduction 


Problem description 
Multi-channel Wiener filtering 
(=EE493Q:
spectral+spatial
Digital Speech Processing
( filtering
Multi-Microphone Noise
Reduction Problem
speech source s[k ]


? s [k ]
(some) speech estimate

n[k ] microphone signals


noise source(s) mi [k ] = si [k ] + ni [k ], i = 1..4
(`4’=arbitrary)

will use m instead of y now!! speech part noise part

EE493Q: Digital Speech Processing


Multi-Microphone Noise
Reduction Problem
will use time-domain linear filtering :

s[k ]
m1[k ]

s [k ]
mi [ k ] = s i [ k ] + n i [ k ]

mTi [k ] = [ mi [k ] mi [k − 1] ... mi [k − L]]


4
 filter
n[k ] s [k ] = ∑ mTi [k ].w i
i =1 coefficients

EE493Q: Digital Speech Processing


Multi-Microphone Noise
Reduction Problem
s[k ]

wi = ? 4

s [k ] = ∑ mTi [k ].w i
i =1

A cool design criterion for the w’s would be (MSE)


 2
min wi E{ s [k ] − s[k ] }
n[k ]
..but s[k]=unknown! (of course)

ps: this would also include dereverberation (see Topic-5)

EE493Q: Digital Speech Processing


Multi-Microphone Noise
Reduction Problem
s[k ]

wi = ? 4

s [k ] = ∑ mTi [k ].w i
i =1

Will use an MSE-criterion (mean-squared error)


 2
min wi E{ s [k ] − d [k ] }
n[k ]
…with d[k] (`desired response’) yet to be defined
This is `Wiener filtering’ (see DSP-II)

EE493Q: Digital Speech Processing


)Wiener Filtering )review

min w E{ s [k ] − d [k ] } MMSE criterion 
2

[
w = w1T w T2 w T3 w T4 ]
T

[
m[k ] = m1T [k ] mT2 [k ] mT3 [k ] mT4 [k ] ]T

…where
4

s [k ] = ∑mTi [k ].w i = mT [k ].w
i =1

auto-correlation matrix cross-correlation vector

[
w = E{m[k ].m[k ] } . E{…Solution
T
m[k ].d [kis]} ] −1

EE493Q: Digital Speech Processing


MWF : Multi-Channel
Wiener
m [k ] = s [k ]Filtering
+ n [k ]
s[k ] 1 1 1


s [k ] = mT [k ].w
wi = ?

 2
min wi E{ s [k ] − d [k ] }

n[k ] choice for d[k]: d [k ] = s1[k ]


= signal part in microphone-1 (`1’=arbitrary)
= unknown signal ! (difference with `standard’ Wiener filtering)

EE493Q: Digital Speech Processing


Interludium : Kalman
Filter
Definition: = MMSE-estimate
xˆ [k | l ] of x[k ] using all
available
data up until time l
xˆ [k | k ]
xˆ [k | k − n], n > 0
= estimateFILTERING’` 
xˆ [k | k + n], n > 0
= estimatePREDICTION’` 

= estimateSMOOTHING’` 
EE493Q: Digital Speech Processing
Kalman filter for Speech
Enhancement
M
s[k ] = ∑ αAR
Assume s[k − model
m] + g u[k ] of speech and 
m s
m =1
N
n[k ] = ∑ β n[k − m] + g w[k ]
m n
noise
u[k], w[k] = zero mean, unit
variance,white noise
m =1

x[k + 1] = Ax[k ] + v[k ]



 y[ k ] = c T
x[k ]
Equivalent state-space model 
is…))see p. 34
EE493Q: Digital Speech Processing
Kalman filter for Speech
Enhancement
xT [k ] = [ s[k − M + 1]  s[k ] n[k − N + 1]  n[k:with
]]
:with
v[k ] = G.[ u[k ] w[k ]]
T
Q = G.G T
0  T        N   
M
A s g s 0
A= ; C = 0  0 1 0  0 1; G = 
0 A n    0 g n 

 0 1  0 0 1  0
  0  0    0  0 
As =  ; An = 
 0  0 1 0  0 1
   
α M α M −1  α1  β N β N −1  β1 

[
g Ts = 0  0 ] [
g s ; g Tn = 0  0 gn ;]
EE493Q: Digital Speech Processing
Kalman filter for Speech
Enhancement

PS:This was single-microphone 


.case
k + 1] this
x[can
How = Ax ] + v[k ]
be[kextended

= C x[k ]
T
 y[?k ]to multi-microphone case

Same A, x, v
EE493Q: Digital Speech Processing
Kalman filter for Speech
Enhancement
Iterative algorithm
iterations

y[k] split estimate


signal parameters
in frames gˆ s ,i ; αˆ m,i gˆ n ,i ; βˆm,i
sˆ[ k ]
Kalman reconstruct
Smoother sˆ i [m], signal
or Kalman Filter nˆ i [m]

Disadvantages iterative approach:


• complexity
• delay

EE493Q: Digital Speech Processing


CONCLUSIONS
Single-channel noise reduction 

’Basic system is `spectral subtraction 


Onlyspectral filtering, not easily extended to multi-channel case for 
additional spatial filtering
Hence can only exploit differences in spectra between noise and speech 
:signal
noise reduction at expense of speech distortion 
achievable noise reduction may be limited 
Multi-channel noise reduction 

Basic system is MWF, possibly extended with speech distortion 


regularization and spatial preprocessing
Providesspectral +spatial filtering(! (links with beamforming 
Kalman filtering based alternative approach (not easily applied in 
EE493Q: Digital Speech Processing (practice

También podría gustarte