Está en la página 1de 16

Music 421

Spring 2004-2005
Homework #7
Overlap-Add STFT Processing
85 points
Due in one week (5/26/2005)

1. (20 pts) Constant-Overlap-Add Condition

(a) (5 pts) Assuming the window w(n) satisfies



X
w(n − mR) = 1, ∀n
m=−∞

show that ∞
X
X(ω) = Xm (ω).
m=−∞

(b) (3 pts) For windows in the Blackman-Harris family only (rectangular, Hann fam-
ily, and Blackman cases), which hop size values give constant overlap-add? You
may use a brute force matlab method to obtain your answer.
(c) (5 pts) For each of the same set of windows, justify your COLA hop size answer
using a frequency domain argument. That is, show that processing a signal using
constant-overlap-and-add windows gives perfect reconstruction in the frequency
domain, when using the hop sizes you chose.
(d) (5 pts) Why does the Kaiser window not overlap-add exactly for R > 1? What
ranges of hop sizes should be used and why? [Characterize the valid hop sizes
in terms of one or more spectral properties of the window. Hint: Consider the
Poisson Summation Formula.]
(e) (2 pts) Suppose a length M = 51 Hamming window designed by hamming(M) in
Matlab is modified to make it “periodic” yet symmetric as follows:
w(1) = w(1)/2;
w(M) = w(M)/2;
Determine the change in the worst-case side-lobe level (modified minus original),
to within a tenth of a dB. Which window has the narrower main lobe? Explain
how these answers make sense.

Solution:

1
(a) (5 pts)
∞ ∞
à ∞
!
X X X
Xm (ω) = xm (n)e−jωn
m=−∞ m=−∞ n=−∞

à ∞
!
X X
−jωn
= x(n)w(n − mR)e
m=−∞ n=−∞

X ∞
X
= x(n)e−jωn w(n − mR)
n=−∞ m=−∞
X∞
= x(n)e−jωn · 1
n=−∞
= X(ω).

(b) (10 pts)


The elementary window types which can be made to obey the constant overlap-
add constraint exactly are listed below, together with the hop-size values (integer
only) that satisfy the constraint. For full credit, answers must include all values
shown here, but no additional justification is required. Two points for rect, four
points for Hann family and four points for Blackman. If answers given are incor-
rect, then matlab code and plots may give partial credit. Partial credit should
also be given if odd window answers use M instead of M − 1, and if even window
answers are given the same as odd window answers.

Rectangular : R = M/k, where k is an integer ≥ 1


Hann Family: R = (M − 1)/(k + 1), where M is odd and k is an integer ≥ 1
Blackman: R = (M − 1)/(k + 2), where M is odd k is an integer ≥ 1

Note: (1) For even-length windows, hop-sizes R = 1, 2 give constant overlap-


add. (2) For the Hamming window to give exact constant overlap-add using the
hop-size as shown for the Hann family, the “tail ends” need to be divided by 2.

More generally, the Blackman-Harris family is defined by


L−1
X
wB (n) = wR (n) αl cos(lΩM n).
l=0

With L = 1, we get the rectangular window wR (n), with L = 2, we get the


generalized Hamming window, and with L = 3, we get the Blackman window
(see the text for details). The first zero-crossing of the window transform is at
2π L/M . The hop size values which satisfy the COLA constraint are given above.

2
(c) (10 pts) We may consider time domain windowing as:


y(n) = x(n) × (w(n) ∗ R (n))

th


where w(n) is the windowing function, x is an impulse train (with every x
sample one), and y(n) is the reconstructed signal. In the DTFT domain, this
becomes:


Y (ω) = X(ω) ∗ (W (ω) × 2π/R (ω)).

Now, we can see by inspection that for perfect reconstruction, we require that


(w(n) ∗ R (n)) = 1 in the time domain, or (W (ω) × 2π/R (ω) = δ(ω)) in the
frequency domain. For the latter to be true,we must choose a hop size R so that

2π/R (ω) lands only on the zero crossings of W (ω) (and it also lands on the
peak of the main lobe). This may be done rigorously by recalling that the zero
crossings of the relevant window transforms occur at regular spacings outside the
main lobe. We recall main lobe widths of:

Rectangular : 2 × 2π/M
Hanning: 4 × 2π/M
Blackman: 6 × 2π/M

In all cases, the zero crossing spacing outside the main lobe is 2π/M . Hence, to
ensure that the samples in the frequency domain pulse train at 2π/R land on zero
crossings outside the main lobe, we require:

Rectangular : R≈ M
Hanning: R ≈ M/2
Blackman: R ≈ M/3.

(For full credit, answers need not repeat this final result if it was given in the
previous item.)
Filter bank view
(The following discussion is not required for full credit, though it should be ac-
cepted for full credit.) We may also consider the FBS point of view. In Filter
Bank Summation, the signal is modulated to different carrier frequencies which
are fs /N apart and then filtered by the window which is a low pass filter. Since
the resultant signals are lowpass, you can decimate by R (the hop size) which
results in aliased images separated by fs /R appearing in the spectrum of each
filter output. If there is no zero-padding, then the sample points (which are f s /N
apart) fall right on the zeros of the window transform (except on the main lobe).
So the aliased copies do not contribute to the values of the spectrum. The aliasing
is squelched, except for the possible nonzero non DC values within the main lobe
which exist for all windows except for the rectangular window whose main lobe
only has one nonzero sample. For these few remaining nonzero samples, the re-
modulation which happens to each filter before the outputs are combined, suffices

3
to point all aliased values “in the right direction” so that they can be correctly
combined with the outputs of the neighboring filters.
The important concept is that the aliased values happen to fall on the zeros of
the window transform for most points. This results in an aliasing which is not
damaging. The aliasing becomes dangerous when modifications are made to the
spectrum before doing the resynthesis. Then the cancellation is effectively ruined.
So, in the case of spectral modification, it is best to keep the aliased copies at least
far enough apart so that the main lobes do not intersect, or else (as in the MPEG
filter bank) use M > N to reduce sidelobe leakage in the window transform.
(d) (10 pts)
Kaiser windows do not overlap-add exactly because the zero-crossings occur at
non-integer multiples of 2π/M . But if we consider the fact that the Kaiser window
maximizes the energy in the main lobe of the window, we can choose R so that the
values of the spectrum at frequencies (k 2π/R) are very close to zero (for k 6= 0).
In other words, the first zero-crossing has to appear before ω = 2π/R. (Full credit
answers need only to proceed this far.)
Remark 1:
To prove that there is no R > 0 giving constant overlap of the Kaiser window, we
can work with the closed-form window transform expression given in the overheads
and text: µq ¶
¡ M ω ¢2
sinh 2
β − 2
M
W (ω) = q
I0 (β) ¡ ¢2
β 2 − M2ω
where β is the window parameter. Outside the main lobe, i.e., for ω > 2β/M , the
square-roots become imaginary, and sinh becomes sin:
µq ¶
¡ M ω ¢2
sin −β 2
M 2
W (ω) = q¡ ¢ , (ω > 2β/M )
I0 (β) Mω 2 2
2
−β

By the PSF, the window w is Cola(R) if and only W is Nyquist(2π/R) for


some R > 0. Thus, the window is constant-overlap-add at hop-size R if and only
if there exists a series of nulls in the window transform at the harmonically related
frequencies 2π/R, 4π/R, 6π/R, . . . .
When β = 0, W (ω) reduces to the sinc function

M sin (M ω/2) M
W (ω) = = sinc(M f ),
I0 (β) M ω/2 I0 (β)

and the nulls become harmonic (located at all nonzero integer multiples of 2π/M ).
This is expected because the Kaiser window reduces to the rectangular window
for β = 0.

4
For β > 0, the window-transform nulls occur at frequencies for which
sµ ¶2

− β 2 = kπ, k = 1, 2, 3, . . . ,
2
or µ ¶2
2 2
β 2 + k2π2 ,
£ ¤
ω = k = 1, 2, 3, . . . .
M
If there is a harmonic subset of these nulls, then we can find ω1 ∈ R and K ∈ Z
such that µ ¶2
2 2 2 £ 2
β + (kK)2 π 2 ,
¤
k ω1 = k = 1, 2, 3, . . . .
M
Note that the left-hand side is linear in k 2 , while the right-hand side is affine in k 2 .
Thus, the LSH and RHS can coincide at at most one point, and not for all positive
integers k, unless β = 0. This completes the proof that there are no constant-
overlap step sizes for the continuous Kaiser-Bessel window. To extend the result
to the discrete-time case, we could try to show that there is no way to alias the
window transform on a block of width ωs so as to produce a harmonic set of nulls,
where ωs is the chosen discrete-time sampling rate in rad/sec. However, the result
is obvious in the time domain since COLA is invariant with respect to sampling.
In other words, if the continuous window is COLA at some displacement R, then
so is the sampled window, provided R is an integer multiple of the sampling
period.
Remark 2: From the window transform expression above, we have that the first
null of a Kaiser-Bessel window is at frequency

w0 = ,
M
which should be ≤ 2π/R if the frame rate is to be tuned to the first null or higher
(which will give approximately constant overlap-add for the Kaiser window). That
is, we should use
π
R ≤ M.
β
To additionally suppress aliasing in the individual channels signals of the STFT
(viewed now as a filter bank), the above limit on R should be cut in half. Since
β is in on the order of 10 for high quality audio applications, a hop size less than
M/6 is typical when the STFT is to be modified.
(e) (5 pts)
M = 51;
N = 8192;
w = hamming(M);
W = db(fft(w,N));
wp = w;

5
wp(1)=w(1)/2;
wp(M)=w(M)/2;
Wp = db(fft(wp,N));
Ws = W(1:N/2);
Wps = Wp(1:N/2);

figure(1); clf;
plot([Ws,Wps]); axis tight; grid on;
hold on;
mle = 1+ceil(2*N/M); % main lobe edge, in bins + 1
plot([mle mle],[min(Ws),max(Wps)],’--’);
legend(’Original’,’Modified’);
title(sprintf(’Hamming window, length %d’,M));
xlabel(’Frequency (bins+1)’);
ylabel(’Magnitude (dB)’);
zoom on;
cmd = ’print -deps ../eps/perhamm.eps’;
disp(cmd); eval(cmd);
hold off;

[Wmax,i] = max(W(mle+1:N/2));
[Wpmax,ip] = max(Wp(mle+1:N/2));

format bank;
disp(sprintf(’Modified minus original = %0.2f dB’,Wpmax-Wmax));
disp(sprintf(’Peak locations = %d and %d’,i,ip));

[ii,Wmaxi] = maxr(W(mle+1:N/2));
[ipi,Wpmaxi] = maxr(Wp(mle+1:N/2));

disp(sprintf(’Modified minus original interp = %0.2f dB’,Wpmaxi-Wmaxi));


disp(sprintf(’Interpolated peak locations = %0.2f and %0.2f’,ii,ipi));

% OUTPUT:

% Modified minus original = -0.72 dB


% Peak locations = 790 and 830
% Modified minus original interp = -0.72 dB
% Interpolated peak locations = 789.90 and 830.48
It is probably surprising that the modified window has 0.72 dB lower side lobes
and a narrower main lobe. However, remember that the Hamming window comes
from a highly restricted class of functions (constant plus a cosine). Our modifi-
cation goes outside of this class of functions, so it is free to do better at approxi-
mating the Chebyshev window than can any member of the generalized Hamming

6
family.
For full credit, the answer must specify the side lobe level to within a tenth of a
dB, say that the new main lobe is narrower, and recognize that the new window
is no longer technically in the Hamming window family. The word “chebyshev”
need not appear in the answer.

7
2. (25 pts) Cyclic Convolution and the DFT

(a) (5 pts) Compute (by hand) the 4-point cyclic (“circular”) convolution of the
following length-4 discrete-time signals x and h:
· ¸
1 1 1
x = [1, 0, −1, 0], h= , , 0, .
2 4 4

Note that the time index runs from n = 0 to 3. Be sure to show all necessary
steps in the computation.
(b) (5 pts) Repeat the previous problem using the DFTs of x and h.
(c) (3 pts) Regarding the signal x as a sampled sinusoid, what are its amplitude,
phase, and frequency relative to the sampling rate fs ?
(d) (10 pts) Regarding the signal h as the impulse response of an FIR digital filter,
what is the corresponding
i. frequency response?
ii. amplitude response?
iii. phase response?
iv. phase delay?
v. group delay?
(e) (2 pts) What is the gain of the filter in the previous problem at frequency f s /4,
where fs denotes the sampling rate? Does this make sense?

Solution:

(a) (5 pts) This problem, as well as problem 3, requires a lot of hand computations.
To reduce the possibility of error, as well as to reduce the tedium, the trick is
to find a convenient way to organize the computations. A sufficiently compact
approach is matrix multiplication. The final answer is
· ¸
1 1
x∗h= , 0, − , 0 .
2 2

(b) (5 pts) By the DFT convolution theorem, cyclic convolution in the time domain
corresponds to multiplication of the DFT’s in the frequency domain. So the
procedure should be
i. take the DFT’s of x and h:
· ¸
1 1
X = [ 0, 2, 0, 2 ] and H = 1, , 0,
2 2

ii. multiply them together:

X · H = [ 0, 1, 0, 1 ]

8
iii. take the inverse DFT.
The final answer is given in part (a).
(c) (3 pts) Regarded as a sampled sinusoid, the signal x is a cosine of amplitude 1 at
half of the Nyquist frequency. Therefore
i. amplitude = 1,
ii. phase = 0,
iii. frequency = fs /4.
(d) (10 pts) The key in this problem is to realize what is asked for. We are queried
as to the response characteristics of h[n] as a FIR digital filter. Consequently, the
appropriate transform is the DTFT. In fact, the DFT corresponds to the sampled
DTFT, which can be seen as the DTFT of the infinitely periodically extended
version of h[n], which is certainly not a FIR filter. Equivalently, one may extract
the FIR version of h[n] by windowing the periodic extension in the time domain.
This gives the DTFT as the bandlimited reconstruction of the DFT.
i. Frequency response:
By inspection, the DTFT is
1 1 −jω 1 −j3ω
+ e + e .
2 4 4
Grouping the last two terms using Euler’s formula yields
1 1 −j2ω
+ e cos(ω).
2 2
ii. Amplitude response:
The amplitude response is the magnitude, or complex modulus of the fre-
quency response. There are several ways to obtain it, the easiest of which is
to take the frequency response, multiply it by its complex conjugate and take
the square root. An other method would be to take the real and imaginary
parts, square them both, add the result and take the square root. We finally
get
1p
|H(ω)| = 1 + cos2 (ω) + 2 cos(ω) cos(2ω).
2
iii. Phase response:
The phase response is obtained by taking the inverse tangent of the ratio of
the imaginary over the real parts of the frequency response:
µ ¶
Im[H(ω)]
Θ(ω) = arctan
Re[H(ω)]
If the real part is negative, π should be added to the result. Equivalently, take
1/j times the logarithm of the frequency response divided by the amplitude
response. This is evident from the polar form. So the answer is
µ ¶
cos(ω) sin(2ω)
Θ(ω) = − arctan .
1 + cos(ω) cos(2ω)

9
iv. Phase delay:
The phase delay is defined as minus the phase response divided by the fre-
quency:
∆ Θ(ω)
P (ω) = − .
ω
Note that a linear phase system has a constant phase delay, equal to the
implicit delay of the system. The answer is
µ ¶
1 cos(ω) sin(2ω)
P (ω) = arctan .
ω 1 + cos(ω) cos(2ω)
v. Group delay:
Defined as the negative derivative of the phase response, the group delay gives
an indication of how much a sinusoid of a given frequency is delayed when it
passes through the system.
∆ dΘ(ω)
D(ω) = − .

This calculation is tedious but straightforward if you recall that
µ ¶
d arctan[f (x)] 1 d f (x)
= .
dx 1 + f 2 (x) dx
The final answer is
sin2 (ω) + sin(ω) sin(2ω)
D(ω) = 1 − .
1 + cos2 (ω) + 2 cos(ω) cos(2ω)
(e) (2 pts) Filter gain at frequency fs /4:
To get the gain from the amplitude response, we just have to evaluate the expres-
sion for ω = 2π/4 = π/2.
1√ 1
|H(π/2)| = 1+0+0= .
2 2
This makes sense since this is the gain of the sampled cosine with frequency fs /4,
x[n] = [ 1, 0, −1, 0 ]. In fact, as we found previously in part (a),
· ¸
1 1
x∗h = , 0, − , 0
2 2
1
= · [ 1, 0, −1, 0 ]
2
1
= x[n].
2
3. (10 pts) Acyclic Convolution and the DFT
Determine (by hand) the acyclic convolution of the two sequences of the previous
problem using

10
(a) (5 pts) the direct approach, and
(b) (5 pts) the DFT-based approach.

Again, be sure to write out the intermediate steps.


Solution:

(a) (5 pts) Using the direct approach:


To simulate acyclic convolution, sufficient zero-padding is used so that non-zero
samples do not “wrap around”. In the case of this exercise, we have to add 3 zero-
samples to x and h. For the calculation itself, the acyclic convolution summation
may be structured as a matrix multiplication. The resulting matrix has Toeplitz
structure and the minimum size of this matrix is 7 by 7. We obtain
· ¸
1 1 1 1
x∗h= , , − , 0, 0, − , 0 .
2 4 2 4

(b) (5 pts) Using the DFT-based approach:


The procedure is similar as in part (b) of problem 2. But here, in the case of
the acyclic convolution, recall that the minimum length of the DFT must be the
sum of the lengths of the inputs minus one, otherwise we are undersampling in
the frequency domain, which leads to time aliasing. Therefore a minimum DFT
of length 4 + 4 − 1 = 7 is required. So we have to zero-pad x and h out to at least
length 7 before taking the DFT’s. Again, this calculation may be implemented
by multiplication by a [ 7 x 7 ] (or larger) matrix. Note the computational duality
to the result in part (a).

4. (20 pts) Bandpass Filtering for Noise Reduction and Dynamic Range Com-
pression
Write a program to denoise the corrupted birdsong in wrenpn1.wav by bandpass fil-
tering a single frequency corresponding to the chirping sinusoid. A dynamic range
compression scheme will also be implemented. The guideline on how to do this is given
below. NOTE: Parts (b)-(d) are BONUS; only parts (a), (e), and (f) are required.

(a) (5 pts) Read in frames of signal using a Hanning window of length M = 256 and
a hop-size which will give a constant overlap-add. State clearly the hop-size you
use.
(b) (bonus)(5 pts) For each frame of noisy signal, find a single largest peak of the
spectrum using findpeaks or otherwise.
(c) (bonus)(5 pts) Design a narrow bandpass filter H(ω) centered at the frequency just
obtained using the same window used in analysis with the same length, L = M .
Be sure that the filter has a unity DC gain in order to preserve our sinusoid’s
amplitude.
(d) (bonus)(5 pts) Multiply H(ω) and Xw (ω) to obtain the filtered output spectrum
where H(ω) and Xw (ω) is the Fourier transform of the filter impulse response and

11
the windowed signal respectively. Be sure that your zero-padding is enough to
avoid time-aliasing. State clearly what FFT length you use.
(e) (5 pts) IFFT the resulting spectrum and overlap-add the frame into the output
buffer. Make sure that your overlap-add is correctly aligned.
(f) (10 pts) Dynamic range compression: Equalize each output frame (before overlap-
add) so that they are all of equal amplitude level. This can be done, for example,
by normalizing each frame by its root-mean-square (RMS). Be sure that the final
output is below saturation. However, disable the normalization for all frames
having a small RMS value (e.g., near that of the noise during intervals of silence).
One effective technique is to define a frame gain that is fixed for the frame, but
varies smoothly from 1/rms to 0 as a function of signal-to-noise ratio.

Submit the denoised output sound with and without dynamic range compression.
Name the first one xxxxhw71.wav and the second one xxxxhw72.wav where xxxx are
the first four letters of your last name. Submit them to the TA or post on the web.
Also, submit the code as usual, and

(a) plots of the spectrograms of the denoised output with and without dynamic range
compression,
(b) a plot the spectral slice of the output (before dynamic range compression) super-
imposed on that of the input at the time halfway through.

Solution: (a) (10 pts) Full credit code must use a correct hop size and successfully
read in successive frames. Hop-size : In general, valid COLA hop-sizes of Hann window
of length M are M/2, M/4, ..., 1. If we use length 256 window, then the OLA is not
exactly constant but very close, because of asymmetry. If M = 255 and Hanning(M)
is used, hop-sizes of (M + 1)/2, (M + 1)/4, ... will give exact OLA, whereas if M = 257,
hop-size of (M + 1)/4 can be used (others depend on whether the formula gives integer
hop-sizes).

(b) (Bonus, 20 pts)


FFT length (bonus only) : FFT length needs to allow for the filtering to take place
without time aliasing. In general, the length must exceed Nx + Nh − 1 where Nx and
Nh are the length of signal x and filter h respectively. In this problem, FFT length
must exceed 512, roughly.

(c) (10 pts) For full credit, no FFTs are required, because the dynamic compression
may be done in the time domain and the FFT is done for the bonus filtering. However,
the alignment of frames must be correct for full credit.
(d) 40 points. Full credit solutions may operate in the time or frequency domain, and
must make each frame have the same mean or rms volume as every other frame, with
the exception of frames that fall below some reasonable threshold volume (such as 0.15

12
of the rms of the entire signal). Extra credit will be given if code successfully estimates
a frequency domain noise floor and normalizes accordingly.
The spectrograms of the results are shown in Figure 1. The spectral slices near the
time midway through of the original and the bandpassed signal are in Figure 2. The
code is below, and allows commenting to remove the bandpass filtering.
Example output files are:

(a) (Filtered only): hw7q4filt only.wav1


(b) (Dynamic Range Compressed Only): hw7q4comp only.wav 2 , and
(c) (Filtered and Dynamic Range Compressed): hw7q4filt and comp.wav 3 .

% HW#7 : Custom bandpass filtering noise reduction and dynamic range compression
% This program takes a frame chunk of input noisy signal and find where the cut-off
% frequencies should be for custom bandpass filter design using window method.
% The filter is used to remove noise from the music/birdsong signal and then
% reconstruct using COLA. The noise squelch is also used, setting threshold
% to avoid pumping up the noise

%% ANALYSIS PARAMETERS
M = 256; % Exact COLA required
R = M/4; % hop size
win = hanning(M); win = win/sum(win); % choose window and norm to unity DC-gain
maxPeaks = 1; % birdsong specific number of peaks
noisefloor = 0.15; % level above which we compress dynamic range
n = 0:M-1; % dummy causal indices for filter design

% LOAD THE SIGNAL:


[x,fs] = wavread(’wrenpn1.wav’);
nx = length(x);
gain0 = sqrt(mean(x.^2)); % output level we require in output
nFrames = floor(nx/M)*floor(M/R); % number of frames
N = 2^(1+floor(log2(5*M+1))); % FFT length, at least a factor of 5 zero-padding
y = zeros(R*nFrames + (N-R), 1); % initialize output

% PROCESS IT:
for m=1:nFrames-1
% ---- read in and FFT this frame ---- %
tt = (m-1)*R+1:(m-1)*R+M; % time samples to use
xw = win .* x(tt); % window signal
1
http://ccrma.stanford.edu/˜jos/hw421/hw7sol/hw7q4filt only.wav
2
http://ccrma.stanford.edu/˜jos/hw421/hw7sol/hw7q4comp only.wav
3
http://ccrma.stanford.edu/˜jos/hw421/hw7sol/hw7q4filt and comp.wav

13
Xw = fft(xw,N); % zero-phase window unnecessary (dropping phase)

%--- analysis for filter design ---%


Xwdb = 20*log10(abs(Xw(1:N/2))); % dB spectrum
[ampsm,freqsm] = findpeaks2(Xwdb,maxPeaks,fs,win,N);
hideal = cos(2*pi*freqsm*n’/fs)*2;
hw = win.*hideal; % WINDOW METHOD
Hw = fft(hw,N); % freq domain
Yw = Xw.*Hw; % FFT filtering
%Yw = Xw; % USE THIS LINE TO UNDO FILTERING

% ---- dynamic range compression ---- %


gainYw(m) = sqrt(mean(abs(Yw).^2)); % sqrt of mean of filtered spectrum
if gainYw(m) > noisefloor % if loud enough
Yw = Yw*gain0/gainYw(m); % then make equal amplitude level
end

% ---- overlap-add ---- %


outindex=(m-1)*R+1:(m-1)*R+N;
y(outindex)=y(outindex)+real(ifft(Yw,N));
% real just to kill round off error
end

% display
subplot(211);specgram(x,N,fs,hann(256),128);title(’original’);
subplot(212);specgram(y(1:length(x)),N,fs,hann(256),128);title(’bandpassed’);
% write output
wavwrite(y./(max(y)+0.01),fs,’hw7q4out.wav’);

5. (10 pts) Optimal FIR Filter Design Using Lp Norm minimization


In this problem, you will use an Lp norm minimization technique to design an optimal
bandpass filter.
(a) (2 pts) Plot an ideal brick-wall bandpass filter in the frequency domain with the
passband ωp = [π/4, π/2] (rad/sample), the passband gain Gp = 1, and the
stopband gain Gs = 0. Use N = 512 frequency samples, and plot from −π to π.
(b) (8 pts) Download lpapprox.m4 , which is a Matlab script for lowpass-filter design.
Replace the desired lowpass filter frequency response Hd (ωk ) with the desired
bandpass frequency response you obtained in part (a). Plot an overlay of the
ideal and obtained amplitude responses in dB.
Solution: Figure 3 shows the magnitude response of an ideal bandpass filter overlayed
with its approximations using L2 − and L1 − norm minimization techniques.

4
http://ccrma.stanford.edu/˜jos/hw421/hw7/lpapprox.m

14
original

10000

Frequency
5000

0
0 0.5 1 1.5 2 Time
bandpassed
10000
Frequency

5000

0
0 0.5 1 1.5 2 Time
bandpassed and compressed dynamice range by thresholding

10000
Frequency

5000

0
0 0.5 1 1.5 2
Time

Figure 1: Spectrograms of original, custom-bandpassed and that also with compressed dy-
namic range

Spectral slice midway through of the original and the bandpassed signal
0
original
bandpassed

−50

−100

−150

−200

−250
0 2000 4000 6000 8000 10000

Figure 2: A spectral slice near the time midway through of the original sound and the
custom-bandpassed version

15
10
ideal
L2 norm approx.
0 L1 norm approx.

−10

−20

−30

−40

−50

−60

−70

−80

−90

−100
0 0.5 1 1.5 2 2.5 3

Figure 3: Magnitude spectrum of a highpass filter

16

También podría gustarte