Está en la página 1de 3

ISSCC 2005 / SESSION 3 / BACKPLANE TRANSCEIVERS / 3.

3.5

A 6.25Gb/s Binary Adaptive DFE


with First Post-Cursor Tap Cancellation
for Serial Backplane Communications

Robert Payne, Bhavesh Bhakta, Sridhar Ramaswamy, Song Wu,


John Powers, Paul Landman, Ulvi Erdogan, Ah-Lyan Yee,
Richard Gu, Lin Wu, Yiqun Xie, Bharat Parthasarathy, Keith Brouse,
Wahed Mohammed, Keerthi Heragu, Vikas Gupta, Lisa Dyson,
Wai Lee
Texas Instruments, Dallas, TX
As backplane data rates scale to 6.25Gb/s and beyond, transceivers must communicate in legacy systems originally designed
for lower data rates. These cost-conscious system channels typically include >30 of FR-4 trace and several board connectors
and signal vias, resulting in received signal eyes closed by ISI,
crosstalk noise, and reflections [1]. Previous solutions to limited
channel bandwidths use transmitter pre-emphasis (or deemphasis) to boost the relative transmit amplitude of high-frequency data, linear filters to boost high frequencies at the receiver, or multi-level signaling solutions such as PAM-4 to fully utilize the available bandwidth. While effective for equalizing isolated channels, transmitter de-emphasis can also increase
crosstalk, resulting in a decreased system SNR. Linear filters
can flatten the channel response, but noise is amplified as well
as the desired signal making them unsuitable for some legacy
backplanes with significant high-frequency crosstalk. Although
PAM-4 reduces the required bandwidth, no signaling standards
exist and the impact of reflections on PAM-4 solutions can be
worse compared to binary signaling [2].
DFEs can overcome these limitations. In a DFE, prior data decisions are multiplied by tap weights equaling the magnitude of
the ISI and subtracted from the received signal, eliminating ISI
from subsequent bits. When correct decisions are made, the
post-cursor channel response is equalized without boosting highfrequency crosstalk. A half-baud rate adaptive DFE with direct
first post-cursor tap cancellation suitable for legacy backplanes
is presented in this paper. In particular, systems require: 1) 4
post-cursor taps to equalize up to 20dB of channel loss at
3.125GHz, yielding BER<10-15; 2) high degree of integration to
deploy up to 144 TX/RX links in a single ASIC; and 3) minimal
latency for maximum backplane throughput. Direct cancellation
of the first post-cursor tap is performed at 6.25Gb/s and all taps
are adapted within the receiver. Previous 6.25Gb/s DFE implementations either do not cancel the first post-cursor tap, require
interaction with the transmitter to cancel the first tap [2], or use
loop unrolling to provide first-tap cancellation [3].
Figure 3.5.1 illustrates the DFE architecture, where 4 half-baud
rate clocks (CLK_0, 90, 180, 270) 2X oversample the serial data
(RXP/N) to adapt both the CDR and DFE. In the CDR loop, a
data sample (DATA_P/N) nominally at the data eye center is
compared to the previous data sample and the 90o offset gradient
sample (GRAD_P/N).
When transitions exist, the sign
(early/late) of the recovered clock phase error is used to update a
phase-interpolator-based CDR. The same sequence of samples
also adjusts the tap coefficients of the DFE, minimizing overhead. It can be shown that minimizing the zero-crossing data
eye jitter also maximizes the eye height, yielding a minimum
BER for the data samples [1]. The gradient samples are used to
determine if each run length is too narrow or wide, indicating if
the eye is over or under-equalized. The gradient and data sample history is used to update the tap weights using a sign-sign
LMS algorithm. The tap weights multiply the prior decisions
using current mode DACs with 5, 4, 3 and 3 bits of resolution for
taps 1, 2, 3 and 4, respectively.

68

The prime challenge in implementing a first-tap DFE is building


circuits fast enough to resolve a differential signal as small as
20mV, multiply it by the TAP1 coefficient, and subtract the result
from the input signal before the next data sample. Optimal DFE
convergence and maximum SNR require proper transition equalization, requiring resolution of the data samples in less than
80ps at 6.25Gb/s. To achieve this, the first tap is fed back directly from the outputs of a pair of high-speed sense amplifiers that
amplify, slice, and latch the data and a MUX that selectively subtracts the current corresponding to the previous sample (Fig.
3.5.2).
Optimization using careful parasitic control and statistical simulations yielded circuits that could resolve the <20mVpk differential sufficiently to correct the next edge within 60ps over expected PVT conditions (Fig. 3.5.3). This provides margin to equalize
the trailing edge to meet the adaptation criteria. The resulting
half-baud rate architecture with direct first-tap feedback minimizes the DFE hardware. As shown in Fig. 3.5.1, the input is
amplified by A1 prior to equalization. Figure 3.5.4 illustrates a
gain tracking circuit that maintains the tap strengths over PVT
variation and also allows programming the tap range and resolution using coefficient K1.
To yield a BER<10-15 in the field, each device must meet strict
sensitivity requirements during production test. The sensitivity
metric must account for both DC offsets and high bandwidth to
meet the 80ps resolution time. By fixing the tap coefficients and
disabling amplifier A1, the DFE generates a repeating data pattern. Comparing the sampled data to predefined signatures
indicates whether the receiver resolved the pattern correctly.
Reducing the coefficient magnitudes reduces the pattern amplitude; detecting the failure threshold quantifies the sensitivity.
Since the DFE is clocked at 3.125GHz, the measured sensitivity
is an at-speed assessment of the sense-amplifier performance.
In addition to the challenge of equalizing the first tap, the low
BER requires careful control of the signal integrity at the package/silicon interface. Low-capacitance layout techniques are
used to optimize the equalizer bandwidth and to minimize the
input return loss from interconnect and ESD parasitics. The
combined package and die return loss measured at the BGA
bumps is <-12.5dB for frequencies <3.125GHz in a 50 environment.
A design integrating eight 6.25Gb/s RX/TX pairs and sixteen
3.125Gb/s TX/RX pairs in a 361-pin organic flip-chip BGA package was implemented in a 1.2V 0.13m CMOS technology. The
DFE performance was tested over multiple channels including
legacy backplanes. Figure 3.5.5 shows the captured DFE tap
adaptation profile for a 223-1 PRBS pattern for a sample channel.
A BER<10-15 (0 errors, 62 hours) is measured when operating at
6.25Gb/s over a worst-case 1Gb/s legacy backplane including two
connecters and multiple vias with an asynchronous crosstalk of
600mV peak amplitude transmitted on the worst-case aggressor
channel (Fig. 3.5.6).
Acknowledgements:
The authors would like to thank Brad Rothbauer, Bret Dahl, Ami
Bhandal, and Boyd Barrie for their support in evaluation, modeling, and
test system design.
References:
[1] S. Wu et al., Design of a 6.25Gb/s Backplane SerDes with TOP-down
Design Methodology, DesignCon2004, Feb., 2004.
[2] J. L. Zerbe et al., Equalization and Clock Recovery for a 2.5-10-Gb/s
2-PAM/4-PAM Backplane Transceiver Cell, IEEE J. Solid-State Circuits,
vol. 38, pp. 2121-2130, Dec., 2003.
[3] V. Stojanovic et al., Adaptive Equalization and Data Recovery in a
Dual-Mode (PAM2/4) Serial Link Transceiver, Symp. VLSI Circuits, pp.
348-351, June, 2004.

2005 IEEE International Solid-State Circuits Conference

0-7803-8904-2/05/$20.00 2005 IEEE.

ISSCC 2005 / February 7, 2005 / Salon 7 / 3:45 PM


VDDA

SENSE

GRAD_P/N

CLK_0/180
AMP
RX_P/N

EQ_P/N

A1

RX_P

A2

EQ_P
A1

SENSE

DATA_P/N

SENSE

DATA_P

AMP

DATA_N

A2
EQ_N

RX_N

CLK_90/270

CLK_90/270
AMP

CLK_P/N
TAP
MULT

TAP4

TAP
MULT

TAP3

TAP2

TAP
MULT

TIME < 80ps

TAP
MULT

TAP1

CLK_P/N
LATCH

LATCH

LATCH

TAP1

DAC

IREF

Figure 3.5.1: DFE architecture.

Figure 3.5.2: DFE critical timing path.


VDDA

R
VDDA

RX_P/N
PRESET
DATA_P

CLK_P/N KEEPER

INV

INV

INV

IN_P

IN_N

INV

EQ_P/N

gm1

DATA_N

A2

A1 = gm1 R
MULT

CLK_P/N KEEPER

DATA_P/N

BIAS
TAP1
CLK_0/90/180/270

DAC

VSSA

VREF

IREF

gm2
K1

Figure 3.5.3: Sense amplifier schematic.

Figure 3.5.4: Gain-tracking circuit.

Measured BER

1E-06
TAP1

1E-08
Value

1E-10
RX Input

TAP2

1E-12

TAP4

1E-14

0 errors in
1.4E15 bits
@ 0.6V

TAP3

1E-16
0.5

Time (x10s)
Figure 3.5.5: DFE tap convergence for a sample channel.

0.9
1.3
1.7
Crosstalk Amplitude (Vpd)

Figure 3.5.6: Measured BER in presence of crosstalk.

Continued on Page 585


DIGEST OF TECHNICAL PAPERS

69

ISSCC 2005 PAPER CONTINUATIONS


8 - 6.25G TX Channels

8 - 6.25G RX Channels

Single RX channel
0.13m 7LM
1.2V supply
0.24mm2
180mW

Figure 3.5.7: Chip micrograph.

585

2005 IEEE International Solid-State Circuits Conference

0-7803-8904-2/05/$20.00 2005 IEEE.

También podría gustarte