Está en la página 1de 4

Design of Giga Bit Ethernet Readout Module

Based on ZYNQ for HPGe


Tao Xue, Weibin Pan, Guanghua Gong, Ming Zeng, Hui Gong, Jianmin Li

Abstract–ZYNQ is the new all programmable SoC architect- the FPGA without the microprocessor. But it is not so easy
ure of FPGA with dual-core high performance ARM Cortex-A9 and flexible to use.
processors from Xilinx. A new module with Giga-bit Ethernet The ZYNQ is a new selection for readout sub system. It
interface based on the ZYNQ XC7Z010 is development for the combines the microprocessor and FPGA in one package, and
High Purity Germanium Detectors’ data acquisition in the CJPL
(China JingPing under-ground Lab) experiment. With the new benefitted from the 28nm technology, the dual microprocessor
architecture of ZYNQ, the embedded open source Linux with in the ZYNQ can run up to 1GHz. The footprint of the
TCP/IP stack and the real time high throughput logic based on smallest ZYNQ chip is about 13mm X 13mm, it’s almost
VHDL can be combined to run in a single ARM + FPGA chip same as a coin size. Fig.1 show the different of the traditional
with lower profile, lower power and highest performance. It’s readout solution and the new one based on ZYNQ.
very differently from the classic architecture with two chips, one
processor plus one FPGA. This paper will introduce how about
the architecture of the readout module and how the new module
can archive higher performance than the classic one. Also the
throughput of the Giga-bit Ethernet which more than 500Mbps
will be introduce and a verify platform of HPGe data acquisition
system based on the new module will also be introduced.

I. INTRODUCTION

Z YNQ is the new all programmable SoC (System on Chip)


architecture of FPGA with dual-core high performance
ARM Cortex-A9 processors from Xilinx [1]. It’s totally
different from the traditional architecture. As traditionally,
people can build a readout sub system with a microprocessor
plus a FPGA, there are two chip needed for system. The
microprocessor usually is running the operation system, such Fig. 1. Traditional readout sub system based on Spartan-6 and MPC5125
as some real time system: VxWorks, PSOS, and real time PowerPC Processor compare to the new one based on ZYNQ.
Linux. And with the microprocessor, there are memory such
as DDR2, DDR3 SDRAM act as data buffer and NOR Flash In Fig.1, the traditional readout sub system is based on the
or NAND Flash is used for code stored, calibration data popular Spartan-6 FPGA from Xilinx and the PowerPC
stored. The other one, FPGA is usually used to interface the microprocessor, MPC5125 from Freescale Semiconductor.
high speed ADC or DAC and deal with the real time function. This typical system is used for many DAQ system in our
For example, the LTM9011-14[2] ADC chip is from Linear design. This design’s cost is lower than traditional VME based
Technology, it can only be interfaced with the high speed system. We have used this design many years almost from
LVDS of FPGA, most microprocessor can’t direct interface to 2005.
it. So the FPGA act as the bridge of the ADC and the The challenge is bandwidth between the FPGA and the
microprocessor. The FPGA act not only the bridge, in most microprocessor, the local bus is 32-bit and almost 60MHz, it
time, the FPGA need to deal with buffer, trigger and DC offset seems enough for read data from FPGA, but in fact the
control functions. The microprocessor can interface to the bandwidth is not as enough as in the letter. The local bus is
FPGA with local bus or PCI/PCI-e bus or other slow control shared with the NAND Flash and other peripheral devices
bus, such as IIC or SPI bus. The embedded open source Linux such as UART, IIC or SPI. Then the actually throughput is not
which running in the microprocessor can easily program by C so high. In most time, a separate memory is connected to
code and the TCP/IP socket is ready and flexible to use. FPGA for data buffer because the local bus is not too fast to
Many microprocessor can provide the Giga-bit Ethernet or store slice of data to memory of microprocessor.
Fast Ethernet to readout data from FPGA to the distributed In the new design with ZYNQ, the interconnect interface is
network. Recently, some DAQ system use the silicon TCP/IP inside the chip, and is high bandwidth between the Cortex-A9
stack to act as the readout sub system, which can running in microprocessor and the programmable logic. There are 4
dedicated high performance port with 64-bit and can run up to
The authors are with Key Laboratory of Particle & Radiation Imaging, more than 150MHz and also have 1KB dedicated FIFO (128
Department of Engineering Physics, Tsinghua University, Beijing, China. slots of 64-bit). They can archive 1.2GB/S each in reading and
100084 (email: gbe.tao.xue@gmail.com).

978-1-4799-3659-5/14/$31.00 ©2014 IEEE


writing. As there are 4 of them, it’s enough to buffer data DQS0_N, DQM0 and DQ8 to DQ15 with DQS1_P, DQS1_N,
between ADC logic and DDR3 memory. The programmable DQM1 should be length matched in 5mil or 5ps.
logic can use the DMA mechanism to store data from ADC to The 88E1512 from Marvell is selected as Ethernet RGMII
the DDR3 memory without interrupt the microprocessor. So PHY, the benefit of it is the power supply of this PHY chip
the challenge in the traditional solution is disappeared. same as XC7Z010, 88E1512 also only need 1.0V, 1.8V and
3.3V, we do not need to generate another power supply for the
II. DESIGN OF HARDWARE PHY chip. Fig.3 and Fig.4 show the interface connection and
The XCZ7010 is selected to build an ultra-small readout power supply for the PHY chip.
module for the sub system. All of the ZYNQ’s programmable
logic is based on one of two architecture: Artix-7 or Kintex-7
FPGA. The CLG225 package of XC7Z010 is the smallest
footprint of ZYNQ series, and the programmable logic is
approximate 28K logic cells, this volume is more than
Spartan-6 series XC6SLX25 (24K logic cells), so it’s enough
for most ADC readout needed, actually the LTM9011-14 can
be easily interfaced with XC6SLX16 (with less than 30%
resource), which only has 14K logic cells.
Typically, the XC7Z010 need 1.0V for internal logic and
internal processor core supply, 1.8V for I/O buffer pre-driver Fig. 3. RGMII PHY interface.
and analog power, 1.5V for DDR3 memory (1.35V for low
voltage DDR3 SDRAM, but we do not used) and 3.3V for I/O
circuits. The current of every power rail can be calculated by
the Vivado development software from Xilinx, but 3A is
enough. The TPS62130RGT[3] from Texas Instruments is
used, it has the 3V to 17V input voltage range and can supply
up to 3A output current in 3mm X 3mm QFN package. Fig.2
show the power topology of power design of readout module
based on TPS62130RGT. Fig. 4. Power design for 88E1512.

The Fig.5 show the real picture of readout module based on


XC7Z010 in CLG225 package.

Fig. 2. Power design for ZYNQ based on TPS62130RGT and TSP51206.

As shown in Fig.2, the TPS51206 generate the 0.75V


terminal voltage for DDR3 SDRAM and 0.75V reference
voltage for DDR3 interface reference from 1.5V. Another
thing should be pay attention is the power sequence of these
voltage, detail should be read from the datasheet of XC7Z010
Fig. 5. Picture of the readout module based on XC7Z010 in CLG225
from website of Xilinx. package in the size of 56mm X 42mm.
The challenge of the electronic design for ZYNQ is the
layout of the DDR3 SDRAM interface. The speed of DDR3 As show in Fig.5, the actual size is 56mm plus 42mm, just
interface of XC7Z010 can reach 533MHz, and with the double the half size of a name card.
data rate, 1066MHz or about 1ns will be reached. Carefully The boot up process is a little complicated because not only
layout of DDR3 SDRAM is very important. The ZYNQ’s the microprocessor should be boot up, the FPGA also need
DDR3 SDRAM interface is dedicated to 40 Ohm impedance, boot up. The boot mode strapping pins select the boot deivces,
so the PCB designer should try to set the trace impedance to after power on, the BootROM in ZYNQ will be running firstly,
40 Ohm. Every lane, such as DQ0 to DQ7 with DQS0_P, the BootROM configures the required section of the
microprocessor and detects the state of strapping pins, reads platform. It’s an 80MHz, 8 Channels, 14-bit ADC with
the images from the selected boot devices, which can be 560MHz LVDS output Interface [5].
Quad-SPI, SD Card, NAND Flash, Nor Flash or JTAG. The clock is generated by the LMK04806B from TI, which
The images which is read from the boot devices is called is the low noise clock jitter cleaner with dual cascaded PLLs
FSBL (First Stage Boot Loader). The FSBL normally contains and integrated 2.5 GHz VCO. Its ultra-low RMS jitter
CPU instruction to further configure the microprocessor and to performance is about 111 fs [6].
configure the programmable logic using the device The verify platform architecture is shown in Fig.7 and the
configuration unit. real system picture is shown in Fig.8.
The FSBL will then read the SSBL (Second Stage Boot
Loader), for example, the u-boot will be read and running.
Then system developer meet the familiar environment, u-boot
is the most popular bootloader now, people can download
Linux Kernel and Filesystem in it very easily.

III. SOFTWARE DEVELOPMENT


The software design flow include two section, one is the
program design based on C program which is running in Fig. 7. Architecture of the hardware verify platform based on ZYNQ
embedded Linux and the other one is the logic design based on readout module.
VHDL/Verilog program which is used to design the real time
process running in the programmable logic. Fig.6 show the
design flow:

Fig. 8. Real picture of the verify platform.

Fig. 6. Design flow of software development for ZYNQ from Xilinx [4].
V. TEST RESULT
The detail of C program or VHDL are all similar to people
used for traditional readout system. A test program is developed based on LabView, data
acquired by the ADS5294, buffered by the programmable
IV. VERIFY PLATFORM logic and DDR3 SDRAM, readout by the Cortex-A9 processor
with Linux and send out through the Giga-Bit Ethernet to PC.
China JinPing underground Laboratory (CJPL) is a deepest
The Fig.9 show the typical waveform get from the verify
underground in the world and provides a very good
platform.
environment for direct observation of dark matter.
The CDEX experiment will going to direct detect the
interaction of WIMP with nucleon in CJPL with high
sensitivity in low mass region. Both CJPL and CDEX have got
much more progress in recent two years. The CDEX use point
contact germanium semi-conduct detector PCGe which has
less than 300eV threshold.
For the HPGe detectors, a 100MHz, 14-bit ADC is better
for signal capture. A verify platform is built for test the ADC
and the new readout module based on XC7Z010. The
ADS5294 from Texas Instruments is selected to the verify Fig. 9. Typical waveform acquired by the verify platform.
Some other performance test are also finished, the ENOB of
ADC is about 11.4-bit. And a separate test is done to test the
throughput of the Giga-Bit Ethernet of the readout module
based on XC7Z010. Fig.10 show the Ethernet performance of
the readout module. It is done with iperf, a very useful
Ethernet throughput test program in Linux.

Fig. 10. Ethernet throughput of readout module based on XC7Z010.

From Fig.10, the throughput of readout module is more than


500Mbps.

ACKNOWLEDGMENT
The new architecture of readout module based on ZYNQ
XC7Z010 in CLG-225 package is so small, the size is 56mm
plus 42mm, but the performance is ultra-high to handle 8
channels 80MHz, 14-bit ADC’s readout, the Ethernet
performance is more than 500Mbps, enough for readout to PC
in the distributed network.
Next verify platform with 125MHz ADC, LTM9011-24
from Liner Technology is in design, this ADC chip need
1GHz LVDS interface, it will verify the highest performance
of the LVDS interface of readout module based on XC7Z010.

REFERENCES
[1] Zynq-7000 All Programmable SoC Overview DS190 www.xilixn.com.
[2] Data Sheet of LTM9011-14 www.linear.com.
[3] Data Sheet of TPS62130 www.ti.com.
[4] www.wiki.xilinx.com.
[5] Data Sheet of ADS5294 www.ti.com.
[6] Data Sheet of LMK04806B www.ti.com.

También podría gustarte