Está en la página 1de 4

Teaching Field Programmable Gate Array Design of Digital Signal Processing Systems

Ru Jakub S tastn y1 , Luk as ckay1


1

Department of Circuit Theory, Faculty of Electrotechnical Engineering, Czech Technical University in Prague, Technick a 2, 166 27 Prague 6, Czech Republic stastnj1@seznam.cz, lukas.ruckay@email.cz

Abstract - In the last several years the application eld of FPGA (Field Programmable Gate Array) virtually exploded and a great deal of companies engaged in digital system on chip design are focusing on FPGA devices either. Universities have to respond to contemporary trends and therefore the design fundamentals of system for digital signal processing on FPGA were added to the curicullum at our department. In this paper we present some interesting examples we prepared for teaching. INTRODUCTION At present complete systems formerly consisting of many integrated circuits can be constructed on a single chip (System on Chip, SoC) thanks to recent huge technological progress in silicon technologies. This chip can be ASIC (Application Specic Integrated Circuit) or some programmable device or circuit FPGA (Field Programmable Gate Array). The application eld of FPGAs is increasing permanently. The reason for their usage is a lower price for small series with comparison to ASICs and design process simplicity. Another signicant property of the FPGA device is the possibility of run-time reconguration. Thanks to these properties we can nowadays encounter the FPGA devices almost everywhere and it is the reason why a great deal of companies engaged in digital system on chip design are focusing on FPGA devices either. Hence employers require digital systems knowledge and design capability on potential employees. Department of Circuit Theory at the Faculty of Electrotechnical Engineering at CTU provides a complementary subject, part of which is targeted to the FPGA design. The design is mainly focused on DSP (Digital Signal Processing) structure implementation. The digital system design requires not only good knowledge of design techniques and tools but an engineering sense of the problem being solved and detailed knowledge of problems and architectures of existing solutions. Students come to our lessons with fundamental knowledge of general digital design and while knowledge of the design tools (mainly usage of suitable language for synthesis and simulation VHDL or Verilog) is possible to learn easy, engineers sense of solved problem can be obtained only by long practice and analysis of previous works. In order to allow

the students to acquire at least basic knowledge of typical DSP FPGA systems architectures we prepared with them a few demonstrational examples which we are going to describe now. TEACHING The FPGA design is taught at the Department of Circuit Theory in the frame of existing voluntary subject AP2 (Architectures and using of Programmable devices 2). The students learn to design digital system by means of VHDL language and acquire fundamental skills with the system design. The design is focused not only on theory but the students can implement and try all of what they designed on development kits with the Xilinx FPGA devices. In teaching we focus on implementation of DSP structures, arithmetic operations or arithmetic operations optimizations because the main research focus of our department is on digital signal processing. The students can learn the advanced techniques in closer cooperation with FPGA laboratory (masters thesis or bachelor project, research project ...). Within the frame of FPGA laboratory several functional blocks IP macros (Intellectual Property) were designed [1],[2],[3]. It is stressed on verication during the design process. Each block has to be veried and complex structures such as [1], [2], [4] are veried with the help of the verication environment [5]. All IP macros are used as basic teaching support. We are going to outline the most signicant designed block in the following chapters. Simple Structures On FPGA Digital ICs are designed on a relatively high level of abstraction nowadays. This boosts work productivity and has other advantages. However, it is not always suitable for teaching and study. Modern design tools capabilities often lead to bad practice the students program in VHDL instead of design in VHDL. Result is often a non-optimal and wastingly complicated system implementation. To allow our students to understand relations between RTL (Register Transfer Level) code and real implementation work [6] was made. Within the scope of bachelor project [6] several simple blocks (XOR gate, AND-OR-NOT structure, adder,

synchronous counter and ripple counter) were designed. These block are more of teaching than practical character and they will be used as very good basis for making the students familiar with the FPGA design problems. The XOR gate is designed in two ways and author addresses possible problems in the design. The implementation of multilevel combinational logic using several FPGA LUTs (Look Up Table) is shown in the AND-OR-NOT structure. The adder is a typical block which uses the FPGA special function fast dedicated carry chain. Both counters represent sequential logic and all the differences between their implementation (size, timing properties, speed, current consumption) are discussed. Full blocks design is done including RTL simulation, synthesis, P&R (Place & Route), timing analysis and back-annotated netlist simulation. Detailed description of FPGA schemes of the blocks and description how each of them is implemented on the FPGA LUTs and CLB (Congurable Logic Block) congurations, CLB interconnections, IOs setting are presented in this work either. DSP Structures The 8th order FIR (Finite Impulse Response) lter (moving average, [3]) is example of the simplest possible DSP lter. All the lter coefcients b[n] are of 1/8 value; multiplication is realized by arithmetic shift right of 3 bits (dividing by 8). Shifted output is further convergently rounded. The adder tree allows us to easily demonstrate inuence of the pipelining on system parameters. Research report describing FIR lter [3] includes all signicant information simulation and synthesis results. The comparison of reachable clock frequency after synthesis and after P&R is outlined as well.
x[n]
N

implementation possibilities with their inuence on the architecture parameters are described in research report [2].
pipeline register

register X

x[n] we_x

register P

2N+W

register Y

y[n]

we_y c[n] we_c

register C

overflow detection

register overflow

ov[n]

Figure 2. Simplied implementation scheme of the MAC unit.

The MAC unit was veried on RTL as well as on gate level. The blocks designed to verify the MAC unit were reused later on to create a generic verication framework. IP macro of IIR (Innite Impulse Response) lter was designed in the frame of masters thesis [1]. The feasibility study and the comparison with existing IP macros was performed before the realization took place. The IIR macro is designed as biquadratic section direct form I. The width of data bus for signal samples and the width of coefcients is optional, adjustable by the generic parameters N and W . Coefcient loading is performed by the bidirectional tristate data bus and address bus reserved only for reading and writing coefcients. Besides the data output IIR macro contains overow indication of the internal MAC unit. The IIR macro is quite complex DSP block and therefore the whole design was divided into several smaller block (MAC unit, rounding unit, ...). Great emphasis was stressed on block verication. The rounding unit has several modes (simple truncating, saturation, convergent rounding or convergent rounding combined with saturation) and therefore it was veried separately. Finally, the whole lter including all working modes and coefcients update was veried on RTL and backannotated gate level. IIR Matlab model was also used for functional verication.

x[n-1] x[n-2] x[n-3] x[n-4] x[n-5] x[n-6] x[n-7] pipe_47 output register
N

pipeline registers pipe_03 rounding


x[n] N N+W

-1
N

b0
N+W

N+W+3

y[n]

Q
Z
-1
N

^ y[n]

N+W

Z-1
y[n]
N

b1
N+W N+W

a1
N

Z-1

Figure 1. Simplied implementation scheme of the FIR lter 8th order.

b2

a2

Figure 3. IIR lter biquadratic section, direct form I.

A MAC unit (Multiple and Accumulate) is the basic block for DSP structure realization [1]. The MAC unit multiplies two N -bits input vectors and results adds into 2N + W -bits accumulator, see Fig. 2. Generic parameters N and W determine the bitwidth of input data bus and the output extension to allow 2W accumulation without overow. All inputs and outputs signals are registered and overow is signalized. Optionally, it is possible to insert pipeline registers into structure and increase speed at the cost of increased latency. All the

Verication Environment Verication, although it is an important step of integrated circuit design, is very unpopular task mainly among the students. We outlined main verication ideas for our students in research report [5]. In addition, basic VHDL blocks allowing and simplifying verication were created. The document describes principal verication ideas and procedures. Blocks vector reader, vector writer, clock gen and test controller are core

of the whole environment. The clock gen block generates clock signal; the vector reader block provides reading test vectors from input le (input signal and lter coefcients for IIR macro). Similarly the vector writer block writes results to output le. The control block test controller provides correct reading and writing timing and complete communication among blocks. The environment allows to perform RTL and backannotated gate-level verication either. SDF (Standard Delay Format) le containing information about main timing parameters in design (setup time, hold time, recovery time, removal time, clock skew ...) is also described in document [5].

which allows direct control of the FPGA application by a human. The rst version of mentioned block is described in [4], the complete controller documentation is not yet nished in the form sufcient for teaching; our students are working on it right away. In SoC systems implementation we often face necessity to drive analogue blocks in system for example one has to generate variously phase-related clock or control signals. The recently nished example VGA controller [8] is the illustration of a simple generator of such signals. The block allows the FPGA to display a picture from internal FPGA memory on a VGA monitor. The analogue signals generation of VGA interface is done by the help of a simple resistive DA converter. RESEARCH

Test controller

Reader X

CLK generator

Reader A

IIR unit HDD


Data OUT

Writer Y Reader B

VHDL
Reader conf

MATLAB
HDD
Data IN

In addition to teaching we are conducting and involving our students in the research of FPGA structures for DSP. At present we are working on several topics; an interesting result of our previous work was the optimization of biquadratic section on FPGA; the rst results were published in [9]. CONCLUSION The FPGA design is taught in the frame of an existing subject as well as individually by work for the FPGA laboratory. At present the laboratory has six students working partly on research tasks, partly on development of another blocks for teaching. We use approach based on examples and applications not only for architecture teaching but also for making students familiar with synthesizable VHDL constructs. The simple fundamental synthesisable structures are presented to students; VHDL reference materials are downloadable from the laboratory WWW site http://amber.feld.cvut.cz/fpga. The above mentioned examples and their documentation is available at our site, either. The materials from our WWW server can be used for noncommercial usage by other academic institutions outside our laboratory. According to reaction which we have from our students, our conception is successful and it gives the future engineer sound basis for further studies and work in this eld. ACKNOWLEDGMENT This work has been supported by the research program Transdisciplinary Research in Biomedical Engineering No. MSM6840770012 of the Czech University in Prague, and the Grant GACR 102/03/H085: Biological and Speech Signal Modelling. REFERENCES [1] L. Ru ckay. IIR lter implementation biquadratic section on FPGA. Masters thesis, CTU FEE

Matlab model

compare

Figure 4. Verication environment: top level test bench scheme of all blocks in IIR lter verication. Boundary between VHDL and Matlab part is outlined. The communication between both worlds (VHDL simulation and Matlab environment) goes through the les stored on hard drive.

Control And Communication Blocks Although we do not want to teach the architecture of computers and processors, we consider as inevitable to make our students familiar with the basic concepts in this eld. Two another example systems help us to reach this target microcontroller PicoCTRL and communication interface between FPGA kit and PC. The PicoCTRL [7] is a simple project of a programmable microcontroller. The system has 32 output signals; 6 input conditional signals and memory for 32 16-bits instruction in the basic version. The controller executes two basic operations write to the output port and jump to the given address. Each instruction can be conditioned with the help of one of eight conditional inputs (6 signals, constant one, constant zero). Thanks to block conception it is very simple to add new instructions, an interrupt subsystem or demonstrate basic problems of pipeline and conditional instruction execution. The controller is well documented and in the near future the whole project will be integrated into teaching. Another block the controller of serial communication interface shows to the students the implementation possibilities of a more complex system for communication between two blocks. The controller allows connection between an FPGA kit and PC by the help of serial line. The Windows hyperterminal is possible to use at the PC site for direct issuing commands for the FPGA kit the protocol is based on ASCII coding

Prague, Department of circuit theory, February 2005. (in Czech). [2] L. Ru ckay. FPGA Implementation of the MAC unit. Unpublished research report: Z052, CTU FEE Prague, Department of circuit theory, March 2004. (in Czech). [3] L. Ru ckay. Implementation of the moving average lter on the FPGA. Unpublished research report: Z051, CTU FEE Prague, Department of circuit theory, February 2004. (in Czech). [4] M. B art u. Demo of the UART XILINX FPGA macro block + Matlab communication framework. Unpublished research report: Z05-6, FEE CTU Prague, 2005. (in Czech). [5] L. Ru ckay. The design of the FPGA VHDL verication environment. Unpublished research report: Z053, CTU FEE Prague, Department of circuit theory, November 2004. (in Czech). [6] M. K reme cek. Basis logic functions FPGA implementation. Masters thesis, CTU FEE Prague, Department of circuit theory, Bachelor project, February 2006. (in Czech). [7] P. B l y and J. S tastn y. PicoCTRL microcontroller. Unpublished research report: Z061, CTU FEE Prague, Department of circuit theory, January 2006. (in Czech). [8] D. Hradeck y. VGA controller on FPGA. Unpublished research report, CTU FEE Prague, Department of circuit theory, July 2005. (in Czech). [9] L. Ru ckay and J. S tastn y. Simulation and Implementation of Biquadratic Section with Quantization Error Feedback on the FPGA. Digital Technologies 04, December 2004.

También podría gustarte