RAM

DESIGN AND IMPLEMENTATION OF AN ARBITER FOR EFFICIENT MULTISYSTEM INTERACTION WITH SHARED RESOURCES 'ASHISH K BABU, "V NATTARASU, ’SHASHIDHARA HR. Department of Electronics and Communication Engineering, Sri Jayachamarajendra Collegeof Engineering, Mysore JSS Academy of Technical Education, Bengaluru, Karnataka, INDIA. [Abstract the paper presents the design ard analysis of an arbiter for efficient multisystem interaction with shared resources. The abiation algorithm contains static fixed priority algorithm to gain fast system performance and lower area implementation and cost. The analysis for data requests of clients under different priorities has been simulated. The results indicate, a better performance can be achieved with this arbitration scheme. Based on the perfarmance analysis, fived priority arbtation scheme can be custom tuned to meet the design requirements. The implementation ofthe afiter with ty arbitration scheme for SoC applications has also been explained. The arbiter wis implemented on FPGA and by XILINX 14.2 version with a 90nm cell library. In addition tke power analysis ofthe arbiter at various Arbitration states is reported. The fixed priority arbiter can be eustom tune to obtain high bandwidth utilization, ow latency and power effective for on-chip bus communication, Keywords—Arbiter, system-on-chip, arbitration algorithm, FPGA, shared resources. |. INTRODUCTION ‘A typical System-on-Chip (SOC) design cont many heterogeneous cores linked together wit sophisticated on-chip bus communication architectures. The ot-chip bus communication architecture determines the way these heterogeneous functional units are exchanging and synchronizing data and has a great impact on the system’s performance, The SOC design paradigm relies on Well defined interfaces and reuse of intellectual Property (IP). Because more and more IP's are integrated into the design platform, the amount of ‘communication between the TPs ison the increase and becomes the source of the performance bottlenecks. In the last thirty years computer memory and other shared resources has evolved rapidly seeing improvements in both capacity and speed. However the logic for controlling the shared resources has ‘also ‘become increasingly more complex and difficult to interface with. The thitd generation of double data rate synchronous dynamic random access memory (DDR3 SDRAM) is the newest and the fastest volatile memory currently available. DDR3 RAM is one of the many types of RAM used to temporarily hold data that system needs tohave quick access DRAM memory is designed for ‘communication with a single system. For two systems to share the same memory, an arbiter must be used. The shared resource arbiter serves as an interface to both systems and grants each system access to the memory one at a time to avoid collisions. In this project we developed an arbiter for a shared resource (in our case a basic hardware multiplier) and its analysis was done. The hardware multiplier would enable very fast data calculation rate ‘and significantly reduce the time the testers take to transfer data. The development of the arbiter was done using Xilinx ISE 14.2 .It enabled us to write a VHDL code that describes the design of arbiter and t0 insert a basic hardware multiplier as shared resource. The simulation was run in modelsim 6.3£The ‘modelsim tool was very usefil because its GUI ‘enables the display of any signal waveforms in the design Since simulation has a fast turnaround time we iteratively validated the design in modelsim to ensure any bugs in the code were caught cay, ‘Although the arbiter has been fully developed and is fully functional, it still needs to be properly tested with greater coverage Functionality checkers should bbe developed for this design which would test the validity of arbiter in runtime environment as of in practical use, In many applications itis. very useful for more than one system to interact with memery. Xilinx bas released a memory controller for DDR3 which handles everything for communication between one system and memory. If to systems try toaccess DDR3 at the same time there is a great risk that data accessed will not be accurate. The goal of this project is to design a traffic controller or an arbiter that will delegate which systems tum it is to send requests while blocking requests from other wm [8]. An arbiter will prevent collision of data’s being accessed and maintain requests in order to ‘ensure memory is holding accurate data I ARBITRATION SCHEMES FOR SHARED RESOURCES: A REVIEW ‘This section discusses arbiters, which resolve ‘multiple requests for a single resource. In addition to being useful in their own right, arbiters form the fundamental building block for controllers that match Trocecdines of leternatoal Joint Confrence, July 2013, Goa, lad ISBN: O7E-SID2TNT IG a—— LS Designand Implementation ofan Arie For Eficien Multisytem Ineraction wth Stared Resouces ‘multiple requesters with multiple resources. Whenever a resource, such as a memory. buses of a ‘multiplier is shared by many agents, an arbiter is required to assign access to the resource to one agent at a time, Figure 1 shows a symbol for an n-input arbiter that is used to arbitrate the use of a resource, such as the input port to a crossbar switch, among a set of agents, such as the client system connected to ‘that input por. Each client system that has a fit send requests access to the input port by asserting its request line. Suppose, for example, that there are n = 8 CSs and CSs 1, 3, and 5 assert their request lines, rl, 13, and #5. The arbiter then selects one of these (CSs—say, CS i= 3—and grants the input port to CS 3 by asserting grant line g3.CS 1 and 5 lose the arbitration and must try again later by reasserting rl and r5.Most often, these lines will just be held high until the CS is successful and receives a grant. Figurel: Arbiter symbols. (a) An arbiter accepts request lines, ro . fy-l, arbitrates ameng the asserted request lines, selecting one, 1, for service, ‘and asserting the corresponding grant line, g. (b) To allow grants to be held for an arbitrary amount of ‘me without gaps between requests, a hold line is added tothe arbiter. 2.1 Fixed Priority Schemes 2.1.1 Static fixed priority scheme Static fixed priority is a common scheduling mechanism on most common buses .In a static fixed priority scheduling policy; each master is assigned a fixed priority value. When several masters request simultaneously, the master with the highest priority will be granted. The advantage ofthis arbitration is its simple implement and small area cost. The static priority based architecture does not provide a means for controlling the fraction of communication ‘bandwidth assigned to a component. If masters with high priority requests frequently, it will lead to starvation of the ones with low priority If we assign priority in a linear order, we can struct an arbiter as an iterative circuit, 36 illustrated for the fixed-priority arbiter of Figure 2. We construct the arbiter asa linear aray of bit cells Each bit cell i, as shown in Figure 2(a, accepts one request input, ri, and one cary input, ci , and generates a grant output, gi, and a carry ouput, ct ‘The carry input ci indicates that the resource has not been granted to a higher priority request and, hence, is available for this bit cell. Ifthe current request is ‘rue and the carry is true, the grant line is asserted and the carry output is deasserted, signalling that the resource has been granted and is no longer available. $ 0. w Figure 2: A fied priority arbiter (a) Abit cell for aniterative ‘arbiter (0) A fourat Oued pity arbiter. Ep Dy @ We can express this logic in equation form as: gist Aci cit = ari Aci Figure 26 shows a four-bit fixed. priority arbiter constructed in this manner. The arbiter consists of four ofthe bit cells of Figure 2a. The first and last bit cells, however, have been simplified. The first bit cll takes advantage ofthe fac that «0 is always |. While the last bit cell takes advantage ofthe fact that there is no need to generate c4At fist glance, it appears thatthe delay ofan iterative arbiter mast be linea in the number of inpuss. Indeed, if they are constructed as shown in Figure 2, then that is the case, since in the Worst case the carry must propagate from one end of the arbiter to the other. However, we can build these arbiters to operate in. time that grows logarithmically with the number of inputs by employing carry-look ahead techniques similar t0 those employed in adders [4] 2.1.2 TOM/Round-robin scheme Time division multiplexed (TDM) scheduling divides execution time onthe bus into tine slots and allocates the time slots t0 adapters requesting wse ofthe bus Each (ne slt can span several physical transactions cn the bus. A request for use ofthe bus might requie ‘multiple slot times to perform al required transfers. However, in this architecture, the components ae provided access to the communication channel in an Ditcrleaved mange, using a. bo level arbiaion rotocel (3). ‘roccding fItcrnational ont Conference, ™ Jah 2013, Goa, Inia, ISBN 9TE-SI92TIA7-7% 2‘esgn and fyplementation of an Arbiter For Efficient Malisystem Interaction wih Shared Resouees Figure 3 Time division maltiplesed/ Round Robin “Architetore The first level of arbitration uses a timing. wheel where each slot is statically reserved for a unique master. In a single rotation ofthe wheel, a master that has reserved more than one slot is potentially granted sccess to the channel mubiple times. If the master Interface associated with the current slot tas an outstanding request, a single word transfer is granted, ind the timing wheel is rotated by one slot. To alleviate the problem of wasted slots, a second level farbitation is supported. The policy isto keep track of the last master interface to be granted access via ‘he second level of arbitration, and issuea grant to the ‘ext requesting master in a round-robin fashion, at figure |, the current slot is reserved for MI, but it has 10 data to communicate. The second level increments 1 round-robin pointer rr2 from its current position at M2 to the next outstanding request at M4.Advantage of this algorithm that it is easy to implement. Disadvantage is that it leads to the mistake of date transfer. |. METHODOLOGY |A system has a shared resource, such as a bus or memory. with which various devices _ may communicate upon a request being granted by an arbiter. In order to reduce the arbitration time, the arbiter is comprised of @ state machine and latch which run asynchronously. Consequently, once one request has been granted and acted upon, the state ‘machine will commence arbitration for any remaining requests. In erder that a plurality of different portions ‘of a computer system may share a common resource, such a a common bus or a common memory. it is Y to provide an arbitration system so that ‘only ene portion uses the resource at a given Arbitration problems arise not only as between pracesiors or processes in computer systems, but also in a wide variety of other electronic systems. For | example, arbitration needs commonly arise in communications networks, where interface access and gateway access can be important shared resourses. In a known bus arbitration scheme, at one system clock eyele the bus requests which are current are loaded into a register. and at the next system clock eyele the requests held in the register are arbitrated by a state machine to deterimine which request to grant, The register is necessary to ‘overcome Meta stability problems; without the resister if a request were to chanze at the time of arbitration, an incorrest grant, of “zltch” is likely t0 ‘The design presented in this paper provides an arbitraticn system which docs not need to wait for an extemal clock signal before commencing an arbitratien operation and which can commence a Subsequent arbitration operation immeditely after a preceding request has been satisfied. It also provides ‘an arbitration system having latch means with a plurality of signal inputs, a corresponding pluralty of signal outputs and a control input. Each signal input reseives a respective one of the resource request signals. The cantrol put receives a control signal Each signal output provides a respective second resource signal which is select ably a latched form and a passed form of the respective first-mentioned resource request signal in dependence on the control signal. Furthermore, the system includes an asynchronous state machine having a plurality of inputs and outputs. Each input receives a respective ‘one of the first and second resource request signals; teach ouiput is provided for a respective one of the first resource request signals and its corresponding. second resource request signal. Each output provides. a respective one of the grant signals to a respective fone of the operating means. ‘A control output provides the control signal to the Tatch means. The state machine operates cyclicly and ‘asynchronously (a) to set the control signal in response to one or more of the resource request signals so that the latching mears latches the first resource request signals to provide the second resource signal; (b) to arbitrate between the second. resource request signals to select one of the second resource request signals comesponding to a selected ‘one of the operating means; (c) to provide that one of the grant signals corresponding to the selected one of the operating means; (@) to wait until the first resource request signal for the selected one of the ‘operating means ceaies to request the resource; and. (@) to set the control signal so thatthe latching means pasees the first resource request signals to form the second resource request signals. Therefore. in accordance with the invention, a initial arbitration ‘operation can commence immediately upon a first resource request signal arising, glitches are prevented due to the action of the latching means which prevents any second resource request signal changing, uring an arbitration operation, and_ subsequent arbitration operations can commence immediately after a preceding request has been satisfied, as ‘rosedings of Ttratonal ont Conference, July 2013, Goa, Indi ISBN OTR-BIS271477-5 i ecient a 3Design and implemestation of an Arbiter For Eficint Mulisystem Interaction wit Shared Resources ialled by the appropriate first resource request al In accordance with one aspect of the presented design, the arbitration system is operated asynchronously. Putin other words, once an arbitration operation has been completed. the system does not wait to be triggered by an external signal (ouch ss a computer system clock) before commencing the next arbi operation, In accordance with anether aspect of the invention, there is provided a system having a common resource, a plurality of means each operable to provide a resource request signal and to communicate with the resource in response fo a grant signal, and arbitration means which is responsive to the resource request sixnals and is operable repeatedly to determine which ‘request to grant and provide a corresponding grant signal, characterised in that the arbitration means ‘operates asynchronously with respect fo any external signal. In accordance with a further aspect of the design, there is provided a system having a common resource, a plurality of means each operable to provide a resource request signal and to communicate with the resource in response to a grant signal, and arbitration means which is responsive io the resource request signals and is operable repeatedly to perform ation operation to determine which request to grant and provide a corresponding grant signal, characterised in thatthe arbitration means is operable to commence a succeeding arbitration operation in response 10 one or more resource request signals immediately after completion of the preceding arbitration operation [5}. Thus. the systems according to the various aspects of the design have the advantage that the arbitration period is determined solely by the propagation delays of the system and is independent of any system clock frequency or the timing of the request signals relative to the system clock. Preferably, the arbitration means includes a state machine, which may be operable to change stare ceyclicly: from an initial state to a decision state in response to one or more request signals from the decision state to one of a plurality of grant states determined in the decision state from the current request signals; and from the determined grant state hack t0 the initial state once the determined request has been met. Conventionally, the state machine has a plural bit state variable which has different values in each of the states, and in order to prevent glitches due to the asynchronous operation of the state machine the correlation between the bit values of the state Variable and the states is preferably arranged such that upon a change from any state to a succeeding state only one of the bit values changes. Preferabl, the bits of the state variable are allocated such that ‘each grant signal can be determined from the value of 4 respective one of the bits of the state variable, thus ‘obviating the need for further logic in order to produce the grant signals. With such a bit allocation of the state variable, it may be required that the state ‘machine is operable to change from at least one of the states to at least one other of the states via a transien! state In order to prevent glitches, the system preferably further comprises means to prevent a change in & request signal input to the state machine while the state machine is in the decision state, Such means may comprise a latch which is operable to pass the request signals when the state machine is in the initial State and which is cperabe to latch the request signals when the state machine is in the decision state. Conveniently, the state machine and the means to prevent a change are formed by a single programmable logic array. In order to determine when a grant signal has been acted upon. the state machine may be arranged to be responsive directly to the request signals and be operable to remain in a determined grant state until the respective request signal is removed. In one embodiment, the common resource is a bus of the computer system. In another embodiment, the common resource is a memory of the computer system. By adopting the arbitration system according to the invention, itis envisaged that it will be possible to use relatively inexpensive multi- ported dRAM memories, where otherwise it woald be necessary 0 use more expensive SRAM memories (61. In summary the definitions of the state and latch ‘enable signal are as follows and are been depicted in the diagram below: Initially by default we have all the priorities and latch enable signal to zeros then state is reset to So and next ‘eycle if reset goes low and clack is applied the sate is updated to $; State $0: In state SO Mux input select and the enable register ig |.and the Mux request select is assigned (01,00 and 10 based on the grants 111,no grant and (000 respectively and then nex state is assigned St State SI: In state $1 when grant is 111 mux request select is 10 then next state is finish, when we have grants otherthen 111 the next state is assigned S2. State S2 (Memory address computation): When the ‘count is 11 mux request select is 10 and next state is, finish when other count the next state is $2. State Finish (Memory read access): Here since it's the finish stat, finish signal is assigned 1, mux request select is 10 and the next state is assigned finish and when others also i's the finish state ‘Proceelingsofatratonal ont Conference, 7 July 2013, oa, India, ISBN: 9TEL-S2T1A7-T4 oDesign and Implementation ofan Arbiter For Efficient Mulisstem Interaction with Shared Resources Figore 4: Sate diagram for different states and latch enable Based on the methodology discussed and taking a basic hardware multiplier as a stared resource the design below represents an arbiter for six client request and according to the state operations discussed above the grants are given to the requested clients based on the fixed priority scheme discussed 2.The figure 5 below depicts the hardware design of arbiter in sect sateo | resource | (oreo | ooraeuses | mucus) The static fixed priority arbitration scheme implemented in the design is a scheduling policy where each master is assigned a fixed priority value when more than one master request simultaneously The master with the highest priority is granted. The synthesized arbiter block for the design using the arbitration scheme discussed is shown in figure 6 Figure 6: RTL Schematic ofthe arbiter design IV, SIMULATION RESULTS The des on the different input values for shared resources. Initially by default we have all the priorities and latch enable signal to zeros then state is reset to So and next cycle if reset goes low and clock is applied the state is updated to S; .State $0 : In state SO Mux input select and the enable register is I and the Mux request select is assigned 01,00 and 10 based on the grants 111,n0 grant and 000 respectively and then next state is assigned S1.State SI: In state SI] when grant is 111 mux request select is 10 then next state is finish, when we have grants other then 111 the next state is assigned S2.State $2 (Memory address computation): When the count is 11 mux request select is 10 and next state is finish when other count the next state is is further synthesized and simulated based 2. State Finish (Memory read access): Here since it's the finish state, finish signal is assigned 1, mux request select is 10 and the next finish and when others also it's the finish state ‘ceedings of ateratonal Jot Conerence, 7 ly 2013 Tdi, ISBN OTH BTDETTADesign ard Implement 5 OSAR) teoiupans Further priorities are assigned to the inputs coming from six different clients. The client request which is assigned the highest priority is given the grant at the ‘earmest. Later the request following the next higher priorities are processed and given grant without WEN eum toni ey | non The logic utilization summarizing simulation is provided in the table below the above ‘Selected Device Xilinx XCTVS85i-1 ipFiops | 668 Fuly sed LOT FFP 9M TOR Uiliaton or Delays) 8.07 | Tost estinated ower — 20678 | dnmW) | Table 1: Logie Utilization table of the design Proceedings of ineratonl Jont Confrence, 7 Eclat Multisystem Ineraction with Stared Resour aly Wuuibeeami sees represented gure siving external clock to each client for every request thereby reducing the time delay. The simulated results forall client request and grant are shown with priorities assigned in figure 8. Musee we ee ee ts ofthe mal arbiter desis CONCLUSION ‘The arbiter with fixed priority for single shared resource is presented in the study. The simulation results not only provide the performance analysis for various request combinations, but also the logic utilization analysis including timing analysis and power report under optimal conditions. The result: obtained show that the framework of the arbiter can be conf used to explore the space of possible rations to evaluate the performance tradeofls, 13, Goa, Tai, ISBN 7BSDRTNTISDesign and Implementation ofan Arbiter For Eicient Multisytem Interaction with Shared Resources REFERENCES, (IDEM. Lin CC. Yen, CH, Shih and LY Jou, “On compliance tet of onchip bus fer SOC", in Proceedings of the 2004 ‘Conference om ist South Paci design automaton, 2004, [2] Noguera. J. and Badia, R. M_"HWISW co-sign techniques for dynamically reconiguable architectures", [EEE Trane VLSI Spt Vol 10, NO, 2002,pp. 398 ~ 415, [BIE S- Shia, V.1. Mooney Ill G.F Riley, "Roundrbin Arbiter Design and Generation” Georgia Irstinte of Tecinologs, ‘Atlant, GA, Technical Repo GIT -C-02-38.2002, {41 incites and practices of intexconnecion networks, William Tames Dally and Brian Towels Morgan Kaun, [5] Asynchronous arbiter state machine for abitrating between ‘opening, devices requesting. acess to a shard resoerce Us!1 79705, Osmaa Keon, Dupont Pine Systema, Lt 1) F Pole, D. Bertoze, LBenini and A. Bog, “Performance Analysis of Abiration Policies for SoC Communication “Archtecures", Design Automation for Embedded Systems, Vo. Numbers 2-3, 2003, 1P1 ViIDC Progamming by Beams, Douglas 1 Perry, Tata ‘MoGa-Hil Publishing Comp Limite. [8] Design and implemcotaion of a recnfgurble attr ‘Proecedings ofthe 7th WSEAS International Conference on Signal, Speech and Image Processing Bejing, China, Sepember 3-17200 kee Trovsolingsofltarnaonal Joint Conference, 7 Fay 207 37 Goa Talia SN: OTERTDITIATIS

RAM

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

RAM

Cargado por

Copyright:

Formatos disponibles

También podría gustarte