Documentos de Académico
Documentos de Profesional
Documentos de Cultura
Five steps are involved in the lw fetch and execution. Time taken to complete each step is as follows:
Instruction fetch: Register read: ALU: Memory read: Register write: 200 ps 100 ps (for base value) 200 ps (for memory address) 200 ps (for reading data from memory) 100 ps (for register write)
Execution time for lw instruction = 800 ps Execution time for a sequence of 3 lw instructions = 2400 ps
Each pipeline stage takes one clock cycle. Clock cycle for a pipeline stage must be long enough to accommodate the slowest operation (200 ps in our example). Nonpipelined versus pipelined execution of 3 lw instructions
Figure 6.3
Figure 6.3
The pipelined processor has a lower average CPI when compared to a multicycle implementation with the same clock rate.
The pipelined processor has a lower product of clock rate and CPI when compared to the single cycle implementation Ideal speedup is proportional to the number of stages
Pipeline Hazards
Hazard:
A situation in pipelining when the next instruction cannot execute in the next clock cycle
Structural Hazard The hardware cannot support the combination of instructions that we want to execute in the same clock cycle.
Data Hazards
Data hazard can occur when one or more of the instructions in the pipeline are data dependent.
Consider the following sequence of instructions: add $s0, $t0, $t1 sub $t2, $s0, $t3 The sub instruction is dependent on the result in register $s0 of the first instruction. Consider the following sequence of instructions: lw $s0, 20 ($t1) sub $t2, $s0, $t3 The data required by the sub instruction is available only after the fourth stage of the first instruction.
S. Barua CPSC 440 sbarua@fullerton.edu http://sbarua.ecs.fullerton.edu
Data Hazard - Solutions Two methods are used to resolve a data hazard.
Forwarding or bypassing
Retrieves the missing data element from internal buffers instead of waiting for it to come from the registers or memory location specified by the instruction (Figure 6.5)
Always stall
Pipeline is stalled until the pipeline determines the outcome of the branch and knows what instruction address to fetch from. The penalty will be several clock cycles.
Pipelining reduces the average execution time per instruction, thereby improving the system performance.
Hazards limit the performance improvement, but appropriate hardware/software techniques can be devised to circumvent these limits.
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
htt
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarufullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu
sbarua@fullerton.edu
http://sbarua.ecs.fullerton.edu