Chapter 4: 9. (page 361)
In this exercise, we examine how data dependences affect execution in the basic 5-stage pipeline described in section 4.5. Problems in this exercise refer to the following sequence of instructions:
or r1,r2,r3
or r2,r1,r4
or r1,r1,r2
Also, assume the following cycle times for each of the options related to forwarding:
Without Forwarding | With Full Forwarding | With ALU-ALU Forwarding Only |
250ps | 300ps | 290ps |
4.9.1 Indicate dependences and their type.
4.9.2 Assume there is no forwarding in this pipelined processor. Indicate hazards and add nop instructions to eliminate them.
4.9.3 Assume there is full forwarding. Indicate hazards and addNOPinstructions to eliminate them.
4.9.4 What is the total execution time of this instruction sequence
without forwarding and with full forwarding? What is the speedup achieved by adding full forwarding to a pipeline that had no forwarding?
4.9.5 Addnopinstructions to this code to eliminate hazards if there is ALU-ALU forwarding only (no forwarding from the MEM to the EX stage).
4.9.6 What is the total execution time of this instruction sequence with only ALU-ALU forwarding? What is the speedup over a no-forwarding pipeline?
Short Answer
4.9.1 For the given sequence of instruction, dependences, and their types are as follows:
Instructions | Dependence |
I1: or r1, r2, r3 | RAW on r1 |
I2: or r2, r1, r4 | RAW on r2 |
I3: or r1, r2, r2 | WAR on r2 |
WAR on r1 | |
WAW on r1 |
4.9.2 In a no forwarding pipeline processor, the dependency with nop instructions is as follows:
Instructions | Description |
I1:or r1, r2, r3 | Delay I2 to avoid a RAW hazard |
nop | |
nop | |
I2:or r2, r1, r4 | Delay I3 to avoid a RAW hazard |
nop | |
nop | |
I3: or r1, r2, r2 |
4.9.3 With the full forwarding pipeline processor, the dependency with nop instructions is as follows:
Instructions | Dependence |
I1: or r1, r2, r3 | |
I2: or r2, r1, r4 | No RAW on r1(forwarded) |
I3: or r1, r2, r2 | No RAW on R2(forwarded) |
4.9.4 The total execution time :
Without forwarding=1980ps
With forwarding =1680ps
Speedup due to forwarding=1.18
4.9.5
Instruction | Description |
I1: or r1, r2, r3 | |
I2: or r2, r1, r4 | ALU-ALU |
I3: or r1, r2, r2 | ALU-ALU |
4.9.6 The total execution time:
Without forwarding=1980ps
With ALU-ALU forwarding=1470ps
Speed up with ALU-ALU=1.35