Chapter 4: Q18E (page 368)
4.18 In this exercise we compare the performance of 1-issue and 2-issue processors, taking into account program transformations that can be made to optimize for 2-issue execution. Problems in this exercise refer to the following loop (written in C):
for(i=0;i!=j;i+=2)
b[i]=a[i]–a[i+1];
When writing MIPS code, assume that variables are kept in registers as follows, and that all registers except those indicated as Free are used to keep various variables, so they cannot be used for anything else.
I | j | a | b | c | Free |
R5 | R6 | R1 | R2 | R3 | R10, R11, R12 |
4.18.1 [10] Translate this C code into MIPS instructions. Your translation should be direct, without rearranging instructions to achieve better performance.
4.18.2 [10] If the loop exits aft er executing only two iterations, draw a pipeline diagram for your MIPS code from 4.18.1 executed on a 2-issue processor shown in Figure 4.69. Assume the processor has perfect branch prediction and can fetch any two instructions (not just consecutive instructions) in the same cycle.
4.18.3 [10] Rearrange your code from 4.18.1 to achieve better performance on a 2-issue statically scheduled processor from Figure 4.69.
4.18.4 [10] Repeat 4.18.2, but this time use your MIPS code from 4.18.3. 4.18.5 [10] What is the speedup of going from a 1-issue processor to a 2-issue processor from Figure 4.69? Use your code from 4.18.1 for both 1-issue and 2-issue, and assume that 1,000,000 iterations of the loop are executed. As in 4.18.2, assume that the processor has perfect branch predictions, and that a 2-issue processor can fetch any two instructions in the same cycle.
4.18.6 [10] Repeat 4.18.5, but this time assume that in the 2-issue processor one of the instructions to be executed in a cycle can be of any kind, and the other must be a non-memory instruction
Short Answer
4.18.1
The corresponding MIPS assembly language code:
ADD R5,R0,R0
Loop1: BEQ R5,R6,Done
ADD R10,R5,R1
LW R11,0(R10)
LW R10,1(R10)
SUB R10,R11,R10
ADD R11,R5,R2
SW R10,0(R11)
ADDI R5,R5,2
BEW R0,R0,Loop1
Done:
4.18.2
ADD R5,R0,R0 | “IF”,“ID”, “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”,“ID”, -- “EX”, “ME”, “WB”. |
ADD R10,R5,R1 | “IF”, -- “ID”, “EX”, “ME”, “WB”. |
LW R11,0(R10) | “IF”, -- “ID”, -- “EX”, “ME”, “WB”. |
LW R10,1(R10) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
SUB R10,R11,R10 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
ADD R11,R5,R2 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
SW R10,0(R11) | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
ADDI R5,R5,2 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
BEW R0,R0,Again | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADD R10,R5,R1 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
LW R11,0(R10) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
LW R10,1(R10) | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
SUB R10,R11,R10 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADD R11,R5,R2 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
SW R10,0(R11) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADDI R5,R5,2 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
BEW R0,R0,Again | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
4.18.3
The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.
For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.
4.18.4
ADD R5,R0,R0 | “IF”,“ID”, “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”,“ID”, -- “EX”, “ME”, “WB”. |
ADD R10,R5,R1 | “IF”, -- “ID”, “EX”, “ME”, “WB”. |
LW R11,0(R10) | “IF”, -- “ID”, -- “EX”, “ME”, “WB”. |
LW R10,1(R10) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
SUB R10,R11,R10 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
ADD R11,R5,R2 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
SW R10,0(R11) | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
ADDI R5,R5,2 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
BEW R0,R0,Again | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADD R10,R5,R1 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
LW R11,0(R10) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
LW R10,1(R10) | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
SUB R10,R11,R10 | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADD R11,R5,R2 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
SW R10,0(R11) | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
ADDI R5,R5,2 | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
BEW R0,R0,Again | “IF”, -- -- “ID”, “EX”, “ME”, “WB”. |
BEQ R5,R6,End | “IF”, -- -- “ID”, -- “EX”, “ME”, “WB”. |
4.18.5
CPI for the first issue: 1.11
CPI for the second issue: 1.06
Speed-up: 1.05
4.18.6
CPI for the first issue: 1.11
CPI for the second issue: 1.06
Speed-up: 1.34