Chapter 4: Q18E (page 368)

4.18 In this exercise we compare the performance of 1-issue and 2-issue processors, taking into account program transformations that can be made to optimize for 2-issue execution. Problems in this exercise refer to the following loop (written in C):
for(i=0;i!=j;i+=2)
b[i]=a[i]–a[i+1];
When writing MIPS code, assume that variables are kept in registers as follows, and that all registers except those indicated as Free are used to keep various variables, so they cannot be used for anything else.
I
j
a
b
c
Free
R5
R6
R1
R2
R3
R10, R11, R12
4.18.1 [10] Translate this C code into MIPS instructions. Your translation should be direct, without rearranging instructions to achieve better performance.
4.18.2 [10] If the loop exits aft er executing only two iterations, draw a pipeline diagram for your MIPS code from 4.18.1 executed on a 2-issue processor shown in Figure 4.69. Assume the processor has perfect branch prediction and can fetch any two instructions (not just consecutive instructions) in the same cycle.
4.18.3 [10] Rearrange your code from 4.18.1 to achieve better performance on a 2-issue statically scheduled processor from Figure 4.69.
4.18.4 [10] Repeat 4.18.2, but this time use your MIPS code from 4.18.3. 4.18.5 [10] What is the speedup of going from a 1-issue processor to a 2-issue processor from Figure 4.69? Use your code from 4.18.1 for both 1-issue and 2-issue, and assume that 1,000,000 iterations of the loop are executed. As in 4.18.2, assume that the processor has perfect branch predictions, and that a 2-issue processor can fetch any two instructions in the same cycle.
4.18.6 [10] Repeat 4.18.5, but this time assume that in the 2-issue processor one of the instructions to be executed in a cycle can be of any kind, and the other must be a non-memory instruction

Short Answer

Expert verified

4.18.1

The corresponding MIPS assembly language code:

ADD R5,R0,R0

Loop1: BEQ R5,R6,Done

ADD R10,R5,R1

LW R11,0(R10)

LW R10,1(R10)

SUB R10,R11,R10

ADD R11,R5,R2

SW R10,0(R11)

ADDI R5,R5,2

BEW R0,R0,Loop1

Done:

4.18.2

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue: 1.11

CPI for the second issue: 1.06

Speed-up: 1.05

4.18.6

CPI for the first issue: 1.11

CPI for the second issue: 1.06

Speed-up: 1.34

Step by step solution

Define the concept.

4.18.1

The explanation of the purpose of using the MIPS instruction:

The “branch-on-equal (beq)” is a decision-making instruction in MIPS assembly language. The purpose of using this MIPS assembly instruction “beq reg1, reg2 Label” is going to the statement “Label” if the value of “reg1” is equal to the “reg2”.
The “I type” instruction “sw$t2, 0($t3)” is used for storing the word where “$t2” is the source register and “$t3” is the destination register and “0” is the offset.
One of the “I type” MIPS instruction is “addi $t3 $t4 1” “$t4” is the source register, “$t3” is the destination register, and “1” is the immediate value. The purpose of using it for add immediately.
One of the MIPS instructions is “add $t3 $t4 $t5” for add where $t3 = $t4 + $t5.

4.18.2

The pipeline diagram:

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

The pipeline diagram:

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue is 1.11:

There is one stall of the cycle in every iteration as the “data hazard” between the “LW-2(the second)” and the following instruction is “SUB”.

CPI for the second issue is 1.06:

These two LW instructions can’t be executed in parallel with other instructions, and the stalls of the “SUB”. As it relies on the “LW-2(the second)”. The instruction of the “SW” executes in parallel with the MIPS instruction “ADDI” in the iterations that are even-numbered.

Speed-up =

4.18.6

CPI for the first issue is 1.11:

There is one stall of the cycle in every iteration as the “data hazard” between the “LW-2(the second)” and the following instruction is “SUB”.

CPI for the second issue is 1.06:

The “SUB” is stalled in all of the iterations as it relies on the “LW-2(the second)”. That instructions only can be executed in the iterations of odd-numbered because the pair consists of the “ADDI” and the “BEQ”. But for the iterations of even-numbered, the two instructions of “LW” cannot be executed as a pair.

Speed-up =

Determine the calculation.

4.18.1

The corresponding MIPS assembly language code:

ADD R5,R0,R0 // R5 = R0+R0

Loop1: BEQ R5,R6,Done //go to the “Done” statement

ADD R10,R5,R1 //R10 = R5+R1

LW R11,0(R10) // “R11” is the source register and “$R10” is the destination register and “0” is the offset

LW R10,1(R10)

SUB R10,R11,R10 //R10=R11-R10

ADD R11,R5,R2 //R11 = R5+R2

SW R10,0(R11) // “R10” is the source register and “$R11” is the destination register and “0” is the offset

ADDI R5,R5,2 // R5=R5+2

BEW R0,R0,Loop1

Done:

4.18.2

The pipeline diagram:

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

The pipeline diagram:

ADD R5,R0,R0	“IF”,“ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”,“ID”, -- “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- “ID”, “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- “ID”, -- “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R10,R5,R1	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
LW R11,0(R10)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
LW R10,1(R10)	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SUB R10,R11,R10	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADD R11,R5,R2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
SW R10,0(R11)	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
ADDI R5,R5,2	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.
BEW R0,R0,Again	“IF”, -- -- “ID”, “EX”, “ME”, “WB”.
BEQ R5,R6,End	“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue $= \frac{10 c y c l e s}{9 i n s t r u c t i o n}$ = 1.11

CPI for the first issue is 1.11

CPI for the second issue $= \frac{19 c y c l e s}{18 i n s t r u c t i o n}$ = 1.06

CPI for the second issue is = 1.06

Speed-up: $\frac{1.11}{1.06} = 1.047 = 1.05$

4.18.6

CPI for the first issue $= \frac{10 c y c l e s}{9 i n s t r u c t i o n} = 1.11$

CPI for the first issue is 1.11.

CPI for the second issue $= \frac{15 c y c l e s}{18 i n s t r u c t i o n} = 0.83$

CPI for the second issue is 1.06

Speed-up = $= \frac{1.11}{0.83} = 1.3373 = 1.34$

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Instruction 1	Instruction 2
BNE R1,R2, Label	LW R1,0(R1)

r0	r1	r2	r3	r4	r5	r6	r8	r12	r31
0	–1	2	–3	–4	10	6	8	2	–16

Short Answer

Step by step solution

Define the concept.

Determine the calculation.

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Computer Science Textbooks

Computer Programming

Issues in Computer Science

Big Data

Problem Solving Techniques

Data Structures

Data Representation in Computer Science

Study anywhere. Anytime. Across all devices.

Company

Product

Help