Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

4.18 In this exercise we compare the performance of 1-issue and 2-issue processors, taking into account program transformations that can be made to optimize for 2-issue execution. Problems in this exercise refer to the following loop (written in C):

for(i=0;i!=j;i+=2)

b[i]=a[i]–a[i+1];

When writing MIPS code, assume that variables are kept in registers as follows, and that all registers except those indicated as Free are used to keep various variables, so they cannot be used for anything else.

I

j

a

b

c

Free

R5

R6

R1

R2

R3

R10, R11, R12

4.18.1 [10] Translate this C code into MIPS instructions. Your translation should be direct, without rearranging instructions to achieve better performance.

4.18.2 [10] If the loop exits aft er executing only two iterations, draw a pipeline diagram for your MIPS code from 4.18.1 executed on a 2-issue processor shown in Figure 4.69. Assume the processor has perfect branch prediction and can fetch any two instructions (not just consecutive instructions) in the same cycle.

4.18.3 [10] Rearrange your code from 4.18.1 to achieve better performance on a 2-issue statically scheduled processor from Figure 4.69.

4.18.4 [10] Repeat 4.18.2, but this time use your MIPS code from 4.18.3. 4.18.5 [10] What is the speedup of going from a 1-issue processor to a 2-issue processor from Figure 4.69? Use your code from 4.18.1 for both 1-issue and 2-issue, and assume that 1,000,000 iterations of the loop are executed. As in 4.18.2, assume that the processor has perfect branch predictions, and that a 2-issue processor can fetch any two instructions in the same cycle.

4.18.6 [10] Repeat 4.18.5, but this time assume that in the 2-issue processor one of the instructions to be executed in a cycle can be of any kind, and the other must be a non-memory instruction

Short Answer

Expert verified

4.18.1

The corresponding MIPS assembly language code:

ADD R5,R0,R0

Loop1: BEQ R5,R6,Done

ADD R10,R5,R1

LW R11,0(R10)

LW R10,1(R10)

SUB R10,R11,R10

ADD R11,R5,R2

SW R10,0(R11)

ADDI R5,R5,2

BEW R0,R0,Loop1

Done:

4.18.2

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue: 1.11

CPI for the second issue: 1.06

Speed-up: 1.05

4.18.6

CPI for the first issue: 1.11

CPI for the second issue: 1.06

Speed-up: 1.34

Step by step solution

01

Define the concept.

4.18.1

The explanation of the purpose of using the MIPS instruction:

  • The “branch-on-equal (beq)” is a decision-making instruction in MIPS assembly language. The purpose of using this MIPS assembly instruction “beq reg1, reg2 Label” is going to the statement “Label” if the value of “reg1” is equal to the “reg2”.
  • The “I type” instruction “sw$t2, 0($t3)” is used for storing the word where “$t2” is the source register and “$t3” is the destination register and “0” is the offset.
  • One of the “I type” MIPS instruction is “addi $t3 $t4 1” “$t4” is the source register, “$t3” is the destination register, and “1” is the immediate value. The purpose of using it for add immediately.
  • One of the MIPS instructions is “add $t3 $t4 $t5” for add where $t3 = $t4 + $t5.

4.18.2

The pipeline diagram:

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

The pipeline diagram:

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue is 1.11:

There is one stall of the cycle in every iteration as the “data hazard” between the “LW-2(the second)” and the following instruction is “SUB”.

CPI for the second issue is 1.06:

These two LW instructions can’t be executed in parallel with other instructions, and the stalls of the “SUB”. As it relies on the “LW-2(the second)”. The instruction of the “SW” executes in parallel with the MIPS instruction “ADDI” in the iterations that are even-numbered.

Speed-up =

4.18.6

CPI for the first issue is 1.11:

There is one stall of the cycle in every iteration as the “data hazard” between the “LW-2(the second)” and the following instruction is “SUB”.

CPI for the second issue is 1.06:

The “SUB” is stalled in all of the iterations as it relies on the “LW-2(the second)”. That instructions only can be executed in the iterations of odd-numbered because the pair consists of the “ADDI” and the “BEQ”. But for the iterations of even-numbered, the two instructions of “LW” cannot be executed as a pair.

Speed-up =

02

Determine the calculation.

4.18.1

The corresponding MIPS assembly language code:

ADD R5,R0,R0 // R5 = R0+R0

Loop1: BEQ R5,R6,Done //go to the “Done” statement

ADD R10,R5,R1 //R10 = R5+R1

LW R11,0(R10) // “R11” is the source register and “$R10” is the destination register and “0” is the offset

LW R10,1(R10)

SUB R10,R11,R10 //R10=R11-R10

ADD R11,R5,R2 //R11 = R5+R2

SW R10,0(R11) // “R10” is the source register and “$R11” is the destination register and “0” is the offset

ADDI R5,R5,2 // R5=R5+2

BEW R0,R0,Loop1

Done:

4.18.2

The pipeline diagram:

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.3

The way of executing the two instructions fully in parallel is loaded or stored for executing together with the other instruction.

For achieving this, around every load or store instruction it will be tried to put the non-load or non-store instructions that haven’t any kind of dependencies on the load or the store.

4.18.4

The pipeline diagram:

ADD R5,R0,R0

“IF”,“ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”,“ID”, -- “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- “ID”, “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- “ID”, -- “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R10,R5,R1

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

LW R11,0(R10)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

LW R10,1(R10)

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SUB R10,R11,R10

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADD R11,R5,R2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

SW R10,0(R11)

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

ADDI R5,R5,2

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

BEW R0,R0,Again

“IF”, -- -- “ID”, “EX”, “ME”, “WB”.

BEQ R5,R6,End

“IF”, -- -- “ID”, -- “EX”, “ME”, “WB”.

4.18.5

CPI for the first issue =10cycles9instruction= 1.11

CPI for the first issue is 1.11

CPI for the second issue=19cycles18instruction= 1.06

CPI for the second issue is = 1.06

Speed-up:1.111.06=1.047=1.05

4.18.6

CPI for the first issue=10cycles9instruction=1.11

CPI for the first issue is 1.11.

CPI for the second issue=15cycles18instruction=0.83

CPI for the second issue is 1.06

Speed-up ==1.110.83=1.3373=1.34

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The basic single-cycle MIPS implementation in Figure 4.2 can only implement some instructions. New instructions can be added to an existing Instruction Set Architecture (ISA), but the decision whether or not to do that depends, among other things, on the cost and complexity the proposed addition introduces into the processor datapath and control. The first three problems in this exercise refer to the new instruction: Instruction: LWI Rt,Rd(Rs) Interpretation: Reg[Rt] = Mem[Reg[Rd]+Reg[Rs]] 4.2.1 [10] Which existing blocks (if any) can be used for this instruction? 4.2.2 [10] which new functional blocks (if any) do we need for this instruction? 4.2.3 [10] what new signals do we need (if any) from the control unit to support this instruction?

Question: When silicon chips are fabricated, defects in materials (e.g., silicon) and manufacturing errors can result in defective circuits. A very common defect is for one wire to affect the signal in another. This is called a cross-talk fault. A special class of cross-talk faults is when a signal is connected to a wire that has a constant logical value (e.g., a power supply wire). In this case we have a stuck-at-0 or a stuckat-1 fault, and the affected signal always has a logical value of 0 or 1, respectively. The following problems refer to bit 0 of the Write Register input on the register fi le in Figure 4.24. 4.6.1 [10] Let us assume that processor testing is done by filling the PC, registers, and data and instruction memories with some values (you can choose which values), letting a single instruction execute, then reading the PC, memories, and registers. These values are then examined to determine if a particular fault is present. Can you design a test (values for PC, memories, and registers) that would determine if there is a stuck-at-0 fault on this signal? 4.6.2 [10] Repeat 4.6.1 for a stuck-at-1 fault. Can you use a single test for both stuck-at-0 and stuck-at-1? If yes, explain how; if no, explain why not. 4.6.3 [60] If we know that the processor has a stuck-at-1 fault on this signal, is the processor still usable? To be usable, we must be able to convert any program that executes on a normal MIPS processor into a program that works on this processor. You can assume that there is enough free instruction memory and data memory to let you make the program longer and store additional data. Hint: the processor is usable if every instruction “broken” by this fault can be replaced with a sequence of “working” instructions that achieve the same effect. 4.6.4 [10] Repeat 4.6.1, but now the fault to test for is whether the “MemRead” control signal becomes 0 if RegDst control signal is 0, no fault otherwise. 4.6.5 [10] Repeat 4.6.4, but now the fault to test for is whether the “Jump” control signal becomes 0 if RegDst control signal is 0, no fault otherwise.

This exercise explores how exception handling affects pipeline design. The first three problems in this exercise refer to the following two instructions:

Instruction 1

Instruction 2

BNE R1,R2, Label

LW R1,0(R1)

4.17.1 Which exceptions can each of these instructions trigger? For each of these exceptions, specify the pipeline stage in which it is detected.

4.17.2 If there is a separate handler address for each exception, show how the pipeline organization must be changed to be able to handle this exception. You can assume that the addresses of these handlers are known when the processor is designed.

4.17.3 If the second instruction is fetched right after the first instruction, describe what happens in the pipeline when the first instruction causes the first exception you listed in 4.17.1. Show the pipeline execution diagram from the time the first instruction is fetched until the time the first instruction of the exception handler is completed.

4.17.4 In vectored exception handling, the table of exception handler

addresses is in data memory at a known (fixed) address. Change the pipeline to implement this exception handling mechanism. Repeat 4.17.3 using this modified pipeline and vectored exception handling.

4.17.5 We want to emulate vectored exception handling (described in 4.17.4) on a machine that has only one fixed handler address. Write the code that should be at that fixed address. Hint: this code should identify the exception, get the right address from the exception vector table, and transfer execution to that handler.

Question: In this exercise we examine in detail how an instruction is executed in a single-cycle datapath. Problems in this exercise refer to a clock cycle in which the processor fetches the following instruction word: 10101100011000100000000000010100. Assume that data memory is all zeros and that the processor’s registers have the following values at the beginning of the cycle in which the above instruction word is fetched:

r0

r1

r2

r3

r4

r5

r6

r8

r12

r31

0

–1

2

–3

–4

10

6

8

2

–16

4.7.1 [5] What are the outputs of the sign-extend and the jump “Shift left 2” unit (near the top of Figure 4.24) for this instruction word?

4.7.2 [10] What are the values of the ALU control unit’s inputs for this instruction?

4.7.3 [10] What is the new PC address after this instruction is executed? Highlight the path through which this value is determined.

4.7.4 [10] For each Mux, show the values of its data output during the execution of this instruction and these register values.

4.7.5 [10] For the ALU and the two add units, what are their data input values?

4.7.6 [10] What are the values of all inputs for the “Registers” unit?

Question: The basic single-cycle MIPS implementation in Figure 4.2 can only implement some instructions. New instructions can be added to an existing Instruction Set Architecture (ISA), but the decision whether or not to do that depends, among other things, on the cost and complexity the proposed addition introduces into the processor datapath and control. The first three problems in this exercise refer to the new instruction: Instruction: LWI Rt,Rd(Rs) Interpretation: Reg[Rt] = Mem[Reg[Rd]+Reg[Rs]] 4.2.1 [10] Which existing blocks (if any) can be used for this instruction? 4.2.2 [10] which new functional blocks (if any) do we need for this instruction? 4.2.3 [10] what new signals do we need (if any) from the control unit to support this instruction?

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free