Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

In this exercise, we examine how data dependences affect execution in the basic 5-stage pipeline described in section 4.5. Problems in this exercise refer to the following sequence of instructions:

or r1,r2,r3

or r2,r1,r4

or r1,r1,r2

Also, assume the following cycle times for each of the options related to forwarding:

Without Forwarding

With Full Forwarding

With ALU-ALU Forwarding Only

250ps

300ps

290ps

4.9.1 Indicate dependences and their type.

4.9.2 Assume there is no forwarding in this pipelined processor. Indicate hazards and add nop instructions to eliminate them.

4.9.3 Assume there is full forwarding. Indicate hazards and addNOPinstructions to eliminate them.

4.9.4 What is the total execution time of this instruction sequence

without forwarding and with full forwarding? What is the speedup achieved by adding full forwarding to a pipeline that had no forwarding?

4.9.5 Addnopinstructions to this code to eliminate hazards if there is ALU-ALU forwarding only (no forwarding from the MEM to the EX stage).

4.9.6 What is the total execution time of this instruction sequence with only ALU-ALU forwarding? What is the speedup over a no-forwarding pipeline?

Short Answer

Expert verified

4.9.1 For the given sequence of instruction, dependences, and their types are as follows:

Instructions

Dependence

I1: or r1, r2, r3

RAW on r1

I2: or r2, r1, r4

RAW on r2

I3: or r1, r2, r2

WAR on r2

WAR on r1

WAW on r1

4.9.2 In a no forwarding pipeline processor, the dependency with nop instructions is as follows:

Instructions

Description

I1:or r1, r2, r3

Delay I2 to avoid a RAW hazard

nop

nop

I2:or r2, r1, r4

Delay I3 to avoid a RAW hazard

nop

nop

I3: or r1, r2, r2

4.9.3 With the full forwarding pipeline processor, the dependency with nop instructions is as follows:

Instructions

Dependence

I1: or r1, r2, r3

I2: or r2, r1, r4

No RAW on r1(forwarded)

I3: or r1, r2, r2

No RAW on R2(forwarded)

4.9.4 The total execution time :

Without forwarding=1980ps

With forwarding =1680ps

Speedup due to forwarding=1.18

4.9.5

Instruction

Description

I1: or r1, r2, r3

I2: or r2, r1, r4

ALU-ALU

I3: or r1, r2, r2

ALU-ALU

4.9.6 The total execution time:

Without forwarding=1980ps

With ALU-ALU forwarding=1470ps

Speed up with ALU-ALU=1.35

Step by step solution

01

Determine the Data dependency.

The dependence occurs when the data needed by the instructions are produced by the other instruction. Based on the read and write operations the chances of overlapping will cause hazards. There are three types of hazards structural, data and control hazards. The hazard and dependency occurs randomly concerning each other.

02

Determine the dependency type.

4.9.Given a sequence of instructions:

I1: or r1, r2, r3

I2: or r2, r1, r4

I3: or r1, r2, r2

The instruction I1 produces Read After Write dependency on register r1 from I1 to I2 and I3. Instructions I2 and I3 read the value of r1. The instruction I2 produces Read After Write dependency on r2 from I2 to I3. Because I3 performs read on r2 after the I2. Write After Read will be created on r2 from I1 to I2. Because I1 and I2 write on the same location r2. Write After Read will be created on r1 from I1 to I3. Because I1 and I3 write on the same location r1.Write After Write will occur on r1 from I1 to I3 because I3 rewrites r1 after I1.

For the given sequence of instruction, dependences, and their types are as follows:

Instructions

Dependence

I1: or r1, r2, r3

RAW on r1

I2: or r2, r1, r4

RAW on r2

I3: or r1, r2, r2

WAR on r2

WAR on r1

WAW on r1

03

Determine the dependency with nop.

4.9.2

WAR and WAW do not cause hazards in a basic five-stage pipeline. The nop instructions will eliminate the hazards In a no forwarding pipeline processor, the dependency with nop instructions is as follows:

Instructions

Description

I1:or r1, r2, r3

Delay I2 to avoid RAW hazard

nop

nop

I2:or r2, r1, r4

Delay I3 to avoid RAW hazard

nop

nop

I3: or r1, r2, r2

04

Determine the dependency with nop.

4.9.3

An ALU instruction with full forwarding can forward the value to the EX stage of the next instruction without any hazard. The load will not be forwarded to the EX stage of the next instruction.

With the full forwarding pipeline processor, the dependency with nop instructions is as follows:

Instructions

Dependence

I1: or r1, r2, r3

I2: or r2, r1, r4

No RAW on r1(forwarded)

I3: or r1, r2, r2

No RAW on R2(forwarded)

05

Determine the total execution time.

4.9.4 The total execution will be calculated by the number of cycles and the clock cycle time. Let us consider the clock cycle time to be 180ps. The sequence of instruction takes 7 cycles for execution.

The total execution time without forwarding will be calculated as follows:

(7+4)×180ps=11×180=1980ps

The Total execution time with forwarding is calculated as follows:

7×240=1680ps

The speed-up due to forwarding will be 1.18 as per the total execution time.

06

Determine nop for ALU-ALU forwarding.

4.9.5 The ALU instruction will be forwarded to the next instruction with ALU-ALU only forwarding.

Instruction

Description

I1: or r1, r2, r3

I2: or r2, r1, r4

ALU-ALU forwarding of r1 from I1

I3: or r1, r2, r2

ALU-ALU forwarding of r2 from I2

07

Determine the total execution time for ALU-ALU.

4.9.6 The total execution will be calculated by the number of cycles and the clock cycle time. Let us consider the clock cycle time to be 180ps. The sequence of instruction takes 7 cycles for execution.

The total execution time without forwarding will be calculated as follows:

(7+4)×180ps=11×180=1980ps

The Total execution time with ALU-ALU forwarding is calculated as follows:

7×120=1470ps

The speed-up due to forwarding will be 1.35 as per the total execution time.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Consider the following instruction: Instruction: AND Rd,Rs,Rt Interpretation: Reg[Rd] = Reg[Rs] AND Reg[Rt] 4.1.1 [5] What are the values of control signals generated by the control in Figure 4.2 for the above instruction? 4.1.2 [5] Which resources (blocks) perform a useful function for this instruction? 4.1.3 [10] Which resources (blocks) produce outputs, but their outputs are not used for this instruction? Which resources produce no outputs for this instruction?.

Consider the following loop.

loop:lw r1,0(r1)

and r1,r1,r2

lw r1,0(r1)

lw r1,0(r1)

beq r1,r0,loop

Assume that perfect branch prediction is used (no stalls due to control hazards), that there are no delay slots, and that the pipeline has full forwarding support. Also, assume that many iterations of this loop are executed before the loop exits.

4.11.1 Show a pipeline execution diagram for the third iteration of this loop, from the cycle in which we fetch the first instruction of that iteration up to(but not including) the cycle in which we can fetch the first instruction of the next iteration. Show all instructions that are in the pipeline during these cycles (not just those from the third iteration).

4.11.2 How often (as a percentage of all cycles) do we have a cycle in which all five pipeline stages are doing useful work?

Question: Problems in this exercise assume that logic blocks needed to implement a processor’s datapath have the following latencies: I-Mem Add Mux ALU Regs D-Mem Sign-Extend Shift-Left-2 200ps 70ps 20ps 90ps 90ps 250ps 15ps 10ps 4.4.1 [10] If the only thing we need to do in a processor is fetch consecutive instructions (Figure 4.6), what would the cycle time be? 4.4.2 [10] Consider a datapath similar to the one in Figure 4.11, but for a processor that only has one type of instruction: unconditional PC-relative branch. What would the cycle time be for this datapath? 4.4.3 [10] Repeat 4.4.2, but this time we need to support only conditional PC-relative branches. The remaining three problems in this exercise refer to the datapath element Shift - left -2: 4.4.4 [10] Which kinds of instructions require this resource? 4.4.5 [20] For which kinds of instructions (if any) is this resource on the critical path? 4.4.6 [10] Assuming that we only support beq and add instructions, discuss how changes in the given latency of this resource affect the cycle time of the processor. Assume that the latencies of other resources do not change.

This exercise is intended to help you understand the relationship between forwarding, hazard detection, and ISA design. Problems in this exercise refer to the following sequence of instructions, and assume that it is executed on a 5-stage pipelined datapath:

add r5,r2,r1

lw r3,4(r5)

lw r2,0(r2)

or r3,r5,r3

sw r3,0(r5)

4.13.1 [5] If there is no forwarding or hazard detection, insert nops to ensure correct execution.

4.13.2 [10] Repeat 4.13.1 but now use nops only when a hazard cannot be avoided by changing or rearranging these instructions. You can assume register R7 can be used to hold temporary values in your modified code.

4.13.3 [10] If the processor has forwarding, but we forgot to implement the hazard detection unit, what happens when this code executes? 4.13.4 [20] If there is forwarding, for the first five cycles during the execution of this code, specify which signals are asserted in each cycle by hazard detection and forwarding units in Figure 4.60.

4.13.5 [10] If there is no forwarding, what new inputs and output signals do we need for the hazard detection unit in Figure 4.60? Using this instruction sequence as an example, explain why each signal is needed. 4.13.6 [20] For the new hazard detection unit from 4.13.5, specify which output signals it asserts in each of the first five cycles during the execution of this code.

For the problems in this exercise, assume that there are no pipeline stalls and that the breakdown of executed instructions is as follows: add addi not beq lw sw 20% 20% 0% 25% 25% 10% 4.5.1 [10] In what fraction of all cycles is the data memory used? 4.5.2 [10] In what fraction of all cycles is the input of the sign-extend circuit needed? What is this circuit doing in cycles in which its input is not needed.

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free