Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Question: Consider the following instruction: Instruction: AND Rd,Rs,Rt Interpretation: Reg[Rd] = Reg[Rs] AND Reg[Rt] 4.1.1 [5] What are the values of control signals generated by the control in Figure 4.2 for the above instruction? 4.1.2 [5] Which resources (blocks) perform a useful function for this instruction? 4.1.3 [10] Which resources (blocks) produce outputs, but their outputs are not used for this instruction? Which resources produce no outputs for this instruction?.

Short Answer

Expert verified

4.1.1

The required values are written below:

RegWrite

Memread

ALUMUX

MemWrite

ALUOP

RegMux

Branch

0

0

1

1

Add

X

0

4.1.2

All resources (blocks) perform a useful function for this instruction except the “branch add” unit and write the port of the registers.

4.1.3

Resources

Output

Branch add

Data memory[Not used output]

Branch add, second the read port register

No outputs

Step by step solution

01

Define the concept.

4.1.1

The control signal “ALUMux” can control Mux.

The output of the register file is selected at the ALU input, 0 (Reg) And the immediate from the instruction word is selected by “1(Imm)” as the second input to ALU.

The control signal “RegMux” can control Mux at the data input to the register file.

The output of the ALU is selected by the “0” ALU and the output of the memory is selected by “0”Mem.

The value “X” is denoted as “don’t care.” So it does not care if the signal is 0 or 1.

4.1.2

The resource (blocks) “branch add” does not perform a useful function for this instruction. It writes the port of the registers.

4.1.3

The “branch add” produces the data memory as an output. This output is not used. However, the resource “branch add, second the read port register” produces no output.

02

Determine the calculation.

4.1.1

The specified picture is Figure 4.2.

It is also given that the instruction is “AND Rd,Rs,Rt” and the interpretation is “Reg[Rd] = Reg[Rs] AND Reg[Rt].”

For the specified instruction, the control generated the values of control signals in the mentioned figure-4.2.

RegWrite

Memread

ALUMUX

MemWrite

ALUOP

RegMux

Branch

0

0

1

1

Add

X

0

Here “X” is denoted as a “don’t care” situation.

4.1.2

All resources (blocks) perform a useful function for this instruction but the resource (blocks) “branch add” does not perform a useful function for this instruction. It writes the port of the registers.

4.1.3

The “branch add” produces the data memory as an output. This output is not used where the “branch add, second the read port register” produces no output.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Consider the following instruction: Instruction: AND Rd,Rs,Rt Interpretation: Reg[Rd] = Reg[Rs] AND Reg[Rt] 4.1.1 [5] What are the values of control signals generated by the control in Figure 4.2 for the above instruction? 4.1.2 [5] Which resources (blocks) perform a useful function for this instruction? 4.1.3 [10] Which resources (blocks) produce outputs, but their outputs are not used for this instruction? Which resources produce no outputs for this instruction?

This exercise is intended to help you understand the cost/complexity/performance trade-offs of forwarding in a pipelined processor. Problems in this exercise refer to pipelined data paths from Figure 4.45. These problems assume that, of all the instructions executed in a processor, the following fraction of these instructions have a particular type of RAW data dependence. The type of RAW data dependence is identified by the stage that produces the result (EX or MEM) and the instruction that consumes the result (1st instruction that follows the one that produces the result, 2nd instruction that follows, or both). We assume that the register write is done in the first half of the clock cycle and that register reads are done in the second half of the cycle, so “EX to 3rd” and “MEM to 3rd” dependencies are not counted because they cannot result in data hazards. Also, assume that the CPI of the processor is 1 if there are no data hazards.

Ex to 1st only

MEM to 1st only

EX to 2nd only

MEM to 2nd only

EX to 1st and MEM to 2nd

Other RAW Dependences

5%

20%

5%

10%

10%

10%

Assume the following latencies for individual pipeline stages. For the EX stage, latencies are given separately for a processor without forwarding and for a processor with different kinds of forwarding.

IF

ID

EX(no FW)

EX (full FW)

EX(FW from EX/MEM only)

Ex(FW from MEM/WB only)

MEM

WB

150ps

100ps

120ps

150ps

140ps

130ps

120ps

100ps

4.12.1 If we use no forwarding, what fraction of cycles are we stalling due to data hazards?

4.12.2 If we use full forwarding (forward all results that can be forwarded), what fraction of cycles are we staling due to data hazards?

4.12.3 Let us assume that we cannot afford to have three-input Muxes that are needed for full forwarding. We have to decide if it is better to forward only from the EX/MEM pipeline register (next-cycle forwarding) or only from the MEM/WB pipeline register (two-cycle forwarding). Which of the two options results in fewer data stall cycles?

4.12.4 For the given hazard probabilities and pipeline stage latencies, what is the speedup achieved by adding full forwarding to a pipeline that had no forwarding?

4.12.5 What would be the additional speedup (relative to a processor with forwarding) if we added time-travel forwarding that eliminates all data hazards? Assume that the yet-to-be-invented time-travel circuitry adds 100 ps to the latency of the full-forwarding EX stage.

4.12.6 Repeat 4.12.3 but this time determine which of the two options results in a shorter time per instruction.

This exercise explores how exception handling affects pipeline design. The first three problems in this exercise refer to the following two instructions:

Instruction 1

Instruction 2

BNE R1,R2, Label

LW R1,0(R1)

4.17.1 Which exceptions can each of these instructions trigger? For each of these exceptions, specify the pipeline stage in which it is detected.

4.17.2 If there is a separate handler address for each exception, show how the pipeline organization must be changed to be able to handle this exception. You can assume that the addresses of these handlers are known when the processor is designed.

4.17.3 If the second instruction is fetched right after the first instruction, describe what happens in the pipeline when the first instruction causes the first exception you listed in 4.17.1. Show the pipeline execution diagram from the time the first instruction is fetched until the time the first instruction of the exception handler is completed.

4.17.4 In vectored exception handling, the table of exception handler

addresses is in data memory at a known (fixed) address. Change the pipeline to implement this exception handling mechanism. Repeat 4.17.3 using this modified pipeline and vectored exception handling.

4.17.5 We want to emulate vectored exception handling (described in 4.17.4) on a machine that has only one fixed handler address. Write the code that should be at that fixed address. Hint: this code should identify the exception, get the right address from the exception vector table, and transfer execution to that handler.

This exercise is intended to help you understand the relationship between delay slots, control hazards, and branch execution in a pipelined processor. In this exercise, we assume that the following MIPS code is executed on a pipelined processor with a 5-stage pipeline, full forwarding, and a predict-taken branch predictor:

lw r2,0(r1)

label1: beq r2,r0,label2 # not taken once, then taken

lw r3,0(r2) beq r3,r0,label1 # taken

add r1,r3,r1

label2: sw r1,0(r2)

4.14.1 [10] Draw the pipeline execution diagram for this code, assuming there are no delay slots and that branches execute in the EX stage. 4.14.2 [10] Repeat 4.14.1, but assume that delay slots are used. In the given code, the instruction that follows the branch is now the delay slot instruction for that branch.

4.14.3 [20] One way to move the branch resolution one stage earlier is to not need an ALU operation in conditional branches. The branch instructions would be “bez rd,label” and “bnez rd,label”, and it would branch if the register has and does not have a zero value, respectively. Change this code to use these branch instructions instead of beq. You can assume that register R8 is available for you to use as a temporary register, and that an seq (set if equal) R-type instruction can be used. 366 Chapter 4 The Processor Section 4.8 describes how the severity of control hazards can be reduced by moving branch execution into the ID stage. This approach involves a dedicated comparator in the ID stage, as shown in Figure 4.62. However, this approach potentially adds to the latency of the ID stage, and requires additional forwarding logic and hazard detection.

4.14.4 [10] Using the first branch instruction in the given code as an example, describe the hazard detection logic needed to support branch execution in the ID stage as in Figure 4.62. Which type of hazard is this new logic supposed to detect?

4.14.5 [10] For the given code, what is the speedup achieved by moving branch execution into the ID stage? Explain your answer. In your speedup calculation, assume that the additional comparison in the ID stage does not affect clock cycle time. 4.14.6 [10] Using the first branch instruction in the given code as an example, describe the forwarding support that must be added to support branch execution in the ID stage. Compare the complexity of this new forwarding unit to the complexity of the existing forwarding unit in Figure 4.62.

This exercise is intended to help you understand the cost/complexity/performance trade-offs of forwarding in a pipelined processor. Problems in this exercise refer to pipelined data paths from Figure 4.45. These problems assume that, of all the instructions executed in a processor, the following fraction of these instructions have a particular type of RAW data dependence. The type of RAW data dependence is identified by the stage that produces the result (EX or MEM) and the instruction that consumes the result (1st instruction that follows the one that produces the result, 2nd instruction that follows, or both). We assume that the register write is done in the first half of the clock cycle and that register reads are done in the second half of the cycle, so “EX to 3rd” and “MEM to 3rd” dependencies are not counted because they cannot result in data hazards. Also, assume that the CPI of the processor is 1 if there are no data hazards.

Ex to 1st only

MEM to 1st only

EX to 2nd only

MEM to 2nd only

EX to 1st and MEM to 2nd

Other RAW Dependences

5%

20%

5%

10%

10%

10%

Assume the following latencies for individual pipeline stages. For the EX stage, latencies are given separately for a processor without forwarding and for a processor with different kinds of forwarding.

IF

ID

EX(no FW)

EX (full FW)

EX(FW from EX/MEM only)

Ex(FW from MEM/WB only)

MEM

WB

150ps

100ps

120ps

150ps

140ps

130ps

120ps

100ps

4.12.1 If we use no forwarding, what fraction of cycles are we stalling due to data hazards?

4.12.2 If we use full forwarding (forward all results that can be forwarded), what fraction of cycles are we staling due to data hazards?

4.12.3 Let us assume that we cannot afford to have three-input Muxes that are needed for full forwarding. We have to decide if it is better to forward only from the EX/MEM pipeline register (next-cycle forwarding) or only from the MEM/WB pipeline register (two-cycle forwarding). Which of the two options results in fewer data stall cycles?

4.12.4 For the given hazard probabilities and pipeline stage latencies, what is the speedup achieved by adding full forwarding to a pipeline that had no forwarding?

4.12.5 What would be the additional speedup (relative to a processor with forwarding) if we added time-travel forwarding that eliminates all data hazards? Assume that the yet-to-be-invented time-travel circuitry adds 100 ps to the latency of the full-forwarding EX stage.

4.12.6 Repeat 4.12.3 but this time determine which of the two options results in a shorter time per instruction.

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free