Chapter 5: Q17E (page 495)

Cache coherence concerns the views of multiple processors on a given cache block. The following data shows two processors and their read/write operations on two different words of a cache block X (initially X[0] = X[1] = 0). Assume the size of integers is 32 bits.
P1
P2
$X [0] + +; X [1] = 3$
$X [0] = 5; X [1] + = 2;$
5.17.1 List the possible values of the given cache block for a correct cache coherence protocol implementation. List at least one more possible value of the block if the protocol doesn’t ensure cache coherency.
5.17.2 For a snooping protocol, list a valid operation sequence on each processor/cache to finish the above read/write operations.
5.17.3 What are the best-case and worst-case numbers of cache misses
needed to execute the listed read/write instructions?
Memory consistency concerns the views of multiple data items. The following data shows two processors and their read/write operations on different cache blocks (A and B initially 0).
P1
P2
$A = 1; B - 2; A + = 2; B + +;$
$C = B; D = A;$
5.17.4 List the possible values of C and D for an implementation that ensures both consistency assumptions on page 470.
5.17.5List at least one more possible pair of values for C and D if such assumptions are not maintained.
5.17.6 For various combinations of write policies and write allocation policies, which combinations make the protocol implementation simpler?

Short Answer

Expert verified

4.17.1.The possible values of the given cache block for a correct cache coherence protocol implementation

Order 1:

P1	P2
$X [0] + +;$
$X [1] = 3;$
	$X [0] = 5;$
	$X [1] + = 2;$

Order 2:

P1	P2
$X [0] + +;$
	$X [0] = 5;$
X [1] = 3;
	X [1] + = 2;

Order 3:

P1	P2
	X [0] = 5;
X [0]++;
	X [1] + = 2;
X [1] = 3;

Order 4:

P1	P2
X [0] ++;
	X [0] = 5;
	X [1] +=2;
X [1] =3;

Order 5:

P1	P2
	X [0] = 5
X [0] ++;
X [1] = 3
	X [1] += 2

Order 6:

P1	P2
	X [0] = 5
	X [1] += 2
X [0] ++;
X [1] = 3

If coherency is not ensured, then P2’s operations take precedence over P1’s.

4.17.2For a snooping protocol, the list of valid operation sequences on each processor/cache to finish the above read/write operations is as follows:

P1	P1 Cache status	P2	P2 Cache status
		X [0] = 5	Invalidate X on other caches, Read X and Write X block
		X [1] += 2;	read and write X block
X [0] ++;	Read X value		X block is shared
	Send invalidate message		X block is invalided
	Write X block
X [1] = 3;	Write X block

4.17.3 Best case: Orderings 1 and 6

Worst Case: Orderings 2 and 3

4.17.4The possible values of the given cache block for a correct cache coherence protocol implementation with C and D.

Order 1:

P1	P2
A = 1
B = 2
A+ =2;
B++;
	C = B
	D = A

Order 2:

P1	P2
A = 1
B = 2
A+ =2;
	C = B
B ++;
	D = A

Order 3:

P1	P2
A = 1
B = 2
	C = B
A+ =2;
B ++;
	D = A

Order 4:

P1	P2
A = 1
	C = B
B = 2
A+ =2;
B ++;
	D = A

Order 5:

P1	P2
	C = B
A = 1
B = 2
A+ =2;
B++;
	D = A

Order 6:

P1	P2
A = 1
B = 2
A+ =2;
	C = B
	D = A
B ++;

Order 7:

P1	P2
A = 1
B = 2
	C = B
A+ =2;
	D = A
B ++;

Order 8:

P1	P2
A = 1
	C = B
B = 2
A+ =2;
	D = A
B ++;

Order 9:

P1	P2
	C = B
A = 1
B = 2
A+ =2;
	D = A
B ++;

Order 10:

P1	P2
A = 1
B = 2
	C = B
	D = A
A+ =2;
B ++;

Order 11:

P1	P2
A = 1
	C = B
B = 2
	D = A
A+ =2;
B ++;

Order 12:

P1	P2
	C = B
A = 1
B = 2
	D = A
A+ =2;
B ++;

Order 13:

P1	P2
A = 1
	C = B
	D = A
B = 2
A+ =2;
B ++;

Order 14:

P1	P2
	C = B
A = 1
	D = A
B = 2
A+ =2;
B ++;

Order 15:

P1	P2
	C = B
	D = A
A = 1
B = 2
A+ =2;
B ++;

5.17.5 Result: (2,0) for B=0, not preceding by A=1

5.17.6 Write Back is simpler.

Step by step solution

Determine Cache coherence and Snooping Protocols.

The multicore processors have multiple processors, but these processors are very likely to have the same physical address. When shared cache data is introduced, these processors will hold the same physical address but different values. This is a problem is referred to as the cache coherence problem.

Cache coherence protocols will maintain coherence for multiple processors.

There are two types of snooping protocols, Write-invalidate and Write-Update.

The write-invalidate protocol will invalidate the copies of data in all other processors before the data-writing processor changes its copy. Write-Update will announce the new data and then all affected caches will be updated with new data.

Determine the possible values of the given cache block for a correct cache coherence protocol implementation and the invalid one.

5.17.1

The given cache block is as follows:

P1	P2
X [0] ++;X [1] = 3	X [0] = 5;X [1]+=2;

The correct cache coherence protocol will maintain the coherence between the multiple processors.

The possible values are as follows:

coherence protocol implementation

Order 1:

P1	P2
X[0]++;
X[1]=3;
	X[0]=5;
	X[1]+=2;

X[0] will be incremented, but the processor P2 holds the value of X[0] as 5. So it remains 5 in the result. X[1] will be 5 by adding 3 to 2.

The values will be (5,5)

Order 2:

P1	P2
X[0]++;
	X[0]=5;
X[1]=3;
	X[1]+=2;

X[0] will be incremented first, but the processor P2 holds the value of X[0] as 5. So it remains 5 in the result. X[1] will be 5 by adding 3 to 2.

The values will be (5,5)

Order 3:

P1	P2
	X[0]=5;
X[0]++;
	X[1]+=2;
X[1]=3;

X[0] will be incremented later, so the value of X[0] will be 6. So it remains 5 in the result. X[1] will be 3 since P2 has the add statement before P1.

The values will be (6,3)

Order 4:

P1	P2
X[0]++;
	X[0]=5;
	X[1]+=2;
X[1]=3;

X[0] will be incremented first, but the value of X[0] will be 5 because of P2. So it remains 5 in the result. X[1] will be 3 sine P2 has the add statement before P1.

The values will be (6,3)

Order 5:

P1	P2
	X[0]=5;
X[0]++;
X[1]=3;
	X[1]+=2;

X[0] will be assigned to 5 first by P2, then it gets incremented as 6. The value 3 will be added to 2 and X[1] will have 5.

The values will be (6,5)

Order 6:

P1	P2
	X[0]=5;
	X[1]+=2;
X[0]++;
X[1]=3;

X[0] will be assigned with value 5 by P2 first and then get incremented as 6 by P1. Since the addition passes first in P2, the value of X[1] will be 3.

The values will be (6,3)

If coherency is not ensured, then P2’s operations take precedence over P1’s.

Determine the list of valid operation sequences on each processor/cache to finish the above read/write operations for snooping protocol.

5.17.2.

For a snooping protocol, the list of valid operation sequences on each processor/cache to finish the above read/write operations is as follows:

P1	P1 Cache status	P2	P2 Cache status
		X[0]=5;	Invalidate X on other caches, Read X and Write X block
		X[1]+=2;	read and write X block
X[0]++;	Read X value		X block is shared
	Send invalidate message		X block is invalided
	Write X block
X[1]=3;	Write X block

The snooping protocol will invalidate the X on other caches so that the coherence will be maintained in P2. Then the read and write will be done in the X block of the processor P2. After this, the X block will be shared with the processor P1. Now, the Processor P1 will read the X block and send invalidate message to P2 which prevents P2 from changing the X values. Finally, P1 will write the values on the X block.

Determine the best case and worst case of the list.

5.17.3

From the list provided in 5.17.1, the best and worst cases can be decided.

The best cases are orders 1 and 6 because they require only two misses.

The worst cases are orders 2 and 3 because these require 4 cache misses.

The possible values of the given cache block for a correct cache coherence protocol implementation with C and D.

5.17.4

Order 1:

P1	P2
A = 1
B = 2
A+ =2;
B++;
	C = B
	D = A

Result: (3,3)

Order 2:

P1	P2
A = 1
B = 2
A+ =2;
	C = B
B ++;
	D = A

Result: (2,3)

Order 3:

P1	P2
A = 1
B = 2
	C = B
A+ =2;
B ++;
	D = A

Result: (2,3)

Order 4:

P1	P2
A = 1
	C = B
B = 2
A+ =2;
B ++;
	D = A

Result: (0,3)

Order 5:

P1	P2
	C = B
A = 1
B = 2
A+ =2;
B++;
	D = A

Result: (0,3)

Order 6:

P1	P2
A = 1
B = 2
A+ =2;
	C = B
	D = A
B ++;

Result: (2,3)

Order 7:

P1	P2
A = 1
B = 2
	C = B
A+ =2;
	D = A
B ++;

Result: (2,3)

Order 8:

P1	P2
A = 1
	C = B
B = 2
A+ =2;
	D = A
B ++;

Result: (0,3)

Order 9:

P1	P2
	C = B
A = 1
B = 2
A+ =2;
	D = A
B ++;

Result: (0,3)

Order 10:

P1	P2
A = 1
B = 2
	C = B
	D = A
A+ =2;
B ++;

Result: (2,1)

Order 11:

P1	P2
A = 1
	C = B
B = 2
	D = A
A+ =2;
B ++;

Result: (0,1)

Order 12:

P1	P2
	C = B
A = 1
B = 2
	D = A
A+ =2;
B ++;

Result: (0,1)

Order 13:

P1	P2
A = 1
	C = B
	D = A
B = 2
A+ =2;
B ++;

Result: (0,1)

Order 14:

P1	P2
	C = B
A = 1
	D = A
B = 2
A+ =2;
B ++;

Result: (0,1)

Order 15:

P1	P2
	C = B
	D = A
A = 1
B = 2
A+ =2;
B ++;

Result: (0,0)

Determine one more possible pair of values for C and D if such assumptions are not maintained.

5.17.5

Assume that B=0 by P2, but not preceding by A=1.

Then the result is (2,0)

Determine which combinations make the protocol implementation simpler.

5.17.6. Write back will facilitate the use of exclusive access blocks, and the frequency of invalidates gets lower. It also prevents the use of write-broadcasts, but this is a more complex protocol.

The allocation policy has very little effect on the protocol.

So, the Write back is simpler than Write through.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Base CPI	Priviliged O/S Accesses per 10,000 Instructions	Performance Impact to Trap to the Guest O/S	Performance Impact to Trap to VMM	I/O Access per 10,000 Instructions	I/O Access Time (Includes Time to Trap to Guest O/S)
1.5	120	15 cycles	175 cycles	30	1100 cycles

a.	Mesa/gcc
b.	mcf/swim

L1	L2
Write through, non-write allocate	Write back, write allocate

Short Answer

Step by step solution

Determine Cache coherence and Snooping Protocols.

Determine the possible values of the given cache block for a correct cache coherence protocol implementation and the invalid one.

Determine the list of valid operation sequences on each processor/cache to finish the above read/write operations for snooping protocol.

Determine the best case and worst case of the list.

The possible values of the given cache block for a correct cache coherence protocol implementation with C and D.

Determine one more possible pair of values for C and D if such assumptions are not maintained.

Determine which combinations make the protocol implementation simpler.

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Computer Science Textbooks

Data Structures

Theory of Computation

Functional Programming

Computer Organisation and Architecture

Databases

Computer Network

Study anywhere. Anytime. Across all devices.

Company

Product

Help

Data Reads per 100 Instructions	Data writes per 1000 Instructions	Instruction Cache Miss Rate	Data Cache Miss Rate	Block Size(byte)
250	100	0.30%	2%	64%

ADDRESS
$0$	$4$	$16$	$132$	$232$	$160$	$1024$	$30$	$140$	$3100$	$180$	$2180$