Chapter 5: Q2E (page 484)

5.2 Caches are important to providing a high-performance memory hierarchy to processors. Below is a list of 32-bit memory address references, given as word addresses.
3, 180, 43, 2, 191, 88, 190, 14, 181, 44, 186, 253

5.2.1 [10] <§5.3> For each of these references, identify the binary address, the tag,
and the index given a direct-mapped cache with 16 one-word blocks. Also list if each reference is a hit or a miss, assuming the cache is initially empty.

5.2.2 [10] <§5.3> For each of these references, identify the binary address, the tag,
and the index given a direct-mapped cache with two-word blocks and a total size of 8 blocks. Also list if each reference is a hit or a miss, assuming the cache is initially empty.

5.2.3 [20] < §§5.3, 5.4> You are asked to optimize a cache design for the given
references. There are three direct-mapped cache designs possible, all with a total of 8 words of data: C1 has 1-word blocks, C2 has 2-word blocks, and C3 has 4-word blocks. In terms of miss rate, which cache design is the best? If the miss stall time is 25 cycles, and C1 has an access time of 2 cycles, C2 takes 3 cycles, and C3 takes 5 cycles, which is the best cache design?

There are many different design parameters that are important to a cache’s overall performance. Below are listed parameters for different direct-mapped cache designs.

Cache Data Size: 32 KiB

Cache Block Size: 2 words

Cache Access Time: 1 cycle

5.2.4 [15] < §5.3> Calculate the total number of bits required for the cache listed
above, assuming a 32-bit address. Given that total size, find the total size of the closest direct-mapped cache with 16-word blocks of equal size or greater. Explain why the second cache, despite its larger data size, might provide slower performance than the first cache.

5.2.5 [20] <§§5.3, 5.4> Generate a series of read requests that have a lower miss rate on a 2 KiB 2-way set associative cache than the cache listed above. Identify one possible solution that would make the cache listed have an equal or lower miss rate than the 2KiB cache. Discuss the advantages and disadvantages of such a solution.

5.2.6 [15] <§5.3> Th e formula shown in Section 5.3 shows the typical method to
index a direct-mapped cache, specifically (Block address) modulo (Number of blocks in the cache). Assuming a 32-bit address and 1024 blocks in the cache, consider a different indexing function, specifically (Block address [31:27] XOR Block address [26:22]). Is it possible to use this to index a direct-mapped cache? If so, explain why and discuss any changes that might need to be made to the cache. If it is not possible, explain why.

Short Answer

Expert verified

5.2.1

Address of word	Address of binary	Tag	Index	Hit/Miss
3	00000011	0	3	M
180	10110100	11	4	M
43	00101011	2	11	M
2	00000010	0	2	M
191	10111111	11	15	M
88	01011000	5	8	M
190	10111110	11	14	M
14	00001110	0	14	M
181	10110101	11	5	M
44	00101100	2	12	M
186	10111010	11	10	M
253	11111101	15	13	M

5.2.2

Address of word	Address of binary	Tag	Index	Hit/Miss
3	00000011	0	1	M
180	10110100	11	2	M
43	00101011	2	5	M
2	00000010	0	1	H
191	10111111	11	7	M
88	01011000	5	4	M
190	10111110	11	7	H
14	00001110	0	7	M
181	10110101	11	2	H
44	00101100	2	6	M
186	10111010	11	5	M
253	11111101	15	6	M

5.2.3

Word address	Binary address	Tag	Cache 1 Index hit/miss	Cache 2 Index hit/miss	Cache 3 Index hit/miss
3	0000 0011	0	3	M	1	M	0	M
180	1011 0100	22	4	M	2	M	1	M
43	0010 1011	5	3	M	1	M	0	M
2	0000 0010	0	2	M	1	M	0	M
191	1011 1111	23	7	M	3	M	1	M
88	0101 1000	11	0	M	0	M	0	M
190	1011 1110	23	6	M	3	H	1	H
14	0000 1110	1	6	M	3	M	1	M
181	1011 0101	22	5	M	2	H	1	M
44	0010 1100	5	4	M	2	M	1	M
186	1011 1010	23	2	M	1	M	0	M
253	1111 1101	31	5	M	2	M	1	M

Cache 2 provides the best performance.

5.2.4

Total cache size =41984

5.2.5

0,32768,0,32768,0, 32768…would miss on every access.

5.2.6

Yes, it is possible.

Step by step solution

Determine the cache basics:

The cache is the temporary memory, where the values can be placed while execution. A cache structure in which each memory location is mapped to one cache is known as a direct-mapped cache. If a data is requested to a cache and if the data is not available, then it is called a cache miss. If the requested data is available, then it is called hit.

Determine the list with the referred details.

5.2.1

Given list of 32-bit address references, as word addresses are:

3,180,43,2,191,88,190,14,181,44,186,253.

The binary address is the conversion of the word address to binary. The tag and index are the numbers to which the memory is mapped to the cache. Hit or miss depends in the data availability.

Address of word	Address of binary	Tag	Index	Hit/Miss
3	00000011	0	3	M
180	10110100	11	4	M
43	00101011	2	11	M
2	00000010	0	2	M
191	10111111	11	15	M
88	01011000	5	8	M
190	10111110	11	14	M
14	00001110	0	14	M
181	10110101	11	5	M
44	00101100	2	12	M
186	10111010	11	10	M
253	11111101	15	13	M

The direct-mapped cache with two-word blocks

5.2.2

The list with binary address, tag, index, hit, miss for the 8 block size is as follows:

Address Of word	Address of binary	Tag	Index	Hit/Miss
3	00000011	0	1	M
180	10110100	11	2	M
43	00101011	2	5	M
2	00000010	0	1	H
191	10111111	11	7	M
88	01011000	5	4	M
190	10111110	11	7	H
14	00001110	0	7	M
181	10110101	11	2	H
44	00101100	2	6	M
186	10111010	11	5	M
253	11111101	15	6	M

Determine the best cache design in terms of miss rate and total cycles

5.2.3

Given that there are three direct-mapped cache designs possible, all with a total of 8

words of data: C1 has 1-word blocks, C2 has 2-word blocks, and C3 has 4-word blocks.

The miss stall time is 25 cycles, and C1 has an access time of 2 cycles, C2 takes 3 cycles, and C3 takes 5 cycles.

Address of word	Address of binary	Tag	Index	Hit/Miss	Index	Hit/miss	Index	Hit/Miss
3	00000011	0	3	M	1	M	0	M
180	10110100	22	4	M	2	M	1	M
43	00101011	5	3	M	1	M	0	M
2	00000010	0	2	M	1	M	0	M
191	10111111	23	7	M	3	M	1	M
88	01011000	11	0	M	0	M	0	M
190	10111110	23	6	M	3	H	1	H
14	00001110	1	6	M	3	M	1	M
181	10110101	22	5	M	2	H	1	M
44	00101100	5	4	M	2	M	1	M
186	10111010	23	2	M	1	M	0	M
253	11111101	31	5	M	2	M	1	M

Cache 1 miss rate= 100%

Cache 1 all cycles= $12 \times 25 + 12 \times 2 = 324$

Cache 2 miss rate= $\frac{10}{12} = 83 %$

Cache 2 all cycles= $10 \times 25 + 12 \times 3 = 286$

Cache 3 miss rate= $\frac{11}{12} = 92 %$

Cache 3 all cycles= $11 \times 25 + 12 \times 5 = 335$

From, the above results, Cache 2 provides the best performance based on the miss rate and the total cycles.

The closest direct-mapped cache with 16-word blocks of equal size or greater.

5.2.4

The unit of cache blocks in the first cache setup must first be determined. We do this by multiplying 32 KiB by 4. (For the number of bytes per word) and by 2(for the number of words per block).

As a result, we have 4096 blocks and a 12 bit index field width. A word offset size of 1 bit and a byte offset size of 2 bits are also available .As a result, the tag field size is 32-15=17 bits. These tag bits, along with one valid bit per block, will require $18 \times 4096 = 73728$ bits or 9216 bytes. The total cache size is thus 9216+32768=41984 bytes.

The all cache size can be approximated as follows:

Totalsize = datasize +( validbitsize + tagsize ) ×blocks

Totalsize =41984

Datasize = blocks× blocksize× wordsize

Wordsize = 4

size of tag= 32- log2(blocks)-log2( blocksize )-log2( wordsize )

Validbitsize =1

Increasing the unit of word blocks from 2 to 16 reduces the tag size from 17 to 14 bits.

We answer the inequality to determine the number of blocks

41984<=64×blocks+15×blocks

Solving this inequality gives us 531 blocks, and rounding to the next power of two gives us a 1024-block cache.

The larger block size may require an increased hit time and an increased miss penalty than the original cache. The fewer number of blocks may cause a higher conflict miss rate than the original cache.

Identify one possible solution

5.2.5

Associative caches are intended to limit the number of conflicts that are missed. As a result, a series of read requests with the same 12-bit index field but distinct tag fields will result in a large unit of misses.

For the cache described above, the sequence 0,32768,0,32768,0, 32768…would miss on every access, while a 2-way set associate cache with LRU replacement, even one with a significantly smaller overall capacity, would hit on every access after the first two.

The disadvantage of this set up is that with the smaller capacity, it results in larger unit of misses. The advantage is that 2-way set associative need a smaller capacity to deal with the cache miss."

Is it possible to use this to index a direct-mapped cache

5.2.6

Yes, it is possible to index the cache using this function. However, when the bits are XOR'd, information about the five bits is lost, therefore you'll need to include extra tag bits to identify the cache address.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

L1	L2
Write through, non-write allocate	Write back, write allocate

Data Reads per 100 Instructions	Data writes per 1000 Instructions	Instruction Cache Miss Rate	Data Cache Miss Rate	Block Size(byte)
250	100	0.30%	2%	64%

Base CPI, No Memory Stalls	Processor Speed	Main Memory Access Time	First Level Cache MissRate per Instruction	Second Level Cache, Direct-Mapped Speed	Global Miss Rate with Second Level Cache, Direct-Mapped	Second Level Cache, Eight-Way Set Associative Speed	Global Miss Rate with Second Level Cache, Eight-Way Set Associative
1.5	2 GHz	100 ns	7%	12 cycles	3.5%	28 cycles	1.5%

Address of Memory Block Accessed	Hit or Miss	Evicted Block	Contents of Cache Blocks After Reference
Set 0	Set 0	Set 1	Set 1
0	Miss		Mem[0]
1	Miss		Mem[0]		Mem[1]
2	Miss		Mem[0]	Mem[2]	Mem[1]
3	Miss		Mem[0]	Mem[2]	Mem[1]	Mem[3]
4	Miss	0	Mem[4]	Mem[2]	Mem[1]	Mem[3]
…

P1	P2
$X [0] + +; X [1] = 3$	$X [0] = 5; X [1] + = 2;$

Short Answer

Step by step solution

Determine the cache basics:

Determine the list with the referred details.

The direct-mapped cache with two-word blocks

Determine the best cache design in terms of miss rate and total cycles

The closest direct-mapped cache with 16-word blocks of equal size or greater.

Identify one possible solution

Is it possible to use this to index a direct-mapped cache

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Computer Science Textbooks

Computer Programming

Theory of Computation

Algorithms in Computer Science

Data Structures

Game Design in Computer Science

Computer Systems

Study anywhere. Anytime. Across all devices.

Company

Product

Help

P1	P2
$A = 1; B - 2; A + = 2; B + +;$	$C = B; D = A;$