Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

For a direct-mapped cache design with a 32-bit address, the following bits of the address are used to access the cache.

Tag

Index

offset

31-10

9-5

4-0

5.3.1 What is the cache block size (in words)?

5.3.2 How many entries does the cache have?

5.3.3 What is the ratio between total bits required for such a cache implementation over the data storage bits?

Starting from power on, the following byte-addressed cache references are recorded.

ADDRESS

0

4

16

132

232

160

1024

30

140

3100

180

2180

5.3.4 How many blocks are replaced?

5.3.5 What is the hit ratio?

5.3.6 List the final state of the cache, with each valid entry represented as a record of <index, tag, data>

Short Answer

Expert verified

5.3.1

8 words

5.3.2

32 entries

5.3.3

Ratio =277256

5.3.4

Number of blocks replaced =3

5.3.5

Hit ratio is 312=14

5.3.6

the final state of cache

<000000,0001,mem[1024]><000001,0011,mem[3088]><001011,0000,mem[176]><001000,0010,mem[2176]><001110,0000,mem[224]><001010,0000,mem[160]>

Step by step solution

01

Determine the formulae.

The formula for calculating block size

Cacheblocksize=2offsetbits .……(1)

The formula for calculating entries does the cache have

Entries=2indexbits …….(2)

The formula for finding the hit ratio

Hitratio=numberofhitstotalnumberofhits ..…..(3)

02

Determine the cache block size in words

5.3.1

Cacheblocksize=2offsetbits=25=32bytes

Here we use byte addressing with 4-byte words

So, the cache block size is 8 words

03

Determine the entries in the cache

5.3.2

Entries=2indexbits=25=32

04

Determine the ratio of the total bits required for such a cache implementation to the total bits required for data storage

5.3.3

Ratio=totalbitsforcachedatastoragebits=[128×(32×8+20+1)][128×(32×8)]=277256

05

Determine the number of blocks replaced

5.3.4

Block address = referenced address

Line ID = Number of blocks in cache

Address

0

4

16

132

232

160

1024

30

140

3100

180

2180

Line ID

0

0

1

8

14

10

0

1

8

1

11

8

Hit/Miss

M

H

M

M

M

M

M

M

H

M

M

M

Replace

N

N

N

N

N

N

Y

N

N

Y

N

Y

Number of blocks replaced =3

06

Determine the hit ratio

5.3.5

Total number of hits = 12

Number of hits = 3

Hitratio=numberofhitstotalnumberofhits

So, the hit ratio is312=14

07

Determine the final state of cache

Final state: <index, tag, data>

In final state the tag and index are represent in binary

<000000,0001,mem[1024]><000001,0011,mem[3088]><001011,0000,mem[176]><001000,0010,mem[2176]><001110,0000,mem[224]><001010,0000,mem[160]>

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Recall that we have two write policies and write allocation policies, and their combinations can be implemented either in L1 or L2 cache. Assume the following choices for L1 and L2 caches:

L1

L2

Write through, non-write allocate

Write back, write allocate

5.4.1 Buffers are employed between different levels of memory hierarchy to reduce access latency. For this given configuration, list the possible buffers needed between L1 and L2 caches, as well as L2 cache and memory.

5.4.2 Describe the procedure of handling an L1 write-miss, considering the component involved and the possibility of replacing a dirty block.

5.4.3 For a multilevel exclusive cache (a block can only reside in one of the L1 and L2 caches), configuration, describe the procedure of handling an L1 write-miss, considering the component involved and the possibility of replacing a dirty block

Consider the following program and cache behaviors.

Data Reads per 100 Instructions

Data writes per 1000 Instructions

Instruction Cache Miss Rate

Data Cache Miss Rate

Block Size(byte)

250

100

0.30%

2%

64%

5.4.4 For a write-through, write-allocate cache, what are the minimum read and write bandwidths (measured by byte per cycle) needed to achieve a CPI of 2?

5.4.5 For a write-back, write-allocate cache, assuming 30% of replaced data cache blocks are dirty, what are the minimal read and write bandwidths needed for a CPI of 2?

5.4.6 What are the minimal bandwidths needed to achieve the performance of CPI=1.5?

In this exercise we show the definition of a web server log and examine code optimizations to improve log processing speed. The data structure for the log is defined as follows:

struct entry {

int srcIP; // remote IP address

char URL[128]; // request URL (e.g., “GET index.html”)

long long refTime; // reference time

int status; // connection status

char browser[64]; // client browser name

} log [NUM_ENTRIES];

Assume the following processing function for the log:

topK_sourceIP (int hour);

5.19.1 Which fields in a log entry will be accessed for the given log processing function? Assuming 64-byte cache blocks and no prefetching, how many caches misses per entry does the given function incur on average?

5.19.2 How can you reorganize the data structure to improve cache utilization and access locality? Show your structure definition code.

5.19.3 Give an example of another log processing function that would prefer a different data structure layout. If both functions are important, how would you rewrite the program to improve the overall performance? Supplement the discussion with code snippets and data.

For the problems below, use data from “Cache performance for SPEC CPU2000 Benchmarks”(http://www.cs.wisc.edu/multifacet/misc/spec2000cache-data/) for the pairs of benchmarks shown in the following table.

a.

Mesa/gcc

b.

mcf/swim

5.19.4 For 64KiB data caches with varying set associativities, what are the miss rates broken down by miss types (cold, capacity, and conflict misses) for each benchmark?

5.19.5 Select the set associativity to be used by a 64KiB L1 data cache shared by both benchmarks. If the L1 cache has to be directly mapped, select the set associativity for the 1 MiB cache.

5.19.6 Give an example in the miss rate table where higher set associativity increases the miss rate. Construct a cache configuration and reference stream to demonstrate this.

Mean Time Between Failures (MTBF), Mean Time To Replacement (MTTR), and Mean Time To Failure (MTTF) are useful metrics for evaluating the reliability and availability of a storage resource. Explore these concepts by answering the questions about devices with the following metrics.

MTTF

MTTR

3 Years

1 Day

(5.8.1) Calculate the MTBF for each of the devices in the table.

(5.8.2) Calculate the availability for each of the devices in the table.

(5.8.3) What happens to availability as the MTTR approaches 0? Is this a realistic situation?

(5.8.4) What happens to availability as the MTTR gets very high, i.e., a device is difficult to repair? Does this imply the device has low availability?

Question: For a high-performance system such as a B-tree index for a database, the page size is determined mainly by the data size and disk performance. Assume that on average a B-tree index page is 70% full with fix-sized entries. The utility of a page is its B-tree depth, calculated as. The following table shows that for 16-byte entries, and a 10-year-old disk with a 10-year-old disk with a 10 ms latency and 10 MB/s transfer rate, the optimal page size is 16K.

Page Size (KiB)

Page Utility or B-Tree Depth (Number of Disk Accesses Saved)

Index Page Access Cost (ms)

Utility/Cost

2

6.49 (or)

10.2

0.64

4

7.49

10.4

0.72

8

8.49

10.8

0.79

16

9.49

11.6

0.82

32

10.49

13.2

0.79

64

11.49

16.4

0.70

128

12.49

22.8

0.55

256

13.49

35.6

0.38

(5.10.1) What is the best page size if entries now become 128 bytes?

(5.10.2) Based on 5.10.1, what is the best page size if pages are half full?

(5.10.3) Based on 5.10.2, what is the best page size if using a modern disk with a 3 ms latency and 100 MB/s transfer rate? Explain why future servers are likely to have larger pages.

Keeping “frequently used” (or “hot”) pages in DRAM can save disk accesses, but how do we determine the exact meaning of “frequently used” for a given system? Data engineers use the cost ratio between DRAM and disk access to quantify the reuse time threshold for hot pages. The cost of a disk access is \(Disk/accesses_per_sec, while the cost to keep a page in DRAM is \)DRAM_MiB/page _size. The typical DRAM and disk costs and typical database page sizes at several time points are listed below:

Year

DRAM Cost (\(/MiB)

Page Size (KiB)

Disk Cost (\)/disk)

Disk Access Rate (access/sec)

1987

5000

1

15,000

15

1997

15

8

2000

64

2007

0.05

64

80

83

(5.10.4) What are the reuse time thresholds for these three technology generations?

(5.10.5) What are the reuse time thresholds if we keep using the same 4K page size? What’s the trend here?

(5.10.6) What other factors can be changed to keep using the same page size (thus avoiding software rewrite)? Discuss their likeliness with current technology and cost trends.

Media applications that play audio or video files are part of a class of workloads called “streaming” workloads; i.e., they bring in large amounts of data but do not reuse much of it. Consider a video streaming workload that accesses a 512 KiB working set sequentially with the following address stream:

0, 2, 4, 6, 8, 10, 12, 14, 16, …

5.5.1 Assume a 64 KiB direct-mapped cache with a 32-byte block. What is the miss rate for the address stream above? How is this miss rate sensitive to the size of the cache or the working set? How would you categorize the misses this workload is experiencing, based on the 3C model?

5.5.2 Re-compute the miss rate when the cache block size is 16 bytes, 64 bytes, and 128 bytes. What kind of locality is this workload exploiting?

5.5.3 “Prefetching” is a technique that leverages predictable address patterns to speculatively bring in additional cache blocks when a particular cache block is accessed. One example of prefetching is a stream buffer that prefetches sequentially adjacent cache blocks into a separate buffer when a particular cache block is brought in. If the data is found in the prefetch buffer, it is considered as a hit and moved into the cache and the next cache block is prefetched. Assume a two-entry stream buffer and assume that the cache latency is such that a cache block can be loaded before the computation on the previous cache block is completed. What is the miss rate for the address stream above?

Cache block size (B) can affect both miss rate and miss latency. Assuming a 1-CPI machine with an average of 1.35 references (both instruction and data) per instruction, help find the optimal block size given the following miss rates for various block sizes.

8;4%
16:3%
32:2%
64:1.5%
128:1%

5.5.4 What is the optimal block size for a miss latency of 20×B cycles?

5.5.5 What is the optimal block size for a miss latency of 24+B cycles?

5.5.6 For constant miss latency, what is the optimal block size

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free