Chapter 5: Q19E (page 497)

swim
5.19.4 For 64KiB data caches with varying set associativities, what are the miss rates broken down by miss types (cold, capacity, and conflict misses) for each benchmark?
5.19.5 Select the set associativity to be used by a 64KiB L1 data cache shared by both benchmarks. If the L1 cache has to be directly mapped, select the set associativity for the 1 MiB cache.
5.19.6 Give an example in the miss rate table where higher set associativity increases the miss rate. Construct a cache configuration and reference stream to demonstrate this.

Short Answer

Expert verified

5.19.1 srcIP and refTime fields. 2 misses per entry.

5.19.2 Group the srcIP and refTime fields into a separate array.

5.19.3 peak_hour (int status);

5.19.4 Conflict and compulsory misses are not affected by associativity. And the selection varies with the data set used.

5.19.5 The set associativity will vary with the data set used.

5.19.6 The“Cache performance for SPEC CPU2000”has many examples like apsi/mesa/ammp/mcf.

Cache:4-block caches, direct-mapped vs 2-way LRU.

Reference stream: 1 2 2 6 1

Step by step solution

Determine the code optimization to improve the log processing speed.

In a multiprocessor system or a system with larger data, the log processing is difficult with lesser code.

The log instruction will be optimized in a way that they get the process log sooner.

By comparing and merging the fields of the log table, the data can be accessed.

Determine which fields in a log entry will be accessed and how many cache misses occur.

5.19.1 Given log structure is as follows:

struct entry {

int srcIP; // remote IP address

char URL[128]; // request URL (e.g., “GET index.html”)

long long refTime; // reference time

int status; // connection status

char browser[64]; // client browser name

} log [NUM_ENTRIES];

The srcIP and the refTime fields will be used to access the given log processing function.

Here, assuming a 64-byte cache block, and no prefetching 2 misses per entry occur.

Determine how the data structure can be reorganized to improve cache utilization and access locality.

5.19.2 To improve the cache utilization and the access locality, group the srcIP and refTime fields into a separate array.

The struct code with srcIP and the refTime into separate array as follows:

struct entry {

long long srcIP refTime [1000][1000000]; // remote IP address and reference time grouped

char URL[128]; // request URL (e.g., “GET index.html”)

int status; // connection status

char browser[64]; // client browser name

} log [NUM_ENTRIES];

In the above structure, the srcIP and the refTime fields are grouped as a two-dimensional array. This improves the cache utilization and the access locality by providing the index for each field. In an array, data can be accessed faster by the array index, which improves the access locality.

Determine how the code can be rewritten to improve overall performance.

5.19.3

Given pairs of benchmarks are:

a.	Mesa/gcc
b.	mcf/swim

The array is the data structure that can be used as a log processing function.

peak_hour (int status);

peak_hour will provide the peak time in which the data is accessed and this will improve the correctness of the result.

The above code will return the peak hours of the given status. Group srcIP, refTime, and status together.

Code snippet:

struct entry {

long long srcIP refTime [1000][1000000];

char URL[128]; // request URL (e.g., “GET index.html”)

peak_hour(int status);// connection status

char browser[64]; // client browser name

} log [NUM_ENTRIES];

Determine the miss rate.

5.19.4

Conflict miss and compulsory misses do not occur and get affected by full associative caches.

The capacity miss rate is computed by subtracting the compulsory miss rate and the fully associative miss rate from the total miss rate. Conflict miss rate can be computed by subtracting the cold and newly computed capacity miss rate from the total miss rate.

The values will be reported as per the miss rate per instruction.

The miss rate will vary depending on which data set is used.

Determine set associativity.

5.19.5

Set associativity will be determined based on the dataset used. Since no specific dataset is given, it cannot be found. It will vary depending upon the data set used.

Determine the miss rate table, cache configuration, and reference stream

5.19.6

The“Cache performance for SPEC CPU2000”has many examples like apsi/mesa/ammp/mcf for miss rate table. They can’t be fit into the document since they are larger.

Cache:4-block caches, direct-mapped vs 2-way LRU.

Reference stream: 1 2 2 6 1

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Start your free trial

Over 30 million students worldwide already upgrade their learning with Vaia!

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Base CPI	Priviliged O/S Accesses per 10,000 Instructions	Performance Impact to Trap to the Guest O/S	Performance Impact to Trap to VMM	I/O Access per 10,000 Instructions	I/O Access Time (Includes Time to Trap to Guest O/S)
1.5	120	15 cycles	175 cycles	30	1100 cycles

Virtual Address (bits)	Physical DRAM Installed	Page Size	PTE Size (byte)
43	16 GiB	4KiB	4

Entry-ID	Valid	VA Page	Modified	Protection	PA Page
1	1	140	1	RW	30
2	0	40	0	RX	34
3	1	200	1	RO	32
4	1	280	0	RW	31

P1	P2
$X [0] + +; X [1] = 3$	$X [0] = 5; X [1] + = 2;$

P1	P2
$A = 1; B - 2; A + = 2; B + +;$	$C = B; D = A;$

Short Answer

Step by step solution

Determine the code optimization to improve the log processing speed.

Determine which fields in a log entry will be accessed and how many cache misses occur.

Determine how the data structure can be reorganized to improve cache utilization and access locality.

Determine how the code can be rewritten to improve overall performance.

Determine the miss rate.

Determine set associativity.

Determine the miss rate table, cache configuration, and reference stream

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Computer Science Textbooks

Computer Organisation and Architecture

Problem Solving Techniques

Algorithms in Computer Science

Issues in Computer Science

Data Representation in Computer Science

Computer Systems

Study anywhere. Anytime. Across all devices.

Company

Product

Help

	L1 Size	L1 Miss Rate	L1 Hit Time
P1	2 KiB	8.0%	0.66 ns
P2	4 KiB	6.0%	0.90 ns

Valid	Physical Page or in Disk
1	5
0	Disk
0	Disk
1	6
1	9
1	11
0	Disk
1	4
0	Disk
0	Disk
1	3
1	12

Virtual Address Size	Page Size	Page Table Entry Size
32 bits	8 KiB	4 bytes