previous
 next 
CS 3853 Computer Architecture Notes on Appendix B Section 2

Read Appendix B.2

B.2: Cache Performance


ClassQue: Cache Performance 1

Example 1
Compare the miss ratios and access times of:
  1. 16KB instruction cache and 64KB data cache
  2. 256KB unified cache
Make reasonable assumptions to solve the problem.
Solution:
Assumptions:
  • Miss rates per 1000 instructions are given in Figure B.6 (on page B-15) as follows:
    16KB instruction: 3.82
    64KB data: 36.9
    256KB unified: 32.9
    These assume 36% instructions are loads and stores, as with some SPEC benchmarks.
    Assume a 2-way set associative cache with 64-byte blocks.
  • A hit takes 1 cycle
  • Miss penalty is 50 cycles
  • A load or store takes an extra cycle because the the structural hazard in the case of the unified cache.
  • Ignore stalls due to write-through.
miss ratiosplit = (3.82+36.9)/(1.36 × 1000) = .02994
miss ratiounified = 32.9/(1.36 × 1000) = .02419
The unified miss ratio is better!


Today's News: October 17
No news yet.

This does not take into account the extra stall due to the structural hazard in the unified cache.
To calculate the average memory access time:
instruction access time = hit time + miss ratio × miss penalty
access timesplit = 1 + .02994 × 50 = 2.497 cycles.
access timeunified = 1 + .36 + .02419 × 50 = 2.57 cycles.
The split access time is better!

The next example explores the performance of direct mapped and set associative caches.
For a given size cache, the more associativity, the higher the hit ratio.
More associativity requires additional hardware (and time) to check a tag (even on a hit)
This might require increasing the clock cycle time.
Example 2
Which is faster, a direct mapped cache with a cycle time of .4 ns, or
a 2-way set associative cache with a cycle time of .45 ns?
We need some additional assumptions to do this problem:
  1. 1.3 memory accesses per instruction
  2. CPI of 1 with no cache misses
  3. miss penalty of 21 ns
  4. miss rate of direct mapped cache: 2.3%
  5. miss rate of 2-way set associative cache: 2.1%
  6. these are unified caches, but with no structural hazard
Solution
First, we need to know the miss penalty in cycles for each:
miss penaltydirect = 21ns/.4ns = 52.5 cycles
miss penalty2-way = 21ns/.45ns = 46.67 cycles
We round up the number of cycles for the miss penalty.
Second we calculate CPI for each:
CPIdirect = 1 + 1.3 × .023 × 53 = 2.5847
CPI2-way = 1 + 1.3 × .021 × 47 = 2.2831
What we really want it time:
Time per instructiondirect = 2.5847 × .4 ns = 1.0339 ns.
Time per instruction2-way = 2.2831 × .45 ns = 1.0274 ns.
In this case the 2-way cache is better by .6%.

With out-of-order execution, part of the miss penalty can be overlapped with the execution of other instructions.
Example 3
Redo the above problem if the 30% of the miss penalty can be overlapped.
Solution:
We just have to reduce the miss penalty by 30% in each case.
CPIdirect = 1 + 1.3 × .023 × 53 × .7 = 2.1093
CPI2-way = 1 + 1.3 × .021 × 47 × .7 = 1.8982
What we really want it time:
Time per instructiondirect = 2.1093 × .4 ns = .8437 ns.
Time per instruction2-way = 1.8982 × .45 ns = .8542 ns.
In this case the direct mapped cache is faster by 1.25%.

Next Notes

Back to CS 3853 Notes Table of Contents