previous
 next 
CS 3853 Computer Architecture Notes on Appendix B Section 2

Read Appendix B.2

B.2: Cache Performance


Today's News: March 7, 2013
No news yet.

Example 1
Compare the miss rates of:
  1. 16KB instruction cache and 64KB data cache
  2. 256KB unified cache
Make reasonable assumptions to solve the problem.
Solution:
Assumptions:
  • Miss rates per 1000 instructions are given in Figure B.6 as follows:
    16KB instruction: 3.82
    64KB data: 36.9
    256KB unified: 32.9
    These assume 36% instructions are loads and stores, as with some SPEC benchmarks.
    Assume a 2-way set associative cache with 64-byte blocks.
  • A hit takes 1 cycle
  • Miss penalty is 50 cycles
  • A load or store takes an extra cycle because the the structural hazard in the case of the unified cache.
  • Ignore stalls due to write-through.
What percentage of references are fetches and what % are data?
fraction instruction references = 1/1.36 = .7353.
fraction data references = .36/1.36 = .2647 (= 1 - .7353)
miss rateinstruction = 3.82/1000 = .00382
miss ratedata = 36.9/(.36 × 1000) = .1025
miss ratesplit = .7353 × .00382 + .2647 × .1025 = .0299
miss rateunified = 32.9/(1.36 × 1000) = .0242
The unified miss rate is better!
This does not take into account the extra stall due to the structural hazard in the unified cache.
access time = hit time + miss rate × miss penalty
access timesplit = 1 + .0299 × 50 = 2.495 cycles.
access timeunified = 1 + .36 + .0242 × 50 = 2.57 cycles.
The split access time is better!
The next example explores the performance of direct mapped and set associative caches.
For a given size cache, the more associativity, the higher the hit ratio.
More associativity requires additional hardware (and time) to check a tag (even on a hit)
This might require increasing the clock cycle time.
Example 2
Which is faster, a direct mapped cache with a cycle time of .4 ns, or
a 2-way set associative cache with a cycle time of .45 ns?
We need some additional assumptions to do this problem:
  1. 1.3 memory accesses per instruction
  2. CPI of 1 with no cache misses
  3. miss penalty of 21 ns
  4. miss rate of direct mapped cache: 2.3%
  5. miss rate of 2-way set associative cache: 2.1%
Solution
First, we need to know the miss penalty in cycles for each:
miss penaltydirect = 21ns/.4ns = 52.5 cycles
miss penalty2-way = 21ns/.45ns = 46.67 cycles
We round up the number of cycles for the miss penalty.
Second we calculate CPI for each:
CPIdirect = 1 + 1.3 × .023 × 53 = 2.5847
CPI2-way = 1 + 1.3 × .021 × 47 = 2.2831
What we really want it time:
Time per instructiondirect = 2.5847 × .4 ns = 1.0339 ns.
Time per instruction2-way = 2.2831 × .45 ns = 1.0274 ns.
In this case the 2-way cache is better by .6%.
With out-of-order execution, part of the miss penalty can be overlapped with the execution of other instructions.
Example 3
Redo the above problem if the 30% of the miss penalty can be overlapped.
Solution:
We just have to reduce the miss penalty by 30% in each case.
CPIdirect = 1 + 1.3 × .023 × 53 × .7 = 2.1093
CPI2-way = 1 + 1.3 × .021 × 47 × .7 = 1.8982
What we really want it time:
Time per instructiondirect = 2.1093 × .4 ns = .8437 ns.
Time per instruction2-way = 1.8982 × .45 ns = .8542 ns.
In this case the direct mapped cache is faster by 1.25%.

Next Notes

Back to CS 3853 Notes Table of Contents