CS 3853 Computer Architecture Notes on Appendix B Section 3
Read Appendix B.3
Today's News: March 6
Assignment 2 is available, due March 27 Code-A-Thon information is available.
B.3: Cache Optimization
Summary: 6 optimizations in 3 categories:
reducing the miss rate
larger block size
larger cache size
higher associativity
reducing the miss penalty
multilevel caches
giving priority to read misses over writes
reducing the hit time
avoiding address translation during indexing of the cache
Need to talk about virtual memory first.
Types of cache misses
compulsory: first access causes a miss, also called cold start misses or first reference misses.
capacity: cache cannot contain all of the blocks needed (blocks discarded that are later needed)
conflict: too many blocks map to the same set, also called collision misses.
These are misses that occur because the cache does not have full associativity.
Optimizations
Increasing Block Size
larger block size:
increasing block size can decrease the miss rate up to a point
if the block size is too large, the miss rate can increase due to not enough blocks.
increasing the block size increases the miss penalty.
larger cache size
can increase the hit time (if associative)
can be expensive in cost and power
limited capacity on chip
Increasing Associativity
higher associativity:
reduces conflict misses
requires extra hardware and can increase hit time
8-way is usually enough
multilevel caches to reduce miss penalty
Widely used
Use small first-level cache (L1) to match the clock cycle
Use large second (and third) level cache to reduce miss penalty.
Example: Intel Core i7 has:
a 32KB L1 instruction cache per processor
a 32KB L1 data cache per processor
a 256KB L2 cache per processor
a shared 8MB L3 cache
Why not just use a larger (256K) L1 data cache?
MultiLevel Caches 1
Give priority to read misses over writes to reduce miss penalty
with a write-through cache need a large write buffer
on a read miss, must wait for write buffer to empty so you get the updated value
if the read (miss) does not require data in the write buffer, can give it priority
can do something similar with write-back
Avoid address translation during indexing
We will come back to this after we discuss virtual memory