previous
 next 
CS 3853 Computer Architecture Notes on Chapter 1

Read Chapter 1

Introduction

What do the following mean: You need to answer these in terms of execution time for a given task.
X is n times faster than Y:  
Execution timeY
Execution timeX
 =  n


X is n% faster than Y:  
Execution timeY
Execution timeX
= 1 +
n
100
We will never use the word slower in this context, only faster.
Today's News: August 31, 2012
Several Scholarships are available for CS undergraduates.
Fill out a separate form for each scholarship.
Forms are available at the front of the class.
Write the name of the scholarship on the form.
Return the forms to the CS Department office.

Section 1.1: Traditional Performance Growth


Section 1.2: Classes of Computers

Types of computers:

Application Parallelism

Hardware Parallelism

Hardware Classifications


Section 1.3: Defining Computer Architecture

In CS 3843 we concentrated on the ISA.
Computer Architecture has 3 main components:
Instruction Set Architecture (ISA)
Organization (also called microarchitecture): memory systems, bus structure, internal CPU design
Hardware: detailed logic design, packaging technology
Design the organization and hardware to meet goals
Example:
The Intel 8088 (1979) and 8086 (1978) had the same ISA.
The 8088 had an 8-bit memory bus while the 8086 had a 16-bit bus.
The 8088 was very successful because it was cheap build computers with an 8-bit bus (in 1979)

Instruction Set Architecture

This is what we used to define the architecture of the X86 and Y86 in CS 3843.

Class of ISA

usually general purpose register architectures with operands registers or memory.
classified as register-memory or load-store.

Memory Addressing

usually byte addressing

Addressing Modes

covered extensively in CS 3843

Types and sizes of operands

often: integers of 8, 16, 32, and 64 bits and floating point of 32 and 64 bits.

Operations

instructions: e.g. add, shift, and, branch

Control flow instructions

a subset of above, including conditional branches, call and return

ISA encoding

fixed length or variable length.
In CS 3843 you should have seen this for Y86 in detail and X86 examples.

Authors' View of Computer Architecture

ISA, organization, hardware

Organization

Also called microarchitecture
High level aspects of computer's design, e.g. memory system, memory interconnect, CPU design
Example: Intel and AMD use the same ISA, but different organizations.

Hardware

refers to detailed logic an packaging technology
Example: Intel Core i7 and Intel Xeon 7560

Some aspects of computer architectural design


Section 1.4: Trends in Technology

Integrated Circuit Technology

Semiconductor Ram

Semiconductor Flash

Magnetic Disk

Network Technology

Trends in Bandwidth and Latency

Wires


Today's News: September 5, 2012
No recitations this week.
Recitations begin Monday, September 10.
Recitation attendance in required.
There will be a quiz in the recitation next week.
More info is available
here
You will need to bring a calculator.
It will need to be able to do: add, subtract, multiply, divide, powers, logs.
You may not use a calculator that can connect to a network: no phones.

Section 1.5: Trends in Power and Energy

Issues

Energy vs. Power

Microprocessor Energy and Power

Techniques for improving energy efficiency


Section 1.6: Trends in Cost

Cost of an Integrated Circuit


Section 1.7: Dependability

Example 1:
A system has 2 disks, each with a MTTF of 1,000,000 hours and a power supply with a MTTF of 200,000 hours. What is the system MTTF?
Solution:
failure rate = 2 × 1/1,000,000 + 1/200,000 = 7/1,000,000
MTTF = 1,000,000/7 = 143,000 hours.
Example 2:
A disk drive has a MTTF of 1,000,000 hours. What is the probability that it will last 50 years without failure?
Solution:
Hours per year = 24*365.25 = 8766.
Probability of dying in a given year: 8766/1,000,000 = .008766
Probability of not dying in a given year: 1 - .008766 = .991234
Probability of not dying in 50 years (.9912234)50 = .6439
What is wrong with this solution?

Section 1.8: Measuring, Reporting, and Summarizing Performance

A computer user is interested in response time
An operator of a warehouse-scale computer is interested in throughput

Comparing two computers

Examples (Also see ClassQue Questions)
  1. Machine A is 40% faster than B and B is 40% faster than C. How much faster is A than C?
    Solution:  
    EB
    EA
    = 1.4 and
    EC
    EB
    = 1.4 so
    EC
    EA
    = 1.4 × 1.4 = 1.96
  2. Machine performance increases by 40% per year for 10 years. By what percentage does performance increase over this period?
    Solution:   1.410 = 28.93. This is 1 + n/100 for n = 2793.

Time


Today's News: September 7, 2012
Recitations start next week.
Be sure to bring a calculator.

Benchmarks


Types of benchmarks:
Tricks people play
SPEC: Standard Performance Evaluation Corporation
Summarizing Performance Results

Section 1.9: Quantitative Principles of Computer Design

Take advantage of parallelism

Principle of locality

Focus on the common case

Amdahl's law

A given enhancement will only improve a part of a program.
Example: on multicore machine, some of the code will only user one core.

Speedup
Speedup =
Performance for entire task using the enhancement when possible
Performance for the entire task without using the enhancement

or
Speedup =
Execution time for entire task without using the enhancement
Execution time for the entire task using the enhancement when possible

Amdahl's law depends on: Fractionenhanced and Speedupenhanced
Note: Fractionenhanced is relative to the original design.
Fractionunenhanced = 1 - Fractionenhanced
Execution timeold = Execution timeold-unenhanced + Execution timeold-enhanced
Execution timeold = Execution timeold × Fractionunenhanced + Execution timeold × Fractionenhanced
Execution timenew = Execution timeold × Fractionunenhanced +
Execution timeold
Speedupenhanced
× Fractionenhanced
Execution timenew = Execution timeold × (Fractionunenhanced +
Fractionenhanced
Speedupenhanced
)
Speedupoverall =
Execution timeold
Execution timenew


Amdahl's Law:
Speedupoverall =
 
1

Fractionunenhanced +
Fractionenhanced
Speedupenhanced
or
Speedupoverall =
 
1

1 - Fractionenhanced +
Fractionenhanced
Speedupenhanced

You must be able to use this formula, but you also need to understand when the formula cannot be used directly.
You can only use the formula directly when you know both the enhanced fraction and the enhanced speedup.

Examples
  1. A new design makes the floating point processor of the CPU 80% faster than before. What is the overall speedup for a task in which floating point operations took up 30% of the CPU time with the old design?
    Solution: The enhanced fraction is 30% = .3. The unenhanced fraction is .7. The enhanced speedup is 1.8. The overall speeup is
    1
    .7 + .3/1.8
      =   1.1538.
  2. A new design makes the floating point processor of the CPU by 80% faster than before. What is the overall speedup for a task in which floating point operations took up 10% of the CPU time with the new design?
    Solution: We cannot use the formula directly since the enhanced fraction in the formula is based on the old design.
    Method 1: first calculate Fractionenhanced (relative to the old design).
    Suppose the total execution time on the new systems is E.
    The unenhanced time on the new system is .9E and the enhanced time is .1E.
    The unenhanced time on the old system is still .9E and the enhanced time is .1E × 1.8 = .18E.
    The fraction enhanced (relative to the old system) is
    .18E
    (.9E + .18E)
    = .1667.
    The speedup is
    1
    1-.1667 + .1667/1.8
    = 1.08.
    Method 2: Use the definition of Speedup directly
    Relative to the new design, the unenhanced fraction is .1 and the unenhanced fraction is .9.
    Execution timenew = .9 × Execution timenew + .1 × Execution timenew
    Execution timeold = .9 × Execution timenew + .1 × Execution timenew × 1.8
    Speedupoverall =
    Execution timeold
    Execution timenew
    =
    .9 × Execution timenew + .1 × Execution timenew × 1.8
    Execution timenew
    = .9 + .18 = 1.08.

Processor Performance Equations

Dependencies

More on CPI

Examples
  1. In a particular task, 23% of the instructions are floating point instruction which each take 5 cycles to execute. All other instructions take 1 cycle to execute. What is the average CPI for this task?
    Solution: CPI = .23 × 5 + .77 × 1 = 1.92
  2. A new design can reduce the number of cycles for a floating point operation to 4, without changing the clock speed. What is the new CPI and what is the expected speedup?
    Solution: CPI = .23 × 4 + .77 × 1 = 1.69
    Speedup = 1.92/1.69 = 1.136
  3. What is wrong with the following method of solving the above problem using the Amdahl's Law equation?
    Fractionenhanced = .23, Fractionunenhanced = .77, and Speedupenhanced = 5/4 = 1.25. Therefore
    Speedupoverall =
    1
    .77 + .23/1.25
    = 1.0482
    Answer: .23 is the fraction of instructions that are enhanced, not the time spent executing the enhanceable part.
    The correct enhanced fraction is (.23 × 5)/(.23 × 5 + .77) = 1.15/1.92 = .5990. This gives
    Speedupoverall =
    1
    .4010 + .5990/1.25
    = 1.136

Today's News: September 10, 2012
Recitations start this week.
Be sure to bring a calculator.

Section 1.10: Putting It All Together


Section 1.11: Fallacies and Pitfalls

Fallacy: a commonly held misbelief
Pitfall: an easily made mistake

Fallacies

Pitfalls


Next Notes

Back to CS 3853 Notes Table of Contents