previous
 next 
CS 3853 Computer Architecture Notes on Chapter 3 Section 1

Read Section 3.1

Today's News: November 2, 2012
No news yet.

3.1: Instruction Level Parallelism Concepts

ILP: the potential to overlap instruction execution.
In this chapter we will look at several ways to improve performance, including out of order execution.
When can we safely change the execution order of instructions?

Terminology

data dependence
instruction j is data dependent on instruction i:
  • Instruction j should be executed after instruction i
  • Instruction i produces a result that may be needed by instruction j or
  • Instruction j is dependent on instruction k and instruction k is dependent on instruction i
    (transitive closure)
Example 1:
1) DADD R1, R2, R3
2) DUSB R4, R5, R1
3) AND  R6, R7, R4
  • instruction 2 depends on instruction 1
  • instruction 3 depends on instruction 2
  • instruction 3 depends on instruction 1
Example 2:
1) S.D  R1, 8(R2)
2) L.D  R4, 16(R3)
Is instruction 2 data dependent on instruction 1?
Dependencies that flow through memory locations are difficult to detect.

name dependence 1: antidependence
instruction j is antidependent on instruction i:
  • Instruction j should be executed after instruction i
  • Instruction j writes to a register or memory location that instruction i reads
  • Note that data dependence can be stated as:
    Instruction i writes to a register or memory location that instruction j reads
Example 3:
1) DADD R1, R2, R3
2) DADD R2, R4, R5
  • instruction 2 is antidependent on instruction 1
  • We must make sure that instruction 1 reads from R2 before instruction 2 changes the value of R2.

name dependence 2: output dependence
output dependence between instruction i and instruction j:
Instructions i and j write to the same register or memory location
Example 4:
1) DADD R1, R2, R3
2) DADD R5, R1, R4
3) DADD R1, R6, R7
  • there is an output dependence between instructions 1 and 3

name dependence 3: comments
  • Name dependence (both antidependence and output dependence) is not a true data dependence.
  • It is caused by reusing a register or memory location.
  • If a register is reused, the name dependence can be eliminated by using another register, if enough registers exist.
  • Name dependence can be difficult to detect (especially at the machine or assembly language level) when it involves memory.
  • Why is memory harder to deal with than registers?
    • it is not because memory is larger
    • it is because of aliasing

control dependence
an instruction is control dependent on a collection of branches if execution of that instruction depends on these branches.

basic block
a straight-line code sequence with no branches in except to its entry and no branches out except at its exit.
  • For MIPS, average dynamic branch frequency is 15% to 25%
  • Typical basic block between 3 and 6 instructions.
  • Not sufficient to just overlap among instructions in a basic block.
Example 5:
for (i=0; i<=999; i++)
   x[i] = x[i] + y[i];
Each of the iterations is independent once the index is known.
Can write this as:
   x[0] = x[0] + y[0];
   x[1] = x[1] + y[1];
   x[2] = x[2] + y[2];
        ...

Dependencies and Hazards


Data Hazards



Today's News: November 5, 2012
We will start by checking your Assignment 3.

Control Dependence

Example 6:
if (p1)
    S1;
if (p2)
    S2;
S1 is control dependent on p1.
S2 is control dependent on p2, but not on p1.

Constraints imposed by control dependence: These requirements are more strict than necessary. What we really want is to preserve exception behavior and data flow.
Example 7:
1)      DADDU  R2, R3, R4
2)      BEQZ   R4, skip
3)      LW     R1, 0(R2)
4) skip:
Instruction 3 is not data dependent on instruction 2, but it cannot be moved above because it would change the exception behavior.
Example 8:
1)      DADDU  R1,  R2, R3
2)      BEQZ   R12, skip
3)      DSUBU  R4,  R5, R6
4)      DADDU  R5,  R4, R9
5) skip:
6)      OR     R7,  R8, R9
Notice that R4 is used as a temporary variable.
Suppose that R4 is not used after the skip. (We say R4 is dead after the skip)
We can move instruction 3 above the branch (or use it in the delay slot)

Exploiting ILP


Next Notes

Back to CS 3853 Notes Table of Contents