previous
 next 
CS 3843 Computer Organization
Notes on Chapter 3: Sections 3.6 and 3.7

Sections 3.6 and 3.7 Overview

Conditional jump instructions are typically used as follows:
     cmpl %eax, %ebx
     ja   .label
     ...
.label:

The instruction:
cmpl %eax, %ebx
is interpreted as compare %ebx to %eax.
Note that the order may seem backwards for now.
The ja instruction is jump on above which transfers control if the previous comparison yielded %ebx was above %eax.
When doing unsigned comparisons, we use above and below.

If we wanted to do a signed comparison we would use jg instead of ja.
For signed comparisons, we use greater and less.
The code would look like:
     cmpl %eax, %ebx
     jg   .label
     ...
.label:

Notice that there is no change in the cmpl instruction.

How this works
The cmp instructions set or clears 4 flags.
A flag has a boolean value (true or false) and each can be represented by a single bit.
If the value of the bit is 1, we say the bit is set, or is true.
If the value of the bit is 0, we say the bit is clear, or is false.
The 4 flags that are used are the carry flag, zero flag, sign flag, and overflow flag.
The instruction cmpl %eax, %ebx calculates (%ebx-%eax) and sets the flags based on the result.

The zero flag (ZF) tells up if the result was 0.
So je (jump on equal) will just use the zero flag.
The carry flag (CF) is set if the subtraction (of unsigned values) produces the wrong value (is not ≥ 0).
The jb (jump on below) will just use the CF flag for the CF flag is set when %ebx is smaller than %eax (as unsigned numbers)
The jbe (jump if below or equal) will use both of these flags since we want to jump if either is true.
Note that the ja (jump on above) can use the same 2 flags since this is the opposite of jbe.
Since jbe will jump if (CF|ZF) is set, ja will jump if ~(CF|ZF) = ~CF&~ZF.

Section 3.6: Control

IA32 uses jump instructions for transfer of control.

Section 3.6.1: Condition Codes

IA32 uses four single-bit flags called condition codes which are set by certain instructions based on the result of the instruction.
FlagNameUse
CFcarry flagcarry out of most significant bit: for unsigned overflow
ZFzero flagzero
SFsign flagsign bit is set
OFoverflow flagtwo's complement overflow: for signed overflow
true if the sign bit is not correct


OF flag: result of add or sub has wrong sign


CF flag: carry out of high bit

The following instructions set the condition codes appropriately:
inc, dec, neg, not, add, sub, mul, imul, div, idiv, xor, or, and, sal, shl, sar, shr,

The following instructions do not modify the condition codes:
mov, leal, push, pop, call, ret, cltd


Today's News: March 7
Assignment 2 regrade due today.


In addition, there are test and compare instructions which take two operands:
cmp S2, S1 : cmpb, cmpw, cmpl : performs S1 - S2
test S2, S1 : testb, testw, testl : performs S1 & S2
These do not store the resulting computation in the destination, only the condition codes are set.

Section 3.6.2: Accessing (and Understanding) the Condition Codes

You can set a byte to 0 or 1 based on the condition flags with the set instructions:
These take a single byte operand as the destination: either an 8-bit register or a single byte of memory.
The variations are:
instructionsynonymeffectdescription
sete DsetzD = ZFequal or zero
setne DsetnzD = ~ZFnot equal or not zero
sets D D = SFnegative
setns D D = ~SFnonnegative
setg DsetnleD = ~(SF^OF)&~ZFsigned greater
setge DsetnlD = ~(SF^OF)signed greater or equal
setl DsetngeD = SF^OFsigned less
setle DsetngD = (SF^OF)|ZFsigned less or equal
seta DsetnbeD = ~CF&~ZFunsigned above
setae DsetnbD = ~CFunsigned above or equal
setb DsetnaeD = CFunsigned below
setbe DsetnaD = CF|ZFunsigned below or equal

Note that the carry flag (CF) is used for unsigned comparisons while the combination of SF and OF are used for signed comparisons.

The important part of this table is the effect field which shows how the 4 condition codes are related to various tests.
The entries in red are easiest to understand. Others from the same group can be easily derived from these.

The description field is based on a previous instruction of the form:
cmp S2, S1
negative refers to the value of S1 - S2
greater, less, above, or below refer to comparing S1 to S2.
For example, greater is true if S1 is greater than S2
Note that the order might seem backwards.

cmpl Example 1

Question:
Consider the following code segment:
cmpl   $10, $20
jle     .L1

Does this jump?
Answer:

You should be able to derive the effect column for each line of the above table.
The ones in red are the simplest, and should be relatively easy. The others in a group can be derived from the red one.

Unsigned comparisons
Example: derive the flags conditions for seta from those of setb:
  • setb is determined by the CF, so setbe is determined by CF|ZF
  • seta is the opposite of setbe, so seta is determined by ~(CF|ZF) = ~CF&~ZF


Signed comparisons
Example: derive the flags conditions for setle from those of setl:
  • setl is determined by SF^OF, so setle is determined by (SF^OF)|ZF


See a summary of discussion of flags here.



Section 3.6.3: Jump Instructions and Their Encoding
Section 3.6.4: Translating Conditional Branches

Jump instructions change the flow of control so that the next instruction executed is not the next instruction.

Traditional instruction cycle, also called fetch-and-execute cycle or fetch-decode-execute cycle.
The program counter (PC) register contains the address of the next instruction to execute. Continue doing this in a loop forever.

A jump instruction is one that modifies the PC during the execute phase.

The IA32 has two types of unconditional jump instructions:
direct and indirect.
jmp Label
jmp *Operand (Operand is one of the addressing modes)

Unconditional jumps are rarely used, except with conditional jumps.
Examples to follow.

IA32 Conditional Jump instructions.
There are no indirect conditional jumps.
This table is similar to the table for set
instructionsynonymJump Conditiondescription
je LabeljzZFequal or zero
jne Labeljnz~ZFnot equal or not zero
js Label SFnegative
jns Label ~SFnonnegative
jg Labeljnle~(SF^OF)&~ZFsigned greater
jge Labeljnl~(SF^OF)signed greater or equal
jl LabeljngeSF^OFsigned less
jle Labeljng(SF^OF)|ZFsigned less or equal
ja Labeljnbe~CF&~ZFunsigned above
jae Labeljnb~CFunsigned above or equal
jb LabeljnaeCFunsigned below
jbe LabeljnaCF|ZFunsigned below or equal



Today's News: March 17
Welcome back from spring break.

Jump Example
Consider the C program in jump.c
int simple_jump(int x, int y, int z) {
   if (x == 0)
      return y-z;
   return z-y;
}


After cc -O1 -S jump.c, jump.s contains:
simple_jump:
        pushl   %ebp
        movl    %esp, %ebp

        cmpl    $0, 8(%ebp)     // compare x to 0
        jne     .L2             // jump if x != 0
                                // get here if x == 0
        movl    12(%ebp), %eax  // y into %eax
        subl    16(%ebp), %eax  // y - z into %eax
        jmp     .L3             // done
.L2:                            // this is the case x != 0
        movl    16(%ebp), %eax  // z in %eax
        subl    12(%ebp), %eax  // z - y in %eax

.L3:                            // common return
        popl    %ebp
        ret


There are several ways that jump instructions are encoded.
The simplest of which is with PC-relative destination.
After cc -c -O1 jump.c and objdump -d jump.o we get
00000000 <simple_jump>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 7d 08 00             cmpl   $0x0,0x8(%ebp)
   7:   75 08                   jne    11 <simple_jump+0x11>
   9:   8b 45 0c                mov    0xc(%ebp),%eax
   c:   2b 45 10                sub    0x10(%ebp),%eax
   f:   eb 06                   jmp    17 <simple_jump+0x17>
  11:   8b 45 10                mov    0x10(%ebp),%eax
  14:   2b 45 0c                sub    0xc(%ebp),%eax
  17:   5d                      pop    %ebp
  18:   c3                      ret    

In the assembly code, labels have been replaced by addresses relative to the start of the program.
During the execute phase of the jne instruction at 7, the PC has the value 9.
The encoding of jne shows a jump offset of 8. 9 + 8 = 11.
During execution of the jmp instruction at f, the PC has value 11.
The jump offset is 6, giving 11 + 6 = 17.



Section 3.6.5: Loops

Loop Example 1: a do-while loop
int fact_do(int n) {
   int result = 1;
   do {
      result *=n;
      n--;
   } while (n > 1);
   return result;
}


and the corresponding assembly code:
fact_do:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %edx  // n in %edx
        movl    $1, %eax       // result is in %eax
.L2:
        imull   %edx, %eax     // result = result * n
        subl    $1, %edx       // n--;
        cmpl    $1, %edx       // compare 1 to n
        jg      .L2            // jump if n > 1
        popl    %ebp
        ret


Loop Example 2: A while loop
int fact_while(int n) {
   int result = 1;
   while (n > 1) {
      result *= n;
      n--;
   }
   return result;
}


and the corresponding assembly code:
fact_while:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %edx   // n in %edx
        movl    $1, %eax        // result in %eax
        cmpl    $1, %edx        // see if n > 1
        jle     .L3             // no, we are done
.L6:
        imull   %edx, %eax      // result = result * n
        subl    $1, %edx        // n--;
        cmpl    $1, %edx        // compare again
        jg      .L6             // keep going if n > 1
.L3:
        popl    %ebp
        ret

Note the use of the test before entering the loop and again at the end of the loop.


Loop Example 3: A for loop
int fact_for(int n) {
   int i;
   int result = 1;
   for (i=2; i <= n; i++)
      result *=i;
   return result;
}


and the corresponding assembly code:
fact_for:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %ecx  // n in %ecx
        movl    $2, %edx       // 2 in %edx (this is i)
        movl    $1, %eax       // 1 into %eax (the result)
        cmpl    $1, %ecx       // compare n to 1
        jle     .L3            // done if n <= 1 (continue if n >= 2)
.L6:
        imull   %edx, %eax    // result = result * n;
        addl    $1, %edx      // i++
        cmpl    %edx, %ecx    // compare n to i
        jge     .L6           // continue if n >= i
.L3:
        popl    %ebp
        ret

You can find a trace of this example here


We will skip sections 3.6.6 and 3.6.7

Today's News: March 19

Section 3.7: Procedures

A procedure involves:
Section 3.7.1: Stack Frame Structure

The stack is used for passing parameters, for local variables, and storing other values.

The stack in organized into pieces called Stack Frames.

See
Figure 3.21 from the book.

Look at the current frame in the diagram.
All procedures start with the following two instructions:
push %ebp
movl $esp, %ebp

Notice the following:


Section 3.7.2: Transferring Control

Three instructions used for supporting procedures:
call label
leave
ret

call can also have the form call *Operand, but we will not be using it.

call pushes the return address (current PC) on the stack and sets the PC to the label.
leave is equivalent to:
movl %ebp, %esp
popl %ebp

The purpose of the first of these is to restore the stack pointer to the value it had after the initial push of %ebp.
We have not seen leave before because none of our procedures have needed to change %esp, so the first of these was not necessary.

ret pops the return address into the PC

Section 3.7.3: Register Usage Conventions

IA32 has 8 32-bit registers.

Section 3.7.4: A Procedure Example from the book

Example: swap_add (from book)

int swap_add(int *xp, int *yp) {
   int x = *xp;
   int y = *yp;
   *xp = y;
   *yp = x;
   return x + y;
}


And here is the caller:
int caller() {
   int arg1 = 534;
   int arg2 = 1057;
   int sum = swap_add(&arg1, &arg2);
   int diff = arg1 - arg2;
   return sum*diff;
}


See Figure 3.24 from the book.

Here is the assembly code generated for swap_add:
swap_add:
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx
        movl    8(%ebp), %edx   // xp
        movl    12(%ebp), %ecx  // yp
        movl    (%edx), %ebx    // x
        movl    (%ecx), %eax    // y
        movl    %eax, (%edx)    // *xp = y
        movl    %ebx, (%ecx)    // *yp = x
        addl    %ebx, %eax      // x + y for return
        popl    %ebx
        popl    %ebp
        ret


And here is the assembly language code for the caller:
caller:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp      // allocate 6 words on the stack
        movl    $534, -4(%ebp) // 534 on stack
        movl    $1057, -8(%ebp)// 1057 on stack
        leal    -8(%ebp), %eax // address of 1057 into %eax
        movl    %eax, 4(%esp)  // address of 1057 on stack
        leal    -4(%ebp), %eax // address of 534 into %eax
        movl    %eax, (%esp)   // address of 534 on stack
        call    swap_add
.R1:    movl    -4(%ebp), %edx // arg1 into %edx
        subl    -8(%ebp), %edx // arg1 - arg2 into %edx
        imull   %edx, %eax     // diff * return value in %eax
        leave                  // restore the stack pointer
        ret

I added the .R1 label to aid in tracing.
You can find a trace of this example here


Why did the compiler reserve 6 words = 24 bytes on the stack when it only needed 3 words?


Today's News: March 21


Section 3.7.5: Recursive Procedures

Recursive Factorial
int rfact(int n) {
   int result;
   if (n < 1)
      result = 1;
   else
      result = n * rfact(n-1);
   return result;
}


and the corresponding assembly code:
rfact:
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx              // %ebx is a callee-save register
        subl    $4, %esp          // reserve 4 extra bytes on the stack
        movl    8(%ebp), %ebx     // n into %ebx
        movl    $1, %eax          // 1 int %eax
        testl   %ebx, %ebx        // test n
        jle     .L3               // jump if n <= 0  (same as n < 1)
        leal    -1(%ebx), %eax    // %eax = n - 1
        movl    %eax, (%esp)      // move n-1 onto the stack
        call    rfact             // call rfact with parameter n-1
.R1:
        imull   %ebx, %eax        // n * return value into %eax for return
.L3:
        addl    $4, %esp          // restore %esp
        popl    %ebx              // restore %ebx
        popl    %ebp
        ret

I have added the label .R1 so we can use it in tracing.
You can find a trace of this example
here


Recursion Efficiency


Question:
What is the maximum amount of stack space needed for rfact(n)?
Answer:


Today's News: March 24


Question:
We need to push %ebx since it is a callee-save register.
How would this change if we used a caller-save register instead?
Answer:


Question:
Compare the time for calculating n! using the for loop and the recursive method.
Answer:



 Back to CS 3843 Notes Table of Contents
 next