CS 3843 Computer Organization Notes on Chapter 3 Sections 3.4 and 3.5

Section 3.4: Accessing Information

IA32 Registers There are 8 8-bit registers, 8 16-bit registers, and 8 32-bit registers.
The 16-bit registers are the low 16 bits of the 32-bit registers.
The 8-bit registers are the 2 parts of the 16-bit registers.

31              15     8 7      0
 -------------------------------
|%eax        %ax|  %ah  |  %al  |
 -------------------------------
 -------------------------------
|%ecx        %cx|  %ch  |  %cl  |
 -------------------------------
 -------------------------------
|%edx        %dx|  %dh  |  %dl  |
 -------------------------------
 -------------------------------
|%ebx        %bx|  %bh  |  %bl  |
 -------------------------------
 -------------------------------
|%esi        %si|               |
 -------------------------------
 -------------------------------
|%edi        %di|               |
 -------------------------------
 -------------------------------
|%esp        %sp|               | stack pointer
 -------------------------------
 -------------------------------
|%ebp        %bp|               | base pointer
 -------------------------------

The first 6 32-bit registers can be considered general purpose registers, but historically they had specific uses.

You can modify the 8-bit registers without modifying the rest of the bits of the corresponding 32-bit register.

Section 3.4.1: Operand Specifiers

There are 11 basic forms for operands.
One is for immediate (constant) values
One is for registers
The other 9 are for memory.

Type	Form	Operand Value	Name
Immediate	$Imm	Imm	Immediate

Register	E_a	R[E_a]	Register

Memory	Imm	M[Imm]	Absolute

Memory	(E_a)	M[R[E_a]]	Indirect
Memory	Imm(E_b)	M[Imm+R[E_b]]	Base + Displacement
Memory	(E_b,E_i)	M[R[E_b]+R[E_i]]	Indexed
Memory	Imm(E_b,E_i)	M[Imm+R[E_b]+R[E_i]]	Indexed

Memory	(,E_i,s)	M[R[E_i]*s]	Scaled Indexed
Memory	Imm(,E_i,s)	M[Imm+R[E_i]*s]	Scaled Indexed
Memory	(E_b,E_i,s)	M[R[E_b]+R[E_i]*s]	Scaled Indexed
Memory	Imm(E_b,E_i,s)	M[Imm+R[E_b]+R[E_i]*s]	Scaled Indexed

Examples
The examples below show Immediate, Register, Base+Displacement, and Scaled indexed addressing modes.

Example 1:

The C program:

int simple(int x) { 
   return x + 17;
}

Compiles to:

simple:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        addl    $17, %eax
        popl    %ebp
        ret

Question:

Find examples of register addressing, immediate addressing, base+displacement addressing.

Answer:

Question:

What addressing modes are used in the second movl and the addl instructions?

Answer:

Example 2:

The C program:

int array(int* s, int i) { 
   return s[i];
}

Compiles to:

array:
        pushl   %ebp
        movl    %esp, %ebp
        movl    12(%ebp), %eax
        movl    8(%ebp), %edx
        movl    (%edx,%eax,4), %eax
        popl    %ebp
        ret

Questions:

Which instruction uses scaled addressing?
If we changed this to an array of short, could we just change the 4 to a 2?

Answers:

Example 3: Example 2 using short instead of int

The C program:

short array(short* s, int i) { 
   return s[i];
}

Compiles to:

array:
        pushl   %ebp
        movl    %esp, %ebp
        movl    12(%ebp), %eax
        movl    8(%ebp), %edx
        movzwl  (%edx,%eax,2), %eax
        popl    %ebp
        ret

Questions:

What does the movzwl instruction do?
What value would be returned in %eax if the array entry contained -1?
Why doesn't the compiler use movswl instead of movezwl?

Answers:

Today's News: February 24

No news yet!

Today's News: February 26

Still working on driver program!

A driver for Example 3:

Here is a driver program to test Example 3:

int main() {
   short a[2] = {5,7}; 
   short value;
   value = array(a,1);
   printf("value is %d\n",(int)value);
}

which compiles to

main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        subl    $36, %esp
        movw    $5, -8(%ebp)      // initialize a[0]
        movw    $7, -6(%ebp)      // initialize a[1]
        movl    $1, 4(%esp)       // second parameter of array
        leal    -8(%ebp), %eax    // first parameter is an address
        movl    %eax, (%esp)      // it takes 2 instructions for mem to mem
        call    array
        cwtl                      // convert word to long (signed ax to eax)
        movl    %eax, 8(%esp)     // return value is 3rd param of __printf_chk
        movl    $.LC0, 4(%esp)    // format string is 2nd param of __printf_chk
        movl    $1, (%esp)        // first parameter of __printf_chk is flags
        call    __printf_chk
        addl    $36, %esp
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret

Question:

Rewrite

leal   -8(%ebp), %eax

without using leal.

Answer:

Example 4: setting an element of an array

The C program:

void array_set(int* s, int i, int value) { 
   s[i] = value;
}

Compiles to:

array_set:
        pushl   %ebp
        movl    %esp, %ebp
        movl    16(%ebp), %ecx        // value into %ecx
        movl    12(%ebp), %edx        // i into %edx
        movl    8(%ebp), %eax         // s into %eax
        movl    %ecx, (%eax,%edx,4)   // value into memory (s + 4*i)
        popl    %ebp
        ret

Example 5: Example 4 using short

The C program:

void array_set(short* s, int i, short value) { 
   s[i] = value;
}

Compiles to

array_set:
        pushl   %ebp
        movl    %esp, %ebp
        movl    16(%ebp), %ecx        // value into %ecx
        movl    12(%ebp), %edx        // i into %edx
        movl    8(%ebp), %eax         // s into %eax
        movw    %cx, (%eax,%edx,2)    // value into memory (s + 2*i)
        popl    %ebp
        ret

Note the use of movw and cx instead of movl and ecx.
Note that 4 bytes are used to store value on the stack, even though only two are needed.

Today's News: February 28

Driver program for Example 5:

The C program:

int main() {
   short a[2];
   array_set(a,1,27);
   printf("value is %d\n",(int)a[1]);
}

Compiles to:

main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        subl    $36, %esp
        movl    $27, 8(%esp)    // third parameter of array_set
        movl    $1, 4(%esp)     // second parameter of array_set
        leal    -8(%ebp), %eax  // first parameter of array_set
        movl    %eax, (%esp)
        call    array_set
        movswl  -6(%ebp),%eax  // a is at -8(%ebp) so a[1] is at -6(%ebp)
        movl    %eax, 8(%esp)  // third parameter of __printf_chk
        movl    $.LC0, 4(%esp) // second parameter
        movl    $1, (%esp)     // first parameter
        call    __printf_chk
        addl    $36, %esp
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret

Example 6: using long long

The C program:

long long array(long long* s, int i) { 
   return s[i];
}

compiles to:

array:
        pushl   %ebp
        movl    %esp, %ebp
        movl    12(%ebp), %edx       // move i into %edx
        movl    8(%ebp), %eax        // address of s into %eax
        leal    (%eax,%edx,8), %edx  // address of s[i] into %edx
        movl    (%edx), %eax         // low 32 bits of s[i] (little endian) into %eax
        movl    4(%edx), %edx        // high 32 bits of s[i] into %edx
        popl    %ebp                 // 64-bit return value in %edx, %eax
        ret

Driver program for Example 6:

The C program:

int main() {
   long long a[2] = {99,2468135792468LL}; 
   long long value;
   value = array(a,1);
   printf("value is %lld\n",value);
}

compiles to:

main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        subl    $36, %esp
        movl    $99, -24(%ebp)    // low byte of 99
        movl    $0, -20(%ebp)     // high byte of 99
        movl    $-1470402732, -16(%ebp)  // low byte of 2468135792468LL
        movl    $574, -12(%ebp)          // high byte
        movl    $1, 4(%esp)       // second parameter of array
        leal    -24(%ebp), %eax   // address of s
        movl    %eax, (%esp)
        call    array
        movl    %eax, 8(%esp)    // low byte of return value in %eax
        movl    %edx, 12(%esp)   // high byte of return value in %edx
        movl    $.LC0, 4(%esp)   // second parameter of __printf_chk
        movl    $1, (%esp)       // first parameter of __printf_chk
        call    __printf_chk
        addl    $36, %esp
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret

Note: 2468135792468 = 574*2³² + 2,824,564,564
The 2's complement of the last number is 1,470,402,732.

Today's News: March 3

Section 3.4.2: Data Movement Instructions

Three move instructions: mov, movs, movz
mov moves between objects of the same size: movb, movw, movl

movs moves from a smaller object to a larger one using sign extension.
This is for converting signed char to short, char to int, short to int.
Instructions are movsbw, movsbl, movswl

movz moves from smaller object to a larger one using zero extension.
This is for converting unsigned.
Instructions are movzbw, movzbl, movzwl

Example 7 (Practice Problem 3.4 from the text, done in recitation)

Moving between integer type of different sign and size.
In each case, assume that the source is in %al, %ax, or %eax and the destination address is in %edx.
Implement an assignment to the destination from the source for the following data types:

	src	dest
a)	int	int
b)	char	int
c)	char	unsigned
d)	unsigned char	int
e)	int	char
f)	unsigned	unsigned char
g)	unsigned	int

push and pop
pushl and popl move bytes on or off the stack.

The stack pointer: %esp points to the last item pushed on the stack.
pushl decrements %esp by 4 and then copies the item onto the stack
popl copies the item from the stack (4 bytes starting at %esp) and increments the stack pointer by 4.

pushl %epb is equivalent to:

subl $4, %esp
movl %ebp (%esp)

popl %eax is equivalent to:

movl (%esp),%eax
addl $4,%esp

If several items need to be pushed onto the stack,
it is often more efficient to decrement the stack pointer once
and use base+displacement addressing.

Section 3.4.3: An example using a pointer parameter
Note: this is not the same example as in the book.
Example 8

The C program:

void exchange(int *xp, int *yp) { 
   int temp;
   temp = *xp;
   *xp = *yp;
   *yp = temp;
}

Compiles to:

exchange:
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx            // %ebx is a callee save register
        movl    8(%ebp), %edx   // move first address into %edx
        movl    12(%ebp), %ecx  // move second address into %ecx
        movl    (%edx), %ebx    // move first value into %ebx
        movl    (%ecx), %eax    // move second value into %eax
        movl    %eax, (%edx)    // move second value into first address
        movl    %ebx, (%ecx)    // move first value into second address
        popl    %ebx            // restore %ebx
        popl    %ebp
        ret

Question:

Once the parameters are moved into registers, it looks like it takes 4 movl instructions to do the interchange.
The C source code does this in 3 moves?
Why does it take 4 IA32 instructions?

Answer:

Example 8 swap parameters

Section 3.5.1: Arithmetic and Logical Operations

The leal instruction.
Load effective address.
Simple Examples:
The following example is Practice Problem 3.6 from the text and will be done in recitation.
Example 9:

Suppose %eax holds x and %ecx holds y.
What is the result of each of the following?

leal 6(%eax), %edx
leal (%eax, %ecx), %edx
leal (%eax, %ecx, 4), %edx
leal 7(%eax, %eax,8), %edx
leal 0xA(,%ecx,4), %edx
leal 9(%eax,%ecx,2), %edx

Example 10:

Consider the C program:

int arith(int x, int y, int z) { 
   int t1 = x + y;
   int t2 = z + t1;
   int t3 = x + 4;
   int t4 = y * 48;
   int t5 = t3 + t4;
   int rval = t2 * t5;
   return rval;
}

After cc -O1 -S arith.c, the file arith.s contains:

arith:
        pushl   %ebp
        movl    %esp, %ebp

        movl    8(%ebp), %ecx        // x into %ecx
        movl    12(%ebp), %edx       // y into %edx
        leal    (%edx,%edx,2), %eax
        sall    $4, %eax
        leal    4(%ecx,%eax), %eax
        addl    %ecx, %edx
        addl    16(%ebp), %edx      // add z to %edx
        imull   %edx, %eax

        popl    %ebp
        ret

Question:

What is going on here?

Answer:

Section 3.5.2: Uniary and Binary Operations

Unary operations: inc, dec, neg, not

Question:

Which addressing mode cannot be used with these instructions?
What is wrong with the following:
inc (%edx)

Answer:

Binary Operations: add, sub, imul, xor, or, and
Operate on source and destination, storing result in destination.
The last three are bitwise operators.

Notes:

subl %eax, %edx means subtract %eax from %edx and put the result in %edx.
imull throws away the high order bits that do not fit in the destination.
imull can be used for both signed and unsigned arguments.

Section 3.5.3: Shift Operations

The shift instructions are: sal (also called shl), sar, and shr.
The sa ones are arithmetic shift and the sh ones are logical shift.
The first parameter is the number of bits to shift, and the second is the item to be shifted.

Examples:

   sarl $3, %eax
   sall %eax, (%edx)

Section 3.5.4: Discussion

Question:

What does the following instruction do?

xorl %eax, %eax

Answer:

Today's News: March 5

Section 3.5.5: Special Arithmetic Operations

Five special arithmetic operations

imull, mull, cltd, idivl, divl

imull and mull:

take one operand
multiply the operand by %eax.
the resulting 64 bits are put in %edx (high bits) and %eax.
imull is for signed and mull is for unsigned.

Question:

Why are there two separate instructions, but there is only one 2-parameter imull?

Answer:

Example:

     movl  12(%ebp), %eax    // x into %eax
     imull 8(%ebp)           // x * y in [%edx,%eax]
     movl  %eax, (%esp)      // store low 32 bits
     movl  %edx, 4(%esp)     // store high 32 bits

Note: Assumes a little endian machine

cltd: no parameters, fill %edx with the sign of %eax

idivl and divl:

take one operand
divide the 64-bit [%edx,%eax] by the operand
the low 32 bits of the quotient are put in %eax
the remainder is put in %edx

Example:

   movl  8(%ebp), %eax    // x into %eax
   cltd                   // sign extend into %edx
   idivl 12(%ebp)         // divide x by y
   movl %eax, 4(%esp)     // x/y
   movl %edx, (%esp)      // x % y

Question:

How would this change if x and y were unsigned?

Answer:

Back to CS 3843 Notes Table of Contents