previous
 next 
CS 3843 Computer Organization
Notes on Chapter 3: through Section 3.3

Start Reading Chapter 3

Chapter 3: Machine Level Representation of Programs

This chapter deals with the representation of machine code and its relationship to assembly language.

All of the examples in this chapter are related to the Intel architecture, mainly the 32-bit IA32 instructions.

Section 3.1: History

The Intel instruction set has its ancestry in the Intel 8008 microprocessor which was launched in 1972.
Each successive microprocessor was backwards compatible with the previous one, requiring the later microprocessors to contain instructions that are mostly unused.

History: Moore's Law
In 1965, Gordon Moore, the founder of Intel predicted that the number of transistors that could fit on a chip would double every year for 10 years.
In 1975, he revised this to doubling every 2 years.
Question:
If the number of transistors on a chip was 3500 in 1972,
and the number doubled every 2 years,
how many transistors could fit on a chip in 2012?
Answer:

Section 3.2: Program Encoding

IA32 instruction set
The full IA32 has a large number of instructions, partly because of backward compatibility.
We will learn about it by looking at how C code is converted in to assembly language.

What happens when you compile a C program?
Today's News: February 14
Exam on Wednesday of next week.


IA32 32-bit registers
NameUse
%eax accumulator - general caller-save register used for returning 32-bit values
%ecx counter - general caller-save register
%edx data - general caller-save register
%ebx base - general callee-save register
%esi source - general callee-save register
%edi destination - general callee-save register
%esp stack pointer
%ebp frame pointer

Question:
What is meant by "caller-save" and "callee-save" registers?
Answer:


Why these strange names for the registers:
The 64-bit architecture has 128 64-bit registers called r0 - r127.

Example 1
int add(int x, int y) { 
   int z;
   z = x + y;
   return z;
}

If this is in sum.c we can produce assembly code with:
gcc -O1 -S sum.c
This generates the following:
        .file   "sum.c"
        .text
.globl add
        .type   add, @function
add:
        pushl   %ebp
        movl    %esp, %ebp
        movl    12(%ebp), %eax
        addl    8(%ebp), %eax
        popl    %ebp
        ret
        .size   add, .-add
        .ident  "GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
        .section        .note.GNU-stack,"",@progbits

Note:
Today's News: February 17


Question:
Could we directly use %esp to avoid using %ebp and eliminate the push and pop?
That is, would the following code work? (3 instructions instead of 6)
add:
   movl    12(%esp), %eax
   addl    8(%esp), %eax
   ret
Answer:


We can generate an object file using:
cc -c sum sum.s
This produces sum.o which we can examine with:
objdump -d sum.o
which produces:
sum.o:     file format elf32-i386

Disassembly of section .text:

00000000 <add>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   8b 45 0c                mov    0xc(%ebp),%eax
   6:   03 45 08                add    0x8(%ebp),%eax
   9:   5d                      pop    %ebp
   a:   c3                      ret  

Note:
To use this program, we need a main to call it:
int add(int x, int y);

int main() {
   int x = 12;
   int y = 31;
   int z;
   z = add(x, y);
   printf("x is %d, y is %d, and z is %d\n",x,y,z);
   return 0;
}

If this is called e1.c we do:
cc -O1 -S e1.c
to create: e1.s which is
	.file	"e1.c"
	.section	.rodata.str1.4,"aMS",@progbits,1
	.align 4
.LC0:
	.string	"x is %d, y is %d, and z is %d\n"
	.text
.globl main
	.type	main, @function
main:
	leal	4(%esp), %ecx
	andl	$-16, %esp
	pushl	-4(%ecx)
	pushl	%ebp
	movl	%esp, %ebp
	pushl	%ecx
	subl	$20, %esp
	movl	$31, 4(%esp)
	movl	$12, (%esp)
	call	add
	movl	%eax, 16(%esp)
	movl	$31, 12(%esp)
	movl	$12, 8(%esp)
	movl	$.LC0, 4(%esp)
	movl	$1, (%esp)
	call	__printf_chk
	movl	$0, %eax
	addl	$20, %esp
	popl	%ecx
	popl	%ebp
	leal	-4(%ecx), %esp
	ret
	.size	main, .-main
	.ident	"GCC: (Ubuntu 4.3.3-5ubuntu4) 4.3.3"
	.section	.note.GNU-stack,"",@progbits


Question:
When add is called in the main program, the first paramter is at (%esp) and the second parameter is at 4(%esp).
Why does the add function use offsets 8 and 12 to access these?
Answer:

Section 3.3: Data Formats

Data formats for IA32:
suffixtypebitsused for
b Byte 8 char
w Word 16 short
l Double Word 32 int, long, pointers
s Single Precision 32 float
l Double Precision 64 double
t Extended Precision 80 or 96 long double
Note: No direct support for long long (64 bit integer)


 Back to CS 3843 Notes Table of Contents
 next