Using the Concurrent UNIX IO Simulator

This document was last modified on November 23, 2005 and reflects changes made in version 1.00.


Table of Contents

Overview
System and User Requirements
Starting the Simulator
Basic Operation
Simple Configuration
Simple Programs
Scheduling
Threads
Execution on the Fly
Advanced Configuration
Local Logging
Additional Features
Examples


Overview

The UNIX Concurrent IO Simulator simulates a programs consisting of open, close, read, write, fork, wait, pthread_create, pthread_detach, and pthread_join instructions. Like the Process Ring Simulator, all I/O is done using file descriptors, but the I/O is done through files rather than through standard input and output. It illustrates the interaction among the process file descriptor table, the system open file table, the inodes and the disk. The intended use of the simulator is to illustrate concurrent I/O done via processes and threads. Although there is no limit to the size of the programs, the number of open file descriptors, or the number of processes and threads created, the simulator is meant to be used with programs of a few lines, one or two open descriptors and at most a handful of threads and processes. It is best used as a class demonstration tool or for short assignments. Some sample assignments and demonstrations are described at the end of this guide.

Although the program can handle threads, it can be used to illustrate standard process I/O by avoiding use of the various pthread instructions.

The simulator is a Java program that uses the JELI package. Many of the parts of the simulator use this standard interface.


System and User Requirements

System requirements:

User requirements:

If you are only interested in this simulator, you can download a zip file io.zip that contains all of the files that you need.

Running in a non-Windows environment:
If you installed the simulator by copying the files from a CD, the scripts may not have the correct permissions to run. The convert and runio files should be executable.
The ASCII files distributed with this distribution are in the Windows format in which lines end in a carriage return followed by a line feed. If you are running under UNIX, Linux, or Mac OSX, it may be more convenient to have the carriage returns removed. You can remove all of the carriage returns from these files by executing the convert script.


Starting the Simulator

If you are running from an image of the simulator CD, start in the io subdirectory of the run directory. If you unzipped the IO.zip file in a single directory, start in that directory. In either case, you can start the simulator from a command window by executing runio. In a Windows environment you can also just click on the file runio.bat.

If you do a custom installation, you can put the jar files anywhere you want. Modify the runio.bat (Windows) or runio (Unix, Linux, Mac OSX) file so that the JARDIR1 or JARDIR2 variables points to the location of the jar files.

If the simulator does not start, make sure you have the Java runtime executables in your path. In a command window, execute:
java -version
and make sure this displays a version of at least 2.0.


Basic Operation

Start the simulator. If all goes well the simulator will start up and look something like Figure 1.

Figure 1: The main simulator window.

The screen is divided into sections. The top part of the screen shows a pictorial representation of the computer system. The bottom part has a collection of clickable buttons and other controls that modify the simulator properties and run the simulation. The top part of the screen is divided into three sections. These sections, from left to right are the User Space which shows the processes and threads, the System Space which shows the system open file table, the in-memory inodes and the list of processes, and the Disk Space which shows the disk blocks.

The buttons and controls are arranged in 5 columns. The first column controls the logging features and the display. The second and third columns control the scheduling of processes and threads. The Fourth column controls the rate at which the program runs, save or restore the state of the simulator, and allows you to execute an instruction on the fly. The last column is for running the program.

In the last column, the step button will execute the next instruction. The Step Back and Step Forward buttons allow you to move through already executed parts of the program. The Run button will run the program until it completes or an error occurs. It will run at a rate controlled by the slider to the left of the run button. The delay between instructions is in seconds. The Reset button will reset and start the program from the beginning. Below this is the Active button which shows which process or thread has the CPU. Clicking on this button does a context switch. Lastly, the Choose Program button allows you to choose which program to run.

Figure 2 shows the simulator with a more complicated program after it has been executing for a while. There are now two processes, the first having 3 additional threads and the second having 2 additional threads. The User Space is divided into two regions, one for each process. For each process, the box on the far left shows the ID of the process and its parent, as well as the values of variables.

To the right of the variables is a box showing the program being executed. If the program has not yet completed, a triangular arrow indicates the next instruction to be executed (the program counter). The arrow is red if the process is in the CPU (one of the threads of process 1002). It is hollow if the process is suspended (either waiting on a join or blocked I/O) as in process 1001. All variables are assumed to be appropriately declared. The values of these variables are shown in the variables box once they have been initialized.

To the right of the program box are boxes showing the functions executed by the threads that have been created. One box appears for each thread function and represents one or more threads running that code. Thread IDs are represented by an ordered pair giving the process ID and a unique integer thread identifier. The thread identifiers are assigned in numerical order. In the upper left corner of each thread function box are the thread identifiers for the threads running this function. Each will have an arrow representing the program counter for that thread unless the thread has completed.

In Figure 2, the first process has three threads. It is suspended, waiting to join the thread with ID (1001,1). The first thread has thread identifier 1, is running the program firstThread and is about to execute the read instruction. The second thread is executing the same function but has finished. The other thread is executing the function secondThread. The thread with identifier 2 has completed (it does not have an arrow for the program counter) and the thread with identifier 3 is about to execute its close instruction. The vertical bar at the end of the program counter arrow indicates that this thread is detached.

On the right side of the User Space of each process is the file descriptor table (FDT) for that process. The first three entries corresponding to standard input, standard output, and standard error are shown on the first line of the FDT. Since the simulator is concerned with file I/O, the contents of these entries are not shown. Each of the other open file descriptors points to an entry in the system file table (SFT) in the System Space.

In Figure 2, the first process shows file descriptors 3 and 4 pointing to independent entries corresponding to reading from the file infile. The second process also has two open file descriptors other than 0, 1, or 2. File descriptor 3 is reading from infile and file descriptor 4 is writing to outfile.

Three items are shown in System Space, the System File Table, the list of active processes, and the files.

Each SFT entry shows the file being accessed, the count of the number of open file descriptors using this entry, and the current file offset.

To the right of the open files is the in-memory inodes. In this case there are two files. The file infile can only be read from. The file outfile can only be written to. All outfile files are empty when created. This file has had 5 bytes written to it by Thread 2 of the second process.

Under this is the list of active processes or threads, indicated by its ID (process ID or thread ID). The state of each process or thread is also given, Running, Ready, Joining, Waiting, or Zombie.

Lastly, the Disk Space shows the contents of the files.

You can easily try the simulator with the included programs. Figure 2 corresponds to Program 7. Use the Choose Program button in the lower right corner to pick one of these program and either step through it with the Step button or run it with the Run button. When the program is running, the Run button changes to Pause and the Reset button changes to Abort Run. When paused, the Pause button changes to Resume.

To exit the simulator, push the pink Quit button that is in the upper left corner of the the frame.

Figure 2: The main simulator window during execution of a program.


Simple Configuration

This section discusses the very basics of the configuration file. More detailed configuration issues is contained in a later section.

The default configuration file is called ioconfig. A copy of this file is included with the standard distribution. The configuration file contains lines consisting of a keyword followed by parameters. The keyword is not case sensitive. Two of the important keywords are program and thread. Each of these is followed by a single token giving the name of the file containing the code for the given program or thread. These files must exist in the same directory as the file ioconfig.


Simple Programs

This section describes the contents of a simple program file that does not use multiple threads.

Each program has access to a number of variables that are assumed to be correctly declared. No variables are initialized, and using the value of an uninitialized variable is considered a fatal error. The simulator terminates a run when a fatal error occurs. The available variables when threads are not used are:
     int totaln;
     char buf
n[LARGE_SIZE];
     int child
n;
     int fd
n;
The bufn are assumed to be arrays of arbitrarily large size that can be copied into. The n can be any non-negative integer, or can be omitted completely. These allow such variables as child0, child5, child123, child or fd7.

Each program line has one of the following forms:
     fdn = open("string",O_RDONLY);
     total
n += read(fdm,bufn+totaln,p);
     fd
n = open("string",wrflags,0777);
     fd
n = open("string",wrflagst,0777);
     fd
n = open("string",wrflagsa,0777);
     fd
n = open("string",wrflagsta,0777);
     write(fd
n,"string",p);
     close(fd
n);
     fork();
     child
n = fork();
     if (child
n) fork();
     if (child
n) childm = fork();
     if (!child
n) fork();
     if (!child
n) childm = fork();
     child
n = wait(NULL);
     if (child
n) {
     if (!child
n) {
     else {
     }
Here, string represents an arbitrary sequence of printing characters, n and m are non-negative integers (0, 1, 2, etc.) and p is a positive integer (1, 2, 3, etc.).

The last 4 instruction types allow different blocks of code to be executed by parent and child processes.

The open for write instructions use the following write flags that allow opening use of the append and truncate flags:
int wrflags = O_WRONLY | O_CREAT;
int wrflagst = O_WRONLY | O_CREAT | O_TRUNC;
int wrflagsa = O_WRONLY | O_CREAT | O_APPEND;
int wrflagsta = O_WRONLY | O_CREAT | O_TRUNC | O_APPEND;

Some restrictions apply to how these are used. Blocks must be nested properly or the program will be rejected by the simulator. Files open for reading must have their names in an appropriate configuration line so the simulator knows the contents of these files. Files used for reading cannot also be used for writing.

The simulator does little checking on the syntax of the program file lines. These lines are only checked so as to distinguish one type of instruction from another. For example, only three of the above lines contain an open brace, so the following are valid lines in a program file:
{
child2 {
{ !child4
and these are interpreted as:
else {
if (child2) {
if (!child4) {


Scheduling

The simulator allows control over several aspects of the scheduling of the processes using the purple buttons in the second and third columns. The scheduling parameters determine when context switched take place and which process is chosen when there is a context switch.

In addition, the top button in the second column determines what is duplicated when a fork occurs. The POSIX thread library specifies that a fork only duplicates the currently running thread, not all of the thread of the process. This is the default for the simulator. The simulator can also be set so the when a fork occurs, the process and all of its threads are duplicated. This button also controls whether thread IDs are unique across processes. The POSIX standard leaves this to the implementation. If thread IDs are unique, then a join in a forked process can refer to a thread in another process. This is not allowed and the simulator considers this a fatal runtime error.

The second button in the second column determines how the next process is chosen when a context switch takes place. The possibilities are:

The third button in the second column controls when context switches take place. The possibilities are: The first button of the third column controls what process executes after a fork instruction. The possibilities are: The second button of the third column controls what thread executes after a thread create. The possibilities are: The third button of the third column indicates whether I/O is atomic. If I/O is atomic, then a process never loses the CPU during a read or write operation. If I/O is not atomic, under this button will be a slider giving the probability that the instruction loses the CPU after each I/O byte is processed. When stepping through a program using non-atomic I/O, each step processes one byte. When a non-atomic I/O instruction is executing, the triangular program counter arrow is preceded by an integer giving the number of bytes processed so far. For a read operation, the total number of bytes to read is given in the instruction, but is reduced if there are fewer bytes in the file. For a write operation, it is considered a fatal error to write more bytes than in the string in the second parameter of the write instruction.

Threads

The simulator supports three thread-related instructions for creating, joining and detaching threads. The instructions are:
     pthread_create(&tidn,NULL,"string",NULL);
     pthread_join(tid
n,NULL);
     pthread_detach(tid
n,NULL);
Here, n is a non-negative integer (0, 1, 2, ...) and string is a sequence of printing characters representing the name of a thread function. Each thread function used must be defined in a file referenced in the configuration file using the keyword thread. Each thread function starts with a line in the form:
     void *string(void *args) {
and ends with the two lines:
     return NULL;
     }

in between these are any valid instructions, that is any of the three described above or the ones in the section Simple Programs.

When a thread is created in the simulator, one of the tid variables is set to the ID of the created thread. The simulator displays these thread IDs as an ordered pair containing the process ID of the creating process and a integer thread identifier which is unique within the process. POSIX does not specify the format of a thread ID and in many systems it is just an integer. POSIX also leaves open the question of whether thread IDs are unique throughout the system (no two threads have the same thread ID) or whether they are just unique within a given process (no two threads of the same process have the same thread ID). Usually this does not make a difference since a process cannot reference (with detach or join) a thread of another process. The simulator displays thread IDs as if they are unique throughout the system because it makes it easier to see what is happening when a program is traced.

It is usually not a good idea to mix thread creation, fork and join as these may interact in unexpected ways. Suppose a process creates a thread and sets tid2 to the ID of the created thread. If the process then forks a child, the child has a copy of the variable tid2. What thread does this represent? In the POSIX standard, a fork only duplicates a single thread, the one that executed the fork. The child process does not have a thread corresponding to tid2. What would happen if the fork duplicated all of the threads of the parent process? The child would have a thread corresponding to tid2 in the parent. Would tid2 represent the thread in the parent or the child? The answer depends on whether thread IDs are unique across the system or just within the process. POSIX does not address this. The simulator allow you to experiment with each of these possibilities using the Fork: dup button at the top of the second column of buttons.

Execution on the Fly

In additional to stepping through a program, any instruction may be executed on the fly using the bottom button in the fourth column. This pulls up a menu of instructions that can be executed. Use the Instruction Parameters menu item to set the parameters for a instruction to be executed. These parameters indicate which fd, buf, child, tid or count value to use. A value of zero is used to omit the number. You may also set the name of the input or output file used by open, the characters written by a write instruction, or the thread to use with the pthread_create instruction.

All of the instructions described above are available except for those with a condition, or the start and end block instructions. The instruction is executed by the currently running process or thread. You can change the active process or thread with the Active button which is in the last column.


Advanced Configuration

The default configuration file is called ioconfig but may be changed by giving the name of the file as the command line parameter when the simulator is started.

If no configuration file is found, the simulator uses a default configuration along with corresponding program and thread files that are stored in the jar file containing the simulator code. The default configuration and files is the same as the configuration file provided with the simulator distribution.

In addition to the simple configuration described above, the configuration file can contain line affecting many aspects of the simulation, including how the scheduling is done. Also, any of the configuration lines may appear in a program file by preceding the line with a #. These lines are not displayed by the with the program in the box at the left of each process User Space, but change the operation of the simulator when the program is chosen. After of the configuration options only make sense when the simulator is started and these are ignore when they appear in a program.

The table below lists all of the configuration options.

KeywordValuesMeaning
General Parameters
useranythingThe name of the current user. This appears in the log file.
quietnone Turns off all sounds generated by the simulator.
directorystring The directory containing the programs and threads.
fontsizeinteger The font size used for displaying everything but program code and button labels.
programfontsizeinteger The font size used for displaying program code.
Programs, Threads, and Files
programfilenameThe name a file containing a program. The filename also serves as the name of the program. There can be any number of these configuration lines.
threadfilenameThe name of a file containing a thread function. This name is used only for reading in the thread function. The name of the thread is contained in the file itself. There can be any number of these configuration lines.
filetwo stringsThe first string is the name of an input file and the second is the contents of the file. This file is used internally to the program and is not related to anything stored on the disk. Thread can be any number of these configuration lines.
Scheduling and Running Parameters
rundelayan floating point number The number of seconds to delay between instructions when the Run button is used to run a program.
IOAtomicnone IO instructions are atomic.
IONotAtomicnone IO instructions are atomic.
AtomicInstructionspecific token Allowable tokens are true and false. If set and IO is atomic, so is setting of total in read instruction.
SchedulingNoPreemptnone Scheduling is not preemptive.
SchedulingRRan integer Scheduling is done using round robin with the given quantum. The quantum represents the number of instructions that can be executed before a context switch is forced.
SchedulingRandoma probability A context switch occurs after each instruction with the given probability. Anther ready process is chosen at random.
forkactionspecific token The allowable tokens are all and thread. If all, a fork duplicates the forking process and all of its threads. If thread, only the thread executing the fork is duplicated, as in the POSIX standard.
uniqueidspecific token The allowable tokens are true and false. If true, thread IDs are unique across processes. If false, thread IDs are unique only within a process.
choosespecific token The allowable tokens are FCFS, next, and random. This determines how the next process is chosen when a context switch occurs due to a process being blocked or terminated. If FCFS, the process that arrived in the ready queue first is chosen. If next, processes are chosen by their process or thread ID. If random one of the ready processes is chosen at random.
afterforkspecific token The allowable tokens are parent, child, either, and random. This determines which process executes after a fork. If parent the parent does not lose the CPU because of the fork. If child, the child process always executes after the fork. If either, either the parent or the child executes after the fork, with one chosen at random. If random a random process or thread is chosen after the fork.
aftercreatespecific token The allowable tokens are original, new, either, and random. This determines which process executes after a pthread_create. If original the original thread does not lose the CPU because of the pthread_create. If new, the newly created thread always executes after the pthread_create. If either, either the original thread or the newly created one executes after the pthread_create, with one chosen at random. If random a random process or thread is chosen after the pthread_create.
Logging Parameters
logfilea filename The name of the log file. The extension .html will be appended to this. The default log file is logfile.html.
logdira directory pathIf given, this is the directory for storing the log file. If not given, the log file is stored in the current directory.
uniquefilenamean integer If given, use a unique file name each time a log file is opened. Do this by appending an integer with the given number of digits to the filename.
imagenamea string The base filename used for logged images. An integer is appended to the name to make the image files unique.
portan integer If this value is greater than zero, remote logging will be used with this port. The default is 0.


Local Logging

The simulator can display information in a log file. Log files are in HTML format and can be displayed by or printed from a standard browser. The first column of light blue buttons controls the logging functions of the simulator.

Initially, these buttons are used to control the logging functions before the log file is open or after it is closed. The first button opens the log. When the log file is successfully opened most of the buttons change their function to control the log functions appropriate when logging is in progress.

While the log is closed, the second button toggles between the modes Replace Old Log and Append to Log. In the former case, opening the log file overwrites any file with the same name that already exists. In the latter case, the new log is appended to the old one. The third button can be used to change the name and directory used for storing the log file. Pushing this button pops up a dialogue box in which the user can change the directory and log file names. The names of the files for storing the graph image files created by the simulator can also be changed.

The Show Remote Log button can be used to pop up a browser window containing the log file generated when the log is stored remotely.

After the log is successfully opened, the Open Log button changes to Close Log, the second button changes to Log Comment, and the third button changes to Stop Log.

Close Log terminates logging and closes the log file. Stop Log temporarily stops adding information to the log file but keeps the log file open. When pushed, this button changes to Start Log which resumes the logging.

The Log Comment button pops up a window which allows the user to enter comments into the log file.

The Log Image button puts a copy of the currently displayed diagram in the log file.


Additional Features

The Save or Restore State button can be use to save or restore the state of the simulator in a file. A restored state contains only the current state of the simulator so you cannot step back from a saved state.

Most of the widgets can be dragged within their display area. Process widgets: the variable list, the program code, and the file descriptor table must stay in the area reserved for that process in the User Space. The system widgets: the system file table, the list of processes and the in-memory inode list may stay in the System Space. The file contents widgets must stay in the Disk Space. The boundaries between these spaces and between the processes can also be modified by dragging them.

Each of the widgets has a standard position.

Each of these widgets will jump to its standard position if dragged to within a few pixels of its standard position. If a widget is in its standard position, dragging it away will require dragging the mouse several pixels. If the file contents of a file is in its standard position, it will follow the corresponding in-memory inode entry if it is dragged.

The Modify Display button at the bottom of the first column of buttons allows for the modification of the display in several ways. Clicking on this button brings up a menu with 5 options.


Examples

Example 1: two independent processes reading
Program 1 looks like this:
     fork();
     fd0 = open("infile",O_RDONLY);
     total += read(fd0,buf+total,2);
     total += read(fd0,buf+total,2);
     close(fd0);
The process forks a child, each process opens the file and reads 4 bytes from the file. Each time open is called, a new SFT entry is created. Since the two processes are using different SFT entries and therefore different file offsets, the two processes each read the first 6 bytes of the file, independent of how the processes are scheduled.

Example 2: two processes that share the file offset
Program 2 looks like this:
     fd0 = open("infile",O_RDONLY);
     fork();
     total += read(fd0,buf+total,2);
     total += read(fd0,buf+total,2);
     close(fd0);
It is identical to Program 1, except that the open occurs before the fork. The open function is called only once, so only one SFT entry is created. The count in the SFT entry is initially 1, indicating that only one file descriptor is accessing this entry. After the fork, the count in this entry becomes 2. The two processes share the same SFT entry, and therefore share the file offset. When one process reads two bytes, the file offset is 2. Whichever process reads next, that process will read the next two bytes of the file. If the parent completed execution before the child starts reading, the parent will get abcd and the child will get efgh. If the parent loses the CPU after the first read and the child executes both of its reads before the parent gets the CPU again, the parent will get abgh and the child will get cdef. If the reads are atomic, one process will get abcd, abef or abgh and the other process gets the other 4 bytes in alphabetical order. You can experiment with this by changing the scheduling to round robin with a quantum of 3, or using random scheduling. You could also single step through the program and change the running process at different points in the execution.

When the first close is called (either by the parent or child) the count stored in the SFT entry is decremented to 1. The second call to close decrements the count to 0 and the entry is removed.

The situation is move complicated if (as according the the POSIX specification), reads do not have to be atomic. In this case, a process can lose the CPU between reading the first and second byte of the 2-bytes read instructions. The possibilities are more numerous. There are 20 possibilities for the process that gets the first byte: abcd, abce, abcf, abcg, abde, etc. Each process will read 4 bytes (assuming that each read does return 2) in alphabetical order and the other process will read the other 4 bytes. You can experiment with this by setting the I/O to be not atomic.

On real systems, these other possibilities will occur rarely (if at all), but programs but work correctly if they do occur.

Example 3: Writes with independent processes
Program 3 looks like this:
     child1=fork();
     fd0 = open("outfile",wrflags,0777);
     if (child1) {
     write(fd0,"ab",2);
     write(fd0,"cd",2);
     }
     else {
     write(fd0,"AB",2);
     write(fd0,"CD",2);
     }
     close(fd0);
The parent and child each open the file and write 4 bytes to the file, the parent writing abcd and the child writing ABCD. The wrflags variable is defined as O_WRONLY | O_CREAT. Since the processes use different SFT entries, the file offsets are independent. The first write of each process writes to the start of the file. If the parent completes writing before the child starts, the child will overwrite the 4 bytes written by the parent and the file will contain ABCD. If the writing is interleaved, a number of possibilities may occur including ABcd. If writes are not performed atomically, any combination of upper and lower case abcd may occur in the file, including AbCd.

Example 4: Writes that share a file offset
Program 4 looks like this:
     fd0 = open("outfile",wrflags,0777);
     child1=fork();
     if (child1) {
     write(fd0,"ab",2);
     write(fd0,"cd",2);
     }
     else {
     write(fd0,"AB",2);
     write(fd0,"CD",2);
     }
     close(fd0);
Since the open occurs before the fork, the processes share a SFT entry and therefore share the file offset. Each process writes to the end of the file and the file should contain 8 bytes when both processes complete. There are many possibilities including abABCDcd. If the writes are performed atomically, there are 6 possibilities. If writes are not performed atomically, there are many more possibilities. The POSIX standard states that the file offset is updated when the write returns but does not indicate the state of the offset if a process loses the CPU while performing a write. The simulator assumes that the offset is updated with each byte written. In this case the possibilities include aABbcCdD and abcABCDd. The bytes abcd (in that order) can be arbitrarily interleaved with ABCD (in that order).

Example 3a: Opening with the append flag
Program 3t is identical to Program 3, except that the open is now:
     fd0 = open("outfile",wrflagsa,0777);
where wrflagst is O_WRONLY | O_CREAT | O_APPEND. When a file is opened with the O_APPEND flag set, the file offset is set the the end of the file before a write takes place. In addition, no other changes can occur to the file between the changing of the file offset and the completion of the write operation. This essentially makes the writing atomic. The POSIX standard does not indicate how this is implemented. In the simulator, a write to a file with the append flags set locks the file until the operation is complete. The process may lose the CPU in the middle of a write (if I/O is set to not atomic in the simulator) but another process (or thread) loses the CPU if it attempts a write operation on this file before the first write completes. In this case even when writes are not atomic, Program 3a behaves as if they are.

Example 3ta: Opening with append and truncate flags.
Program 3ta is identical to Program 3 and Program 3a except that the open is now:
     fd0 = open("outfile",wrflagsta,0777);
where wrflagst is O_WRONLY | O_CREAT | O_TRUNC | O_APPEND. The O_TRUNC causes the file to be truncated to 0 length when it is opened. Any bytes written to the file by the other process are discarded when the open is executed. It is possible for the file to contain fewer than 8 bytes when both processes are complete. For example, if the parent finishes its first write and then the child starts and completes execution, the final contents of the file will be ABCDcd.

Example 5: Reading from threads.
Program 5 looks like this:
     fd1 = open("infile",O_RDONLY,0777);
     pthread_create(&tid1,NULL,firstThread,NULL);
     total+=read(fd1,buf+total,2);
The created thread executes the following function:
     void *firstThread(void *args) {
         total+=read(fd1,buf+total,2);
         return NULL;
     }
The call to pthread_create does not increment the count is SFT entry. The thread and the main process use the same FDT entry and therefore the same SFT entry and file offset. This program uses incorrect synchronization and the results will depend on the way the scheduling is done. The main thread and the created thread each read two bytes from the file. If these reads are done atomically, one reads ab and the other reads cd. The buffer will contain abcd. If the main program terminates by calling return or exit, or just falls off the end, it terminates all of its non-detached threads. It is possible that the main program will terminate before the thread starts and the thread will never execute.

Things to try with IO atomic:

Example 6: Reading from threads, non-atomic I/O
Program 5 can behave in an unexpected manner if the instructions are not executed atomicly. Program 6 is the same as Program 5 except that the read in the main program is repeated. The main program looks like this:
     fd1 = open("infile",O_RDONLY,0777);
     pthread_create(&tid1,NULL,firstThread,NULL);
     total+=read(fd1,buf+total,2);
     total+=read(fd1,buf+total,2);
Set the IO to be not atomic and step through Program 6 as follows.

  1. Step through the main program until 1 byte has been read.
  2. Switch to the thread and read one byte.
  3. Switch back to the main program and read one more byte, completing the first read instruction.
  4. Switch back the the thread and step through it until it has completed.
  5. Step through the main program until it has completed.
Try to understand what has happened. Why does the buffer contain what it does? Look at the values of the variable total and buf as well as the offset in the SFT entry.

Explanation: Remember that when two threads (in this case the main thread and the created one) share variables, use of these variables form critical sections that should be protected with synchronization constructs. The simulator illustrates what happens when you do not protect critical sections.

In the following explanation, we assume that the file offset is updated with each partial read. The POSIX standard does not address the issue of whether the offset is adjust for each read or only when the read is complete.

  1. After one byte has been read in the main program, buf contains a and the read instruction is prepared to put the next byte in the second position of the buffer. Before it can do this, it loses the CPU. Note that the value of total has not been modified since the read has not completed.
  2. The thread reads one byte, b and puts it in the first slot of buf, overwriting what was put in it by the main program because total is still 0. It is prepared to put the next byte read into the buffer at the second position. Before it can do this, it loses the CPU.
  3. The main program completes it first read. It puts the byte read, c in the second position of buf and updates total. buf now contains bc and total is 2.
  4. The thread reads the second byte, d and puts it in buf in the second position, overwriting the c put in by the main program. The value of total is now updated.
    What is the new value of total? It depends on how the instruction: total += read(fd1,buf+total,2) is compiled.
    One possibility is:
         x = read(fd1,buf+total,2);
         total = total + x;
    In the first of these, buf+total was computed when the first byte was read, so the first byte was put in the first position of buf. After the second byte was read, x has the value 2, and since total was set to 2 by the main program, the value of total is now 4. Notice that buf only contains 2 bytes.
  5. The main program now executes its second read. Total is 4, so two bytes are read ef and put in the 5th and 6th positions of the array, buf. total is set to 6 and the file offset is also 6. However, nothing was ever put in the 3rd or 4th position of buf. We indicate this as bd..ef.
Notice that the results depend on several details of the implementation, including how += is compiled. Notice that the results would be different of the read instruction were compiled as:
     y = total;
     x = read(fd1,buf+total,2);
     total = y + x;
In this case, y would be using the value of total when the first byte was read, rather than the value of total when the second byte is read. Notice that this issue is not related to the I/O being nonatomic. The instruction x += y is usually not atomic, independent of how I/O is implemented.