CS 3733 Operating Systems, Fall 2002 Assignment 1


Due Thursday, September 19, 2002

Assignment 1 has 6 parts.


Introduction
The purpose of this assignment is to test you skills at C programming and testing, including the use of strings and pointers. You will also write routines that may be useful in a later assignment.

Web browsers and web servers communicate using a protocol called HTTP. In the simplest situation, a browser sends a request to a web server and the web server sends a reply. The initial line of a request has the following format:
Method SP request-URI SP HTTP-version CRLF
SP represents white space, any number of blank and tab characters. CRLF represents a carriage return followed by a line feed. The Method is one of a small number of valid requests. The one of interest to us is GET.
The Request-URI gives information about the location of the requested resource. For this assignment, the HTTP-version will be HTTP/1.0.

Some request-URIs specify a host name while others do not. If the host name is not specified, the request-URI is just an absolute path, that is, a path to a file starting with an initial / character.

If the host name is specified we will assume that the request_URI starts with http// which will be immediately followed by the host name and the absolute path.

In this assignment you will be parsing the initial line of an HTTP request and outputting information about the request.

You may use any of the C library routines to do this assignment, except for those whose man page specifically indicate that they are Unsafe in multithreaded applications. In particular, you may not use strtok. You should also avoid using strtok_r for this assignment. The test programs you write (the ones whose name ends in test) may assume that input lines have a reasonable maximum length and may truncate the input if necessary. The other programs may not assume anything about the size of the input. No programs that you write should allow a buffer overflow to occur. For example, avoid using gets but fgets is OK.


Part 0: Looking for the GET
Make a clean directory called assign1 for this assignment. Make a part0 subdirectory for this part. Copy all of the files from /usr/local/courses/cs3733/spring2002/assign1/part0 into this directory. Write a function with the following prototype:

int parse_simple(const char *inline);
that returns 1 if the array given by inline starts with the token GET. It returns -1 if it does not. The array is terminated by a line feed. Note that this is not a string as it may not have a string terminator. A valid GET token may be preceded by any amount of white space (blanks or tabs) and will be followed by at least one blank or tab. Tokens may contain any characters except for blanks, tabs, carriage returns or line feeds. Case is important here. Put your function in a file called parse_simple.c. Note that parse_simple is not allowed to change the input line.

Write a main program called parse_simple_test for testing this. Your test program will read lines from standard input, pass them to parse_simple, and display the result in human-readable form. Use separate compilation and the makefile you copied to compile and lint this program. You will hand in the lint output and sample output generated by testing the program. Use cut-and-paste to put the lint output and the test output in a file to be printed.


Part 1: Looking for three tokens
Copy all of the files from your part0 directory into a new directory called part1. Execute "make clean". Rename parse_simple_test.c parse_token_test.c, rename parse_simple.c parse_token.c and modify the makefile accordingly. Change parse_token_test to call parse_token instead of parse_simple. Now modify parse_token so that in addition to checking for and valid GET request, it also checks to see if the input line has exactly 3 tokens. Tokens are still separated by blanks or tabs but the last token is not required to have a blank or tab before the line feed. It is allowed, but not required, to have a carriage return right before the line feed. Run lint, compile and test the program as in Part 0.


Part 2: Parsing the initial request
Copy all of the files from your part1 directory into a new directory called part2. Execute "make clean". Rename parse_token_test.c parse_initial_test.c, rename parse_token.c parse_initial.c and modify the makefile accordingly. Change parse_initial to have the following prototype:

int parse_initial(char *inline, char **commandp, char **serverp, char **pathp,
                  char **protocolp);
The first parameter will be as before, but note that it is no longer constant. You will modify the string so that string terminators are put in after each token. If it returns without error, *commandp will point to the first token, *pathp will point to the second token, and *protocolp will point to the third token. These will now each be a strings because of the string terminate you insert. The *serverp parameter will be set to NULL. (This will be used in Part 3.) The parse_initial will return -1 one error or 1 on success as before. Modify parse_initial_test so that if there is no error, it outputs the command, path, and protocol on separate lines in a nice format. The only change to the input line that should be made is to replace 3 bytes with string terminators.

The example below shows a case in which there are two blanks after the GET and after the path and no blanks after the protocol. The first blank after the GET and path are replaced along with the carriage return. Note that if the optional carriage return had not been there, the line feed character would heave been replaced.


Part 3: Parsing the server
Copy all of the files from your part2 directory into a new directory called part3. This will behave exactly as part 2 unless the second token starts with http:// and also contains at least one additional / character. In this case the name of the server is between the second / following http: and the next / that starts the path. You must move the server name to the left one character and put a string terminator between the server name and the path. Set the serverp parameter to point to the server and return 2. The figure below shows this in a case in which there are two blanks after the GET and after the path and no blanks after the protocol.


Part 4: Logging HTTP GET commands

Copy all of the files from /usr/local/courses/cs3733/spring2002/assign1/part4 into a part4 directory along with your parse_initial.c. You will have a makefile and a few test files. Write a function with the following prototype:

int command_logger(int fd, char *inline);
Put this in the file command_logger.c. The first parameter is an open file descriptor and the second is a line to be interpreted as a possible HTTP GET command line. Note again that this parameter might not be a string. This function writes logging information to the given file descriptor in the format described below. The first line is a copy of the line given by inline. Next is either a single line indicating that this is not a valid GET line or it consists for 4 lines giving the four parts on separate lines. If it is not a valid GET (as determined by parse_initial) output the following line exactly:
Line is not a valid GET command.
After the period there should be a single line feed character. If it is a valid GET command, output 4 lines in the format below:
Command: commandname
Server: servername
Path: pathname
Protocol: protocolname
Each of these lines should end in a single line feed with no carriage returns. There should be a single blank character after the colon. The commandname, servername, pathname, and protocolname should come from parse_initial and contain no white space. If parse_initial returns a NULL pointer for the server, then the line feed will appear after the blank following the colon on the server line.

The function command_logger returns 0 on success and -1 on error. The command_logger should only return an error if an I/O error occurs. It is not considered an error for command_logger if parse_initial returns -1. The only I/O command_logger does is to the file descriptor parameter as described above.

Lint and compile your program using the makefile given. The logger_test program has been supplied. It reads lines from standard input and sends them to command_logger using standard output for the output. Test your program thoroughly. When you think it is working correctly, execute:

   logger_test < infile > outfile
The resulting outfile should be identical to outfile_correct. Use diff to compare these. There should be no differences.


Part 5: Final test

Copy all of your files from your part4 directory into a part5 directory. Then copy the files from /usr/local/courses/cs3733/spring2002/assign1/part5 into this directory. This will give you a new makefile and a new object file called tester.o. Rewrite command_logger so that all output generated by a given call to it is written with a single write statement. Do not make any assumptions about the size of the input line. This will require that you allocate and free storage inside command_logger. Under no circumstances should command_logger allocate any memory that is not freed before it returns. If it does, this is called a memory leak. Memory created in the way cannot be freed without terminating the process. Test to see that your program produces exactly the same output as in Part 4.

When you a satisfied that your program is working correctly, run the tester program. If all goes well you should have two messages that say "Congratulations". If not, try to find your errors and correct them. You should only run the tester program when you think your program is working correctly. Try to run the tester program as few times as possible. Keep track of the number of times you ran it. If tester reports an error, the file tester.log will contain information about what test failed.



Handing in the program
Print out the cover page and fill in the page numbers of the various parts of the assignment. Include all of the documents indicated on the cover page. Use a single staple in the upper left corner to attache the cover page to your output. Hand everything in at the beginning of class on the due date.