CS 3733 Operating Systems, Fall 2011 Assignment 5


Assignment 5: Wednesday, December 7
Note: No late assignments for this one!

This assignment is a continuation of Assignment 4.
You will use a network of computers to find primes.
Please read the entire assignment before writing any code.


Part 1: The server
Write a parallel UICI server. The server should have one command line parameter, the port number it will listen on. When a connection is made it receives one long long integer from the client, a number to test to see if it is a prime. It sends back to the client a single character, '0' if it was not a prime, or '1' if it was a prime. After sending this single character it closes the connection. Use your isLongLongPrime function.

Write a simple client that takes three command line parameters, a long long to test, a port number, and the name of a host, in this order. It will send the value to the host and report back with a message whether that number was prime.

Test your client and server.

Note: It is not specified what format to use to send the number to the server. Choose a method. Whatever method you choose should work even if the client and server are running on different hardware.

Answer the following questions:
1) In what format is the number sent from the server to the client? Your answer should have sufficient detail to allow someone to write code that sends the number to the server.
2) How did you take into account the fact that a read does not always return the number of bytes requested?

Part 2: The client
Write a program that takes at least 4 command line parameters. The first two are similar to the one of your Part 3 program of Assignment 4. The first parameter is a starting prime number and the second is the number of primes to find. The third parameter is a port number. The rest of the parameters are host names.

The program will attempt to find the required primes by distributing the work to the hosts given on the command line. It will display the results as in Assignment 4, with the primes in numerical order. The only output to standard output should be the original number followed by the primes, one per line. For corresponding input, the output should be exactly the same as the output produced by Assignment 4, assuming no errors occur.

You need to design the algorithm used, with the restriction that the client program cannot directly check to see if any particular number is prime. That work needs to be done by the servers. The goal is to do this calculation in parallel. Do not solve the problem by sending a value to a host and waiting for the result before sending the next value.

Think about the following: How do you keep the servers busy without overloading them? If you are using 2 servers and send them each 1000 numbers to try, each will have 1000 children competing with each other, and make little progress. It might be better to send each server 1 number and not send another until the results come back. However, if the servers have 4 cores, this only uses 25% of the processing power of each server. Perhaps it would be better to keep each server busy doing 4 calculations.

Describe the algorithm that you use to solve this problem. Include the following:

  1. How do you distribute the work to the hosts?
  2. How do you wait for the results to come back so that the work is done in parallel and the servers are kept busy?
  3. How do you collect the results for display? Remember that you cannot display anything until all values have been calculated.
Test your program with a single server and the same test you used in Part 3 of Assignment 4. Produce corresponding output and save it. Run the same test as in Part 3 of Assignment 4 using test number 9 and finding 20 primes. Redirect standard output to part2a.out. Execute:
ls -l part2a.out
wc part2a.out
md5sum part2a.out
and save the result. Turn this in.
Do not continue on until this gives the correct result.

Now do the same test using 3 servers and save the output in part2b.out. Execute:
ls -l part2b.out
wc part2b.out
md5sum part2b.out
and save the result. Turn this in along with a description of the machines you used as servers. Make sure you state the number of cores each has.

Important: Keep track of which machines that you run your server on. It is your responsibility to kill the servers when they are not in use. You must remain logged into any machine that is running your server. Leaving the server running when you are not logged in, or for long periods of time, will result in a severe grade penalty.

Part 3:Making the client more efficient
There is an overhead involved in sending a number over the network. It does not make sense to do this if number of divisible by a small prime. Modify your client so that it only sends to the servers numbers which are not divisible by small primes. You must determine what "small" means in this context. Answer the following question:
3) What is the meaning of "small" in this context and how did you determine this?

Do the same tests as in Part 2 and save the output. Also, run timing tests and compare the results of Part 2 and Part 3. Put your results in an easy-to-read table and discuss the results.


Handing in your assignment
Use this cover sheet.