UTSA CS 3733 Operating Systems Assignment 4 Spring 2008

CS 3733 Operating Systems, Spring 2008 Assignment 4

Due Friday, March 28

In this assignment you will write several versions of a program to find the factors of an integer. We will use a fairly stupid algorithm. The purpose of the assignment is not to solve the problem, but to show how you might use concurrency to speed up the execution.

In each part you are asked to modify the program from the previous part. In doing so, make a new file containing the new program. You may want to name the programs part0, part1, etc. Save the programs for each part, as you may need to run them again.

Each program should start by outputting which part of the assignment this is, your name and the number that you are trying to factor. It should then list all of the prime factors, one per line. The output the number again along with the product of the factors you have found. Do this in such a way that it is easy to tell that these two numbers agree. Make sure you also include the appropriate words along with your output so you can tell what the numbers mean. Sample output might look like this:

part1 written by S. Robbins
Trying to factor 63269392157957
Number of prime factors is 2:
2712161
23328037
The prime factors of 
63269392157957 multiply to 
63269392157957

Notice how the last two numbers of the output appear one below the other so it is clear that they have the same value.

At the end of the assignment, the requirements for turning in your program are described. Run all of the tests on the same machine, if possible.

Basic algorithm:
Given an integer n, find the factors that are less than sqrt(n) by trying all possible values. If k is an integer between 2 and sqrt(n), k is a factor of n if n%k is 0.

For each value of k in this range we test whether n%k is 0. If it is, we have found a factor. We store the factor, replace n by n/k and calculate a new sqrt(n). We then continue looking from where we left off. In doing so, we need to use the same value of k again in case a power of k divides n.

When we are done, we have a list of factors of n. Convince yourself that these must be prime factors. Multiply them together to get m. If m is not equal to n, then n/m is the remaining prime factor of n.

Do not modify the algorithm to make it more efficient. For example, we know that other than 2, prime factors must be odd. Do not use this fact.

Part 0
Write a program that takes one command line parameter, an integer, n, and determines the prime factors of n using the method described.

Notes:

You may use atoi to convert the command line parameter to an int.
Exit the program with an appropriate message if it does not have exactly one command line parameter.
You will have to find a way to store an arbitrarily long list of integers. Do not assume a limit to the size of this list (even though we know that a 32-bit int cannot have more than 31 factors). Write a function called saveValue that saves a given value in a list.
saveValue will need to do some memory allocation and therefore can produce an error. Find a way to handle this in which you will be able to tell if an error occurred and print out an appropriate message. In this case, output the values found so far. saveValue should not do any printing, but should have a way of indicating to the calling program that an error occurred.
Test your program with some big numbers. It should be pretty quick no matter which number you use. It will take the longest when the number is prime or the product of two large primes, but it should produce the results in well under a second.

Here are some good int values to try:

12347
135799
399168
1760251
1493749
16431211
21172267
62270208
180743191
202591578
254168543
1418141046
1988174621
2147483647

Part 1
Convert your program to use the data type long long. This is usually a 64-bit data type. Very little of your program needs to be changed, except for replacing int by long long in the appropriate places. Only use long long where necessary. For example, a counter for the number of prime factors should still be an int. You can use atoll to convert the command line string to a long long. Make sure saveValue can handle long long values. Be sure that when you calculate sqrt(n) you round up, rather than truncating. Some long long may not exactly fit in a double and so the sqrt function may not give the exact answer when n is a perfect square.

Try your program with the same values for Part 0, plus some additional larger values below.

Here are some larger values you might want to try:
The ones marked with a * may take a long time to run.

3050639959
21869920649
36612831053
439366540651
5272420096343
29108864245399
63269392157957
121645100408832
320197506699379
5109094217170944
3522172573692853
5109094217170944
38743898310620761 *
57227106440852922
112400072777760768
426182881416828241 *
1311956179350817129 *
2585201673888497664
4611685975477714963 *
4688011695585110197 *
9223372036854775783 *
9223372036854775802
9223372036854775807 *

Part 2
Here we will prepare to exploit the use of 2 CPUs. We will break the problem into two parts and eventually run each part on a different CPU. For now we will run the two parts sequentially.

Modify Part 1 to use functions, findFactors1 and findFactors2. Each of these takes a number n as a parameter. The first function searches for factors in the first half on the range (2 to sqrt(n)/2) and the second one searches the second half of the range. Each of these should use the data type long long. If findFactors1 finds a factor, adjust the value of its n and its range, accordingly. Call findFactors1 followed by findFactors2. findFactors2 should use the original value of n.

Note: When findFactors1 finds a factor, it should adjust its maximum value to sqrt(n) rather than sqrt(n)/2. Why?

Note:There are cases in which findFactors2 will find factors that are not prime. Do not worry about this for now. It will be fixed in Part 3.

Part 3
When you ran the program from Part 2, you might have noticed that:

This program sometimes runs much slower than the one in Part 1.
If findFactors1 finds any factors, it will find all factors less than sqrt(n) so findFactors2 will never find any additional factors.
If findFactors1 does not find any factors, findFactors2 will find at most one factor and it will be prime.

Convince yourself that these statements are true. Modify Part 2, to take these things into account. The program should now run at about the same speed as in Part 1.

Part 4
We want to run findFactors1 and findFactors2 concurrently. Fork a child. Have the parent run findFactors1 and have the child run findFactors2. If the child finds a factor (it can find at most 1) it must communicate that factor back to the parent. Note that this will not fit in 8 bits, so the return value cannot be used. One way is to use a pipe. Implement this.

If findFactors1 finds a factor, then the child cannot. Kill the child if findFactors1 finds a factor. This should speed up execution on a single-CPU machine.

Part 4 can run up to twice as fast as Part 3 if you have 2 CPUs. Run Parts 3 and 4 for some of the cases that take a long time and see how the running times compare. Most of the machines we have in the lab will only run one CPU. If you have access to a dual-core Linux machine, run some tests and determine under which circumstances Part 4 runs much faster than Part 3.

Part 5
We can speed this up even more in some cases. When findFactors1 in Part 4 finds a factor and kills the child, it has reduced the problem to a smaller size. Have it handle only half of what is left and create a child to handle the other half.

Handing in your assignment
You will hand in the source code for each of the 5 parts of the assignment. In addition, you will run specific tests for each part. You will time each test using the time command. Make sure the time command you are using displays the user time, the system time and the elapsed time.

Run the following tests for each part:
For Part 0, run tests 1, 4, 8, 10 and 14. Include timing information for each, but these should all run very quickly.

For each of the remaining parts, run the following test cases from the list under the Part 1 discussion: 9, 10, 17, 18, 19. Include timing information for each. Which of these tests run slower under Part 2 than Part 1? Did Part 3 solve this problem?

Use this cover sheet.