Introduction to Computer Science II -- Project 6 for 2007-Nov-16

Introduction to Computer Science II

Project 6 for 2007-Nov-16

Evaluating sorting algorithms

Overview

We have examined a number of sorting algorithms, many of them fundamentally different from the others. We have also performed a high-level categorization of the speed of each (what computer scientists' call an asymptotic running-time analysis.) However, mathematical analyses like that are just models---simplified approximations of what we expect the running-time of certain algorithms to exhibit when implemented and run on a real computing device. Thus, our primary question here is, is that model sufficiently realistic?

In order to answer that question, we need to perform some experiments. Specifically, we must implement some of the sorting algorithms that we have discussed, and then we to measure their performance as they run on a real computer. Measuring performance, however, is not quite as simple as it seems. Here are some of the issues that we must consider:

For small arrays, sorting will happen so quickly that we need a high-resolution timer to get useful measurements. Alternatively, we can sort the small arrays many times and then take the average based on a lower-resolution timer.
We weren't very good at determining the "average case"; instead, we mostly discussed the worst and best cases. How much do those diverge?
Even though the compiled program is the same, the hardware is the same, and the input data is the same, the amount of time required to execute some code may vary. By how much do we expect it to?
Complete, real computer systems are complex beasts. What other things are going on inside that could affect how long our sorting algorithms take?

Our goal will be to measure a handful of sorting algorithms, see how long they really take as the number of items to sort increases, and look for anomalies. Do the timings come out as expected? If not, why not?

Which algorithms to implement

You must implement three sorting algorithms. Specifically, you must implement:

Either selection sort or insertion sort.
Merge sort.
Counting sort
Optionally, if you are masochistic, you may try to implement stupid sort.

My code

I have provided code that should get you started. Note that as of 2007-Nov-09, this code has changed since your initial coping during lab. Get the new version! You can get a copy of it like so:

cp -r ~sfkaplan/public/cs12/project-6 .

You will see four .java files---for this project, you are welcome to examine the code that I have provided. Be aware that it is not as thoroughly commented as I would like, so you may have to infer a bit. Here are descriptions of each of the classes:

SortTest: This class contains the main() method that starts the program. It's job is to drive the testing of the given algorithm. Specifically, based on the name of the sorting algorithm requested by the user, it calls Sort.create() to make a Sort object. Starting at a size of 1 element and doubling the size with each iteration, it creates an array of that size and fills it with random int values. It then passes the array to the sorter, timing how long the sorting process requires. For each array size, some number (given by the user) of repetitions of that array size is performed.
Sort: An abstract class that defines the sort() method, which all subclasses must have. This method must not only sort the array of int, but must all return the number of comparison, array read, and array write operations performed. To assist with that task, it defines a data member, operations, that should be used to count such actions. It also provides methods that perform comparisons, array reads, and array writes, while incrementing the operations performed with each call. Note that you may need to add methods to this class or to subclasses for different uses of these operations.
BubbleSort: A sample Sort subclass that implements one type of sorting algorithm. Note that it uses the methods from its parent to keep track of the operations performed. Thus, it contains only the definition of the sort() method itself, with everything else being inherited.
Support: Provides the handy-dandy abort() method for cleaner error-checking code.

You will notice that the code is (now) fully commented and documented. All of the comments that preceed class declarations, methods, and data members are in a format known as JavaDoc. You can take advantage of these, getting a better overview of the details of each class, by following these steps:

Compile the documentation: Go into your project-6 directory. From there, issue the following command, which will create a number of additional files:

javadoc SortTest.java Sort.java BubbleSort.java Support.java
View the documentation: Open a web browser on the index.html file that was created by the previous command. You can do so by using the following command, again from your project-6 directory:

firefox index.html &

How to do the experiments

The goal is to test each sorting algorithm on a range of array sizes, performing multiple tests on each array size. Each test should count the total number of operations and the total amount of time required to perform the sort. The questions that we want to answer are:

What is the average time required to sort each array size with each algorithm?
What is the standard deviation of time required to sort each array size with each algorithm?
What is the rate of growth for time and operations for each algorithm as the arrays size increases?
Is there a consistent relationship between the number of operations and the measured time required to sort with each algorithm? If there is none, why not?

To answer these questions, you may need to collect the results of each timing in a manner that allows you to perform these computations. You may use Java code, a spreadsheet, or a statistics program (or maybe something else) to do these calculations. Depending on what you choose, you may need to change the output format of the timings to put them into a more convenient layout -- or you may not print them at all, and simply use them internally.

Submitting your work

From within your project-6 subdirectory, use the following command to submit you work (all of your Java code files) when you are done:

cs12-submit project-6 *.java

This assignment is due Friday, 2007-Nov-16, at 11:59 pm

Scott F. H. Kaplan

Last modified: Sat Nov 10 22:02:41 EST 2007