CS 12 -- Project 3


Compressing English text

This project is easily described: Write two Java programs---one that compresses English text, and another that decompresses it. There are some details about this project, however, that are provided here:

  1. The compression program should take two command-line arguments. The first is the name of the source (uncompressed) file, adn the second is the name of the destination (compressed) file. Thus, it should be invoked like so:

    java Compressor original compressed

    The contents of the original should be compressed and written to the file compressed. While you may write as many classes as you see fit, the main() method for this program should be a member of the Compressor class.

  2. The decompression program should also take two command-line arguments. The first is the filename of the compressed source, the second is the filename of the uncompressed destination. It should be invoked like this:

    java Decompressor compressed decompressed

    The compressed data in the file compressed should be decompressed and written to the file decompressed. The destination file should have the identical contents to the original file used to create the compressed representation; that is, the compression and decompression must be lossless.

    Again, you may write as many classes as needed, but the main() method of the decompression program must be named Decompressor.

  3. You must write a file, named README, that describes your compression algorithm at an abstract level. You must make it clear not only that your compressor works, but that you understand how it works. You will get little credit for a compression strategy that you cannot explain.

  4. All code must be carefully formatted and well commented. It should be clear how the compressor and decompressor work by reading the code and its comments.

  5. Both programs must run correctly! A program that crashes or mangles the data it compresses and decompresses will get little credit.

To test your work, I recommend creating a few of your own, short files. Once your code is working, you can copy the following two files from my public directory to test the efficacy of your compression techniques:

~sfkaplan/public/cs12/alice-in-wonderland.txt
~sfkaplan/public/cs12/as-you-like-it.txt

A Competition!

You should submit the best compressor that you can write based on the above criterion. However, I will also evaluate the compressors against one another to see which one generates the smallest compressed files. The winner of this competition automatically receives an A on the final exam. There are some important rules to this competition:

A violation of these requirements will disqualify you from the competition. Note that I reserve the right not to award the prize: If the only submissions that meet the above requirements uses a trivial approach to compression, there may be no winner of the competition. A winning submission must be sufficiently compelling to warrant exemption from the final exam.

I will judge your programs by using them on the sample files provided above as well as other English text files of my choosing. I will also judge them based on the clarity of the code and the explanation in the README file.


Submitting your work

Use the cs12-submit program to submit only your source code, which should have been written as a number of .java files (one per class). Submit your work as project-3, something like this:

cs12-submit project-3 Compressor.java Decompressor.java

This assigment is due on Friday, November 21st at 11:59 pm!

Scott F. Kaplan
Last modified: Mon Nov 3 14:12:24 EST 2003