Introduction to Computer Science II

Project 7 for 2007-Dec-02

State machines for parsing

We seek to perform parsing of numbers using a state machine. In this lab, we need to implement classes that, through good object oriented design, allow us to represent a state machine. We should be able to write code that constructs the state machine and then uses it to parse a particular kind of input, determining whether that input can be successfully identified by the state machine.


Classes for a state machine

A state machine comprises states (vertices) and transitions (directed edges). We will represent each as objects in our programs. Specifically, each state contains a number of outgoing transitions, each of which correspond to some subset of possible inputs to the machine. We need a way to represent these states and transitions, and a strategy for traversing that structure as we process each symbol of the input.

First, we need classes that will represent the structure itself:

Next, you will need an "outer" class that encapsulates the entire state machine and its interface. Specifically, there must be an IntegerParser class. An object of this class constructs and controls a state machine, built from State and Transition objects, that is capable of parsing an integer. An IntegerParser should have one public method (in addition to a constructor) that provides its basic interface to other classes:

public Integer parse (String s)

This method should use the state machine contained by the IntegerParser object to determine if the String object to which s points is valid integer value. If it is, this method should return an Integer object with the parsed value; otherwise, null should be returned. This method must traverse the state machine, using one character from the String at a time, until either a failure state or a success state is achieved, triggering the generation of a return value for the method.

A helpful observation: Few methods that you will write for this project are complex. The challenge of this project is not in the complexity within individual methods, but rather the complexity of the interactions between objects. Do not overthink the role of each method, as it may well be simple. Be aware that the power of the state machine comes in its structure, and not from any actions that any one part of it performs.


So what is an integer?

Certainly we would accept "13" and "-8" as integer values. But what about "-34,123,998"? That's a valid integer, too.

For the first version of your code, don't worry about integers that contain commas to separate every 3rd digit. However, note that if your code works correctly, accepting integers with commas is only a matter of constructing the state machine differently; the rest of your code should not need to change.


How to build the state machine

It requires some thought to determine how to construct an actual state machine from the objects described above. Specifically, you must create all of the State and Transition objects, and somehow make them point to one another. You also must be sure that each Transition contains the appropriate set of symbols that indicate which inputs will trigger that transition to be followed.

Thus, you must write these classes with an eye toward the interface that you need to create these objects. For example, once I create a particular State, it must be possible to then call a method on that object to tell it where its outgoing Transitions are. It must also be possible to pass to each Transition which state is its target. Thus, you will need an appropriate set of methods to do so.

Once you have these methods, you must chart out the sequence of object creations and method calls on those objects to make the full state machine.


Your assignment

Write the classes described above. You may choose to write other classes not described if you find that they would be helpful. Be sure to choose good names for such classes so that their use is obvious. Also be sure to divide the tasks cleanly into the classes and their methods. Use good object oriented design -- wise use of inheritence should make the interaction with states and transitions simple.

Finally, write a main() method in IntegerParser that creates an IntegerParser object and passes Strings taken from the command line that may be an integers or not. The command-line usage should look like this:

  % java IntegerParser 13
  Value = 13

  % java IntegerParser -6,153
  Value = -6153

  % java IntegerParser foobarbaz
  Not an integer = foobarbaz

  % java IntegerParser -8 15quux81 38,38 98,998 003,519
  Value = -8
  Not an integer = 15quux81
  Not an integer = 38,38
  Value = 98998
  Not an integer = 003,519
    

Note that, in your final version, leading zeros should be considered illegal, even though, technically, all integers have a (usually unwritten) infinite sequence of leading zeros.


What you must submit

Once you have written and tested the various classes, you must use the cs12-submit command to submit them. Submit this assignment as project-7 when you use the cs12-submit command.


This assignment is due on Sunday, 2007-Dec-02, at 11:59 pm.

Scott F. Kaplan
Last modified: Sun Nov 5 22:48:52 EST 2000