Getting stared with C++

Depending on your background, you may need only to read certain parts of this lengthy page. It provides information on how to connect to the Math/CS Department server (graphically), getting started with XEmacs, structure C++ files (including basic use of classes and objects), compile and link C++ programs, and use a source-level debugger on a C++ program. Skip over any parts that you don't need!

Logging into sirius

For this course, I will assume that you will use sirius.cs.amherst.edu (the Math/CS Department server) to do your work. While you are welcome to do your work on another system (perhaps your own desktop/laptop machine), your code must compile, run, and be submitted on sirius. This page and others like it will assume that you are using sirius.

Note that if you are in SMudd 007, you can simply login to sirius directly using one of the NeoWare X-Terminal machines. You do not need SSH or VNC.

We begin by describing how to connect to sirius from outside of SMudd 007 using VNC, which will provide you a graphical interface. More specifically, this section will desribe how to establish a VNC connection that is tunneled through an SSH connection. The use of SSH ensures that your communication is encrypted, which is a good habit considering that the Amherst College network, as well as many other networks connected to the Internet, are open and rather insecure. If you are connecting from outside of the Amherst College network, you must use SSH tunneling to establish a VNC connection. I recommend that you use SSH tunneling under all circumstances.

The following steps will allow you to establish a VNC connection using SSH tunneling:

Connect to sirius using an SSH client configured to establish the tunnel: The choice of SSH client and the description of how to use it depends on the type of system that you are using. Below are instructions for Mac OS X, Linux/UNIX, and Windows systems. If you have another type of system (such as Mac OS 9), contact me and I will attempt to point you a web page or a person that can help you.
- Mac OS X, Linux, and other forms of UNIX: Recently versions of these OS's come with ssh installed. If your system does not have it, try The OpenSSH home page. Begin by opening a shell. (For Mac OS X users, use Applications -> Utilities -> Terminal to open a shell.) At the command line, simply type:
  ssh -L 5900:sirius.cs.amherst.edu:5900 username@sirius.cs.amherst.edu
  
  Of course, you would replace username with your username on sirius. Also note that if you are on the Amherst College network, you may not need to type the trailing .amherst.edu.
  
  You will be prompted for your password, so simply provide it. Assuming that your login proceeds normally, you will have established a tunnel for VNC conncetions (which use port 5900). Simply leave this login open so long as you are using VNC.
- Windows: For those on the Amherst College network, the college provides an SSH client. On the college's public machines, use Start -> All Programs -> Network and Internet -> SSH Client. For your own machine, check the college's software collection for students. (See The Amherst College IT pages to find out what software is available to you an dhow to get it.) Alternatively, if you are not an Amherst College student or not connected to the College's network, try The SSH home page. In their Downloads section, you may download the Windows Workstation client for free as a non-commercial user.
  
  Once you have (installed and) started the SSH Client, select Edit -> Settings.... On the left side of the setting window, select Profile Settings -> Tunneling, and the right side of the window should provide you tunneling options. Click the Outgoing tab, and click the Add... button. Fill out the window that appears as follows:
  - Display Name: VNC tunnel
  - Type: TCP
  - Listen Port: 5900
  - Allow Local Connections Only: Check.
  - Destination Host: sirius.cs.amherst.edu
  - Destination Port: 5900
  Click OK on this window, and then click OK for the settings window. So that you don't have to do this again (assuming that this is your machine and not a public College machine), select File -> Save Settings.
  
  Click the Quick Connect button on the main SSH window. Fill out the window as follows (nothing that the lower two values are most likely already correctly provided):
  - Host Name: sirius.cs.amherst.edu
  - User Name: Your username on sirius
  - Port Number: 22
  - Authentication Method: <Profile Settings> or Password
  Click Connect, and shortly after, a window shoulda appear asking for your password to sirius. Enter it, and you should be logged into sirius and given a command prompt. At this point, you've established a tunnel for VNC and should leave yourself logged into sirius via SSH until done with your VNC connection.
Open a VNC connection: Again, the software used in system dependant, and I describe how to use software for each of the systems described above:
- Mac OS X or Mac OS 8.1 (or later): Download a copy of VNCThing (you can also get it from the VNCThing homepage), which is a VNC client for this platform. (Note that this same VNC client is-or-soon-will-be available on the College's public Macs.) Start the application, and fill out the window that appears like so:
  - Server: localhost
  - Password: Leave blank.
  Click OK, and a new window should appear containing an entire graphical interface to sirius. Note that you can use Command-F to toggle between this format and a full screen mode.
- Linux/UNIX: Again, recent distributions have a VNC client included. If yours does not, you can download the RealVNC client as a tarball, or for as an RPM package. You could also visit the RealVNC home page.
  
  You must be logged into your machine and running an X-Windows server on your console. Open a shell, and give the following command:
  
  vncviewer localhost
  
  A new window should appear containing an entire graphical interface to sirius. To obtain full screen mode, trying pressing the F8 key, and a drop-down menu should appear with that option.
- Windows: You can download a copy of RealVNC, or you can visit the RealVNC home page. This is the same VNC client used on the College's public Windows machines.
  
  When you start your VNC viewer, fill out the window that appears like so:
  - Server: localhost
  - Password: Leave blank.
  (Note that this window may be different if you use another VNC client.) A new window should appear with an entire graphical interface to sirius. To obtain full screen mode, click on the upper-left corner of the window frame to get the window's drop down menu, and select the full screen option.
Login to sirius: Now that you are graphically connected, simply use the login window. A windowing environment will appear.
Open a shell: There should be, along the bottom of your screen, a set of icons. Among these should be an icon that looks like a red hat in the lower-left corner. It works like a Windows ``start'' menu. Click that, and then select System Tools, and within that menu select Terminal. You should get a new terminal window in which a shell will start and you will be given a command line.

You can now issue commands or use the graphical interface to work on sirius. Note that there are an enormous number of programs available on this system, and a number of variations that you can choose for graphical interaction, including complete different windowing environments. Experiment with these at your own risk!

On Using XEmacs

I will assume that you are going to use XEmacs as your programming editor and environment. Feel free to use something else, but I will only likely be able to help you with this particular program (or, for those who perfer is, the original and similar Emacs). I recommend XEmacs because it will help to format C++ code, allow you to open a shell within it, direct your compilations (see below), and direct your debugging (see below).

I will also assume that you are at least modestly familiar with XEmacs from previous courses. If this program is completely new to you, try the following steps:

From a command line, type:

xemacs &

Note that the ampersand (&) causes the program to run in the background, thus continuing to run the program while also allowing you to enter other commands.
When a new window appears, select from the menus Help -> Tutorials -> English. (Of course if you prefer, you can choose another language!) This command will lead you through an extensive tutorial that will help you get familiar with how XEmacs works and what it can do.

You may notice that in using XEmacs, the window may split, perhaps multiple times. (This window splitting occurs, for example, when you use XEmacs to direct compilation and debugging.) You can control this splitting with the commands given below. (Try them!)

To return the window to show a single buffer:

C-x 1
To split the window horizontally:

C-x 2
To split the window vertically (which can be particularly handy if you've enlarged your window to cover the entire screen):

C-x 3

Some of the sections below will describe more advanced uses of XEmacs that are you likely to find helpful in working on these projects.

C++ File Layout

C++ is a modular language in that you can write independent modules of code that can be separately compiled into an object file, and later the object files can be linked together to form a single executable file. Java is somewhat similar in that each .java file is its module code. The Java compiler (javac) will compile each code such file into an object file (.class). When you run a Java program using java, one of the things is does (more or less) is link those object files to form a single, running program.

First, observe that the commands for compiling, linking, and running C++ programs is different from Java. Second, note that each C++ module comprises two files instead of one. This section will describe the basics of creating such a module, and the next section will describe how to compile it and link it with others.

We'll begin with a very simple, single-module program. In this simple case, we will need only one file for the module, and we will be able to compile and link it in one step. Consider single file, named sample-program.cc, with the following contents:

#include <stdio.h>

int double_it (int x) {

  return (x * 2);

}

int main () {

  int i = 15;
  int j = double_it(x);

  printf("i = %d, j = %d\n", i, j);

  return 0;

}

There are a few important things to notice about this simple program:

The #include directive, as shown, is a special way of informing the C++ compiler of some other module you want to use. Here, stdio.h provides a set of standard C input/output functions -- in this case, the function printf(). The angle-brackets (<>) indicate that this other module is a standard one, and should be found "automatically" by the compiler in some standard location.
The function double_it() must appear before its use in main(). Otherwise, the compiler will complain that it is unaware of any such function name.
The main() function is the entry point for the program, just as it is in Java. There must be a main() in exactly one module that you link to form a valid executable.
The printf function is an odd one. The first argument provides a format string, showing how the output should appear. The instances of %d in that string indicate places where integer values will be inserted by the function in decimal form when producing the output. Each argument that follows the format string will provide the value to be inserted, where arguments should be given in the order corresponding to the %d instances.

To obtain more information about printf() and functions related to it (as well as a large number of standard C functions), use the manual pages. At a command line, you can type the following to see the page on printf():

man -S 3 printf

Note that the -S 3 is necessary because there is also a printf command available at the command line. By providing the extra option to man, you ensure that you'll see a description of the printf() function used in C/C++ programs.

Alternatively, you can view manual pages within XEmacs. Use M-x manual-entry, and when prompted for which manual entry, enter printf(3). The manual page will appear in a new buffer.

You can compile this single C++ module directly into an executable form, using the following command line:

g++ -o sample-program sample-program.cc

The result will be an executable file named sample-program. To run this file, you simply enter the filename as a command to the shell, like so:

./sample-program

Notice that you must preceed the filename with a ./ to ensure that the program will run. The dot (.) indicates to the shell that you're interested in something within the current directory, and the slash (/) separates the current-directory indicator from the filename itself. Without this prefix, the shell may not know that it should look for executable programs within the current directory, and will thus complain that it cannot find the command named sample-program.

Now we can move on to seeing how to create a C++ program using multiple modules. Our example will include the use of object orientation, since it is likely that you will want to create classes and objects for later projects.

Consider first a module that contains only the main() function for this program. Since it only contains that function, it comprises only a single file named main.cc whose contents are as follows:

#include <stdio.h>
#include "Foo.hh"

int main () {

  // Create a new Foo object on the heap.
  Foo* x = new Foo(5);

  // Get the Foo object to do its thing.
  x->do_something();

  // Eliminate the object.
  delete x;
  x = NULL;

  return 0;

}

Notice that there is a second #include directive for this module. Because it will make use of the Foo class which will be contained in another module, this one must import that module to compile successfully. Because the module being used is not one of the standard modules, we place the filename in quotes ("") to indicate that the compiler should search the current directory to find that file.

Now we create a module for the Foo class. Note that it is good programming practice (but not strictly necessary) to write one module for each class that you create. Also note that in C++, a module doesn't need to contain a class, as it may contain any collection of data and functions (thus not using the object oriented features of the language). We will see some non-OO use of C++ in later projects.

Any module that is going to be imported by another module (as is the case with the Foo module here) must be written as two files. These files are:

The header file (using the .hh suffix) will define classes and declare functions and variables that may be used by other modules. In short, it is the external interface that should be presented to other modules, revealing only the elements necessary to use that module.
The code file (using the .cc suffix) will contain the definitions of functions and methods and the assignments of variables. In other words, it is the internal, inner workings of a module, usually consisting of the actual function and method definitions.

First, we present the header file for the Foo class module which would be contained in the file Foo.hh, and whose contents would look like this:

#if !defined (_FOO_HH)
#define _FOO_HH

class Foo {

public:

  // The constructor.
  Foo (int x);

  // A method that does, well, _something_...
  void do_something ();

private:

  int value;

};

#endif // _FOO_HH

There are a number of elements of this file that bear explanation, and we provide those below. First, notice that this is the file that the main module will import. The #include directive that mentions this file will simply cause the compiler to read and process this entire file as though it were copied-and-pasted into the location of the #include line. Now, onto the observations about the file itself:

The first two line and last line of the file (the #if, #define and #endif directives) allow the compiler to know whether or not it has already read and processed this file in compiling a single module. Since one file can #include any other, it is possible to have circular dependencies among files. These special lines let the compiler avoid looping forever through these dependencies.

It is a good habit always to include these lines in every header file that you write. Notice that the symbol name used is related to the filename (_FOO_HH), thus ensuring that difference modules will use different symbols.
Beginning a class definition looks the same as it does with Java. One slight but important difference is that the end of the class definition, specifically the closing curly-brace (}) must be followed by a semicolon (;).
Instead of labeling each member of the class as being public or private, C++ uses paragraphs. That is, everything following the public: declaration is part of a paragraph of public members, and everything following the private: declaration is part of the paragraph of private members.
Dissimilarly from Java, methods are only declare and not defined. That is, there is no body to the methods, and just a signature providing the name of the method, its return type, and its parameters.

Now onto the code file for the Foo class module, which would be named Foo.cc and would look like this:

#include <stdio.h>
#include "Foo.hh"

// The constructor.
Foo::Foo (int x) {

  value = x;

}

void Foo::do_something () {

  printf("The value, in hex, happens to be %x\n", value);

}

Again, some observations about this file:

Even though the main module also imports stdio.h, this module must import it as well, in spite of the fact that the two modules will later be linked together. Since this code file includes the use of printf(), it must separately import stdio.h.
When compiling this module, the compiler must see the class definition before it sees the definitions of the individual methods of the class. Thus, Foo.hh must be imported here as well.
In the code file, the name of each method must be explicitly scoped. That is, it must be preceeded (in this case) by Foo::, thus indicating to the compiler that you're defining not just a function named do_something, but rather a method called do_something that is a member of the Foo class.
Similarly to Java, constructors in C++ have no return type.

Given this collection of files, we'll move onto the next section, where we will consider performing a more complex compilation and linking of these modules.

Compiling and linking

How can we compile and link multiple C++ modules into an executable file? We will examine the steps required in this section, showing not only how to compile and link, but also how to do so within XEmacs, which can help in the task of examining compilation errors.

Throughout this section, we will continue the examine begun in the previous section. This example assumes two modules. The first is the main module which contains only the main() function, and consists only of a code file named main.cc. The second is the Foo module which contains all components of the Foo class, and exists as a header file name Foo.hh and a code file name Foo.cc.

Note that this section does not address the use of the make utility. We will encounter make in project 1.

Follow these steps to compile and link this (silly little) program:

Compile the main module: Within XEmacs, use the following command:

M-x compile

XEmacs will prompt you to enter a specific command to perform compilation. By default, it will provide make -k as this command. Delete that default, and instead provide the following command:

g++ -ggdb -c main.cc

You will see XEmacs split the window and open another buffer to hold the results of the compilation. The actual compilation should only take a few moments.

The first option passed to the compiler (-ggdb) tells it to include extra debugging information in its output. The second option (-c) tells the compiler only to perform the compilation step, and not the linking step. Therefore, the result of this command will be a file named main.o, which is an object file.
Introduce an error into the Foo module: For instructional purposes, try modifying Foo.cc so that it contains an error. For example, change the use of one of the variables so that it is misspelled. You will therefore be able to see how XEmacs can help handle compilation errors.
Start another compilation in XEmacs using the same command as before. Notice that it will remember your last compilation command and present that as a default. Edit the default so that you issue the following command to compile the Foo module:

g++ -ggdb -c Foo.cc

Notice that you need only tell the compiler about the code file of the module. The header file will be used whenever a #include directive requests it, as this module and the main module do.
Use XEmacs to find the compilation error: The compilation started in the previous step will indicate an error due to the modification to Foo.cc performed earlier. Try the following command to XEmacs (and don't forget the backquote (`)):

C-x `

XEmacs will show, in the compilation buffer, the error. Meanwhile, it will also move the cursor in the Foo.cc buffer to the line that contained the error. Repeated use of the above command will step through each compilation error in sequence (if you have multiple such errors), where XEmacs will do the work of finding the location of the error in your code.
Fix the error and recompile: Simply fix the error that you introduced into Foo.cc and then compile the Foo module again. An object file named Foo.o should be produced.
Link the object files: We want to join the two object files to form a single, executable file. Use the M-x compile command within XEmacs again. Alter the compilation command to be:

g++ -o final-product main.o Foo.o

This command will produce an executable file named final-product. It contains not only the object code from main.i and Foo.o, but also some special bootstrapping code that allows the program to begin running and to call main(). The places where main() uses Foo objects and methods have been linked together.
Run the program: If you haven't already, open a shell within XEmacs itself. Specifically, use the command:

M-x shell

At the command line, run your program by entering:

./final-product

It should executing, doing something not at all particularly interesting, and then exiting. If that happens, you've successfully written, compiled, and linked a multi-module C++ program.

Debugging

Although the sample programs that we've presented on this page are simple and should work, it is likely that you will write code that will not immediately run properly. Using GDB, a source-level debugger, will allow you to control the execution of your programs and to catch errors. XEmacs can control the execution of GDB, particularly by showing you the location in your code of the current point of execution. We'll see here how to get GDB started within XEmacs, and how to perform some basic tasks with the debugger.

We'll use the final-product program generated by the multi-module example above. You could use some other program, but it will work well only if the program was compiled using the -ggdb option. Follow these steps:

Starting GDB: You can start the debugger on your executable by using the XEmacs command:

M-x gdb

The editor will ask you to select a file on which you want to run GDB, providing the current directory as the starting point for a complete pathname for that file. Type the name of the desired executable, which in this case is final-product).

The window will split, and a new buffer will open up, with GDB running and your executable having been loaded into it. You will be given a prompt that reads:

(gdb)
Be aware of the help: GDB contains its own, online help section. Simply use the command:

(gdb) help

This help facility provides plenty of information. However, you may prefer to view the HTML documentation for GDB.
Run your program: You can simply run your program within GDB like so:

(gdb) run

First, note that if your program has no errors, then it will run normally to completion, leaving you back at the (gdb) prompt. If there were errors, execution would stop, and GDB would show you the point of execution at which the error occurred.

Second, if your program requires command-line arguments, you should enter them after the run command just as you would if running the program outside of GDB and at a shell prompt.
Set a breakpoint: Since the example we're assuming contains no errors, execution will run to completion in the previous step. To see how GDB can allow you to control the execution of a program, you can set a breakpoint, which is a location in the code where the debugger should pause execution and wait for further commands. Let's set a breakpoint in a Foo method, like so:

(gdb) b Foo::do_something

This command will stop execution any time the do_something() method is called on a Foo object. Note that you must explicitly scope method names within GDB. If you were to specify a function instead of a method, you would provide only the function name.

You can also set a breakpoint at an arbitrary line of code, like so:

(gdb) b main.cc:13

This command will stop execution at line 13 of main.cc, which is where the Foo object is deleted within main(). If you specify a line number that does not contain a C++ expression (for example, a line containing only a comment), then GDB will set a breakpoint at the nearest following line that does contain an expression.
Re-run your program: Run the program again with the run command, and you will execution stop at the Foo::do_something() breakpoint. Notice that GDB will open the Foo.cc file and point to the next line to be executed. The debugger will provide its prompt and wait for a command. The following steps will describe some of the many things you can do once execution has been paused.
Examining the activation stack: There are many things you can do to examine the state of the program. A good starting point is obtaining a backtrace:

(gdb) bt

This command shows you the activation stack; that is, which functions called which others, and with what values for the arguments. You will see, in our example, that main() has called Foo::do_something().
Viewing variables values: You can print the values of any variables visible within the current scope. For example, within do_something, the data member value is visible, and you can see its contents with this command:

(gdb) p value

It is sometimes useful to view a value in hexidecimal, which you can do like so:

(gdb) p/x value

This command can also be used to print more complicated values. If you had a variable named my_array that pointed to an array of pointers to Foo objects, then you could examine the value data member of an object at which the 3^rd element of the array points:
p my_array[3]->value
Moving through the stack: You may wish to examine the variables from the other frames shown in the activation stack. You can select the frame number (notice that the bt command enumerates the frames) that you would like to see and operate within, like so:

f 1

In our examine, you will be brought into the frame for main(), and shown the point at which that function called Foo::do_something(). The variables local to main() become visible as part of the current scope, and you can examine their contents. For example, you can examine the pointer to the Foo object:

(gdb) p/x x
Step-by-step execution: Now that the program is paused, you can continue the execution one step at a time. First, you can step line-by-line with the next command, like so:

(gdb) n

This command will execute one line of code and return you to the prompt. If there is a function call on that line of code, the next command will step over that call; that is, the whole function call will be performed, and you will regain control at the next line of code, where that function call will have returned.

In contrast, you can use the step command to step into such function calls, like so:

(gdb) s

This command also progesses one line of code at a time. However, if a line of code contains a function/method call, it will jump into that function/method, stopping at the first line that contains an expression. An examination of the stack will show you that a new frame has been pushed.
Continuing execution: When you no longer want to control execution step-by-step, you can tell GDB to resume full-speed execution with the continue command:

(gdb) c

This command will simply resume execution until an error, a breakpoint, or the end of the program is reached.

Most of all, play with GDB, as it can do a number of useful things to help you find errors. In programs that do not execute correctly, the challenge will be to figure out how to stop program execution with a breakpoint just slightly before the error occurs. That way, you can see the error occur as you control the execution, keeping track of the way in which the variable values change, and thus discovering the point at which the computation veers off track.

Scott F. Kaplan

Last modified: Tue Sep 27 09:40:27 EDT 2005