CS 39 -- Assmebler Documentation

Overview

This assembler is designed for the CS 39 Virtual Machine. It allows you to write mostly-normal assembly code, using labels to specify locations and using symbols to specify register numbers. This page describes the assembly code format and use of the assembler itself.

The register set

Below are listed the addressable registers for this architecture and the symbols used to refer to those registers conveniently in assembly code. A description of the typical use for each in assembly programming is provided:

%zero is the zero register which always holds the value 0. Attempts to assign other values into this register will fail silently.
%sp is the stack pointer. It stores the address of the first byte that preceeds the top stack frame. (Recall that stacks grow downward in the address space.)
%fp is the frame pointer. It stores the address of the last byte of the top stack frame. Note that this register is often unused in hand-writing of assembly code, as it is typically used by compilers is the creation of annotated code use by source-level debuggers.
%ra holds the return address for the call instruction (see the instruction listing below). When a function call is performed, the address of the instruction following the call is stored in this register.
%s0 - %s3 are the system registers. These hold values used for communication between a process and the kernel, including arguments and return values for system calls.
%a0 - %a7 are the argument registers used for passing arguments during function calling.
%t0 - %t7 are the temporary registers. By convention, these are not preserved on a function call, and can be used for anything.
%g0 - g7 are the general registers. These also can be used for anything, but are expected to be preserved at function calls.

Assembly code instruction format

The input format for the assembler is somewhat unusual due to its implementation in Scheme, a derivative of the LISP programming language. The code takes the following format (where the code snippet shown here does nothing interesting, and actually forms an infinite loop):

(
    (lbl-4 stri %g0 5)       ; Set %g0 to 5.
    (()    stri %t2 #x2e30   ; Set %t2 to hold hex value `2e30'
    (()    subr %g1 %g0 %t2) ; %g1 = %g0 - %t2

    ; A useless comment for demonstration purposes.
    (()    jmpi lbl-4)       ; Jump back to the top and repeat.
)

There are a number of elements of this format to observe:

The entire program is enclosed in a pair of outer parentheses.
Each instruction is enclosed in its own pair of inner parentheses.
Blank lines are ignored.
Anything following a semicolon (;) is ignored, allowing you to provide comments on separate lines or on the same lines as instructions themselves.
Field 1 of each instruction holds the label for that instruction, if any. Specifically, by chosing to label an instruction, you make it possible to refer to that label elsewhere in branching and jumping instructions. The assembler will then replace those references with the actual address at which the labelled instruction will be loaded at runtime. If no label is used, the empty list (open and then close parentheses with nothing inbetween) marks the instruction as not labelled.
Field 2 is the opcode. The list of opcodes and the operands on which they work is provided below. Note that every opcode has a four-character name.
Fields 3, 4, and 5 are the operands. Each operand is either a label, a register name, or an immediate value. The choice of which to use depends on the opcode for that instruction. Note that if an opcode requires fewer than all three operands, then only the required operands should be provided.
Immediate operand values are by default in decimal, but can also be specified in hexidecimal using the prefix #x.

Instruction Index

Below is a catalog of opcodes and their associated interpretation of the operands that follow. Specifically, in each description, we will refer to the opcode, as well as operand-0, operand-1, and operand-2. For each instruction, we will provide an example of its use.

sysc -- SYStem Call
Ex: (() sysc)
This instruction generates an intentional interrupt that forces the processor to trap into the kernel. It is the method by which a program can perform an unusual kind of procedure call, where the procedure is inside the kernel and the arguments are passed in the system registers.

All three operands are unused by this instruction. However, the setting of the system registers are the means of communication to the kernel. The convention for how these registers are used are determine by the kernel itself, which may define how values are passed into and returned from the system call.
strr -- SeT Register from Register
Ex: (() strr %g0 %t5) ; %g0 = %t5
Copy a value from one register into another. Set register operand-0 to hold the value contained in register operand-1.
stri -- SeT Register from Immediate
Ex: (() stri %a0 15) ; %a0 = 15
Copy an immediate value into a register. Set register operand-0 to hold the immediate value operand-1.
load -- LOAD from main memory into register
Ex: (() load %g3 %sp 8) ; %g3 = *(%sp + 8)
Copy a value from a main memory location into a register. The main memory address is computed by adding register operand-1 to immediate operand-2. The value at this address is copied into register operand-0. Note that the address must be a word-aligned location.
stor -- STORe from register into main memory
Ex: (() stor %t7 %fp 4) ; *(%fp + 4) = %t7
Copy a value from a register into a main memory location. The main memory address is computed by adding register operand-1 to immediate operand-2. The value in register operand-0 is copied into this address. Note that the address must be a word-aligned location.
addr -- ADD Register to register
Ex: (() addr %s0 %g2 %g3) ; %s0 = %g2 + %g3
Add register operand-1 to register operand-2 and store the result in register operand-0.
addi -- ADD Immediate to register
Ex: (() addi %s1 %t0 15) ; %s1 = %t0 + 15
Add register operand-1 to immediate operand-2 and store the result in register operand-0.
subr -- SUBtract Register from register
Ex: (() subr %s0 %g2 %g3) ; %s0 = %g2 - %g3
Subtract register operand-2 from register operand-1 and store the result in register operand-0.
subi -- SUBtract Immediate from register
Ex: (() subi %s1 %t0 15) ; %s1 = %t0 - 15
Subtract register operand-2 from immediate operand-1 and store the result in register operand-0.
mulr -- MULtiply Register and register
Ex: (() subr %s0 %g2 %g3) ; %s0 = %g2 * %g3
Multiply register operand-1 with register operand-2 and store the result in register operand-0.
muli -- MULtiply Immediate and register
Ex: (() subi %s1 %t0 15) ; %s1 = %t0 * 15
Multiply register operand-1 with immediate operand-2 and store the result in register operand-0.
divr -- DIVide register by Register
Ex: (() subr %s0 %g2 %g3) ; %s0 = %g2 / %g3
Divide register operand-1 by register operand-2 and store the result in register operand-0.
divi -- DIVide register by Immediate
Ex: (() subi %s1 %t0 15) ; %s1 = %t0 / 15
Divide register operand-1 by immediate operand-2 and store the result in register operand-0.
andr -- AND bitwise Register and register
Ex: (() subr %s0 %g2 %g3) ; %s0 = %g2 & %g3
Bitwise AND register operand-1 with register operand-2 and store the result in register operand-0.
andi -- AND bitwise Immediate and register
Ex: (() subi %s1 %t0 15) ; %s1 = %t0 & 15
Bitwise AND register operand-1 with immediate operand-2 and store the result in register operand-0.
orr_ -- OR bitwise Register and register
Ex: (() subr %s0 %g2 %g3) ; %s0 = %g2 | %g3
Bitwise OR register operand-1 with register operand-2 and store the result in register operand-0. The trailing underscore (_) is critical to ensure a four-byte opcode.
ori_ -- OR bitwise Immediate and register
Ex: (() subi %s1 %t0 15) ; %s1 = %t0 | 15
Bitwise OR register operand-1 with immediate operand-2 and store the result in register operand-0. The trailing underscore (_) is critical to ensure a four-byte opcode.
notr -- NOT bitwise Register
Ex: (() notr %s0 %g2) ; %s0 = ~%g2
Bitwise NOT register operand-1 and store the result in register operand-0.
noti -- NOT bitwise Immediate
Ex: (() noti %s1 %t0) ; %s1 = ~%t0
Bitwise NOT immediate operand-1 and store the result in register operand-0.
shrl -- SHift Register Left
Ex: (() shrl %s0 %g2 %g3) ; %s0 = %g2 << %g3
Shift register opcode-1 by the number of bits specified in immediate opcode-2, and store the result in register opcode-0. Note that bits with the value 0 are inserted into the word from the right end.
shrr -- SHift Register Right
Ex: (() shrr %s0 %g2 %g3) ; %s0 = %g2 >> %g3
Shift register opcode-1 by the number of bits specified in immediate opcode-2, and store the result in register opcode-0. Note that bits with the value 0 are inserted into the word from the left end.
jmpr -- JuMP Register
Ex: (() jmpr %ra)
Jump to the address in register opcode-0. This instruction causes the virtual machine to set the program counter to the new address in the executable.
jmpi -- JuMP Immediate
Ex: (() jmpi lbl-1)
Jump to the address in immediate opcode-0. This instruction causes the virtual machine to set the program counter to the new address in the executable.
breq -- BRanch if EQual
Ex: (() breq lbl-1 %g0 %g1) ; if (%g0 == %g1) goto lbl-1
If register opcode-1 is equal to register opcode-2, jump to the address in immediate opcode-0.
brne -- BRanch if Not Equal
Ex: (() brne lbl-1 %g0 %g1) ; if (%g0 != %g1) goto lbl-1
If register opcode-1 is not equal to register opcode-2, jump to the address in immediate opcode-0.
brgt -- BRanch if Greater Than
Ex: (() brgt lbl-1 %g0 %g1) ; if (%g0 > %g1) goto lbl-1
If register opcode-1 is greater than register opcode-2, jump to the address in immediate opcode-0.
brlt -- BRanch if Less Than
Ex: (() brlt lbl-1 %g0 %g1) ; if (%g0 < %g1) goto lbl-1
If register opcode-1 is less than register opcode-2, jump to the address in immediate opcode-0.
brge -- BRanch if Greater Than or Equal
Ex: (() brge lbl-1 %g0 %g1) ; if (%g0 >= %g1) goto lbl-1
If register opcode-1 is greater than or equal to register opcode-2, jump to the address in immediate opcode-0.
brle -- BRanch if Less Than or Equal
Ex: (() brle lbl-1 %g0 %g1) ; if (%g0 <= %g1) goto lbl-1
If register opcode-1 is less than or equal to register opcode-2, jump to the address in immediate opcode-0.
call -- CALL a procedure
Ex: (() call foo)
Jump to the address in immediate opcode-0. Before jumping, store PC + 32 (that is, the address of the next instruction) into the return address register (%ra).

How to use the assembler on `algol`

Use of the VP assembler is reasonably simple. The only difficulty is in handling the errors. But first, a simple example of its use. Assume that we have a file name fib.vma in the current directory that you want to assemble. The following command will assemble it, placing the result in a file name fix.vmx:

assemble.sh fib

If the assembly code is syntactically correct, then it will assemble without any output -- no news is good news. If there is an error in the code, however, you will see one of two kinds of errors:

An error is the number or type of the operands: This kind of error will cause a specific (and somewhat descriptive) error message to be emitted, and the assembly process to terminate, returning you to the shell.
Any other error: All other errors (invalid opcodes, unknown register specifiers, unknown label names, etc.) will result in a more crytpic error message to be emitted. You will be left within the Scheme interpreter, at a single greater-than (>) prompt. The following will allow you to return to the shell:

(exit)

While future versions of the assembler may provide more helpful error messages in these cases. Meanwhile, I hope that the error messages provide some clue as to the error.

Be sure to follow the format of the assembly code (particularly the use of parentheses) described above on this page. Errors in this format may cause some of the most cryptic errors.

Scott F. Kaplan