CS 39 -- Project 2

Supporting virtual memory

There is a new version of the virtual machine (hereafter the CPU) that now sports a memory management unit (MMU). As a result, some things have disappeared from the CPU: specifically, the base and limit registers that controlled memory access.

This MMU will perform mapping from virtual addresses to physical addresses (which are within the simulated machine's main memory). The MMU, at any given moment, will only be able to translate virtual to physical addresses for a small set of pages. Specifically, only those virtual pages whose page table entries (PTEs) reside in the MMU's translation lookaside buffer (TLB) can be successfully translated. See below for more information on the page table, its structure, and the management of the TLB.

The CPU itself (particularly the LOAD and STOR instructions) have been modified to work through the MMU. One other addition to the CPU has been a page table register (PTR). This register holds the starting address of the top-level page table. This register dictates which page table is being used by the MMU at any given moment.

Your goal in this assignment will be to provide virtual memory for processes running on the virtual platform. The details are described below, but first, a note: You will not be implementing a full virtual memory system! There are simplifications to the virtual memory support that we want for this assignment, so be sure to read the requirements carefully, and ask questions if the requirements aren't clear!

New CPU functions

As mentioned above, all functions associated with the base and limit registers have been removed. Specifically, the following functions will no longer be available:

  setBase()
  getBase()
  setLimit()
  getLimit()

There is a pair of new functions available for manipulating the page table register (PTR):

void setPTR (vMachineWord_t pageTableAddress);

This function takes the starting address of the top level of the current page table, and sets the ptr register to hold that address. This function then calls flushTLB() to invalidate all TLB entries. (See below)
vMachineWord_t getPTR ();

This function simply returns value of the ptr register.

The MMU interface

A few new functions have been added to the CPU/MMU. This section catalogs the operation of those new functions.

The MMU allows the kernel (or the CPU) to read from or write to a specific location in the virtual memory space of the current process. Below are descriptions of the four functions that allow this access, plus two functions needed for TLB management:

bool readWord (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a word within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the requested machine word into the requested register number. Note that the virtual address provided must be word-aligned. The function returns whether or not it was able successfully to perform the translation and read the word.
bool writeWord (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a word within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the machine word contained in the given register into the location specified by the mapped physical address. Again, the virtual address provided must be word-aligned. The function returns whether or not it was able successfully to perform the translation and write the word.
bool readByte (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a byte within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the requested byte into the requested register. Note that the upper three bytes of the register will be set to 0. The function returns whether or not it was able successfully to perform the translation and read the byte.
bool writeByte (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a byte within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the lowest-order byte contained in the specified register into the location specified by the mapped physical address. The function returns whether or not it was able successfully to perform the translation and write the byte.
void flushTLB ()
This function invalidates all TLB entries. Note that this function is called by the setPTR() function!.
void flushOneTLBEntry (vMachineWord_t virtualAddress);
This function searches the TLB for a PTE that maps the virtual page indicated by virtualAddress. If found, that entry is invalidated; otherwise, the TLB is left unmodified.

Handling MMU-generated traps

When the four read/write functions above attempt to translate the given virtual address to a physical address, that mapping can fail. (Note that there is an instruction-fetching MMU function that is used within the CPU that can also fail during a virtual-to-physical mapping.) There are two possible interrupts that can be generated due to a failure:

Address alignment interrupt: If the virtual address provided is not properly aligned, then this interrupt will cause the CPU to call the address alignment trap handler in the kernel. Word-sized memory accesses must use word-aligned virtual addresses, and instruction-sized memory accesses must use instruction-sized virtual addresses. Notice that it is not possible to be misaligned with respect to characters, which are single bytes.

The CPU will place the offending virtual address in register %s0 before trapping into the kernel.
Memory access interrupt: If the attempt to translate the virtual address to a physical address fails, this interrupt will cause the CPU to call the memory access trap handler in the kernel. This type of failure can occur for any of the following reasons:
- The PTE's resident bit is 0.
- For a read operation, the PTE's read permission bit is 0.
- For a write operation, the PTE's write permission bit is 0.
- The physical address specified is invalid. (That is, there is no such page frame in the simulated machine's main memory.)

Upon an memory access interrupt, the CPU will place the following data in registers:

The requested virtual address is placed in %s0.
The operation that caused the invalid access is placed in %s1; the value 0x0 is placed in that register if a read operation was being performed, and the value 0x1 is placed in that register if a write operation was being performed.

The kernel's trap handler will then be given control. It must determine why the MMU could not proceed. If the problem can be fixed by the kernel (e.g. by fetching the page from disk), then the kernel should do so. If the problem cannot be resolved by the kernel (e.g. a bug in the program), then the process should be killed.

Important note on returning control to the MMU: After this trap handler is used, the MMU does not immediately try again to perform the mapping and access the space. Rather, the CPU leaves the PC unmodified, and then restarts the fetch-decode-execute cycle.

However, if your kernel calls one of the MMU access functions directly, then it should be prepared for a false return value. That is, your kernel should appropriate retry the access if its first attempt fails.

Page table and PTE formats

The page table for this architecture is a multi-level page table. Specifically, the CPU uses 32-bit addresses and 4 KB pages. Each page table is organized as a two-level table, and therefore each virtual address can be divided into three components:

The uppermost 10 bits (31 to 22) are the top-level index.
The next 10 bits (21 to 12) are the bottom-level index.
The remaining 12 bits (11 to 0) are the offset.

The top-level directory (there is only one per page table) is a single page that contains 1024 word-sized entries. Each entry is a pointer to a bottom-level directory.

A bottom-level directory (there are as many as 1024 of them) is also a single page that contains 1024 word-sized entries. Each entry is a PTE.

A PTE contains the actual translation information for a virtual page. It's format is as follows:

Bits 31 to 12 are the page frame number.
Bit 11 is the reference bit.
Bit 10 is the dirty bit.
Bit 9 is the read permission bit.
Bit 8 is the write permission bit.
Bit 7 is the resident bit.
Bits 6 to 0 are available and may be used by the kernel for other purposes.

It is the responsibility of the kernel to allocate top- and bottom-level directories and to assign their entries. The MMU will read a page table, and will modify it only to set reference and dirty bits within PTEs. Notice that there is no requirement that the page table be contained with the simulated machine's main memory space. Like other kernel data structures, it can be allocated anywhere.

Your assignment

You can obtain the code for the new virtual machine by unpacking the tarball available in my public directory:

tar -xjvpf ~sfkaplan/public/cs39/VP-v1.2.2.tar.bz2

Note that, unlike last time, the tarball does not include a kernel -- it's just the machine itself. You will have to adapt your kernel to the new and different capabilities of the CPU and MMU. You will also need to write a few programs to operate on the new CPU. Feel free to share yours with other teams.

Your kernel must be capable of the following:

Create a new system call, MMAP. A user level process can request this system call by placing the constant 0x11111100 in %s0 before using the SYSC instruction.

The process must also specify a virtual address in register %s1. The kernel should create a new PTE for the virtual page that contains the given virtual address. If that virtual page already has an entry in the page table, then the kernel should do nothing.

Notice that this system call should never fail. All 32-bit values correspond to legal virtual addresses, and so there cannot be an invalid request.

Notice also that this system call should merely assign a PTE. It should not actually allocate a page frame and establish a mapping. Only an attempt to reference the virtual page should trigger those events.
Write the address alignment trap handler. It is likely that this handler should kill the process that used a misaligned address. If you can think of a way that the kernel can make the process survive the error, do that!
Write the invalid memory access handler. It should be able to determine the cause of the trap, and then deal with it appropriately, either by killing an offending process, or by managing the pages so that the MMU can succeed the next time.

This handler should include the capability for automatic page swapping to a backing store. Backing store space should be "fake" -- that is, your kernel should allocate a chunk of space that it can use as a backing store, outside of the virtual machine's main memory.
Update the loading of an executable image for a new process. In particular, you must allocate page frames, assign PTEs that map virtual pages to these page frames, and load the image into those page frames. The image should, of course, appear to be continguous within the virtual address space.
Update the FORK system call -- specifically, update the part of that system call which makes an identical copy of the main memory space for the child process. You have two options for this aspect of the project:
1. Non-sharing: Duplicate the parent's pages in newly allocated space for the child. This approach is simple, but wastes space and time.
2. Sharing: Implement copy-on-write (COW), where the child gets a copy of the parent's page table and all pages are marked read-only. Each page will be shared until a write operation causes a page fault, at which time the kernel should make two separate copies, one for each process. This approach is more complex, but more efficient in space and time.

No changes to the virtual machine, including the MMU, should be needed. Bugs should be reported to me, as I may need to modify the virtual machine and issue an update to all of the groups.

Fixes from Project 1

There were a few bugs and errors in project 1 that have been fixed. We catalog them here.

Previously, an invalid instruction opcode would cause a invalid instruction interrupt during the decoding phase. Unfortunately, following the corresponding trap into the kernel, the CPU would continue with the malformed instruction, thus causing the simulator to abort during the execution cycle.

This CPU bug had been fixed. After the decoding phase generates an invalid instruction interrupt, it restarts the fetch-decode-execute cycle. Thus, the kernel can modify the CPU's state, and the CPU will not continue trying to process an invalid instruction.

Notice that if an invalid opcode is somehow discovered during the execution phase, the CPU still aborts the simulator. This error would be due to a bug in the CPU code, and not due to an input error -- therefore, the simulator should and does abort.
In Project 1, if a FORK failed, there was no explicit code that could be returned to the calling process to indicate that failure. We fix this process by dictating that a failed FORK should return 0xFFFFFFFF as a failure code to the calling process. As a consequence, this value can never be used for a process ID.
The original version of the assembler would provide a cryptic message if a branch/jump instruction refered to a label that could not be correlated to any instruction. (That is, it tried to jump to a label that did not exist.) The assembler now produces a clear error message in response to this kind of programming error.

Submitting your work

When your group has completed its project, submit all files as project-2 using the cs39-submit program. Note that this program can take directory names as arguments.

This assignment is due on Sunday, October 26^th at 11:59 pm!

Scott F. Kaplan

Last modified: Wed Oct 22 21:31:51 EDT 2003