CS 39 -- Project 2

This page is long and contains a great number of important details. Read carefully and thoroughly!

Supporting virtual memory

There is a new version of the virtual machine (hereafter the CPU) that now sports a memory management unit (MMU). As a result, some things have disappeared from the CPU: specifically, the base and limit registers that controlled memory access.

This MMU will perform mapping from virtual addresses to physical addresses (which are within the simulated machine's main memory). The MMU, at any given moment, will only be able to translate virtual to physical addresses for a small set of pages. Specifically, only those virtual pages whose page table entries (PTEs) reside in the MMU's translation lookaside buffer (TLB) can be successfully translated. See below for more information on the page table, its structure, and the management of the TLB.

The CPU itself (particularly the LOAD and STOR instructions, as well as the instruction fetch mechanism) has been modified to work through the MMU. One other addition to the CPU has been a page table register (PTR). This register holds the starting address of the top-level page table. This register dictates which page table is being used by the MMU at any given moment.

Your goal in this assignment will be to provide virtual memory for processes running on the virtual platform. The details are described below, but first, a note: You will not be implementing a full virtual memory system! There are simplifications to the virtual memory support that we want for this assignment, so be sure to read the requirements carefully, and ask questions if the requirements aren't clear!

New CPU functions

As mentioned above, all functions associated with the base and limit registers have been removed. Specifically, the following functions will no longer be available:

  setBase()
  getBase()
  setLimit()
  getLimit()

There is a pair of new functions available for manipulating the page table register (PTR):

void setPTR (vMachineWord_t pageTableAddress);

This function takes the starting address of the top level of the current page table, and sets the ptr register to hold that address. This function then calls flushTLB() (see below) to invalidate all TLB entries.
vMachineWord_t getPTR ();

This function simply returns value of the ptr register.

The MMU interface

A few new functions have been added to the CPU/MMU. This section catalogs the operation of those new functions.

The MMU allows the kernel (or the CPU) to read from or write to a specific location in the virtual memory space of the current process. Below are descriptions of the four functions that allow this access, plus two functions needed for TLB management:

bool readWord (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a word within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the requested machine word into the requested register number. Note that the virtual address provided must be word-aligned. The function returns whether or not it was able successfully to perform the translation and read the word.
bool writeWord (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a word within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the machine word contained in the given register into the location specified by the mapped physical address. Again, the virtual address provided must be word-aligned. The function returns whether or not it was able successfully to perform the translation and write the word.
bool readByte (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a byte within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the requested byte into the requested register. Note that the upper three bytes of the register will be set to 0. The function returns whether or not it was able successfully to perform the translation and read the byte.
bool writeByte (vMachineWord_t virtualAddress, int registerNumber)
This function allows the caller to specify, via virtualAddress, the location of a byte within the virtual address space of the current process. The MMU will perform the translation from virtual to physical address, and then copy the lowest-order byte contained in the specified register into the location specified by the mapped physical address. The function returns whether or not it was able successfully to perform the translation and write the byte.
void flushTLB ()
This function invalidates all TLB entries. Note that this function is called by the setPTR() (see above) function!.
void flushOneTLBEntry (vMachineWord_t virtualAddress);
This function searches the TLB for a PTE that maps the virtual page indicated by virtualAddress. If found, that entry is invalidated; otherwise, the TLB is left unmodified.

Handling MMU-generated traps

When the four read/write functions above attempt to translate the given virtual address to a physical address, that mapping can fail. (Note that there is an instruction-fetching MMU function that is used within the CPU that can also fail during a virtual-to-physical mapping.) There are two possible interrupts that can be generated due to a failure:

Address alignment interrupt: If the virtual address provided is not properly aligned, then this interrupt will cause the CPU to call the address alignment trap handler in the kernel. Word-sized memory accesses must use word-aligned virtual addresses, and instruction-sized memory accesses must use instruction-sized virtual addresses. Notice that it is not possible to be misaligned with respect to characters, which are single bytes.

The CPU will place the offending virtual address in register %s0 before trapping into the kernel.
Memory access interrupt: If the attempt to translate the virtual address to a physical address fails, this interrupt will cause the CPU to call the memory access trap handler in the kernel. This type of failure can occur for any of the following reasons:
- The PTE's resident bit is 0.
- For a read operation, the PTE's read permission bit is 0.
- For a write operation, the PTE's write permission bit is 0.
- For an instruction fetch, the PTE's execute permission bit is 0.
- The physical address specified is invalid. (That is, there is no such page frame in the simulated machine's main memory.)

Upon an memory access interrupt, the CPU will place the following data in registers:

The requested virtual address is placed in %s0.
The operation that caused the invalid access is placed in %s1. The following values may be placed in this register, and they imply the corresponding operation:
1. 0x0 => instruction fetch
2. 0x1 => write
3. 0x2 => read

The kernel's trap handler will then be given control. It must determine why the MMU could not proceed. If the problem can be fixed by the kernel (e.g. by fetching the page from disk), then the kernel should do so. If the problem cannot be resolved by the kernel (e.g. a bug in the program), then the process should be killed.

Important note on returning control to the MMU: After this trap handler is used, the MMU does not immediately try again to perform the mapping and access the space. Rather, the CPU leaves the PC unmodified, and then restarts the fetch-decode-execute cycle.

However, if your kernel calls one of the MMU access functions directly, then it should be prepared for a false return value. That is, your kernel should appropriate retry the access if its first attempt fails.

Page table and PTE formats

The page table for this architecture is a multi-level page table. Specifically, the CPU uses 32-bit addresses and 4 KB pages. Each page table is organized as a two-level table, and therefore each virtual address can be divided into three components:

The uppermost 10 bits (31 to 22) are the top-level index.
The next 10 bits (21 to 12) are the bottom-level index.
The remaining 12 bits (11 to 0) are the offset.

The top-level directory (there is only one per page table) is a single page that contains 1024 word-sized entries. Each entry is a pointer to a bottom-level directory.

A bottom-level directory (there are as many as 1024 of them) is also a single page that contains 1024 word-sized entries. Each entry is a PTE.

A PTE contains the actual translation information for a virtual page. It's format is as follows:

Bits 31 to 12 are the page frame number.
Bit 11 is the reference bit (set by MMU).
Bit 10 is the dirty bit (set by MMU).
Bit 9 is the read permission bit.
Bit 8 is the write permission bit.
Bit 7 is the execute permission bit.
Bit 6 is the resident bit.
Bits 5 to 0 are unused and may be defined and used by the kernel for other purposes.

It is the responsibility of the kernel to allocate top- and bottom-level directories and to assign their entries. The MMU will read a page table, and will modify it only to set reference and dirty bits within PTEs. Notice that there is no requirement that the page table be contained with the simulated machine's main memory space. Like other kernel data structures, it can be allocated anywhere.

Your assignments (note the plural!)

You can obtain the code for the new virtual machine by unpacking the tarball available in my public directory:

tar -xjvpf ~sfkaplan/public/cs39/VP-v1.2.3.tar.bz2

Note that, unlike last time, the tarball does not include a kernel -- it's just the machine itself. You will have to adapt your kernel to the new and different capabilities of the CPU and MMU. You will also need to write a few programs to operate on the new CPU. Feel free to share yours with other teams.

For part A of this assignment, your kernel must have the implement the following capabilities:

Create a new system call, MMAP. A user level process can request this system call by placing the constant 0x11111100 in %s0 before using the SYSC instruction.

The process must also specify a virtual address in register %s1 and a permission mask in register %s2. For the permission mask, only the least significant 3 bits of the value are meaningful. Each of these bits indicates whether a given type of permission should be associated with the page, where a 1 indicates permission and 0 the lack of permission:
- Bit 2: Read permission
- Bit 1: Write permission
- Bit 0: Execute permission
The kernel should create a new PTE for the virtual page specified in the virtual address; that PTE should be assigned the requested permissions. If that virtual page is already mapped (that is, there is already a valid PTE for it), then the kernel should simply update the permissions of the PTE.

Notice that this system call should never fail. All 32-bit values correspond to legal virtual addresses and all combinations of permission bits are valid, and so there cannot be an invalid request.

Notice also that this system call should merely assign a PTE. It should not actually allocate a page frame and establish a mapping. Only an attempt to reference the virtual page should trigger those events.
Write the address alignment trap handler. It is likely that this handler should kill the offending process (unless you can somehow think of something nicer to do).
Write the invalid memory access handler. It should be able to determine the cause of the trap, and then deal with it appropriately, either by killing an offending process, or by managing the pages so that the MMU can succeed the next time.

First note a case for this handler that we have not discussed in class, namely the new page allocation case. If a page has been MMAPed, but not previously referenced, then it will have a valid PTE but no real space (main memory or backing store) corresponding to that virtual page. Thus, the page must be created anew in main memory and the PTE made to map that virtual page to the given page frame. Note that the page frame should be cleared (all zeros) during this process as a security measure.

Second, for part A of this project, this handler should not implement page swapping. There should be no backing store. If all of the page frames are being used and a process requests a new page, then that process should be killed. Its page frames should then be reclaimed and re-used for any subsequent requests for new pages by other processes.
Update the loading of an executable image for a new process. In particular, you must allocate page frames, assign PTEs that map virtual pages to these page frames, and load the image into those page frames. The image should, of course, appear to be continguous within the virtual address space.

Note that the kernel should use the MMU to write the executable image into process's memory. That is, your kernel code should not use physical address in loading the image, but rather should use the MMU functions writeByte and/or writeWord.

For part B of this assignment, you must add these additional capabilities to your kernel:

Add automatic page swapping. Allocate a fake backing store (simply an allocated chunk of memory that you will manage as though it were disk space), into which you will swap pages that are selected for replacement. You must implement either the FIFO/LRU SEGQ policy or a CLOCK policy to drive replacement choices. You must also, of course, update your invalid memory access handler to handle references to valid but non-resident pages.
Update the FORK system call -- specifically, update the part of that system call which makes an identical copy of the main memory space for the child process. You must implement copy-on-write (COW), a mechanism that behaves as follows:
- When a FORK occurs, the child process is given a copy of the parent's page table. That is, both processes will share the parent's pages. All of the valid PTE's in both page tables are marked read-only (that is, the read permission bit is set to 0) as part of the FORK operation.
  
  Note that if the child then itself performs a FORK, all three processes (parent, child, and grandchild) share the pages. A page may therefore be shared by an arbitrary number of processes.
- If a process attempts to write to a COW (read-only) page, and if that page is still shared by multiple processes, then the invalid memory access handler must create a copy of the page and redirect the faulting PTE so that it maps to this copy. It can then restore write-permission to that page.
  
  Note that if the COW page is no longer shared with other processes (the others have faulted on the page and been assigned copies), then the remaining process that uses that page may be given permission to write that page.XS
  
  Also note that the kernel must distinguish between pages that have been made read-only by the COW mechanism and pages that are read-only at the process's request (as part of an MMAP call).

No changes to the virtual machine, including the MMU, should be needed. Bugs should be reported to me, as I may need to modify the virtual machine and issue an update to all of the groups.

Fixes from Project 1

There were a few bugs and errors in project 1 that have been fixed. We catalog them here.

Previously, an invalid instruction opcode would cause a invalid instruction interrupt during the decoding phase. Unfortunately, following the corresponding trap into the kernel, the CPU would continue with the malformed instruction, thus causing the simulator to abort during the execution cycle.

This CPU bug had been fixed. After the decoding phase generates an invalid instruction interrupt, it restarts the fetch-decode-execute cycle. Thus, the kernel can modify the CPU's state, and the CPU will not continue trying to process an invalid instruction.

Notice that if an invalid opcode is somehow discovered during the execution phase, the CPU still aborts the simulator. This error would be due to a bug in the CPU code, and not due to an input error -- therefore, the simulator should and does abort.
In Project 1, if a FORK failed, there was no explicit code that could be returned to the calling process to indicate that failure. We fix this process by dictating that a failed FORK should return 0xFFFFFFFF as a failure code to the calling process. As a consequence, this value can never be used for a process ID.

Submitting your work

When your group has completed each part of this project, it should have one member submit the entire directory as project-2A (for part A) or project-2B (for part B) using the cs39-submit program. Note that this program can take directory names as arguments.

Part A of this assignment is due on Sunday, October 30^th at 11:59 pm!

Part B of this assignment is due on Friday, November 18^th at 11:59 pm!

Scott F. Kaplan

Last modified: Thu Nov 17 21:26:32 EST 2005