Research topics

Here are links to some of my research topics. The current projects are listed at the top, while older projects have sifted towards the bottom.

kVMTrace -- Efficient main-memory reference trace collection
Collecting traces of virtual memory and file system references often introduces large execution time overhead. kVMTrace is a kernel-based reference tracing tool that gathers page-level traces efficiently, introducing as little as a few percent overhead. All processes and their kernel-level threads are traced, and each reference is associated with the thread that performed it. All uses of shared memory are reconciled. This tool allows for the collection of reference traces for workloads that run for tens of minutes or hours and that reference many GB of data.
Dynamic virtual memory management
A re-examination of the problem of dynamic virtual memory management (dvmm), where modern system workloads and assumptions of fairness are considered. This topic has not been thoroughly visited in some time, and old solutions (Working Set, Global LRU, Page Fault Frequency) have substantial weaknesses with modern workloads. We seek not just to improve the performance of DVMM, but to re-focus the problem on providing control over the use of memory as a resource.
In-kernel recency information gathering (RIG)
Information about the recency with which programs reference pages can be valuable. Our work on EELRU, compressed caching, demand prepaging, and dynamic memory management (below) all rely on this information -- specifically, they assume that a system can keep a histogram of the hits to LRU queue positions, where an LRU queue provides the logical order of most recent use for each page. Gathering this data in a kernel without incurring substantial overhead is important to making this new approach to memory management realizable. We investigate the problems of gathering this data in real time via in-kernel implementation.
Whole-system reference trace collection (Laplace)
It is common to experiment with memory management policies in simulation, thus making it possible to compare policies on the same inputs and reproduce the results. However, such simulation is based on reference traces---logs of the memory operations performed by real, executing processes. Previous methods of collecting reference traces either work only on single processes, or they do not collect enough information about processes, threads, and memory mappings to make the simulation of a multi-programmed workload possible. Laplace is a tool designed to collect reference traces of every kernel-level thread and the kernel itself, as well as information about threads, processes, and memory mappings. With this information, it becomes possible to simulate a whole system, including a scheduler and memory manager, as it would execute a collection of processes.
Demand prepaging
A simple idea for reducing the number of page faults by speculatively fetching a number of pages at each fault and then caching those pages for ``some time''. We re-evaluate an idea whose previous analysis was incomplete and based on reference behavior for processes from over two decades ago. Specifically, previous work focused on the selection of pages to prepage, but not on the problem of caching those pages. We focus on the caching of prepaged pages and their competition with pages that have actually been referenced.
Compressed caching
Reducing the demand on the backing store by compressing in-memory program data, thus splitting main memory into an uncompressed cache and a compressed cache. By dynamically adapting the amount of compressed data stored in main memory, the combination of time spend transferring to/from disk and performing compression/decompression is minimized.
EELRU
A page replacement algorithm that dynamically identifies when it would be beneficial to perform non-LRU evictions on some pages so that others may remain resident for longer.
Trace reduction
Novel algorithms and their implementations for the problem of lossy reduction of reference traces, where the information lost is guaranteed not to affect LRU and OPT memories of a user-selected minimal size.

Scott F. Kaplan

Last modified: Wed Jul 13 15:39:29 EDT 2005