kVMTrace is a patch to the Linux 2.4.20 kernel that enables a system to log the virtual memory and file system references performed by every user-level task. It records sufficient information to attribute every reference to the task that performed it. Furthermore, all uses of shared space are indentified, thus allowing a simulator using kVMTrace traces to avoid "double-counting" such pages.
kVMTrace captures virtual memory references using the same principle as the Tapeworm II trace collector described by Uhlig, Nagle, Mudge, and Sechrest in a 1994 ASPLOS paper: All but a small, fixed portion of a process's virtual memory mappings are access protected. References to any protected pages triggers a page fault that kVMTrace then handles by logging the reference and unprotecting the page, potentially protecting some other page. References to unprotected pages occur normally, with no interferences by kVMTrace. Thus, so long as the pool of unprotected pages is sufficiently large, the system's performance is quite tolerable.
More concretely, real main memories are currently at least 256 MB, and often 512 MB or 1 GB. For an unprotected pool of 2 MB, any current main memory will be more than a factor of 100 larger. The type of inaccuracy introduced by kVMTrace is similar to the stack deletion method of reference trace reduction, where a factor of 30 has been shown to be sufficient to ensure accurate results with LRU, CLOCK, segmented queue, and other common, basic replacement policies. With a 2 MB unprotected pool size, kVMTrace will slow system execution by a factor of anywhere from 1.5 to 5 --- fast enough for regular use.
Thus, kVMTrace can be applied to real systems running real multi-programmed workloads that may be compute-based, server-based, or interactive. It requires neither source code nor object code, and it is efficient enough that gathering hours or days of execution that references tens of GB can be practical.