We describe here the basic trace format emitted by Laplace. Note that this format provides a trace that very closely resembles, in some sense, the sequences of memory references and kernel events as they are collected by Laplace itself. It is possible to modify Laplace to process these sequences in arbitrary ways before they are emitted. It is also possible simply to pipe these traces, described below, into separate utilities that perform such processing. The former approach has the advantage that the I/O work performed on the original sequence need not be performed at all.
The first component of Laplace, implemented within the machine simulator, emits a log of memory references performed by the processor. Each load, store, and instruction fetch is recorded. Because the speed with which this log can be recorded is the most limiting factor on the performance of trace collection, the format is not converted into a human-readable form within Laplace. Instead, it is emitted in a simple binary format, described below. For those who wish to convert this binary format into a human readable format, we have implemented raw-reference-binary-to-text -- a simple utility that performs this conversion.
Each memory reference is stored as one record within the raw refeerence trace. Each such record is stored as a 5-tuple whose fields take the following form (given in pseudo-code):
struct reference_record_s { 8-bit char type; 64-bit unsigned int timestamp; 8-bit unsigned int length; 32-bit unsigned int virtual_address_space; 32-bit unsigned int virtual_address; }
Thus, each trace record comprises 144 bits (18 bytes), where each field has a fixed size. We now give a description of the values that could be stored in each one of these fields, and thus how they could be interpreted:
The raw-reference-binary-to-text utility can read this format and instead emit records, in text, of the following form:
r 123456789abcdef0 4 9f8e7 9a8b7c6d
Here, the order of the fields and their meaning is identical to the above. Each line of text represents one record with five fields of the same length, where the numerical values are given in hexidecimal. We recommend the use of the human-readable text format as the canonical format that other utilities can read and perhaps as the primary format for storage. The binary format exists only for speed of logging at trace collection time. Throughout the rest of the documentation, we assume the use of the text format.
While the kernel may be made to log any arbitrary event, we list here the selected kernel events that we have found useful towards the simulation of a multi-programmed, multi-threaded workload on a system capable of different scheduling and memory management disciplines. Because kernel events are generated at a far lower rate than memory references, logging kernel events is not a performance bottleneck. Therefore, the kernel reference is recorded by Laplace in a human readable format, described below.
A kernel event is recorded as a single trace record. The format of each record depends on the type of kernel event. However, each record begins with the same two fields -- a type tag and a timestamp -- and ends with a newline character. The type tag is used to determine how to parse the remainder of the record. The timestamp value is taken from the same cycle counting register that is used for the raw reference trace. We catalog here each type of kernel event that is logged, as well as the record format used for that event type.
Schedule: When a task is scheduled by the kernel for execution on the CPU, that event is logged with the following format:
s 123456789abcdef0 100
Fork: Use of the fork() system call is logged to record the creation of a new task. Note that a task is initially assumed to be a new thread of a the calling process; only later, when the task is assigned a new address space, is that task considered the first thread of a new process. A forking event is logged as a record that takes the following format:
F 123456789abcdef0 100 77
Exit: When a task uses the exit() system call, that event is logged to record the deletion of that task. An exit event is logged as a record that takes the following format:
T 123456789abcdef0 100
Execute: When a task uses the exec() system call (or one of its derivatives), a new executable image is associated with the address space and all tasks assigned to it. An execute event is logged as a record that takes the following format:
X 123456789abcdef0 100 1a2b3b4c 35 5 foobar.so
Address space assignment: When a task is assigned to a new address space, that event is logged. The assignment of an address space event is logged as a record that takes the following format:
A 123456789abcdef0 100 1a2b3
Duplicate range: After a new task is created and it is assigned to a new address space that no other task uses, then a new process has been created in its own address space. Therefore, the address space of the parent process will be shared in a copy-on-write (COW) fashion. As the kernel makes each region of the parent's space shared with the child's new address space, Laplace records that event as a range duplication. Each such duplication event is logged as a record that takes the following format:
D 123456789abcdef0 1a2b3 9f8e7 56f73000 56f8000
Memory mapping a file: When a process maps a file into its address space using the mmap(), that event is logged using the following format:
M 123456789abcdef0 100 1a2b3c4d 9000 8efa5567 35 5 1029384756afbecd foobar.so
Anonymous memory mapping: A process may declare a region of its virtual address space to be anonymous using the mmap() system call -- that is, reserved and addressable, but not associated with any backing file. Such events are logged using the following format:
m 123456789abcdef0 100 1a2b3c4d 9000
IPC shared memory attachment: An IPC shared memory region may be attached by mapping it into an address space using the shmat() system call. Such events are logged using the following format:
S 123456789abcdef0 100 1a2b3c4d 9000 8912ef45
Memory unmapping: A region of a virtual address space may be unmapped either because of the process directly called munmap() or because some other action indirectly caused the kernel to called munmap on behalf of that process. Unmapping events are logged using the following format:
U 123456789abcdef0 100 1a2b3c4d 9000
Complete unmapping: The kernel may clear an entire address space by destroying all of its mappings; these events are common as part of processing exec() or exit() system calls. Complete unmapping events are logged as follows:
u 123456789abcdef0 1a2b3
COW unmapping and duplication: When a page has been mapped as read-only for COW purposes (see range duplication, above), then when a process attempts to write to that page, a page fault will occur. A new copy of the page will be created, and that new page will be mapped with read/write permissions into the faulting process's address space. Such COW unmapping and duplication events are logged using the following form:
C 123456789abcdef0 100 1a2b3000
Buffer cache allocation and deallocation, or filesystem cache deallocation: When the kernel allocates a page for buffer cache use, when the kernel deallocates such a page that has been used for the buffer cache, or when the kernel deallocates a page that has been used for the filesystem cache, each such event is logged in the following form:
B 123456789abcdef0 c1234000 b 123456789abcdef0 d9898000 p 123456789abcdef0 dc102000
Filesystem cache allocation: When the kernel allocates a page to the filesystem cache it is allocated for some specific page from the logical address space of some specific file. Each such allocation is logged in the following form:
P 123456789abcdef0 dc102000 1a2b3c4d 35 5 1029384756afbecd
File open: When the open() system call is used to open a file, that event is logged using the following format:
( 123456789abcdef0 100 1a2b3c4d 35 5 bazquux.txt
File close: When the close() system call is used to close a file, that event is logged using the following format:
) 123456789abcdef0 100 1a2b3c4d 35 5
File read/write: When the read() or write() system call is used to load data from a file or store it into a file, each such event is logged using the following format:
< 123456789abcdef0 100 1a2b3c4d 35 5 1029384756afbecd e012 > 123456789abcdef0 105 1a2b3c4d 35 5 1029384756afbeff 1b1b
File deletion: A task may call for a file to be deleted via a system call. However, that file may not be deleted if there are other directory entries (hard links) to its inode. Only when an inode is deleted and the block of the filesystem occupied by the logical pages of that file are removed is a file deletion event is logged, thusly:
- 123456789abcdef0 1a2b3c4d 35 5
Kernel trace completion: The end of the kernel trace is marked with an explicit token:
c