Limitations of dynamic instrumentation
The most obvious limitation of dynamic instrumentation with tools, such as, Valgrind and Purify, is that the instrumented program runs much slower than the original program. This is because the instrumented program needs to keep a database of each allocated chunk of memory. Each memory read and write instruction has to be augmented with other instructions to read the information from the database to find out whether the memory location is valid. The instrumented programs typically run more than 30x slower.This slowdown often makes it impractical to do dynamic checking of a program with a large test suite. Consequently, dynamic checking is mostly used for debugging a specific problem or with a smaller test suite. There are also some limitations on the errors that dynamic instrumentation can detect. The ability to detect memory errors relies on being able to distinguish between valid memory accesses and invalid memory accesses. Some kinds of memory access are clearly an error, for example a read of uninitialized memory is unambiguously a problem. Other situations are not so clear cut.
As an example consider an application that allocates a structure, uses that structure and then frees it. If a pointer to this region of memory is used, then the tool can detect that it is an access to invalid memory, and can report it as a freed memory access error. This is known as a dangling pointer – a pointer that points to a region of memory that is no longer valid.
However, if a later malloc() reuses the same region of memory, then the memory is considered valid again. Now a memory access through the stale pointer is indistinguishable from a memory access through a legitimate pointer to the region. So it is not possible for a tool to report an error when the stale pointer is used.
int *area1 = malloc(64); free(area1); char *area2 = malloc(64); // area2 gets the memory area just freed by area1 area1[0] = 0; // Stale Pointer Access |
There is a similar situation where a pointer gets corrupted. If the corrupted pointer happens to point to a valid region of memory, then it’s not possible for a tool to determine that this is a corrupted, rather than legitimate, pointer.
Hardware support for memory error detection
New SPARC processors, starting with the SPARC M7 and SPARC T7 processors have hardware support for Silicon Secured Memory (SSM, previously known as ADI). This hardware support allows real-time detection of memory access errors.Data is stored in memory in units of 64 bytes called cache lines. So when a data of one or more bytes is loaded from memory, the entire block of 64 bytes containing that data is fetched. The latest SPARC processors extend this by adding four additional bits to each cache line. Fetching the 64 byte cache line also fetches these additional bits. These four bits are invisible to the application and are used to hold additional information for SSM.
The best way of thinking of the bits is to imagine them containing a color. For example, a value of one could be thought of as Red, a value of two Green, and so on. So a cache line, of 64 bytes, can be thought of as both containing 64 bytes of data and having a color.
Whenever we need to access a memory location, we need to have a pointer to that memory location. Pointers are 64 bits in size which allows a 64-bit processor to potentially access 16 EiB of data – about 17,000,000 TB of data. There’s no current systems that can hold this much memory. For example, a SPARC M7-32 system can contain a staggering 64 TB of memory. Consequently a 64-bit processor does not need to use all the 64-bits in a pointer. Normally the unused bits are constrained to be all zero or all one, but SSM uses them for a different purpose. Instead of requiring the most significant four bits to be all zero or all one, SSM uses them to store color values. This means all the pointers can be thought of as colored in the same way as all the cache lines in memory are colored.
SSM uses the fact that we can color both pointer and memory to check for invalid memory accesses. A “red” cache line can only be accessed through a “red” pointer. It is an error to use a “red” pointer to access a “green” cache line. The hardware will cause a trap when this color mismatch occurs.
The advantages of hardware support
The most obvious advantage to hardware support for memory error detection is the massive performance advantage. The hardware takes responsibility for checking that every memory access is valid, and this usually only incurs a cost if the access is invalid and the hardware has to trap to report the error. Consequently most applications run at close to their usual speeds.Another important advantage is that the software changes needed to support SSM can be provided in a library. An application does not need to have any instrumentation added in order for it to be checked. This means even existing applications where the source code has been lost can be checked for correctness. For example, if the application is run with command a.out, the following will enable SSM checking of the application.
% LD_PRELOAD_64= |