Skip to main content

Valgrind - Make Your Memory Safe Again


[0x00] Valgrind!



What does Valgrind means to you and your memory?

[0x01] Hey, Your memory looks good to me



This is a typical layout of a linux (32-bit) process's virtual memory. From high address to low address, we can see regions reserved for kernel, user stack, shared libraries, runtime heap, static data segment and program region. 

Every time we do a malloc() or calloc(), the OS will allocate one piece of memory in the runtime heap for us to use. Although the size of heap are larger enough in most cases (even for a 32-bit program's virtual memory), there is no reason we want to see a heap memory leak. And if your program is a long live process but unfortunately keeps leaking, the heap segment might be exhausted. So how can you detect the potential memory leak, or any other memory error?

[0x02] What is Valgrind

Valgrind is an instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. And we can also use Valgrind to build new tools.

The Valgrind distribution currently includes six production-quality tools: a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache and branch-prediction profiler, and a heap profiler. It also includes three experimental tools: a stack/global array overrun detector, a second heap profiler that examines how heap blocks are used, and a SimPoint basic block vector generator. It runs on the following platforms: X86/Linux, AMD64/Linux, ARM/Linux, ARM64/Linux, PPC32/Linux, PPC64/Linux, PPC64LE/Linux, S390X/Linux, MIPS32/Linux, MIPS64/Linux, X86/Solaris, AMD64/Solaris, ARM/Android (2.3.x and later), ARM64/Android, X86/Android (4.0 and later), MIPS32/Android, X86/Darwin and AMD64/Darwin (Mac OS X 10.12).

[0x03] How Valgrind works

Framwork of Valgrind

In general, Valgrind take control of your program and put your program into a simulated CPU runtime. It just said: "Hey, no rush, let me execute for you and let me check for you!"

 Valid-Address Map and Valid-Value Map

Valgrind relies on two maps: Valid-Address Map and Valid-Value Map to check invalid read/write and unintialized value, respectively.

For Valid-Address Map, 1 bit in VA map represents 1 byte in virtual memory, and 1 means this byte of memory is valid for reading/writing, the bit of 0 will indicate you cannot do this, and Valgrind will report it.

For Valid-Value Map, 1 bytes in VV map represents 1 byte in virtual memory or in register, it basically can check whether this is a meaning content or just some uninitialized random trash. When some random trash byte is feed into the register, this can be considered as crashing your program since it's using an uninitialized value, and Valgrind will report it.

As for memory allocation, free, and leak, Valgrind will keep track of memory operation and record everything, and it is not until the program is actually terminated that Valgrind can tell you whether there is a memory leak, it is definitely lost or still reachable.  

[0x04] Get your hands dirty

Unfortunately, Valgrind doesn't support the current version(10.14) of MacOS, but we can simply install Valgrind and test it on Ubuntu 18.04.

You can either build Valgrind from source:
$ tar -jxvf valgrind-3.15.0.tar.bz2 && cd valgrind-3.15.0
$ ./configure && make –j$(nproc)
$ sudo make install 

Or install by advanced package manager:
$ sudo apt install valgrind
Now let's have a look at this simple program:


We can easily find two problems: 
1) heap block overrun 2) memory leak -- x not freed
The question is whether Valgrind is smart enough to detect those problems?


I tried to feed this program into Valgrind and found that Valgrind indeed reported two types of problems that we found before! Great works!

[0x05] Lost In Valgrind's ways

You may noticed that in the Valgrind report's LEAK SUMMARY section, there are different types of leak. What the difference between them?

[definitely lost] means your program is leaking memory -- fix those leaks!

[indirectly lost] means your program is leaking memory in a pointer-based structure. (E.g. if the root node of a binary tree is "definitely lost", all the children will be "indirectly lost".) If you fix the definitely lost leaks, the indirectly lost leaks should go away.

[possibly lost] means your program is leaking memory, unless you're doing funny things with pointers. This is sometimes reasonable. Use --show-possibly-lost=no if you don't want to see these reports.

[still reachable] means your program is probably ok -- it didn't free some memory it could have. This is quite common and often reasonable.

[suppressed] means that a leak error has been suppressed. There are some suppressions in the default suppression files. You can ignore suppressed errors.

Well, I know you are still confused by the difference between [definitely lost] and [still reachable], let me tell you in this way:

There is more than one way to define "memory leak". In particular, there are two primary definitions of "memory leak" that are in common usage among programmers.
The first commonly used definition of "memory leak" is, "Memory was allocated and was not subsequently freed before the program terminated." However, many programmers argue that certain types of memory leaks that fit this definition don't actually pose any sort of problem, and therefore should not be considered true "memory leaks".
An arguably stricter (and more useful) definition of "memory leak" is, "Memory was allocated and cannot be subsequently freed because the program no longer has any pointers to the allocated memory block." In other words, you cannot free memory that you no longer have any pointers to. Such memory is therefore a "memory leak". Valgrind uses this stricter definition of the term "memory leak". This is the type of leak which can potentially cause significant heap depletion, especially for long lived processes.
The "still reachable" category within Valgrind's leak report refers to allocations that fit only the first definition of "memory leak". These blocks were not freed, but they could have been freed (if the programmer had wanted to) because the program still was keeping track of pointers to those memory blocks.
In most cases, there is no need to worry about "still reachable" blocks. They don't pose the sort of problem that true memory leaks can cause. For instance, there is normally no potential for heap exhaustion from "still reachable" blocks. This is because these blocks are usually one-time allocations, references to which are kept throughout the duration of the process's lifetime. While you could go through and ensure that your program frees all allocated memory, there is usually no practical benefit from doing so since the operating system will reclaim all of the process's memory after the process terminates, anyway. Contrast this with true memory leaks which, if left unfixed, could cause a process to run out of memory if left running long enough, or will simply cause a process to consume far more memory than is necessary.

In short, when a program terminated and OS and find where are those memory not free by program, it is a "still reachable". If OS cannot find where is those memory allocated by program, it is a "definitely lost".

[0x06] Have fun :)

Do you want to see how your embeddedlab behaves under Valgrind? Check this out!

github.com/Roadsong/SecurityIoT-Memcheck

[0x07] Reference

https://inst.eecs.berkeley.edu/~cs161/sp15/slides/lec3-sw-vulns.pdf
http://valgrind.org
http://valgrind.org/docs/valgrind2007.pdf
https://www.ibm.com/developerworks/cn/linux/l-cn-valgrind/index.html
https://github.com/google/sanitizers
https://stackoverflow.com/questions/3840582/still-reachable-leak-detected-by-valgrind/3856938
https://stackoverflow.com/questions/7886176/memory-not-freed-but-still-reachable-is-it-leaking


Comments

Popular posts from this blog

Angr: A Multi-Architecture Binary Analysis Toolkit

This blog is quoted from several angr blogs and documentations, click  here  and  here . angr is a multi-architecture binary analysis toolkit, with the capability to perform dynamic symbolic execution (like Mayhem, KLEE, etc.) and various static analyses on binaries. We've tried to make using angr as pain-free as possible - our goal is to create a user-friendly binary analysis suite, allowing a user to simply start up iPython and easily perform intensive binary analyses with a couple of commands. That being said, binary analysis is complex, which makes angr complex. This documentation is an attempt to help out with that, providing narrative explanation and exploration of angr and its design. Several challenges must be overcome to programmatically analyze a binary. They are, roughly: Loading a binary into the analysis program. Translating a binary into an intermediate representation (IR). Performing the actual analysis. This could be: A partial or full-prog...

Introduction to Meltdown and Escaping the Chrome Sandbox

R untime isolation and sandboxed environments are central to modern application security, but the most commonly used ones may not be as secure as we hope. Overview The general idea of isolated or sandboxed environments is to give a program a limited scope in which to operate. Instead of allowing a given program to use any of a machine’s resources, physical or virtual, you restrict its environment such that it can only access aspects of the system that the sandbox designer has decided are available for use by the program. This is not unlike putting your child in a literal sandbox with high walls – they are free to do whatever they want with all the sand, toys, and tools inside, but cannot interact with the environment outside. Isolation principles are in play at pretty much every aspect of modern computing. For example, last week a classmate wrote a blog on WannaCry, an exploit in Windows SMB older and unpatched versions of Windows. Without going into the detail...

Information Side Channel

By Elaine Cole and Jarek Millburg An information side channel can be used to gain information about the system or data that it processes. A side-channel attack identifies a physical or micro-architectural signal that leaks such desired information and monitors and analyzes that signal as the system operates. While there are many different types of information side channels and even more ways to maliciously exploit them, this blog explores a recent publication that leverages information side channels within IoT devices to aid crime scene investigators in real-time. In this blog, we provide an overview of the general attack procedure, and explore two of the many forms of side channel attacks. Side Channel Attack General Procedure While there are many different forms of side channels, at a high level, a side channel attack requires the following: 1. identify a side channel:  The attacker must first identify  a physical or micro-architectural signal that leaks...