Hate UML?

Draw sequence diagrams in seconds.
http://www.websequencediagrams.com

Detecting C++ memory leaks
Posted on: 2007-04-07 19:38:11
A while ago I had the problem of detecting memory leaks in my code, and I didn't want to spend lots of money on a brittle software package to do that. It's fairly simple to redefine malloc() and free() to your own functions, to track the file and line number of memory leaks. But what about the new() and delete() operators? It's a little more difficult with C++, if you want to figure out the exact line number of a resource leak.

In this article, I'll explain how you can get a stack trace for where your resource leaks occur. This method is for Microsoft Windows. Linux developers are better served with Valgrind.

Download the source code:

Overview

  • We will use #define to replace the standard implementation of malloc() and free() with ones that record the file and line numbers where they are called. That way, we can track where memory leaks occur for allocations made using the standard C allocation functions.
  • We will overload the new() and delete() operators to track the address of the functions that they are called, by walking backwards up the stack.
  • Finally, we will parse the .map file generated by the linker. This will let us figure out where new() and delete() were called based on the return address information.

The header file

The first thing we'll do is have an #ifdef, because memory tracking is inefficient. You'll want to cut it out in release versions of your code.

debug.h:

#ifdef DEBUG_MEM
#include 
#define malloc(A) _dbgmalloc(__FILE,__LINE, (A) )
#define free(A) _dbgfree( __FILE__, __LINE__, (A) )
// ... continue with calloc, realloc, strdup, etc.
#endif

Every *.cpp source file in your program should include this file. It's optional, of course. But if you allocate something in a memory-tracked module, and free it in another that doesn't, your program will crash, since it was allocated with _dbgmalloc() and free'd with free() instead of _dbgfree().

The implementation for malloc

void*
_dbgmalloc( const char* file, int line, size_t size )
{
    void* ptr;

    if ( !_init ) {
        return malloc( size );
    }

    ptr = add_record( file, line, size );
    if ( ptr == 0 ) {
        dbgprint(( DMEMORY, "Out of memory." ));
        return 0;
    }

    dbgprint(( DMEMORY, "%s:%d: malloc( %d ) [%p]", file, line, size, ptr ));

    return ptr;
}

void _dbgfree( const char* file, int line, void* ptr )
{
    if ( ptr == 0 ) {
        return;
    }

    if ( !_init ) {
        free( ptr );
        return;
    }

    MemBlock* block = (MemBlock*)ptr - 1;
    int size = block->size;

    del_record( file, line, ptr );

    dbgprint(( DMEMORY, "%s:%d: free( [%p], %d )", file, line, ptr, size ));
}

The add_record() and del_record() functions perform the real work of memory tracking. They will allocate the requested amount of memory, but they will add space for extra tracking information. The tracking information is stored in the first few bytes of the memory block, and then the returned pointer offset by this amount. We will also reserve extra space at the end of the memory block, so we will be able to detect writes past the end of the array. We will write a specific sequence of bytes (Here, 0x12345678) at this location, and if when the block is free'd, the bytes have been modified, then your program has done something it shouldn't have, and the del_record() function will complain.

void*
add_record( const char* file, int line, size_t size )
{
    MemBlock* block;
    assert(_init);

    block = (MemBlock*)malloc( sizeof( MemBlock ) + size + 4 );

    if ( block == 0 ) {
        dbgprint(( DMEMORY, "Out of memory." ));
        return 0;
    }

    block->sentry = SENTRY;
    block->size = size;
    block->line = line;
    block->file = _strdup( file );
    if ( 0 == block->file && file ) {
        free( block );
        dbgprint(( DMEMORY, "Out of memory." ));
        return 0;
    }

    memcpy( (char*)block + sizeof(*block) + size, &SENTRY, 
        sizeof( SENTRY ) );

    EnterCriticalSection(&_cs);
    list_add_tail( &_blockList, &block->list );
    LeaveCriticalSection(&_cs);

    return block + 1;
}

What about new?

That's all fine and good for malloc() and free(), and strdup() and _tcsdup() and calloc() and realloc(), but what about C++? When you call malloc() above, you see that the macro puts in the file and line number information, but this is not possible for the new operator. Instead, we will do it the hard way. We'll redefine the new operator and then search up the stack for the caller's address and store that. Later, we'll parse the linker's map file to figure out which function it was from the address.

Here's the implementation for new() and delete(). They are almost the same as malloc() and free() above, except that they record the return address instead of the file and line information.

void* operator new( size_t size ) throw ( std::bad_alloc )
{
    static bool recurse = false;
    void* ret;
    CrashPosition_t pos;
    if ( recurse || !_init) {
        return malloc( size );
    }

    EnterCriticalSection(&_cs);
    pos = getFileLine(1);
    if ( pos.file == 0 ) {
        pos.file = pos.function;
    }

    ret = add_record( pos.file, pos.line, size );
    if ( ret == 0 ) {
        dbgprint(( DMEMORY, "Out of memory." ));
	    LeaveCriticalSection(&_cs);
        return 0;
    }

    dbgprint(( DMEMORY, "%s:%d: new( %d ) [%p]", pos.file, pos.line, size, ret ));
	    LeaveCriticalSection(&_cs);
    return ret;
}


/******************************************************************************
 *****************************************************************************/
void operator delete( void* ptr ) throw ()
{
    CrashPosition_t pos;
    
    if ( !_init ) {
        free( ptr );
        return;
    }

    if ( ptr == 0 ) {
        return;
    }
    EnterCriticalSection(&_cs);

    pos = getFileLine(2);
	    LeaveCriticalSection(&_cs);

    dbgprint(( DMEMORY, "%s:%d: delete [%p]", pos.file, pos.line, ptr ));
    del_record( pos.file, pos.line, ptr );
}

Walking the stack

Here's where the magic happens. Because file and line number information is not available to the new operator, we will walk the stack in order to record the return address. Later on, we'll figure out the function name where they were called from.
static int 
GetCallStack( unsigned* stack, int max )
{
    unsigned* my_ebp = 0;
    int i;

    __asm {
        mov eax, ebp
        mov dword ptr [my_ebp], eax;
    }

    // It is not safe to use this function in a WIN32 standard exception handler!
    if ( IsBadReadPtr( my_ebp + 1, 4 ) ) {
        return 0;
    }

    stack[0] = *(my_ebp + 1);
    for ( i = 1; i < max; i++ ) {
        unsigned addr;
        if ( IsBadReadPtr( my_ebp, 4 ) ) {
            break;
        }
        my_ebp = (unsigned*)(*my_ebp);

        if ( IsBadReadPtr( my_ebp + 1, 4 ) ) {
            break;
        }

        addr = *(my_ebp + 1);
        if ( addr ) {
            stack[i] = addr;
        } else {
			break;
		}
    }

    return i;
}

Making the map file

So far, for malloc() and free() calls, we have recorded the file and line number information, but for new() and delete() we have only the return address. How do we figure out which function called new() and delete()?

We will induce the linker to create a .map file. Add these options to your makefile when calling link.exe. Replace example with the name of your executable output file. (The debug.cpp code will assume that the map file has the same base name as the executable).

/MAP:example.map /MAPINFO:LINES

Compiler differences

Note: For Microsoft Visual Studio 2005, Microsoft has removed the MAPINFO:LINES option. So you should either use an earlier version of the compiler, or be content without line numbers. You will still have function names.

The Map File

The Map file contains a list of every function in your program, and the exact addresses to which they are loaded. So, using a binary search, we are able to look up a function given an address. I have implemented this process in Mapfile.cpp, which is called diretly from debug.cpp.

Putting it together

When your program exits, the debug.cpp module will automatically execute this cleanup code. The cleanup code will dump out any unfree'd memory chunks.
static void
dump_blocks()
{
    list_entry_t* entry = list_head( &_blockList );
    while( entry != &_blockList ) {
        MemBlock* block = list_entry( entry, MemBlock, list );
        dbgprint(( DMEMLEAK, "Leaked %d bytes from %s:%d [%08x]",
                    block->size, block->file, block->line, block + 1
                 ));
        
        entry = entry->next;
    }

    if ( list_empty(&_blockList ) ) {
        dbgprint(( DMEMLEAK, "No memory leaks detected." ));
    }
}

dbgprintf

To see the memory leaks, you will have to implement a debug message handler. I don't have time to explain this right now, but it should be pretty obvious from the source code. Or, you can replace dbgprint() with OutputDebugString(), or printf(), or MessageBox(), or whatever you want.

Enjoy!

Want more programming tech talk?
Add to Circles on Google Plus
Subscribe to posts

Post comment

Real Name:
Your Email (Not displayed):

Text only. No HTML. If you write "http:" your message will be ignored.
Choose an edit password if you want to be able to edit or delete your comment later.
Editing Password (Optional):

Alice

2007-04-10 22:19:36
You know what helps you to detect C++ memory leaks? A new scarf that I knitted.

Aun

2007-07-22 08:34:12
Can this be used with VS 2005 ?

Ulysses

2007-08-17 14:55:12
Reinventing free dump heap and heap compare tools from MS?

Steve Hanov

2007-08-20 10:35:24
Ulysses,

There is an advantage to having memory leak tracking built-in, instead of generating a huge log file with a separate tool. Whenever the debug version of my program exits, it performs the heap check, so I instantly know when and where a memory leak occurs during development. The MS UMDH tool is meant to be run when you suspect there is a problem, not all the time.

Reinvention is an important part of being a good programmer. When someone re-writes an existing software tool, the result is a tool that is a perfect fit for the purpose, plus new skills for the developer. Depending on the scope of the task, there may be a time savings over having to learn the existing tool and working around its bugs.

Fernando

2007-08-29 17:07:31
Steve, I tried compliling the files on my application, which is compiled in Unicode, and your files are erroring because the mix of TCHAR, PTSTR, char and so on...

Do you have any updated code?

Anyway excelent tool.

Ming

2007-09-24 23:31:46
Ignore Ulysses... It's much better to have the callstack available at runtime. Ulysses should feel fortunate that he hasn't had to debug anyone's code where it was necessary to have a little tool like this.

John

2008-06-26 10:00:57
I am trying to find a hidden memory leak and I liked your methodology and your code.

But, my linker is spitting out undefined references to dbgSetHandler and dbgSetKeys.

Any ideas on what could be the problem?

Thanks.

Neo Law

2008-06-28 05:48:14
I am using it now. I'll get back and tell you if it works or not. Anyway, thanks.

Steve Hanov

2008-06-29 15:51:08
Hi, John!

The library is designed to be stripped out of DEBUG and DEBUG_MEM are not defined. In order to use it, you have to make sure DEBUG and DEBUG_MEM are defined.

Jörg

2008-07-01 10:20:36
It's a very nice article.

I'm trying to find a hidden memory leak and I'm using your code.

It's very helpful.

Thanks a lot.

elnyka

2008-09-16 09:53:48
Dude, I wish I've had that 8 years ago. Cool stuff :)

Peter

2010-04-29 00:01:58
MSVC has a very similar built-in function, which records the file name and line number via __FILE__ and __LINE__ macros:

hxxp://msdn.microsoft.com/en-us/library/e5ewb1h3(VS.80).aspx

(change hxxp to http)

Debug heap functions in MSVC can also detect buffer overrun (google for _CrtSetDbgFlag).

Vasya2

2010-11-27 15:29:27
To avoid memory leaks just use Deleaker

Marc Lepage

2011-03-25 14:17:52
These are important skills as noted. You can find a lot of similar tricks noted in places like Dr. Dobbs Journal.

Another good way to avoid memory leaks is to rely more upon the RAII idiom for managing resources using objects. For example, using smart pointers.

MastAvalons

2011-12-07 14:15:18
In such cases I use deleaker. And in general - do not allow leaks during the creation of algorithm.

Jojn Depth

2012-01-09 22:03:25
I usually use the deleaker in such cases......

Ory

2013-09-04 13:37:19
Well, when working on Windows anyway, then I can recommend the "Visual Leak Detector": it's free, works great under Visual Studio, and does provide a stack trace to where the memory leaked was allocated. All that in a format VS understand, so when running a debug build from within VS, double-clicking on the stack traces will actually take you to the code location...
Email
steve.hanov@gmail.com

Other posts by Steve

Yes, You Absolutely Might Possibly Need an EIN to Sell Software to the US How Asana Breaks the Rules About Per-Seat Pricing 5 Ways PowToon Made Me Want to Buy Their Software How I run my business selling software to Americans 0, 1, Many, a Zillion Give your Commodore 64 new life with an SD card reader 20 lines of code that will beat A/B testing every time [comic] Appreciation of xkcd comics vs. technical ability VP trees: A data structure for finding stuff fast Why you should go to the Business of Software Conference Next Year Four ways of handling asynchronous operations in node.js Type-checked CoffeeScript with jzbuild Zero load time file formats Finding the top K items in a list efficiently An instant rhyming dictionary for any web site Succinct Data Structures: Cramming 80,000 words into a Javascript file. Throw away the keys: Easy, Minimal Perfect Hashing Why don't web browsers do this? Fun with Colour Difference Compressing dictionaries with a DAWG Fast and Easy Levenshtein distance using a Trie The Curious Complexity of Being Turned On Cross-domain communication the HTML5 way Five essential steps to prepare for your next programming interview Minimal usable Ubuntu with one command Finding awesome developers in programming interviews Compress your JSON with automatic type extraction JZBUILD - An Easy Javascript Build System Pssst! Want to stream your videos to your iPod? "This is stupid. Your program doesn't work," my wife told me The simple and obvious way to walk through a graph Asking users for steps to reproduce bugs, and other dumb ideas Creating portable binaries on Linux Bending over: How to sell your software to large companies Regular Expression Matching can be Ugly and Slow C++: A language for next generation web apps qb.js: An implementation of QBASIC in Javascript Zwibbler: A simple drawing program using Javascript and Canvas You don't need a project/solution to use the VC++ debugger Boring Date (comic) barcamp (comic) How IE <canvas> tag emulation works I didn't know you could mix and match (comic) Sign here (comic) It's a dirty job... (comic) The PenIsland Problem: Text-to-speech for domain names Pitching to VCs #2 (comic) Building a better rhyming dictionary Does Android team with eccentric geeks? (comic) Comment spam defeated at last Pitching to VCs (comic) How QBASIC almost got me killed Blame the extensions (comic) How to run a linux based home web server Microsoft's generosity knows no end for a year (comic) Using the Acer Aspire One as a web server When programmers design web sites (comic) Finding great ideas for your startup Game Theory, Salary Negotiation, and Programmers Coding tips they don't teach you in school When a reporter mangles your elevator pitch Test Driven Development without Tears Drawing Graphs with Physics Free up disk space in Ubuntu Keeping Abreast of Pornographic Research in Computer Science Exploiting perceptual colour difference for edge detection Experiment: Deleting a post from the Internet Is 2009 the year of Linux malware? Email Etiquette How a programmer reads your resume (comic) How wide should you make your web page? Usability Nightmare: Xfce Settings Manager cairo blur image surface Automatically remove wordiness from your writing Why Perforce is more scalable than Git Optimizing Ubuntu to run from a USB key or SD card UMA Questions Answered Make Windows XP look like Ubuntu, with Spinning Cube Effect See sound without drugs Standby Preventer Stock Picking using Python Spoke.com scam Stackoverflow.com Copy a cairo surface to the windows clipboard Simulating freehand drawing with Cairo Free, Raw Stock Data Installing Ubuntu on the Via Artigo Why are all my lines fuzzy in cairo? A simple command line calculator Tool for Creating UML Sequence Diagrams Exploring sound with Wavelets UMA and free long distance UMA's dirty secrets Installing the Latest Debian on an Ancient Laptop Dissecting Adsense HTML/ Javascript/ CSS Pretty Printer Web Comic Aggregator Experiments in making money online How much cash do celebrities make? Draw waveforms and hear them Cell Phones on Airplanes Detecting C++ memory leaks What does your phone number spell? A Rhyming Engine Rules for Effective C++ Cell Phone Secrets