Sun, 03 Jan 2010

Debugging Memory Leaks in a GTK+ House of Cards.

Recently I've been hacking on Conrad Parker's sound editing, audio mangling and DJing tool Sweep. As part of my bug fixing and clean up work I ran Sweep under the Linux world's favourite memory debugging tool Valgrind. Even after running valgrind with the officially sanctioned environment variables and gtk suppressions file, the resulting 500k gzipped output file was a little shocking.

Now I'm pretty sure a number of those leaks are in Sweep, but a significant number of them seem to be in GTK+ and Glib. Since trying to differentiate the leaks in Sweep from the leaks in GTK+ was proving to be a very difficult and frustrating task I decided to look at the behaviour of a simple GTK+ program under Valgrind. The program I chose was the helloworld example from the GTK+ tutorial.

Compiling that on Ubuntu 9.10 and running it under valgrind using the following commands:


  export G_SLICE=always-malloc
  export G_DEBUG=gc-friendly
  valgrind --tool=memcheck --leak-check=full --leak-resolution=high \
    --num-callers=50 --show-reachable=yes --suppressions=gtk.suppression \
    helloworld > helloworld-vg.txt 2>&1

resulted in a memcheck summary of:


  ==22566== LEAK SUMMARY:
  ==22566==    definitely lost: 1,449 bytes in 8 blocks
  ==22566==    indirectly lost: 3,716 bytes in 189 blocks
  ==22566==      possibly lost: 4,428 bytes in 107 blocks
  ==22566==    still reachable: 380,505 bytes in 7,898 blocks
  ==22566==         suppressed: 35,873 bytes in 182 blocks

The full memcheck report is available here.

The simplest GTK+ hello world program is 100 lines of code and results in a leak report of over 8000 leaked blocks even when using the recommended valgrind suppressions file and GTK+ debugging environment variables. If someone modifies that code and adds another leak, trying to find that leak needle in the GTK+ leak haystack is going to be a needlessly difficult task.

Researching this some more I find that GTK+ is known to do a large number of allocations that are done once per program run and are never released. Furthermore the GTK+ developers seem to think this is ok and from the point of view of a user running a GTK+ program this is true. However for developers coding against GTK+ and hoping to use Valgrind to find leaks in their own code, this is a royal PITA. Leaks in the developer's code can easily be swamped and hidden by GTK+ leaks. My guess is that most people don't even bother checking unless their program's memory footprint grows over time for no good reason.

Obviously, I'm not the first to realise how hard it is too debug memory leaks in a program when the library it links against throws up so many warnings. In fact, back in 2001 a bug was raised in the GTK+ bug tracker requesting the addition of a call to be used only during debugging that would release all memory so that client programs are easier to debug. That bug has remained open and without action for over 8 years.

As far as I am concerned, this is completely unacceptable. If this was my code, I would be too ashamed to put my name on it. Edit: Being able to valgrind GTK+ client code is worth the effort and cost of changing the otherwise perfectly reasonable behaviour of not accounting for lifetime-of-the-app data structures (thanks Andrew).

Note: Anyone who wishes to comment on this can do so on reddit.

Posted at: 11:38 | Category: CodeHacking | Permalink