www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Who wants to have some fun memory debugging?

reply Robert Fraser <fraserofthenight gmail.com> writes:
Running this program with Tango SVN + DMD 1.045 or Tango 0.98 + DMD 
1.041 on WinXP SP3 32-bit results in a memory leak (the program keeps 
increasing in size at every iteration)

leak.d:
-------------------------------------------
module leak;

import tango.stdc.stdio;
import tango.core.Memory;

struct Data
{
	Data* prev;
	char[4092] something;
}

public void main()
{
	Data* data;
	Data* newData;
	int i;
	while(true)
	{
		for(i = 0; i < 10_000; i++)
		{
			newData = new Data;
			newData.prev = data;
			data = newData;
		}
		data = null;
		newData = null;
		i = 0;
		GC.collect();
		printf("Iteration...");
		fflush(stdout);
		fgetc(stdin);
	}
}
-------------------------------------------

Running it through a debugger shows that at the printf, the entire 
contents of the stack frame of main is zeros and there's no global data 
referenced by the program.
May 12 2009
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Simpler version, sans printf:

module leak;

import tango.core.Memory;

struct Data
{
     Data* prev;
     char[4092] something;
}

public void main()
{
     Data* data;
     Data* newData;
     int i;
     while(true)
     {
         for(i = 0; i < 10_000; i++)
         {
             newData = new Data;
             newData.prev = data;
             data = newData;
         }
         data = null;
         newData = null;
         i = 0;
         GC.collect();
     }
}
May 12 2009
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Robert Fraser wrote:
 Simpler version, sans printf:
 
 module leak;
 
 import tango.core.Memory;
 
 struct Data
 {
     Data* prev;
     char[4092] something;
 }
 
 public void main()
 {
     Data* data;
     Data* newData;
     int i;
     while(true)
     {
         for(i = 0; i < 10_000; i++)
         {
             newData = new Data;
             newData.prev = data;
             data = newData;
         }
         data = null;
         newData = null;
At this point it's quite possible that you still have a reference to newData in a register. If you want to be sure the collect call below works as intended for this test, try adding: newData = new Data; here.
         i = 0;
         GC.collect();
     }
 }
Beyond that, you'll just potentially end up with a bunch of false positives, since you have a big static array in the same block as a pointer. But since you aren't actually setting the array contents in this test, that clearly isn't a problem here.
May 12 2009
next sibling parent Robert Fraser <fraserofthenight gmail.com> writes:
Sean Kelly wrote:
 At this point it's quite possible that you still have a reference to 
 newData in a register.  
That argument works for one iteration, but this fool just keeps on growing. Also, this is a pared down test from a much larger program, so this is unlikely.
 If you want to be sure the collect call below 
 works as intended for this test, try adding:
 
           newData = new Data;
 
 here.
 
         i = 0;
         GC.collect();
     }
 }
Nope, didn't do anything. It still swells forever.
 Beyond that, you'll just potentially end up with a bunch of false 
 positives, since you have a big static array in the same block as a 
 pointer.  But since you aren't actually setting the array contents in 
 this test, that clearly isn't a problem here.
The real program is a tree structure that contains class references, so is entirely pointers. The root of the tree is then discarded, and a new tree created, however the old tree isn't being collected.
May 12 2009
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Sean Kelly wrote:
 ...
 
 At this point it's quite possible that you still have a reference to
 newData in a register.  If you want to be sure the collect call below
 works as intended for this test, try adding:
 
           newData = new Data;
 
 here.
Debugging it with ddbg under Win XP, I found that at "data = null", there is what appears to be a pointer to the newest allocation in EDX, and a pointer to the previous one in ECX.
         i = 0;
After this line, the ECX reference hasn't been touched.
         GC.collect();
Disassembling the code, there's nothing that touches ECX before the call to GC.collect, so it's quite probable that this is why the memory isn't being freed in a given iteration, but not necessarily why it isn't being collected at all. I modified the code to explicitly zero-out ECX just before the call to GC.collect, and it didn't appear to help. I then instrumented the code with calls to gc_stats. Here is the run of it reporting the stats AFTER each forced collect (output via printf): GCStats: poolsize, usedsize, freeblocks, freelistsize, pageblocks GCStats: 81985536, 640, 16, 7552, 9999 GCStats: 163840000, 640, 3200, 7552, 18399 GCStats: 232652800, 640, 2030, 7552, 27384 GCStats: 306315264, 640, 5948, 7552, 34417 GCStats: 360251392, 640, 6954, 7552, 40498 GCStats: 410517504, 640, 5502, 7552, 47360 GCStats: 469958656, 640, 8498, 7552, 53118 The program also got progressively slower as the size of the heap increased. Odd thing is that it would pause at a certain heap size for many seconds at a time, increase in size, pause, increase, pause... it was like the allocations were getting slower. What's also weird is that even though the heap usage should be the same each iteration, it's clearly creating more and more blocks. It's like the GC is freeing those blocks, but the allocator isn't using them. Then I ran it so that it would compute the difference of gc_stats before and after the collect. This one is somewhat depressing: GCStats: poolsize, usedsize, freeblocks, freelistsize, pageblocks GCStats: 0, 0, 2, 0, -1 GCStats: 0, 0, 2, 0, -1 GCStats: 0, 0, 2, 0, -1 GCStats: 0, 0, 2, 0, -1 GCStats: 0, 0, 2, 0, -1 GCStats: 0, 0, 2, 0, -1 It appears to free *two* blocks and consume an extra page during a collection, but doesn't affect the pool or used size at all. Maybe it's time to put together an instrumented GC... -- Daniel
May 12 2009
parent reply grauzone <none example.net> writes:
 Maybe it's time to put together an instrumented GC...
Wishlist: - some way to know _when_ a collection happened (or how often) - logging of allocations (module/linenumber of allocator, size of allocation, type of allocation) - per TypeInfo allocation statistics - per module allocation statistics
May 12 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Wed, 13 May 2009 10:48:53 +0400, grauzone <none example.net> wrote:

 Maybe it's time to put together an instrumented GC...
Wishlist: - some way to know _when_ a collection happened (or how often) - logging of allocations (module/linenumber of allocator, size of allocation, type of allocation) - per TypeInfo allocation statistics - per module allocation statistics
At work, we manage memory ourselves (C++ that is). As result of a program run, a memory log is created which is then visualized by a Memory Monitor. MM gives a picture of memory state at any given time. You use a scroller to advance in time. You may a watch a history of every memory block - when it was first allocated, how it was reused, etc. It also gives you a great picture of how big memory fragmentation is, how efficient it is reused etc. I may post a few screenshots or even a whole program with a sample memory log to show you how useful it is. Perhaps, it will be an inspiration for someone to create something like it for D GC. Unfortunately, an implementation is heavily memory-management specific and thus there is little to no code to share.
May 13 2009
parent Nick B <nick.barbalich gmail.com> writes:
Denis Koroskin wrote:

 At work, we manage memory ourselves (C++ that is). As result of a 
 program run, a memory log is created which is then visualized by a 
 Memory Monitor. MM gives a picture of memory state at any given time. 
 You use a scroller to advance in time. You may a watch a history of 
 every memory block - when it was first allocated, how it was reused, 
 etc. It also gives you a great picture of how big memory fragmentation 
 is, how efficient it is reused etc.
 
 I may post a few screenshots or even a whole program with a sample 
 memory log to show you how useful it is. Perhaps, it will be an 
 inspiration for someone to create something like it for D GC.
 Unfortunately, an implementation is heavily memory-management specific 
 and thus there is little to no code to share.
Dennis - this sounds very interesting. Could you post some screen shots etc. Nick B.
May 13 2009
prev sibling parent reply grauzone <none example.net> writes:
Wild guess: there's a false pointer, that keeps one element in the list 
from being collected, and because the list-prev pointers are still 
there, all following elements won't be collected either in consequence.

If I had time, I'd try two experiments:
1. before freeing everything, reset the prev field of each element; if 
this makes the leak go away, my guess would probably be right
2. use the destructor (Data had to be a class) to keep track of how many 
elements are actually freed by the GC (just a thread safe counter, 
that's incremented in the ctor and decremented in the dtor); just to 
find out if this is an internal GC problem, or if you have too many live 
garbage
May 12 2009
parent Robert Fraser <fraserofthenight gmail.com> writes:
grauzone wrote:
 Wild guess: there's a false pointer, that keeps one element in the list 
 from being collected, and because the list-prev pointers are still 
 there, all following elements won't be collected either in consequence.
 
 If I had time, I'd try two experiments:
 1. before freeing everything, reset the prev field of each element; if 
 this makes the leak go away, my guess would probably be right
...But from where? The only pointers in the program (not counting the runtime) are on the stack. Even when the registers are zeroed, the problem still exists.
 2. use the destructor (Data had to be a class) to keep track of how many 
 elements are actually freed by the GC (just a thread safe counter, 
 that's incremented in the ctor and decremented in the dtor); just to 
 find out if this is an internal GC problem, or if you have too many live 
 garbage
Interesting! I reduced it to 1000 allocations per iteration, and here are my results: 1000 1856 2744 3704 3768 3768 4600 4600 4600 4600 ... So it stabilizes at some point (albiet a point further than I'd like since for the actual application 4.6 * one usage would be around 600MB). So I guess it's just a lot less memory efficient than it should be. Okay, thanks for all your help... This doesn't exactly *solve* my problem, but at least it means I can assume my code is good for now and hope for a new GC before I start running it on multi-gig data sets.
May 13 2009