www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.container.Array/RefCounted(T) leaking memory?

reply %u <wfunction hotmail.com> writes:
Hi,

This code seems to leak memory, as the memory isn't reclaimed:

//Test memory here: low
{
	auto b = Array!(bool)();
	b.length = 1024 * 1024 * 128 * 8;
	//Test memory here: high
}
//Test memory here: high

Am I missing something about how Array(T) (and RefCounted) works, or is this
really a bug?

Thank you!
Jan 07 2011
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
On 08/01/2011 05:56, %u wrote:
 {
 	auto b = Array!(bool)();
 	b.length = 1024 * 1024 * 128 * 8;
 	//Test memory here: high
 }
What method are you using to test the memory? I'm puzzled that you've put a comment there rather than the code you're actually using. If you run this code twice, does the memory usage double? Stewart.
Jan 08 2011
parent reply %u <wfunction hotmail.com> writes:
 What method are you using to test the memory?
 I'm puzzled that you've put a comment there rather than the code you're
actually
using. I'm not using code, I'm checking the working set of my process in Task Manager, and through every iteration, it adds 128 MB.
 If you run this code twice, does the memory usage double?
Yes. I ran this code: { auto b = Array!(bool)(); b.length = 1024 * 1024 * 128 * 8; } { auto b = Array!(bool)(); b.length = 1024 * 1024 * 128 * 8; } and Task Manager showed two increases of 128-MB. Thank you!
Jan 08 2011
parent reply %u <wfunction hotmail.com> writes:
Sorry to bump this up, but is RefCounted(T) really leaking, or am I missing
something? I would like to use this in my program, and I'm curious as to why no
one responded, since if it's actually leaking, it would be an important issue.

Thanks!
Jan 12 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 12, 2011 15:29:51 %u wrote:
 Sorry to bump this up, but is RefCounted(T) really leaking, or am I missing
 something? I would like to use this in my program, and I'm curious as to
 why no one responded, since if it's actually leaking, it would be an
 important issue.
There probably aren't all that many people who saw your post, and out of those who did, there are probably very few - if any - who have actually done much with RefCounted. It's fairly new. There's at least one major bug on Array at the moment ( http://d.puremagic.com/issues/show_bug.cgi?id=4942 ). There are also several bugs having to do with destructors at the moment which could be causing you problems. Now, even assuming that you're not seeing any problem with a destructor-related bug and that you're not hitting a known bug with Array, there are three things that you need to be aware of which would likely show high memory usage regardless: 1. Array uses an array internally, and there is some caching that goes on with regards to arrays that has to do with appending. This means that if you're dealing with large arrays, you could have several which haven't been garbage collected yet simply because they're cached. Steven Schveighoffer has talked about it in several posts, and he has done some work to improve the situation, but I'm not sure that any of it has been in a release yet. 2. The garbage collector does not currently run in its own thread. IIUC, it only gets run when you try and allocate memory. So, if you allocate a bunch of memory, and then you never try and allocate memory again, no memory will be collected, regardless of whether it's currently being used or not. 3. As I understand it, the current garbage collector _never_ gives memory back to the OS. It will reclaim memory that you're not referencing any longer so that it doesn't necessarily need to go grab more memory from the OS when you try and allocate something, but once the garbage collector has gotten a block of memory from the OS, it doesn't give it back. So, currently you will _never_ see the memory usage of a D program go down, unless you're explicitly using malloc and free instead of GC-allocated memory. So, if you really want to be testing for leaks, you probably should testing for short-lived small arrays in a loop with lots of iterations or something similar. Testing a couple of large arrays will almost certainly mean that the memory usage will be high and that it won't drop. On the other hand, if you keep creating small arrays in a loop where they have no references outside the loop and could be collected after they go out of scope, then you have lots of arrays for the garbage collector to collect, and if it fails to properly collect them, _then_ you'll see the memory usage continue to rise and show that there's a leak. But at this point, using a few large arrays is almost certainly going to look like it's leaking. I'm sure that at some point D's garbage collector will improve so that these issues don't exist or at least or definitely diminished, but fixing the GC is not exactly at the top of the TODO list, so it's not currently as performant as would be nice. Once more important stuff has been fixed though, it'll get its turn. - Jonathan M Davis
Jan 12 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 12 Jan 2011 19:29:30 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, January 12, 2011 15:29:51 %u wrote:
 Sorry to bump this up, but is RefCounted(T) really leaking, or am I  
 missing
 something? I would like to use this in my program, and I'm curious as to
 why no one responded, since if it's actually leaking, it would be an
 important issue.
There probably aren't all that many people who saw your post, and out of those who did, there are probably very few - if any - who have actually done much with RefCounted. It's fairly new. There's at least one major bug on Array at the moment ( http://d.puremagic.com/issues/show_bug.cgi?id=4942 ). There are also several bugs having to do with destructors at the moment which could be causing you problems. Now, even assuming that you're not seeing any problem with a destructor-related bug and that you're not hitting a known bug with Array, there are three things that you need to be aware of which would likely show high memory usage regardless: 1. Array uses an array internally, and there is some caching that goes on with regards to arrays that has to do with appending. This means that if you're dealing with large arrays, you could have several which haven't been garbage collected yet simply because they're cached. Steven Schveighoffer has talked about it in several posts, and he has done some work to improve the situation, but I'm not sure that any of it has been in a release yet.
No, there is no release yet, but the code is checked into svn. But Array doesn't use D appending anyways.
 2. The garbage collector does not currently run in its own thread. IIUC,  
 it only
 gets run when you try and allocate memory. So, if you allocate a bunch of
 memory, and then you never try and allocate memory again, no memory will  
 be
 collected, regardless of whether it's currently being used or not.
 3. As I understand it, the current garbage collector _never_ gives  
 memory back
 to the OS. It will reclaim memory that you're not referencing any longer  
 so that
 it doesn't necessarily need to go grab more memory from the OS when you  
 try and
 allocate something, but once the garbage collector has gotten a block of  
 memory
 from the OS, it doesn't give it back. So, currently you will _never_ see  
 the
 memory usage of a D program go down, unless you're explicitly using  
 malloc and
 free instead of GC-allocated memory.
Um... Array acutally uses malloc and free to allocate its data. But even so, malloc and free have the same property where they don't always give back memory to the OS. IIUC, Linux can only change the size of memory it wants, it cannot free pages in the middle of the block. -Steve
Jan 13 2011
parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Steven Schveighoffer Wrote:

 But even so, malloc and free have the same property where they don't  
 always give back memory to the OS.  IIUC, Linux can only change the size  
 of memory it wants, it cannot free pages in the middle of the block.
 
 -Steve
Disclaimer: I don't know what I am talking about. I think this is correct, the program isn't responsible for reclaiming the memory, that is what the OS does. If you don't have an OS then you don't have anything to return the memory to, so it just becomes free memory. Modern operating systems aren't going to take their memory back until it is needed (wast of cycles). What I observed using Linux and $ free; each section would result in a reduction of free memory and an increase in buffered data. This suggests to me that the OS doesn't want the memory yet. Tracking memory in a modern OS is not easy, and this is probably why no one wanted to make a statement on what was really happening. As I said I don't know if this is what is happening, but it usually isn't as straight forward as checking memory usage.
Jan 13 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 13 Jan 2011 12:40:04 -0500, Jesse Phillips  
<jessekphillips+D gmail.com> wrote:

 Steven Schveighoffer Wrote:

 But even so, malloc and free have the same property where they don't
 always give back memory to the OS.  IIUC, Linux can only change the size
 of memory it wants, it cannot free pages in the middle of the block.

 -Steve
Disclaimer: I don't know what I am talking about. I think this is correct, the program isn't responsible for reclaiming the memory, that is what the OS does. If you don't have an OS then you don't have anything to return the memory to, so it just becomes free memory. Modern operating systems aren't going to take their memory back until it is needed (wast of cycles). What I observed using Linux and $ free; each section would result in a reduction of free memory and an increase in buffered data. This suggests to me that the OS doesn't want the memory yet. Tracking memory in a modern OS is not easy, and this is probably why no one wanted to make a statement on what was really happening. As I said I don't know if this is what is happening, but it usually isn't as straight forward as checking memory usage.
I think all memory is allocated/deallocated from the OS via the sbrk/brk system call: brk() and sbrk() change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment). Increasing the program break has the effect of allocating memory to the process; decreasing the break deallocates memory. So you can only ever add to the *end* of memory, and you can only ever deallocate from the *end*. And the OS doesn't ever just jump in and claim memory, you have to tell it that you are deallocating memory. Which means, if you say allocated 100MB, and wanted to deallocate the first 99MB, you still couldn't release any back to the OS. A moving GC would allow for more memory to be freed, but we aren't there yet. Of course, I could be completely wrong about all this, I've never really used sbrk or brk :) -Steve
Jan 13 2011
parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Thanks very nice info, just two guys babbling about things they've only read I
guess, but you seem much better informed.

Steven Schveighoffer Wrote:

 I think all memory is allocated/deallocated from the OS via the sbrk/brk  
 system call:
 
 
         brk()  and  sbrk()  change  the  location  of  the program break,  
 which
         defines the end of the process's data segment (i.e., the program   
 break
         is the first location after the end of the uninitialized data  
 segment).
         Increasing the program break has the effect of allocating memory to  
 the
         process; decreasing the break deallocates memory.
 
 So you can only ever add to the *end* of memory, and you can only ever  
 deallocate from the *end*.  And the OS doesn't ever just jump in and claim  
 memory, you have to tell it that you are deallocating memory.
 
 Which means, if you say allocated 100MB, and wanted to deallocate the  
 first 99MB, you still couldn't release any back to the OS.
 
 A moving GC would allow for more memory to be freed, but we aren't there  
 yet.
 
 Of course, I could be completely wrong about all this, I've never really  
 used sbrk or brk :)
 
 -Steve
Jan 13 2011
parent reply %u <wfunction hotmail.com> writes:
 Tracking memory in a modern OS is not easy, and this is probably why no one
wanted to make a statement on what was really happening. The issue is that the memory *is* leaking -- it's because the struct destructor is simply not getting called. If I call free() manually, the memory usage decreases normally, so it's not a measurement problem. Furthermore, this doesn't seem to be an Array(T)-related bug at all -- it seems that pretty much *any* struct with a destructor will not have its destructor called on exit. In fact, after reading the language specifications, it seems like the glossary contradicts itself: it defines Plain Old Data as referring "to a struct that [...] has no destructor. D structs are POD." By definition, if D structs were POD, then they could not have any destructors. It seems like the language contradicts itself, and the compiler only *sometimes* calls struct destructors. Any ideas? Is this a bug? And thank you for all your great responses! :)
Jan 15 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 15 January 2011 20:27:26 %u wrote:
 Tracking memory in a modern OS is not easy, and this is probably why no
 one
wanted to make a statement on what was really happening. The issue is that the memory *is* leaking -- it's because the struct destructor is simply not getting called. If I call free() manually, the memory usage decreases normally, so it's not a measurement problem. Furthermore, this doesn't seem to be an Array(T)-related bug at all -- it seems that pretty much *any* struct with a destructor will not have its destructor called on exit. In fact, after reading the language specifications, it seems like the glossary contradicts itself: it defines Plain Old Data as referring "to a struct that [...] has no destructor. D structs are POD." By definition, if D structs were POD, then they could not have any destructors. It seems like the language contradicts itself, and the compiler only *sometimes* calls struct destructors. Any ideas? Is this a bug? And thank you for all your great responses! :)
It's probably this bug: http://d.puremagic.com/issues/show_bug.cgi?id=2834 However, there are several bugs relating to destructors, and stuff that ends up on the heap is big problem as far as destructors go IIRC. So, it's definitely a bug. - Jonathan M Davis
Jan 15 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 15 Jan 2011 23:27:26 -0500, %u <wfunction hotmail.com> wrote:

 Tracking memory in a modern OS is not easy, and this is probably why no  
 one
wanted to make a statement on what was really happening. The issue is that the memory *is* leaking -- it's because the struct destructor is simply not getting called. If I call free() manually, the memory usage decreases normally, so it's not a measurement problem. Furthermore, this doesn't seem to be an Array(T)-related bug at all -- it seems that pretty much *any* struct with a destructor will not have its destructor called on exit. In fact, after reading the language specifications, it seems like the glossary contradicts itself: it defines Plain Old Data as referring "to a struct that [...] has no destructor. D structs are POD."
This is definitely a bug. A struct dtor should be called on scope exit. That documentation is also out of date. D1 structs had no destructors or constructors, so it's probably just a stale doc. I find it very hard to believe that struct dtors are never called. There must be some situations where they are called, or the feature would not have made it this far without outcry. The bug referenced by Jonathan is referring to structs not having their dtors called on collection. But that is a completely different problem. -Steve
Jan 17 2011
parent reply %u <wfunction hotmail.com> writes:
 I find it very hard to believe that struct dtors are never called.
Sorry, that part was my bad -- last time I checked, they didn't get called, but maybe my example was too complicated, since they did get called for a *simple* example. However, here's a situation in which no postblit or destructor is called whatsoever: import std.stdio; struct S { this(int dummy) { writeln("ctor"); } this(this) { writeln("postblit"); } ~this() { writeln("dtor"); } } S test(int depth) { return depth > 0 ? test(depth - 1) : S(0); } int main(string[] argv) { test(3); }
Jan 17 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 18 Jan 2011 01:16:51 +0000, %u wrote:

 I find it very hard to believe that struct dtors are never called.
Sorry, that part was my bad -- last time I checked, they didn't get called, but maybe my example was too complicated, since they did get called for a *simple* example. However, here's a situation in which no postblit or destructor is called whatsoever: import std.stdio; struct S { this(int dummy) { writeln("ctor"); } this(this) { writeln("postblit"); } ~this() { writeln("dtor"); } } S test(int depth) { return depth > 0 ? test(depth - 1) : S(0); } int main(string[] argv) { test(3); }
That would be bug 3516, wouldn't it? http://d.puremagic.com/issues/show_bug.cgi?id=3516 -Lars
Jan 18 2011
parent %u <wfunction hotmail.com> writes:
 That would be bug 3516, wouldn't it?
Huh... yes, it indeed would. Thanks for the link, I couldn't think of the right keywords to search for. :)
Jan 18 2011