www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Manually freeing up memory

reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
Hello all,

I'm doing some work with a fairly large dataset.  For various reasons it's 
convenient to import it first as simply an array of data points which is then 
used to generate other data structures (actually, technically it's an array of 
data points plus a couple of associative arrays, which ideally would instead be 
sets; but I think that's a minor detail).

Once the various data structures are in place, it's possible to discard the 
initial array data.  It would be very desirable to free up the memory
allocated, 
as it's a very large amount.  However, I can't work out how to do this.

I've tried calling destroy() on the input data, with and without a subsequent 
GC.collect(), but the program's memory usage still remains at its peak level. 
This is a shame, because that peak memory usage only needs to last for a short 
part of the program's total runtime, and it seems only polite to other computer 
users to give back the excess memory.

Can anyone advise?  I would rather not disable the GC entirely as there's lots 
of Phobos I want to be able to use -- but I'd really like it if I could
indicate 
categorically to the GC, "these objects and arrays need to be deleted and the 
memory freed _now_".

Thanks and best wishes,

       -- Joe
Nov 07 2012
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

 Can anyone advise?  I would rather not disable the GC entirely 
 as there's lots of Phobos I want to be able to use -- but I'd 
 really like it if I could indicate categorically to the GC, 
 "these objects and arrays need to be deleted and the memory 
 freed _now_".
One solution is to allocate the original array on the C heap. Another solution is to allocate it normally from the GC heap and then use GC.free(). Maybe a third option is to use a memory-mapped file for the first array. Bye, bearophile
Nov 07 2012
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/07/2012 03:17 PM, bearophile wrote:
 One solution is to allocate the original array on the C heap. Another solution
 is to allocate it normally from the GC heap and then use GC.free().
Well, what I've got is something like this: auto raw = rawInput(); /* loads data and outputs a struct containing the array of data */ auto data = rawToData(raw); // converts the raw input to data structure GC.free(raw.links.ptr); // _should_ free up the allocated memory? ... but despite the GC.free(), memory usage stays at peak level for the rest of the runtime of the function. I tried preceding the free() with a destroy(raw) or destroy(raw.links) also to no avail.
 Maybe a third option is to use a memory-mapped file for the first array.
That's an interesting thought, which I'll look into. Another thought was to dump the data into an SQL DB and read/sample from there as necessary, but IIRC the SQL support available for D is somewhat limited right now ... ?
Nov 07 2012
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Joseph Rushton Wakeling:

 ... but despite the GC.free(), memory usage stays at peak level 
 for the rest of the runtime of the function.
GC.free() usually works. Some memory allocators don't give back the memory to the OS, no matter what, until the process is over, despite that memory is free for the process to use in other ways (this is what often happens in Python on Windows). If I am right, then if you try to allocate memory from the same program after GC.free() the total memory used by that process will not increase. Bye, bearophile
Nov 07 2012
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Nov 07, 2012 at 06:12:52PM +0100, bearophile wrote:
 Joseph Rushton Wakeling:
 
... but despite the GC.free(), memory usage stays at peak level
for the rest of the runtime of the function.
GC.free() usually works. Some memory allocators don't give back the memory to the OS, no matter what, until the process is over, despite that memory is free for the process to use in other ways (this is what often happens in Python on Windows).
[...] I think on Posix systems, malloc/free does not return freed memory back to the OS, it just gets reused by the process later on. If you want to return memory back to the OS, you could call sbrk()... but that is highly *NOT* recommended unless you know exactly what you're doing, and you know the innards of your C library (*and* D runtime) like the back of your hand. But it *is* the "hardcore" way of doing it. :-) An easier workaround might be to fork() a process that constructs whatever data structures you need, transmits that to the main process somehow, then exit. If I understand it correctly, the large memory allocations will be restricted to the child process, which will get returned to the OS once it exits. (Note that you have to use fork(), not threads, because threads share memory in the same process so you end up with the same problem.) T -- Question authority. Don't ask why, just do it.
Nov 07 2012
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/07/2012 06:53 PM, H. S. Teoh wrote:
 I think on Posix systems, malloc/free does not return freed memory back
 to the OS, it just gets reused by the process later on.
I have to say that in this program, it looks like the memory usage keeps increasing even after the free(), even though theoretically the amount it's possible to free up would dwarf any subsequent memory requirements. Using GC.missing() seems to return a very little bit of memory to the OS, depending on which compiler is used, but nowhere near the amount it's theoretically possible to hand back.
 An easier workaround might be to fork() a process that constructs
 whatever data structures you need, transmits that to the main process
 somehow, then exit. If I understand it correctly, the large memory
 allocations will be restricted to the child process, which will get
 returned to the OS once it exits. (Note that you have to use fork(), not
 threads, because threads share memory in the same process so you end up
 with the same problem.)
Nice thought! I'll have a look at doing this.
Nov 07 2012
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 07 Nov 2012 19:56:35 +0100
schrieb Joseph Rushton Wakeling <joseph.wakeling webdrake.net>:

 On 11/07/2012 06:53 PM, H. S. Teoh wrote:
 I think on Posix systems, malloc/free does not return freed memory back
 to the OS, it just gets reused by the process later on.
I have to say that in this program, it looks like the memory usage keeps increasing even after the free(), even though theoretically the amount it's possible to free up would dwarf any subsequent memory requirements.
Could it be that you still hold a reference to the raw memory in your data structures ? A slice would be a typical candidate: s.name = raw[a .. b]; You probably checked that already... -- Marco
Nov 07 2012
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/08/2012 05:50 AM, Marco Leise wrote:
 Could it be that you still hold a reference to the raw memory
 in your data structures ? A slice would be a typical candidate:
 s.name = raw[a .. b];
 You probably checked that already...
I don't _think_ so, although there is a point where data is passed to another struct something like this: foreach(link; raw.links) // raw is struct, links is array data.add(link.expand); // each entry in links is a Tuple!(size_t, size_t) where add() takes as input a pair of size_t's. I assumed the values here would be copied. I've tried tweaking it to take out the link.expand and it makes no difference.
Nov 08 2012
prev sibling parent "Rob T" <rob ucora.com> writes:
On Thursday, 8 November 2012 at 04:51:00 UTC, Marco Leise wrote:
 Could it be that you still hold a reference to the raw memory
 in your data structures ? A slice would be a typical candidate:
Good point. I find that with GC'd memory, you have to diligently keep track of where and when your references will be deallocated to ensure there are no persistent references left dangling by mistake. I find that apps built with GC languages like Java tend to suffer from severe memory leak issues, perhaps due to persistent referenced memory that the programmer is unaware about. I come from C++ background so I am painfully aware of why I cannot lower my guard just because there's a CG kicking about, in fact I find myself much more concerned than ever because I'm never certain when the GC will kick in, or if it will do the job correctly, and so forth. --rt
Nov 09 2012