digitalmars.D - Memory issues. GC not giving back memory to OS?

Cristian Becerescu (25/25) Apr 21 2020 Hi!

Jonathan M Davis (9/15) Apr 21 2020 It is my understanding that under normal circumstances, the GC will neve...

Arafel (9/29) Apr 22 2020 I had a similar issue some time ago, and found that the memory wouldn't

Steven Schveighoffer (25/46) Apr 21 2020 The GC doesn't automatically give back memory to the OS. And it really

Arun Chandrasekaran (14/19) Apr 22 2020 How much of Phobos is betterC compatible?

welkam (8/9) Apr 22 2020 You can do everything in D that you can do in C++ when it comes

Arun Chandrasekaran (6/11) Apr 22 2020 We can do the same with Java as well, use JNI, manual memory

ikod (8/34) Apr 21 2020 IMHO this happens because each time you requested larger

Cristian Becerescu <cristian.becerescu yahoo.com> writes:

Hi!

A little bit of context first:

I was using DPP and I noticed huge amounts of RAM being used.
So I used valgrind massif and found out that 98% of the process’ 
memory (~6GB) was allocated for arrays / Appender with mmap.

I then performed a simple test where I incrementally appended 
2^30 integers (4GB) to a dynamic array (memory measurements are 
the same for Appender).
-> Memory used (peak; increasing towards the end of execution): 
~7GB
-> capacity == 1.107 * size (at the end of the program)

This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the 
peak memory consumption was 7GB. Apparently, the GC can correctly 
collect the memory when manually calling collect() at the end of 
appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is 
never given back to the system. At least this is our intuition 
after making those observations.

I have created a gist with the test code and results (thanks Edi 
for augmenting the test code to profile the GC): 
https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f

Just some notes:
- if reserving 2^30 elements for the array (or Appender) 
beforehand, memory peaks are at 4GB
- C++'s std::vector, without reservation, never gets beyond 4GB 
and has size == capacity at the end

Apr 21 2020

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, April 21, 2020 12:31:28 PM MDT Cristian Becerescu via 
Digitalmars-d wrote:
 This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the
 peak memory consumption was 7GB. Apparently, the GC can correctly
 collect the memory when manually calling collect() at the end of
 appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is
 never given back to the system. At least this is our intuition
 after making those observations.

It is my understanding that under normal circumstances, the GC will never
return memory to the OS until the program terminates but rather will just
keep it around to reuse when more memory needs to be allocated. However, the
documentation for core.memory's GC.minimize says that it will return free
memory to the OS. So, if you need memory to be returned to the OS while the
program is running, you'll probably need to use that.

- Jonathan M Davis

Apr 21 2020

Arafel <er.krali gmail.com> writes:

On 21/4/20 22:23, Jonathan M Davis wrote:
 On Tuesday, April 21, 2020 12:31:28 PM MDT Cristian Becerescu via
 Digitalmars-d wrote:
 This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the
 peak memory consumption was 7GB. Apparently, the GC can correctly
 collect the memory when manually calling collect() at the end of
 appending, but that memory (we are talking 7 - 4.4 = 2.6GB) is
 never given back to the system. At least this is our intuition
 after making those observations.

 
 It is my understanding that under normal circumstances, the GC will never
 return memory to the OS until the program terminates but rather will just
 keep it around to reuse when more memory needs to be allocated. However, the
 documentation for core.memory's GC.minimize says that it will return free
 memory to the OS. So, if you need memory to be returned to the OS while the
 program is running, you'll probably need to use that.
 
 - Jonathan M Davis
 
 
 

I had a similar issue some time ago, and found that the memory wouldn't 
be returned to the OS even after the GC had freed it. I had to call 
malloc_trim [1] manually, this seems to be a libc / OS issue (I'm 
exclusively using linux, I don't know if this is also an issue with 
Windows or Mac).

Could this be also happening here?

A.

[1]: http://man7.org/linux/man-pages/man3/malloc_trim.3.html

Apr 22 2020

Steven Schveighoffer <schveiguy gmail.com> writes:

On 4/21/20 2:31 PM, Cristian Becerescu wrote:

 I then performed a simple test where I incrementally appended 2^30 
 integers (4GB) to a dynamic array (memory measurements are the same for 
 Appender).
 -> Memory used (peak; increasing towards the end of execution): ~7GB
 -> capacity == 1.107 * size (at the end of the program)
 
 This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and the peak 
 memory consumption was 7GB. Apparently, the GC can correctly collect the 
 memory when manually calling collect() at the end of appending, but that 
 memory (we are talking 7 - 4.4 = 2.6GB) is never given back to the 
 system. At least this is our intuition after making those observations.

The GC doesn't automatically give back memory to the OS. And it really 
can't. There's a GC.minimize function, but that is only going to release 
memory to the OS that can be released. It highly depends on the 
implementation and the mechanism the OS gives to access memory.

So for example, if all the "free" memory is in the middle of the 
OS-provided memory segment, then it can't give it back.

 
 I have created a gist with the test code and results (thanks Edi for 
 augmenting the test code to profile the GC): 
 https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f
 
 Just some notes:
 - if reserving 2^30 elements for the array (or Appender) beforehand, 
 memory peaks are at 4GB

Right, because it will never reallocate, it just grows within the 
original memory block. This is what I'd recommend for something like this.

If you don't reserve, then as it grows, it needs a bigger and bigger 
segment.

And it's not always going to reuse memory that you already used on your 
way up. Why? Because it can't get a contiguous segment that is free and 
fits the new requirement. It does try extending in-place if it can, but 
once it can't, that memory is not usable because the segment is too 
small to fit your massive data.

But I'd say that the stats you are printing are a bit puzzling. Why does 
it all of a sudden allow you to collect at the end when it didn't 
before? It does seem like your output doesn't match your example code. 
But there are a number of reasons why the GC may not do what you are 
expecting, including possible bugs in the GC.

 - C++'s std::vector, without reservation, never gets beyond 4GB and has 
 size == capacity at the end

C++ frees the original memory immediately when growing. So it's going to 
be more memory efficient. You are never going to match a manually 
managed memory efficiency in terms of space used with a GC.

-Steve

Apr 21 2020

Arun Chandrasekaran <aruncxy gmail.com> writes:

On Tuesday, 21 April 2020 at 20:29:37 UTC, Steven Schveighoffer 
wrote:
 
 C++ frees the original memory immediately when growing. So it's 
 going to be more memory efficient. You are never going to match 
 a manually managed memory efficiency in terms of space used 
 with a GC.

How much of Phobos is betterC compatible?

I encountered the same issues with GC couple of years ago and 
abandoned our plans to migrate from C++ to D for one of our core 
products. (I'm not encouraging anyone to do the same, do your own 
analysis and take the decision.)

To see recent posts about chasing Rust with  live with all these 
existing baggage... Hmm.. Don't know what to say... This might 
excite a PL theorist/researcher, but not a programmer who can't 
get his app to work in the most basic form...

Walter, memory efficiency first please, arcane safety later.

--
If you don't have anything nice to say, don't say anything at all.

Apr 22 2020

welkam <wwwelkam gmail.com> writes:

On Wednesday, 22 April 2020 at 07:25:34 UTC, Arun Chandrasekaran 
wrote:
 Walter, memory efficiency first please, arcane safety later.

You can do everything in D that you can do in C++ when it comes 
to memory management. Also a good system that tracks pointers can 
be used to turn GC allocations to malloc/free pair and some 
allocations can be turnet to stack allocations (llvm does some of 
that). Safety features can be used as performance features with 
some additional work.

Apr 22 2020

Arun Chandrasekaran <aruncxy gmail.com> writes:

On Wednesday, 22 April 2020 at 13:13:29 UTC, welkam wrote:
 On Wednesday, 22 April 2020 at 07:25:34 UTC, Arun 
 Chandrasekaran wrote:
 Walter, memory efficiency first please, arcane safety later.

 You can do everything in D that you can do in C++ when it comes 
 to memory management.

We can do the same with Java as well, use JNI, manual memory 
management, etc. But will we?

So when I say "we can't" it doesn't mean technically we can't. It 
is just that the alternatives are better than what's being 
offered in D.

Apr 22 2020

ikod <geller.garry gmail.com> writes:

On Tuesday, 21 April 2020 at 18:31:28 UTC, Cristian Becerescu 
wrote:
 Hi!

 A little bit of context first:

 I was using DPP and I noticed huge amounts of RAM being used.
 So I used valgrind massif and found out that 98% of the 
 process’ memory (~6GB) was allocated for arrays / Appender with 
 mmap.

 I then performed a simple test where I incrementally appended 
 2^30 integers (4GB) to a dynamic array (memory measurements are 
 the same for Appender).
 -> Memory used (peak; increasing towards the end of execution): 
 ~7GB
 -> capacity == 1.107 * size (at the end of the program)

 This is a bit odd, because 1.107 * 2^30 is roughly 4.4GB, and 
 the peak memory consumption was 7GB. Apparently, the GC can 
 correctly collect the memory when manually calling collect() at 
 the end of appending, but that memory (we are talking 7 - 4.4 = 
 2.6GB) is never given back to the system. At least this is our 
 intuition after making those observations.

 I have created a gist with the test code and results (thanks 
 Edi for augmenting the test code to profile the GC): 
 https://gist.github.com/cbecerescu/e6606a8530c56ae06c52e5b1cd32b31f

 Just some notes:
 - if reserving 2^30 elements for the array (or Appender) 
 beforehand, memory peaks are at 4GB
 - C++'s std::vector, without reservation, never gets beyond 4GB 
 and has size == capacity at the end

IMHO this happens because each time you requested larger 
contiguous memory region. Runtime have to allocate (or 
reallocate) larger piece of memory (at higher addresses), copy 
old content and then release old piece of memory. But old piece 
of memory can't be released to OS as heap area can be released 
only from the top.

Apr 21 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Memory issues. GC not giving back memory to OS?