www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Usage of memory by arrays

reply unDEFER <undefer gmail.com> writes:
Hello!
Here very simple test program:
------------------->8--------------------
import std.conv;
import std.stdio;

import std.string;

int MemoryUsage()
{
     auto file = File("/proc/self/status");
     foreach (line; file.byLine())
     {
         if (line[0..6] == "VmRSS:")
         {
             return line[7..$-3].strip().to!(int);
         }
     }
     return 0;
}

void main()
{
     float[3] f;
     float[3][] x;
     writefln("float = %s bytes", float.sizeof);
     writefln("float[3] = %s bytes", f.sizeof);

     int before = MemoryUsage();

     int total = 100;
     foreach(i; 0..total)
     {
         foreach (j; 0..1000)
         {
             x ~= [0.01, 0.02, 0.03];
         }
     }

     int after = MemoryUsage();
     writefln("%dK * float[3] = %d Kbytes", total, (after-before));
}
-------------------8<--------------------

It prints:
$ ./memory
float = 4 bytes
float[3] = 12 bytes
100K * float[3] = 2356 Kbytes

Why not 1200 Kbytes?
Apr 05 2018
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/5/18 4:58 PM, unDEFER wrote:

 It prints:
 $ ./memory
 float = 4 bytes
 float[3] = 12 bytes
 100K * float[3] = 2356 Kbytes
 
 Why not 1200 Kbytes?
Array appending is complex. As you append, it continually "fills in" the memory block you have. But once it outgrows that block, it needs to allocate and move everything to a new bigger block. But the old block doesn't go away! It's collected and stored in a free list for future allocations. Since you are only ever growing your block, and not allocating smaller ones, it never gets to use those free list blocks which are too small. Normal non-pathological programs are going to use that memory for all kinds of things, so your overall memory should stabilize at some level. -Steve
Apr 05 2018
parent reply unDEFER <undefer gmail.com> writes:
On Thursday, 5 April 2018 at 21:11:53 UTC, Steven Schveighoffer 
wrote:

 But the old block doesn't go away! It's collected and stored in 
 a free list for future allocations.

 -Steve
Big thanks, -Steve! Really program like the next: ==============8<============== void main() { float[3] f; float[3][] x; float[3][] y; writefln("float = %s bytes", float.sizeof); writefln("float[3] = %s bytes", f.sizeof); int total = 200; foreach(i; 0..total) { foreach (j; 0..1000) { x ~= [0.01, 0.02, 0.03]; } } int before = MemoryUsage(); foreach(i; 0..total/2) { foreach (j; 0..1000) { y ~= [0.01, 0.02, 0.03]; } } int after = MemoryUsage(); writefln("%dK * float[3] = %d Kbytes", total/2, (after-before)); } ==============>8============== Prints: float = 4 bytes float[3] = 12 bytes 100K * float[3] = 1320 Kbytes Very well. I'm thinking how to optimize my game program to not be "pathological" on loading the models.
Apr 05 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, April 05, 2018 21:27:54 unDEFER via Digitalmars-d wrote:
 On Thursday, 5 April 2018 at 21:11:53 UTC, Steven Schveighoffer

 wrote:
 But the old block doesn't go away! It's collected and stored in
 a free list for future allocations.

 -Steve
Big thanks, -Steve! Really program like the next: ==============8<============== void main() { float[3] f; float[3][] x; float[3][] y; writefln("float = %s bytes", float.sizeof); writefln("float[3] = %s bytes", f.sizeof); int total = 200; foreach(i; 0..total) { foreach (j; 0..1000) { x ~= [0.01, 0.02, 0.03]; } } int before = MemoryUsage(); foreach(i; 0..total/2) { foreach (j; 0..1000) { y ~= [0.01, 0.02, 0.03]; } } int after = MemoryUsage(); writefln("%dK * float[3] = %d Kbytes", total/2, (after-before)); } ==============>8============== Prints: float = 4 bytes float[3] = 12 bytes 100K * float[3] = 1320 Kbytes Very well. I'm thinking how to optimize my game program to not be "pathological" on loading the models.
If you know the size of the arrays beforehand, then either call reserve on the array before you start appending, in which case, the capacity of the array will be roughly the actual length. Alternatively, you can just allocate the dynamic array to the correct length and then assign each element. I don't know which would result in less memory being used (since they're both likely to have some extra capacity depending on the exact number and size of the elements), but they both should roughly take up the amount of memory required, and you'll only get one heap allocation, meaning that you won't be creating any extra garbage in the process. - Jonathan M Davis
Apr 05 2018
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 5 April 2018 at 20:58:32 UTC, unDEFER wrote:
 100K * float[3] = 2356 Kbytes

 Why not 1200 Kbytes?
My guess is the reallocation triggered by ~= just passed the double threshold there. When the runtime appends, it usually reserves (about) 2x of what it actually needs. This is a performance optimization in most cases because it cuts down on extra allocations, copies, and fragmentation. But if you catch it right at the edge, after it doubles but before it is filled in by more loop, it can look like excess memory. It is also possible that you are just seeing some fixed overhead with the process.
Apr 05 2018
parent reply unDEFER <undefer gmail.com> writes:
OK, without reallocation:

====================8<====================
void main()
{
     float[3] f;
     float[3][] x;
     writefln("float = %s bytes", float.sizeof);
     writefln("float[3] = %s bytes", f.sizeof);

     int before = MemoryUsage();

     int total = 100;
     x = new float[3][total*1000];

     int after = MemoryUsage();
     writefln("%dK * float[3] = %d Kbytes", total, (after-before));
}
====================>8====================

$ ./memory
float = 4 bytes
float[3] = 12 bytes
100K * float[3] = 2300 Kbytes

Why this so?
Apr 05 2018
next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, April 05, 2018 21:44:35 unDEFER via Digitalmars-d wrote:
 OK, without reallocation:

 ====================8<====================
 void main()
 {
      float[3] f;
      float[3][] x;
      writefln("float = %s bytes", float.sizeof);
      writefln("float[3] = %s bytes", f.sizeof);

      int before = MemoryUsage();

      int total = 100;
      x = new float[3][total*1000];

      int after = MemoryUsage();
      writefln("%dK * float[3] = %d Kbytes", total, (after-before));
 }
 ====================>8====================

 $ ./memory
 float = 4 bytes
 float[3] = 12 bytes
 100K * float[3] = 2300 Kbytes

 Why this so?
You could also look at how x.capacity compares to x.length as well as core.memory.GC.stats() to see what the GC thinks that it's using. On my system, the x.capacity was only 9 greater than x.length, and GC.stats printed as Stats(1200128, 598016) whereas 100,000 * 12 is 1,200,000, meaning that the GC claims to only be using 128 more bytes than the dynamic array itself. Since FreeBSD doesn't have the same /proc as Linux, I can't test using your MemoryUsage function, so I don't know what it would claim, but if that's giving the entire memory of the program, then the extra memory could be used by something in the C runtime. Either way, I'd suggest looking at the result of GC.stats to see what the GC thinks it's using on your system, which should give you a clue as to what's using the memory. And at minimum, x.capacity will tell you how much is used specifically for the buffer that the dynamic array is a slice of. - Jonathan M Davis
Apr 05 2018
parent reply unDEFER <undefer gmail.com> writes:
On Thursday, 5 April 2018 at 22:06:10 UTC, Jonathan M Davis wrote:

 You could also look at how x.capacity compares to x.length as 
 well as core.memory.GC.stats() to see what the GC thinks that 
 it's using. On my system, the x.capacity was only 9 greater 
 than x.length, and GC.stats printed as
Yes, capacity for me also have only 9 greater. So.. in my game GS.stats() shows 105 Mb, but the system shows 320 Mb of usage memory by the process... The same under Windows. Really don't understand what this all means.
Apr 05 2018
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, April 05, 2018 22:29:54 unDEFER via Digitalmars-d wrote:
 On Thursday, 5 April 2018 at 22:06:10 UTC, Jonathan M Davis wrote:
 You could also look at how x.capacity compares to x.length as
 well as core.memory.GC.stats() to see what the GC thinks that
 it's using. On my system, the x.capacity was only 9 greater
 than x.length, and GC.stats printed as
Yes, capacity for me also have only 9 greater. So.. in my game GS.stats() shows 105 Mb, but the system shows 320 Mb of usage memory by the process... The same under Windows. Really don't understand what this all means.
Well, if it's not the GC, AFAIk, it would pretty much have to be the C runtime which was using the rest of the memory, though 205 MiB seems like a lot extra. Are you doing anything that would allocate using malloc instead of the GC? e.g. IIRC, std.container.array.Array allocates stuff internally using malloc rather than the GC. Also, if you have done anything which would involve freeing malloc-ed memory, there could be some amount of heap fragmentation, though unless you were allocating and freeing a lot of memory, I wouldn't expect fragmentation of that magnitude. - Jonathan M Davis
Apr 05 2018
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 4/5/18 5:44 PM, unDEFER wrote:
 OK, without reallocation:
 
 ====================8<====================
 void main()
 {
      float[3] f;
      float[3][] x;
      writefln("float = %s bytes", float.sizeof);
      writefln("float[3] = %s bytes", f.sizeof);
 
      int before = MemoryUsage();
 
      int total = 100;
      x = new float[3][total*1000];
 
      int after = MemoryUsage();
      writefln("%dK * float[3] = %d Kbytes", total, (after-before));
 }
 ====================>8====================
 
 $ ./memory
 float = 4 bytes
 float[3] = 12 bytes
 100K * float[3] = 2300 Kbytes
 
 Why this so?
1. Compare to C malloc-ing 1.2MB at once (GC uses C malloc underneath) 2. Have you examined smaller numbers for total? Does it scale linearly or is there a threshold? (my guess is the latter) Note that the GC does keep some metadata, but it's 1/32 the size of the actual memory, so I'm not sure it's relevant here. Is there an end goal for this exercise? That is, did you find a problem and are trying to diagnose it by using this test? Maybe if we know the real problem you are having, it can be explained differently. -Steve
Apr 05 2018
parent reply unDEFER <undefer gmail.com> writes:
On Thursday, 5 April 2018 at 22:23:12 UTC, Steven Schveighoffer 
wrote:
 1. Compare to C malloc-ing 1.2MB at once (GC uses C malloc 
 underneath)
Yes after initialize malloc'ed 1.2Mb in C it consumes 1.6 Mb. 4.8 Mb => 4.9 Mb
 2. Have you examined smaller numbers for total? Does it scale 
 linearly or is there a threshold? (my guess is the latter)
OK, if total = 1000 the last test shows 12916 Kbytes. It's fine.
 Note that the GC does keep some metadata, but it's 1/32 the 
 size of the actual memory, so I'm not sure it's relevant here.

 Is there an end goal for this exercise? That is, did you find a 
 problem and are trying to diagnose it by using this test? Maybe 
 if we know the real problem you are having, it can be explained 
 differently.
I just don't like that my game application takes too much memory. I will think more.
Apr 05 2018
parent unDEFER <undefer gmail.com> writes:
So, I completely found all answers.
In my game of 260 Mb:
100 Mb consumes GC,
100 Mb consumes scene in glNewList.
And 30 Mb textures in glTexImage2D.

Very well, now I know what to do and how to get it smaller.
Big thanks to all.
Apr 05 2018