digitalmars.D.learn - Large memory allocations

bearophile (71/71) Nov 14 2008 While allocating lot of memory for a little memory-hungry program, I hav...

Kagamin (2/4) Nov 15 2008 because you use different implementations of malloc. Aggressive allocati...

bearophile (5/7) Nov 15 2008 So the allocator used by D may be better...

Kagamin (2/4) Nov 15 2008 who knows...

Denis Koroskin (3/87) Nov 15 2008 Comparison DMD against DMC might be more consistent since they share the...
BCS (4/17) Nov 15 2008 IIRC without special work, 32bit windows apps can't use more than 2GB of...

bearophile (4/6) Nov 15 2008 If you notice the numbers I have shown relative to D (single allocation ...

BCS (6/17) Nov 16 2008 your within 10-25% of that limit, I'd say that's close enough to make a ...

Janderson (23/187) Nov 15 2008 have found results that I don't understand. So I have written the

Kagamin (2/216) Nov 16 2008

bearophile <bearophileHUGS lycos.com> writes:

While allocating lot of memory for a little memory-hungry program, I have found
results that I don't understand. So I have written the following test programs.
Maybe someone can give me some information on the matter.
I am using a default install of a 32 bit Win XP with 2 GB RAM (so for example I
can't allocate 3 GB of RAM). (I presume answers to my questions are
Windows-related).

From C (MinGW 4.2.1) this is about the largest memory block I can allocate
(even it swaps and requires 7+ seconds to run), 1_920_000_000 bytes:

#include "stdio.h"
#include "stdlib.h"
#define N 480000000
int main() {
    unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
    unsigned int i;
    if (a != NULL)
        for (i = 0; i < N; ++i)
           a[i] = i;
    else
        printf("null!");
    return 0;
}


But from D this is about the largest memory block I can allocate with
std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?

//import std.gc: malloc;
import std.c.stdlib: malloc;
void main() {
    const uint N = 411_000_000;
    uint* a = cast(uint*)malloc(N * uint.sizeof);
    if (a !is null)
        for (uint i; i < N; ++i)
           a[i] = i;
    else
        printf("null!");
}

(If I use std.gc.malloc the situation is different yet, and generally worse).

-----------------------

So I have tried to use a sequence of smaller memory blocks, this is the C code
(every block is about 1 MB):

#include "stdio.h"
#include "stdlib.h"

#define N 250000

int main(int argc, char** argv) {
    unsigned int i, j;
    unsigned int m = argc == 2 ? atoi(argv[1]) : 100;

    for (j = 0; j < m; ++j) {
        unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));

        if (a != NULL) {
            for (i = 0; i < N; ++i)
               a[i] = i;
        } else {
            printf("null! %d\n", j);
            break;
        }
    }

    return 0;
}


And the D code:

//import std.gc: malloc;
import std.c.stdlib: malloc;
import std.conv: toUint;

void main(string[] args) {
    const uint N = 250_000;
    uint m = toUint(args[1]);

    for (uint j; j < m; ++j) {
        uint* a = cast(uint*)malloc(N * uint.sizeof);

        if (a !is null) {
            for (uint i; i < N; ++i)
               a[i] = i;
        } else {
            printf("null! %d\n", j);
            break;
        }
    }
}

With such code I can allocate 1_708_000_000 bytes from D and up to
2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C code
swaps a lot).
So can't I can't use all my RAM from my D code? And do you know why?

Bye,
bearophile

Nov 14 2008

Kagamin <spam here.lot> writes:

bearophile Wrote:

 So can't I can't use all my RAM from my D code? And do you know why?
 

because you use different implementations of malloc. Aggressive allocation make
windows sluggish so it's a good idea to stop allocations before os becomes
unresponsive.

Nov 15 2008

bearophile <bearophileHUGS lycos.com> writes:

Kagamin:
 because you use different implementations of malloc.

But as you have noticed there's a large difference.


Aggressive allocation make windows sluggish so it's a good idea to stop
allocations before os becomes unresponsive.<

So the allocator used by D may be better...

Thank you for the answer,
bearophile

Nov 15 2008

Kagamin <spam here.lot> writes:

bearophile Wrote:

 So the allocator used by D may be better...
 

who knows...

Nov 15 2008

"Denis Koroskin" <2korden gmail.com> writes:

15.11.08 в 00:56 bearophile в своём письме писал(а):

 While allocating lot of memory for a little memory-hungry program, I  
 have found results that I don't understand. So I have written the  
 following test programs. Maybe someone can give me some information on  
 the matter.
 I am using a default install of a 32 bit Win XP with 2 GB RAM (so for  
 example I can't allocate 3 GB of RAM). (I presume answers to my  
 questions are Windows-related).

 From C (MinGW 4.2.1) this is about the largest memory block I can  
 allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000  
 bytes:

 #include "stdio.h"
 #include "stdlib.h"
 #define N 480000000
 int main() {
     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
     unsigned int i;
     if (a != NULL)
         for (i = 0; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
     return 0;
 }


 But from D this is about the largest memory block I can allocate with  
 std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?

 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 void main() {
     const uint N = 411_000_000;
     uint* a = cast(uint*)malloc(N * uint.sizeof);
     if (a !is null)
         for (uint i; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
 }

 (If I use std.gc.malloc the situation is different yet, and generally  
 worse).

 -----------------------

 So I have tried to use a sequence of smaller memory blocks, this is the  
 C code (every block is about 1 MB):

 #include "stdio.h"
 #include "stdlib.h"

 #define N 250000

 int main(int argc, char** argv) {
     unsigned int i, j;
     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;

     for (j = 0; j < m; ++j) {
         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned  
 int));

         if (a != NULL) {
             for (i = 0; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }

     return 0;
 }


 And the D code:

 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 import std.conv: toUint;

 void main(string[] args) {
     const uint N = 250_000;
     uint m = toUint(args[1]);

     for (uint j; j < m; ++j) {
         uint* a = cast(uint*)malloc(N * uint.sizeof);

         if (a !is null) {
             for (uint i; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 }

 With such code I can allocate 1_708_000_000 bytes from D and up to  
 2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C  
 code swaps a lot).
 So can't I can't use all my RAM from my D code? And do you know why?

 Bye,
 bearophile

Comparison DMD against DMC might be more consistent since they share the  
same CRT.

Nov 15 2008

BCS <ao pathlink.com> writes:

Reply to bearophile,

 While allocating lot of memory for a little memory-hungry program, I
 have found results that I don't understand. So I have written the
 following test programs. Maybe someone can give me some information on
 the matter.
 
 I am using a default install of a 32 bit Win XP with 2 GB RAM (so for
 example I can't allocate 3 GB of RAM). (I presume answers to my
 questions are Windows-related).
 
 From C (MinGW 4.2.1) this is about the largest memory block I can
 allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000
 bytes:
 


IIRC without special work, 32bit windows apps can't use more than 2GB of 
total address space regardless of how much ram you have. It's an OS imposed 
limit. With special setup that can be switched to 3GB but that is it.

Nov 15 2008

bearophile <bearophileHUGS lycos.com> writes:

BCS:
 IIRC without special work, 32bit windows apps can't use more than 2GB of 
 total address space regardless of how much ram you have.

If you notice the numbers I have shown relative to D (single allocation or many
smaller blocks) aren't much close to the 2 GB limit (I haven't tried to raise
the limit to 3 GB yet).

Bye,
bearophile

Nov 15 2008

BCS <ao pathlink.com> writes:

Reply to bearophile,

 BCS:
 
 IIRC without special work, 32bit windows apps can't use more than 2GB
 of total address space regardless of how much ram you have.
 

 If you notice the numbers I have shown relative to D (single
 allocation or many smaller blocks) aren't much close to the 2 GB limit
 (I haven't tried to raise the limit to 3 GB yet).
 
 Bye,
 bearophile

your within 10-25% of that limit, I'd say that's close enough to make a
difference 
depending on what kind of overhead your looking at. For example, D might 
be reserving address space for every dll it /might/ have to load while the 
C program might be waiting to reserve that until it actually needs to load 
them.

Nov 16 2008

Janderson <ask me.com> writes:

bearophile wrote:
 While allocating lot of memory for a little memory-hungry program, I have
found results that I don't understand. So I have written the following test
programs. Maybe someone can give me some information on the matter.
 I am using a default install of a 32 bit Win XP with 2 GB RAM (so for example
I can't allocate 3 GB of RAM). (I presume answers to my questions are
Windows-related).
 
 From C (MinGW 4.2.1) this is about the largest memory block I can allocate
(even it swaps and requires 7+ seconds to run), 1_920_000_000 bytes:
 
 #include "stdio.h"
 #include "stdlib.h"
 #define N 480000000
 int main() {
     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
     unsigned int i;
     if (a != NULL)
         for (i = 0; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
     return 0;
 }
 
 
 But from D this is about the largest memory block I can allocate with
std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
 
 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 void main() {
     const uint N = 411_000_000;
     uint* a = cast(uint*)malloc(N * uint.sizeof);
     if (a !is null)
         for (uint i; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
 }
 
 (If I use std.gc.malloc the situation is different yet, and generally worse).
 
 -----------------------
 
 So I have tried to use a sequence of smaller memory blocks, this is the C code
(every block is about 1 MB):
 
 #include "stdio.h"
 #include "stdlib.h"
 
 #define N 250000
 
 int main(int argc, char** argv) {
     unsigned int i, j;
     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;
 
     for (j = 0; j < m; ++j) {
         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
 
         if (a != NULL) {
             for (i = 0; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 
     return 0;
 }
 
 
 And the D code:
 
 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 import std.conv: toUint;
 
 void main(string[] args) {
     const uint N = 250_000;
     uint m = toUint(args[1]);
 
     for (uint j; j < m; ++j) {
         uint* a = cast(uint*)malloc(N * uint.sizeof);
 
         if (a !is null) {
             for (uint i; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 }
 
 With such code I can allocate 1_708_000_000 bytes from D and up to
2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C code
swaps a lot).
 So can't I can't use all my RAM from my D code? And do you know why?
 
 Bye,
 bearophile


bearophile wrote:
 While allocating lot of memory for a little memory-hungry program, I 

have found results that I don't understand. So I have written the 
following test programs. Maybe someone can give me some information on 
the matter.
 I am using a default install of a 32 bit Win XP with 2 GB RAM (so for 

example I can't allocate 3 GB of RAM). (I presume answers to my 
questions are Windows-related).
 From C (MinGW 4.2.1) this is about the largest memory block I can 

allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000 
bytes:
 #include "stdio.h"
 #include "stdlib.h"
 #define N 480000000
 int main() {
     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
     unsigned int i;
     if (a != NULL)
         for (i = 0; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
     return 0;
 }


 But from D this is about the largest memory block I can allocate with 

std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 void main() {
     const uint N = 411_000_000;
     uint* a = cast(uint*)malloc(N * uint.sizeof);
     if (a !is null)
         for (uint i; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
 }

 (If I use std.gc.malloc the situation is different yet, and generally 

worse).
 -----------------------

 So I have tried to use a sequence of smaller memory blocks, this is 

the C code (every block is about 1 MB):
 #include "stdio.h"
 #include "stdlib.h"

 #define N 250000

 int main(int argc, char** argv) {
     unsigned int i, j;
     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;

     for (j = 0; j < m; ++j) {
         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned 

int));
         if (a != NULL) {
             for (i = 0; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }

     return 0;
 }


 And the D code:

 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 import std.conv: toUint;

 void main(string[] args) {
     const uint N = 250_000;
     uint m = toUint(args[1]);

     for (uint j; j < m; ++j) {
         uint* a = cast(uint*)malloc(N * uint.sizeof);

         if (a !is null) {
             for (uint i; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 }

 With such code I can allocate 1_708_000_000 bytes from D and up to 

2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C 
code swaps a lot).
 So can't I can't use all my RAM from my D code? And do you know why?

 Bye,
 bearophile

Different allocation schemes have different strengths and weaknesses. 
Some are fast, some fragment less, some have less overhead, some allow 
larger sized blocks.  Often these things arn't mutual so there are 
always tradoffs.  For example, to improve speed an allocator may 
allocate into particular buckets which might restrict the maximum size 
of one allocation.

I wonder how Ned-Malloc or Hord perform with your tests?

-Joel

Nov 15 2008

Kagamin <spam here.lot> writes:

lol, quot damage!

Janderson Wrote:

 bearophile wrote:
 While allocating lot of memory for a little memory-hungry program, I have
found results that I don't understand. So I have written the following test
programs. Maybe someone can give me some information on the matter.
 I am using a default install of a 32 bit Win XP with 2 GB RAM (so for example
I can't allocate 3 GB of RAM). (I presume answers to my questions are
Windows-related).
 
 From C (MinGW 4.2.1) this is about the largest memory block I can allocate
(even it swaps and requires 7+ seconds to run), 1_920_000_000 bytes:
 
 #include "stdio.h"
 #include "stdlib.h"
 #define N 480000000
 int main() {
     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
     unsigned int i;
     if (a != NULL)
         for (i = 0; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
     return 0;
 }
 
 
 But from D this is about the largest memory block I can allocate with
std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
 
 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 void main() {
     const uint N = 411_000_000;
     uint* a = cast(uint*)malloc(N * uint.sizeof);
     if (a !is null)
         for (uint i; i < N; ++i)
            a[i] = i;
     else
         printf("null!");
 }
 
 (If I use std.gc.malloc the situation is different yet, and generally worse).
 
 -----------------------
 
 So I have tried to use a sequence of smaller memory blocks, this is the C code
(every block is about 1 MB):
 
 #include "stdio.h"
 #include "stdlib.h"
 
 #define N 250000
 
 int main(int argc, char** argv) {
     unsigned int i, j;
     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;
 
     for (j = 0; j < m; ++j) {
         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
 
         if (a != NULL) {
             for (i = 0; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 
     return 0;
 }
 
 
 And the D code:
 
 //import std.gc: malloc;
 import std.c.stdlib: malloc;
 import std.conv: toUint;
 
 void main(string[] args) {
     const uint N = 250_000;
     uint m = toUint(args[1]);
 
     for (uint j; j < m; ++j) {
         uint* a = cast(uint*)malloc(N * uint.sizeof);
 
         if (a !is null) {
             for (uint i; i < N; ++i)
                a[i] = i;
         } else {
             printf("null! %d\n", j);
             break;
         }
     }
 }
 
 With such code I can allocate 1_708_000_000 bytes from D and up to
2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C code
swaps a lot).
 So can't I can't use all my RAM from my D code? And do you know why?
 
 Bye,
 bearophile

 
 
 bearophile wrote:
  > While allocating lot of memory for a little memory-hungry program, I 
 have found results that I don't understand. So I have written the 
 following test programs. Maybe someone can give me some information on 
 the matter.
  > I am using a default install of a 32 bit Win XP with 2 GB RAM (so for 
 example I can't allocate 3 GB of RAM). (I presume answers to my 
 questions are Windows-related).
  >
  > From C (MinGW 4.2.1) this is about the largest memory block I can 
 allocate (even it swaps and requires 7+ seconds to run), 1_920_000_000 
 bytes:
  >
  > #include "stdio.h"
  > #include "stdlib.h"
  > #define N 480000000
  > int main() {
  >     unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned int));
  >     unsigned int i;
  >     if (a != NULL)
  >         for (i = 0; i < N; ++i)
  >            a[i] = i;
  >     else
  >         printf("null!");
  >     return 0;
  > }
  >
  >
  > But from D this is about the largest memory block I can allocate with 
 std.c.stdlib.malloc, 1_644_000_000 bytes, do you know why the difference?
  >
  > //import std.gc: malloc;
  > import std.c.stdlib: malloc;
  > void main() {
  >     const uint N = 411_000_000;
  >     uint* a = cast(uint*)malloc(N * uint.sizeof);
  >     if (a !is null)
  >         for (uint i; i < N; ++i)
  >            a[i] = i;
  >     else
  >         printf("null!");
  > }
  >
  > (If I use std.gc.malloc the situation is different yet, and generally 
 worse).
  >
  > -----------------------
  >
  > So I have tried to use a sequence of smaller memory blocks, this is 
 the C code (every block is about 1 MB):
  >
  > #include "stdio.h"
  > #include "stdlib.h"
  >
  > #define N 250000
  >
  > int main(int argc, char** argv) {
  >     unsigned int i, j;
  >     unsigned int m = argc == 2 ? atoi(argv[1]) : 100;
  >
  >     for (j = 0; j < m; ++j) {
  >         unsigned int* a = (unsigned int*)malloc(N * sizeof(unsigned 
 int));
  >
  >         if (a != NULL) {
  >             for (i = 0; i < N; ++i)
  >                a[i] = i;
  >         } else {
  >             printf("null! %d\n", j);
  >             break;
  >         }
  >     }
  >
  >     return 0;
  > }
  >
  >
  > And the D code:
  >
  > //import std.gc: malloc;
  > import std.c.stdlib: malloc;
  > import std.conv: toUint;
  >
  > void main(string[] args) {
  >     const uint N = 250_000;
  >     uint m = toUint(args[1]);
  >
  >     for (uint j; j < m; ++j) {
  >         uint* a = cast(uint*)malloc(N * uint.sizeof);
  >
  >         if (a !is null) {
  >             for (uint i; i < N; ++i)
  >                a[i] = i;
  >         } else {
  >             printf("null! %d\n", j);
  >             break;
  >         }
  >     }
  > }
  >
  > With such code I can allocate 1_708_000_000 bytes from D and up to 
 2_038_000_000 bytes from C (but near the last 100-200 MB of RAM the C 
 code swaps a lot).
  > So can't I can't use all my RAM from my D code? And do you know why?
  >
  > Bye,
  > bearophile
 
 Different allocation schemes have different strengths and weaknesses. 
 Some are fast, some fragment less, some have less overhead, some allow 
 larger sized blocks.  Often these things arn't mutual so there are 
 always tradoffs.  For example, to improve speed an allocator may 
 allocate into particular buckets which might restrict the maximum size 
 of one allocation.
 
 I wonder how Ned-Malloc or Hord perform with your tests?
 
 -Joel

Nov 16 2008

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Large memory allocations