digitalmars.D.learn - Out of memory error (even when using destroy())

realhet (33/33) May 25 2017 Hi,

Jonathan M Davis via Digitalmars-d-learn (8/40) May 26 2017 It's likely an issue with false pointers. The GC thinks that the memory ...

realhet (32/32) May 26 2017 Thanks for the answer!

rikki cattermole (4/41) May 26 2017 If you have to use such large amounts frequently, you really have to go
Guillaume Piolat (3/11) May 26 2017 If you have issues with false pointers, you can use malloc
ag0aep6g (5/9) May 26 2017 The issue is not that the block contains a false pointer, but that

Mike B Johnson (3/14) May 26 2017 Wow, if that is the case then the GC has some real issues. The GC

Stanislav Blinov (10/26) May 26 2017 What is a pointer if not an int? :)
H. S. Teoh via Digitalmars-d-learn (18/32) May 26 2017 Unfortunately, it can't, because (1) D interfaces with C code, and you

Mike B Johnson (20/53) May 27 2017 And what if one isn't interfacing to C? All pointers should be

Stanislav Blinov (19/36) May 27 2017 Eh? So *every* cast from and to a pointer should become a call
nkm1 (18/28) May 27 2017 Apparently some people are (were?) working on semi-precise GC:

Jordan Wilson (14/50) May 26 2017 I believe the general solution would be to limit allocation

realhet (12/24) May 26 2017 Yea, that's the perfect solution. It uses exactly the amount of

realhet <real_het hotmail.com> writes:

Hi,

I'm kinda new to the D language and I love it already. :D So far 
I haven't got any serious problems but this one seems like beyond 
me.

import std.stdio;
void main(){
     foreach(i; 0..2000){
         writeln(i);
         auto st = new ubyte[500_000_000];
         destroy(st); //<-this doesnt matter
     }
}

Compiled with DMD 2.074.0 Win32 it produces the following output:
0
1
2
core.exception.OutOfMemoryError src\core\exception.d(696): Memory 
allocation failed

It doesn't matter that I call destroy() or not. This is ok 
because as I learned: destroy only calls the destructor and marks 
the memory block as unused.

But I also learned that GC will start to collect when it run out 
of memory but in this time the following happens:
3x half GB of allocations and deallocations, and on the 4th the 
system runs out of the 2GB
  limit which is ok. At this point the GC already has 1.5GB of 
free memory but instead of using that, it returns a Memory Error. 
Why?

Note: This is not a problem when I use smaller blocks (like 50MB).
But I want to use large blocks, without making a slow wrapper 
that emulates a large block by using smaller GC allocated blocks.

Is there a solution to this?

Thank You!

May 25 2017

Jonathan M Davis via Digitalmars-d-learn writes:

On Friday, May 26, 2017 06:31:49 realhet via Digitalmars-d-learn wrote:
 Hi,

 I'm kinda new to the D language and I love it already. :D So far
 I haven't got any serious problems but this one seems like beyond
 me.

 import std.stdio;
 void main(){
      foreach(i; 0..2000){
          writeln(i);
          auto st = new ubyte[500_000_000];
          destroy(st); //<-this doesnt matter
      }
 }

 Compiled with DMD 2.074.0 Win32 it produces the following output:
 0
 1
 2
 core.exception.OutOfMemoryError src\core\exception.d(696): Memory
 allocation failed

 It doesn't matter that I call destroy() or not. This is ok
 because as I learned: destroy only calls the destructor and marks
 the memory block as unused.

 But I also learned that GC will start to collect when it run out
 of memory but in this time the following happens:
 3x half GB of allocations and deallocations, and on the 4th the
 system runs out of the 2GB
   limit which is ok. At this point the GC already has 1.5GB of
 free memory but instead of using that, it returns a Memory Error.
 Why?

 Note: This is not a problem when I use smaller blocks (like 50MB).
 But I want to use large blocks, without making a slow wrapper
 that emulates a large block by using smaller GC allocated blocks.

It's likely an issue with false pointers. The GC thinks that the memory is
referenced when it isn't, because some of the values match the pointers that
would need to be freed.

 Is there a solution to this?

Use 64-bit. False pointers don't tend to be a problem with 64-bit, whereas
they can be with 32-bit - especially when you're allocating large blocks of
memory like that.

- Jonathan M Davis

May 26 2017

realhet <real_het hotmail.com> writes:

Thanks for the answer!

But hey, the GC knows that is should not search for any pointers 
in those large blocks.
And the buffer is full of 0-s at the start, so there can't be any 
'false pointers' in it. And I think the GC will not search in it 
either.

The only reference to the buffer is 'st' which will die shortly 
after it has been allocated.

64bit is not a solution because I need to produce a 32bit dll, 
and I also wanna use 32bit asm objs.
The total 2GB amount of memory is more than enough for the 
problem.
My program have to produce 300..500 MB of continuous data 
frequently. This works in MSVC32, but with D's GC it starts to 
eat memory and fails at the 4th iteration. Actually it never 
releases the previous blocks even I say so with destroy().

At this point I only can think of:
a) Work with the D allocator but emulate large blocks by 
virtually stitching small blocks together. (this is unnecessary 
complexity)
b) Allocating memory by Win32 api and not using D goodies anymore 
(also unnecessary complexity)

But these are ugly workarounds. :S

I also tried to allocate smaller blocks than the previous one, so 
it would easily fit to the prevouisly released space, and yet it 
keeps eating memory:

void alloc_dealloc(size_t siz){
     auto st = new ubyte[siz];
}

void main(){
     foreach(i; 0..4) alloc_dealloc(500_000_000 - 50_000_000*i);
}

May 26 2017

rikki cattermole <rikki cattermole.co.nz> writes:

On 26/05/2017 9:15 AM, realhet wrote:
 Thanks for the answer!
 
 But hey, the GC knows that is should not search for any pointers in 
 those large blocks.
 And the buffer is full of 0-s at the start, so there can't be any 'false 
 pointers' in it. And I think the GC will not search in it either.
 
 The only reference to the buffer is 'st' which will die shortly after it 
 has been allocated.
 
 64bit is not a solution because I need to produce a 32bit dll, and I 
 also wanna use 32bit asm objs.
 The total 2GB amount of memory is more than enough for the problem.
 My program have to produce 300..500 MB of continuous data frequently. 
 This works in MSVC32, but with D's GC it starts to eat memory and fails 
 at the 4th iteration. Actually it never releases the previous blocks 
 even I say so with destroy().
 
 At this point I only can think of:
 a) Work with the D allocator but emulate large blocks by virtually 
 stitching small blocks together. (this is unnecessary complexity)
 b) Allocating memory by Win32 api and not using D goodies anymore (also 
 unnecessary complexity)
 
 But these are ugly workarounds. :S
 
 I also tried to allocate smaller blocks than the previous one, so it 
 would easily fit to the prevouisly released space, and yet it keeps 
 eating memory:
 
 void alloc_dealloc(size_t siz){
      auto st = new ubyte[siz];
 }
 
 void main(){
      foreach(i; 0..4) alloc_dealloc(500_000_000 - 50_000_000*i);
 }

If you have to use such large amounts frequently, you really have to go 
with buffers of memory that you control, not the GC. Memory allocation 
is always expensive, if you can prevent it all the better.

May 26 2017

Guillaume Piolat <first.last gmail.com> writes:

On Friday, 26 May 2017 at 08:15:49 UTC, realhet wrote:
 64bit is not a solution because I need to produce a 32bit dll, 
 and I also wanna use 32bit asm objs.
 The total 2GB amount of memory is more than enough for the 
 problem.
 My program have to produce 300..500 MB of continuous data 
 frequently. This works in MSVC32, but with D's GC it starts to 
 eat memory and fails at the 4th iteration. Actually it never 
 releases the previous blocks even I say so with destroy().

If you have issues with false pointers, you can use malloc 
instead of the GC to use much less memory.

May 26 2017

ag0aep6g <anonymous example.com> writes:

On 05/26/2017 10:15 AM, realhet wrote:
 But hey, the GC knows that is should not search for any pointers in 
 those large blocks.
 And the buffer is full of 0-s at the start, so there can't be any 'false 
 pointers' in it. And I think the GC will not search in it either.

The issue is not that the block contains a false pointer, but that 
there's a false pointer elsewhere that points into the block. The bigger 
the block, the more likely it is that something (e.g. an int on the 
stack) is mistaken for a pointer into it.

May 26 2017

Mike B Johnson <Mikey Ikes.com> writes:

On Friday, 26 May 2017 at 14:05:34 UTC, ag0aep6g wrote:
 On 05/26/2017 10:15 AM, realhet wrote:
 But hey, the GC knows that is should not search for any 
 pointers in those large blocks.
 And the buffer is full of 0-s at the start, so there can't be 
 any 'false pointers' in it. And I think the GC will not search 
 in it either.

 The issue is not that the block contains a false pointer, but 
 that there's a false pointer elsewhere that points into the 
 block. The bigger the block, the more likely it is that 
 something (e.g. an int on the stack) is mistaken for a pointer 
 into it.

Wow, if that is the case then the GC has some real issues. The GC 
should be informed about all pointers and an int is not a pointer.

May 26 2017

Stanislav Blinov <stanislav.blinov gmail.com> writes:

On Friday, 26 May 2017 at 18:06:42 UTC, Mike B Johnson wrote:
 On Friday, 26 May 2017 at 14:05:34 UTC, ag0aep6g wrote:
 On 05/26/2017 10:15 AM, realhet wrote:
 But hey, the GC knows that is should not search for any 
 pointers in those large blocks.
 And the buffer is full of 0-s at the start, so there can't be 
 any 'false pointers' in it. And I think the GC will not 
 search in it either.

 The issue is not that the block contains a false pointer, but 
 that there's a false pointer elsewhere that points into the 
 block. The bigger the block, the more likely it is that 
 something (e.g. an int on the stack) is mistaken for a pointer 
 into it.

 Wow, if that is the case then the GC has some real issues. The 
 GC should be informed about all pointers and an int is not a 
 pointer.

What is a pointer if not an int? :)

That is not an issue. The GC holds off releasing memory if 
there's even a suspicion that someone might be holding on to it. 
In most problems, ints are small. Pointers are always big, so 
there's not much overlap there. Accidents do happen occasionally, 
but it's better to have a system that is too cautious than one 
that ruins your data.

Working with huge memory chunks isn't really a domain for GC 
though.

May 26 2017

"H. S. Teoh via Digitalmars-d-learn" <digitalmars-d-learn puremagic.com> writes:

On Fri, May 26, 2017 at 06:06:42PM +0000, Mike B Johnson via
Digitalmars-d-learn wrote:
 On Friday, 26 May 2017 at 14:05:34 UTC, ag0aep6g wrote:
 On 05/26/2017 10:15 AM, realhet wrote:
 But hey, the GC knows that is should not search for any pointers
 in those large blocks.  And the buffer is full of 0-s at the
 start, so there can't be any 'false pointers' in it. And I think
 the GC will not search in it either.

 
 The issue is not that the block contains a false pointer, but that
 there's a false pointer elsewhere that points into the block. The
 bigger the block, the more likely it is that something (e.g. an int
 on the stack) is mistaken for a pointer into it.

 
 Wow, if that is the case then the GC has some real issues. The GC
 should be informed about all pointers and an int is not a pointer.

Unfortunately, it can't, because (1) D interfaces with C code, and you
don't have this kind of information from a C object file, and (2) you
can turn a pointer into an int with a cast or a union in  system code,
and since the GC cannot assume  safe for all code, it needs to be
conservative and assume any int-like data could potentially be a
pointer.

You could improve GC performance by giving it type info from  safe code
so that it skips over blocks that *definitely* have no pointers (it
already does this to some extent, e.g., data in an int[] will never be
scanned for pointers because the GC knows it can't contain any). But you
can't make the GC fully non-conservative because it may crash the
program when it wrongly assumes a memory block is dead when it's
actually still live. All it takes is one pointer on the stack that's
wrongly assumed to be just int, and you're screwed.


T

-- 
Dogs have owners ... cats have staff. -- Krista Casada

May 26 2017

Mike B Johnson <Mikey Ikes.com> writes:

On Friday, 26 May 2017 at 18:19:48 UTC, H. S. Teoh wrote:
 On Fri, May 26, 2017 at 06:06:42PM +0000, Mike B Johnson via 
 Digitalmars-d-learn wrote:
 On Friday, 26 May 2017 at 14:05:34 UTC, ag0aep6g wrote:
 On 05/26/2017 10:15 AM, realhet wrote:
 But hey, the GC knows that is should not search for any 
 pointers in those large blocks.  And the buffer is full of 
 0-s at the start, so there can't be any 'false pointers' 
 in it. And I think the GC will not search in it either.

 
 The issue is not that the block contains a false pointer, 
 but that there's a false pointer elsewhere that points into 
 the block. The bigger the block, the more likely it is that 
 something (e.g. an int on the stack) is mistaken for a 
 pointer into it.

 
 Wow, if that is the case then the GC has some real issues. The 
 GC should be informed about all pointers and an int is not a 
 pointer.

 Unfortunately, it can't, because (1) D interfaces with C code, 
 and you don't have this kind of information from a C object 
 file, and (2) you can turn a pointer into an int with a cast or 
 a union in  system code, and since the GC cannot assume  safe 
 for all code, it needs to be conservative and assume any 
 int-like data could potentially be a pointer.

 You could improve GC performance by giving it type info from 
  safe code so that it skips over blocks that *definitely* have 
 no pointers (it already does this to some extent, e.g., data in 
 an int[] will never be scanned for pointers because the GC 
 knows it can't contain any). But you can't make the GC fully 
 non-conservative because it may crash the program when it 
 wrongly assumes a memory block is dead when it's actually still 
 live. All it takes is one pointer on the stack that's wrongly 
 assumed to be just int, and you're screwed.

And what if one isn't interfacing to C? All pointers should be 
known. You can't access memory by and int or any other 
non-pointer type! Hence, when pointers are created or ints are 
cast to pointers, the GC should be informed and then handle them 
appropriately(then, instead of scanning a 100MB block of memory 
for "pointers" it should scan the list of possible pointers(which 
will generally be much much lower).

Therefor, in a true D program(no outsourcing) with no pointers 
used, the GC should never have to scan anything.

It seems the GC can be smarter than it is instead of just making 
blanket assumptions about the entire program(which rarely hold), 
which is generally always a poor choice when it comes to 
performance...

In fact, When interfacing with C or other programs, memory could 
be partitioned and any memory that may escape D is treated 
differently than the memory used only by D code.

After all, if we truly want to be safe, why not scan the entire 
memory of the system? Who knows, some pointer externally might be 
peeping in on our hello world program.

May 27 2017

Stanislav Blinov <stanislav.blinov gmail.com> writes:

On Saturday, 27 May 2017 at 17:57:03 UTC, Mike B Johnson wrote:

 And what if one isn't interfacing to C? All pointers should be 
 known. You can't access memory by and int or any other 
 non-pointer type! Hence, when pointers are created or ints are 
 cast to pointers, the GC should be informed and then handle 
 them appropriately

Eh? So *every* cast from and to a pointer should become a call 
into the runtime, poking the GC? Or rather, every variable 
declaration should somehow be made magically known to the GC 
without any runtime cost?

 (then, instead of scanning a 100MB block of memory for 
 "pointers" it should scan the list of possible pointers(which 
 will generally be much much lower).

That's precisely what it does, it scans the possible suspects, 
nothing more. That is, the stack (it has no idea what's there, 
it's just a block of untyped memory), memory it itself allocated 
*only if* it needs to (e.g. you allocated a typed array, and the 
type has pointers), memory you've specifically asked it to scan. 
It won't scan that block of 500k ints the OP allocated, unless 
told to do so. It would scan it if it was a void[] block though.

 Therefor, in a true D program(no outsourcing) with no pointers 
 used, the GC should never have to scan anything.

No pointers used? No arrays, no strings, no delegates?.. That's a 
rather limited program. But thing is, you're right, in such a 
program the GC will indeed never have to scan anything. If you 
never allocate, GC collection never occurs either.

 It seems the GC can be smarter than it is instead of just 
 making blanket assumptions about the entire program(which 
 rarely hold), which is generally always a poor choice when it 
 comes to performance...

Unnecessary interaction with the GC, e.g. informing it about 
every cast, is a poor choice for performance.

 After all, if we truly want to be safe, why not scan the entire 
 memory of the system? Who knows, some pointer externally might 
 be peeping in on our hello world program.

What?

May 27 2017

nkm1 <t4nk074 openmailbox.org> writes:

On Saturday, 27 May 2017 at 17:57:03 UTC, Mike B Johnson wrote:

 And what if one isn't interfacing to C? All pointers should be 
 known.

Apparently some people are (were?) working on semi-precise GC: 
https://github.com/dlang/druntime/pull/1603
That still scans the stack conservatively, though.

 Therefor, in a true D program(no outsourcing) with no pointers 
 used, the GC should never have to scan anything.

All realistic programs (in any language) use a lot of pointers - 
for example, all slices in D have embedded pointers (slice.ptr), 
references are pointers, classes are references, etc.

 It seems the GC can be smarter than it is instead of just 
 making blanket assumptions about the entire program(which 
 rarely hold), which is generally always a poor choice when it 
 comes to performance...

If you only have compile time information, making blanket 
assumptions is inevitable - after all, compiler can't understand 
how a nontrivial program actually works. The alternative is doing 
more work at runtime (marking pointers that changed since 
previous collection, etc), which is also not good for performance.

 Who knows, some pointer externally might be peeping in on our 
 hello world program.

Of course, there is a pointer :)

void main()
{
     import std.stdio;

     writeln("hello world".ptr);
}

May 27 2017

Jordan Wilson <wilsonjord gmail.com> writes:

On Friday, 26 May 2017 at 06:31:49 UTC, realhet wrote:
 Hi,

 I'm kinda new to the D language and I love it already. :D So 
 far I haven't got any serious problems but this one seems like 
 beyond me.

 import std.stdio;
 void main(){
     foreach(i; 0..2000){
         writeln(i);
         auto st = new ubyte[500_000_000];
         destroy(st); //<-this doesnt matter
     }
 }

 Compiled with DMD 2.074.0 Win32 it produces the following 
 output:
 0
 1
 2
 core.exception.OutOfMemoryError src\core\exception.d(696): 
 Memory allocation failed

 It doesn't matter that I call destroy() or not. This is ok 
 because as I learned: destroy only calls the destructor and 
 marks the memory block as unused.

 But I also learned that GC will start to collect when it run 
 out of memory but in this time the following happens:
 3x half GB of allocations and deallocations, and on the 4th the 
 system runs out of the 2GB
  limit which is ok. At this point the GC already has 1.5GB of 
 free memory but instead of using that, it returns a Memory 
 Error. Why?

 Note: This is not a problem when I use smaller blocks (like 
 50MB).
 But I want to use large blocks, without making a slow wrapper 
 that emulates a large block by using smaller GC allocated 
 blocks.

 Is there a solution to this?

 Thank You!

I believe the general solution would be to limit allocation 
within loops (given the issue Johnathan mentioned).

This I think achieves the spirit of your code, but without the 
memory exception:
     ubyte[] st;
     foreach(i; 0..2000){
         writeln(i);
         st.length=500_000_000; // auto = new ubyte[500_000_000];
         st.length=0; // destory(st)
         st.assumeSafeAppend;
// prevent allocation by assuming it's ok to overrwrite what's 
currently in st
     }

May 26 2017

realhet <real_het hotmail.com> writes:

 Jordan Wilson wrote:
 This I think achieves the spirit of your code, but without the 
 memory exception:
     ubyte[] st;
     foreach(i; 0..2000){
         writeln(i);
         st.length=500_000_000; // auto = new ubyte[500_000_000];
         st.length=0; // destory(st)
         st.assumeSafeAppend;
 // prevent allocation by assuming it's ok to overrwrite what's 
 currently in st
     }

Yea, that's the perfect solution. It uses exactly the amount of 
memory that is required and still I'm using D things only.
The only difference is that I need only one variable outside of 
the loop, but it's well worth it because I only need one large 
buffer at a time.
Also refreshed my knowledge about assumeSafeAppend() which is now 
clear to me, thanks to You.

Using this information I'll be able to do a BigArray class that 
will hold large amount of data without worrying that the program 
uses 3x more memory than needed :D

Thanks for everyone,
Such a helping community you have here!

May 26 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Out of memory error (even when using destroy())