www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Memory allocation faile on string concat

reply Xie <xiemargl gmail.com> writes:
Can't run a simple program. What's wrong, GC?

import std.stdio;
import std.date;

void f0()
{
	wstring a[];

	foreach(i; 0 .. 100_000_000)
	{
  		a ~= " "w;
  	}
}

void main()
{
	auto r = benchmark!(f0)(1);
	writeln(r, "ms");
}

DMD 2.047
Nov 10 2010
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, November 10, 2010 08:33:45 Xie wrote:
 Can't run a simple program. What's wrong, GC?
 
 import std.stdio;
 import std.date;
 
 void f0()
 {
 	wstring a[];
 
 	foreach(i; 0 .. 100_000_000)
 	{
   		a ~= " "w;
   	}
 }
 
 void main()
 {
 	auto r = benchmark!(f0)(1);
 	writeln(r, "ms");
 }
 
 DMD 2.047

You just tried to create an array with 100_000_000 elements, and arrays try and maintain extra capacity for extra appending, so you're actually trying to create one bigger than that. That's a _lot_ of memory, and it's _contiguous_ memory. You probably hit some sort of memory limit in there somewhere and would have the same problem if you tried to create an array that large in C or C++. That's an _enormous_ array. Maybe it's a 32-bit barrier which will go away once 64-bit dmd is complete, but it's not exactly normal to try and use an array wich 100 million elements in it. The computer does have limits you know. - Jonathan M Davis
Nov 10 2010
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 11:33:45 -0500, Xie <xiemargl gmail.com> wrote:

 Can't run a simple program. What's wrong, GC?

 import std.stdio;
 import std.date;

 void f0()
 {
 	wstring a[];

 	foreach(i; 0 .. 100_000_000)
 	{
   		a ~= " "w;
   	}
 }

 void main()
 {
 	auto r = benchmark!(f0)(1);
 	writeln(r, "ms");
 }

The results on my machine with 1G of memory is that it consumes 2G of memory and the system starts thrashing. I changed the value to 10_000_000, and it runs in a couple seconds, I change it to 50_000_000 and it runs in 200 seconds. Something is definitely amiss here, because the following doesn't help (it still consumes 1.2GB of memory): void f0() { wstring a[]; a.reserve(100_000_000); foreach(i; 0 .. 100_000_000) { a ~= " "w; } } This should take the explosive nature of appending out of the equation, because a reallocation should never occur. I'll look into it. -Steve
Nov 10 2010
parent reply Xie <Xiemargl gmail.com> writes:
Sorry, it a mistypo (i began from wchar[], later changed to wstring)

Real problem can be seen here

import std.stdio;
import std.date;

void f0()
{
	wstring a;

	foreach(i; 0 .. 100_000_000)
	{
  		a ~= " "w;
  	}
}

void main()
{
	auto r = benchmark!(f0)(10);
	writeln(r, "ms");
}
Nov 10 2010
parent Xie <Xiemargl gmail.com> writes:
OK, this actually makes sense to me.

It's a manifestation of this issue:
http://d.puremagic.com/issues/show_bug.cgi?id=3929

I'm think - it's truth but not at all. Sorry, but i'm give incomplete data. My example run fine, when benchmark(1), (2), but not 10. This means, that memory not collected _between_ calls.
Nov 10 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 13:33:11 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 10 Nov 2010 11:33:45 -0500, Xie <xiemargl gmail.com> wrote:

 Can't run a simple program. What's wrong, GC?

 import std.stdio;
 import std.date;

 void f0()
 {
 	wstring a[];

 	foreach(i; 0 .. 100_000_000)
 	{
   		a ~= " "w;
   	}
 }

 void main()
 {
 	auto r = benchmark!(f0)(1);
 	writeln(r, "ms");
 }

The results on my machine with 1G of memory is that it consumes 2G of memory and the system starts thrashing. I changed the value to 10_000_000, and it runs in a couple seconds, I change it to 50_000_000 and it runs in 200 seconds. Something is definitely amiss here, because the following doesn't help (it still consumes 1.2GB of memory): void f0() { wstring a[]; a.reserve(100_000_000); foreach(i; 0 .. 100_000_000) { a ~= " "w; } } This should take the explosive nature of appending out of the equation, because a reallocation should never occur. I'll look into it.

Waaaait a second, a is not a wstring, it's an *array* of wstrings. That's completely different. This all makes sense now. It's not 200MB, it's 800MB your code is asking for. This might be a bit much for a test, a 32-bit system only supports 4GB of address space, and usually 1GB is reserved, so you are trying to allocate about 1/3 of the memory you could possibly allocate. Is there any reason you expect this to perform well? -Steve
Nov 10 2010
prev sibling next sibling parent reply sybrandy <sybrandy gmail.com> writes:
On 11/10/2010 11:33 AM, Xie wrote:
 Can't run a simple program. What's wrong, GC?

 import std.stdio;
 import std.date;

 void f0()
 {
 	wstring a[];

 	foreach(i; 0 .. 100_000_000)
 	{
    		a ~= " "w;
    	}
 }

 void main()
 {
 	auto r = benchmark!(f0)(1);
 	writeln(r, "ms");
 }

 DMD 2.047

In addition to what everybody else is suggesting you look at, you may want to look into using an appender (defined in std.array). Not sure if it will help with the memory usage, but it's supposed to be faster for appending data to an array. Casey
Nov 10 2010
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 13:55:26 -0500, sybrandy <sybrandy gmail.com> wrote:

 On 11/10/2010 11:33 AM, Xie wrote:
 Can't run a simple program. What's wrong, GC?

 import std.stdio;
 import std.date;

 void f0()
 {
 	wstring a[];

 	foreach(i; 0 .. 100_000_000)
 	{
    		a ~= " "w;
    	}
 }

 void main()
 {
 	auto r = benchmark!(f0)(1);
 	writeln(r, "ms");
 }

 DMD 2.047

In addition to what everybody else is suggesting you look at, you may want to look into using an appender (defined in std.array). Not sure if it will help with the memory usage, but it's supposed to be faster for appending data to an array.

Just tried it, Appender is actually slower. This *is* a problem, it should be way faster than builtin array appending. I will look into it. -Steve
Nov 10 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 14:06:40 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 Just tried it, Appender is actually slower.  This *is* a problem, it  
 should be way faster than builtin array appending.

 I will look into it.

More data, Appender taking an array of elements is significantly slower than taking a single element (by an order of magnitude). I'll try to figure out why. BTW, if you append individual wchars instead of a string of a single wchar, Appender does beat the pants off of builtin appends. -Steve
Nov 10 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 14:38:02 -0500, Xie <Xiemargl gmail.com> wrote:

 Sorry, it a mistypo (i began from wchar[], later changed to wstring)

 Real problem can be seen here

 import std.stdio;
 import std.date;

 void f0()
 {
 	wstring a;

 	foreach(i; 0 .. 100_000_000)
 	{
   		a ~= " "w;
   	}
 }

 void main()
 {
 	auto r = benchmark!(f0)(10);
 	writeln(r, "ms");
 }

OK, this actually makes sense to me. It's a manifestation of this issue: http://d.puremagic.com/issues/show_bug.cgi?id=3929 In essence, in order to aid performance, there is an 8-element cache of arrays that are being appended. At the moment, the append code relies on this cache keeping the array from being collected to remain sane. However, as I wrote in the last message on that bug report, I think I can fix it so the cache is not considered a pointer to the memory, and it should be collected. But I haven't done that yet. So what's happening in your code is the first 8 times executing that function, the memory is not collected. This results in you allocating actually 800 million wchars, or 1.6GB before anything can be collected. So in the meantime, jump on the CC for that bug, and I hope to get to it sometime in the near future (maybe by 2.051-2.052). -Steve
Nov 10 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 10 Nov 2010 23:33:58 -0500, Xie <Xiemargl gmail.com> wrote:

 OK, this actually makes sense to me.

 It's a manifestation of this issue:
 http://d.puremagic.com/issues/show_bug.cgi?id=3929

I'm think - it's truth but not at all. Sorry, but i'm give incomplete data. My example run fine, when benchmark(1), (2), but not 10. This means, that memory not collected _between_ calls.

Yes, this is what the bug report is about -- the memory is not collected even though there are no references left, because the cache used to speed up appending still has a reference. -Steve
Nov 11 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 11 Nov 2010 07:42:36 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 10 Nov 2010 23:33:58 -0500, Xie <Xiemargl gmail.com> wrote:

 OK, this actually makes sense to me.

 It's a manifestation of this issue:
 http://d.puremagic.com/issues/show_bug.cgi?id=3929

I'm think - it's truth but not at all. Sorry, but i'm give incomplete data. My example run fine, when benchmark(1), (2), but not 10. This means, that memory not collected _between_ calls.

Yes, this is what the bug report is about -- the memory is not collected even though there are no references left, because the cache used to speed up appending still has a reference.

OK, so I have actually implemented a fix for bug 3929. Essentially, the cache will no longer hold hostage the data that was being appended. However, I have found that this *still* doesn't fix the problem. Even though the cache no longer holds the memory as allocated, the example does something that we can't really get around. When you allocate a very very large block of data (in this case 1/20 of your address space), it is inevitable with a conservative GC that this data never gets collected. Why? Because the stack is scanned conservatively, and the chances that some 4-bytes of data looks like it points to the space is very high. So in order to fix this, we would not only need precise heap scanning, but *also* precise stack scanning, and precise TLS/global data scanning. But I'm still going to check this in to fix the bug, because it does seem to help with smaller blocks. I'll post more on the main newsgroup on this issue. -Steve
Dec 23 2010