www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pop quiz [memory usage]

reply "Vladimir Panteleev" <thecybershadow gmail.com> writes:
// Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
version(Tango) import tango.io.Console;
else           import std.stdio;

struct S
{
	ubyte[40_000] data;
}

void main()
{
	S[] a;
	a ~= S();

	// QUESTION: How much memory will this program consume upon reaching this  
point?
	version(Tango) Cin.copyln();
	else           readln();
}
Jun 06 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:
 	// QUESTION: How much memory will this program consume upon reaching this  
 point?

There's some serious bug here. It allocates 40+ MB. The following code even crashes LDC during the compilation, I'll ask in the LDC channel: struct S { ubyte[40_000] d; } void main() { S[] a; a ~= S(); } Bye and thank you for the little test, bearophile
Jun 06 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
 The following code even crashes LDC during the compilation, I'll ask in the
LDC channel:<

The good ChristianK has already added it: http://www.dsource.org/projects/ldc/ticket/320 Bye, bearophile
Jun 06 2009
parent reply Christian Kamm <kamm-incasoftware removethis.de> writes:
bearophile wrote:

 The following code even crashes LDC during the compilation, I'll ask in
 the LDC channel:<

The good ChristianK has already added it: http://www.dsource.org/projects/ldc/ticket/320

And thankfully Frits van Bommel has already fixed it: it consumes about 40kb of heap memory at runtime now. This seems to be because we don't use the _d_arrayappend functions at all but emit a _d_arraysetlength instead. What's the memory growing behavior of that function? Should this be considered a bug?
Jun 07 2009
parent reply Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-07 10:47:47 +0200, Christian Kamm 
<kamm-incasoftware removethis.de> said:

 bearophile wrote:
 
 The following code even crashes LDC during the compilation, I'll ask in
 the LDC channel:<

The good ChristianK has already added it: http://www.dsource.org/projects/ldc/ticket/320

And thankfully Frits van Bommel has already fixed it: it consumes about 40kb of heap memory at runtime now. This seems to be because we don't use the _d_arrayappend functions at all but emit a _d_arraysetlength instead. What's the memory growing behavior of that function? Should this be considered a bug?

I would say that setlength should not allocate extra space, because one should trust that the user knows his needs, whereas if the array grows while appending then it is reasonable to add extra space because the user is probably appending without really knowing/caring about array resizes. So it is reasonable (and probably good) to try to second guess him, and add extra space. This way repeatedly appending to that array has a good performance. Thus yes I would say that it should be considered a bug (that will degrade repeated appending performance). The solution is to use a better newCapacity function, instead of the flawed one. Fawzi
Jun 07 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Fawzi Mohamed wrote:
 On 2009-06-07 10:47:47 +0200, Christian Kamm 
 <kamm-incasoftware removethis.de> said:
 
 bearophile wrote:

 The following code even crashes LDC during the compilation, I'll ask in
 the LDC channel:<

The good ChristianK has already added it: http://www.dsource.org/projects/ldc/ticket/320

And thankfully Frits van Bommel has already fixed it: it consumes about 40kb of heap memory at runtime now. This seems to be because we don't use the _d_arrayappend functions at all but emit a _d_arraysetlength instead. What's the memory growing behavior of that function? Should this be considered a bug?

I would say that setlength should not allocate extra space, because one should trust that the user knows his needs

char[] x; x.length = 100; // setlength allocates memory
Jun 08 2009
parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-08 15:55:26 +0200, Sean Kelly <sean invisibleduck.org> said:

 Fawzi Mohamed wrote:
 On 2009-06-07 10:47:47 +0200, Christian Kamm 
 <kamm-incasoftware removethis.de> said:
 
 bearophile wrote:
 
 The following code even crashes LDC during the compilation, I'll ask in
 the LDC channel:<

The good ChristianK has already added it: http://www.dsource.org/projects/ldc/ticket/320

And thankfully Frits van Bommel has already fixed it: it consumes about 40kb of heap memory at runtime now. This seems to be because we don't use the _d_arrayappend functions at all but emit a _d_arraysetlength instead. What's the memory growing behavior of that function? Should this be considered a bug?

I would say that setlength should not allocate extra space, because one should trust that the user knows his needs

char[] x; x.length = 100; // setlength allocates memory

I thought it was clear that I meant that (apart some rounding up done by the gc), the allocation created by x.length should no allocate *extra* space (meaning more that needed), because it is quite clear that the user wants that much space, and the likelihood that he doesn't really know, and some extra space would be good is not so high. Thus adding space would most likely be wasted. If the user appends to an array, and reallocation is needed on the other hand, the probability that that array will be appended again is much higher, so it makes sense to allocate some extra space. This makes continuously appending to an array perform O(log(N)) reallocations. In any case I fixed the newCapacity function in tango, with one of my guesses for a good function. But if anybody comes with some benchmarks that suggest and improvement I will gladly do it. Fawzi
Jun 08 2009
prev sibling next sibling parent reply "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sat, 06 Jun 2009 16:17:10 +0300, bearophile <bearophileHUGS lycos.com>  
wrote:
 There's some serious bug here. It allocates 40+ MB.

Um, it should be much more than that. What are you running? -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jun 06 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:
Um, it should be much more than that. What are you running?<

I am running DMD v1.042 with Phobos on WinXP, the compilation needs only tents of a second, and at runtime it allocated 48.3 MB. DMD v2.030 with Phobos needs 49.028 MB and the compilation is a bit slower, 0.35 seconds. Bye, bearophile
Jun 06 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Vladimir Panteleev:
 Ah, that's just the working set. Have a look at the virtual memory usage.<

How can I tell them apart? How can I measure that ' virtual memory usage' on Windows? Bye, bearophile
Jun 06 2009
next sibling parent reply BCS <none anon.com> writes:
Hello bearophile,

 Vladimir Panteleev:
 
 Ah, that's just the working set. Have a look at the virtual memory
 usage.<
 

usage' on Windows? Bye, bearophile

bring up task manager
Jun 06 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
BCS:
 bring up task manager

That's what I have done to take those numbers. Then I have used another small program that has given me similar numbers... Bye, bearophile
Jun 06 2009
parent bearophile <bearophileHUGS lycos.com> writes:
davidl:
 You need to bring up the column of virtual memory usage

Ah, thank you for the suggestion\tip. Now it shows it's using 1033 MB of virtual memory! Bye, bearophile
Jun 06 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, Jun 6, 2009 at 10:53 AM, bearophile<bearophileHUGS lycos.com> wrote:
 BCS:
 bring up task manager

That's what I have done to take those numbers. Then I have used another small program that has given me similar numbers...

You can add extra columns to the task manager to see all sorts of information. That, or use procexp.
Jun 06 2009
prev sibling parent davidl <davidl nospam.org> writes:
在 Sat, 06 Jun 2009 22:53:27 +0800,bearophile <bearophileHUGS lycos.com>  
写道:

 BCS:
 bring up task manager

That's what I have done to take those numbers. Then I have used another small program that has given me similar numbers... Bye, bearophile

You need to bring up the column of virtual memory usage -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/
Jun 06 2009
prev sibling next sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sat, 06 Jun 2009 17:11:45 +0300, bearophile <bearophileHUGS lycos.com>  
wrote:

 Vladimir Panteleev:
 Um, it should be much more than that. What are you running?<

I am running DMD v1.042 with Phobos on WinXP, the compilation needs only tents of a second, and at runtime it allocated 48.3 MB. DMD v2.030 with Phobos needs 49.028 MB and the compilation is a bit slower, 0.35 seconds.

Ah, that's just the working set. Have a look at the virtual memory usage. -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jun 06 2009
prev sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, Jun 6, 2009 at 10:19 AM, Vladimir
Panteleev<thecybershadow gmail.com> wrote:
 On Sat, 06 Jun 2009 17:11:45 +0300, bearophile <bearophileHUGS lycos.com>
 wrote:

 Vladimir Panteleev:
 Um, it should be much more than that. What are you running?<

I am running DMD v1.042 with Phobos on WinXP, the compilation needs only tents of a second, and at runtime it allocated 48.3 MB. DMD v2.030 with Phobos needs 49.028 MB and the compilation is a bit slower, 0.35 seconds.

Ah, that's just the working set. Have a look at the virtual memory usage. -- Best regards, =A0Vladimir =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mailto:the=


1.5GB. Wow.
Jun 06 2009
prev sibling next sibling parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else =A0 =A0 =A0 =A0 =A0 import std.stdio;

 struct S
 {
 =A0 =A0 =A0 =A0ubyte[40_000] data;
 }

 void main()
 {
 =A0 =A0 =A0 =A0S[] a;
 =A0 =A0 =A0 =A0a ~=3D S();

 =A0 =A0 =A0 =A0// QUESTION: How much memory will this program consume upo=

 this point?
 =A0 =A0 =A0 =A0version(Tango) Cin.copyln();
 =A0 =A0 =A0 =A0else =A0 =A0 =A0 =A0 =A0 readln();
 }

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult =3D 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.
Jun 06 2009
next sibling parent reply Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-06 17:12:58 +0200, Jarrett Billingsley 
<jarrett.billingsley gmail.com> said:

 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else      import std.stdio;
 
 struct S
 {
    爑byte[40_000] data;
 }
 
 void main()
 {
    燬[] a;
    燼 ~= S();
 
    // QUESTION: How much memory will this program consume upo

 this point?
    爒ersion(Tango) Cin.copyln();
    爀lse      readln();
 }
 

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

Indeed we were discussing this in the IRC, Actually it is interesting to note that the continuos function written as comment in newCapacity double mult2 = 1.0 + (size / log10(pow(newcap * 2.0,2.0))); does *not* have that behaviour. It seems to me that it is generally much better to work on the total memory rather than on the number of elements. I would use something like long mult = 100 + 200L / log2plus2(newcap) and round up newext = cast(size_t)((newcap * mult) / 100); newext += size-(newext % size); This is what I am adding in tango. One could add something that further favors large sizes, but I miss the rationale behind that, I would rather expect that one typically concatenates strings (size=1..4) and so there is more to gain by making that faster. I can also understand if someone wants to use only the number of elements (rather than the total size), but what was implemented wasn't that either. If someone has some insight, or good benchmarks to choose a better function it would be welcome. Fawzi
Jun 06 2009
parent Fawzi Mohamed <fmohamed mac.com> writes:
 Indeed we were discussing this in the IRC,
 Actually it is interesting to note that the continuos function written 
 as comment in newCapacity
 	double mult2 = 1.0 + (size / log10(pow(newcap * 2.0,2.0)));
 does *not* have that behaviour.
 It seems to me that it is generally much better to work on the total 
 memory rather than on the number of elements.
 I would use something like
             long mult = 100 + 200L / log2plus2(newcap)
 and round up
             newext = cast(size_t)((newcap * mult) / 100);
             newext += size-(newext % size);
 
 This is what I am adding in tango.

thinking more about this, given that one starts at pagesize ~4024, log2=12 this might be too conservative, I will test a little bit more...
 One could add something that further favors large sizes, but I miss the 
 rationale behind that, I would rather expect that one typically 
 concatenates strings (size=1..4) and so there is more to gain by making 
 that faster.
 I can also understand if someone wants to use only the number of 
 elements (rather than the total size), but what was implemented wasn't 
 that either.

maybe the number of elements is really the correct thing to do...
 
 If someone has some insight, or good benchmarks to choose a better 
 function it would be welcome.
 
 Fawzi

Jun 06 2009
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Jarrett Billingsley wrote:
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

I've been debating for a while whether newCapacity shoulds exist at all. The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it? Particularly in D2 where append operations on arrays are probably less common as a result of string being invariant. I'll fix newCapacity and run some tests, depending on their result I may disable it entirely.
Jun 06 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Sean Kelly wrote:
 Jarrett Billingsley wrote:
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon 
 reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

I've been debating for a while whether newCapacity shoulds exist at all. The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it? Particularly in D2 where append operations on arrays are probably less common as a result of string being invariant. I'll fix newCapacity and run some tests, depending on their result I may disable it entirely.

I think that's a great point. Andrei
Jun 06 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Sean Kelly:
 Particularly in D2 where append 
 operations on arrays are probably less common as a result of string 
 being invariant.

They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common... Bye, bearophile
Jun 06 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append 
 operations on arrays are probably less common as a result of string 
 being invariant.

They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

Yes but appending to an immutable string is never performed in place, which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.
Jun 06 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:
 
 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

What gave you that idea? void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); }

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.
Jun 06 2009
parent Sean Kelly <sean invisibleduck.org> writes:
Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 21:59:39 -0700, Sean Kelly wrote:
 
 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

Oh, I know. It's a long-standing issue with immutability, but I think if appending gets fixed as Andrei suggests, this should be fixed as well. I was just saying that your statement about immutable data never being appended in-place was false.

Oops, I replied based on an educated guess--I should have tested it. Still, immutable arrays are hardly immutable if an append can alter their contents.
Jun 07 2009
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean invisibleduck.org>
 wrote:
 
 Jarrett Billingsley wrote:
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon 
 reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

_d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

I've been debating for a while whether newCapacity shoulds exist at all. The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?

Do you mean at the end of pages, or pools?

Blocks. Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity. newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.
 If you mean that DMD will allocate memory pools larger than the immediate
 memory requirement, that's true, however D's GC is "greedy" in that it
 always allocates memory at the lowest address possible. This means that
 when all previous pools fill up, the next object will "cap" the new array,
 and the array will no longer be able to extend.

This is a quirk of the current GC. A new GC may not behave the same way (in fact I can guarantee that the one I'm working on is implemented differently).
 So, I don't think that your idea is feasable with the current GC
 implementation.

I'm not sure I understand. The only "idea" I proposed was to get rid of newCapacity.
 I wonder how much would things get improved if objects
 would be divided between "growing" and "non-growing", with the GC
 prioritizing allocating new objects in free space not directly following
 "growing" objects.

The user already knows best which arrays are going to grow though. Is this really something the runtime should try to figure out?
Jun 06 2009
parent reply Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-06 21:07:40 +0200, Sean Kelly <sean invisibleduck.org> said:

 Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean invisibleduck.org>
 wrote:


 
 I've been debating for a while whether newCapacity shoulds exist at 
 all.   The GC already tends to leave free space at the end of blocks, 
 so why should the runtime second-guess it?

Do you mean at the end of pages, or pools?

Blocks. Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity. newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

at the moment the behavior is exactly the opposite (leaving the small array to the GC handling you just sketched)), only array larger than a page have the special treatment, I think that the idea is that appending to large arrays can be potentially very expensive, so those get the special treatment. But as I said previously I don't fully understand the rationale, especially behind the idea of having more reserved space if the elements are larger, I could understand having the reserved space just referring to the elements, like this: long mult = 100 + 200L / log2plus2(newlength) instead of long mult = 100 + 1000L / log2plus2(newcap) or something in between these give at most 1.5 or 1.71, so a waste of at most 100% or 71% and decreases as 1/log toward 1.02. To really test this one should benchmark some "typical" applications. The current choice is neither of these, I don't know exactly why this high weight to the high size elements. I think it is an error, but maybe there was an idea (at least for smallish sizes), that I don't see. Fawzi
 
 If you mean that DMD will allocate memory pools larger than the immediate
 memory requirement, that's true, however D's GC is "greedy" in that it
 always allocates memory at the lowest address possible. This means that
 when all previous pools fill up, the next object will "cap" the new array,
 and the array will no longer be able to extend.

This is a quirk of the current GC. A new GC may not behave the same way (in fact I can guarantee that the one I'm working on is implemented differently).
 So, I don't think that your idea is feasable with the current GC
 implementation.

I'm not sure I understand. The only "idea" I proposed was to get rid of newCapacity.
 I wonder how much would things get improved if objects
 would be divided between "growing" and "non-growing", with the GC
 prioritizing allocating new objects in free space not directly following
 "growing" objects.

The user already knows best which arrays are going to grow though. Is this really something the runtime should try to figure out?

Jun 06 2009
parent reply Sean Kelly <sean invisibleduck.org> writes:
Fawzi Mohamed wrote:
 On 2009-06-06 21:07:40 +0200, Sean Kelly <sean invisibleduck.org> said:
 
 Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean invisibleduck.org>
 wrote:


 I've been debating for a while whether newCapacity shoulds exist at 
 all.   The GC already tends to leave free space at the end of 
 blocks, so why should the runtime second-guess it?

Do you mean at the end of pages, or pools?

Blocks. Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity. newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

at the moment the behavior is exactly the opposite (leaving the small array to the GC handling you just sketched)), only array larger than a page have the special treatment, I think that the idea is that appending to large arrays can be potentially very expensive, so those get the special treatment.

Hm. I still think we'd be better off letting the user handle this. If they're going to be appending and performance is important then they'll use an ArrayAppender anyway. I'd rather not have extra space tacked onto the end of every array I create "just in case" I decide to append to it.
Jun 06 2009
parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-07 04:07:45 +0200, Sean Kelly <sean invisibleduck.org> said:

 Fawzi Mohamed wrote:
 On 2009-06-06 21:07:40 +0200, Sean Kelly <sean invisibleduck.org> said:
 
 Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean invisibleduck.org>
 wrote:


 
 I've been debating for a while whether newCapacity shoulds exist at 
 all.   The GC already tends to leave free space at the end of blocks, 
 so why should the runtime second-guess it?

Do you mean at the end of pages, or pools?

Blocks. Blocks less than 4k in the current allocator are sized in powers of two, so array appends already get the "double in size" behavior in Java even without newCapacity. newCapacity just has the potential to throw an array into the next larger block early, thus potentially wasting more space if the array will never be appended to. I think newCapacity isn't supposed to reserve extra space for blocks larger than 4k, but it's been a while since I've looked at it.

at the moment the behavior is exactly the opposite (leaving the small array to the GC handling you just sketched)), only array larger than a page have the special treatment, I think that the idea is that appending to large arrays can be potentially very expensive, so those get the special treatment.

Hm. I still think we'd be better off letting the user handle this. If they're going to be appending and performance is important then they'll use an ArrayAppender anyway. I'd rather not have extra space tacked onto the end of every array I create "just in case" I decide to append to it.

well it isn't so bad, newCapacity is used only when the use *is* adding in place, and the array is not large enough. Fawzi
Jun 07 2009
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
Steve Schveighoffer wrote:
 On Mon, 08 Jun 2009 00:34:10 +0400, Denis Koroskin wrote:
 
 On Sun, 07 Jun 2009 15:47:52 +0400, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:

 On Sat, 06 Jun 2009 21:59:39 -0700, Sean Kelly wrote:

 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

 void main()
 {
   auto str1 = "hello".idup;
   auto str2 = str1;
   str1 ~= "world";
   assert(str1.ptr == str2.ptr);
 }

auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

if appending gets fixed as Andrei suggests, this should be fixed as well. I was just saying that your statement about immutable data never being appended in-place was false. -Steve


By-design behavior. See http://www.digitalmars.com/d/2.0/ arrays.html#resize poorly designed, perhaps, but it's by design. My point was that there's no special treatment of immutable data (as was suggested by Sean), it suffers from the same issues as mutable appending. BTW, I'm not in favor of the current behavior, but I do think that if something is fixed for this array allocation issue, it should cover this problem as well. -Steve

This problem was one of the main drivers behind the proposal to formally separate arrays and slices (the T[new] vs T[] part of the D 2.0 talk). It's kinda fallen down on the todo list, but it's a pretty key usability problem, imho. Later, Brad
Jun 07 2009
prev sibling parent reply Lionello Lunesu <lio lunesu.remove.com> writes:
Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 18:12:58 +0300, Jarrett Billingsley 
 <jarrett.billingsley gmail.com> wrote:
 
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon 
 reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

Bulls-eye. I think mr. Dave Fladebo deserves a math lesson or something, and mr. Walter Bright should review patches that go into the language's runtime more carefully. :P Can we discuss a suitable replacement? I suggested in #d that we look at how other platforms do it. For example, Java's Vector just doubles its capacity by default ( http://java.sun.com/javase/6/docs/api/java/util/Vector.html ).

In my own array classes I'm using Python's: size += max(size>>3, 8); Using this, you won't end up wasting a lot of memory. Preallocating is always preferred anyway. L.
Jun 06 2009
parent reply Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-07 02:23:47 +0200, Lionello Lunesu <lio lunesu.remove.com> said:

 Vladimir Panteleev wrote:
 On Sat, 06 Jun 2009 18:12:58 +0300, Jarrett Billingsley 
 <jarrett.billingsley gmail.com> wrote:
 
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;
 
 struct S
 {
        ubyte[40_000] data;
 }
 
 void main()
 {
        S[] a;
        a ~= S();
 
        // QUESTION: How much memory will this program consume upon reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }
 

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

Bulls-eye. I think mr. Dave Fladebo deserves a math lesson or something, and mr. Walter Bright should review patches that go into the language's runtime more carefully. :P Can we discuss a suitable replacement? I suggested in #d that we look at how other platforms do it. For example, Java's Vector just doubles its capacity by default ( http://java.sun.com/javase/6/docs/api/java/util/Vector.html ).

In my own array classes I'm using Python's: size += max(size>>3, 8); Using this, you won't end up wasting a lot of memory. Preallocating is always preferred anyway. L.

actually the original function proposed by Dave Fladebo is not so bad, it calculates the average storage requirement per bit (total_size/num_bits) and tries to allocate that space more. The advantage is that it stays exponential like which means ~O(log(n)) (maybe log(n)*log(log(n))) reallocations needed when continuously adding, but tries to waste less and less space the larger the array is. I did propose two "reasonable" possibilities, that I give again here in a cleaner way {{{ const int b=2; version(A){ const a =1000L; } else { const a=100L; } static int log2plusB(size_t c) { int i=b; while(c >>= 1){ ++i; } return i; } variant(A) long mult = 100 + a*b / log2plusB(newcap); } else { long mult = 100 + a*b / log2plusB(newlength); } // testing shows 1.02 for large arrays is about the point of diminishing return if (mult < 102) mult = 102; newext = cast(size_t)((newcap * mult) / 100); newext += size-(newext % size); }}} version A uses the total memory, the other the number of elements. Increasing a makes the the amount of reserved space larger (it is the maximum amount of extra reserved space in %). The value of a I choose is different from one version to the other because min(log2(newcap))=~12, min(log2(newlength))=1 Increasing b makes the decrease flatter, and goes closer to treating all sizes with the same extra space. For small b the decrease for larger arrays is faster. In general I think that the functional form is sound, and it is superior to either fixed "double the size" or "add a constant size". This approach should be used also in other places that need to periodically reallocate. The question is which values of a and b are optimal, and if it is better to use the total size or the number of elements. Now if someone would to some meaningful benchmarks (on real applications), then I would be very happy, otherwise I would just take one of my guesses... Fawzi
Jun 07 2009
parent Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-06-07 11:33:06 +0200, Fawzi Mohamed <fmohamed mac.com> said:

Ehm some small corrections to the constants, so that their meaning is 
better connected to what one expects in both versions

 actually the original function proposed by Dave Fladebo is not so bad, 
 it calculates the average storage requirement per bit 
 (total_size/num_bits) and tries to allocate that space more.
 The advantage is that it stays exponential like which means ~O(log(n)) 
 (maybe log(n)*log(log(n))) reallocations needed when continuously 
 adding, but tries to waste less and less space the larger the array is.
 I did propose two "reasonable" possibilities, that I give again here in 
 a cleaner way

const int b=2; const a =100L; version(A){ const minBits=11; // normally page size is at least 2K } else { const minBits=1; } static int log2plusB(size_t c) { int i=b; while(c >>= 1){ ++i; } return i; } variant(A) long mult = 100 + a*(minBits+b) / log2plusB(newcap); } else { long mult = 100 + a*(minBits+b) / log2plusB(newlength); } // testing shows 1.02 for large arrays is about the point of diminishing return if (mult < 102) mult = 102; newext = cast(size_t)((newcap * mult) / 100); newext += size-(newext % size); }}}
 version A uses the total memory, the other the number of elements.
 
 Increasing a makes the the amount of reserved space larger (it is the 
 maximum amount of extra reserved space in %).

The value of minBits I choose is different from one version to the other because min(log2(newcap))=~12, min(log2(newlength))=1
 Increasing b makes the decrease flatter, and goes closer to treating 
 all sizes with the same extra space. For small b the decrease for 
 larger arrays is faster.
 
 In general I think that the functional form is sound, and it is 
 superior to either fixed "double the size" or "add a constant size". 
 This approach should be used also in other places that need to 
 periodically reallocate.
 
 The question is which values of a and b are optimal, and if it is 
 better to use the total size or the number of elements.
 Now if someone would to some meaningful benchmarks (on real 
 applications), then I would be very happy, otherwise I would just take 
 one of my guesses...
 
 Fawzi

Jun 07 2009
prev sibling next sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sat, 06 Jun 2009 18:12:58 +0300, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon  
 reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

There seems to be something wrong with the newCapacity function that _d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

Bulls-eye. I think mr. Dave Fladebo deserves a math lesson or something, and mr. Walter Bright should review patches that go into the language's runtime more carefully. :P Can we discuss a suitable replacement? I suggested in #d that we look at how other platforms do it. For example, Java's Vector just doubles its capacity by default ( http://java.sun.com/javase/6/docs/api/java/util/Vector.html ). -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jun 06 2009
prev sibling next sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sat, 06 Jun 2009 20:19:45 +0300, Sean Kelly <sean invisibleduck.org>
wrote:

 Jarrett Billingsley wrote:
 On Sat, Jun 6, 2009 at 8:03 AM, Vladimir
 Panteleev<thecybershadow gmail.com> wrote:
 // Works for DMD1/Phobos, DMD1/Tango and DMD2/Phobos
 version(Tango) import tango.io.Console;
 else           import std.stdio;

 struct S
 {
        ubyte[40_000] data;
 }

 void main()
 {
        S[] a;
        a ~= S();

        // QUESTION: How much memory will this program consume upon  
 reaching
 this point?
        version(Tango) Cin.copyln();
        else           readln();
 }

_d_arrayappendcT calls. From an element size of 20000 (I halved it just to make the allocation faster) and an array length of 1, it somehow calculates the new size to be 266686600. Hm. That seems a bit off. It seems this line: long mult = 100 + (1000L * size) / log2plus1(newcap); is to blame. I don't think large value types were taken into account here. The resulting multiplier is 1,333,433, which is hilariously large.

I've been debating for a while whether newCapacity shoulds exist at all. The GC already tends to leave free space at the end of blocks, so why should the runtime second-guess it?

Do you mean at the end of pages, or pools? Pages are only 4K in size, causing a reallocation on every 4K doesn't make sense performance-wise. If you mean that DMD will allocate memory pools larger than the immediate memory requirement, that's true, however D's GC is "greedy" in that it always allocates memory at the lowest address possible. This means that when all previous pools fill up, the next object will "cap" the new array, and the array will no longer be able to extend. Allow me to demonstrate: ----------------------- import std.gc; import std.stdio; struct S1 { ubyte[4095] data; } struct S2 { ubyte[4095*4] data; } void main() { S1[] a; S2*[] b; for (int i=0;i<1024;i++) { a ~= S1(); b ~= new S2; writefln("%d", i); } } ----------------------- This program allocates 4-page blocks and appends 1-page blocks to an array in a loop. Here's a Diamond[1] analysis screenshot from before and after the first two garbage collects: http://dump.thecybershadow.net/b4af5badf32c954b7a18b848b7d9da64/1.png The P+++ segments are S2 instances. The segments with the longer tails of +es are copies of S1[]. As you can see, as soon as the previous pools fill up, the pool containing the S1[] gets rapidly filled, because it's just a loop of reallocating S1[] every time an S2[] is allocated at its end. So, I don't think that your idea is feasable with the current GC implementation. I wonder how much would things get improved if objects would be divided between "growing" and "non-growing", with the GC prioritizing allocating new objects in free space not directly following "growing" objects. [1] http://www.dsource.org/projects/diamond -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jun 06 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, Jun 6, 2009 at 1:42 PM, bearophile<bearophileHUGS lycos.com> wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

Especially since D2 has one in the stdlib. I always find myself writing my own anyway, since I don't like depending on unspecified heap behavior.
Jun 06 2009
prev sibling next sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Sat, 06 Jun 2009 22:07:40 +0300, Sean Kelly <sean invisibleduck.org>  
wrote:

 Blocks.  Blocks less than 4k in the current allocator are sized in  
 powers of two, so array appends already get the "double in size"  
 behavior in Java even without newCapacity.  newCapacity just has the  
 potential to throw an array into the next larger block early, thus  
 potentially wasting more space if the array will never be appended to. I  
 think newCapacity isn't supposed to reserve extra space for blocks  
 larger than 4k, but it's been a while since I've looked at it.

Yes, but you do agree that the penalty of reallocating every time the array size goes over a 4kB boundary (in the worst case of heap fragmentation similar like the one I demonstrated) is not acceptable?
 This is a quirk of the current GC.  A new GC may not behave the same way  
 (in fact I can guarantee that the one I'm working on is implemented  
 differently).

Could you tell us more about that? I was toying with a new GC idea myself (since last year) but haven't gotten around to finishing it yet.
 I'm not sure I understand.  The only "idea" I proposed was to get rid of  
 newCapacity.

I understood as you wanting to change the code so it would behave as if newCapacity always returned newlength * size.
 The user already knows best which arrays are going to grow though.  Is  
 this really something the runtime should try to figure out?

No, but then you'll need to change the language specification to allow the user to specify growable arrays. I guess the proper solution here is to force the user to use specialized classes with their own "newCapacity" implementations. -- Best regards, Vladimir mailto:thecybershadow gmail.com
Jun 06 2009
prev sibling next sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

They aren't much common maybe because they are currently dead-slow. Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

Yes but appending to an immutable string is never performed in place, which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

What gave you that idea? void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); } -Steve
Jun 06 2009
prev sibling next sibling parent davidl <davidl nospam.org> writes:
在 Sun, 07 Jun 2009 12:59:39 +0800,Sean Kelly <sean invisibleduck.org>  
写道:

 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); }

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

Oh, file a bug report! you find the bug! -- 使用 Opera 革命性的电子邮件客户程序: http://www.opera.com/mail/
Jun 06 2009
prev sibling next sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Sat, 06 Jun 2009 21:59:39 -0700, Sean Kelly wrote:

 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:
 
 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

What gave you that idea? void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); }

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

Oh, I know. It's a long-standing issue with immutability, but I think if appending gets fixed as Andrei suggests, this should be fixed as well. I was just saying that your statement about immutable data never being appended in-place was false. -Steve
Jun 07 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sun, 07 Jun 2009 15:47:52 +0400, Steve Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Sat, 06 Jun 2009 21:59:39 -0700, Sean Kelly wrote:

 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of string
 being invariant.

Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

What gave you that idea? void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); }

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

Oh, I know. It's a long-standing issue with immutability, but I think if appending gets fixed as Andrei suggests, this should be fixed as well. I was just saying that your statement about immutable data never being appended in-place was false. -Steve

Your proof relies on buggy behavior.
Jun 07 2009
prev sibling parent Steve Schveighoffer <schveiguy yahoo.com> writes:
On Mon, 08 Jun 2009 00:34:10 +0400, Denis Koroskin wrote:

 On Sun, 07 Jun 2009 15:47:52 +0400, Steve Schveighoffer
 <schveiguy yahoo.com> wrote:
 
 On Sat, 06 Jun 2009 21:59:39 -0700, Sean Kelly wrote:

 Steve Schveighoffer wrote:
 On Sat, 06 Jun 2009 12:03:03 -0700, Sean Kelly wrote:

 bearophile wrote:
 Sean Kelly:
 Particularly in D2 where append
 operations on arrays are probably less common as a result of
 string being invariant.

Appending to an immutable string is a common operation. But I guess Array appenders will get more common...

place, which is the only time the extra space reserved by newCapacity matters. I suspect the memory wasted by newCapacity is more of an issue than any time savings it provides.

What gave you that idea? void main() { auto str1 = "hello".idup; auto str2 = str1; str1 ~= "world"; assert(str1.ptr == str2.ptr); }

auto str1 = "hello".idup; auto str2 = str3 = str1; str2 ~= " world"; str3 ~= " garbage"; Doesn't seem terribly safe to me.

Oh, I know. It's a long-standing issue with immutability, but I think if appending gets fixed as Andrei suggests, this should be fixed as well. I was just saying that your statement about immutable data never being appended in-place was false. -Steve

Your proof relies on buggy behavior.

By-design behavior. See http://www.digitalmars.com/d/2.0/ arrays.html#resize poorly designed, perhaps, but it's by design. My point was that there's no special treatment of immutable data (as was suggested by Sean), it suffers from the same issues as mutable appending. BTW, I'm not in favor of the current behavior, but I do think that if something is fixed for this array allocation issue, it should cover this problem as well. -Steve
Jun 07 2009