digitalmars.D.learn - long compile time question

Dan (15/15) Oct 23 2012 The following takes nearly three minutes to compile.

1100110 (7/22) Oct 23 2012 I have the same issue on linux x64 2.060

Era Scarecrow (24/31) Oct 24 2012 It appears it's all happening during copying init, why I am not

Maxim Fomin (5/38) Oct 24 2012 According to assembly dmd just generates repetitive instructions

thedeemon (21/36) Oct 24 2012 The code DMD generates for initializing the struct does not use

Don Clugston (2/40) Oct 24 2012 That's incredibly horrible, please add to bugzilla.

H. S. Teoh (25/57) Oct 24 2012 [...]

thedeemon (8/13) Oct 24 2012 Must be other factors. "Optimized" code (generated by dmd with

Jonathan M Davis (6/7) Oct 24 2012 There are at least a couple of potentially related bugs already:

Era Scarecrow (6/14) Oct 24 2012 That seems silly. I would think after the struct's init/contents

H. S. Teoh (10/27) Oct 24 2012 Not true. Memset is *usually* used to set memory to 0, but it can set

BLM768 (10/15) Oct 27 2012 I think that the reasoning behind DMD's implementation is that

BLM768 (5/21) Oct 27 2012 I just realized that this post is redundant; other posts have

Manfred Nowak (13/14) Oct 24 2012 ... and this returns immediately:

Jonathan M Davis (6/23) Oct 24 2012 That's incredibly dangerous though, because than B.init has garbage in i...

"Dan" <dbdavidson yahoo.com> writes:

The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan
----------------
struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x;
}

void main() {
   B[] barr;
   barr ~= B();
}
-----------------

Oct 23 2012

1100110 <0b1100110 gmail.com> writes:

On Tue, 23 Oct 2012 22:50:46 -0500, Dan <dbdavidson yahoo.com> wrote:

 The following takes nearly three minutes to compile.
 The culprit is the line bar ~= B();
 What is wrong with this?

 Thanks,
 Dan
 ----------------
 struct B {
    const size_t SIZE = 1024*64;
    int[SIZE] x;
 }

 void main() {
    B[] barr;
    barr ~= B();
 }
 -----------------

I have the same issue on linux x64 2.060

So appending to a dynamic array isn't really that efficient.
But this goes WAY over that line.  I'm timing your test now.

It's still going...


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Oct 23 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:
 The following takes nearly three minutes to compile.
 The culprit is the line bar ~= B();
 What is wrong with this?


 I have the same issue on linux x64 2.060

 So appending to a dynamic array isn't really that efficient. 
 But this goes WAY over that line.  I'm timing your test now.

 It's still going...

  It appears it's all happening during copying init, why I am not 
sure.

[code]
struct B {
   enum SIZE = 1024 * 64;
   int[SIZE] x;
}

//same timing issue, no array involved
void test(B b) {}

void main() {
   test(B());
}
[/code]

  I've changed the *64 to various numbers and gotten curious 
results. The mb is the noted (estimated) memory footprint it used 
during compiling.

1:   0m0.725s   mb:??
2:   0m1.414s   mb:??
4:   0m2.620s   mb:28
8:   0m8.937s   mb:30
16:  0m35.869s  mb:34
32:  2m36.922s  mb:42
64:  9m27.353s  mb:56

Oct 24 2012

"Maxim Fomin" <maxim maxim-fomin.ru> writes:

On Wednesday, 24 October 2012 at 09:50:38 UTC, Era Scarecrow 
wrote:
 On Wednesday, 24 October 2012 at 04:49:19 UTC, 1100110 wrote:
 The following takes nearly three minutes to compile.
 The culprit is the line bar ~= B();
 What is wrong with this?


 I have the same issue on linux x64 2.060

 So appending to a dynamic array isn't really that efficient. 
 But this goes WAY over that line.  I'm timing your test now.

 It's still going...

  It appears it's all happening during copying init, why I am 
 not sure.

 [code]
 struct B {
   enum SIZE = 1024 * 64;
   int[SIZE] x;
 }

 //same timing issue, no array involved
 void test(B b) {}

 void main() {
   test(B());
 }
 [/code]

  I've changed the *64 to various numbers and gotten curious 
 results. The mb is the noted (estimated) memory footprint it 
 used during compiling.

 1:   0m0.725s   mb:??
 2:   0m1.414s   mb:??
 4:   0m2.620s   mb:28
 8:   0m8.937s   mb:30
 16:  0m35.869s  mb:34
 32:  2m36.922s  mb:42
 64:  9m27.353s  mb:56

According to assembly dmd just generates repetitive instructions 
to zero memory instead of making it in a loop. A workaround is to 
initialize array to void and then zero it in a loop.

Oct 24 2012

"thedeemon" <dlang thedeemon.com> writes:

On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:
 The following takes nearly three minutes to compile.
 The culprit is the line bar ~= B();
 What is wrong with this?

 Thanks,
 Dan
 ----------------
 struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x;
 }

 void main() {
   B[] barr;
   barr ~= B();
 }
 -----------------

The code DMD generates for initializing the struct does not use 
loops, so it's
xor     ecx, ecx
mov     [eax], ecx
mov     [eax+4], ecx
mov     [eax+8], ecx
mov     [eax+0Ch], ecx
mov     [eax+10h], ecx
mov     [eax+14h], ecx
mov     [eax+18h], ecx
mov     [eax+1Ch], ecx
mov     [eax+20h], ecx
mov     [eax+24h], ecx
mov     [eax+28h], ecx
mov     [eax+2Ch], ecx
mov     [eax+30h], ecx
mov     [eax+34h], ecx
mov     [eax+38h], ecx
...

So your code creates a lot of work for the compiler.

Oct 24 2012

Don Clugston <dac nospam.com> writes:

On 24/10/12 17:39, thedeemon wrote:
 On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:
 The following takes nearly three minutes to compile.
 The culprit is the line bar ~= B();
 What is wrong with this?

 Thanks,
 Dan
 ----------------
 struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x;
 }

 void main() {
   B[] barr;
   barr ~= B();
 }
 -----------------

 The code DMD generates for initializing the struct does not use loops,
 so it's
 xor     ecx, ecx
 mov     [eax], ecx
 mov     [eax+4], ecx
 mov     [eax+8], ecx
 mov     [eax+0Ch], ecx
 mov     [eax+10h], ecx
 mov     [eax+14h], ecx
 mov     [eax+18h], ecx
 mov     [eax+1Ch], ecx
 mov     [eax+20h], ecx
 mov     [eax+24h], ecx
 mov     [eax+28h], ecx
 mov     [eax+2Ch], ecx
 mov     [eax+30h], ecx
 mov     [eax+34h], ecx
 mov     [eax+38h], ecx
 ...

 So your code creates a lot of work for the compiler.

That's incredibly horrible, please add to bugzilla.

Oct 24 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Oct 24, 2012 at 06:04:10PM +0200, Don Clugston wrote:
 On 24/10/12 17:39, thedeemon wrote:
On Wednesday, 24 October 2012 at 03:50:47 UTC, Dan wrote:
The following takes nearly three minutes to compile.
The culprit is the line bar ~= B();
What is wrong with this?

Thanks,
Dan
----------------
struct B {
  const size_t SIZE = 1024*64;
  int[SIZE] x;
}

void main() {
  B[] barr;
  barr ~= B();
}
-----------------

The code DMD generates for initializing the struct does not use loops,
so it's
xor     ecx, ecx
mov     [eax], ecx
mov     [eax+4], ecx
mov     [eax+8], ecx
mov     [eax+0Ch], ecx
mov     [eax+10h], ecx
mov     [eax+14h], ecx
mov     [eax+18h], ecx
mov     [eax+1Ch], ecx


[...]

Yikes!! Why aren't we using memset (or equivalent) here?!


 That's incredibly horrible, please add to bugzilla.

[...]

Yeah, no kidding! For comparison, GDC does a way better job in this
department:

	$ time dmd test.d

	real    0m7.564s
	user    0m7.529s
	sys     0m0.029s
	$ time gdc test.d

	real    0m0.107s
	user    0m0.069s
	sys     0m0.036s
	$

This is with SIZE = 1024*16 (I didn't dare try it with 1024*64).  That's
a 75:1 ratio between dmd and gdc, which is pretty horrible, since dmd is
usually significantly faster than gdc.

Surprisingly, though, dmd still produces a smaller executable than gdc
for this code! I'm guessing the optimizer cleans up that code
afterwards? (Or maybe there are other factors at play here that I'm not
aware of.)


T

-- 
EMACS = Extremely Massive And Cumbersome System

Oct 24 2012

"thedeemon" <dlang thedeemon.com> writes:

On Wednesday, 24 October 2012 at 17:43:11 UTC, H. S. Teoh wrote:

 Surprisingly, though, dmd still produces a smaller executable 
 than gdc
 for this code! I'm guessing the optimizer cleans up that code
 afterwards? (Or maybe there are other factors at play here that 
 I'm not aware of.)

Must be other factors. "Optimized" code (generated by dmd with 
-release -O) looks like
mov DWORD PTR [edx], 0
mov DWORD PTR [edx+4], 0
mov DWORD PTR [edx+8], 0
...
so it should be even bigger and probably slower.

Oct 24 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, October 24, 2012 18:04:10 Don Clugston wrote:
 That's incredibly horrible, please add to bugzilla.

There are at least a couple of potentially related bugs already:

http://d.puremagic.com/issues/show_bug.cgi?id=8828
http://d.puremagic.com/issues/show_bug.cgi?id=8449

though this may still merit a new report.

- Jonathan M Davis

Oct 24 2012

"Era Scarecrow" <rtcvb32 yahoo.com> writes:

On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:
 The code DMD generates for initializing the struct does not use 
 loops, so it's
 xor     ecx, ecx
 mov     [eax], ecx
 mov     [eax+4], ecx
 mov     [eax+8], ecx
 ...

 So your code creates a lot of work for the compiler.

  That seems silly. I would think after the struct's init/contents 
were known it would make a single block that holds the basic init 
for it and bulk copy every time it needed it (if it's beyond a 
certain size, say 32 bytes).

  Also memset only works if the data can be defaulted to 0. Hmmm...

Oct 24 2012

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Thu, Oct 25, 2012 at 12:05:25AM +0200, Era Scarecrow wrote:
 On Wednesday, 24 October 2012 at 15:39:19 UTC, thedeemon wrote:
The code DMD generates for initializing the struct does not use
loops, so it's
xor     ecx, ecx
mov     [eax], ecx
mov     [eax+4], ecx
mov     [eax+8], ecx
...

So your code creates a lot of work for the compiler.

 
  That seems silly. I would think after the struct's init/contents
 were known it would make a single block that holds the basic init
 for it and bulk copy every time it needed it (if it's beyond a
 certain size, say 32 bytes).
 
  Also memset only works if the data can be defaulted to 0. Hmmm...

Not true. Memset is *usually* used to set memory to 0, but it can set
memory to other byte values too. Although this doesn't help if the .init
value isn't something consisting of repeated byte values.

In any case, though, separately initializing every member of an array is
silly. That's what a loop is for. That, or a memcpy from an immutable
copy of .init.


T

-- 
Life would be easier if I had the source code. -- YHL

Oct 24 2012

"BLM768" <blm768 gmail.com> writes:

 In any case, though, separately initializing every member of an 
 array is
 silly. That's what a loop is for. That, or a memcpy from an 
 immutable
 copy of .init.

I think that the reasoning behind DMD's implementation is that 
for small structs, writing out the instructions is more efficient 
than a loop or a memcpy(); it's essentially the equivalent of 
loop unrolling and function inlining. However, that reasoning 
breaks down as soon as the struct's size goes beyond a certain 
value.
In my opinion, though, this behavior should be kept for small 
structs. For example, if you have a struct that just wraps a 
size_t, just generating a move instruction is _way_ faster than a 
call to memcpy().

Oct 27 2012

"BLM768" <blm768 gmail.com> writes:

On Saturday, 27 October 2012 at 23:07:19 UTC, BLM768 wrote:
 In any case, though, separately initializing every member of 
 an array is
 silly. That's what a loop is for. That, or a memcpy from an 
 immutable
 copy of .init.

 I think that the reasoning behind DMD's implementation is that 
 for small structs, writing out the instructions is more 
 efficient than a loop or a memcpy(); it's essentially the 
 equivalent of loop unrolling and function inlining. However, 
 that reasoning breaks down as soon as the struct's size goes 
 beyond a certain value.
 In my opinion, though, this behavior should be kept for small 
 structs. For example, if you have a struct that just wraps a 
 size_t, just generating a move instruction is _way_ faster than 
 a call to memcpy().

I just realized that this post is redundant; other posts have 
also mentioned optimization for small structs. That makes two 
relatively dumb posts from me in a day; maybe I should just stop 
for now. :)

Oct 27 2012

Manfred Nowak <svv1999 hotmail.com> writes:

Dan wrote:

 The following takes nearly three minutes to compile.

... and this returns immediately:

--------------------------------
struct B {
   const size_t SIZE = 1024*64;
   int[SIZE] x= void; // !!!
}

void main() {
   B[] barr;
   barr ~= B();
}
-------------------------------
- manfred

Oct 24 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, October 24, 2012 21:35:22 Manfred Nowak wrote:
 Dan wrote:
 The following takes nearly three minutes to compile.

 
 ... and this returns immediately:
 
 --------------------------------
 struct B {
 const size_t SIZE = 1024*64;
 int[SIZE] x= void; // !!!
 }
 
 void main() {
 B[] barr;
 barr ~= B();
 }
 -------------------------------
 - manfred

That's incredibly dangerous though, because than B.init has garbage in it. 
Still, it's interesting that it would affect the speed. Maybe the compiler 
recognizes that it's garbage and so doesn't bother copying it (meaning that 
you'll get different garbage for every use of B.init).

- Jonathan M Davis

Oct 24 2012

D Programming

C/C++ Programming

Other

digitalmars.D.learn - long compile time question