digitalmars.D - D stack allocation

van eeshan (95/117) Aug 10 2004 needs

Andy Friesen (7/15) Aug 10 2004 You'll notice, though, that there's absolutely nothing preventing D from...

van eeshan (6/21) Aug 10 2004 Yep; agreed (and noted below). It just doesn't do that today, so I thoug...
van eeshan (205/205) Aug 11 2004 Attached a little graph that shows arrays allocated on the heap versus

Sean Kelly <sean f4.ca>s (8/19) Aug 11 2004 Regan, if I remember correctly, reported a bug where executable size is

stonecobra (8/20) Aug 11 2004 That's why I think the ability to have it on by default is great. If it...
van eeshan (23/42) Aug 11 2004 testing

Sean Kelly (14/39) Aug 11 2004 Ah okay. I thought perhaps the current code generation involving arrays...
Regan Heath (21/79) Aug 11 2004 I'm not sure it was me who reported it initially, Arcane Jill was
Walter (7/19) Aug 11 2004 stack-based

Walter (43/65) Aug 11 2004 to

van eeshan (59/101) Aug 12 2004 The whole point of that post was to selectively and optionally /inhibit/

Derek Parnell (10/37) Aug 12 2004 Sounds like a good idea. It allows a coder to take some responsibility f...
Ben Hinkle (27/65) Aug 12 2004 As a general principle I agree it would be nice to be able to skip

van eeshan (65/116) Aug 12 2004 easily

Walter (28/37) Aug 12 2004 the

van eeshan (57/96) Aug 12 2004 deal.

Walter (32/71) Aug 12 2004 approach

Michael (3/3) Aug 13 2004 Just out of curiosity, how does alloca() work? I guess it just manipulat...

Walter (8/11) Aug 13 2004 the

Michael (3/13) Aug 13 2004 Ahh, I hadn't thought about expression temps. Very interesting stuff. I ...

van eeshan (58/118) Aug 13 2004 out

Sean Kelly (8/13) Aug 13 2004 I think Walter meant that this is typically the proper behavior for an
Walter (11/23) Aug 13 2004 that

van eeshan (21/45) Aug 13 2004 May I point you towards this post, which has an attempted summary of the

van eeshan (24/61) Aug 12 2004 Addendum.

Walter (39/59) Aug 12 2004 I'm confused - you wrote: "The allocated (temporary) workspace is typica...

Arcane Jill (5/8) Aug 12 2004 That's probably good news, but:

Walter (4/11) Aug 13 2004 (Please

Arcane Jill (16/34) Aug 13 2004 That is a very good point. In fact, it's an excellent point, and one whi...

Walter (10/16) Aug 13 2004 with

van eeshan (88/123) Aug 13 2004 flaw

Arcane Jill (10/22) Aug 13 2004 I think you may have misunderstood Walter. Arbitrary random data can ind...

van eeshan (21/43) Aug 13 2004 Fair point Jill.
Walter (17/33) Aug 13 2004 that

van eeshan (37/96) Aug 30 2004 So, the resolution on this has apparently been implemented. If you write...

Ilya Minkov (11/11) Aug 12 2004 Ahem, you're speaking of allocating a *buffer*. A large thing, perhaps
Ben Hinkle (7/17) Aug 13 2004 another option is to look for "void" as an initialization value:

Arcane Jill (8/15) Aug 13 2004 /another/ option is for the compiler, on seeing the code:

Ben Hinkle (12/28) Aug 13 2004 large
Andy Friesen (8/16) Aug 13 2004 I like this approach. The programmer didn't say where he wants the

van eeshan (10/26) Aug 13 2004 With respect Andy, the programmer was quite explicit about where the mem...

Andy Friesen (32/38) Aug 13 2004 One of the greatest things about D is that we have a chance to change

van eeshan (24/64) Aug 13 2004 All good points. As long as there is some easy means of avoiding unwante...

Andy Friesen (12/32) Aug 13 2004 I actually didn't mean to address implicit initialization directly at

Ben Hinkle (5/9) Aug 13 2004 Perl! ouch! ;-)
van eeshan (44/50) Aug 13 2004 In light of what we now know (but still don't quite know :-) I think it'...

ben 0x539.de (41/41) Aug 15 2004 Pretty ignorant of some of this thread, I would like to make some points...

Derek Parnell (13/60) Aug 15 2004 This idea is stilling making sense to me, even when taking Walter's

van eeshan (3/20) Aug 13 2004 That's a very nice idea Ben ...

Ben Hinkle (6/148) Aug 11 2004 Have you tried alloca? I haven't tried running it but I'm thinking of

van eeshan (3/8) Aug 11 2004 Tried that, Ben; alloca() also clears everything to zero.

Ben Hinkle (6/16) Aug 11 2004 Hmm. I just tried on Linux and got the times:
Walter (6/15) Aug 11 2004 No, it does not. It simply calls C's alloca(), and gets whatever C's

van eeshan (34/182) Aug 11 2004 Well, blow me down with a feather. I must have really screwed that test ...

Arcane Jill (12/15) Aug 12 2004 My experiments seem to suggest that the following actually works rather ...

van eeshan (24/40) Aug 12 2004 Now that is exactly what the compiler ought to (optionally) provide for,...
van eeshan (7/23) Aug 12 2004 I've been trying to convert this to a mixin, but not having much luck. G...

"van eeshan" <vanee hotmail.net> writes:

"Walter" <newshound digitalmars.com> wrote in message
news:cfbip7$ora$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfb72j$9l7$1 digitaldaemon.com...
 "Walter" <newshound digitalmars.com> wrote in message
 news:cfb5v9$76u$1 digitaldaemon.com...
 The problem, however, is that since Java does not allow any objects or
 arrays on the stack (or in static data), whenever you need one it



needs
 to
 be allocated on the heap. This is a lot more expensive than stack
 allocation. Since one winds up allocating a lot more heap objects in


 Java
 than one does in C/C++, the end result is slower. Java offers no
 alternatives to gc allocation.

 This is why D allows stack based objects and value objects.

 Are you saying that D allows one to create Objects entirely upon the

 stack?

 For auto objects, it is allowed for the compiler to put them there. But I
 was primarilly referring to structs and arrays.

 And entirely within static data?

 Structs and arrays, yes. Java does not allow structs, nor does it allow
 arrays on the stack.


Thanks for clarifying that Walter. Would it be prudent to note that D
allocates auto Objects on the heap? Stack allocation may be allowed for in
the spec, but that's not how the compiler operates.

Also, there is that point about heap allocations being a lot more expensive
than stack allocations. Naturally, one would lean towards wholeheartedly
agreeing; but this is just not true when it comes to D. Here's a little
(contrived) test jig to illustrate:

private import std.c.time;
private import std.c.stdlib;
private import std.c.windows.windows;

void heap()
{
        void *x = malloc (16 * 1024);
        free (x);
}

void stack()
{
        ubyte[16 * 1024] x;
}

void trace (uint startTime)
{
        FILETIME ft;
        GetSystemTimeAsFileTime(&ft);
        printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000);
}

void main()
{
        FILETIME start;

       GetSystemTimeAsFileTime(&start);
        trace (start.dwLowDateTime);

        sleep (3);
        trace (start.dwLowDateTime);

        for (int i=1_000_000; --i;)
             heap();

        trace (start.dwLowDateTime);

        for (int i=1_000_000; --i;)
             stack();

        trace (start.dwLowDateTime);
}

The results? On this machine it takes the heap() routine ~500ms to execute,
and the stack() version ~3500ms. Yep, that's seven times longer. Why? What
wasn't stated is that D ensures all stack variables are initialized ...
including arrays. The innocuous stack allocation actually clears the 16KB to
zero each time, which is why it takes so long. For a 32KB 'buffer', the
stack() execution time goes up to ~8600ms versus the ~500ms for heap(); you
can see there's some rather non-linear behavior for the stack version <g>

The point here is not about the benefits/detriments of deterministic
initialization. It's about clarifying a point of performance-tuning that is
far from obvious. Without wishing to put too fine a point on it, the quoted
post was somewhat misleading. That is; allocating an array on the D stack
can consume far more cycles than the equivalent heap allocation. It depends
upon the size of the array, the bus speed, the size of the cache, heap
fragmentation and many other things.

D stack allocations are simply not like C/C++ at all. Far from it.


For the curious, here's the disassembly for both routines. It's worth noting
that the array initialization could be sped up. Also worth noting is that
the loop/call overhead was 10ms.

10:   {
11:           void *x = malloc (16 * 1024);
12:           free (x);
00402011   push        4000h
00402016   call        _malloc (0040574c)
0040201B   add         esp,4
0040201E   push        eax
0040201F   call        _free (004057a4)
00402024   add         esp,4
13:   }
00402027   pop         eax
00402028   ret
14:
15:
16:   void stack()
0040202C   push        ebp
0040202D   mov         ebp,esp
0040202F   mov         edx,4
00402034   sub         esp,1000h
0040203A   test        dword ptr [esp],esp
0040203D   dec         edx
0040203E   jne         _D5hello5stackFZv+8 (00402034)
00402040   sub         esp,edx
00402042   mov         ecx,4000h
00402047   xor         eax,eax
17:   {
18:           ubyte[16 * 1024] x;
00402049   push        edi
0040204A   lea         edi,[x]
00402050   rep stos    byte ptr [edi]
19:   }
00402052   pop         edi
00402053   mov         esp,ebp
00402055   pop         ebp
00402056   ret

Aug 10 2004

Andy Friesen <andy ikagames.com> writes:

van eeshan wrote:
 Thanks for clarifying that Walter. Would it be prudent to note that D
 allocates auto Objects on the heap? Stack allocation may be allowed for in
 the spec, but that's not how the compiler operates.
 
 Also, there is that point about heap allocations being a lot more expensive
 than stack allocations. Naturally, one would lean towards wholeheartedly
 agreeing; but this is just not true when it comes to D. Here's a little
 (contrived) test jig to illustrate:

You'll notice, though, that there's absolutely nothing preventing D from 
performing these allocations on the stack as an optimization.

All the compiler needs is to be certain that the object's lifetime is 
limited to the scope.  Auto references offer precisely this.

(incidently, it is legal to write "auto int[] x = new int[...];")

  -- andy

Aug 10 2004

"van eeshan" <vanee hotmail.net> writes:

Yep; agreed (and noted below). It just doesn't do that today, so I thought
it was worth a clarification (for those who didn't know).

"Andy Friesen" <andy ikagames.com> wrote in message
news:cfbpc5$13eh$1 digitaldaemon.com...
 van eeshan wrote:
 Thanks for clarifying that Walter. Would it be prudent to note that D
 allocates auto Objects on the heap? Stack allocation may be allowed for


in
 the spec, but that's not how the compiler operates.

 Also, there is that point about heap allocations being a lot more


expensive
 than stack allocations. Naturally, one would lean towards wholeheartedly
 agreeing; but this is just not true when it comes to D. Here's a little
 (contrived) test jig to illustrate:

 You'll notice, though, that there's absolutely nothing preventing D from
 performing these allocations on the stack as an optimization.

 All the compiler needs is to be certain that the object's lifetime is
 limited to the scope.  Auto references offer precisely this.

 (incidently, it is legal to write "auto int[] x = new int[...];")

   -- andy

Aug 10 2004

"van eeshan" <vanee hotmail.net> writes:

Attached a little graph that shows arrays allocated on the heap versus
arrays declared as local (stack) variables. You can see that stack
allocations are faster only for array sizes of below ~64 bytes (on this
machine), and quickly become significantly slower once you start using local
arrays for anything bigger than around 512-bytes.

There's an interesting quirk at the 4096-byte mark: the codegen changes
strategy slightly at this point, and I'd guess it is doing something akin to
paging (ensuring the stack page-set is available), though can't be sure.
Take a look at the previously posted assembler for reference.

If local (stack) arrays were to use the C/C++ approach, the time consumed
would be ~10ms across the board, rather than the ~500ms for heap
allocations. That is, for a 32KB local array, the initialization cost is
~865 times /more/ than it would be for the equivalent C/C++ approach (~10ms
versus ~8650ms); almost three orders of magnatude.

This indicates two things:

1) making claims about D's perfomance superiority over Java due to the
former's ability to use the stack (versus always on the heap) is based on
speculation; not fact. Testing shows this to be entirely the opposite for
arrays beyond 64 bytes in length. Similarly, any kind of performance testing
against C/C++ within this realm will be viewed as a negative mark against D.

2) Implicit initialization is great. However, it can clearly and easily be
demonstrated to be a serious impediment. Given that D is supposed to be a
'systems' language, I humbly suggest that a mechanism be provided to bypass
such nicities when one is prepared to assume full responsibility. For
example, there might be a qualifier for variables that disables the
automatic initialization (along the lines of the 'auto' qualifier).

What do you folks think?









begin 666 stack allocation.PNG
MB5!.1PT*& H````-24A$4 ```C$```$L" (```"]K75[````!&=!34$`````
MBR5 30```"!C2%)-``````````````````````````````````````````" 





M`0;P.    $;KI%$P"D;!*! %`PS =1! `(W62:- %(R"43 *!AC ZR" `*)5

M>($/$$#TZ

M$*V3``)H8,;NB.<28_ H& 6C8!2,`C0P1.LD  `:DI,W0\NUHV 4C()10'\P

M"D;!*!B&8(C620`!-%HGC8)1, I&P3 $0[1.` B T3II%(R"43 *AB$8HG42


MZZ11, I&P2 8U("! 9S:98C620`!-%HGC8)1, I&P: &(ZI.` B T3II%(R"
M43 *!B\ KT+Z/V3K)( `&JV31L$H& 6C8) "LBND_T.V3 ((H-$Z:12, E$P
M" 8I&(%U$D `C=9)HV 4C()1,! !)172_R%;)P$$T&B=- I&P2 8!8,1C,PZ

M`.45TO\A6R<!!!!-ZB3,. :901(7E_E4=.TH& 6C8!0,*C"2ZR2 `*)'/XD!
M!M#M'JV31L$H& 6C`!50I4+Z/V3K)( `HE.=A,P8K9-&P2 8!:,`*Z!6A?1_



M^C]DZR2 `!JMDT;!*! %HV  P6B=]!^I5 <(H-$Z:12, E$P" 8,T*A"^C]D


M==(H& 6C8!30&]"Z0OH_9.LD  `:K9-&P2 8!:. KH .%=+_(5LG`030:)TT
M"D;!*! %= 6C=1(F )?J``$T6B>- E$P"D8!_0!]*J3_0[9.` B T3II%(R"
M43 *Z 3H5B']'[)U$D `C=9)HV 4C()10"<P6B?A`O!2'2" 1NND43 *1L$H
MH > 9X7T?\C620`!-%HGC8)1, I&`<T!G2ND_T.V3 ((H-$Z:12, E$P"F .
M1NLD_ !>J ,$T&B=- I&P2 8!;0%]*^0_ _9. D  &A5)V$U$&$K&""+X^$2
M:? H& 6C8!0,3C!:)Q$$\%(=((!H4B=AK53  L U$T$N+O.IZ-I1, I&P2B 
M'1B0"NG_D*V3``*(3OTDY/IFM$X:!:- %(P0,% 5TO\A6R<!!! ]ZB22*J'1
M.FD4C()1,&S :)U$)("7Z `!1*<Z"0T JR&I3D(S812, E$P" 8M&, *Z?^0


MP6"HD/X/V3H)((!&ZZ11, I&P2B )ABMD\ `\%(=((!&ZZ11, I&P2B &A D
M%=+_(5LG`030:)TT"D;!*! %U &#IT+Z/V3K)( `&JV31L$H& 6C #I M$XB
M&\!+=8 `&JV31L$H& 6C  I 4%5(_X=LG0000*-UTB 8!:- %% !C-9)E !X
MJ0X00*-UTB 8!:- %% *!EN%]'_(UDD``31:)XV"43 *1 %%8!!62/^';)T$


M%(R"43 *R &#N4+Z/V3K)( `&JV31L$H& 6C !PP6B=1$<!+=8 `&JV31L$H


M)G429AV#S"")B\M\*KIV%(R"43 *B 1#I4+Z/V3K)( `HE,_"5EPM$X:!:- 
M% Q1,%HGT0C 2W6 `*)?G819V8S62:- %(R"(02&4(7T?\C620`!-,3J) 88
MH*YK1\$H& 6C " 8K9-H!^"E.D `T6^-`Q:[1_M)HV 4C((A`H96A?1_R-9)


M:(7T?\C620`!-%HGC8)1, I&`4XP6B?1!\!+=8 `&JV31L$H& 6C`#L8NA72
M_R%;)P$$T&B=- I&P2 8!=C!:)U$-P`OU0$":+1.& 6C8!2,`BQ 2%=(_X=L
MG0000*-UTB 8!:- %*"#H5XA_1^R=1) `(W62:- %(R"48 .1NLD. -XJ0X0
M0*-UTB 8!:- %*" 85 A_1^R=1) `(W62:- %(R"48 "1NLD^ -XJ0X00*-U



M"^"E.D `C8[=C8)1, I&.AA^%=+_(5LG`030:)TT"D;!*!CI8+1.&G `+]4!
M`HA6=1*: 6 S3R1Q"1H^"D;!*! %9(-A62']'[)U$D `D5PG(73BUH)9QR S


M-JA8)Z%5K:- %(R"44 V&*V3!AQ %O ``42/-0ZC_:11, I&P6 #P[M"^C]D

M_KJ[`02#P0VC8!2, J$+1NLDJ ."QA)9)P$$T&B=- I&P2 866 D5$C_:5\G
M(8]I,6 #R,K^8ZM^L H"!!"MQNYH"D;KI%$P"D8!>6"$5$C_:5PG81J")H)9
M#Z
M6B>- E$P"D8*&#D5TG]ZS2>A52IXN$3VDP`"B)PZ:73L;A2, E$P%,%HG?2?


MCH)1, H&/QBMD^! 4)6?F'420 "-CMV- E$P"H8Y&($5TO\A6R<!!-!HG30*

M%<& "M-1, I&P: %([9"^C]DZR2 `")SS^QHG30*1L$H&/Q M$["!(.J_,2L

MUDE8Q0=YG0000*-K'$;!*! %PQ",\ KI/]W/%H(+HBF #ZIA[<E UDD``411
M/VET[&X4C()1,#C!:)U$5)W$0#I


M""!Z])-&ZZ11, I&`=W :)T$`8-J?Q(N2S$+?( `&H"QN]$Z:12, E% (S!:
M(<'!$*V3``*(G#II



MD,%HA80)AFB=!!! =%H+3ETP&-PP"D;!*! \8+1.P 1#M
MP= &HQ425C!$ZR2 `!JMDT;!*! %0QN,UDE8P1"MDP`":+1.& 6C8!0,83!:
M(>$"0[1.` B T3II%(R"43"$P6B=A L,T3H)((!&ZZ11, I&P5 %HQ42'C!$
MZR2 `!JMDT;!*! %0Q6,UDEXP!"MDP`":+1.& 6C8!0,23!:(>$'0[1.` B 
MT3II%(R"43 DP6B=A!\,T3H)((!&ZZ11, I&P= #HQ4203!$ZR2 `!JMDT;!
M*! %0P^,UDD$P1"MDP`":+1.& 6C8!0,,3!:(1$#AFB=!!! HW72*! %HV"(
M =$ZB1 P1.LD  `:K9-&P2 8!4,)C%9(1((A6B<!!-!HG30*1L$H&$I M$XB
M$ S1. D  .A4)Z%=MD02%ZMIM'#D*! %HV"0 ]$*B7 P1.LD  "B1YV$L S,
M((F+W\!1, I&P8 "HW42\6"(UDD``42G. FYZS-:)XV"43 *R "C%1))8(C6

M` B (58G$90:!:- % Q+,%HGD0J&:)T$$$ #L.X.<WH)#Q>7:31U[2 8!:- 

M$AE B-9)``$T6B>- E% !3!::-(.C(8M>6"(UDD``31:)XV"44 I !2:HT4G
MC<!HP)('AFB=!!! HW72*! %% %XB3E:=-("C(8JV6"(UDD``31:)XV"44 ^



M.1 -0\K!$*V3``)HM$X:!:. !$!D63E:I%((1 .0<C!$ZR2 `!JMDT;!*" 6
MD%10CI:J9(/1H*,*&*)U$D `C=9)HV 4$ 5(+2A'"U:RP6C0404,T3H)((!&
MZZ11, H(`_)*R=&RE0PP&FC4`D.T3 ((H-$Z:12,` * [%)RM' E`XP&&K7 
M$*V3``)HM$X:!:,`'Z"PB!PM84D"H\%%13!$ZR2 `!JMDT;!*, )*"\B1PM9




M`IH6CJ,E+RXP&C*T`$.T3 (((#K526 `V5*2ZB0T$T;!** BH'7).%KR8 6C
MP4(C,$3K)( `&H U#J/]I%$P" %]2L;1\A<3C(8)C< 0K9,``FA ZB2TWA(>
M+AYS1L$HH!: 6[$X6OZB =$`H1T8HG420 "-[D\:!2,=T+E8'"V%D<%H:- .

M-!QH"H9HG0000*-UTB 8H6! "\31XG T!& -AFB=!!! HW72*!B)8, +Q %W
MP("#T1" -1BB=1) `(W62:- Q(%!4AH.$F?0'P`]/F+]3D\P1.LD  `:K9-&





MK9,``FBT3AH%PPH,W<)QB#K[_U!V^? &0[1.` B T3II% P3,'1K(S 8<NX?
M!F$^C,$0K9,``FBT3AH%0QX,FY)Q:/EB:+EV!((A6B<!!-!HG30*AC 8-K41


MHX 2,.QK(S 8;-X<.2$_G, 0K9,``HA^=1)F93-:)XT"(L%(*Q,'E6<'E6-&

MW:/]I%& &XSP`G% _3["`W]X "%:)P$$$)W6."#W;]#Z.OBYN



MJ]$`'[% B-9)``$T6B>-`GJ T6*1$D!&N(T&^" 8HG420 "-UDFC (9 M&2D

M #I M%BD-< ,V]$P'P5XP!"MDP`":+1.& 7D ]$N$3T!/)!'PWP4$ .&:)T$

M&JV31 $^,#HZ-PI&P1 %0[1.` B T3II%" `< TT6 ^- E$PI,$0K9,``FBT
M3AJ)`+/N&:V!1L$H&&9 B-9)``$T6B<-6X"KXAFM>T;!*! )8(C620`!-%HG


M20B7X782.:X=\*[W(.O7CX)1, J&-QBB=1) `(V8. F?C;3LP8R"43 *1L% 
M "%:)P$$T$BIDT8KF%$P"D;!B )#M$X""* A5B<QC()1, I&P2  ` S1. D 
M (98G410BA8N&;5E\- R;#PR:LL M&+4E $$<-<"!-! K), `+\:^KADU);!


MHV 4C()1, H&&,#K
M8 "O P`":+1.& 6C8!2, E$PP !>!P$$X-Z,<0``01CH_U_M1 Q8H-K!R.AP
M;0G$B9__)'?G5)X]253L42N!Y"2A(L1=B0E5G/G<!LM'*$1FY3I85=,01S(`


M2::1EXF(CV7*?80G6G$E8^)M(9 !\1M+HY*-_ #N?H `W%4Q$ ` "/K_JVMK
M2$!3N^MR<PA25#ZI1X10MP4<#N4EYV


MS2>)2Y(M)%42Y%E!O&D45A $N63'#C


M#R8




M  $&\#H(((!&ZZ11, I&P2 8!0,,X'400 "-UDFC8!2, E$P" 88P.L   `:





MC_\8W2!D76B,_WCK)/S=IO^CV7 4C& `3_P``31:)XV"40`%:-4)FC N]5 5

MHV"D`7CB!PB T3II%(P", '^7$-&GAK-AJ- Q )XX <(H-$Z:12,`O(!KJ4*
MHQ72*! %) %X^ <(H-$Z:12, E$P"D;!``-X'0000*-UTB 8!:- %(R"`0;P
M.    $;KI%$P"D;!*! %`PS =1! `(W62:- %(R"43 *!AC ZR" `!JMDT;!


M& 6C8!2,  $&\#H((("P'"(Y"D;!*! %HV 4T!E *B. `!KM'HV"43 *1L$H
J&"P`((!&ZZ11, I&P2 8!8,%``08`&FN<"M[6$Y2`````$E%3D2N0F""
`
end

Aug 11 2004

Sean Kelly <sean f4.ca>s writes:

In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...
1) making claims about D's perfomance superiority over Java due to the
former's ability to use the stack (versus always on the heap) is based on
speculation; not fact. Testing shows this to be entirely the opposite for
arrays beyond 64 bytes in length. Similarly, any kind of performance testing
against C/C++ within this realm will be viewed as a negative mark against D.

Regan, if I remember correctly, reported a bug where executable size is
proportional to the size of static arrays the application contains.  This
performance issue sounds like it may be related.

2) Implicit initialization is great. However, it can clearly and easily be
demonstrated to be a serious impediment. Given that D is supposed to be a
'systems' language, I humbly suggest that a mechanism be provided to bypass
such nicities when one is prepared to assume full responsibility. For
example, there might be a qualifier for variables that disables the
automatic initialization (along the lines of the 'auto' qualifier).

An impediment how?  Just in terms of performance?  In general, I would say that
default initialization is worth the performance cost for its ability to prevent
bugs.


Sean

Aug 11 2004

stonecobra <scott stonecobra.com> writes:

2) Implicit initialization is great. However, it can clearly and easily be
demonstrated to be a serious impediment. Given that D is supposed to be a
'systems' language, I humbly suggest that a mechanism be provided to bypass
such nicities when one is prepared to assume full responsibility. For
example, there might be a qualifier for variables that disables the
automatic initialization (along the lines of the 'auto' qualifier).

 
 
 An impediment how?  Just in terms of performance?  In general, I would say that
 default initialization is worth the performance cost for its ability to prevent
 bugs.
 

That's why I think the ability to have it on by default is great.  If it 
were able to be turned off, that is no different from me reusing a 
buffer.  The buffer is only initialized once, yet I re-use it thousands 
of times without much problem.

Performance does and will always matter, regardless of the secret 
contract between Intel and Microsoft to castrate newer, faster PCs. 
MHO, of course.

Scott

Aug 11 2004

"van eeshan" <vanee hotmail.net> writes:

"Sean Kelly s" <sean f4.ca> wrote in message
news:cfe4hu$19cf$1 digitaldaemon.com...
 In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...
1) making claims about D's perfomance superiority over Java due to the
former's ability to use the stack (versus always on the heap) is based on
speculation; not fact. Testing shows this to be entirely the opposite for
arrays beyond 64 bytes in length. Similarly, any kind of performance


testing
against C/C++ within this realm will be viewed as a negative mark against


D.
 Regan, if I remember correctly, reported a bug where executable size is
 proportional to the size of static arrays the application contains.  This
 performance issue sounds like it may be related.

AFAIK this is not related in the slightest, I'm afraid. This topic is purely
about CPU cycles, and the stack. Nothing to do with static data or
executable size.

2) Implicit initialization is great. However, it can clearly and easily


be
demonstrated to be a serious impediment. Given that D is supposed to be a
'systems' language, I humbly suggest that a mechanism be provided to


bypass
such nicities when one is prepared to assume full responsibility. For
example, there might be a qualifier for variables that disables the
automatic initialization (along the lines of the 'auto' qualifier).

 An impediment how?  Just in terms of performance?  In general, I would say

that
 default initialization is worth the performance cost for its ability to

prevent
 bugs.

I don't think anything negative was said about default initialization as a
feature, Sean. If it was then I apologize. Default/implicit initialization
is great (as was noted; so we agree), but there are times when you need to
take full control over what's going on. It may not matter to you
specifically, but there are many situations whereby one needs to take full
responsibility over a select set of variables (such as temporary stack-based
buffers). Providing the mechanism as an /option/ is not detrimental to the
language. It's like applying "const", or one of the other optional
qualifiers.

I'm just illustrating that the language could really use a means to disable
default-initialization for those cases where it is deemed both completely
unecessary and too costly. Surely that's not too much to consider?

Aug 11 2004

Sean Kelly <sean f4.ca> writes:

In article <cfe60s$1a2i$1 digitaldaemon.com>, van eeshan says...
"Sean Kelly s" <sean f4.ca> wrote in message
news:cfe4hu$19cf$1 digitaldaemon.com...

 Regan, if I remember correctly, reported a bug where executable size is
 proportional to the size of static arrays the application contains.  This
 performance issue sounds like it may be related.

AFAIK this is not related in the slightest, I'm afraid. This topic is purely
about CPU cycles, and the stack. Nothing to do with static data or
executable size.

Ah okay.  I thought perhaps the current code generation involving arrays had
issues that might be corrected in the future.

 An impediment how?  Just in terms of performance?  In general, I would say

that
 default initialization is worth the performance cost for its ability to

prevent
 bugs.

I don't think anything negative was said about default initialization as a
feature, Sean. If it was then I apologize. Default/implicit initialization
is great (as was noted; so we agree), but there are times when you need to
take full control over what's going on. It may not matter to you
specifically, but there are many situations whereby one needs to take full
responsibility over a select set of variables (such as temporary stack-based
buffers). Providing the mechanism as an /option/ is not detrimental to the
language. It's like applying "const", or one of the other optional
qualifiers.

I'm just illustrating that the language could really use a means to disable
default-initialization for those cases where it is deemed both completely
unecessary and too costly. Surely that's not too much to consider?

Not at all.  However this might violate Walter's rule against pragmas.  Even so,
I had overlooked the fact that declaring an array would implicitly initialize
all array elements.  This kind of stinks from a performance perspective, since
90% of the time I use arrays I have no need for them to be initialized.  Perhaps
a new attribute?

declare int x[10000000];
declare {
float a[100000];
char b[1000000];
}


Sean

Aug 11 2004

Regan Heath <regan netwin.co.nz> writes:

On Wed, 11 Aug 2004 15:21:50 -0700, van eeshan <vanee hotmail.net> wrote:
 "Sean Kelly s" <sean f4.ca> wrote in message
 news:cfe4hu$19cf$1 digitaldaemon.com...
 In article <cfdoer$13pt$1 digitaldaemon.com>, van eeshan says...
1) making claims about D's perfomance superiority over Java due to the
former's ability to use the stack (versus always on the heap) is based 

 on
speculation; not fact. Testing shows this to be entirely the opposite 

 for
arrays beyond 64 bytes in length. Similarly, any kind of performance


 testing
against C/C++ within this realm will be viewed as a negative mark 

 against

 D.
 Regan, if I remember correctly, reported a bug where executable size is
 proportional to the size of static arrays the application contains.  
 This
 performance issue sounds like it may be related.

 AFAIK this is not related in the slightest, I'm afraid. This topic is 
 purely
 about CPU cycles, and the stack. Nothing to do with static data or
 executable size.

I'm not sure it was me who reported it initially, Arcane Jill was 
discussing it more recently..

I think what Sean was referring to by mentioning the size of executables 
is an indication of how static arrays are currently implemented, the array 
seems to be compiled into the executable, whether this is to increase 
performance or decrease complexity, we don't know.

2) Implicit initialization is great. However, it can clearly and easily


 be
demonstrated to be a serious impediment. Given that D is supposed to 

 be a
'systems' language, I humbly suggest that a mechanism be provided to


 bypass
such nicities when one is prepared to assume full responsibility. For
example, there might be a qualifier for variables that disables the
automatic initialization (along the lines of the 'auto' qualifier).

 An impediment how?  Just in terms of performance?  In general, I would 
 say

 that
 default initialization is worth the performance cost for its ability to

 prevent
 bugs.

 I don't think anything negative was said about default initialization as 
 a
 feature, Sean. If it was then I apologize. Default/implicit 
 initialization
 is great (as was noted; so we agree), but there are times when you need 
 to
 take full control over what's going on. It may not matter to you
 specifically, but there are many situations whereby one needs to take 
 full
 responsibility over a select set of variables (such as temporary 
 stack-based
 buffers). Providing the mechanism as an /option/ is not detrimental to 
 the
 language. It's like applying "const", or one of the other optional
 qualifiers.

 I'm just illustrating that the language could really use a means to 
 disable
 default-initialization for those cases where it is deemed both completely
 unecessary and too costly. Surely that's not too much to consider?

After seeing the cost of default initialization, I can see when someone 
might not want it, it should be optional IMO.

Perhaps it's a good default behaviour, I am un-convinced, AFAIKS 
un-initialised data generally causes a spectaclar and obvious failure, in 
fact shouldn't good DBC catch them, so, if I have to choose between 
default initialization and efficient arrays I choose the latter.

default initialization also seems to be responsible for hiding my un-used 
variables? or is dmd just not looking for them, I guess an 'error' is 
overkill for an un-used variable, and there might possibly be a reason to 
use one, can anyone think of one?

I am a tidy programmer and these bug me.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Aug 11 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfe60s$1a2i$1 digitaldaemon.com...
 I don't think anything negative was said about default initialization as a
 feature, Sean. If it was then I apologize. Default/implicit initialization
 is great (as was noted; so we agree), but there are times when you need to
 take full control over what's going on. It may not matter to you
 specifically, but there are many situations whereby one needs to take full
 responsibility over a select set of variables (such as temporary

stack-based
 buffers). Providing the mechanism as an /option/ is not detrimental to the
 language. It's like applying "const", or one of the other optional
 qualifiers.

 I'm just illustrating that the language could really use a means to

disable
 default-initialization for those cases where it is deemed both completely
 unecessary and too costly. Surely that's not too much to consider?

You can take full control in D by doing:

    byte* p = malloc(size);

and p will point to an uninitialized buffer.

Aug 11 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfdoer$13pt$1 digitaldaemon.com...
 There's an interesting quirk at the 4096-byte mark: the codegen changes
 strategy slightly at this point, and I'd guess it is doing something akin

to
 paging (ensuring the stack page-set is available), though can't be sure.
 Take a look at the previously posted assembler for reference.

That's because of how the underlying OS allocates space to the stack. This
shift in strategy happens with C/C++, too.

 If local (stack) arrays were to use the C/C++ approach, the time consumed
 would be ~10ms across the board, rather than the ~500ms for heap
 allocations. That is, for a 32KB local array, the initialization cost is
 ~865 times /more/ than it would be for the equivalent C/C++ approach

(~10ms
 versus ~8650ms); almost three orders of magnatude.

To do an apples/apples comparison, benchmark against calloc/free, not
malloc/free.  I normally use calloc in C++, not malloc, because the
uninitialization bugs aren't worth it.

 This indicates two things:

 1) making claims about D's perfomance superiority over Java due to the
 former's ability to use the stack (versus always on the heap) is based on
 speculation; not fact. Testing shows this to be entirely the opposite for

No, it does not show that at all:

1) Your test was with C++, not Java.
2) You cannot allocate uninitialized arrays (or any uninitialized data) in
Java.
3) I've written a Java compiler and have done a lot of performance tuning of
Java apps trying to make them go faster. I'm very familiar with where the
cycles got eaten.
4) I presume that someone allocating an array in C++ would presumably store
something in it - just allocating it, doing nothing with it, and freeing it
is about as representative as benchmarking the sequence of code:

    x = x + 1;
    x = x - 1;

 arrays beyond 64 bytes in length.

Most lightweight objects are less than 64 bytes in length. Allocating 1000
byte objects on the stack is very rare, not the usual case.

 Similarly, any kind of performance testing
 against C/C++ within this realm will be viewed as a negative mark against

D.

I've had problems with inappropriate use of benchmarks since my compiler
executed the following loop:

void foo()
{  int i;
    double d;
    for (i = 0; i < 1000; i++)
        d += 1;
}

in orders of magnitude faster than the competition. The journalist
disassembled the code, found that the compiler had removed the floating
point instructions, and so declared the compiler to be "buggy" in his
writeup. He never bothered to contact us to find out why. What actually
happened was that it was the ONLY compiler reviewed that did data flow
analysis, and concluded that the benchmark routine did nothing and so
deleted it. Today, nearly all compilers do that.


 2) Implicit initialization is great. However, it can clearly and easily be
 demonstrated to be a serious impediment. Given that D is supposed to be a
 'systems' language, I humbly suggest that a mechanism be provided to

bypass
 such nicities when one is prepared to assume full responsibility. For
 example, there might be a qualifier for variables that disables the
 automatic initialization (along the lines of the 'auto' qualifier).

The solution is improving the optimizer so it can determine when the
initialization is redundant. It already does this for most variables, just
not for arrays.

Aug 11 2004

"van eeshan" <vanee hotmail.net> writes:

"Walter" <newshound digitalmars.com> wrote
 If local (stack) arrays were to use the C/C++ approach, the time


consumed
 would be ~10ms across the board, rather than the ~500ms for heap
 allocations. That is, for a 32KB local array, the initialization cost is
 ~865 times /more/ than it would be for the equivalent C/C++ approach

 (~10ms
 versus ~8650ms); almost three orders of magnatude.

 To do an apples/apples comparison, benchmark against calloc/free, not
 malloc/free.  I normally use calloc in C++, not malloc, because the
 uninitialization bugs aren't worth it.

The whole point of that post was to selectively and optionally /inhibit/
initialization, where that eats far too many cycles. That's why malloc() was
used rather than calloc(). There are plenty times where you need to avoid
the (totally superfluous) overhead with workspace buffers. You were stating
that heap allocation is a lot slower than stack allocation. Well, that is
absolutely not true when you don't want the initialization; heap allocation
is (bizarrely) faster within D since you /cannot/ inhibit the darned array
initialization when you really need to. C/C++ stack allocations are
blindingly fast by comparision; just a couple of instructions in many cases.


 1) Your test was with C++, not Java.

That was D source-code.


 4) I presume that someone allocating an array in C++ would presumably

store
 something in it - just allocating it, doing nothing with it, and freeing

it
 is about as representative as benchmarking the sequence of code:

     x = x + 1;
     x = x - 1;

Thanks Walter ... you have never heard of allocating a stack buffer, and
then passing it off to another function for population? Yes, something else
is done with the buffer, but often there's only a very /small/ part of it
used. The overhead of initialization totally swamps the typical processing
time. Do that often enough (in, say, something like text processing) and it
adds up rapidly. You've never done this kind of thing before? Fine.

This was a test to see where the unknown overhead was coming from during a C
port; naturally, the test was whittled down to isolate the culprit. Isn't
that how you like them?


 Most lightweight objects are less than 64 bytes in length. Allocating 1000
 byte objects on the stack is very rare, not the usual case.

What usual case? Do you mean in all software? Please be a little more
specific, as "usual case" is terribly vague. Allocating double[1024],
char[8192], and often much bigger, on the stack is not at all rare; far from
it.

How about just one example? Microsoft does it /all/ the time in their
Operating Systems, IE, and probably all their other mainstream software. Why
do you think AMD and Intel have the NX instruction in their CPUs now?

What's nice about D is that you always know the length of the array, and
debug options catch out-of-bounds conditions. Rare? I'd claim it's perhaps a
rather usual case in existing C and C++ code. May I ask where you get your
use-cases from, please?


 I've had problems with inappropriate use of benchmarks since my compiler
 executed the following loop:

 void foo()
 {  int i;
     double d;
     for (i = 0; i < 1000; i++)
         d += 1;
 }

 in orders of magnitude faster than the competition. The journalist
 disassembled the code, found that the compiler had removed the floating
 point instructions, and so declared the compiler to be "buggy" in his
 writeup. He never bothered to contact us to find out why. What actually
 happened was that it was the ONLY compiler reviewed that did data flow
 analysis, and concluded that the benchmark routine did nothing and so
 deleted it. Today, nearly all compilers do that.

You didn't bother to ask about the code-paths that prompted this topic,
Walter; yet seem to go out of your way to belittle any and all points made;
and then vehemently deny any possible existence of the problem. I really
don't understand why this post has apparently inflamed you so, but do offer
sincere and profound apologize. One would have thought knowing where the
codegen was seriously non-optimal for given scenarios would be helpful to
you. Would you rather not know?


 2) Implicit initialization is great. However, it can clearly and easily


be
 demonstrated to be a serious impediment. Given that D is supposed to be


a
 'systems' language, I humbly suggest that a mechanism be provided to

 bypass
 such nicities when one is prepared to assume full responsibility. For
 example, there might be a qualifier for variables that disables the
 automatic initialization (along the lines of the 'auto' qualifier).

 The solution is improving the optimizer so it can determine when the
 initialization is redundant. It already does this for most variables, just
 not for arrays.


Great!  Will it eliminate initialization for this type of (trivail) usage:

void A (void delegate (char[]) formatter)
{
    char [1024 * 32] workspace;

    formatter (workspace);
}

Because, if not, then the following modification is the sort of (keyword
qualifier) thing that would hand control back to the programmer:

{
    uninitialized char[1024 * 32] workspace;

    formatter (workspace);
}

That's perhaps not the best name for it, but you get the idea. It's very
clear what's going on, it's entirely optional, and the compiler will do
exactly what the programmer intends it to do. No under-the-tablecloth-magic
required. Oh, and it's easy to grep for. It's just a suggestion.

Aug 12 2004

Derek Parnell <derek psych.ward> writes:

On Thu, 12 Aug 2004 00:31:38 -0700, van eeshan wrote:

 "Walter" <newshound digitalmars.com> wrote

[snip]

 The solution is improving the optimizer so it can determine when the
 initialization is redundant. It already does this for most variables, just
 not for arrays.

 
 Great!  Will it eliminate initialization for this type of (trivail) usage:
 
 void A (void delegate (char[]) formatter)
 {
     char [1024 * 32] workspace;
 
     formatter (workspace);
 }
 
 Because, if not, then the following modification is the sort of (keyword
 qualifier) thing that would hand control back to the programmer:
 
 {
     uninitialized char[1024 * 32] workspace;
 
     formatter (workspace);
 }
 
 That's perhaps not the best name for it, but you get the idea. It's very
 clear what's going on, it's entirely optional, and the compiler will do
 exactly what the programmer intends it to do. No under-the-tablecloth-magic
 required. Oh, and it's easy to grep for. It's just a suggestion.

Sounds like a good idea. It allows a coder to take some responsibility for
their own work. There is also nothing wrong with D's auto-initialized
arrays *when* the coder accepts them too. I feel that D ought to support
both ideas. A new keyword is probably the way to go.

-- 
Derek
Melbourne, Australia
12/Aug/04 5:54:15 PM

Aug 12 2004

Ben Hinkle <bhinkle4 juno.com> writes:

[snip]

 2) Implicit initialization is great. However, it can clearly and easily


 be
 demonstrated to be a serious impediment. Given that D is supposed to be


 a
 'systems' language, I humbly suggest that a mechanism be provided to

 bypass
 such nicities when one is prepared to assume full responsibility. For
 example, there might be a qualifier for variables that disables the
 automatic initialization (along the lines of the 'auto' qualifier).

 The solution is improving the optimizer so it can determine when the
 initialization is redundant. It already does this for most variables,
 just not for arrays.

 
 
 Great!  Will it eliminate initialization for this type of (trivail) usage:
 
 void A (void delegate (char[]) formatter)
 {
     char [1024 * 32] workspace;
 
     formatter (workspace);
 }
 
 Because, if not, then the following modification is the sort of (keyword
 qualifier) thing that would hand control back to the programmer:
 
 {
     uninitialized char[1024 * 32] workspace;
 
     formatter (workspace);
 }
 
 That's perhaps not the best name for it, but you get the idea. It's very
 clear what's going on, it's entirely optional, and the compiler will do
 exactly what the programmer intends it to do. No
 under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's
 just a suggestion.

As a general principle I agree it would be nice to be able to skip
initialization just like it's nice to be able to turn off array-bounds
checking. But in practice (in D) I wonder how often that would pay off.
Allocating a 32K array on the stack is pretty evil, and I wonder how often
that really happens. What C code were you porting that was doing that? I
assume it was allocating such a big array not because it was going to use
it all (as you said only a small portion of the array gets used) but
because the coder was lazy and was trying to avoid worrying about array
bounds errors. As we all know overwriting array bounds invites exploits and
bugs. In D the way I've been coding APIs that take buffers as input is to
take a dynamic array as "a typical size buffer" and if the buffer turns out
to be too small then the function increases the length field of the local
variable which could potentially allocate new GC'ed memory. For example in
std/stream.d there is the function
  char[] readLine(char[] buffer);
which takes an input buffer and fills it with the next line of text from the
stream. If the line fits in the buffer the output array's length is the
number of characters read and the char* is the buffer's char*. If the line
didn't fit then readLine (potentially) allocates a new buffer (just by
increasing the length field of the local copy of the buffer variable) and
returns the char* of the new buffer. The "overflow" buffer is garbage
collected eventually so there isn't any memory leak going on. So in essence
the input buffer is just a suggested buffer for readLine but there is no
guarantee that the buffer will contain the result of the funtion call when
it is done. This approach means you can target your buffers to the typical
usage and avoid array-bounds errors.

Aug 12 2004

"van eeshan" <vanee hotmail.net> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cfg7af$2dbv$1 digitaldaemon.com...
 [snip]

 2) Implicit initialization is great. However, it can clearly and




easily
 be
 demonstrated to be a serious impediment. Given that D is supposed to




be
 a
 'systems' language, I humbly suggest that a mechanism be provided to

 bypass
 such nicities when one is prepared to assume full responsibility. For
 example, there might be a qualifier for variables that disables the
 automatic initialization (along the lines of the 'auto' qualifier).

 The solution is improving the optimizer so it can determine when the
 initialization is redundant. It already does this for most variables,
 just not for arrays.


 Great!  Will it eliminate initialization for this type of (trivail)


usage:
 void A (void delegate (char[]) formatter)
 {
     char [1024 * 32] workspace;

     formatter (workspace);
 }

 Because, if not, then the following modification is the sort of (keyword
 qualifier) thing that would hand control back to the programmer:

 {
     uninitialized char[1024 * 32] workspace;

     formatter (workspace);
 }

 That's perhaps not the best name for it, but you get the idea. It's very
 clear what's going on, it's entirely optional, and the compiler will do
 exactly what the programmer intends it to do. No
 under-the-tablecloth-magic required. Oh, and it's easy to grep for. It's
 just a suggestion.

 As a general principle I agree it would be nice to be able to skip
 initialization just like it's nice to be able to turn off array-bounds
 checking. But in practice (in D) I wonder how often that would pay off.
 Allocating a 32K array on the stack is pretty evil, and I wonder how often
 that really happens.

Is allocating a 2K array on the stack equally evil? From the reply to AJ:

2K allocations:
heap:      504
stack[]:   5681
alloca():  57
asm:       10

32K allocations:
heap:      520
stack[]:   8812
alloca():  118
asm:       10

You can see that the 2K array initialization is almost equally obnoxious,
compared to the traditional [asm] C/C++ uninitialized approach. The whole
point is that here's a scenario where the compiler thinks it knows best, and
won't hand over control unless we run rings around it. Perhaps worse, the
whole issue is squirrelled away under the covers; you have to dig like a
rabbit just to find the culprit.

 What C code were you porting that was doing that? I
 assume it was allocating such a big array not because it was going to use
 it all (as you said only a small portion of the array gets used) but
 because the coder was lazy and was trying to avoid worrying about array
 bounds errors.

Heavy-duty text processing. The coder wasn't being lazy at all, as
out-of-bounds conditions are being checked for. The allocated (temporary)
workspace is typically used in only a small amounts, but sporadically will
get used more fully. The approach taken was apparently "we'll check for
end-of-buffer, but if anything exceeds these limits, then it's so beyond the
design-specification that it's an error condition". I understand that to be
a rather common approach. It's certainly valid, and rather effective. The D
approach to this is, " ...you /vill/ pay ze initialization cost, regardless
of vot ze design!"

(that is a caricature; no offense intended to anyone)

The existing code simply took advantage of the fact that it's perfectly
legitimate, and very fast, to do that kind of thing in C and C++. If D is
going to outlaw such approaches then it had better get it down in the
"manifesto". However, I didn't realize that D was such an ideological
exercise.

<snip>
 it is done. This approach means you can target your buffers to the typical
 usage and avoid array-bounds errors.

Yes, this is all well and good. But /redesigning/ existing software to suit
is just not a solution. While the approach you discuss is perfectly valid,
so was the original (given its apparent design scope) since it did not have
array-bounds or buffer-overrun conditions. Further, you can't reliably place
an extensible buffer on the stack; thus, you are forced to forego the
performance benefits of doing so.

This thread is not about a design-methodology issue ~ if it were, nobody
would be allowed to use the stack for array allocation, at all! It's a
simple issue of the D language not providing the programmer with
"accessible" tools to decide what's best. In this particular case it showed
up in ported code. There's nothing to say that it won't show up in original
D software either.

As you say, Ben; as a general principal, D should (absolutely) permit the
skipping of array initialization. The whole default initialization thing is
arguably wonderful, but then not providing accessible control over it is
akin to tinpot-dictatorship. For example, I will probably resort to assembly
to get around this issue: that's not exactly an accessible or acceptable
solution, for obvious reasons. I think we can agree on that basic tenet.
Your other points are perhaps more about a design philosophy: that's a
different ball-game and, applicable or not, unfortunately does not jive with
porting existing code.

The optional "keyword qualifier" example (at the top of the page) is likely
to be as trivial as one could hope for within the compiler; combined with a
little documentation, that would resolve such issues in a pragmatic and
considerate manner. Surely that is far simpler than building an optimizer to
read the programmers mind?

Or, is there some fundamental ideology at stake here?

Aug 12 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfge6g$2g1j$1 digitaldaemon.com...
 Heavy-duty text processing. The coder wasn't being lazy at all, as
 out-of-bounds conditions are being checked for. The allocated (temporary)
 workspace is typically used in only a small amounts, but sporadically will
 get used more fully. The approach taken was apparently "we'll check for
 end-of-buffer, but if anything exceeds these limits, then it's so beyond

the
 design-specification that it's an error condition". I understand that to

be
 a rather common approach. It's certainly valid, and rather effective. The

D
 approach to this is, " ...you /vill/ pay ze initialization cost,

regardless
 of vot ze design!"

Here's an example of an alternate way to do it, for code that would
typically use only a small amount, but occaisionally could use a great deal.
It can also be set up to never overflow (at least until all memory is
exhausted) by repeatedly resizing it. This has the advantage of not needing
to set up any error recovery, or create and document an error message for
the limits.

import std.stdio;

char[] formatter(char[] s)
{
    if (s.length < 14)        // if our passed buffer isn't big enough
        s.length = 14;        // yes, we can reallocate things that were
originally on the stack
    s[0..14] = "abcdefghijklmn";    // copy result into s[]
    return s;
}


void main()
{   char[4] workspace;    // assume 4 is size typically needed in
formatter()
    char[] s;

    s = formatter(workspace);
    writefln(s);
}

Aug 12 2004

"van eeshan" <vanee hotmail.net> writes:

"Walter" <newshound digitalmars.com> wrote ...
 Here's an example of an alternate way to do it, for code that would
 typically use only a small amount, but occaisionally could use a great

deal.

Thanks Walter, for the alternate implementation. Ben had already suggested
that, and I am aware of various other approaches too.

However, let's not confuse this topic with that of design-methodology. As
was noted (in the post you replied to) this is a port. Do you think it
reasonable to state, " ... Now listen up you ...scurrilous ... miserable
bunch of C coders! If you want to join the D party, then you're gonna' have
to shape-up, and drop all those implementation approaches that were valid
and comfortable. We don't have any of that nonsense over here! If you want
your damned existing code to run fast, well then, by golly you'd better make
sure you redesign it too!" ?

... and all because there's no accessible way to inhibit default
initialization?  The issues at hand are several:

1) the original implementation was, and still is, a perfectly valid approach
(given the requirements). It is also executes rather efficiently. The same
pattern is quite likely to occur in original D code also. Let's try to avoid
whether other approaches are better or worse; that doesn't help. Not even
the best education can thoroughly resolve such problems. For the record,
your alternative still needs an additional boundary-check plus error-message
for equivalent pathological conditions (out of memory).

2) what you are now suggesting is that the program should be redesigned
instead. That is simply not realistic for working, optimal code. The code
compiled and executed without too much effort, but it exhibits performance
issues. That's what drove the foray into this topic. I will probably fix it
with the assembly-code thoughtfully provided by AJ (and thanks to Ben also
for both his suggestions) but that's hardly an accessible, or portable,
resolution in the general sense.

3) the solution offered (by both yourself and by Ben) is not allocated on
the stack. The whole point here is to utilize the stack because it's a much
faster means of providing a temporary workspace. Going to the heap means a
partial redesign, and the program will still run slower in D than it did
originally. Forcing stack arrays to initialize will /always/ be dramatically
slower than the equivalent C/C++ code, whereas a simple embellishment to the
existing initialization mechanism would resolve that completely; and in a
jiffy. This is about choice, Walter, and supporting the programmer in what
they are doing.

4) the approach to addressing this seems to be one of "guilty until proven
innocent". That is, there doesn't seem to be any notion on your part that
this might, actually, be a valid observation. With solid number to back it
up. You're basically saying, " ... the approach never had, and never will,
have any validity whatsoever. Rip out the code and do it just like I would".
I do sincerely hope that I'm misunderstanding you, but that's kinda' how it
has been coming across.

May I ask why some easy manual control over default-initialization is such a
terrible thing? Is it really /so/ impossible that you'd rather force people
into re-implementing and retesting parts of their existing codebase instead
(as you suggest above) ?

Or is there perhaps some fundamental ideology at stake, that I just don't
see?




"Walter" <newshound digitalmars.com> wrote in message
news:cfgqv4$2lq9$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfge6g$2g1j$1 digitaldaemon.com...
 Heavy-duty text processing. The coder wasn't being lazy at all, as
 out-of-bounds conditions are being checked for. The allocated


(temporary)
 workspace is typically used in only a small amounts, but sporadically


will
 get used more fully. The approach taken was apparently "we'll check for
 end-of-buffer, but if anything exceeds these limits, then it's so beyond

 the
 design-specification that it's an error condition". I understand that to

 be
 a rather common approach. It's certainly valid, and rather effective.


The
 D
 approach to this is, " ...you /vill/ pay ze initialization cost,

 regardless
 of vot ze design!"

 Here's an example of an alternate way to do it, for code that would
 typically use only a small amount, but occaisionally could use a great

deal.
 It can also be set up to never overflow (at least until all memory is
 exhausted) by repeatedly resizing it. This has the advantage of not

needing
 to set up any error recovery, or create and document an error message for
 the limits.

 import std.stdio;

 char[] formatter(char[] s)
 {
     if (s.length < 14)        // if our passed buffer isn't big enough
         s.length = 14;        // yes, we can reallocate things that were
 originally on the stack
     s[0..14] = "abcdefghijklmn";    // copy result into s[]
     return s;
 }


 void main()
 {   char[4] workspace;    // assume 4 is size typically needed in
 formatter()
     char[] s;

     s = formatter(workspace);
     writefln(s);
 }

Aug 12 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfgvi5$2npu$1 digitaldaemon.com...
 1) the original implementation was, and still is, a perfectly valid

approach
 (given the requirements).

And it is valid and works in D.

 It is also executes rather efficiently. The same
 pattern is quite likely to occur in original D code also. Let's try to

avoid
 whether other approaches are better or worse; that doesn't help. Not even
 the best education can thoroughly resolve such problems. For the record,
 your alternative still needs an additional boundary-check

Yes.

 plus error-message
 for equivalent pathological conditions (out of memory).

This is not necessary, because very few programs can proceed if they run out
of memory, so they just issue an error message and abort. In D, this is
handled automatically - running out of memory throws an exception with a
message in it, if you don't catch the exception, the message is printed and
the program aborted. For nearly all programs, this is the right behavior.

 2) what you are now suggesting is that the program should be redesigned
 instead. That is simply not realistic for working, optimal code. The code
 compiled and executed without too much effort, but it exhibits performance
 issues. That's what drove the foray into this topic. I will probably fix

it
 with the assembly-code thoughtfully provided by AJ (and thanks to Ben also
 for both his suggestions) but that's hardly an accessible, or portable,
 resolution in the general sense.

I suggest the alloca() approach, and do some timing tests on the real app
with it vs AJ's approach. I'm willing to bet you a beer that you won't be
able to measure the difference, especially if the typical use of the array
will fill it 2 or 3 Kb worth.

 3) the solution offered (by both yourself and by Ben) is not allocated on
 the stack.

The alloca() version is.

 The whole point here is to utilize the stack because it's a much
 faster means of providing a temporary workspace. Going to the heap means a
 partial redesign, and the program will still run slower in D than it did
 originally. Forcing stack arrays to initialize will /always/ be

dramatically
 slower than the equivalent C/C++ code, whereas a simple embellishment to

the
 existing initialization mechanism would resolve that completely; and in a
 jiffy. This is about choice, Walter, and supporting the programmer in what
 they are doing.

 4) the approach to addressing this seems to be one of "guilty until proven
 innocent". That is, there doesn't seem to be any notion on your part that
 this might, actually, be a valid observation. With solid number to back it
 up. You're basically saying, " ... the approach never had, and never will,
 have any validity whatsoever. Rip out the code and do it just like I

would".
 I do sincerely hope that I'm misunderstanding you, but that's kinda' how

it
 has been coming across.

 May I ask why some easy manual control over default-initialization is such

a
 terrible thing? Is it really /so/ impossible that you'd rather force

people
 into re-implementing and retesting parts of their existing codebase

instead
 (as you suggest above) ?

 Or is there perhaps some fundamental ideology at stake, that I just don't
 see?

Uninitialized data can be a source of bugs and trouble, as I pointed out in
my other posting, even when used correctly. One design goal of D is to
improve reliability and portability by eliminating sources of undefined
behavior, and uninitialized data is one huge source of undefined,
unportable, erratic and unpredictable behavior. I'm trying to squeeze that
out of the normal constructs in D - but access to C functions and inline
assembler is there for folks who really need it.

Give the other methods presented a chance, and time them in context.

Aug 12 2004

Michael <Michael_member pathlink.com> writes:

Just out of curiosity, how does alloca() work? I guess it just manipulates the
Stack and Base Pointers?

My apologies for going a bit off topic, I've just never seen alloca() before.

Aug 13 2004

"Walter" <newshound digitalmars.com> writes:

"Michael" <Michael_member pathlink.com> wrote in message
news:cfhr6d$9qj$1 digitaldaemon.com...
 Just out of curiosity, how does alloca() work? I guess it just manipulates

the
 Stack and Base Pointers?

 My apologies for going a bit off topic, I've just never seen alloca()

before.

It manipulates the ESP register. There is some complexity in it due to
needing to 'touch' each 4k page allocated, and to relocate expression
temporaries that are pushed on the stack. The source for it comes on the
DMC++ CD.

Aug 13 2004

Michael <Michael_member pathlink.com> writes:

Ahh, I hadn't thought about expression temps. Very interesting stuff. I think I
may order that CD. 

Thanks very much for the information.


 Just out of curiosity, how does alloca() work? I guess it just manipulates

the
 Stack and Base Pointers?

 My apologies for going a bit off topic, I've just never seen alloca()

before.

It manipulates the ESP register. There is some complexity in it due to
needing to 'touch' each 4k page allocated, and to relocate expression
temporaries that are pushed on the stack. The source for it comes on the
DMC++ CD.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

"Walter" <newshound digitalmars.com> wrote in message
news:cfhk9n$5jh$1 digitaldaemon.com...
 plus error-message
 for equivalent pathological conditions (out of memory).

 This is not necessary, because very few programs can proceed if they run

out
 of memory, so they just issue an error message and abort. In D, this is
 handled automatically - running out of memory throws an exception with a
 message in it, if you don't catch the exception, the message is printed

and
 the program aborted. For nearly all programs, this is the right behavior.

Perhaps you haven't had the opportunity to write resiliant software? For
example, halting an entire web-site, because some file is not in the correct
format, is just not a solution. Frankly, I'm really surprised you'd
encourage
something so obviously naiive in this day and age.

With a server, you do whatever is necessary to track and release allocations
to ensure the server stays alive as long as it can. Even better; you  focus
on temporarily /stack/ allocations (surprise!), and thread-local reusable
buffers, so there's never any issue about running out of memory (as a bonus,
it's all thread-safe). Please don't waste your efforts on me in this
respect: Been
there. Done that. Seen the movie a gazillion times. Optimize it while I
sleep.


 2) what you are now suggesting is that the program should be redesigned
 instead. That is simply not realistic for working, optimal code. The


code
 compiled and executed without too much effort, but it exhibits


performance
 issues. That's what drove the foray into this topic. I will probably fix

 it
 with the assembly-code thoughtfully provided by AJ (and thanks to Ben


also
 for both his suggestions) but that's hardly an accessible, or portable,
 resolution in the general sense.

 I suggest the alloca() approach, and do some timing tests on the real app
 with it vs AJ's approach. I'm willing to bet you a beer that you won't be
 able to measure the difference, especially if the typical use of the array
 will fill it 2 or 3 Kb worth.

You are well aware that this is not about alloca() versus assembly.


 3) the solution offered (by both yourself and by Ben) is not allocated


on
 the stack.

 The alloca() version is.

Was referring to the example Ben gave that looked rather like your own one.
The one he noted in the post you  recently replied to.


 The whole point here is to utilize the stack because it's a much
 faster means of providing a temporary workspace. Going to the heap means


a
 partial redesign, and the program will still run slower in D than it did
 originally. Forcing stack arrays to initialize will /always/ be

 dramatically
 slower than the equivalent C/C++ code, whereas a simple embellishment to

 the
 existing initialization mechanism would resolve that completely; and in


a
 jiffy. This is about choice, Walter, and supporting the programmer in


what
 they are doing.

 4) the approach to addressing this seems to be one of "guilty until


proven
 innocent". That is, there doesn't seem to be any notion on your part


that
 this might, actually, be a valid observation. With solid number to back


it
 up. You're basically saying, " ... the approach never had, and never


will,
 have any validity whatsoever. Rip out the code and do it just like I

 would".
 I do sincerely hope that I'm misunderstanding you, but that's kinda' how

 it
 has been coming across.

 May I ask why some easy manual control over default-initialization is


such
 a
 terrible thing? Is it really /so/ impossible that you'd rather force

 people
 into re-implementing and retesting parts of their existing codebase

 instead
 (as you suggest above) ?

 Or is there perhaps some fundamental ideology at stake, that I just


don't
 see?

 Uninitialized data can be a source of bugs and trouble, as I pointed out

in
 my other posting, even when used correctly. One design goal of D is to
 improve reliability and portability by eliminating sources of undefined
 behavior, and uninitialized data is one huge source of undefined,
 unportable, erratic and unpredictable behavior. I'm trying to squeeze that
 out of the normal constructs in D - but access to C functions and inline
 assembler is there for folks who really need it.

Most admirable, Walter. And agreed, as I've stated in virtually every single
post on this topic.

But here's where that position truly falls apart ... what about those
multitude of cases where buffers are reused over and over again? What about
thread-local workspaces? What about pooled buffers?  what about all those X,
Y, and Z variations? They do /not/ get automagically re-initialized.

Your argument, as you state it, is one-shot only. That is, the whole basis
of said argument hold validity only as long as the "uninitialized data" is
used once, and once only. Anyone who produces high-performance designs
and/or code will chortle with glee at this glaring logic hole (where such
buffer are deliberately reused thousands, sometimes millions, of times per
second). Such systems do not have the time to spend clearing out each little
byte of data ... they simply ensure that it's never a problem. Can we please
get past this level of discussion?

You know; in the time you've taken over these postings, you could probably
have implemented a keyword qualifier to optionally inhibit the default
initialization. Again, we're talking about a deliberate action by the
programmer to say " I don't want that frickin CPU gobblin initialization!".
From the developers position, your noble ideology has already  been placed
to one side at that point. The D language should be able to help there,
rather
than just get in the way. Remember? It's a "Practical Language for Practical
Programmers".

Please consider.

Aug 13 2004

Sean Kelly <sean f4.ca> writes:

van eeshan wrote:
 
 Perhaps you haven't had the opportunity to write resiliant software? For
 example, halting an entire web-site, because some file is not in the correct
 format, is just not a solution. Frankly, I'm really surprised you'd
 encourage something so obviously naiive in this day and age.

I think Walter meant that this is typically the proper behavior for an 
out of memory condition.  He also notes that an exception is thrown so 
"resilient software" has a chance to do something about the problem 
instead of aborting.  C++ behaves the exact same way, however C++ does 
not automatically catch the error and display a meaningful message if 
the programmer forgot to write a try/catch block.


Sean

Aug 13 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfhv05$bl9$1 digitaldaemon.com...
 Uninitialized data can be a source of bugs and trouble, as I pointed out

 in
 my other posting, even when used correctly. One design goal of D is to
 improve reliability and portability by eliminating sources of undefined
 behavior, and uninitialized data is one huge source of undefined,
 unportable, erratic and unpredictable behavior. I'm trying to squeeze


that
 out of the normal constructs in D - but access to C functions and inline
 assembler is there for folks who really need it.

 But here's where that position truly falls apart ... what about those
 multitude of cases where buffers are reused over and over again? What

about
 thread-local workspaces? What about pooled buffers?  what about all those

X,
 Y, and Z variations? They do /not/ get automagically re-initialized.

Right, they don't. However, data buffers are unlikely to contain garbage
that looks like pointers into the gc heap, whereas garbage on the stack *is*
very likely to. Next, detritus in recycled data buffers is not going to
consist of objects that are no longer valid, the source of the other problem
I mentioned. And lastly, D can't close every hole, but that doesn't mean it
shouldn't close what it can.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

May I point you towards this post, which has an attempted summary of the
discussion thus far? news:cfhqhu$9jj$1 digitaldaemon.com

Would very much appreciate your comments on those points.

I think, perhaps, this unfolding picture indicates it's not a question of
ideology per se; but one of stack-usage being somewhat incompatible with the
current GC design. If there were some means to quickly eliminate what look
like old GC references from the (reused) portion of the stack that's still
live, then what Ben has suggested would presumably be more palatable?
Currently, the method employed to achieve this is to wipe everything to a
predefined default value; which can be agonizingly slow compared to the
C/C++ approach.

Is that a fair assessment?


"Walter" <newshound digitalmars.com> wrote in message
news:cfisfb$s39$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfhv05$bl9$1 digitaldaemon.com...
 Uninitialized data can be a source of bugs and trouble, as I pointed



out
 in
 my other posting, even when used correctly. One design goal of D is to
 improve reliability and portability by eliminating sources of



undefined
 behavior, and uninitialized data is one huge source of undefined,
 unportable, erratic and unpredictable behavior. I'm trying to squeeze


 that
 out of the normal constructs in D - but access to C functions and



inline
 assembler is there for folks who really need it.

 But here's where that position truly falls apart ... what about those
 multitude of cases where buffers are reused over and over again? What

 about
 thread-local workspaces? What about pooled buffers?  what about all


those
 X,
 Y, and Z variations? They do /not/ get automagically re-initialized.

 Right, they don't. However, data buffers are unlikely to contain garbage
 that looks like pointers into the gc heap, whereas garbage on the stack

*is*
 very likely to. Next, detritus in recycled data buffers is not going to
 consist of objects that are no longer valid, the source of the other

problem
 I mentioned. And lastly, D can't close every hole, but that doesn't mean

it
 shouldn't close what it can.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

Addendum.

I misread your code somewhat, Water, but what you suggest has a fatal flaw
in it:  you are still initializing the original stack-based array, and
you're making the assumption that the typical size is very small. Further,
you are then extending the array via the heap which will, I imagine,
initialize the extended part of the array. That just doesn't work because:

a) there's a significant amount of effort to convert all the "workers" to do
what you suggest. Plus, the code is less efficient, there's more of it, and
there's possible heap-fragmentation involved.

b) the initial size needs to be around the 2 to 3KB mark. Given that, your
example is hardly faster than the original code. In fact, the initialization
cost is ~57000% slower than the equivalent C/C++ array allocation. That's
better than ~88000%, but who's to quibble at these levels?

I wish it were so simple as your suggestion conveyed. Conversely, if there
were a syntactic means of optionally inhibiting the array initialization,
there would be no related issue at all. We could all be radiantly happy, and
have a beer or something.




"Walter" <newshound digitalmars.com> wrote in message
news:cfgqv4$2lq9$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfge6g$2g1j$1 digitaldaemon.com...
 Heavy-duty text processing. The coder wasn't being lazy at all, as
 out-of-bounds conditions are being checked for. The allocated


(temporary)
 workspace is typically used in only a small amounts, but sporadically


will
 get used more fully. The approach taken was apparently "we'll check for
 end-of-buffer, but if anything exceeds these limits, then it's so beyond

 the
 design-specification that it's an error condition". I understand that to

 be
 a rather common approach. It's certainly valid, and rather effective.


The
 D
 approach to this is, " ...you /vill/ pay ze initialization cost,

 regardless
 of vot ze design!"

 Here's an example of an alternate way to do it, for code that would
 typically use only a small amount, but occaisionally could use a great

deal.
 It can also be set up to never overflow (at least until all memory is
 exhausted) by repeatedly resizing it. This has the advantage of not

needing
 to set up any error recovery, or create and document an error message for
 the limits.

 import std.stdio;

 char[] formatter(char[] s)
 {
     if (s.length < 14)        // if our passed buffer isn't big enough
         s.length = 14;        // yes, we can reallocate things that were
 originally on the stack
     s[0..14] = "abcdefghijklmn";    // copy result into s[]
     return s;
 }


 void main()
 {   char[4] workspace;    // assume 4 is size typically needed in
 formatter()
     char[] s;

     s = formatter(workspace);
     writefln(s);
 }

Aug 12 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfh14k$2ouk$1 digitaldaemon.com...
 I misread your code somewhat, Water, but what you suggest has a fatal flaw
 in it:  you are still initializing the original stack-based array, and
 you're making the assumption that the typical size is very small.

I'm confused - you wrote: "The allocated (temporary) workspace is typically
used in only a small amounts, but sporadically will get used more fully." My
suggestion was tuned to that.

 Further,
 you are then extending the array via the heap which will, I imagine,
 initialize the extended part of the array.

That's correct.

 That just doesn't work because:
 a) there's a significant amount of effort to convert all the "workers" to

do
 what you suggest.

Correct, but resizing dynamic arrays in D is pretty easy. It's not as klunky
and error-prone as in C.

 Plus, the code is less efficient,

Since the resize will only happen in atypical cases, its efficiency should
not impact overall performance.

 there's more of it, and
 there's possible heap-fragmentation involved.

Heap fragmentation is much less of an issue than it used to be, with large
memory pools and virtual memory. I know how to build a gc that will compact
memory to deal with fragmentation when it does arise, so although this is
not in the current D it is not a technical problem to do it.

 b) the initial size needs to be around the 2 to 3KB mark. Given that, your
 example is hardly faster than the original code. In fact, the

initialization
 cost is ~57000% slower than the equivalent C/C++ array allocation. That's
 better than ~88000%, but who's to quibble at these levels?

If your app typically is going to fill 2 to 3 Kb, I suggest using malloc()
(or alloca()) for it. The handful of extra instructions to allocate via
malloc as opposed to the stack will be lost in the thousands of instructions
needed to fill 2 to 3 Kb.

 I wish it were so simple as your suggestion conveyed. Conversely, if there
 were a syntactic means of optionally inhibiting the array initialization,
 there would be no related issue at all. We could all be radiantly happy,

and
 have a beer or something.

Allocating large, uninitialized arrays on the stack has its own problems:

1) The uninitialized data that is on the stack will get scanned by the
garbage collector looking for any references to allocated memory. Since the
uninitialized data consists of old D stack frames, it is highly likely that
some of that garbage will look like references into the gc heap, and the gc
memory will not get freed. This problem really does happen, and can be
pretty frustrating to track down. Some of my C code that uses GC gets a
memset() on the stack frame for that reason.

2) It's possible for a function to pass out of it a reference to data on
that function's stack frame. By then allocating a new stack frame over the
old data, and not initializing, the reference to the old data may still
appear to be valid. The program will then behave erratically. Initializing
all data on the stack frame will greatly increase the probability of forcing
that bug into the open in a repeatable manner.

3) Large arrays on the stack bring up the specter of stack exhaustion with
even modest amounts of recursion.

That said, there are things I can do to generate faster code for the
initialization.

Aug 12 2004

Arcane Jill <Arcane_member pathlink.com> writes:

In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...

I know how to build a gc that will compact
memory to deal with fragmentation when it does arise, so although this is
not in the current D it is not a technical problem to do it.

That's probably good news, but:

Will it be possible to mark specific arrays on the heap as "immovable"? (Please
say yes. I'm sure it is not a technical problem).

Arcane Jill

Aug 12 2004

"Walter" <newshound digitalmars.com> writes:

"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cfhoe3$793$1 digitaldaemon.com...
 In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...

I know how to build a gc that will compact
memory to deal with fragmentation when it does arise, so although this is
not in the current D it is not a technical problem to do it.

 That's probably good news, but:

 Will it be possible to mark specific arrays on the heap as "immovable"?

(Please
 say yes. I'm sure it is not a technical problem).

Yes. It's called "pinning". And I can guess why you need this feature <g>.

Aug 13 2004

Arcane Jill <Arcane_member pathlink.com> writes:

In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...

Allocating large, uninitialized arrays on the stack has its own problems:

1) The uninitialized data that is on the stack will get scanned by the
garbage collector looking for any references to allocated memory. Since the
uninitialized data consists of old D stack frames, it is highly likely that
some of that garbage will look like references into the gc heap, and the gc
memory will not get freed. This problem really does happen, and can be
pretty frustrating to track down. Some of my C code that uses GC gets a
memset() on the stack frame for that reason.

That is a very good point. In fact, it's an excellent point, and one which I had
not previously considered. Although, I suppose you could get round it with
something like this:








2) It's possible for a function to pass out of it a reference to data on
that function's stack frame. By then allocating a new stack frame over the
old data, and not initializing, the reference to the old data may still
appear to be valid. The program will then behave erratically. Initializing
all data on the stack frame will greatly increase the probability of forcing
that bug into the open in a repeatable manner.

That's another very good point.


3) Large arrays on the stack bring up the specter of stack exhaustion with
even modest amounts of recursion.

I'm not sure about this one. I thought that in the 80386+ architecture, the
stack(s) can grow indefinitely. Or is it just that Windows doesn't take
advantage of this possibility.



That said, there are things I can do to generate faster code for the
initialization.

I suspect that if you succeed in making this lightening-fast, everyone will stop
complaining. You only have to look at the figures which have been posted
recently to see what everybody's complaining about.

Aug 13 2004

"Walter" <newshound digitalmars.com> writes:

"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cfhppt$97t$1 digitaldaemon.com...
 In article <cfhhbe$3jf$1 digitaldaemon.com>, Walter says...
3) Large arrays on the stack bring up the specter of stack exhaustion


with
even modest amounts of recursion.

 I'm not sure about this one. I thought that in the 80386+ architecture,

the
 stack(s) can grow indefinitely. Or is it just that Windows doesn't take
 advantage of this possibility.

Stacks are allocated a range of memory. Once that's filled up, since the
stack cannot be relocated, the program comes to a (crashing) halt. The
problem gets worse if you create a lot of threads, because each thread will
necessarilly have a fixed maximum stack size.

The stack size only appears to be infinite to those of us used to 16 bit
machines <g>.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

"Walter" <newshound digitalmars.com> wrote in message
news:cfhhbe$3jf$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfh14k$2ouk$1 digitaldaemon.com...
 I misread your code somewhat, Water, but what you suggest has a fatal


flaw
 in it:  you are still initializing the original stack-based array, and
 you're making the assumption that the typical size is very small.

 I'm confused - you wrote: "The allocated (temporary) workspace is

typically
 used in only a small amounts, but sporadically will get used more fully."

My
 suggestion was tuned to that.

This topic had become attuned somewhat to 16KB and 32KB buffers, so the
typical population of 2 to 3KB is relatively small; I can understand how
that might have been misinterpreted to become the < 5 bytes you had tuned
for.

 b) the initial size needs to be around the 2 to 3KB mark. Given that,


your
 example is hardly faster than the original code. In fact, the

 initialization
 cost is ~57000% slower than the equivalent C/C++ array allocation.


That's
 better than ~88000%, but who's to quibble at these levels?

 If your app typically is going to fill 2 to 3 Kb, I suggest using malloc()
 (or alloca()) for it. The handful of extra instructions to allocate via
 malloc as opposed to the stack will be lost in the thousands of

instructions
 needed to fill 2 to 3 Kb.

Good that you should note the thousands of instructions, because that's
effectively what the default initialization does (fills the entire array
with default data).  This particular app is actually very efficient in
filling the 'typical' space used; you might think of it as effectively a
quick lookup followed by a memcpy(). If you consider that as the core
element of the algorithm; then the default initialization will slow that
core down by almost a factor of two (clear the entire buffer, then fill the
whole thing again). That slows down the overall application quite
noticeably.  One is rather likely to notice that level of redundant overhead
in any CPU-bound application that doesn't completely toss the notion of
efficiency out of the proverbial window.

 Allocating large, uninitialized arrays on the stack has its own problems:

 1) The uninitialized data that is on the stack will get scanned by the
 garbage collector looking for any references to allocated memory. Since

the
 uninitialized data consists of old D stack frames, it is highly likely

that
 some of that garbage will look like references into the gc heap, and the

gc
 memory will not get freed. This problem really does happen, and can be
 pretty frustrating to track down. Some of my C code that uses GC gets a
 memset() on the stack frame for that reason.

So what you are saying is that the current GC is incompatible with stack
allocation in general? What you describe is not restricted to uninitialized
data, Walter ~ what you appear to be saying is that /any/ data on the stack
could potentially look like a reference into the gc heap (as you describe
it). If this is the case, the use of alloca() will have exactly the same
issues, yet you recommend that as a solution a few paragraphs above. Where
does this really stand? Are you really saying that the D GC is incompatible
with long-term stack content? That is, if I place some arbitrary values in a
stack array during main(), will those values cause the GC confusion
throughout the remainder of program execution if they happen to look like GC
references?

 2) It's possible for a function to pass out of it a reference to data on
 that function's stack frame. By then allocating a new stack frame over the
 old data, and not initializing, the reference to the old data may still
 appear to be valid. The program will then behave erratically. Initializing
 all data on the stack frame will greatly increase the probability of

forcing
 that bug into the open in a repeatable manner.

Oh ... I thought we were already well past the general case here, and into
the realm of where the programmer had alreay taken full responsibility for
their actions. Specifically, not initializing the stack content. Let's face
it: it's quite feasible for any function to barf all over the heap, stack,
and everywhere else until the hardware catches it.

And what about the cases whereby a stack buffer is consistently reused
within the same, or subordinate, function? It's not suddenly cleared all
over again before each subsequent usage (e.g. between foreach loops), is it?
This appears to be a completely moot (and potentially misleading) point, but
please educate me if I'm wrong. Seriously ...

 3) Large arrays on the stack bring up the specter of stack exhaustion with
 even modest amounts of recursion.

Anyone who has read this thread is well aware that we're not talking
recursion. Regardless; your sudden concern didn't stop this being a common
and successful approach in C/C++. I mean, did you outlaw large(ish)
stack-arrays in the DM C++ compiler? That is not a rhetorical question ...

Besides, as noted previously, it's best we not confuse this topic with
design-methodologies; as even the best education cannot reconcile such
issues. I hope we can agree on that.

 That said, there are things I can do to generate faster code for the
 initialization.

Yes, I'm aware of that. You could, for example, initialize in chunks of four
rather than one. However, that will still not even get close to a C/C++
style array-allocation on the stack. But the thought is both appreciated and
noted.



To summarize thus far:

a) we have a D language facility to initialize all variables with default
values. This is arguably a wonderful facility, and can help catch certain
bugs in the case of general programming.

b) unfortunately there's no easy "accessible" means to inhibit that
initialization for stack-based arrays; leaving the programmer to eat a
potentially (and repeated) vast overhead of CPU consumption, or resort to
some non-obvious guru-tricks (none of which will easily work on anything
other than a single dimensional array) ... Not to mention that this CPU cost
is completely hidden from source-level view in the first place.

c) you indicate, Walter, that the GC has problems with uninitialized data on
the stack. Well, that will happen regardless of whether or not the language
supports non-initialization. That is; a valid stack array can contain
arbitrary data patterns ~ said content may look exactly like gc references.
Please correct me if I'm wrong.

d) The language explicitly exposes a built-in assembler, to do whatever the
heck anyone might want to (barring the usual platform issues). Yet, there
seems to be a massive pushback against providing an easy to use,
symmetrical, high performance, and thoroughly useful counter to the default
initialization ~ the alternative to which is to drop into assembly (or
similar guru-grok), to simulate what C/C++ provides as its default, highly
efficient, behavior. There's not even the vaguest counter-argument relating
to implementation complexity.

Please forgive the unsavory mess while this latter, wildly contradictory
concept, blows my tiny non-academic mind all over the inside of your screen
...

Let me finish with the sincere hope that my (obviously) thoroughly weak
grasp on reality will soon be at an end. Can you please help me wrangle that
into shape Walter?

Aug 13 2004

Arcane Jill <Arcane_member pathlink.com> writes:

In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...

Since

the
 uninitialized data consists of old D stack frames, it is highly likely

that
 some of that garbage will look like references into the gc heap, and the

gc
 memory will not get freed.


Are you really saying that the D GC is incompatible
with long-term stack content? That is, if I place some arbitrary values in a
stack array during main(), will those values cause the GC confusion
throughout the remainder of program execution if they happen to look like GC
references?

I think you may have misunderstood Walter. Arbitrary random data can indeed give
the GC false positives, but this usually doesn't matter, since it is rare. What
Walter said is that uninitialized stack data is /not random/ - in that it is
almost certain to contain fragments of some previous stack frame, and therefore
also /real, valid/ (but out-of-date) references to GC-controlled memory. So the
number of false positives will shoot up from "almost none" to "very many".

(And my previous suggestion of temporarily disabling the GC won't help either,
so I withdraw the suggestion).

Jill

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

Fair point Jill.

But at that stage, one might as well state that the stack is entirely off
limits for general array usage. You cannot predict the frequency of what
might or might not look like GC references for any given application. It
would require dynamic sampling to find a trend and try to offset that
against GC behavior. The real problem, in this particular case, is the GC ~
it's simply incompatible with arbitrary (yet valid) stack-based array
content. How much of a problem is that? It may actually be a red-herring. On
the other hand, I may be waning considerably at this very moment (eight
hours behind you).



"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cfhvbq$bra$1 digitaldaemon.com...
 In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...

Since

the
 uninitialized data consists of old D stack frames, it is highly likely

that
 some of that garbage will look like references into the gc heap, and



the
gc
 memory will not get freed.


Are you really saying that the D GC is incompatible
with long-term stack content? That is, if I place some arbitrary values


in a
stack array during main(), will those values cause the GC confusion
throughout the remainder of program execution if they happen to look like


GC
references?

 I think you may have misunderstood Walter. Arbitrary random data can

indeed give
 the GC false positives, but this usually doesn't matter, since it is rare.

What
 Walter said is that uninitialized stack data is /not random/ - in that it

is
 almost certain to contain fragments of some previous stack frame, and

therefore
 also /real, valid/ (but out-of-date) references to GC-controlled memory.

So the
 number of false positives will shoot up from "almost none" to "very many".

 (And my previous suggestion of temporarily disabling the GC won't help

either,
 so I withdraw the suggestion).

 Jill

Aug 13 2004

"Walter" <newshound digitalmars.com> writes:

"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cfhvbq$bra$1 digitaldaemon.com...
 In article <cfhqhu$9jj$1 digitaldaemon.com>, van eeshan says...

Since the
 uninitialized data consists of old D stack frames, it is highly likely



that
 some of that garbage will look like references into the gc heap, and



the
 gc memory will not get freed.


Are you really saying that the D GC is incompatible
with long-term stack content? That is, if I place some arbitrary values


in a
stack array during main(), will those values cause the GC confusion
throughout the remainder of program execution if they happen to look like


GC
references?

 I think you may have misunderstood Walter. Arbitrary random data can

indeed give
 the GC false positives, but this usually doesn't matter, since it is rare.

What
 Walter said is that uninitialized stack data is /not random/ - in that it

is
 almost certain to contain fragments of some previous stack frame, and

therefore
 also /real, valid/ (but out-of-date) references to GC-controlled memory.

So the
 number of false positives will shoot up from "almost none" to "very many".

You're right, I'd forgotten to make that clear. Uninitialized data is
"garbage", but that's not at all the same thing as "random". In my extensive
experiments with this, text and integer "garbage" rarely looks like a valid
pointer into the gc heap, but old stack frames are very likely to.

Also, it is not a matter of "incompatibility" with gc, it's more a matter of
the gc being more effective.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

So, the resolution on this has apparently been implemented. If you write:

{
    char[] x;
    const int size = 2048;

    x = cast(char[]) alloca(size)[0..size];
}

... then you will get the equivalent of a C/C++ stack array, without all the
prior initialization overhead involved. While this is a VeryGoodThing, it
could use some syntactic sugar ...

There were a number of suggestions made throughout this topic; here's a
brief recap of three:

{
    uninitialized char[2048] x;

    char[2048] x = void;

    char[] x = stackalloc char[2048];
}

The first two may find use outside of stack-allocation, whilst the third is
clearly stack-based. The second suggestion avoids creating an additional
language 'qualifier'.

How about it?



"Walter" <newshound digitalmars.com> wrote in message
news:cfhhbe$3jf$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfh14k$2ouk$1 digitaldaemon.com...
 I misread your code somewhat, Water, but what you suggest has a fatal


flaw
 in it:  you are still initializing the original stack-based array, and
 you're making the assumption that the typical size is very small.

 I'm confused - you wrote: "The allocated (temporary) workspace is

typically
 used in only a small amounts, but sporadically will get used more fully."

My
 suggestion was tuned to that.

 Further,
 you are then extending the array via the heap which will, I imagine,
 initialize the extended part of the array.

 That's correct.

 That just doesn't work because:
 a) there's a significant amount of effort to convert all the "workers"


to
 do
 what you suggest.

 Correct, but resizing dynamic arrays in D is pretty easy. It's not as

klunky
 and error-prone as in C.

 Plus, the code is less efficient,

 Since the resize will only happen in atypical cases, its efficiency should
 not impact overall performance.

 there's more of it, and
 there's possible heap-fragmentation involved.

 Heap fragmentation is much less of an issue than it used to be, with large
 memory pools and virtual memory. I know how to build a gc that will

compact
 memory to deal with fragmentation when it does arise, so although this is
 not in the current D it is not a technical problem to do it.

 b) the initial size needs to be around the 2 to 3KB mark. Given that,


your
 example is hardly faster than the original code. In fact, the

 initialization
 cost is ~57000% slower than the equivalent C/C++ array allocation.


That's
 better than ~88000%, but who's to quibble at these levels?

 If your app typically is going to fill 2 to 3 Kb, I suggest using malloc()
 (or alloca()) for it. The handful of extra instructions to allocate via
 malloc as opposed to the stack will be lost in the thousands of

instructions
 needed to fill 2 to 3 Kb.

 I wish it were so simple as your suggestion conveyed. Conversely, if


there
 were a syntactic means of optionally inhibiting the array


initialization,
 there would be no related issue at all. We could all be radiantly happy,

 and
 have a beer or something.

 Allocating large, uninitialized arrays on the stack has its own problems:

 1) The uninitialized data that is on the stack will get scanned by the
 garbage collector looking for any references to allocated memory. Since

the
 uninitialized data consists of old D stack frames, it is highly likely

that
 some of that garbage will look like references into the gc heap, and the

gc
 memory will not get freed. This problem really does happen, and can be
 pretty frustrating to track down. Some of my C code that uses GC gets a
 memset() on the stack frame for that reason.

 2) It's possible for a function to pass out of it a reference to data on
 that function's stack frame. By then allocating a new stack frame over the
 old data, and not initializing, the reference to the old data may still
 appear to be valid. The program will then behave erratically. Initializing
 all data on the stack frame will greatly increase the probability of

forcing
 that bug into the open in a repeatable manner.

 3) Large arrays on the stack bring up the specter of stack exhaustion with
 even modest amounts of recursion.

 That said, there are things I can do to generate faster code for the
 initialization.

Aug 30 2004

Ilya Minkov <minkov cs.tum.edu> writes:

Ahem, you're speaking of allocating a *buffer*. A large thing, perhaps 
used partially. Normally, i would feel somewhat uneasy to do that on the 
stack - the heap does it quite well, and, well, if there is still a 
performance problem, a buffer can be cached/shared/whatever, fairly 
easily. And if i remember correctly, there was some circumstance which 
could make the OS break if it had to extend the stack by more than 1 
page at a time.

The bulk of performance loss from heap allocation comes from many little 
objects, much more than from a few big ones... So what's the fuss about? 
Besides, you'd be using it by a pointer anyway.

-eye

Aug 12 2004

Ben Hinkle <bhinkle4 juno.com> writes:

 Because, if not, then the following modification is the sort of (keyword
 qualifier) thing that would hand control back to the programmer:
 
 {
     uninitialized char[1024 * 32] workspace;
 
     formatter (workspace);
 }
 
 That's perhaps not the best name for it, but you get the idea.

another option is to look for "void" as an initialization value:
{
  char[1024*32] workspace = void;
  formatter (workspace);
}
that way we don't need another keyword. If the initializer is void the
variable isn't initialized.

Aug 13 2004

Arcane Jill <Arcane_member pathlink.com> writes:

In article <cfiam0$hhc$1 digitaldaemon.com>, Ben Hinkle says...

another option is to look for "void" as an initialization value:
{
  char[1024*32] workspace = void;
  formatter (workspace);
}
that way we don't need another keyword. If the initializer is void the
variable isn't initialized.

/another/ option is for the compiler, on seeing the code:



would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mighty large
number of bytes to be putting on the stack. I think I'll allocate it from the
heap instead. The programmer won't care. Hell, the programmer won't even
/notice/ unless they start looking at the generated assembler!"

Arcane Jill

Aug 13 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cfidel$j0u$1 digitaldaemon.com...
 In article <cfiam0$hhc$1 digitaldaemon.com>, Ben Hinkle says...

another option is to look for "void" as an initialization value:
{
  char[1024*32] workspace = void;
  formatter (workspace);
}
that way we don't need another keyword. If the initializer is void the
variable isn't initialized.

 /another/ option is for the compiler, on seeing the code:



 would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mighty

large
 number of bytes to be putting on the stack. I think I'll allocate it from

the
 heap instead.

That would be very unusual for a compiler to do - and it raises various
questions like what "large" is. Plus unless it uses malloc to allocate it
still won't be as fast as C due to the initialization overhead.

 The programmer won't care. Hell, the programmer won't even
 /notice/ unless they start looking at the generated assembler!"

uhh - that's obviously not completely true. Some programmers will care and
will definitely notice. The question is how many (and how bad the problem
is). Walter thinks the number is small (ie-not large enough to change
things). The OP thinks the number is significant. I'm in the camp of
nice-to-have but not a high priority problem.

 Arcane Jill

Aug 13 2004

Andy Friesen <andy ikagames.com> writes:

Arcane Jill wrote:

 /another/ option is for the compiler, on seeing the code:
 

 
 would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a mighty large
 number of bytes to be putting on the stack. I think I'll allocate it from the
 heap instead. The programmer won't care. Hell, the programmer won't even
 /notice/ unless they start looking at the generated assembler!"

I like this approach.  The programmer didn't say where he wants the 
memory to come from, so it's only logical to assume that he isn't 
thinking about it and is therefore relying on the compiler to do the 
right thing for him.

If the programmer wants to force the storage to one place or another, he 
can say so explicitly. (using calloc, malloc, gc.malloc, etc)

  -- andy

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

With respect Andy, the programmer was quite explicit about where the memory
should come from ... it's a locally-declared array, and the language
documentation points out such declarations are place on the stack (just like
C/C++) because there are some very good reasons to do so. If you abstract
that notion away, then we're not discussing D anymore. Note that the
declaration does not have static, const, or any other qualifier either.


"Andy Friesen" <andy ikagames.com> wrote in message
news:cfim8j$nq2$1 digitaldaemon.com...
 Arcane Jill wrote:

 /another/ option is for the compiler, on seeing the code:



 would think to itself: "Hmmm. Now 1024*32 - that's 32768. That's a


mighty large
 number of bytes to be putting on the stack. I think I'll allocate it


from the
 heap instead. The programmer won't care. Hell, the programmer won't even
 /notice/ unless they start looking at the generated assembler!"

 I like this approach.  The programmer didn't say where he wants the
 memory to come from, so it's only logical to assume that he isn't
 thinking about it and is therefore relying on the compiler to do the
 right thing for him.

 If the programmer wants to force the storage to one place or another, he
 can say so explicitly. (using calloc, malloc, gc.malloc, etc)

   -- andy

Aug 13 2004

Andy Friesen <andy ikagames.com> writes:

van eeshan wrote:
 With respect Andy, the programmer was quite explicit about where the memory
 should come from ... it's a locally-declared array, and the language
 documentation points out such declarations are place on the stack (just like
 C/C++) because there are some very good reasons to do so. If you abstract
 that notion away, then we're not discussing D anymore. Note that the
 declaration does not have static, const, or any other qualifier either.

One of the greatest things about D is that we have a chance to change 
it.  Therefore, we are discussing D even if we abstract that notion 
away, because D itself may follow suit if we can convince Walter. :)

The declaration, taken purely at face value:

     char[1024 * 32] workspace;

declares 32768 consecutive chars under the name 'workspace'.  Since the 
'new' keyword is not used, this storage is only guaranteed to exist 
within the function's scope.

It doesn't necessarily have to say any more than that: the D language 
(not the D compiler implementation) is much more than spec for a 
portable assembler, (as C was) and it makes perfect sense to describe it 
in conceptual terms where possible because concepts are by far the most 
common use case. (code is rarely optimized before it is written :) )

Ideally, the compiler knows what the output has to do, but is completely 
free to achieve that any way it wants. (this is the main reason O'Caml 
is so fast, as I understand it)

However, there's no denying that D is a systems language, so it goes 
without saying that there absolutely must be a convenient way to say 
that you *do* care precisely where that storage comes from and how you 
want it to be handled.


its name, allocates stack storage.  By being a language primitive, it 
presents plenty of information to the code optimizer.

So, hypothetically, we would be able to write

     // want chars.  Don't care where they came from.
     char[1024*32] workspace;

     // get 'new' storage from the GC.
     char[] workspace = new char[1024*32];

     // allocate stack storage
     char[] workspace = stackalloc char[1024*32];

  -- andy

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

All good points. As long as there is some easy means of avoiding unwanted
initialization overhead (particularly vis-a-vis the stack, since it is by
far the fastest place to allocate locally-scoped storage), then I'd
certainly be all for it.

Regarding the stack in particular, the declaration:

      // allocate stack storage
      char[] workspace = stackalloc char[1024*32];

... would likely map to the alloca() function (which does not waste valuable
time initializing all of the requested storage space). However, as Walter
has described it, the GC is apparently incompatible with usage of alloca()
also ~ because potentially old GC references will suddenly appear back in
the "live" stack area again. If this is truly the case, then you should
expect the same pushback on this notion as I've had with the notion of an
"uninitialized" qualifier, or Ben's alternative "=void". After all, your
"stackalloc" keyword is a variation on the theme.

On the other hand, if the GC turns out to be compatible with alloca (as it
is today), then we can all sing and dance with joy when Walter wraps that
function with some "more accessible" syntax <g>. I prefer Ben's approach
though, because it's potentially applicable in more places than just the
stack.

Bon Chance.


"Andy Friesen" <andy ikagames.com> wrote in message
news:cfivtl$vcr$1 digitaldaemon.com...
 van eeshan wrote:
 With respect Andy, the programmer was quite explicit about where the


memory
 should come from ... it's a locally-declared array, and the language
 documentation points out such declarations are place on the stack (just


like
 C/C++) because there are some very good reasons to do so. If you


abstract
 that notion away, then we're not discussing D anymore. Note that the
 declaration does not have static, const, or any other qualifier either.

 One of the greatest things about D is that we have a chance to change
 it.  Therefore, we are discussing D even if we abstract that notion
 away, because D itself may follow suit if we can convince Walter. :)

 The declaration, taken purely at face value:

      char[1024 * 32] workspace;

 declares 32768 consecutive chars under the name 'workspace'.  Since the
 'new' keyword is not used, this storage is only guaranteed to exist
 within the function's scope.

 It doesn't necessarily have to say any more than that: the D language
 (not the D compiler implementation) is much more than spec for a
 portable assembler, (as C was) and it makes perfect sense to describe it
 in conceptual terms where possible because concepts are by far the most
 common use case. (code is rarely optimized before it is written :) )

 Ideally, the compiler knows what the output has to do, but is completely
 free to achieve that any way it wants. (this is the main reason O'Caml
 is so fast, as I understand it)

 However, there's no denying that D is a systems language, so it goes
 without saying that there absolutely must be a convenient way to say
 that you *do* care precisely where that storage comes from and how you
 want it to be handled.


 its name, allocates stack storage.  By being a language primitive, it
 presents plenty of information to the code optimizer.

 So, hypothetically, we would be able to write

      // want chars.  Don't care where they came from.
      char[1024*32] workspace;

      // get 'new' storage from the GC.
      char[] workspace = new char[1024*32];

      // allocate stack storage
      char[] workspace = stackalloc char[1024*32];

   -- andy

Aug 13 2004

Andy Friesen <andy ikagames.com> writes:

van eeshan wrote:
 All good points. As long as there is some easy means of avoiding unwanted
 initialization overhead (particularly vis-a-vis the stack, since it is by
 far the fastest place to allocate locally-scoped storage), then I'd
 certainly be all for it.
 
 Regarding the stack in particular, the declaration:
 
 
     // allocate stack storage
     char[] workspace = stackalloc char[1024*32];

 
 
 ... would likely map to the alloca() function (which does not waste valuable
 time initializing all of the requested storage space). However, as Walter
 has described it, the GC is apparently incompatible with usage of alloca()
 also ~ because potentially old GC references will suddenly appear back in
 the "live" stack area again. If this is truly the case, then you should
 expect the same pushback on this notion as I've had with the notion of an
 "uninitialized" qualifier, or Ben's alternative "=void". After all, your
 "stackalloc" keyword is a variation on the theme.

I actually didn't mean to address implicit initialization directly at 
all so much as a need for some way to let the compiler decide what's faster.

I think the same reasoning applies, though: as a language which allows 
the programmer to hit the metal when he needs to, D needs to offer some 
mechanism that communicates that the programmer knows better than the 
compiler.

Using the 'void' keyword for this purpose sounds a lot like the Perl 
practice of assigning arbitrary meaning to symbols for the sake of 
terseness.  That being said, I can't think of anything better off the 
top of my head, so I'll shut up now. :)

  -- andy

Aug 13 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

 Using the 'void' keyword for this purpose sounds a lot like the Perl
 practice of assigning arbitrary meaning to symbols for the sake of
 terseness.  That being said, I can't think of anything better off the
 top of my head, so I'll shut up now. :)

Perl! ouch! ;-)
Void currently means "no type" so it seems very natural to me to extend that
to mean "no initial value" when used as an initializer.

On a related note, I think the only way to get uninitialized memory from the
heap should be through malloc.

Aug 13 2004

"van eeshan" <vanee hotmail.net> writes:

"Andy Friesen"  wrote ..
 I actually didn't mean to address implicit initialization directly at
 all so much as a need for some way to let the compiler decide what's

faster.
 I think the same reasoning applies, though: as a language which allows
 the programmer to hit the metal when he needs to, D needs to offer some
 mechanism that communicates that the programmer knows better than the
 compiler.

In light of what we now know (but still don't quite know :-) I think it's
worth recapturing the initial request. I hope you folks will correct and/or
embellish the following, as my knowledge of the internal GC operation is
meager:

a) there are two things that a programmer uses the stack for: 1) nonchalant
locally-scoped variables, and 2) a deliberate and explicit means of
allocating temporary storage very quickly ( ~50x faster than the heap).

b) the default initialization scheme works great for most variables, and
even arrays, where they belong to the nonchalant-usage group.


storage requested is of anything other than a truly trivial size (~5 bytes).
These issues apparently (still to be clarified) are borne due to the nature
and behavior of GC stack-scans. That is, the GC apparently performs more
efficiently where the "live" stack does not contain "old" things that look

attempt to eliminate such references from the stack content (wipe everything
to a default value).

This begs the question, "... how often does the GC run?" And also, "... how
much impact do "old" GC references have on its performance?"

Let's make an assertion here first: If a programmer is explicitly and

assume said storage will be given up very soon. After all, if the execution
time is long-lived then one might as well allocate on the heap instead;
right? The core issue brought up by this thread/topic is about those cases
where execution-time is very short-lived, such that the overhead of
initialization is enough to be noticeably detrimental. Let's give this short
execution-window a name:  X ...

Okay; so given this assertion, how many times will the GC become active
during X? One might expect something close to zero ... especially so because
the currently executing thread will not be making any heap requests. If it
did, then the benefits of the original stack-allocation would likely be
drowned out anyway.

The problem remains that /another/ thread may cause the GC to scan the (now
multiple) stacks during X. So, how often does that happen? Within this short
window? One might expect "very few".

Another approach would presumably be to disable GC stack-walks during X;
that is, inhibit the GC from scanning a specific stack (the one belonging to
the X-thread) for the duration of X.

Either way, the upshot is this: the programmer wants to take full control
over what's going on; and D needs a decent mechanism to enable such control.
Placing blinkers over the eyes of the programmer is not a good approach to
this.

Aug 13 2004

ben 0x539.de writes:

Pretty ignorant of some of this thread, I would like to make some points anyway.

Firstly, I would consider the ability to request storage on the stack without it
being initialised a rather important feature for a system programming language.
Yes, I absolutely want this piece of storage on the stack, and no, my dear
compiler, I do not want you to waste my time by filling it with zeroes. ;)
There have been some fine suggestions that address this point, yet they appear
to be less accessible (like alloca, or even asm which is not even portable)
than, or bring the semantics of the language too far away from, the way C or C++
do it.
I recognise that D need not be another iteration of C++, yet I consider the ease
of adoption of D for folks used to C'ish languages to be an important strength.
If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed
heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''),
it would probably appear to be unintuitive.

But that is just my unbased personal opinion.

Therefore I consider some simple qualifier with obvious semantics like ``
uninitialised '' to be worth the possible danger. This does not mean that
everybody should default to using uninitialised memory unless they absolutely
need initialisation, for methological reasons that were well stated by Walter.
But if you want it, here it is, no need for alloca or asm hacks.

This might confuse the garbage collector, yet I do not see how that is much of a
problem. Some of the uses of uninitialised arrays demonstrated in this thread, I
think, quickly overwrote the array with memcpy'd data, and in others the array
might not even live long enough to `` block '' memory from being free'd for any
timespan that would matter.
Of course the problem exists, but `` false positives '' are not exactly fatal.


As another approach to this problem, I think it might be a worthwhile hack to
the semantics of the language itself to have all pointers' memory be zeroed when
it goes out of scope.
This basically means that pointers do the opposite of what arrays do, being ``
initialised '' once we do not need them anymore instead of being initialised
when they are created.
So when the memory occupied by them goes back in scope due to any means of
allocating uninitialised memory on the stack, what used to be used for pointers
does now not confuse the garbage collector.
Considering we already have a lot of initialisation going on, I do not think
that it would be a terrible overhead to have deinitialisation, but I might be
wrong, of course.


Please excuse my amateurish intervention. I certainly have no qualifications at
all to point at anything you folks have said and say `` NO YOU ARE WRONG! '',
yet I hope that I might have helped.

Aug 15 2004

Derek Parnell <derek psych.ward> writes:

On Mon, 16 Aug 2004 04:43:43 +0000 (UTC), ben 0x539.de wrote:

 Pretty ignorant of some of this thread, I would like to make some points
anyway.
 
 Firstly, I would consider the ability to request storage on the stack without
it
 being initialised a rather important feature for a system programming language.
 Yes, I absolutely want this piece of storage on the stack, and no, my dear
 compiler, I do not want you to waste my time by filling it with zeroes. ;)
 There have been some fine suggestions that address this point, yet they appear
 to be less accessible (like alloca, or even asm which is not even portable)
 than, or bring the semantics of the language too far away from, the way C or
C++
 do it.
 I recognise that D need not be another iteration of C++, yet I consider the
ease
 of adoption of D for folks used to C'ish languages to be an important strength.
 If someone as simple as int[2048] foo; could have a meaning (`` allocate GC'ed
 heap memory '') `` totally '' different from that in C (`` subq %rsp, 8192 ''),
 it would probably appear to be unintuitive.
 
 But that is just my unbased personal opinion.
 
 Therefore I consider some simple qualifier with obvious semantics like ``
 uninitialised '' to be worth the possible danger. This does not mean that
 everybody should default to using uninitialised memory unless they absolutely
 need initialisation, for methological reasons that were well stated by Walter.
 But if you want it, here it is, no need for alloca or asm hacks.
 
 This might confuse the garbage collector, yet I do not see how that is much of
a
 problem. Some of the uses of uninitialised arrays demonstrated in this thread,
I
 think, quickly overwrote the array with memcpy'd data, and in others the array
 might not even live long enough to `` block '' memory from being free'd for any
 timespan that would matter.
 Of course the problem exists, but `` false positives '' are not exactly fatal.
 
 As another approach to this problem, I think it might be a worthwhile hack to
 the semantics of the language itself to have all pointers' memory be zeroed
when
 it goes out of scope.
 This basically means that pointers do the opposite of what arrays do, being ``
 initialised '' once we do not need them anymore instead of being initialised
 when they are created.
 So when the memory occupied by them goes back in scope due to any means of
 allocating uninitialised memory on the stack, what used to be used for pointers
 does now not confuse the garbage collector.
 Considering we already have a lot of initialisation going on, I do not think
 that it would be a terrible overhead to have deinitialisation, but I might be
 wrong, of course.
 
 Please excuse my amateurish intervention. I certainly have no qualifications at
 all to point at anything you folks have said and say `` NO YOU ARE WRONG! '',
 yet I hope that I might have helped.

This idea is stilling making sense to me, even when taking Walter's
objections into consideration.

Okay, so an uninitialized stack array might be a likely source of bugs. But
if its not the default behaviour, and the coder explicitly takes
responsibility of it by coding the keyword 'uninitialized' or similar, then
why shouldn't D say "Okay if that's what you want, that's what you'll get!"

D allows 'goto' which is probably a more common source of bugs and high
maintenance costs.

-- 
Derek
Melbourne, Australia
16/Aug/04 3:26:36 PM

Aug 15 2004

"van eeshan" <vanee hotmail.net> writes:

That's a very nice idea Ben ...

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cfiam0$hhc$1 digitaldaemon.com...
 Because, if not, then the following modification is the sort of (keyword
 qualifier) thing that would hand control back to the programmer:

 {
     uninitialized char[1024 * 32] workspace;

     formatter (workspace);
 }

 That's perhaps not the best name for it, but you get the idea.

 another option is to look for "void" as an initialization value:
 {
   char[1024*32] workspace = void;
   formatter (workspace);
 }
 that way we don't need another keyword. If the initializer is void the
 variable isn't initialized.

Aug 13 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Have you tried alloca? I haven't tried running it but I'm thinking of

void stack2()
{
  void* x = alloca(16*1024);
}


van eeshan wrote:

 "Walter" <newshound digitalmars.com> wrote in message
 news:cfbip7$ora$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfb72j$9l7$1 digitaldaemon.com...
 "Walter" <newshound digitalmars.com> wrote in message
 news:cfb5v9$76u$1 digitaldaemon.com...
 The problem, however, is that since Java does not allow any objects
 or arrays on the stack (or in static data), whenever you need one it



 needs
 to
 be allocated on the heap. This is a lot more expensive than stack
 allocation. Since one winds up allocating a lot more heap objects in


 Java
 than one does in C/C++, the end result is slower. Java offers no
 alternatives to gc allocation.

 This is why D allows stack based objects and value objects.

 Are you saying that D allows one to create Objects entirely upon the

 stack?

 For auto objects, it is allowed for the compiler to put them there. But I
 was primarilly referring to structs and arrays.

 And entirely within static data?

 Structs and arrays, yes. Java does not allow structs, nor does it allow
 arrays on the stack.

 
 
 Thanks for clarifying that Walter. Would it be prudent to note that D
 allocates auto Objects on the heap? Stack allocation may be allowed for in
 the spec, but that's not how the compiler operates.
 
 Also, there is that point about heap allocations being a lot more
 expensive than stack allocations. Naturally, one would lean towards
 wholeheartedly agreeing; but this is just not true when it comes to D.
 Here's a little (contrived) test jig to illustrate:
 
 private import std.c.time;
 private import std.c.stdlib;
 private import std.c.windows.windows;
 
 void heap()
 {
         void *x = malloc (16 * 1024);
         free (x);
 }
 
 void stack()
 {
         ubyte[16 * 1024] x;
 }
 
 void trace (uint startTime)
 {
         FILETIME ft;
         GetSystemTimeAsFileTime(&ft);
         printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000);
 }
 
 void main()
 {
         FILETIME start;
 
        GetSystemTimeAsFileTime(&start);
         trace (start.dwLowDateTime);
 
         sleep (3);
         trace (start.dwLowDateTime);
 
         for (int i=1_000_000; --i;)
              heap();
 
         trace (start.dwLowDateTime);
 
         for (int i=1_000_000; --i;)
              stack();
 
         trace (start.dwLowDateTime);
 }
 
 The results? On this machine it takes the heap() routine ~500ms to
 execute, and the stack() version ~3500ms. Yep, that's seven times longer.
 Why? What wasn't stated is that D ensures all stack variables are
 initialized ... including arrays. The innocuous stack allocation actually
 clears the 16KB to zero each time, which is why it takes so long. For a
 32KB 'buffer', the stack() execution time goes up to ~8600ms versus the
 ~500ms for heap(); you can see there's some rather non-linear behavior for
 the stack version <g>
 
 The point here is not about the benefits/detriments of deterministic
 initialization. It's about clarifying a point of performance-tuning that
 is far from obvious. Without wishing to put too fine a point on it, the
 quoted post was somewhat misleading. That is; allocating an array on the D
 stack can consume far more cycles than the equivalent heap allocation. It
 depends upon the size of the array, the bus speed, the size of the cache,
 heap fragmentation and many other things.
 
 D stack allocations are simply not like C/C++ at all. Far from it.
 
 
 For the curious, here's the disassembly for both routines. It's worth
 noting that the array initialization could be sped up. Also worth noting
 is that the loop/call overhead was 10ms.
 
 10:   {
 11:           void *x = malloc (16 * 1024);
 12:           free (x);
 00402011   push        4000h
 00402016   call        _malloc (0040574c)
 0040201B   add         esp,4
 0040201E   push        eax
 0040201F   call        _free (004057a4)
 00402024   add         esp,4
 13:   }
 00402027   pop         eax
 00402028   ret
 14:
 15:
 16:   void stack()
 0040202C   push        ebp
 0040202D   mov         ebp,esp
 0040202F   mov         edx,4
 00402034   sub         esp,1000h
 0040203A   test        dword ptr [esp],esp
 0040203D   dec         edx
 0040203E   jne         _D5hello5stackFZv+8 (00402034)
 00402040   sub         esp,edx
 00402042   mov         ecx,4000h
 00402047   xor         eax,eax
 17:   {
 18:           ubyte[16 * 1024] x;
 00402049   push        edi
 0040204A   lea         edi,[x]
 00402050   rep stos    byte ptr [edi]
 19:   }
 00402052   pop         edi
 00402053   mov         esp,ebp
 00402055   pop         ebp
 00402056   ret

Aug 11 2004

"van eeshan" <vanee hotmail.net> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cfe87l$1akg$1 digitaldaemon.com...
 Have you tried alloca? I haven't tried running it but I'm thinking of

 void stack2()
 {
   void* x = alloca(16*1024);
 }

Tried that, Ben; alloca() also clears everything to zero.

Aug 11 2004

Ben Hinkle <bhinkle4 juno.com> writes:

van eeshan wrote:

 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:cfe87l$1akg$1 digitaldaemon.com...
 Have you tried alloca? I haven't tried running it but I'm thinking of

 void stack2()
 {
   void* x = alloca(16*1024);
 }

 
 Tried that, Ben; alloca() also clears everything to zero.

Hmm. I just tried on Linux and got the times:
heap:   .17sec
stack:  14
stack2: .06

looks like on Linux alloca is the fastest.

Aug 11 2004

"Walter" <newshound digitalmars.com> writes:

"van eeshan" <vanee hotmail.net> wrote in message
news:cfe8ob$1b85$1 digitaldaemon.com...
 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:cfe87l$1akg$1 digitaldaemon.com...
 Have you tried alloca? I haven't tried running it but I'm thinking of

 void stack2()
 {
   void* x = alloca(16*1024);
 }

 Tried that, Ben; alloca() also clears everything to zero.

No, it does not. It simply calls C's alloca(), and gets whatever C's
alloca() does. Note, however, that under linux the OS seems to clear pages
that are faulted into the stack, so it could *appear* to be 0 initialized
when it really is just fresh snow <g>.

Aug 11 2004

"van eeshan" <vanee hotmail.net> writes:

Well, blow me down with a feather. I must have really screwed that test up;
thank you Ben ~ that gets me out of a hole for now. So, in light of this,
where are we now? Currently, if you want to quickly allocate an array on the
stack you must do something like the following:

  char[] array = (cast(char *) alloca(Size))[0..Size];

I feel it would be beneficial to both programmers and to the language
reputation to provide something rather more "accessible" than this ...
thing. Perhaps a globally available template? Something, where taking
advantage of the stack is not hidden away in the darkest recesses. This is
why I figured something like an optional qualifier on the variable (to
inhibit default initialization) would be ideal: it's easy to apply, easy to
understand, easy to document, and very likely easy to implement. If this
sort of thing remains poorly documented, or is not easily accessible, people
will be left wondering why their supposed fast stack-arrays are actually
eating serious chunks of CPU time. After all, porting some C code is what
led to this foray (temporary stack arrays are commonplace in C).

Finally, the alloca() approach is still ~700% slower than the C/C++ standard
mechanism, yet it is faster than simply declaring a "char[20]" D local
variable (that's kinda' hoky when you think about it ... a char[20]
declaration is ~750% slower in D than in C).

What to do? Can we please have a simple, clean, optional mechanism for
inhibiting default initialization? Is it really so much trouble?













"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cfe87l$1akg$1 digitaldaemon.com...
 Have you tried alloca? I haven't tried running it but I'm thinking of

 void stack2()
 {
   void* x = alloca(16*1024);
 }


 van eeshan wrote:

 "Walter" <newshound digitalmars.com> wrote in message
 news:cfbip7$ora$1 digitaldaemon.com...
 "van eeshan" <vanee hotmail.net> wrote in message
 news:cfb72j$9l7$1 digitaldaemon.com...
 "Walter" <newshound digitalmars.com> wrote in message
 news:cfb5v9$76u$1 digitaldaemon.com...
 The problem, however, is that since Java does not allow any objects
 or arrays on the stack (or in static data), whenever you need one





it
 needs
 to
 be allocated on the heap. This is a lot more expensive than stack
 allocation. Since one winds up allocating a lot more heap objects





in
 Java
 than one does in C/C++, the end result is slower. Java offers no
 alternatives to gc allocation.

 This is why D allows stack based objects and value objects.

 Are you saying that D allows one to create Objects entirely upon the

 stack?

 For auto objects, it is allowed for the compiler to put them there. But



I
 was primarilly referring to structs and arrays.

 And entirely within static data?

 Structs and arrays, yes. Java does not allow structs, nor does it allow
 arrays on the stack.


 Thanks for clarifying that Walter. Would it be prudent to note that D
 allocates auto Objects on the heap? Stack allocation may be allowed for


in
 the spec, but that's not how the compiler operates.

 Also, there is that point about heap allocations being a lot more
 expensive than stack allocations. Naturally, one would lean towards
 wholeheartedly agreeing; but this is just not true when it comes to D.
 Here's a little (contrived) test jig to illustrate:

 private import std.c.time;
 private import std.c.stdlib;
 private import std.c.windows.windows;

 void heap()
 {
         void *x = malloc (16 * 1024);
         free (x);
 }

 void stack()
 {
         ubyte[16 * 1024] x;
 }

 void trace (uint startTime)
 {
         FILETIME ft;
         GetSystemTimeAsFileTime(&ft);
         printf ("%u\n", (ft.dwLowDateTime - startTime) / 10000);
 }

 void main()
 {
         FILETIME start;

        GetSystemTimeAsFileTime(&start);
         trace (start.dwLowDateTime);

         sleep (3);
         trace (start.dwLowDateTime);

         for (int i=1_000_000; --i;)
              heap();

         trace (start.dwLowDateTime);

         for (int i=1_000_000; --i;)
              stack();

         trace (start.dwLowDateTime);
 }

 The results? On this machine it takes the heap() routine ~500ms to
 execute, and the stack() version ~3500ms. Yep, that's seven times


longer.
 Why? What wasn't stated is that D ensures all stack variables are
 initialized ... including arrays. The innocuous stack allocation


actually
 clears the 16KB to zero each time, which is why it takes so long. For a
 32KB 'buffer', the stack() execution time goes up to ~8600ms versus the
 ~500ms for heap(); you can see there's some rather non-linear behavior


for
 the stack version <g>

 The point here is not about the benefits/detriments of deterministic
 initialization. It's about clarifying a point of performance-tuning that
 is far from obvious. Without wishing to put too fine a point on it, the
 quoted post was somewhat misleading. That is; allocating an array on the


D
 stack can consume far more cycles than the equivalent heap allocation.


It
 depends upon the size of the array, the bus speed, the size of the


cache,
 heap fragmentation and many other things.

 D stack allocations are simply not like C/C++ at all. Far from it.


 For the curious, here's the disassembly for both routines. It's worth
 noting that the array initialization could be sped up. Also worth noting
 is that the loop/call overhead was 10ms.

 10:   {
 11:           void *x = malloc (16 * 1024);
 12:           free (x);
 00402011   push        4000h
 00402016   call        _malloc (0040574c)
 0040201B   add         esp,4
 0040201E   push        eax
 0040201F   call        _free (004057a4)
 00402024   add         esp,4
 13:   }
 00402027   pop         eax
 00402028   ret
 14:
 15:
 16:   void stack()
 0040202C   push        ebp
 0040202D   mov         ebp,esp
 0040202F   mov         edx,4
 00402034   sub         esp,1000h
 0040203A   test        dword ptr [esp],esp
 0040203D   dec         edx
 0040203E   jne         _D5hello5stackFZv+8 (00402034)
 00402040   sub         esp,edx
 00402042   mov         ecx,4000h
 00402047   xor         eax,eax
 17:   {
 18:           ubyte[16 * 1024] x;
 00402049   push        edi
 0040204A   lea         edi,[x]
 00402050   rep stos    byte ptr [edi]
 19:   }
 00402052   pop         edi
 00402053   mov         esp,ebp
 00402055   pop         ebp
 00402056   ret

Aug 11 2004

Arcane Jill <Arcane_member pathlink.com> writes:

In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...

Currently, if you want to quickly allocate an array on the
stack you must do something like the following:

  char[] array = (cast(char *) alloca(Size))[0..Size];

My experiments seem to suggest that the following actually works rather well.

Aug 12 2004

"van eeshan" <vanee hotmail.net> writes:

Now that is exactly what the compiler ought to (optionally) provide for, at
the high level :-)

16 bytes allocations (versus relative execution time):
heap:      269
stack[]:   50
alloca():  57
asm:       6

2K allocations:
heap:      504
stack[]:   5681
alloca():  57
asm:       10

32K allocations:
heap:      520
stack[]:   8812
alloca():  118
asm:       10

There's that 32KB stack-array initialization being ~88000% slower than the
traditional [asm] C/C++ uninitialized style. Barring further issue, I will
use that ~ thanks very much Jill. Perhaps one of the Wiki owners will take
an interest in noting these workarounds.



"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cff6a9$1ugb$1 digitaldaemon.com...
 In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...

Currently, if you want to quickly allocate an array on the
stack you must do something like the following:

  char[] array = (cast(char *) alloca(Size))[0..Size];

 My experiments seem to suggest that the following actually works rather

well.

Aug 12 2004

"van eeshan" <vanee hotmail.net> writes:

I've been trying to convert this to a mixin, but not having much luck. Guess
I don't know enough about templates ... can you point me in the right
direction please Jill? Or will a mixin always place this inside its own
private function scope (and thus restore ESP before the array can be used) ?


"Arcane Jill" <Arcane_member pathlink.com> wrote in message
news:cff6a9$1ugb$1 digitaldaemon.com...
 In article <cfete2$1p7g$1 digitaldaemon.com>, van eeshan says...

Currently, if you want to quickly allocate an array on the
stack you must do something like the following:

  char[] array = (cast(char *) alloca(Size))[0..Size];

 My experiments seem to suggest that the following actually works rather

well.

Aug 12 2004

D Programming

C/C++ Programming

Other

digitalmars.D - D stack allocation