www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - We need to define the semantics of block initialization of arrays

reply "Don" <turnyourkidsintocash nospam.com> writes:
DMD has always accepted this initializer syntax for static arrays:

float [50] x = 1.0;

If this declaration happens inside a function, or in global 
scope, the compiler sets all members of x to 1.0.  That is, it's 
the same as:

float [50] x = void;
x[] = 1.0;

In my DMD pull requests, I've called this 'block initialization', 
since there was no standard name for it.

A lot of code relies on this behaviour, but the spec doesn't 
mention it!!!

The problem is not simply that this is unspecified. A long time 
ago, if this same declaration was a member of a struct 
declaration, the behaviour was completely different. It used to 
set x[0] to 1.0, and leave the others at float.init. I'll call 
this "first-element-initialization", and it still applies in many 
cases, for example when you use a struct static initializer. Ie, 
it's the same as:

float [50] x;
x[0] = 1.0;

Note however that this part of the compiler has historically been 
very bug-prone, and the behaviour has changed several times.


I didn't know about first-element-initialization when I 
originally did the CTFE code, so when CTFE is involved, it always 
does block initialization instead.
Internally, the compiler has two functions, defaultInit() and 
defaultInitLiteral(). The first does first-element-init, the 
second does block-init.
There are several other situations which do block initialization 
(not just CTFE). There are a greater number of situations where 
first-init can happen, but the most frequently encountered 
situations use block-init. There are even some foul cases, like 
bug 10198, where due to a bug in CTFE, you currently get a 
bizarre mix of both first-init and block-init!


So, we have a curious mix of the two behaviours. Which way is 
correct?

Personally I'd like to just use block-init everywhere. I 
personally find first-element-init rather unexpected, but maybe 
that's just me. I don't know when it would be useful. But 
regardless, we need to get this sorted out.
It's a blocker for my CTFE work.


Here's an example of some of the oddities:
----
struct S {
    int [3] x;
}

struct T {
     int [3] x = 8;
}

struct U {
    int [3][3] y;
}

void main()
{
    int [3][4] w = 7;
    assert( w[2][2] == 7); // Passes, it was block-initialized

    S s =  { 8 }; // OK, struct static initializer. 
first-element-init
    S r = S( 8 ); // OK, struct literal, block-init.
    T t;          // Default initialized, block-init
    assert( s.x[2] == 8); // Fails; it was 
first-element-initialized
    assert( r.x[2] == 8); // Passes; all elements are 8. 
Block-init.
    assert( t.x[2] == 8); // Passes; all elements are 8. 
Block-init.

    U u = { 9 };  // Does not compile
    // Error: cannot implicitly convert expression (9) of type int 
to int[3LU][3LU]
}
---
Jun 03 2013
next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 3 June 2013 at 09:06:25 UTC, Don wrote:
 Personally I'd like to just use block-init everywhere. I 
 personally find first-element-init rather unexpected, but maybe 
 that's just me. I don't know when it would be useful.

+1 I see no point in just initialising the first member. If you want that, just default init then set the first member.
Jun 03 2013
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 6/3/13, Don <turnyourkidsintocash nospam.com> wrote:
 A lot of code relies on this behaviour, but the spec doesn't
 mention it!!!

I didn't know about it until Walter mentioned the syntax to me. I've found it quite useful since then. E.g.: char[100] buffer = 0; Without this buffer is normally initialized with 0xFF, and this could break C functions when you pass a pointer to such an array.
 Personally I'd like to just use block-init everywhere.

Me too. You get my vote.
Jun 03 2013
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On 2013-06-03, 11:06, Don wrote:

 Personally I'd like to just use block-init everywhere. I personally find  
 first-element-init rather unexpected, but maybe that's just me. I don't  
 know when it would be useful. But regardless, we need to get this sorted  
 out.
 It's a blocker for my CTFE work.

Votes++; -- Simen
Jun 03 2013
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/03/2013 11:06 AM, Don wrote:
 Personally I'd like to just use block-init everywhere. I personally find
 first-element-init rather unexpected, but maybe that's just me. I don't
 know when it would be useful. But regardless, we need to get this sorted
 out.
 It's a blocker for my CTFE work.

Agreed, kill first-element init.
Jun 03 2013
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Monday, 3 June 2013 at 19:41:35 UTC, Timon Gehr wrote:
 On 06/03/2013 11:06 AM, Don wrote:
 Personally I'd like to just use block-init everywhere. […]

Agreed, kill first-element init.

Kill it with a vengeance! Honestly, I can't see how that could have ever been intended as a feature, and while fixing an issue in LDC a while back, I already removed that one instance with a first-element struct initializer from the test suite: https://github.com/ldc-developers/dmd-testsuite/blob/ldc/runnable/test42.d#L3226 David
Jun 03 2013
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, June 03, 2013 11:06:22 Don wrote:
 So, we have a curious mix of the two behaviours. Which way is
 correct?
 
 Personally I'd like to just use block-init everywhere. I
 personally find first-element-init rather unexpected, but maybe
 that's just me. I don't know when it would be useful. But
 regardless, we need to get this sorted out.
 It's a blocker for my CTFE work.

I honestly didn't know about either, but first element init seems very wrong to me, whereas block init makes sense. - Jonathan M Davis
Jun 03 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 3 June 2013 at 09:06:25 UTC, Don wrote:
 Personally I'd like to just use block-init everywhere. I 
 personally find first-element-init rather unexpected, but maybe 
 that's just me. I don't know when it would be useful. But 
 regardless, we need to get this sorted out.
 It's a blocker for my CTFE work.

KILL IT WITH FIRE !
Jun 03 2013
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 03 Jun 2013 05:06:22 -0400, Don <turnyourkidsintocash nospam.com>  
wrote:

 DMD has always accepted this initializer syntax [with unexplained black  
 magic behavior that is inconsistent from one usage to the next]

BURRRRRN IT!!!! Seriously, I had no idea this was going on. It would surprise the hell out of me if I had discovered it! -Steve
Jun 03 2013
prev sibling next sibling parent Kenji Hara <k.hara.pg gmail.com> writes:
--bcaec53d5ae5bb9d4504de4ae833
Content-Type: text/plain; charset=UTF-8

2013/6/3 Don <turnyourkidsintocash nospam.com>

 DMD has always accepted this initializer syntax for static arrays:

 float [50] x = 1.0;

 If this declaration happens inside a function, or in global scope, the
 compiler sets all members of x to 1.0.  That is, it's the same as:

 float [50] x = void;
 x[] = 1.0;

 In my DMD pull requests, I've called this 'block initialization', since
 there was no standard name for it.

 A lot of code relies on this behaviour, but the spec doesn't mention it!!!

 The problem is not simply that this is unspecified. A long time ago, if
 this same declaration was a member of a struct declaration, the behaviour
 was completely different. It used to set x[0] to 1.0, and leave the others
 at float.init. I'll call this "first-element-initialization"**, and it
 still applies in many cases, for example when you use a struct static
 initializer. Ie, it's the same as:

 float [50] x;
 x[0] = 1.0;

 Note however that this part of the compiler has historically been very
 bug-prone, and the behaviour has changed several times.


 I didn't know about first-element-initialization when I originally did the
 CTFE code, so when CTFE is involved, it always does block initialization
 instead.
 Internally, the compiler has two functions, defaultInit() and
 defaultInitLiteral(). The first does first-element-init, the second does
 block-init.
 There are several other situations which do block initialization (not just
 CTFE). There are a greater number of situations where first-init can
 happen, but the most frequently encountered situations use block-init.
 There are even some foul cases, like bug 10198, where due to a bug in CTFE,
 you currently get a bizarre mix of both first-init and block-init!


 So, we have a curious mix of the two behaviours. Which way is correct?

 Personally I'd like to just use block-init everywhere. I personally find
 first-element-init rather unexpected, but maybe that's just me. I don't
 know when it would be useful. But regardless, we need to get this sorted
 out.
 It's a blocker for my CTFE work.

First-element-init is definitely a bug. I can argue that nobody wants the strange behavior. Here's an example of some of the oddities:
 ----
 struct S {
    int [3] x;
 }

 struct T {
     int [3] x = 8;
 }

 struct U {
    int [3][3] y;
 }

 void main()
 {
    int [3][4] w = 7;
    assert( w[2][2] == 7); // Passes, it was block-initialized

Currently block-initialization for multi-dimensional static array is just only allowed for variable declaration in statement scope. I'm planning to fix bug 3849 and 7019, but changing this behavior might affect them. As my hope, I'd like to keep this as-is so I've not finished thinking about it well. S s = { 8 }; // OK, struct static initializer. first-element-init

This is definitely a bug. Instead, block-init should occur.
    S r = S( 8 ); // OK, struct literal, block-init.
    T t;          // Default initialized, block-init

OK.
    assert( s.x[2] == 8); // Fails; it was first-element-initialized

Also, definitely a bug.
    assert( r.x[2] == 8); // Passes; all elements are 8. Block-init.
    assert( t.x[2] == 8); // Passes; all elements are 8. Block-init.

OK.
    U u = { 9 };  // Does not compile
    // Error: cannot implicitly convert expression (9) of type int to
 int[3LU][3LU]

For reasons I've already mentioned in `int [3][4] w = 7;`, I'd like to keep this current behavior.
 }
 ---

Kenji Hara --bcaec53d5ae5bb9d4504de4ae833 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">2013= /6/3 Don <span dir=3D"ltr">&lt;<a href=3D"mailto:turnyourkidsintocash nospa= m.com" target=3D"_blank">turnyourkidsintocash nospam.com</a>&gt;</span><br>= <blockquote class=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;= margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color= :rgb(204,204,204);border-left-style:solid;padding-left:1ex"> DMD has always accepted this initializer syntax for static arrays:<br> <br> float [50] x =3D 1.0;<br> <br> If this declaration happens inside a function, or in global scope, the comp= iler sets all members of x to 1.0. =C2=A0That is, it&#39;s the same as:<br> <br> float [50] x =3D void;<br> x[] =3D 1.0;<br> <br> In my DMD pull requests, I&#39;ve called this &#39;block initialization&#39= ;, since there was no standard name for it.<br> <br> A lot of code relies on this behaviour, but the spec doesn&#39;t mention it= !!!<br> <br> The problem is not simply that this is unspecified. A long time ago, if thi= s same declaration was a member of a struct declaration, the behaviour was = completely different. It used to set x[0] to 1.0, and leave the others at f= loat.init. I&#39;ll call this &quot;first-element-initialization&quot;<u></= u>, and it still applies in many cases, for example when you use a struct s= tatic initializer. Ie, it&#39;s the same as:<br> <br> float [50] x;<br> x[0] =3D 1.0;<br> <br> Note however that this part of the compiler has historically been very bug-= prone, and the behaviour has changed several times.<br> <br> <br> I didn&#39;t know about first-element-initialization when I originally did = the CTFE code, so when CTFE is involved, it always does block initializatio= n instead.<br> Internally, the compiler has two functions, defaultInit() and defaultInitLi= teral(). The first does first-element-init, the second does block-init.<br> There are several other situations which do block initialization (not just = CTFE). There are a greater number of situations where first-init can happen= , but the most frequently encountered situations use block-init. There are = even some foul cases, like bug 10198, where due to a bug in CTFE, you curre= ntly get a bizarre mix of both first-init and block-init!<br> <br> <br> So, we have a curious mix of the two behaviours. Which way is correct?<br> <br> Personally I&#39;d like to just use block-init everywhere. I personally fin= d first-element-init rather unexpected, but maybe that&#39;s just me. I don= &#39;t know when it would be useful. But regardless, we need to get this so= rted out.<br> It&#39;s a blocker for my CTFE work.<br></blockquote><div><br></div><div>Fi= rst-element-init is definitely a bug. I can argue that nobody wants the str= ange behavior.</div><div>=C2=A0</div><div><br></div><blockquote class=3D"gm= ail_quote" style=3D"margin-top:0px;margin-right:0px;margin-bottom:0px;margi= n-left:0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);borde= r-left-style:solid;padding-left:1ex"> Here&#39;s an example of some of the oddities:<br> ----<br> struct S {<br> =C2=A0 =C2=A0int [3] x;<br> }<br> <br> struct T {<br> =C2=A0 =C2=A0 int [3] x =3D 8;<br> }<br> <br> struct U {<br> =C2=A0 =C2=A0int [3][3] y;<br> }<br> <br> void main()<br> {<br> =C2=A0 =C2=A0int [3][4] w =3D 7;<br> =C2=A0 =C2=A0assert( w[2][2] =3D=3D 7); // Passes, it was block-initialized= <br></blockquote><div><br></div><div>Currently block-initialization for mul= ti-dimensional static array is just only allowed for variable declaration i= n statement scope.</div> <div>I&#39;m planning to fix bug 3849 and 7019, but changing this behavior = might affect them. As my hope, I&#39;d like to keep this as-is so I&#39;ve = not finished thinking about it well.<br></div><div><br></div><blockquote cl= ass=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;margin-bottom:= 0px;margin-left:0.8ex;border-left-width:1px;border-left-color:rgb(204,204,2= 04);border-left-style:solid;padding-left:1ex"> =C2=A0 =C2=A0S s =3D =C2=A0{ 8 }; // OK, struct static initializer. first-e= lement-init<br></blockquote><div><br></div><div>This is definitely a bug. I= nstead, block-init should occur.</div><div>=C2=A0</div><blockquote class=3D= "gmail_quote" style=3D"margin-top:0px;margin-right:0px;margin-bottom:0px;ma= rgin-left:0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);bo= rder-left-style:solid;padding-left:1ex"> =C2=A0 =C2=A0S r =3D S( 8 ); // OK, struct literal, block-init.<br> =C2=A0 =C2=A0T t; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// Default initialized,= block-init<br></blockquote><div><br></div><div>OK.</div><div>=C2=A0</div><= blockquote class=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;m= argin-bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color:= rgb(204,204,204);border-left-style:solid;padding-left:1ex"> =C2=A0 =C2=A0assert( s.x[2] =3D=3D 8); // Fails; it was first-element-initi= alized<br></blockquote><div><br></div><div>Also, definitely a bug.</div><di= v>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin-top:0px;mar= gin-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;bor= der-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> =C2=A0 =C2=A0assert( r.x[2] =3D=3D 8); // Passes; all elements are 8. Block= -init.<br> =C2=A0 =C2=A0assert( t.x[2] =3D=3D 8); // Passes; all elements are 8. Block= -init.<br></blockquote><div><br></div><div>OK.</div><div>=C2=A0</div><block= quote class=3D"gmail_quote" style=3D"margin-top:0px;margin-right:0px;margin= -bottom:0px;margin-left:0.8ex;border-left-width:1px;border-left-color:rgb(2= 04,204,204);border-left-style:solid;padding-left:1ex"> =C2=A0 =C2=A0U u =3D { 9 }; =C2=A0// Does not compile<br> =C2=A0 =C2=A0// Error: cannot implicitly convert expression (9) of type int= to int[3LU][3LU]<br></blockquote><div><br></div><div>For reasons I&#39;ve = already mentioned in `int [3][4] w =3D 7;`, I&#39;d like to keep this curre= nt behavior.<br> </div><div> =C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin-top:0px;margi= n-right:0px;margin-bottom:0px;margin-left:0.8ex;border-left-width:1px;borde= r-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> }<br> ---<br> </blockquote></div><br></div><div class=3D"gmail_extra">Kenji Hara</div></d= iv> --bcaec53d5ae5bb9d4504de4ae833--
Jun 03 2013
prev sibling next sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Monday, 3 June 2013 at 09:06:25 UTC, Don wrote:
 DMD has always accepted this initializer syntax for static 
 arrays:

 float [50] x = 1.0;

I would expect block initialization and have never considered first element initialization.
Jun 03 2013
prev sibling next sibling parent "Don" <turnyourkidsintocash nospam.com> writes:
On Tuesday, 4 June 2013 at 02:33:54 UTC, Kenji Hara wrote:
 Personally I'd like to just use block-init everywhere. I 
 personally find
 first-element-init rather unexpected, but maybe that's just 
 me. I don't
 know when it would be useful. But regardless, we need to get 
 this sorted
 out.
 It's a blocker for my CTFE work.

First-element-init is definitely a bug. I can argue that nobody wants the strange behavior.

Good, it seems that everyone agrees. It's therefore a bug in todt.c, StructInitializer::todt()
 void main()
 {
    int [3][4] w = 7;
    assert( w[2][2] == 7); // Passes, it was block-initialized

Currently block-initialization for multi-dimensional static array is just only allowed for variable declaration in statement scope. I'm planning to fix bug 3849 and 7019, but changing this behavior might affect them. As my hope, I'd like to keep this as-is so I've not finished thinking about it well.

Yeah, there's difficulties with things like: int [3][4] = [7, 7, 7]; which could be a block initialization -- is this allowed or not? Though I think we have already dealt with such issues for block assignment.
    S s =  { 8 }; // OK, struct static initializer. 
 first-element-init

This is definitely a bug. Instead, block-init should occur.

Good.
    S r = S( 8 ); // OK, struct literal, block-init.
    T t;          // Default initialized, block-init

OK.
    assert( s.x[2] == 8); // Fails; it was 
 first-element-initialized

Also, definitely a bug.
    assert( r.x[2] == 8); // Passes; all elements are 8. 
 Block-init.
    assert( t.x[2] == 8); // Passes; all elements are 8. 
 Block-init.

OK.
    U u = { 9 };  // Does not compile
    // Error: cannot implicitly convert expression (9) of type 
 int to
 int[3LU][3LU]

For reasons I've already mentioned in `int [3][4] w = 7;`, I'd like to keep this current behavior.

There is still one problem, bug 10198. This currently compiles, and does something stupid: --- struct U { int [3][3] y; } U u = U(4); --- What do you think should happen here?
Jun 04 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Don:

 Yeah, there's difficulties with things like:

 int [3][4] = [7, 7, 7];

 which could be a block initialization -- is this allowed or not?

I hope it keeps being disallowed, to avoid programmer's mistakes. (Maybe this is acceptable: int [3][4] = [7, 7, 7][]; ) Bye, bearophile
Jun 04 2013
prev sibling next sibling parent Kenji Hara <k.hara.pg gmail.com> writes:
--f46d043c7dded41ae004de5469d4
Content-Type: text/plain; charset=UTF-8

2013/6/4 Don <turnyourkidsintocash nospam.com>

 There is still one problem, bug 10198. This currently compiles, and does
 something stupid:
 ---

 struct U {
    int [3][3] y;
 }

 U u = U(4);
 ---
 What do you think should happen here?

Oh! I did not know it is currently accepted. I think accepting multi-dimensional block initializing on StructLiteralExp arguments is very bug-prone behavior. Different from variable declaration in statement scope, there is no target type we can look for. So inferring the cost of static array construction is difficult. struct U { int[3][3][3][3] w; } U u = U(1); // looks trivial, but actually costly operation. int[3][3][3][3] w = 1; // initializing cost is very obvious At most it would be better that it is restricted up to one-dimensional block initializing, same as StructInitializer. Kenji Hara --f46d043c7dded41ae004de5469d4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">2013/6/4 Don <span dir=3D"ltr">&lt;<a href=3D"mailto:turny= ourkidsintocash nospam.com" target=3D"_blank">turnyourkidsintocash nospam.c= om</a>&gt;</span><br><div class=3D"gmail_extra"><div class=3D"gmail_quote">= <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> <div class=3D"im"><br></div> There is still one problem, bug 10198. This currently compiles, and does so= mething stupid:<br> ---<div class=3D"im"><br> struct U {<br> =C2=A0 =C2=A0int [3][3] y;<br> }<br> <br></div> U u =3D U(4);<br> ---<br> What do you think should happen here?<br> </blockquote></div><br></div><div class=3D"gmail_extra">Oh! I did not know = it is currently accepted.</div><div class=3D"gmail_extra"><br></div><div cl= ass=3D"gmail_extra">I think accepting multi-dimensional block initializing = on StructLiteralExp arguments is very bug-prone behavior.</div> <div class=3D"gmail_extra">Different from variable declaration in statement= scope, there is no target type we can look for. So inferring the cost of s= tatic array construction is difficult.</div><div class=3D"gmail_extra"><br> </div><div class=3D"gmail_extra">struct U { int[3][3][3][3] w; }</div><div = class=3D"gmail_extra">U u =3D U(1); =C2=A0 =C2=A0// looks trivial, but actu= ally costly operation.</div><div class=3D"gmail_extra"><br></div><div class= =3D"gmail_extra"> int[3][3][3][3] w =3D 1; =C2=A0 // initializing cost is very obvious</div><= div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">At most it w= ould be better that it is restricted up to one-dimensional block initializi= ng, same as StructInitializer.</div> <div class=3D"gmail_extra" style><br></div><div class=3D"gmail_extra" style=
Kenji Hara<br></div></div>

--f46d043c7dded41ae004de5469d4--
Jun 04 2013
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Sometimes I'd like to write code like this:


struct Foo {
     int x;
}
void main() {
     Foo[100] foos;
     foos[].x = 10;
}


Bye,
bearophile
Jun 04 2013