www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - array initialization problem

reply Qian Xu <quian.xu stud.tu-ilmenau.de> writes:
Hi All,

I have accidentally written a buggy class.

Briefly described as follows:
  1. The class contains a list of string
  2. The list of string is assigned to a constant in constructor
  3. Try to change the value of the list
  4. Create another class by repeating step 1-3 again
  5. Add both of them to a LinkSeq object
  6. Print their values again
Now you will find their lists have the same values now.

Can someone explain, why the values are different before they are 
inserted into a list?
And why this.str has no problem?


The console output and the source are included below.

##################### console output begin ############################

list: [111,222,]
  str: hello
-----------------------------
list: [333,444,]
  str: world
-----------------------------
--- after insert ---
-----------------------------
list: [333,444,]
  str: hello
-----------------------------
list: [333,444,]
  str: world
-----------------------------

##################### console output end ############################

######################### code begin ############################

module test;

import tango.io.Console;
import tango.util.collection.LinkSeq;

const char[][] CLIST = [null, null];
const char[] CSTR = "hello";

class Entity
{
   char[][] list;
   char[] str;

   this()
   {
     this.list = CLIST;
     this.str = CSTR;
   }

   void print()
   {
     Cout.opCall("list: [");
     foreach (char[] s; list)
     {
       Cout.opCall(s ~ ",");
     }
     Cout.opCall("]\n");
     Cout.opCall(" str: "~this.str);
     Cout.opCall("\n-----------------------------\n");
   }
}

void main()
{
   Entity e = new Entity();
   e.list[0] = "111";
   e.list[1] = "222";
   e.str = "hello";
   e.print();

   Entity e2 = new Entity();
   e2.list[0] = "333";
   e2.list[1] = "444";
   e2.str = "world";
   e2.print();

   Cout.opCall("--- after insert ---\n-----------------------------\n");
   LinkSeq!(Entity) l = new LinkSeq!(Entity)();
   l.append(e);
   l.append(e2);

   foreach (Entity entity; l)
   {
     entity.print();
   }
}

######################### code end ############################


-- 
Xu, Qian (stanleyxu)
  http://stanleyxu2005.blogspot.com
Jan 16 2009
next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Fri, Jan 16, 2009 at 4:19 PM, Qian Xu <quian.xu stud.tu-ilmenau.de> wrote:

 Can someone explain, why the values are different before they are inserted
 into a list?
The values are different _before you insert values into e2_. Try printing out the contents of e _after_ you put strings in e2, and you'll notice it now has the same values as e2. This is because arrays in D are by reference. e and e2 point to the same array (CLIST). When you modify the contents of e2.list, the modifications show up in e as well. Accessing this.str is fine because they each point to different strings.
    Cout.opCall("list: [");
Also, lol, opCall is an operator overload of (). You aren't supposed to call it directly, use: Cout("list: ["); instead.
 void main()
 {
  Entity e = new Entity();
  e.list[0] = "111";
  e.list[1] = "222";
  e.str = "hello";
  e.print();

  Entity e2 = new Entity();
  e2.list[0] = "333";
  e2.list[1] = "444";
See, e.list and e2.list are the same array here.
Jan 16 2009
prev sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 17 Jan 2009 00:19:46 +0300, Qian Xu <quian.xu stud.tu-ilmenau.de> wrote:

 Hi All,

 I have accidentally written a buggy class.

 Briefly described as follows:
   1. The class contains a list of string
   2. The list of string is assigned to a constant in constructor
   3. Try to change the value of the list
   4. Create another class by repeating step 1-3 again
   5. Add both of them to a LinkSeq object
   6. Print their values again
 Now you will find their lists have the same values now.

 Can someone explain, why the values are different before they are  
 inserted into a list?
 And why this.str has no problem?


 The console output and the source are included below.

 ##################### console output begin ############################

 list: [111,222,]
   str: hello
 -----------------------------
 list: [333,444,]
   str: world
 -----------------------------
 --- after insert ---
 -----------------------------
 list: [333,444,]
   str: hello
 -----------------------------
 list: [333,444,]
   str: world
 -----------------------------

 ##################### console output end ############################

 ######################### code begin ############################

 module test;

 import tango.io.Console;
 import tango.util.collection.LinkSeq;

 const char[][] CLIST = [null, null];
 const char[] CSTR = "hello";

 class Entity
 {
    char[][] list;
    char[] str;

    this()
    {
      this.list = CLIST;
      this.str = CSTR;
    }

    void print()
    {
      Cout.opCall("list: [");
      foreach (char[] s; list)
      {
        Cout.opCall(s ~ ",");
      }
      Cout.opCall("]\n");
      Cout.opCall(" str: "~this.str);
      Cout.opCall("\n-----------------------------\n");
    }
 }

 void main()
 {
    Entity e = new Entity();
    e.list[0] = "111";
    e.list[1] = "222";
    e.str = "hello";
    e.print();

    Entity e2 = new Entity();
    e2.list[0] = "333";
    e2.list[1] = "444";
    e2.str = "world";
    e2.print();

    Cout.opCall("--- after insert ---\n-----------------------------\n");
    LinkSeq!(Entity) l = new LinkSeq!(Entity)();
    l.append(e);
    l.append(e2);

    foreach (Entity entity; l)
    {
      entity.print();
    }
 }

 ######################### code end ############################
You have two instances of class Entity. Both point to the same variables - CLIST and CSTR. Thus, modifying CSTR and CLIST variables' content would have an effect on e.str, e.list, e2.str and e.list, because they are sharing the data (as opposite to owning it). For example, let's modify CSTR and see what happens: CSTR[0] = 'J'; // now it is "Jello" printing e.str and e2.str gives us the following output: Jello Jello i.e. both strings have been changed, too! Once again, this happens because they don't own the data but share it with CSTR. It happens because arrays are not copied upon assignment, i.e. the following line: this.str = CSTR; makes sure that there is only 1 instance of "Hello" in memory, not three distinct copies (CSTR, e.str and e2.str). Therefore modifying either CSTR, e.str or e2.str would have an effect on all 3 variables. Here is a picture for you: CSTR: length = 5 ptr = ---------------- Hello / | / | / | e.str: / | length = 5 / | ptr = -------------* | | | | e2.str: | length = 5 | ptr = ---------------------* If you want to be able to modify without affecting others, make a copy! this.str = CSTR.dup; This way memory will contain three copies of "Hello" - CSTR, e.str and e2.str . I hope this is clear, let's move on. Just like modifying e.str contents, modifying e.list contents will have an effect on all variables - CLIST, e.list and e2.list . That's what happens step by step: 0 - Program startup State: CLIST : [null, null]; e : <doesn't exist>; e2 : <doesn't exist> 1 - Entity e = new Entity(); State: CLIST : [null, null]; e : list = [null, null]; str = "hello"; e2 : <doesn't exist> . 2 - e.list[0] = "111"; State: CLIST : ["111", null]; // note that CLIST has been changed, too! e : list = ["111", null]; str = "hello"; e2 : <doesn't exist> 3 - e.list[1] = "222"; State: CLIST : ["111", "222"]; // note that CLIST has been changed, too! e : list = ["111", "222"]; str = "hello"; e2 : <doesn't exist> 4 - Entity e2 = new Entity(); State: CLIST : ["111", "222"]; e : list = ["111", "222"]; str = "hello"; e2 : list = ["111", "222"]; str = "hello"; // !!! 5 - e2.list[0] = "333"; State: CLIST : ["333", "222"]; e : list = ["333", "222"]; str = "hello"; e2 : list = ["333", "222"]; str = "hello"; 6 - e2.list[1] = "444"; State: CLIST : ["333", "444"]; e : list = ["333", "444"]; str = "hello"; e2 : list = ["333", "444"]; str = "hello"; 7 - e2.str = "world"; State: CLIST : ["333", "444"]; e : list = ["333", "444"]; str = "hello"; e2 : list = ["333", "444"]; str = "world"; Hope it helps.
Jan 16 2009
next sibling parent Qian Xu <quian.xu stud.tu-ilmenau.de> writes:
Denis Koroskin wrote:
 7 - e2.str = "world";
 State: CLIST : ["333", "444"];
       e     : list = ["333", "444"]; str = "hello";
       e2    : list = ["333", "444"]; str = "world";
 
 
 Hope it helps.
Thanks for your nice answer. You made my day ;-) -- Xu, Qian (stanleyxu) http://stanleyxu2005.blogspot.com
Jan 18 2009
prev sibling parent reply Qian Xu <quian.xu stud.tu-ilmenau.de> writes:
Denis Koroskin wrote:

 ...
 
 For example, let's modify CSTR and see what happens:
 CSTR[0] = 'J'; // now it is "Jello"
 
 printing e.str and e2.str gives us the following output:
 Jello
 Jello
 
 ...
Hi again, but there is one thing, I do not understand. CSTR is a constant. But with "CSTR[0] = 'J'", you can modify a const anyway, cannot you? BTW: Do you know, why D do not use copy-on-write semantic instead of referencing? IMO, copy-on-write is much performanter. --Qian
Jan 19 2009
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Mon, 19 Jan 2009 12:21:59 +0300, Qian Xu <quian.xu stud.tu-ilmenau.de> wrote:

 Denis Koroskin wrote:

 ...

 For example, let's modify CSTR and see what happens:
 CSTR[0] = 'J'; // now it is "Jello"

 printing e.str and e2.str gives us the following output:
 Jello
 Jello

 ...
Hi again, but there is one thing, I do not understand. CSTR is a constant. But with "CSTR[0] = 'J'", you can modify a const anyway, cannot you?
D1 has no const support.
 BTW: Do you know, why D do not use copy-on-write semantic instead of
 referencing? IMO, copy-on-write is much performanter.

 --Qian
It's not about performance (explicit memory management is faster, too), but semantics. Arrays in D are reference types. Besides, it's best to avoid hidden allocations.
Jan 19 2009
parent reply Rainer Deyke <rainerd eldwood.com> writes:
Denis Koroskin wrote:
 Arrays in D are reference types. Besides, it's best to avoid hidden
 allocations.
Arrays in D are reference types except when they're not. int[] a = [5]; int[] b = a; a[0] = 4; assert(b[0] == 4); a.length = 2; assert(b.length == 1); a[0] = 3; // Is b[0] 3 or 4? -- Rainer Deyke - rainerd eldwood.com
Jan 19 2009
parent Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:
Rainer Deyke wrote:
 Denis Koroskin wrote:
 Arrays in D are reference types. Besides, it's best to avoid hidden
 allocations.
Arrays in D are reference types except when they're not. int[] a = [5]; int[] b = a; a[0] = 4; assert(b[0] == 4); a.length = 2; assert(b.length == 1); a[0] = 3; // Is b[0] 3 or 4?
To be really pedantic about it, D's arrays aren't really reference types at all, but bear the *illusion* of reference semantics because of what they really are (a struct with a length field and a pointer field). In the above example, the value of b[0] depends on whether a was resized in place or not. Which is why slicing, albeit a fantastically useful feature, has to be handled with care. -- Chris Nicholson-Sauls <ibisbasenji Google Mail>
Jan 20 2009
prev sibling parent Christopher Wright <dhasenan gmail.com> writes:
Qian Xu wrote:
 Denis Koroskin wrote:
 
 ...

 For example, let's modify CSTR and see what happens:
 CSTR[0] = 'J'; // now it is "Jello"

 printing e.str and e2.str gives us the following output:
 Jello
 Jello

 ...
Hi again, but there is one thing, I do not understand. CSTR is a constant. But with "CSTR[0] = 'J'", you can modify a const anyway, cannot you?
CSTR is a string constant. It's in a data segment of the binary that DMD creates. However, on Windows, string constants are in a read-write area of memory, so you can change them; but for efficiency, there is only one copy of each string constant in the binary. On Linux, that code would produce a segmentation fault -- there, string constants are in a read-only text segment. (I believe I heard that the MinGW compiler on Windows makes string constants read-only, so this may be compiler specific.)
 BTW: Do you know, why D do not use copy-on-write semantic instead of
 referencing? IMO, copy-on-write is much performanter.
It makes the compiler a fair bit more complicated. It requires syntax to create a copy-on-write array versus a by-reference array, or to refer to a COW array by reference (so if you modify it, aliases to the same array get modified). And copy-on-write does not give you better performance. Most of all, nobody's made a compelling case to Walter about this. It's easy enough to .dup an array if you're about to modify it, though bugs from accidentally modifying an array in place are rather hard to track. On the other hand, if you have a reasonable const system, these bugs turn into compile errors.
 --Qian
Jan 19 2009