www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - question about passing associative array to a function

reply "rbutler" <rbutler mtsu.edu> writes:
I have searched and can not understand something about passing 
AAs to a function.
I have reduced the gist of the question to a tiny program below.
If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
My confusion is that AAs are supposed to be passed as refs 
anyway, so I do
not understand why I should have to use ref to make it work.

Related, it also works if I UN-comment the line    d[9] = 9;

Thanks for any helpful comments you can make.
--rbutler

import std.stdio;

void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
}

void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
}
May 11 2014
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
 }

 void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
 }
There are problems with the implementation of associative arrays. What you are seeing above is a consequence of the associative array not being correctly initialised (I think...). I often create my associative arrays with the following function to avoid the problem you're having: /// Hack to properly initialise an empty AA auto initAA(T)() { T t = [typeof(T.keys[0]).init : typeof(T.values[0]).init]; t.remove(typeof(T.keys[0]).init); return t; } import std.stdio; void test(int[int] aa, int x) { aa[x] = x; aa[8] = 8; } void main() { int[int] d = initAA!(int[int]); test(d, 0); writeln(d); }
May 11 2014
parent "rbutler" <rbutler mtsu.edu> writes:
On Sunday, 11 May 2014 at 15:22:29 UTC, John Colvin wrote:
 On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program 
 below.
 If I put "ref"  in the function stmt it works, i.e.:
        ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
    aa[x] = x;
    aa[8] = 8;
 }

 void main() {
    int[int] d;
    writeln(d.length);
    // d[9] = 9;
    test(d, 0);
    writeln(d);
 }
There are problems with the implementation of associative arrays. What you are seeing above is a consequence of the associative array not being correctly initialised (I think...). I often create my associative arrays with the following function to avoid the problem you're having: /// Hack to properly initialise an empty AA auto initAA(T)() { T t = [typeof(T.keys[0]).init : typeof(T.values[0]).init]; t.remove(typeof(T.keys[0]).init); return t; } import std.stdio; void test(int[int] aa, int x) { aa[x] = x; aa[8] = 8; } void main() { int[int] d = initAA!(int[int]); test(d, 0); writeln(d); }
OK. :-) That makes it difficult to talk about in a classroom, especially when trying to stress adherence to the principle of least surprise. Thanks very much for the quick reply.
May 11 2014
prev sibling next sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs anyway, so 
I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }
The problem is with the initial state of associative arrays, which happens to be null. When AAs are copied when null, both copies are null, not being associated with anything, not even an initial table to store the hash buckets in. As a result, null AAs cannot be references to each other's (non existent) data. When a null AA starts receiving data, it first creates its own data memory but the other one cannot know about that data. ref parameter works because then there is only one AA to speak of. d[9] entry works as well because then the first AA is not null. Ali
May 11 2014
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Sunday, 11 May 2014 at 16:54:18 UTC, Ali Çehreli wrote:
 On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about
passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program
below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs
anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }
The problem is with the initial state of associative arrays, which happens to be null. When AAs are copied when null, both copies are null, not being associated with anything, not even an initial table to store the hash buckets in. As a result, null AAs cannot be references to each other's (non existent) data. When a null AA starts receiving data, it first creates its own data memory but the other one cannot know about that data. ref parameter works because then there is only one AA to speak of. d[9] entry works as well because then the first AA is not null. Ali
Remind me again why we can't just change this to a sensible initial state? Or at least add a .initialize()?
May 11 2014
next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/11/2014 10:00 AM, John Colvin wrote:
 On Sunday, 11 May 2014 at 16:54:18 UTC, Ali Çehreli wrote:
 On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about
passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program
below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs
anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }
The problem is with the initial state of associative arrays, which happens to be null. When AAs are copied when null, both copies are null, not being associated with anything, not even an initial table to store the hash buckets in. As a result, null AAs cannot be references to each other's (non existent) data. When a null AA starts receiving data, it first creates its own data memory but the other one cannot know about that data. ref parameter works because then there is only one AA to speak of. d[9] entry works as well because then the first AA is not null. Ali
Remind me again why we can't just change this to a sensible initial state?
First, I am not familiar with the current implementation of AAs and I deduced what I've written just from the behavior. I think it is this way primarily for lazy initialization so that nothing is done until there is at least one element. It could still work as expected though if there were another level of indirection, which would naturally add some cost. (Although, lazy initialization brings a constant cost as well, right? In the form of "has this been initialized yet"; but that cost is as cheap as checking the value of a local variable. On the other hand, an indirection would be cache-unfriendly. And this is pure speculation... :p)
 Or at least add a .initialize()?
Your initAA() function seems to be the only way that a user can manage to do that. Although, it would help if we renamed it as initialize() and added a template constraint so that it is called only for AAs. Ali
May 11 2014
prev sibling parent Jonathan M Davis via Digitalmars-d-learn writes:
On Sun, 11 May 2014 17:00:13 +0000
 Remind me again why we can't just change this to a sensible
 initial state? Or at least add a .initialize()?
All reference types have a null init value. Arrays and classes have the exact same issue as AAs. Anything else would require not only allocating memory but would require that that state persist from compile time to runtime, because the init value must be known at compile time, and there are many cases, where a variable exists at compile time (e.g. a module-level or static variable), making delayed initialization problematic. Previously, it was impossible to allocate anything other than arrays at compile time and have it's state persist through to runtime, though it's not possible to do that with classes (I don't know about AAs). So, it _might_ now be possible to make it so that AAs had an init value other than null, but because there's only one init value per type, even if the init value for AAs wasn't null, it wouldn't solve the problem. It would just result in all AAs of the same type sharing the same value unless they were directly initialized rather than having their init value used. Essentially, the way that default-initialization works in D makes it so that a default-initialized AA can't be its own value like you're looking for. For that, we'd need default construction (like C++ has), but then we'd lose out on the benefits of having a known init value for all types and would have the problems that that was meant to solve. It causes us problems with structs too for similar reasons (the lack of default construction there also gets complained about fairly frequently). Ultimately, it's a set of tradeoffs, and you're running into the negative side of this particular one. - Jonathan M Davis
May 11 2014
prev sibling parent "ed" <sillymongrel gmail.com> writes:
On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
 }

 void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
 }
The AA is passed by value but its underlying data is referenced, making the copy cheap. The snippet below also shows the same behaviour even when the AA has data in it before calling the function. --- void func(string[int] aa) { writefln("[FUNC1] &aa:%s=%s", &aa, aa); // Reassign the data here in func()'s copy and // main never sees it aa = [2:"two"]; writefln("[FUNC2] &aa:%s=%s", &aa, aa); } void main() { string[int] aa; aa[1] = "one"; writefln("[MAIN1] &aa:%s=%s", &aa, aa); func(aa); writefln("[MAIN2] &aa:%s=%s", &aa, aa); } --- It is the same as passing a C++ shared_ptr<> by value. Cheers, ed
May 11 2014