digitalmars.D.learn - question about passing associative array to a function

rbutler (23/23) May 11 2014 I have searched and can not understand something about passing

John Colvin (23/46) May 11 2014 There are problems with the implementation of associative arrays.

rbutler (6/58) May 11 2014 OK. :-)

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (12/34) May 11 2014 The problem is with the initial state of associative arrays, which

John Colvin (3/46) May 11 2014 Remind me again why we can't just change this to a sensible

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (15/66) May 11 2014 First, I am not familiar with the current implementation of AAs and I
Jonathan M Davis via Digitalmars-d-learn (25/27) May 11 2014 All reference types have a null init value. Arrays and classes have the ...

ed (26/49) May 11 2014 The AA is passed by value but its underlying data is referenced,

"rbutler" <rbutler mtsu.edu> writes:

I have searched and can not understand something about passing 
AAs to a function.
I have reduced the gist of the question to a tiny program below.
If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
My confusion is that AAs are supposed to be passed as refs 
anyway, so I do
not understand why I should have to use ref to make it work.

Related, it also works if I UN-comment the line    d[9] = 9;

Thanks for any helpful comments you can make.
--rbutler

import std.stdio;

void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
}

void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
}

May 11 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
 }

 void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
 }

There are problems with the implementation of associative arrays. 
What you are seeing above is a consequence of the associative 
array not being correctly initialised (I think...).

I often create my associative arrays with the following function 
to avoid the problem you're having:

/// Hack to properly initialise an empty AA
auto initAA(T)()
{
	T t = [typeof(T.keys[0]).init : typeof(T.values[0]).init];
	t.remove(typeof(T.keys[0]).init);
	return t;
}

import std.stdio;

void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
}

void main() {
     int[int] d = initAA!(int[int]);
     test(d, 0);
     writeln(d);
}

May 11 2014

"rbutler" <rbutler mtsu.edu> writes:

On Sunday, 11 May 2014 at 15:22:29 UTC, John Colvin wrote:
 On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program 
 below.
 If I put "ref"  in the function stmt it works, i.e.:
        ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
    aa[x] = x;
    aa[8] = 8;
 }

 void main() {
    int[int] d;
    writeln(d.length);
    // d[9] = 9;
    test(d, 0);
    writeln(d);
 }

 There are problems with the implementation of associative 
 arrays. What you are seeing above is a consequence of the 
 associative array not being correctly initialised (I think...).

 I often create my associative arrays with the following 
 function to avoid the problem you're having:

 /// Hack to properly initialise an empty AA
 auto initAA(T)()
 {
 	T t = [typeof(T.keys[0]).init : typeof(T.values[0]).init];
 	t.remove(typeof(T.keys[0]).init);
 	return t;
 }

 import std.stdio;

 void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
 }

 void main() {
     int[int] d = initAA!(int[int]);
     test(d, 0);
     writeln(d);
 }

OK. :-)
That makes it difficult to talk about in a classroom, especially 
when trying to stress
adherence to the principle of least surprise.
Thanks very much for the quick reply.

May 11 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs anyway, so 

I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }

The problem is with the initial state of associative arrays, which 
happens to be null. When AAs are copied when null, both copies are null, 
not being associated with anything, not even an initial table to store 
the hash buckets in. As a result, null AAs cannot be references to each 
other's (non existent) data.

When a null AA starts receiving data, it first creates its own data 
memory but the other one cannot know about that data.

ref parameter works because then there is only one AA to speak of.

d[9] entry works as well because then the first AA is not null.

Ali

May 11 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Sunday, 11 May 2014 at 16:54:18 UTC, Ali Çehreli wrote:
 On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about

 passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program

 below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs

 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }

 The problem is with the initial state of associative arrays, 
 which happens to be null. When AAs are copied when null, both 
 copies are null, not being associated with anything, not even 
 an initial table to store the hash buckets in. As a result, 
 null AAs cannot be references to each other's (non existent) 
 data.

 When a null AA starts receiving data, it first creates its own 
 data memory but the other one cannot know about that data.

 ref parameter works because then there is only one AA to speak 
 of.

 d[9] entry works as well because then the first AA is not null.

 Ali

Remind me again why we can't just change this to a sensible 
initial state? Or at least add a .initialize()?

May 11 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 05/11/2014 10:00 AM, John Colvin wrote:
 On Sunday, 11 May 2014 at 16:54:18 UTC, Ali Çehreli wrote:
 On 05/11/2014 07:46 AM, rbutler wrote:

 I have searched and can not understand something about

 passing AAs to a
 function.
 I have reduced the gist of the question to a tiny program

 below.
 If I put "ref"  in the function stmt it works, i.e.:
          ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs

 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
      aa[x] = x;
      aa[8] = 8;
 }

 void main() {
      int[int] d;
      writeln(d.length);
      // d[9] = 9;
      test(d, 0);
      writeln(d);
 }

 The problem is with the initial state of associative arrays, which
 happens to be null. When AAs are copied when null, both copies are
 null, not being associated with anything, not even an initial table to
 store the hash buckets in. As a result, null AAs cannot be references
 to each other's (non existent) data.

 When a null AA starts receiving data, it first creates its own data
 memory but the other one cannot know about that data.

 ref parameter works because then there is only one AA to speak of.

 d[9] entry works as well because then the first AA is not null.

 Ali

 Remind me again why we can't just change this to a sensible initial
 state?

First, I am not familiar with the current implementation of AAs and I 
deduced what I've written just from the behavior.

I think it is this way primarily for lazy initialization so that nothing 
is done until there is at least one element. It could still work as 
expected though if there were another level of indirection, which would 
naturally add some cost. (Although, lazy initialization brings a 
constant cost as well, right? In the form of "has this been initialized 
yet"; but that cost is as cheap as checking the value of a local 
variable. On the other hand, an indirection would be cache-unfriendly. 
And this is pure speculation... :p)

 Or at least add a .initialize()?

Your initAA() function seems to be the only way that a user can manage 
to do that. Although, it would help if we renamed it as initialize() and 
added a template constraint so that it is called only for AAs.

Ali

May 11 2014

Jonathan M Davis via Digitalmars-d-learn writes:

On Sun, 11 May 2014 17:00:13 +0000
 Remind me again why we can't just change this to a sensible
 initial state? Or at least add a .initialize()?

All reference types have a null init value. Arrays and classes have the exact
same issue as AAs. Anything else would require not only allocating memory but
would require that that state persist from compile time to runtime, because
the init value must be known at compile time, and there are many cases, where
a variable exists at compile time (e.g. a module-level or static variable),
making delayed initialization problematic. Previously, it was impossible to
allocate anything other than arrays at compile time and have it's state
persist through to runtime, though it's not possible to do that with classes
(I don't know about AAs).

So, it _might_ now be possible to make it so that AAs had an init value other
than null, but because there's only one init value per type, even if the init
value for AAs wasn't null, it wouldn't solve the problem. It would just result
in all AAs of the same type sharing the same value unless they were directly
initialized rather than having their init value used.

Essentially, the way that default-initialization works in D makes it so that a
default-initialized AA can't be its own value like you're looking for. For
that, we'd need default construction (like C++ has), but then we'd lose out on
the benefits of having a known init value for all types and would have the
problems that that was meant to solve. It causes us problems with structs too
for similar reasons (the lack of default construction there also gets
complained about fairly frequently).

Ultimately, it's a set of tradeoffs, and you're running into the negative
side of this particular one.

- Jonathan M Davis

May 11 2014

"ed" <sillymongrel gmail.com> writes:

On Sunday, 11 May 2014 at 14:46:35 UTC, rbutler wrote:
 I have searched and can not understand something about passing 
 AAs to a function.
 I have reduced the gist of the question to a tiny program below.
 If I put "ref"  in the function stmt it works, i.e.:
         ref int[int] aa
 My confusion is that AAs are supposed to be passed as refs 
 anyway, so I do
 not understand why I should have to use ref to make it work.

 Related, it also works if I UN-comment the line    d[9] = 9;

 Thanks for any helpful comments you can make.
 --rbutler

 import std.stdio;

 void test(int[int] aa, int x) {
     aa[x] = x;
     aa[8] = 8;
 }

 void main() {
     int[int] d;
     writeln(d.length);
     // d[9] = 9;
     test(d, 0);
     writeln(d);
 }

The AA is passed by value but its underlying data is referenced, 
making the copy cheap. The snippet below also shows the same 
behaviour even when the AA has data in it before calling the 
function.
---
void func(string[int] aa)
{
     writefln("[FUNC1]    &aa:%s=%s", &aa, aa);

     // Reassign the data here in func()'s copy and
     // main never sees it
     aa = [2:"two"];
     writefln("[FUNC2]    &aa:%s=%s", &aa, aa);

}

void main()
{
     string[int] aa;
     aa[1] = "one";
     writefln("[MAIN1]    &aa:%s=%s", &aa, aa);
     func(aa);
     writefln("[MAIN2]    &aa:%s=%s", &aa, aa);

}
---

It is the same as passing a C++ shared_ptr<> by value.

Cheers,
ed

May 11 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - question about passing associative array to a function