www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Initialization of std.typecons.RefCounted objects

reply Matthias Walter <xammy xammy.info> writes:
Hi,

I looked at Bug #6153 (Array!(Array!int) failure) and found that the
reason is the following behavior:

Given some type T you wrap as RefCounted!T object, then proper use of
the auto-initialization feature or manual initialization on demand
ensures that no null-pointer dereference will happen. But the other
problem is that as long as the wrapped object is uninitialized, the
object behaves like a value type instead of a reference type.

This exactly is what makes the following code fail:

Array!(Array!int) array2d;
array2d.length = 1;
array2d[0].insert(1);

The inner array "array2d[0]" was not initialized and hence the reference
pointer is null. Since Array.opIndex returns by value, the 'insert'
method is called on a temporary object and does not affect the inner
array (still being empty) which is stored in the outer array.

What do you think about this?

Must the user ensure that the Array container is always initialized
explicitly? If yes, how shall this happen since the only constructor
takes a (non-empty) tuple of new elements. Or shall opIndex return by
reference?

The problem arises in the same way if Array was a class - but then the
user typically knows that after enlarging the outer array, the inner
array is a null pointer which must be created by new.

Best regards,

Matthias Walter
Jul 18 2012
next sibling parent reply travert phare.normalesup.org (Christophe Travert) writes:
Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
 I looked at Bug #6153 (Array!(Array!int) failure) and found that the

 This exactly is what makes the following code fail:
 
 Array!(Array!int) array2d;
 array2d.length = 1;
 array2d[0].insert(1);
 
 The inner array "array2d[0]" was not initialized and hence the reference
 pointer is null. Since Array.opIndex returns by value, the 'insert'
 method is called on a temporary object and does not affect the inner
 array (still being empty) which is stored in the outer array.
 
 What do you think about this?
 
 Must the user ensure that the Array container is always initialized
 explicitly? If yes, how shall this happen since the only constructor
 takes a (non-empty) tuple of new elements. Or shall opIndex return by
 reference?

I think opIndex should return by reference. opIndexAssign is of no help when the user want to use a function that takes a reference (here Array.insert). It is normal that Array uses default construction when someone increases the array's length. Besides that point, I don't see why default-constructed Array have an uninitialised Payload. This makes uninitialised Array behaves unexpectedly, because making a copy and using the copy will not affect the original, which is not the intended reference value behavior. Correcting default-initialization of Array would correct your bug, but would not correct the wrong behavior of Array when the value returned by opIndex is used with a non-const method with other objects (dynamic arrays?). So both have to be changed in my opinion. -- Christophe
Jul 18 2012
next sibling parent travert phare.normalesup.org (Christophe Travert) writes:
I see you found the appropriate entry to discuss this bug:

http://d.puremagic.com/issues/show_bug.cgi?id=6153
Jul 18 2012
prev sibling parent reply travert phare.normalesup.org (Christophe Travert) writes:
"monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
 I think it would be better to "initialize on copy", rather than 
 default initialize. There are too many cases an empty array is 
 created, then initialized on the next line, or passed to 
 something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an Array. I think people use Array more often than they create uninitialized ones that are not going to be used before an other Array instance is assigned to them, so Array would be more efficient if it was default initialized and never check it is initialized again. But that's just speculation.
 You'd get the correct behavior, and everything else (except dupe) 
 works fine anyways.

Keeping the adress of the content secret may be a valuable intention, but as long as properties and opIndex does not allow to correctly forward methods, this is completely broken. Is there even a begining of a plan to implement this? I don't see how properties or opIndex could safely forward methods that uses references and that we do not control without escaping the reference to them. That's not possible until D has a complete control of escaping references, which is not planned in the near or distant future. Not to mention that having a complete control of escaping reference break lots of code anyway, and might considerably decrease the need for ref counted utilities like... Array. Please, at least give me hope that there is light at the end of the tunnel. -- Christophe
Jul 19 2012
parent travert phare.normalesup.org (Christophe Travert) writes:
"monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
 One of the reason the implementation doesn't let you escape a 
 reference is that that reference may become (_unverifiably_) 
 invalid.

The same applies to a dynamic array: it is undistinguishable from a sliced static array. More generally, as long as you allow variables on the stack with no escaped reference tracking, you can't ensure references remain valid. Even in safe code. If I want my references to remain valid, I use dynamic array and garbage collection. If I use Array, I accept that my references may die. Array that protects the validity of their references are awesome. But, IMHO, not at that cost.
 ...That said, I see no reason for the other containers (SList, 
 I'm looking at you), not to expose references.

I'm against not exposing reference, but all containers will be implemented with custom allocator someday.
 The current work around? Copy-Extract, manipulate, re-insert. 
 Sucks.

IMO, what sucks even more is that arr[0].insert(foo) compiles while it has no effect. arr[0] is a R-value, but applying method to R-value is allowed. I don't know the state of debates about forbiding to call non-const methods on R-values. I think this would break too much code.
Jul 19 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 18 July 2012 at 13:32:39 UTC, 
travert phare.normalesup.org (Christophe Travert) wrote:
 I think opIndex should return by reference. opIndexAssign is of 
 no help
 when the user want to use a function that takes a reference 
 (here
 Array.insert).

Having already brought this up before, the answer to that is that the containers inside "container" are meant to be "closed" and not give access to the addresses of their internals, so the return by value is on purpose: Here is Andrei's post about this: http://forum.dlang.org/thread/ceftaiklanejfhodbpix forum.dlang.org?page=2#post-jthdko:24231u:241:40digitalmars.com That said, I had brought up the exact same issue here: http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org
 It is normal that Array uses default construction when
 someone increases the array's length.
 Besides that point, I don't see why default-constructed Array 
 have an
 uninitialised Payload. This makes uninitialised Array behaves
 unexpectedly, because making a copy and using the copy will not 
 affect
 the original, which is not the intended reference value 
 behavior.

I think it would be better to "initialize on copy", rather than default initialize. There are too many cases an empty array is created, then initialized on the next line, or passed to something else that does the initialization proper. You'd get the correct behavior, and everything else (except dupe) works fine anyways.
Jul 18 2012
prev sibling next sibling parent Matthias Walter <xammy xammy.info> writes:
On 07/18/2012 03:32 PM, Christophe Travert wrote:
 Matthias Walter , dans le message (digitalmars.D:172673), a écrit :
 I looked at Bug #6153 (Array!(Array!int) failure) and found that the

 This exactly is what makes the following code fail:

 Array!(Array!int) array2d;
 array2d.length = 1;
 array2d[0].insert(1);

 The inner array "array2d[0]" was not initialized and hence the reference
 pointer is null. Since Array.opIndex returns by value, the 'insert'
 method is called on a temporary object and does not affect the inner
 array (still being empty) which is stored in the outer array.

 What do you think about this?

 Must the user ensure that the Array container is always initialized
 explicitly? If yes, how shall this happen since the only constructor
 takes a (non-empty) tuple of new elements. Or shall opIndex return by
 reference?

I think opIndex should return by reference. opIndexAssign is of no help when the user want to use a function that takes a reference (here Array.insert). It is normal that Array uses default construction when someone increases the array's length.

Okay, I fully agree here.
 Besides that point, I don't see why default-constructed Array have an 
 uninitialised Payload. This makes uninitialised Array behaves 
 unexpectedly, because making a copy and using the copy will not affect 
 the original, which is not the intended reference value behavior.

Well the reason is that no user-defined function is called when a struct is default-constructed and then copied into another variable (the first one is this(this) which only known the new variable). Hence no user code can be written that would initialize the payload. Do we need a paradigm to perform explicit initialization of structs? Best regards, Matthias
Jul 18 2012
prev sibling next sibling parent Matthias Walter <xammy xammy.info> writes:
On 07/19/2012 10:14 AM, Christophe Travert wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
 I think it would be better to "initialize on copy", rather than 
 default initialize. There are too many cases an empty array is 
 created, then initialized on the next line, or passed to 
 something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an Array. I think people use Array more often than they create uninitialized ones that are not going to be used before an other Array instance is assigned to them, so Array would be more efficient if it was default initialized and never check it is initialized again. But that's just speculation.

I agree here. Additionally my question: Is it possible (at the moment) to really do "initialize on copy"? As far as I see it, the only way to interact here is to implement 'this(this)' which is called after bit-copying and hence cannot access the source of the copy process.
Jul 19 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 19 July 2012 at 08:14:25 UTC, 
travert phare.normalesup.org (Christophe Travert) wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172700), a 
 écrit :
 I think it would be better to "initialize on copy", rather 
 than default initialize. There are too many cases an empty 
 array is created, then initialized on the next line, or passed 
 to something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an Array.

I'm back pedaling and agreeing 100% actually. Plus it keeps the implementation simpler. +1 to you.
 Keeping the adress of the content secret may be a valuable 
 intention,
 but as long as properties and opIndex does not allow to 
 correctly
 forward methods, this is completely broken. Is there even a 
 begining of
 a plan to implement this? I don't see how properties or opIndex 
 could
 safely forward methods that uses references and that we do not 
 control
 without escaping the reference to them. That's not possible 
 until D has
 a complete control of escaping references, which is not planned 
 in the
 near or distant future. Not to mention that having a complete 
 control of
 escaping reference break lots of code anyway, and might 
 considerably
 decrease the need for ref counted utilities like... Array.

 Please, at least give me hope that there is light at the end of 
 the
 tunnel.

One of the reason the implementation doesn't let you escape a reference is that that reference may become (_unverifiably_) invalid. Ranges to Arrays, on the other hand, keep a reference to the Array itself, and always* remain valid. (always as long as the Array doesn't shrink past the range's boundaries, but in this case, at least, the range throws an exception, rather than crash the program). A reference to a dynamic array, on the other hand, will ALWAYS be valid 100% of the time, because the original data is never actually destroyed (until unreferenced). So it is perfectly safe to expose references in an array. So escaping references from an Array is something _very_ dangerous, especially in a language that kind of guarantees it won't core dump if you don't manipulate pointers (which could if arrays escaped references). While I get your point, it really feels like a "lose lose" situation here: You get stiffed either way...with Array... ...That said, I see no reason for the other containers (SList, I'm looking at you), not to expose references. The way I see it, Array *could* escape references, but then the whole class would have to be qualified unsafe (if such a thing existed) or something along these lines. The current work around? Copy-Extract, manipulate, re-insert. Sucks. On Thursday, 19 July 2012 at 10:39:56 UTC, Matthias Walter wrote:
 On 07/19/2012 10:14 AM, Christophe Travert wrote:
 I agree here. Additionally my question: Is it possible (at the 
 moment)
 to really do "initialize on copy"? As far as I see it, the only 
 way to
 interact here is to implement 'this(this)' which is called after
 bit-copying and hence cannot access the source of the copy 
 process.

Good point. The answer is no though. A RefCounted (also known as SharedPointer in C++) needs to be initialized first if you want it aliased. Allowing such a behavior is possible, but prohibitively expensive.
Jul 19 2012
prev sibling parent Matthias Walter <xammy xammy.info> writes:
On 07/19/2012 02:16 PM, Christophe Travert wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
 One of the reason the implementation doesn't let you escape a 
 reference is that that reference may become (_unverifiably_) 
 invalid.

The same applies to a dynamic array: it is undistinguishable from a sliced static array. More generally, as long as you allow variables on the stack with no escaped reference tracking, you can't ensure references remain valid. Even in safe code. If I want my references to remain valid, I use dynamic array and garbage collection. If I use Array, I accept that my references may die. Array that protects the validity of their references are awesome. But, IMHO, not at that cost.
 ...That said, I see no reason for the other containers (SList, 
 I'm looking at you), not to expose references.

I'm against not exposing reference, but all containers will be implemented with custom allocator someday.
 The current work around? Copy-Extract, manipulate, re-insert. 
 Sucks.

IMO, what sucks even more is that arr[0].insert(foo) compiles while it has no effect. arr[0] is a R-value, but applying method to R-value is allowed. I don't know the state of debates about forbiding to call non-const methods on R-values. I think this would break too much code.

As it seems the issue really should be resolved by making opIndex return by reference and press thumbs hard that something like a 'scope ref' will be implemented? Furthermore, since RefCounted objects do not behave like reference types until initialized, they *must* be initialized before anything else happens and hence I propose to change std.container.Array like Christophe said: Replace 'isInitialized()' checks by assertions and add a method with which the user explicitly initialized the reference counter. Or is there a reasonable alternative? Best regards, Matthias Walter
Jul 19 2012