digitalmars.D - Initialization of std.typecons.RefCounted objects

Matthias Walter (26/26) Jul 18 2012 Hi,

travert phare.normalesup.org (Christophe Travert) (15/33) Jul 18 2012 I think opIndex should return by reference. opIndexAssign is of no help

travert phare.normalesup.org (Christophe Travert) (2/2) Jul 18 2012 I see you found the appropriate entry to discuss this bug:
monarch_dodra (16/30) Jul 18 2012 Having already brought this up before, the answer to that is that

travert phare.normalesup.org (Christophe Travert) (21/27) Jul 19 2012 Not default-initializing Array has a cost for every legitimate use of an...

Matthias Walter (5/17) Jul 19 2012 I agree here. Additionally my question: Is it possible (at the moment)
monarch_dodra (33/71) Jul 19 2012 I'm back pedaling and agreeing 100% actually. Plus it keeps the

travert phare.normalesup.org (Christophe Travert) (15/22) Jul 19 2012 The same applies to a dynamic array: it is undistinguishable from a

Matthias Walter (12/40) Jul 19 2012 As it seems the issue really should be resolved by making opIndex return

Matthias Walter (9/38) Jul 18 2012 Well the reason is that no user-defined function is called when a struct

Matthias Walter <xammy xammy.info> writes:

Hi,


reason is the following behavior:

Given some type T you wrap as RefCounted!T object, then proper use of
the auto-initialization feature or manual initialization on demand
ensures that no null-pointer dereference will happen. But the other
problem is that as long as the wrapped object is uninitialized, the
object behaves like a value type instead of a reference type.

This exactly is what makes the following code fail:

Array!(Array!int) array2d;
array2d.length = 1;
array2d[0].insert(1);

The inner array "array2d[0]" was not initialized and hence the reference
pointer is null. Since Array.opIndex returns by value, the 'insert'
method is called on a temporary object and does not affect the inner
array (still being empty) which is stored in the outer array.

What do you think about this?

Must the user ensure that the Array container is always initialized
explicitly? If yes, how shall this happen since the only constructor
takes a (non-empty) tuple of new elements. Or shall opIndex return by
reference?

The problem arises in the same way if Array was a class - but then the
user typically knows that after enlarging the outer array, the inner
array is a null pointer which must be created by new.

Best regards,

Matthias Walter

Jul 18 2012

travert phare.normalesup.org (Christophe Travert) writes:

Matthias Walter , dans le message (digitalmars.D:172673), a �crit�:


 This exactly is what makes the following code fail:
 
 Array!(Array!int) array2d;
 array2d.length = 1;
 array2d[0].insert(1);
 
 The inner array "array2d[0]" was not initialized and hence the reference
 pointer is null. Since Array.opIndex returns by value, the 'insert'
 method is called on a temporary object and does not affect the inner
 array (still being empty) which is stored in the outer array.
 
 What do you think about this?
 
 Must the user ensure that the Array container is always initialized
 explicitly? If yes, how shall this happen since the only constructor
 takes a (non-empty) tuple of new elements. Or shall opIndex return by
 reference?

I think opIndex should return by reference. opIndexAssign is of no help 
when the user want to use a function that takes a reference (here 
Array.insert). It is normal that Array uses default construction when 
someone increases the array's length.

Besides that point, I don't see why default-constructed Array have an 
uninitialised Payload. This makes uninitialised Array behaves 
unexpectedly, because making a copy and using the copy will not affect 
the original, which is not the intended reference value behavior.

Correcting default-initialization of Array would correct your bug, but 
would not correct the wrong behavior of Array when the value returned by 
opIndex is used with a non-const method with other objects (dynamic 
arrays?). So both have to be changed in my opinion.

-- 
Christophe

Jul 18 2012

travert phare.normalesup.org (Christophe Travert) writes:

I see you found the appropriate entry to discuss this bug:

http://d.puremagic.com/issues/show_bug.cgi?id=6153

Jul 18 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Wednesday, 18 July 2012 at 13:32:39 UTC, 
travert phare.normalesup.org (Christophe Travert) wrote:
 I think opIndex should return by reference. opIndexAssign is of 
 no help
 when the user want to use a function that takes a reference 
 (here
 Array.insert).

Having already brought this up before, the answer to that is that 
the containers inside "container" are meant to be "closed" and 
not give access to the addresses of their internals, so the 
return by value is on purpose:

Here is Andrei's post about this:
http://forum.dlang.org/thread/ceftaiklanejfhodbpix forum.dlang.org?page=2#post-jthdko:24231u:241:40digitalmars.com

That said, I had brought up the exact same issue here:
http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org

 It is normal that Array uses default construction when
 someone increases the array's length.
 Besides that point, I don't see why default-constructed Array 
 have an
 uninitialised Payload. This makes uninitialised Array behaves
 unexpectedly, because making a copy and using the copy will not 
 affect
 the original, which is not the intended reference value 
 behavior.

I think it would be better to "initialize on copy", rather than 
default initialize. There are too many cases an empty array is 
created, then initialized on the next line, or passed to 
something else that does the initialization proper.

You'd get the correct behavior, and everything else (except dupe) 
works fine anyways.

Jul 18 2012

travert phare.normalesup.org (Christophe Travert) writes:

"monarch_dodra" , dans le message (digitalmars.D:172700), a �crit�:
 I think it would be better to "initialize on copy", rather than 
 default initialize. There are too many cases an empty array is 
 created, then initialized on the next line, or passed to 
 something else that does the initialization proper.

Not default-initializing Array has a cost for every legitimate use of an 
Array. I think people use Array more often than they create 
uninitialized ones that are not going to be used before an other Array 
instance is assigned to them, so Array would be more efficient if it was 
default initialized and never check it is initialized again. But that's 
just speculation.

 You'd get the correct behavior, and everything else (except dupe) 
 works fine anyways.

Keeping the adress of the content secret may be a valuable intention, 
but as long as properties and opIndex does not allow to correctly 
forward methods, this is completely broken. Is there even a begining of 
a plan to implement this? I don't see how properties or opIndex could 
safely forward methods that uses references and that we do not control 
without escaping the reference to them. That's not possible until D has 
a complete control of escaping references, which is not planned in the 
near or distant future. Not to mention that having a complete control of 
escaping reference break lots of code anyway, and might considerably 
decrease the need for ref counted utilities like... Array.

Please, at least give me hope that there is light at the end of the 
tunnel.

-- 
Christophe

Jul 19 2012

Matthias Walter <xammy xammy.info> writes:

On 07/19/2012 10:14 AM, Christophe Travert wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172700), a écrit :
 I think it would be better to "initialize on copy", rather than 
 default initialize. There are too many cases an empty array is 
 created, then initialized on the next line, or passed to 
 something else that does the initialization proper.

 
 Not default-initializing Array has a cost for every legitimate use of an 
 Array. I think people use Array more often than they create 
 uninitialized ones that are not going to be used before an other Array 
 instance is assigned to them, so Array would be more efficient if it was 
 default initialized and never check it is initialized again. But that's 
 just speculation.

I agree here. Additionally my question: Is it possible (at the moment)
to really do "initialize on copy"? As far as I see it, the only way to
interact here is to implement 'this(this)' which is called after
bit-copying and hence cannot access the source of the copy process.

Jul 19 2012

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 19 July 2012 at 08:14:25 UTC, 
travert phare.normalesup.org (Christophe Travert) wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172700), a 
 écrit :
 I think it would be better to "initialize on copy", rather 
 than default initialize. There are too many cases an empty 
 array is created, then initialized on the next line, or passed 
 to something else that does the initialization proper.

 Not default-initializing Array has a cost for every legitimate 
 use of an Array.

I'm back pedaling and agreeing 100% actually. Plus it keeps the 
implementation simpler. +1 to you.

 Keeping the adress of the content secret may be a valuable 
 intention,
 but as long as properties and opIndex does not allow to 
 correctly
 forward methods, this is completely broken. Is there even a 
 begining of
 a plan to implement this? I don't see how properties or opIndex 
 could
 safely forward methods that uses references and that we do not 
 control
 without escaping the reference to them. That's not possible 
 until D has
 a complete control of escaping references, which is not planned 
 in the
 near or distant future. Not to mention that having a complete 
 control of
 escaping reference break lots of code anyway, and might 
 considerably
 decrease the need for ref counted utilities like... Array.

 Please, at least give me hope that there is light at the end of 
 the
 tunnel.

One of the reason the implementation doesn't let you escape a 
reference is that that reference may become (_unverifiably_) 
invalid. Ranges to Arrays, on the other hand, keep a reference to 
the Array itself, and always* remain valid. (always as long as 
the Array doesn't shrink past the range's boundaries, but in this 
case, at least, the range throws an exception, rather than crash 
the program).

A reference to a dynamic array, on the other hand, will ALWAYS be 
valid 100% of the time, because the original data is never 
actually destroyed (until unreferenced). So it is perfectly safe 
to expose references in an array.

So escaping references from an Array is something _very_ 
dangerous, especially in a language that kind of guarantees it 
won't core dump if you don't manipulate pointers (which could if 
arrays escaped references).

While I get your point, it really feels like a "lose lose" 
situation here: You get stiffed either way...with Array...
...That said, I see no reason for the other containers (SList, 
I'm looking at you), not to expose references.

The way I see it, Array *could* escape references, but then the 
whole class would have to be qualified  unsafe (if such a thing 
existed) or something along these lines.

The current work around? Copy-Extract, manipulate, re-insert. 
Sucks.

On Thursday, 19 July 2012 at 10:39:56 UTC, Matthias Walter wrote:
 On 07/19/2012 10:14 AM, Christophe Travert wrote:
 I agree here. Additionally my question: Is it possible (at the 
 moment)
 to really do "initialize on copy"? As far as I see it, the only 
 way to
 interact here is to implement 'this(this)' which is called after
 bit-copying and hence cannot access the source of the copy 
 process.

Good point. The answer is no though. A RefCounted (also known as 
SharedPointer in C++) needs to be initialized first if you want 
it aliased.

Allowing such a behavior is possible, but prohibitively expensive.

Jul 19 2012

travert phare.normalesup.org (Christophe Travert) writes:

"monarch_dodra" , dans le message (digitalmars.D:172710), a �crit�:
 One of the reason the implementation doesn't let you escape a 
 reference is that that reference may become (_unverifiably_) 
 invalid.

The same applies to a dynamic array: it is undistinguishable from a 
sliced static array. More generally, as long as you allow variables on 
the stack with no escaped reference tracking, you can't ensure 
references remain valid. Even in safe code.

If I want my references to remain valid, I use dynamic array and garbage 
collection. If I use Array, I accept that my references may die. Array 
that protects the validity of their references are awesome. But, IMHO, 
not at that cost.

 ...That said, I see no reason for the other containers (SList, 
 I'm looking at you), not to expose references.

I'm against not exposing reference, but all containers will be 
implemented with custom allocator someday.

 The current work around? Copy-Extract, manipulate, re-insert. 
 Sucks.

IMO, what sucks even more is that arr[0].insert(foo) compiles while it 
has no effect. arr[0] is a R-value, but applying method to R-value is 
allowed. I don't know the state of debates about forbiding to call 
non-const methods on R-values. I think this would break too much code.

Jul 19 2012

Matthias Walter <xammy xammy.info> writes:

On 07/19/2012 02:16 PM, Christophe Travert wrote:
 "monarch_dodra" , dans le message (digitalmars.D:172710), a écrit :
 One of the reason the implementation doesn't let you escape a 
 reference is that that reference may become (_unverifiably_) 
 invalid.

 
 The same applies to a dynamic array: it is undistinguishable from a 
 sliced static array. More generally, as long as you allow variables on 
 the stack with no escaped reference tracking, you can't ensure 
 references remain valid. Even in safe code.
 
 If I want my references to remain valid, I use dynamic array and garbage 
 collection. If I use Array, I accept that my references may die. Array 
 that protects the validity of their references are awesome. But, IMHO, 
 not at that cost.
 
 ...That said, I see no reason for the other containers (SList, 
 I'm looking at you), not to expose references.

 
 I'm against not exposing reference, but all containers will be 
 implemented with custom allocator someday.
 
 The current work around? Copy-Extract, manipulate, re-insert. 
 Sucks.

 
 IMO, what sucks even more is that arr[0].insert(foo) compiles while it 
 has no effect. arr[0] is a R-value, but applying method to R-value is 
 allowed. I don't know the state of debates about forbiding to call 
 non-const methods on R-values. I think this would break too much code.

As it seems the issue really should be resolved by making opIndex return
by reference and press thumbs hard that something like a 'scope ref'
will be implemented?

Furthermore, since RefCounted objects do not behave like reference types
until initialized, they *must* be initialized before anything else
happens and hence I propose to change std.container.Array like
Christophe said: Replace 'isInitialized()' checks by assertions and add
a method with which the user explicitly initialized the reference counter.

Or is there a reasonable alternative?

Best regards,

Matthias Walter

Jul 19 2012

Matthias Walter <xammy xammy.info> writes:

On 07/18/2012 03:32 PM, Christophe Travert wrote:
 Matthias Walter , dans le message (digitalmars.D:172673), a écrit :


 This exactly is what makes the following code fail:

 Array!(Array!int) array2d;
 array2d.length = 1;
 array2d[0].insert(1);

 The inner array "array2d[0]" was not initialized and hence the reference
 pointer is null. Since Array.opIndex returns by value, the 'insert'
 method is called on a temporary object and does not affect the inner
 array (still being empty) which is stored in the outer array.

 What do you think about this?

 Must the user ensure that the Array container is always initialized
 explicitly? If yes, how shall this happen since the only constructor
 takes a (non-empty) tuple of new elements. Or shall opIndex return by
 reference?

 
 I think opIndex should return by reference. opIndexAssign is of no help 
 when the user want to use a function that takes a reference (here 
 Array.insert). It is normal that Array uses default construction when 
 someone increases the array's length.

Okay, I fully agree here.

 Besides that point, I don't see why default-constructed Array have an 
 uninitialised Payload. This makes uninitialised Array behaves 
 unexpectedly, because making a copy and using the copy will not affect 
 the original, which is not the intended reference value behavior.

Well the reason is that no user-defined function is called when a struct
is default-constructed and then copied into another variable (the first
one is this(this) which only known the new variable). Hence no user code
can be written that would initialize the payload. Do we need a paradigm
to perform explicit initialization of structs?

Best regards,

Matthias

Jul 18 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Initialization of std.typecons.RefCounted objects