www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - array cast should be supported corrected

reply Frank Benoit <keinfarbton googlemail.com> writes:
interface I{}
class C : I {}

C[] carray = getInstance();
I[] iarray = carray; // compile error
I[] iarray = cast(I[])carray; // runtime error (1)

// correct way:
I[] iarray = new I[carray.length];
foreach( idx, c; carray ){
	iarray[idx] = c; // implicit cast
}

I use a template for doing this, but this looks so ugly.
I[] iarray = arraycast!(I)(carray);

I think the D compiler should call a runtime method in (1) to do the 
cast in a loop, instead of doing a simple type change that is not 
working correctly.
Aug 06 2008
next sibling parent reply BCS <ao pathlink.com> writes:
Reply to Frank,

 interface I{}
 class C : I {}
 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)
 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 iarray[idx] = c; // implicit cast
 }
 I use a template for doing this, but this looks so ugly. I[] iarray =
 arraycast!(I)(carray);
 
 I think the D compiler should call a runtime method in (1) to do the
 cast in a loop, instead of doing a simple type change that is not
 working correctly.
 

this has come up before (IIRC) and the reply from Walter was that because the conversion is O(n) it should not be put into a cast that is otherwise O(1). OTOH if the cast(I[])carray is broken, that should be fixed (forbidden).
Aug 06 2008
parent reply Frank Benoit <keinfarbton googlemail.com> writes:
BCS schrieb:
 Reply to Frank,
 
 interface I{}
 class C : I {}
 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)
 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 iarray[idx] = c; // implicit cast
 }
 I use a template for doing this, but this looks so ugly. I[] iarray =
 arraycast!(I)(carray);

 I think the D compiler should call a runtime method in (1) to do the
 cast in a loop, instead of doing a simple type change that is not
 working correctly.

this has come up before (IIRC) and the reply from Walter was that because the conversion is O(n) it should not be put into a cast that is otherwise O(1). OTOH if the cast(I[])carray is broken, that should be fixed (forbidden).

Yes it is broken by design. When casting from class ref to interface ref the numerical value of the reference must get an offset applied. So if the O(n) operation (the loop above) is not made, the resulting array is not usable. However. What is better in a template arraycast with o(n) or a cast? The programmer always needs to know what the concequences are. With this argumentation the string concatenation ~ should also be banned from the language, because hidden heap allocation has always unbound execution time.
Aug 06 2008
next sibling parent BCS <ao pathlink.com> writes:
Reply to Frank,

 However. What is better in a template arraycast with o(n) or a cast?
 The programmer always needs to know what the concequences are. With
 this argumentation the string concatenation ~ should also be banned from
 the language, because hidden heap allocation has always unbound execution
 time.

Array cat's /always/ do a O(?) allocation and a O(n+m) copy. Cast usually does a O(1) transformation. Having the corner case be much worse than the normal case is the problem, not the fact it's O(n).
Aug 06 2008
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Frank Benoit:
 The programmer always needs to know what the concequences are. With this 
 argumentation the string concatenation ~ should also be banned from the 
 language, because hidden heap allocation has always unbound execution time.

Here I agree with Walter, a good language *must* act as much transparently as humanly possible, otherwise the life of the programmer becomes miserable (the only problems I find in D are when I mix GC-managed pointers with GC-unmanaged pointers in data structures, this leads to complex situations because I haven't written the GC, I don't know much about how its internals work, so in those situations for me it's *not* transparent). But D is a language designed to prevent programmer bugs when possible, so a simple solution is to make the compiler raise a compilation error at compile time in those situations. Then, if people want it, it can optionally be added a different kind of cast to D, maybe an array_cast(), that is known to work in O(n). Bye, bearophile
Aug 06 2008
prev sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Aug 7, 2008 at 6:06 AM, BCS <ao pathlink.com> wrote:
 Reply to Frank,

 However. What is better in a template arraycast with o(n) or a cast?
 The programmer always needs to know what the concequences are. With
 this argumentation the string concatenation ~ should also be banned from
 the language, because hidden heap allocation has always unbound execution
 time.

Array cat's /always/ do a O(?) allocation and a O(n+m) copy.

Not true for ~=. Sometimes allocates sometimes doesn't. You'll have to come up with a better argument than that. :-) That said, I don't see why writing arraycast!(T)(x) is so much worse than cast(T)x. I wish D had more flavors of cast. The current cast is a very blunt instrument. Seems obvious to me that It's a bad idea to use the same construct to mean "please convert this to a T" and "please pretend the bits of this are actually bits of a T". C++ added all those extra kinds of casts for a reason.
 Cast usually
 does a O(1) transformation. Having the corner case be much worse than the
 normal case is the problem, not the fact it's O(n).

I think casting to an interface is O(n) in the number of interfaces the object supports, no? --bb
Aug 06 2008
parent reply BCS <ao pathlink.com> writes:
Reply to Bill,

 On Thu, Aug 7, 2008 at 6:06 AM, BCS <ao pathlink.com> wrote:
 
 Reply to Frank,
 
 However. What is better in a template arraycast with o(n) or a cast?
 The programmer always needs to know what the concequences are. With
 this argumentation the string concatenation ~ should also be banned
 from
 the language, because hidden heap allocation has always unbound
 execution
 time.


to come up with a better argument than that. :-)

http://www.digitalmars.com/d/1.0/arrays.html "Concatenation always creates a copy of its operands" http://www.digitalmars.com/d/1.0/expression.html "" Assignment operator expressions, such as: a op= b are semantically equivalent to: a = a op b ""
 Cast usually
 does a O(1) transformation. Having the corner case be much worse than
 the
 normal case is the problem, not the fact it's O(n).

the object supports, no?

I think that it's just a simple addition as the interface vtbl in a given class is at a constant offset from the base of the object. I don't /think/ a class to interface cast can fail at run time. OTOH casting from an interface to a base class of the actual type... I haven't a clue what that's doing. Might be O(n) on the number of derivations The O(n) I was looking at would be on the length of the array.
 
 --bb
 

Aug 06 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"BCS" <ao pathlink.com> wrote in message 
news:55391cb3300e68cac5e93619c98c news.digitalmars.com...
 Reply to Bill,

 On Thu, Aug 7, 2008 at 6:06 AM, BCS <ao pathlink.com> wrote:

 Reply to Frank,

 However. What is better in a template arraycast with o(n) or a cast?
 The programmer always needs to know what the concequences are. With
 this argumentation the string concatenation ~ should also be banned
 from
 the language, because hidden heap allocation has always unbound
 execution
 time.


to come up with a better argument than that. :-)

http://www.digitalmars.com/d/1.0/arrays.html "Concatenation always creates a copy of its operands" http://www.digitalmars.com/d/1.0/expression.html "" Assignment operator expressions, such as: a op= b are semantically equivalent to: a = a op b ""

I once thought as you do. However, there is a conflicting statement later in that page: http://www.digitalmars.com/d/1.0/arrays.html#resize Although it says that it applies to the ~ and ~= operator, in reality it only applies to the ~= operator, I think that is an error in the spec. Go ahead and try it out: char[] x = new char[1]; x[0] = 'a'; char[] y = x; x ~= 'a'; assert(x.ptr == y.ptr); y ~= 'b'; assert(x[1] == 'b'); // now the concatenation operator y = y ~ 'c'; assert(y.ptr != x.ptr); -Steve
Aug 07 2008
parent BCS <ao pathlink.com> writes:
Reply to Steven,

 "BCS" <ao pathlink.com> wrote in message
 news:55391cb3300e68cac5e93619c98c news.digitalmars.com...
 
 Reply to Bill,
 
 On Thu, Aug 7, 2008 at 6:06 AM, BCS <ao pathlink.com> wrote:
 
 Reply to Frank,
 
 However. What is better in a template arraycast with o(n) or a
 cast?
 The programmer always needs to know what the concequences are.
 With
 this argumentation the string concatenation ~ should also be
 banned
 from
 the language, because hidden heap allocation has always unbound
 execution
 time.


have to come up with a better argument than that. :-)

"Concatenation always creates a copy of its operands" http://www.digitalmars.com/d/1.0/expression.html "" Assignment operator expressions, such as: a op= b are semantically equivalent to: a = a op b ""

later in that page: http://www.digitalmars.com/d/1.0/arrays.html#resize Although it says that it applies to the ~ and ~= operator, in reality it only applies to the ~= operator, I think that is an error in the spec.

ok, new version: "Array cat's are /always/ suposed to do a O(?) allocation and a O(n+m) copy."
Aug 08 2008
prev sibling next sibling parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Frank Benoit Wrote:

 interface I{}
 class C : I {}
 
 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)
 
 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 	iarray[idx] = c; // implicit cast
 }
 
 I use a template for doing this, but this looks so ugly.
 I[] iarray = arraycast!(I)(carray);
 
 I think the D compiler should call a runtime method in (1) to do the 
 cast in a loop, instead of doing a simple type change that is not 
 working correctly.

The only way to do this is if the cast performed a copy. Consider: interface I { } class A : I { } class B : I { } A[] aarray; I[] iarray = aarray; iarray ~= new B(); A a = aarray[0]; // you BROKE my TYPE SYSTEM -- this is an instance of B When upcasting an array you NEED to copy; you cannot alias. And a cast should never make a copy, since that's unexpected behavior. In Java, List<A> is not upcastable to List<I>, only to List<? extends I> to solve this exact problem. This syntax is pretty ugly, though, so I wouldn't recommend it for D.
Aug 06 2008
next sibling parent maelp <mael.primet gmail.com> writes:
 interface I { }
 class A : I { }
 class B : I { }
 
 A[] aarray;
 I[] iarray = aarray;
 iarray ~= new B();
 A a = aarray[0]; // you BROKE my TYPE SYSTEM -- this is an instance of B

I guess, for most of the D programming language design decisions, it would help if we had a kind of wiki listing those counterexamples, because it helps a lot when trying to wrap one's head on "why is this feature un/available". We should just collect those and possibly link them from the D documentation
Aug 07 2008
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Robert Fraser" wrote
 Frank Benoit Wrote:

 interface I{}
 class C : I {}

 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)

 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 iarray[idx] = c; // implicit cast
 }

 I use a template for doing this, but this looks so ugly.
 I[] iarray = arraycast!(I)(carray);

 I think the D compiler should call a runtime method in (1) to do the
 cast in a loop, instead of doing a simple type change that is not
 working correctly.

The only way to do this is if the cast performed a copy. Consider: interface I { } class A : I { } class B : I { } A[] aarray; I[] iarray = aarray; iarray ~= new B(); A a = aarray[0]; // you BROKE my TYPE SYSTEM -- this is an instance of B When upcasting an array you NEED to copy; you cannot alias. And a cast should never make a copy, since that's unexpected behavior.

Aliasing is possible. The problem is that for interfaces, the pointer is changed. If you change I to be a class, not an interface, then your code will compile (of course, it has the same problem as you have added a B to the end of an A array, but in general, casting from A[] to B[] where A is implicitly castable to B and both A and B use the same amount of space, is allowed. Except for Interfaces :) ) -Steve
Aug 07 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Steven Schveighoffer Wrote:

 "Robert Fraser" wrote
 Frank Benoit Wrote:

 interface I{}
 class C : I {}

 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)

 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 iarray[idx] = c; // implicit cast
 }

 I use a template for doing this, but this looks so ugly.
 I[] iarray = arraycast!(I)(carray);

 I think the D compiler should call a runtime method in (1) to do the
 cast in a loop, instead of doing a simple type change that is not
 working correctly.

The only way to do this is if the cast performed a copy. Consider: interface I { } class A : I { } class B : I { } A[] aarray; I[] iarray = aarray; iarray ~= new B(); A a = aarray[0]; // you BROKE my TYPE SYSTEM -- this is an instance of B When upcasting an array you NEED to copy; you cannot alias. And a cast should never make a copy, since that's unexpected behavior.

Aliasing is possible. The problem is that for interfaces, the pointer is changed. If you change I to be a class, not an interface, then your code will compile (of course, it has the same problem as you have added a B to the end of an A array, but in general, casting from A[] to B[] where A is implicitly castable to B and both A and B use the same amount of space, is allowed. Except for Interfaces :) )

If you can replace "I" with an abstract class and get it to compile right, that's a good old-fashioned bug. It doesn't matter if A and B are the same size (unless you're using scope classes which might lead to slicing), the problem is that the bits are being interpreted as different types than they are. // Note these definitions -- A and B are not type-compatible! class Base { } class A : Base { char[] str; } class B : Base { int x, int y; } A[] a_array; Base[] base_array = a_array; base_array ~= new B(5, 10); A a = a_array[0]; writefln(a.str); // Segfault! I don't have a D compiler on this computer so I can't test it out, but if the compiler can implicitly cast A[] to Base[], then this is a hole in the type system. The issue with interfaces is tangential. It's also extremely easy to overlook this error. Say you're passing an array of A[] to a function that modifies an array of Base[] by possibly adding things. In a large system, it can be a tricky bug to track down since the error may not manifest itself until a while after it happens, and may not always be a segfault (if they're the same size & it's just the bits being interpreted as the wrong type, things could get very ugly -- it may not always just segfault).
Aug 07 2008
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Robert Fraser" wrote
 Steven Schveighoffer Wrote:

 "Robert Fraser" wrote
 Frank Benoit Wrote:

 interface I{}
 class C : I {}

 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)

 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
 iarray[idx] = c; // implicit cast
 }

 I use a template for doing this, but this looks so ugly.
 I[] iarray = arraycast!(I)(carray);

 I think the D compiler should call a runtime method in (1) to do the
 cast in a loop, instead of doing a simple type change that is not
 working correctly.

The only way to do this is if the cast performed a copy. Consider: interface I { } class A : I { } class B : I { } A[] aarray; I[] iarray = aarray; iarray ~= new B(); A a = aarray[0]; // you BROKE my TYPE SYSTEM -- this is an instance of B When upcasting an array you NEED to copy; you cannot alias. And a cast should never make a copy, since that's unexpected behavior.

Aliasing is possible. The problem is that for interfaces, the pointer is changed. If you change I to be a class, not an interface, then your code will compile (of course, it has the same problem as you have added a B to the end of an A array, but in general, casting from A[] to B[] where A is implicitly castable to B and both A and B use the same amount of space, is allowed. Except for Interfaces :) )

If you can replace "I" with an abstract class and get it to compile right, that's a good old-fashioned bug. It doesn't matter if A and B are the same size (unless you're using scope classes which might lead to slicing), the problem is that the bits are being interpreted as different types than they are. // Note these definitions -- A and B are not type-compatible! class Base { } class A : Base { char[] str; } class B : Base { int x, int y; } A[] a_array; Base[] base_array = a_array; base_array ~= new B(5, 10); A a = a_array[0]; writefln(a.str); // Segfault!

Actually, because of the way arrays work, this will cause a segfault only in release mode, because a_array is still 0-length (and null). Without release mode, you get an array bounds exception A better example would be: A[] a_array; a_array ~= new A; Base[] base_array = a_array; B b = new B; b.x = b.y = 100; base_array[0] = b; A a = a_array[0]; // a now points to b writefln(a.str); // Segfault!
 I don't have a D compiler on this computer so I can't test it out, but if
 the compiler can implicitly cast A[] to Base[], then this is a hole in the 
 type
 system. The issue with interfaces is tangential.

The issues are orthogonal. When casting to an interface array O(n) copying is the only reasonable choice. O(1) aliasing is OK for a base class if you are not planning on changing the existing elements.
 It's also extremely easy to overlook this error. Say you're passing an 
 array
 of A[] to a function that modifies an array of Base[] by possibly adding
 things. In a large system, it can be a tricky bug to track down since the
 error may not manifest itself until a while after it happens, and may not
 always be a segfault (if they're the same size & it's just the bits being
 interpreted as the wrong type, things could get very ugly -- it may not
 always just segfault).

This is not a common case (to pass in a abstract base array in which you plan on changing elements). The common case is to use the base class array as a means of writing a common function that *uses* the array, but does not create elements in it. For that, the cast is perfectly safe. Or to use it as a co-variant return value. I think the benefits of being able to cast this way outweigh the uncommon pitfalls. I look at it no differently than doing: class C {int x;} C c; c.x = 5; // segfault But the interface thing is a performance question. Whether or not to implicitly generate heap activity and run an O(n) algorithm by default is a completely different question. -Steve
Aug 07 2008
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Steven Schveighoffer Wrote:
 Actually, because of the way arrays work, this will cause a segfault only in 
 release mode, because a_array is still 0-length (and null).  Without release 
 mode, you get an array bounds exception
 
 A better example would be:
 A[] a_array;
 a_array ~= new A;
 Base[] base_array = a_array;
 B b = new B;
 b.x = b.y = 100;
 base_array[0] = b;
 A a = a_array[0]; // a now points to b
 writefln(a.str); // Segfault!

Yes, you're right; my bad.
 I don't have a D compiler on this computer so I can't test it out, but if
 the compiler can implicitly cast A[] to Base[], then this is a hole in the 
 type
 system. The issue with interfaces is tangential.

The issues are orthogonal. When casting to an interface array O(n) copying is the only reasonable choice.

We're saying the same thing ;-P. From wiktionary: tangential = "Only indirectly related." orthogonal = "Able to be treated separately."
 O(1) aliasing is OK for a base class if you 
 are not planning on changing the existing elements.

In this case it should require an explicit cast. A[] could be implicitly castable to const(Base[]) but not to const(Base)[] or Base[].
 It's also extremely easy to overlook this error. Say you're passing an 
 array
 of A[] to a function that modifies an array of Base[] by possibly adding
 things. In a large system, it can be a tricky bug to track down since the
 error may not manifest itself until a while after it happens, and may not
 always be a segfault (if they're the same size & it's just the bits being
 interpreted as the wrong type, things could get very ugly -- it may not
 always just segfault).

This is not a common case (to pass in a abstract base array in which you plan on changing elements). The common case is to use the base class array as a means of writing a common function that *uses* the array, but does not create elements in it. For that, the cast is perfectly safe. Or to use it as a co-variant return value. I think the benefits of being able to cast this way outweigh the uncommon pitfalls.

I agree the advantages are enough to justify allowing aliasing without serious subversion of the type system. But the cast _must be explicit_, so users see a red flag there. It's potentially dangerous, since you might accidentally interpret one bit pattern as a different type. Short of unions, only a cast can do that.
 I look at it no differently than doing:
 
 class C {int x;}
 
 C c;
 c.x = 5; // segfault

How is that at all the same? In the case of implicit base class casting, you're tricking the compiler (likely without meaning to) into interpreting the data as an incorrect type. In this case, the compiler is acting correctly and you have a bug in your code.
 But the interface thing is a performance question.  Whether or not to 
 implicitly generate heap activity and run an O(n) algorithm by default is a 
 completely different question.

Agreed.
Aug 07 2008
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
"Robert Fraser" wrote
 Steven Schveighoffer Wrote:
 O(1) aliasing is OK for a base class if you
 are not planning on changing the existing elements.

In this case it should require an explicit cast. A[] could be implicitly castable to const(Base[]) but not to const(Base)[] or Base[].

That eliminates the possibility of having covariant functions that return arrays of objects.
 It's also extremely easy to overlook this error. Say you're passing an
 array
 of A[] to a function that modifies an array of Base[] by possibly 
 adding
 things. In a large system, it can be a tricky bug to track down since 
 the
 error may not manifest itself until a while after it happens, and may 
 not
 always be a segfault (if they're the same size & it's just the bits 
 being
 interpreted as the wrong type, things could get very ugly -- it may not
 always just segfault).

This is not a common case (to pass in a abstract base array in which you plan on changing elements). The common case is to use the base class array as a means of writing a common function that *uses* the array, but does not create elements in it. For that, the cast is perfectly safe. Or to use it as a co-variant return value. I think the benefits of being able to cast this way outweigh the uncommon pitfalls.

I agree the advantages are enough to justify allowing aliasing without serious subversion of the type system. But the cast _must be explicit_, so users see a red flag there. It's potentially dangerous, since you might accidentally interpret one bit pattern as a different type. Short of unions, only a cast can do that.
 I look at it no differently than doing:

 class C {int x;}

 C c;
 c.x = 5; // segfault

How is that at all the same? In the case of implicit base class casting, you're tricking the compiler (likely without meaning to) into interpreting the data as an incorrect type. In this case, the compiler is acting correctly and you have a bug in your code.

I'm trying to show that there is a precedent of being able to make code that fails without explicit casting. Everyone knows that the 'right' thing to do is to create a new object before using it. I could say exactly the same thing about your example. The compiler is acting correctly and you have a bug in your code. -Steve
Aug 08 2008
prev sibling parent Frank Benoit <keinfarbton googlemail.com> writes:
Frank Benoit schrieb:
 interface I{}
 class C : I {}
 
 C[] carray = getInstance();
 I[] iarray = carray; // compile error
 I[] iarray = cast(I[])carray; // runtime error (1)
 
 // correct way:
 I[] iarray = new I[carray.length];
 foreach( idx, c; carray ){
     iarray[idx] = c; // implicit cast
 }
 
 I use a template for doing this, but this looks so ugly.
 I[] iarray = arraycast!(I)(carray);
 
 I think the D compiler should call a runtime method in (1) to do the 
 cast in a loop, instead of doing a simple type change that is not 
 working correctly.
 
 
 

I filed this as report 2270.
Aug 07 2008