www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [typing] Type-erasure re generics

reply Justin Johansson <no spam.com> writes:
Sorry if I should ask to this on D.help.

Can someone please inform me if D does type-erasure on generic
types as does Java?

Here's one Java reference on the subject that I found.
http://download.oracle.com/javase/tutorial/java/generics/erasure.html

Also I would like to be informed as to the positives and negatives
aspects of type-erasure as might be appropriate in a D versus Java
context.

Thanks for answers,

-- Justin Johansson
Sep 29 2010
next sibling parent Justin Johansson <no spam.com> writes:
More apologies, perhaps a subject line of
[RTTI] Type-erasure in D generics
would be more apt?

RTTI: run-time type identification/information

-- JJ
Sep 29 2010
prev sibling next sibling parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
Justin Johansson Wrote:

 Also I would like to be informed as to the positives and negatives
 aspects of type-erasure as might be appropriate in a D versus Java
 context.

Here is a quote that is most relevant: "[Type-erasure is] a process where the compiler removes all information related to type parameters and type arguments within a class or method. Type erasure enables Java applications that use generics to maintain binary compatibility with Java libraries and applications that were created before generics." The answer is, D does not do this. The reason is that it does not need to be backwards compatible with Java libraries written prior to generics... There really isn't any benefit to this, the point of types is so the compiler can check that you are only doing valid operations on it. If you start removing type information you might as well just be working with Object and leave it at that (actually use variant, I think it is more type safe.) The only benefit, which should be solve in another manner is having this code work: class A {} class B:A {} class Container(T) {} void main() { Container!(A) a = new Container!(B)(); } I think this was reported as a bug, but can't seem to find it now.
Sep 29 2010
next sibling parent reply Pelle <pelle.mansson gmail.com> writes:
On 09/29/2010 05:53 PM, Jesse Phillips wrote:
 The only benefit, which should be solve in another manner is having this code
work:

 class A {}
 class B:A {}

 class Container(T) {}

 void main() {
      Container!(A) a = new Container!(B)();
 }

Sorry for falling off topic, but that code shouldn't work. a.insert(new A)
Sep 29 2010
parent Jesse Phillips <jessekphillips+D gmail.com> writes:
 On Wednesday, September 29, 2010 10:18:17 Pelle wrote:
 On 09/29/2010 05:53 PM, Jesse Phillips wrote:
 The only benefit, which should be solve in another manner is having this
 code work:
 
 class A {}
 class B:A {}
 
 class Container(T) {}
 
 void main() {
 
      Container!(A) a = new Container!(B)();
 
 }

Sorry for falling off topic, but that code shouldn't work. a.insert(new A)


I will first quote Wikipedia about C#[1]: "For example, in C# 3.0 generic parameters did not support co or contravariance; List<A> was not equivalent to List<TypeDerivedFromTypeA> as one might intuit; however, this is now supported in C# 4.0, though standard arrays have always supported covariance & contravariance since .NET was introduced." Then I will show an example that does compile[2]: void main() { A[] = [new B(), new A()]; } 1. http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science) 2. http://ideone.com/ZzDTs
Sep 29 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 29, 2010 10:18:17 Pelle wrote:
 On 09/29/2010 05:53 PM, Jesse Phillips wrote:
 The only benefit, which should be solve in another manner is having this
 code work:
 
 class A {}
 class B:A {}
 
 class Container(T) {}
 
 void main() {
 
      Container!(A) a = new Container!(B)();
 
 }

Sorry for falling off topic, but that code shouldn't work. a.insert(new A)

Yeah. Having a container of one type should not be castable to a container of another type. They are completely separate types even if they hold types which have an is-a relationship. And your example shows one of the reasons why. It's a fundamental aspect of the concept of generic containers, not a result of the implementation. It _seems_ like it should work, but once you really get into it, it quickly becomes apparent that it doesn't. - Jonathan M Davis
Sep 29 2010
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Sep 2010 13:44:32 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, September 29, 2010 10:18:17 Pelle wrote:
 On 09/29/2010 05:53 PM, Jesse Phillips wrote:
 The only benefit, which should be solve in another manner is having  

 code work:

 class A {}
 class B:A {}

 class Container(T) {}

 void main() {

      Container!(A) a = new Container!(B)();

 }

Sorry for falling off topic, but that code shouldn't work. a.insert(new A)

Yeah. Having a container of one type should not be castable to a container of another type. They are completely separate types even if they hold types which have an is-a relationship. And your example shows one of the reasons why. It's a fundamental aspect of the concept of generic containers, not a result of the implementation. It _seems_ like it should work, but once you really get into it, it quickly becomes apparent that it doesn't.

I think actually that concept is in the latest C# with type contra-variance. Essentially, you should be able to implicitly cast Container!(B) to const(Container!(A)) conceptually, but actually you may have to require only pure calls (I don't think there's a way to do this). I think C# deals with this via designating certain types on the template parameters as "input" or "output" types, and restricting functions from using types in the wrong direction. -Steve
Sep 29 2010
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2010-09-29 17:14, Justin Johansson wrote:
 Sorry if I should ask to this on D.help.

 Can someone please inform me if D does type-erasure on generic
 types as does Java?

 Here's one Java reference on the subject that I found.
 http://download.oracle.com/javase/tutorial/java/generics/erasure.html

 Also I would like to be informed as to the positives and negatives
 aspects of type-erasure as might be appropriate in a D versus Java
 context.

 Thanks for answers,

 -- Justin Johansson

D does not perform type-erasure. It creates a new version of every template for every type that is used with that template. For example: T add (T) (T x, T y) { return x + y; } add(3, 4) add(4.0, 5.0); The above template will generate two instances, like this: int add (int x, int y) { return x + y; } double add (double x, double y) { return x + y; } If we look at D's templates: pro: * stronger type system con: * code bloat * perhaps longer compile time -- /Jacob Carlborg
Sep 29 2010
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday 29 September 2010 08:14:38 Justin Johansson wrote:
 Sorry if I should ask to this on D.help.
 
 Can someone please inform me if D does type-erasure on generic
 types as does Java?
 
 Here's one Java reference on the subject that I found.
 http://download.oracle.com/javase/tutorial/java/generics/erasure.html
 
 Also I would like to be informed as to the positives and negatives
 aspects of type-erasure as might be appropriate in a D versus Java
 context.
 
 Thanks for answers,
 
 -- Justin Johansson

Wow. Goodness no. Generics and templates are _completely_ different. Java is the only language that I'm aware of that does type erasure, and the only reason that it does it is that they needed generics to be backwards compatible, so the generated bytecode is the same with generics as it is without. So, generics are pretty much just compile-time checks in Java. I believe that C# does generics in a way that results in multiple types using the same code but that the type information is maintained with the type, so the VM knows about the generics. D's templates are more like C++'s templates. In both D and C++, templates are literally for generating code. If you declare struct S(T) { T val; } and use S!int and S!double, then two struct types are created (typically you would say that S was instantiated 2 twice): struct S!int { int val; } struct S!double { double val; } They are as seperate as if you had declared struct SInt { int val; } struct SDouble { double val; } If you used only S!int, then only the code for S!int would be created. If you used S!int, S!double, and S!float, then three sets of code would be created, each with its own type. If you didn't use any S, then no code would be created. This is so literal that if you declare struct S(T) { unittest { ... } T val; } then if you don't declare any S's, then the unittest code (with whatever it does) doesn't exist and is never run, while if you declared S!int, S!double, S!float, and S!(S!long), you'd end up with 4 versions of it and run for times (4, since S!(S!long)) would generate both the S!long and S!(S!Long)) types). Neither D or C++ templates are saying to create a class or struct or which takes any type. They're saying to generate that code for each type that you use with it. It's literally as if you had copy and pasted it for every time you needed it for a new type, and replaced T with the new type for each. Take D's eponymous templates for instance (this from std.metastrings): /** * Convert constant argument to a string. */ template toStringNow(ulong v) { static if (v < 10) enum toStringNow = "" ~ cast(char)(v + '0'); else enum toStringNow = toStringNow!(v / 10) ~ toStringNow!(v % 10); } unittest { static assert(toStringNow!(1uL << 62) == "4611686018427387904"); } Using to(toStringNow!(1uL << 62)) recursively generates code, each time replacing the template with the variable with its name until all you have left is the generated value. toStringNow!(1234); becomes template toStringNow!1234 { enum toStringNow!1234 = toStringNow!(123) ~ toStringNow!4; } template toStringNow!123 { enum toStringNow!123 = toStringNow!(12) ~ toStringNow!3; } template toStringNow!4 { enum toStringNow!4 = "" ~ '4'; } tempalet toStringNow!12 { enum toStringNow!12 = toStringNow!1 ~ toStringNow!2; } template toStringNow!2 { enum toStringNow!2 = "" ~ "2"; } template toStringNow!1 { enum toStringNow!1 = "" ~"1" } which is reduced to template toStringNow!1234 { enum toStringNow!1234 = "1234"; } which is reduced to "1234" The code is literally generated it - in this case, recursively so. You can't even dream of doing this sort of thing in Java. And because the code is generated, the type information is always there as much as it would have been had you produced all of that code by hand. The type of S!int won't be erased to S anymore than SInt would have been. The different template instantiations are completely new sets of code. In Java or C#, class S<T> { T val; } is really class S<Object> { Object val; } and that's why it has to do all kinds of casts and box primitive types and whatnot. Since, C++ and D generate completely new sets of code for each template instantiation, they don't have that problem. No casts are necessary, and primitive types are used directly. Now, this is often acused of causing code bloat (though a particularly advanced compiler can actually use the same code underneath for types of the same size - so S!int and S!float would be joined into one while S!long wouldn't be; dmd doesn't do that at this point; it would be an optimization to do so), and C# or Java folks might point that out as a reason that C# and Java's approaches are better. However, C# and Javas's approaches make generics pretty much only useful for container classes. They can't even dream of stuff like eponymous templates. So, while they make gains in the size of the code, they lose out on a _lot_ of power. C++ and D templates are literally code generation mechanisms, and if you want to make full use of them, you need to realize that every time that you instantiate a new template, you are literally generating new code. That has all kinds of powerful implications, and Phobos definitely takes advantage of them. It's not without cost (it _is_ a tradeoff between space and power), but the benefit is so large that most of us would agree that the benefits dwarf the cost in space. And maybe someday dmd will get the aforementioned optimization would be make template instantiations with the same size for their type actually use one set of underlying code. But there's no way that I'd trade D templates for C#'s generics which use one set of code but maintain type information, let alone Java's which use one set of code and erase all type information. - Jonathan M Davis
Sep 29 2010
next sibling parent reply Justin Johansson <no spam.com> writes:
On 30/09/2010 2:18 AM, Jonathan M Davis wrote:
 On Wednesday 29 September 2010 08:14:38 Justin Johansson wrote:
 Can someone please inform me if D does type-erasure on generic
 types as does Java?

Wow. Goodness no. Generics and templates are _completely_ different. Java is the only language that I'm aware of that does type erasure, and the only

Thanks Jonathan et. al. for many well-thought responses and perhaps material for TDPL Edition 2. Andrei will be sure to buy you guys, and other respondents, a beer for each new page written. :-)
Sep 29 2010
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/29/10 9:30 PDT, Justin Johansson wrote:
 On 30/09/2010 2:18 AM, Jonathan M Davis wrote:
 On Wednesday 29 September 2010 08:14:38 Justin Johansson wrote:
 Can someone please inform me if D does type-erasure on generic
 types as does Java?

Wow. Goodness no. Generics and templates are _completely_ different. Java is the only language that I'm aware of that does type erasure, and the only

Thanks Jonathan et. al. for many well-thought responses and perhaps material for TDPL Edition 2. Andrei will be sure to buy you guys, and other respondents, a beer for each new page written. :-)

TDPL includes a long discussion between homogeneous and heterogeneous translation (as defined by the paper "Pizza into Java" by Odersky/Wadler). Andrei
Sep 29 2010
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 And maybe someday dmd will get the aforementioned optimization would be make 
 template instantiations with the same size for their type actually use one set 
 of underlying code.

LDC is already able to do it in not too much complex situations if you compile D1 code using the -mergefunc switch too. I have a LLVM bug report open on this, they say they will further improve this feature. Bye, bearophile
Sep 29 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 29, 2010 11:11:53 Steven Schveighoffer wrote:
 I think actually that concept is in the latest C# with type
 contra-variance.
 
 Essentially, you should be able to implicitly cast Container!(B) to
 const(Container!(A)) conceptually, but actually you may have to require
 only pure calls (I don't think there's a way to do this).
 
 I think C# deals with this via designating certain types on the template
 parameters as "input" or "output" types, and restricting functions from
 using types in the wrong direction.

There may be a way to get it to work with const or some other sort of constraints that disallowed mutation - or at least mutating what is directly in the container (mutating which is referred to by the elements in the container could work) - but you can't allow mutation of the elements directly in the container or you'd be allowed to put incorrect types in the container. I'd hate to be the one to try to get such a feature to work without const like they would have had to do in C#. - Jonathan M Davis
Sep 29 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Sep 2010 14:39:43 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, September 29, 2010 11:11:53 Steven Schveighoffer wrote:
 I think actually that concept is in the latest C# with type
 contra-variance.

 Essentially, you should be able to implicitly cast Container!(B) to
 const(Container!(A)) conceptually, but actually you may have to require
 only pure calls (I don't think there's a way to do this).

 I think C# deals with this via designating certain types on the template
 parameters as "input" or "output" types, and restricting functions from
 using types in the wrong direction.

There may be a way to get it to work with const or some other sort of constraints that disallowed mutation - or at least mutating what is directly in the container (mutating which is referred to by the elements in the container could work) - but you can't allow mutation of the elements directly in the container or you'd be allowed to put incorrect types in the container. I'd hate to be the one to try to get such a feature to work without const like they would have had to do in C#.

I think it may have to be more than just const: class C(T) { static T globalval; } class A {} class B : A {} void foo(C!B c) { const(C!A) c2 = c; c2.globalval = new A; } If the entire class is marked as pure (with pure defined as weak purity a la Don's idea) that might work as setting globalval would be illegal. -Steve
Sep 29 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 29, 2010 11:54:30 Steven Schveighoffer wrote:
 I think it may have to be more than just const:
 
 class C(T)
 {
     static T globalval;
 }
 
 class A {}
 class B : A {}
 
 void foo(C!B c)
 {
    const(C!A) c2 = c;
    c2.globalval = new A;
 }
 
 If the entire class is marked as pure (with pure defined as weak purity a
 la Don's idea) that might work as setting globalval would be illegal.

Ah. Once you get beyond arrays, you have the issue of the container type itself being different. Hmmm. For C#, you might be okay since they're all technically the same type anyway (and would share any globals), but for D, the two types could have absolutes nothing in common with one another. All class variables would be by definition local to each instantiation, so no const would not be enough. But worse still, consider this: With templates, you could have a container type which used completely different implementations depending on the type(s) you use to instantiate it. It could use an array for ints, a hash table for floats, and a red-black tree for everything else. In such a case, would would it even _mean_ to try and assign one container type to another? Sure, if they're classes, and they share the same API, you may be able to get it to work thanks to the fact that you're dealing with references. But for structs? Forget it. Just because the types Array!int and Array!float are instantiated from the same template doesn't mean that they have anything in common other than the base name. The only way that you can even consider having containers be effectively in an inheritance hiercharcy because their elements are in an inheritence hiearchy is if the containers are classes. And since it looks like most - if not all - containers in Phobos are going to be structs, it quickly becomes a moot point for anything but arrays. With arrays, you could do it with just constness, I think. But for other container types? It _might_ be feasible with classes if you could somehow restrict access to all of their class variables and non-pure, static member functions, but it_won't work for structs. - Jonathan M Davis
Sep 29 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Sep 2010 15:16:28 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, September 29, 2010 11:54:30 Steven Schveighoffer wrote:
 I think it may have to be more than just const:

 class C(T)
 {
     static T globalval;
 }

 class A {}
 class B : A {}

 void foo(C!B c)
 {
    const(C!A) c2 = c;
    c2.globalval = new A;
 }

 If the entire class is marked as pure (with pure defined as weak purity  
 a
 la Don's idea) that might work as setting globalval would be illegal.

Ah. Once you get beyond arrays, you have the issue of the container type itself being different. Hmmm. For C#, you might be okay since they're all technically the same type anyway (and would share any globals), but for D, the two types could have absolutes nothing in common with one another. All class variables would be by definition local to each instantiation, so no const would not be enough. But worse still, consider this: With templates, you could have a container type which used completely different implementations depending on the type(s) you use to instantiate it.

Excellent point. If this *were* a feature we wanted, the compiler would have to make that determination (I think it could be done, but I'm not sure of all the nuances).
 It could use an array for ints, a hash table for floats, and a
 red-black tree for everything else.

Wait, the types have to be bit-for-bit compatible, or it doesn't work. You can't cast floats to ints without reinterpreting the data. I don't even think you could do classes to interfaces, because you need to do a translation.
 In such a case, would would it even _mean_
 to try and assign one container type to another? Sure, if they're  
 classes, and
 they share the same API, you may be able to get it to work thanks to the  
 fact
 that you're dealing with references.

No, even with that, you brought it up earlier -- class C could say: static if(is(T == A)) void anotherfunction() {} and then the vtable for C!B is different than C!A, even though the vtable for B is compatible with the vtable for A. There has to be some way for the compiler to determine this, but again, only if this feature makes sense to implement.
 But for structs? Forget it. Just because
 the types Array!int and Array!float are instantiated from the same  
 template
 doesn't mean that they have anything in common other than the base name.

For floats and ints, yes. But this doesn't even matter, because it has to be bit-for-bit compatible. There's no way you could case S!int to S!float without having to translate everything. Something like Array!int to Array!(const int) would be feasible.
 The
 only way that you can even consider having containers be effectively in  
 an
 inheritance hiercharcy because their elements are in an inheritence  
 hiearchy is
 if the containers are classes. And since it looks like most - if not all  
 -
 containers in Phobos are going to be structs

All containers in dcollections are classes ;)
 it quickly becomes a moot point
 for anything but arrays. With arrays, you could do it with just  
 constness, I
 think. But for other container types? It _might_ be feasible with  
 classes if you
 could somehow restrict access to all of their class variables and  
 non-pure,
 static member functions, but it_won't work for structs.

I think it can be done, but it may require really strict rules, and those rules may not be assertable without some new syntax. For example, one place where this can be extremely useful is tail-const. If S!(T) implicitly casts to S!(const(T)), then we can implement tail-const in the library (though I'd prefer a compiler solution). -Steve
Sep 29 2010
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 29 Sep 2010 15:36:02 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 29 Sep 2010 15:16:28 -0400, Jonathan M Davis
 But worse still, consider this: With templates, you could have a  
 container type
 which used completely different implementations depending on the  
 type(s) you use
 to instantiate it.

Excellent point. If this *were* a feature we wanted, the compiler would have to make that determination (I think it could be done, but I'm not sure of all the nuances).

By "that determination" I meant, determine if the template alters its implementation based on the type passed in. -Steve
Sep 29 2010
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 29, 2010 12:36:02 Steven Schveighoffer wrote:
 In such a case, would would it even _mean_
 to try and assign one container type to another? Sure, if they're
 classes, and
 they share the same API, you may be able to get it to work thanks to the
 fact
 that you're dealing with references.

No, even with that, you brought it up earlier -- class C could say: static if(is(T == A)) void anotherfunction() {} and then the vtable for C!B is different than C!A, even though the vtable for B is compatible with the vtable for A. There has to be some way for the compiler to determine this, but again, only if this feature makes sense to implement.

I forgot about the vtable. That makes things even harder. Overall, it looks like you'd have to really work at it t omake it possible and that even then it would only work under limited circumstances, and those circumstances may not even be particularly apparent to the programmer. So, it would probably be a very difficult feature to understand, let alone use, because it wouldn't be at all clear when it would work. And to make things worse, all it would take would be one small change in the container code which broke the binary compatability and all your code that relied on it working with that type would break. So, it may be theoretically possible to get some version container compatability based on the inheritance hiercharcy of their elements to work, but it would be hard to implement, likely difficult to use, and easy to break. It's just easier to cope with containers not being assignable to one another just because the type of the elements of one is derived from the type of the elements of the other. - Jonathan M Davis
Sep 29 2010
prev sibling parent Philippe Sigaud <philippe.sigaud gmail.com> writes:
Jonathan:
 It's just easier to cope with containers not being assignable to one another
 just because the type of the elements of one is derived from the type of the
 elements of the other.

Scala does this with + or - annotation on types: List[+T] : List is parameterized by type T, and is covariant in T. A contravariant version would use [-T] See http://www.scala-lang.org/node/129 In D, a List(T) struct could have an opAssign doing the conversion: struct List(T) { ... void opAssign(U : T)(List!U rhs) { ... walk the list, transform the U's into T's } } You can then test any container to see if it's covariant or contravariant: see if the type defines the correct opAssign. Or maybe an opCast(LU)() if (is(LU lu == List!U, U) && is(U : T)) ^^^^^^^^^^ I'm not sure we can do that in template constraints. We may have to express it with an external template: if( isA!(List, LU) && is(ExtractType!LU : T)) Philippe
Sep 29 2010