www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Type unions in D

reply Justin Johansson <procode adam-dott-com.au> writes:
What's the best way of emulating a system of quantified type unions in D  (D1)?

By type unions (sometimes called algebraic sums in the literature), I don't
mean the C/C++ "union" facility.

Suppose there is some class X, I'd like to have a first class type for
representing the following quantified types (using RegExp notation):
X?  zero or one instances of an X  (call this type XQ)
X a single instance of X
X* zero or more instances of X  (call this type XS)
X+ at least one X (call this type XP)

To unify the whole thing, now let X, XQ, XS and XP all be derived from a common
base type T.

Note that Scala has the Option, Some and None classes which kind of handles
(but not exactly) the XQ case.

Brendan Eich, JavaScript inventor, apparently has suggested type unions for
that language.  Some FP languages also have type systems that support type
unions.

http://wiki.ecmascript.org/doku.php?id=meetings:minutes_oct_13_2005

"Type Unions
Brendan thinks we should have some ability for type unions in the language.

var x : Number | Null

It would be good to have something like this because otherwise developers end
up enforcing their invariants on their own, with the potential for buggy code."

As always, thanks for all replies.

Justin Johansson
Sep 15 2009
next sibling parent Justin Johansson <procode adam-dott-com.au> writes:
Correction:

 Suppose there is some class X, I'd like to have a first class type for
representing the following quantified types (using RegExp notation):
 X?  zero or one instances of an X  (call this type XQ)
 X a single instance of X
 X* zero or more instances of X  (call this type XS)
 X+ at least one X (call this type XP)
 To unify the whole thing, now let X, XQ, XS and XP all be derived from a
common base type T.
Should be:
 X?  zero or one instances of T  (call this type XQ)
 X a single instance of T
 X* zero or more instances of T  (call this type XS)
 X+ at least one T (call this type XP)
JJ
Sep 15 2009
prev sibling next sibling parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Justin Johansson Wrote:

 What's the best way of emulating a system of quantified type unions in D  (D1)?
 
 By type unions (sometimes called algebraic sums in the literature), I don't
mean the C/C++ "union" facility.
 
 Suppose there is some class X, I'd like to have a first class type for
representing the following quantified types (using RegExp notation):
 X?  zero or one instances of an X  (call this type XQ)
 X a single instance of X
 X* zero or more instances of X  (call this type XS)
 X+ at least one X (call this type XP)
 
 To unify the whole thing, now let X, XQ, XS and XP all be derived from a
common base type T.
 
 Note that Scala has the Option, Some and None classes which kind of handles
(but not exactly) the XQ case.
 
 Brendan Eich, JavaScript inventor, apparently has suggested type unions for
that language.  Some FP languages also have type systems that support type
unions.
 
 http://wiki.ecmascript.org/doku.php?id=meetings:minutes_oct_13_2005
 
 "Type Unions
 Brendan thinks we should have some ability for type unions in the language.
 
 var x : Number | Null
 
 It would be good to have something like this because otherwise developers end
up enforcing their invariants on their own, with the potential for buggy code."
 
 As always, thanks for all replies.
 
 Justin Johansson
 
What you want sounds a lot like a variant type. Check std.variant in phobos, it has a template to let you define custom variants through type tuples.
Sep 15 2009
parent reply Justin Johansson <procode adam-dott-com.au> writes:
Jeremie Pelletier Wrote:

 Justin Johansson Wrote:
 
 What's the best way of emulating a system of quantified type unions in D  (D1)?
 What you want sounds a lot like a variant type. Check std.variant in phobos,
it has a template to let you define custom variants through type tuples.
Thanks Jeremie. It certainly does .. but the reason I haven't seen it before is because I'm using D1. Sure enough though I poked the D2 Phobos code and there it was. Right at the top of the file is the link to Andrei's circa 2002 article in DDJ which makes for very interesting reading. http://erdani.org/publications/cuj-04-2002.html A colleague of mine is suggesting that I really do take a closer look at D2 now but I'm not sure that I'm ready to go standing on the bleading bleading (the blood doesn't clot) edge just right yet. :-( JJ
Sep 15 2009
parent reply Fawzi Mohamed <fmohamed mac.com> writes:
On 2009-09-16 02:40:02 +0200, Justin Johansson <procode adam-dott-com.au> said:

 Jeremie Pelletier Wrote:
 
 Justin Johansson Wrote:
 
 What's the best way of emulating a system of quantified type unions in D  (D1)?
 What you want sounds a lot like a variant type. Check std.variant in 
 phobos, it has a template to let you define custom variants through 
 type tuples.
Thanks Jeremie. It certainly does .. but the reason I haven't seen it before is because I'm using D1. Sure enough though I poked the D2 Phobos code and there it was. Right at the top of the file is the link to Andrei's circa 2002 article in DDJ which makes for very interesting reading. http://erdani.org/publications/cuj-04-2002.html A colleague of mine is suggesting that I really do take a closer look at D2 now but I'm not sure that I'm ready to go standing on the bleading bleading (the blood doesn't clot) edge just right yet. :-( JJ
Tango has variant (tango.core.Variant), and is D1, if that could be an option for you.
Sep 16 2009
parent reply language_fan <foo bar.com.invalid> writes:
Wed, 16 Sep 2009 12:41:06 +0200, Fawzi Mohamed thusly wrote:

 On 2009-09-16 02:40:02 +0200, Justin Johansson
 <procode adam-dott-com.au> said:
 A colleague of mine is suggesting that I really do take a closer look
 at D2 now but I'm not sure that I'm ready to go standing on the
 bleading bleading (the blood doesn't clot) edge just right yet. :-(
 
 JJ
Tango has variant (tango.core.Variant), and is D1, if that could be an option for you.
Do the various variant structures presented here support recursive type definitions such as lists? Is this done efficiently?
Sep 17 2009
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
language_fan wrote:
 Wed, 16 Sep 2009 12:41:06 +0200, Fawzi Mohamed thusly wrote:
 
 On 2009-09-16 02:40:02 +0200, Justin Johansson
 <procode adam-dott-com.au> said:
 A colleague of mine is suggesting that I really do take a closer look
 at D2 now but I'm not sure that I'm ready to go standing on the
 bleading bleading (the blood doesn't clot) edge just right yet. :-(

 JJ
Tango has variant (tango.core.Variant), and is D1, if that could be an option for you.
Tango and Phobos2's Variants are very different beasts. Phobos2's Variant is a discriminated union; you give it a set of types, and that's what it can store. The general-purpose one, I believe, can store any type so long as it's no larger than the biggest primitive types (it can't store big structs or static arrays, for example). Tango's Variant is a generic boxer.
 Do the various variant structures presented here support recursive type 
 definitions such as lists? Is this done efficiently?
Given the above, this question doesn't apply to Tango at all.
Sep 17 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Daniel Keep wrote:
 
 language_fan wrote:
 Wed, 16 Sep 2009 12:41:06 +0200, Fawzi Mohamed thusly wrote:

 On 2009-09-16 02:40:02 +0200, Justin Johansson
 <procode adam-dott-com.au> said:
 A colleague of mine is suggesting that I really do take a closer look
 at D2 now but I'm not sure that I'm ready to go standing on the
 bleading bleading (the blood doesn't clot) edge just right yet. :-(

 JJ
Tango has variant (tango.core.Variant), and is D1, if that could be an option for you.
Tango and Phobos2's Variants are very different beasts. Phobos2's Variant is a discriminated union; you give it a set of types, and that's what it can store. The general-purpose one, I believe, can store any type so long as it's no larger than the biggest primitive types (it can't store big structs or static arrays, for example).
std.variant includes three parameterized types. One is the backend VariantN, which has the maximum size as a template argument, so it can accommodate any type size as long as the maximum is known in advance. The second is Algebraic, which is useful when you want to model a holder for any of a finite and known set of types. Essentially Algebraic aliases itself to VariantN with an appropriately computed maximum size. Finally, there is the generic Variant which can hold any type in an unbounded set. Until last dmd release, Variant had the limitation that types larger than a delegate could not be stored directly. I fixed that, so now for large types Variant uses dynamic allocation and pointers under the wraps. I have big plans with Variant - I want to make it (or a related type) the variable type common in dynamic languages, with flexibility, dynamic invocation using common syntax, the works. So if you write a short script and want dynamic typing, you should be able to use Variant throughout. Andrei
Sep 17 2009
next sibling parent Justin Johansson <procode adam-dott-com.au> writes:
Andrei Alexandrescu Wrote:

 I have big plans with Variant - I want to make it (or a related type) 
 the variable type common in dynamic languages, with flexibility, dynamic 
 invocation using common syntax, the works. So if you write a short 
 script and want dynamic typing, you should be able to use Variant 
 throughout.
Now that sounds absolutely wicked. :-) JJ
Sep 17 2009
prev sibling parent "Danny Wilson" <bluezenix gmail.com> writes:
Op Thu, 17 Sep 2009 16:56:41 +0200 schreef Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org>:


 I have big plans with Variant - I want to make it (or a related type)  
 the variable type common in dynamic languages, with flexibility, dynamic  
 invocation using common syntax, the works. So if you write a short  
 script and want dynamic typing, you should be able to use Variant  
 throughout.
Including some sort of object literal syntax? You know, like javascript var obj = {i_am:"Dynamic", ...};
Sep 17 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
language_fan wrote:
 Wed, 16 Sep 2009 12:41:06 +0200, Fawzi Mohamed thusly wrote:
 
 On 2009-09-16 02:40:02 +0200, Justin Johansson
 <procode adam-dott-com.au> said:
 A colleague of mine is suggesting that I really do take a closer look
 at D2 now but I'm not sure that I'm ready to go standing on the
 bleading bleading (the blood doesn't clot) edge just right yet. :-(

 JJ
Tango has variant (tango.core.Variant), and is D1, if that could be an option for you.
Do the various variant structures presented here support recursive type definitions such as lists? Is this done efficiently?
In Phobos' Algebraic, if you mention the type This inside the allowed types list, it will expand to the Algebraic's type. The support for complex constructs involving This is at the moment incomplete. Andrei
Sep 17 2009
parent language_fan <foo bar.com.invalid> writes:
Thu, 17 Sep 2009 09:46:01 -0500, Andrei Alexandrescu wrote:

 language_fan wrote:
 Do the various variant structures presented here support recursive type
 definitions such as lists? Is this done efficiently?
In Phobos' Algebraic, if you mention the type This inside the allowed types list, it will expand to the Algebraic's type. The support for complex constructs involving This is at the moment incomplete. Andrei
Roger that. Thanks Andrei for the clear answer.
Sep 17 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Justin Johansson:
 What's the best way of emulating a system of quantified type unions in D  (D1)?
Having nullable values may be useful, having nonnull class references can probably be very useful, having tagged unions (I am talking as C unions here) can be useful. But can you show me some usercases/purposes for what you are asking for? I think other people too here are curious. Bye, bearophile
Sep 15 2009
parent reply Justin Johansson <procode adam-dott-com.au> writes:
bearophile Wrote:

 Justin Johansson:
 What's the best way of emulating a system of quantified type unions in D  (D1)?
Having nullable values may be useful, having nonnull class references can probably be very useful, having tagged unions (I am talking as C unions here) can be useful. But can you show me some usercases/purposes for what you are asking for? I think other people too here are curious.
Thanks for asking. Here's a use case. Suppose you have a container class which represents a list of numbers. Let's call this class ListOfNums. Now a ListOfNums can contain zero or more numbers. Now you want a function called max that takes a ListOfNums, xs, as an argument and returns the number in the list, xs, that is greater than or equal to all other numbers in xs. So if xs contains (3, 9, 9, 2), max( xs) returns 9. Now it makes sense that xs should contain at least one number, so if xs contains just (3), max( xs) returns 3. But what if xs contains no numbers i.e. xs = (), the empty list? What should max( xs) return in the way of a meaningful result short of throwing an exception? Put another way, what is the maximum number in an empty list of numbers? If you allow the numbers in the list to be real, then I suppose you could return NaN but there is no way of expressing NaN if the numbers you are dealing with are purely integer .. and it certainly is not a good idea to make a function that is dealing purely with integers to return a real just allow the corner case to return NaN (a real). An elegant solution to this problem is to specify the max function as taking a quantified list of zero or more numbers and returning another quantified list of numbers which may contain either zero numbers or just one number: zero numbers if the argument list is empty and a single number (being the maximum number in the argument list) if the argument number list is non-empty. The important point in this solution is realizing a type system that treats a list of zero_or_one numbers, a list of zero numbers, a list of zero_or_more numbers, etc, as distinct types. Trust this makes sense. <JJ/>
Sep 15 2009
next sibling parent reply Jeremie Pelletier <jeremiep gmail.com> writes:
Justin Johansson Wrote:

 bearophile Wrote:
 
 Justin Johansson:
 What's the best way of emulating a system of quantified type unions in D  (D1)?
Having nullable values may be useful, having nonnull class references can probably be very useful, having tagged unions (I am talking as C unions here) can be useful. But can you show me some usercases/purposes for what you are asking for? I think other people too here are curious.
Thanks for asking. Here's a use case. Suppose you have a container class which represents a list of numbers. Let's call this class ListOfNums. Now a ListOfNums can contain zero or more numbers. Now you want a function called max that takes a ListOfNums, xs, as an argument and returns the number in the list, xs, that is greater than or equal to all other numbers in xs. So if xs contains (3, 9, 9, 2), max( xs) returns 9. Now it makes sense that xs should contain at least one number, so if xs contains just (3), max( xs) returns 3. But what if xs contains no numbers i.e. xs = (), the empty list? What should max( xs) return in the way of a meaningful result short of throwing an exception? Put another way, what is the maximum number in an empty list of numbers? If you allow the numbers in the list to be real, then I suppose you could return NaN but there is no way of expressing NaN if the numbers you are dealing with are purely integer .. and it certainly is not a good idea to make a function that is dealing purely with integers to return a real just allow the corner case to return NaN (a real). An elegant solution to this problem is to specify the max function as taking a quantified list of zero or more numbers and returning another quantified list of numbers which may contain either zero numbers or just one number: zero numbers if the argument list is empty and a single number (being the maximum number in the argument list) if the argument number list is non-empty. The important point in this solution is realizing a type system that treats a list of zero_or_one numbers, a list of zero numbers, a list of zero_or_more numbers, etc, as distinct types. Trust this makes sense. <JJ/>
You don't need a variant type for such a thing. A simple template will do the job. As for the max of an empty set, you can just return zero. class ListOfNums(T) if(isNumeric!T) { T max() const { T max = 0; foreach(n; _nums) if(n > max) max = n; return max; } T[] _nums; } It's been too long since I last used D1 so the above may use D2 only features.
Sep 15 2009
parent reply Justin Johansson <procode adam-dott-com.au> writes:
Jeremie Pelletier Wrote:

 As for the max of an empty set, you can just return zero.
Well you could return -1, 99 or any arbitrary integer for that matter; just that it doesn't make sense to do that from a pure maths perspective. Mathematicians have spent centuries trying to make number systems consistent. The invention (or discovery**) of "i", the square-root of minus-one, was a major major break-through. In similar vein, it's nice if the type systems we use for programming follow from a consistent set of axioms. ;-) (**Is mathematics invention or discovery?) <JJ/>
Sep 15 2009
next sibling parent Jeremie Pelletier <jeremiep gmail.com> writes:
Justin Johansson Wrote:

 Jeremie Pelletier Wrote:
 
 As for the max of an empty set, you can just return zero.
Well you could return -1, 99 or any arbitrary integer for that matter; just that it doesn't make sense to do that from a pure maths perspective. Mathematicians have spent centuries trying to make number systems consistent. The invention (or discovery**) of "i", the square-root of minus-one, was a major major break-through. In similar vein, it's nice if the type systems we use for programming follow from a consistent set of axioms. ;-) (**Is mathematics invention or discovery?) <JJ/>
The line between invention and discovery is rather thin, I myself like to think of both as two sides of a same coin. For inventions you need to first discover how to do it, and for discoveries you need to invent the system to describe it. Basically anything that draws a line between how we perceive something and how we communicate it can be thought of in terms of semantics and syntax respectively. You discover the semantics and then invent the syntax to describe it, so other people can study your syntax and hopefully get a grasp of the semantics you originally had in mind. As for your code, you could always use more template parameters to control the behavior of your list type: class ListOfNums(T, T DefaultValue = T.init, bool ThrowOnEmpty = false) if(isNumeric!T) { T max() const { static if(ThrowOnEmpty) if(!nums.length) throw new EmptyListOfNumsException(); T max = Defaultvalue; foreach(n; _nums) if(n > max) max = n; return max; } T[] _nums; }
Sep 15 2009
prev sibling parent Edward Diener <eddielee_no_spam_here tropicsoft.com> writes:
Justin Johansson wrote:
 Jeremie Pelletier Wrote:
 
 As for the max of an empty set, you can just return zero.
Well you could return -1, 99 or any arbitrary integer for that matter; just that it doesn't make sense to do that from a pure maths perspective. Mathematicians have spent centuries trying to make number systems consistent. The invention (or discovery**) of "i", the square-root of minus-one, was a major major break-through. In similar vein, it's nice if the type systems we use for programming follow from a consistent set of axioms. ;-)
Your return situation is what an 'optional' template is made for. The entire idea about 'optional' is that you should be able to always return a value for a type which is not a valid value for that type. Some other languages use a built-in value, like 'None' in Python, to solve the problem. C++ and D can use an 'optional' template to solve this problem. See optional<T> in Boost for the C++ answer. I believe Andrei is working on a similar solution for D.
Sep 15 2009
prev sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
Justin Johansson escribió:
 bearophile Wrote:
 
 Justin Johansson:
 What's the best way of emulating a system of quantified type unions in D  (D1)?
Having nullable values may be useful, having nonnull class references can probably be very useful, having tagged unions (I am talking as C unions here) can be useful. But can you show me some usercases/purposes for what you are asking for? I think other people too here are curious.
Thanks for asking. Here's a use case. Suppose you have a container class which represents a list of numbers. Let's call this class ListOfNums. Now a ListOfNums can contain zero or more numbers. Now you want a function called max that takes a ListOfNums, xs, as an argument and returns the number in the list, xs, that is greater than or equal to all other numbers in xs. So if xs contains (3, 9, 9, 2), max( xs) returns 9. Now it makes sense that xs should contain at least one number, so if xs contains just (3), max( xs) returns 3. But what if xs contains no numbers i.e. xs = (), the empty list? What should max( xs) return in the way of a meaningful result short of throwing an exception? Put another way, what is the maximum number in an empty list of numbers?
So after invoking max on the list you would want to know whether it contains a single element or it's the empty list, right? If so, why not asking it before invoking the max function? if (xs.isEmpty) { // no max, do something } else { auto max = max(xs); // do something *else* } I've seen this code a lot of times, and it being: auto max = max(xs); if (max.isEmpty) { // no max, do something } else { // do something *else* } is exactly the same thing, unless you are planning to do something with the list that contains max. At least one of the first things I learned when studying computer science if function pre and postconditions, and max's is "list is not empty". :) What's wrong with that?
Sep 16 2009
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Justin Johansson:

But what if xs contains no numbers i.e. xs = (), the empty list?  What should
max( xs) return in the way of a meaningful result short of throwing an
exception?<
In both Python and my dlibs it throws an exception.
An elegant solution to this problem is to specify the max function as taking a
quantified list of zero or more numbers and returning another quantified list
of numbers which may contain either zero numbers or just one number: zero
numbers if the argument list is empty and a single number (being the maximum
number in the argument list) if the argument number list is non-empty.<
Cute.
The important point in this solution is realizing a type system that treats a
list of zero_or_one numbers, a list of zero numbers, a list of zero_or_more
numbers, etc, as distinct types.<
I don't understand the correlation from this last part and what you have written about max(). Maybe what can be useful is to define as a type a "not empty iterable", so if max() takes as input such type it's sure to not throw an exception. In general iterables can be lazy, their items can be computed on the fly, so there's no way to know at compile time if an iterable will be empty or not. Bye, bearophile
Sep 17 2009
parent Justin Johansson <procode adam-dott-com.au> writes:
bearophile Wrote:
 Justin Johansson:
The important point in this solution is realizing a type system that treats a
list of zero_or_one numbers, a list of zero numbers, a list of zero_or_more
numbers, etc, as distinct types.<
I don't understand the correlation from this last part and what you have written about max(). Maybe what can be useful is to define as a type a "not empty iterable", so if max() takes as input such type it's sure to not throw an exception. In general iterables can be lazy, their items can be computed on the fly, so there's no way to know at compile time if an iterable will be empty or not.
Perhaps I'm not as eloquent as some writers are, so perhaps I could recast the scene as follows: What is a type system? (Rhetorical question) When speaking of types most programmers think, well, is an int or a double, is it a thing (aka object) that one could apply some taxonomy to? Classical CS101: You have a car and a bike. Now come up with the typical OO class structure that has Vehicle as a base class of Car and of Bike. Mostly when thinking about types we think about the qualitative attributes of what we are trying to model .. and we make distinction of types based upon those qualities (and class hierarchies accordingly -- well up to the point that single inheritance breaks down but that's a side issue). OTH quantitative typing (layperson speak .. I'm not a published academic) considers the quantity of things to be separate types. For example, in some problem spaces, typically pattern matching, it may be useful to consider a dozen eggs (12 off) as being a different type to baker's dozen of eggs (13 off). Taking this further, one might well consider how to unify a type system based upon both qualities and quantities. Anyway, regarding my max() example, this was just a single case taken illustrating a problem in the quantitative typing domain. Noting that a number of repondents on this thread suggested various work arounds/solutions for this particular example, me thinks the bigger picture for my particular problem was missed. I've got a good collection of links regarding type systems which I can post if people are interested further in the subject. To be honest though, some of the literature looks like you have to be Einstein to understand it. Google for "type union", "existential types", "quantified types" and similar and you'll find yourself buried up to your neck in literature in no time. Cheers Justin Johansson
Sep 17 2009