www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Variants and pattern matching implementation

reply Reiner Pope <reiner none.com> writes:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

An interesting exercise I've been working on in D has been implementing 
type-safe variants (aka discriminated unions) in D, the main feature of 
which is the ability to do a simple form of pattern matching on them.

It's now at a usable stage. Below is some annotated sample code, showing 
the features, and attached is the sample code in a file, and the 
implementation itself.

Comments would be much appreciated.

Cheers,

Reiner


Variant sample:

import variant;
import std.stdio;

// Setting it up for examples below
void showOverload(int n) { writefln("int"); }
void showOverload(char[] c) { writefln("char[]"); }


void main()
{
//  The base data-structure is the Variant struct, which takes a list of 
types:
     alias Variant!(int, long, char[]) MyVariant; // MyVariant is a type 
which can be assigned ints, char[]s, or Foo's.

//  This data-type has opAssign overloaded, to allow:
     MyVariant v;
     v = 5; // it now holds an int
     v = 5L; // it now holds a long
     v = "World!"; // now holds a char[]

//  The point of interest is that it keeps track of the stored type, and 
pattern matching (similar to a switch statement)
//  can be used to access the data in a type-safe way:
     mixin(Match!(v,
              Case!("int n", `writefln("Twice your number is %d", 2*n);`),
              Case!("char[] s", `writefln("Hello ", s);`)));

//  The Match!() statement ensures that the value is only used as an 
appropriate type.
//  However, it needn't be used with the exact type. Instead (just as a 
proof of concept), if no exact match is found,
//  it follows D's overloading rules (to a certain extent), and it will 
match with any type which can be implicitly
//  converted to the type specified in the pattern, so
     mixin(Match!(v,
              Case!("real r", `writefln("Your value is a long or an int 
with a value of ", r);`)));

//  Also as a neat thing, we can express multiple cases with one 
pattern. The supplied code is then inserted once for each
//  different type specified by the pattern (similar to the quasi-static 
foreach -- that's the main reason I put it in):
     mixin(Match!(v,
              Case!("{int|char[]} n", `showOverload(n);`)));
//  The above code will print "int" if matched with an int, and "char[]" 
if matched with a char[]

//  If a pattern is unreachable, you will be told at compile time:
     mixin(Match!(v,
              Case!("real r", `writefln("Matches int and long");`)
//            , Case!("int n", `writefln("Matches int");`) // This line 
would give an error:
                                                            // 
"Unreachable case statement: int n"
              ));

//  Finally, you can use MatchStrict!() to express a match statement 
with the requirement that every possible type is
//  handled. This is useful if the possible data-types will change, and 
you want to be warned in the future if you don't handle
//  all of them. Other than this requirement, MatchStrict!() behaves 
just like Match!()
     mixin(MatchStrict!(v,
              Case!("{int|char[]} a", `// Do nothing`),
              Case!("long l", `writefln("Aha! A long");`) // Commenting 
this line out would cause an error at compile-time:
                                                          // "Not all 
cases expressed in match statement"
                   ));

}
May 06 2007
next sibling parent reply Reiner Pope <reiner none.com> writes:
Reiner Pope wrote:
 Comments would be much appreciated.

I'll start them off myself <g>. There are two main things in the current result that I'm not satisfied with: 1. The error messages don't propagate correctly, so that the line number given is the line in variant.d, although the error is actually on the caller's side. Of course, printing an error backtrace, although it would work here, but may just be an annoyance in other code. 2. Exposure doesn't work correctly, and I don't know what to do. For instance, I don't currently know how to declare a type in some module, and use it in Variant!(). The problem is that, to create the opAssign's, I currently *need* to use string mixins (because of a template mixin annoyance detailed below), and the string mixin appears to be evaluated in a scope which is unaware of my declared type. Similarly, I need to expose Variant's internals, although that makes more sense, since I access them at the call site. Two other annoyances are: 1. Funny template mixin behaviour. After using a lot of text mixins, I tried to make my code cleaner by using other language features (I turned one into a simple tuple, and I tried to turn the opAssign declarations into a template mixin). To make multiple opAssigns, I used the following template: template makeOpAssigns(ulong index, T...) { static if (index < T.length) { pragma(msg, itoa(index)); typeof(*this) opAssign(T[index] value) { _variant_data[index] = value; _variant_type = index; return *this; } mixin makeOpAssigns!(index + 1, T); } } but it appears that the T[index] in the function prototype doesn't use the 'index' template parameter, but actually uses 'index' from an external scope. At least, that's what it seems, although this surprises me. 2. Lack of types for templates parameters. It would really be nice to express what *kind* of template parameters you want in more detail (this specifically refers to alias parameters, and their tuple counterparts). I know of at least three specific times in writing the variant that I forgot/misunderstood the type the parameter was -- I can't help feeling that some kind of Concepts idea for template parameters would be nice. But other than that, writing it was great -- it is quite amazing how much you can do with mixin/CTFE/static if. Cheers, Reiner
May 06 2007
parent reply "David B. Held" <dheld codelogicconsulting.com> writes:
Reiner Pope wrote:
 [...]
 2. Lack of types for templates parameters.
 It would really be nice to express what *kind* of template parameters 
 you want in more detail (this specifically refers to alias parameters, 
 and their tuple counterparts). I know of at least three specific times 
 in writing the variant that I forgot/misunderstood the type the 
 parameter was -- I can't help feeling that some kind of Concepts idea 
 for template parameters would be nice.
 [...]

This is my favorite hobby-horse, so if you could elaborate on this with some specific examples, that would be great. ;) Dave
May 09 2007
parent reply Reiner Pope <reiner none.com> writes:
David B. Held wrote:
 Reiner Pope wrote:
 [...]
 2. Lack of types for templates parameters.
 It would really be nice to express what *kind* of template parameters 
 you want in more detail (this specifically refers to alias parameters, 
 and their tuple counterparts). I know of at least three specific times 
 in writing the variant that I forgot/misunderstood the type the 
 parameter was -- I can't help feeling that some kind of Concepts idea 
 for template parameters would be nice.
 [...]

This is my favorite hobby-horse, so if you could elaborate on this with some specific examples, that would be great. ;) Dave

Take some of the code currently in variant.d:
 template GenerateCases(VType, bool Strict, IndicesThenCases...)
 {
     ...
     const ulong[] indices = TypeIndices!(TheType, VType.Types);
     ...
 }

1. A varargs template parameter is the only way to pass around a ulong[] as a template parameter; you can't write template Foo(ulong[] T1) {...} That is why the IndicesThenCases parameter is so named, since it's actually two parameters. 2. The tail of IndicesThenCases is (obviously enough) a list of cases. It makes sense if you think about it, but the following line from above:
          const ulong[] indices = TypeIndices!(TheType, VType.Types);

used to actually be
         const ulong[] indices = TypeIndices!(VType, Cases);

When I came to that bug, the compiler printed several pages of template instantiations. What was worse was that the ulong[] there caused them to be displayed mangled, not as plain text. It was mostly guesswork to find that as the bug, since the compiler didn't give any useful information. Of course, if Cases had been typed as a list of template instantiations, and TypeIndices had stated it required a list of types, it would have been a simple type mismatch, and easier to find. 3. Another problem is the form of the Match!() template, because it expects a list of Cases. The code pattern I use to check the user is providing this is the following idea:
 template CaseImpl(...)
 {
     const bool IsCase = true;
     ....
 }
 

and later (although I forgot this in the code I released):
 static assert(is(typeof(Cases[0])), "The match statement expects a list of
Cases");

Which seems a bit silly when you consider I am just ensuring the template parameters are the right type. That's all that comes to mind now. Are there any thoughts about implementing some kind of concepts/checking for template parameters? I'm especially uncertain of nested tuples and value parameters, which both currently (and a lot more) are all hidden behind the single
 template Foo(T...)

idiom. Cheers, Reiner
May 13 2007
parent "David B. Held" <dheld codelogicconsulting.com> writes:
Reiner Pope wrote:
 [...]
 That's all that comes to mind now. Are there any thoughts about 
 implementing some kind of concepts/checking for template parameters? I'm 
 especially uncertain of nested tuples and value parameters, which both 
 currently (and a lot more) are all hidden behind the single
 
 template Foo(T...)

idiom.

Well, concepts are just a special form of metatypes, which is what I am interested in more generally. I actually argued for metatypes several years ago, from my headaches working with C++, and believing there had to be A Better Way. The C++ Concepts proposal proves that lots of other people have come to the same conclusion. However, Concepts are designed primarily to support structural metatypes, which makes sense and is useful, but is by no means complete. D has an opportunity to get metatyping correct without having to bolt it on as an afterthought, like Concepts. However, the need for metatypes must be motivated by real-world examples, and since metaprogramming is considerably less frequent than muggleprogramming, it's harder to convince the Unbelievers. Interestingly enough, Andrei is one of the Unbelievers that thinks the problems solved by metatyping can be addressed satisfactorily without first-class metatypes, and even Walter tends to think that specialization is Good Enough. So it's an uphill battle for me to champion first-class metatypes, but having examples that don't just come from my own experience definitely helps. Consider that an Open Call to the D community at large to bring forward more examples like yours (which is great, BTW...thanks for shaing). Dave
May 13 2007
prev sibling next sibling parent Gregor Richards <Richards codu.org> writes:
Could you add a license to this?

  - Gregor Richards
May 07 2007
prev sibling parent Reiner Pope <reiner none.com> writes:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Re-attached code, released into public domain.
May 07 2007