www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Pattern matching is-expressions

reply Stefan Koch <uplink.coder googlemail.com> writes:
Hi there,

Recently I have been thinking about converting commonly used is 
expressions into __traits,
since having that makes type functions more consitent.

So let me first talk about the problem.

let's say for type T you wanted write an is expression which 
matchs a on slice types.
you would write:
static if (is(T == U[], U))
{
     pragma(msg, "I have seen a slice of type " ~ U.stringof);
}
Which means If there is any U such that U[] is the same type as 
T, return true and inject the name U into the scope below.
As long as that scope is a `static if`.

We are effectively changing the ast here.
The is expression is not just a query but also injecting a node.

Note: what I wrote above is what's in the spec.
In reality the is expression makes U a name immediately and not 
just in the static if scope below.

consider this example:

`static if (is(P[0] S == super) && is(S[0] == ASTNode))`
Which means:
iff there is a tuple P AND P has at least one element AND for 
tuple P the first element of P is a class or interface AND P[0] 
is not object or the first interface in an interface hierachy, 
introduce a new name S which is a tuple of the parent classs or 
interfaces of P[0], THEN evluate the is expression to true;
AND iff there is a tuple S AND S has at least one element AND 
there is a type called ASTNode AND the first elemennt of S is the 
type called ASTNode THEN evaluate the is-expression to true.
If the latter is expression is true that implies that the former 
is expression was true.


We can see that S is a valid tuple immediately after, (and even 
inside), the expression which created it.
That's helpful in this case because otherwise if'd have to one 
nest more `static if`.

To make the issue clearer let me reframe this a little.
this expression is(P S == super) is rougly equivalent to the 
question:
"Do you have a father?"

So you ask that question and the answer is "Yes I do have a 
father, his name is Bernd, also that's him just there, and he is 
going to live with you now."

That is a problem if you don't have room for Bernd at your home.


If you write the following into a new file
```
int[] ia;
pragma(msg, "ia is an array ", is(typeof(ia) == U[], U), " and 
it's element type is: ", U);
```

The output will be:
`ia is an array true and it's element type is: int`

which you can check here https://run.dlang.io/is/iKZSvQ


This works because the is expression installed the name U in your 
scope.

At least in my opinion this is very diffrent to the way D behaves 
anywhere else in the language.

What do you think?

Greetings,
Stefan
Aug 19 2020
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 07:29:18 UTC, Stefan Koch wrote:
 Hi there,

 Recently I have been thinking about converting commonly used is 
 expressions into __traits,
 since having that makes type functions more consistent.
[ ... ] There is even more inconsistency, which comes from the fact that pattern matching is expressions are not actually able to match arbitrarily complex patterns. You cannot even use it to extract parts from a function type. which is why the other wired forms exist for an example look at this: package void invert(double[] from) { from[] *= 7.0; pragma(msg, typeof(invert)); // output void(double[] from) pragma(msg, is(void function (double[]))); // true void function (double[]) is a type pragma(msg, is(typeof(invert) == function)); // true typeof invert is a function pragma(msg, is(typeof(invert) == void function(double[]))); // false ??? pragma(msg, is(typeof(invert) == R function (A), R, A)); // also false, that would suggest that typeof(invert) is actually function without return type or parameters. // what we see here is that is expressions are broken. // pragma(msg, is(typeof(invert) == R F (A), R, A, F)); // if it's not a function let F be a free symbol as well // answer: expected identifier `F` in declarator // this is a parser error // therefore the line had to be commented out } the same code is also at: https://run.dlang.io/is/76q4wG
Aug 19 2020
next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Wednesday, 19 August 2020 at 09:53:41 UTC, Stefan Koch wrote:
     		pragma(msg, is(typeof(invert) == void 
 function(double[]))); // false ???
The `function` keyword represents a function pointer, while invert is a function. Change it to `typeof(&invert)` and it will hold true.
 // what we see here is that is expressions are broken.
 // pragma(msg, is(typeof(invert) == R F (A), R, A, F)); // if 
 it's not a function let F be a free symbol as well
is expression were never specified to match arbitrary patterns, so I wouldn't call it inconsistent or broken, just limited.
Aug 19 2020
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 11:56:18 UTC, Dennis wrote:
 On Wednesday, 19 August 2020 at 09:53:41 UTC, Stefan Koch wrote:
     		pragma(msg, is(typeof(invert) == void 
 function(double[]))); // false ???
The `function` keyword represents a function pointer, while invert is a function. Change it to `typeof(&invert)` and it will hold true.
Thanks!
 // what we see here is that is expressions are broken.
 // pragma(msg, is(typeof(invert) == R F (A), R, A, F)); // if 
 it's not a function let F be a free symbol as well
is expression were never specified to match arbitrary patterns, so I wouldn't call it inconsistent or broken, just limited.
Let's call it limited then. Does a limited pattern matching still fulfill a desirable power to complexity ratio though?
Aug 19 2020
parent reply Dennis <dkorpel gmail.com> writes:
On Wednesday, 19 August 2020 at 12:03:59 UTC, Stefan Koch wrote:
 Does a limited pattern matching still fulfill a desirable power 
 to complexity ratio though?
If D were to be rebuilt from the ground up with first-class types, things could definitely be simplified. The existence of meta-programming libraries wrapping traits and is-expressions is a sign that D's introspection is not intuitive or ergonomic. https://code.dlang.org/packages/bolts https://code.dlang.org/packages/mirror However, considering the current template parameter system is so deeply ingrained into the D language and is-expressions simply leverage most of that, I don't think we should do away with is-expressions.
Aug 19 2020
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 19 August 2020 at 13:37:54 UTC, Dennis wrote:
 The existence of meta-programming libraries wrapping traits and 
 is-expressions is a sign that D's introspection is not 
 intuitive or ergonomic.
I'm not convinced of that, I think these are just ideological. Like the reason std.traits exists and __traits has the __ is just that they felt it *should* be library, not that it *had* to be. But perhaps there is just too steep of a learning curve, like I said, I didn't *really* get it myself until I was forced to by writing the book on it. Just... maybe we should make the docs easier to understand (the spec is quite bad for really getting it - but realizing it just mirrors the declaration helps a lot) and see if these libs start dying. I'd also love to kill the libs because 1) they are invariably buggy and 2) they always bloat compile times and dmd ram to unacceptable amounts.
Aug 19 2020
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 13:48:49 UTC, Adam D. Ruppe wrote:
 On Wednesday, 19 August 2020 at 13:37:54 UTC, Dennis wrote:
 The existence of meta-programming libraries wrapping traits 
 and is-expressions is a sign that D's introspection is not 
 intuitive or ergonomic.
I just found out that is(T S == super) returns true, even if T is object.Object.
Aug 21 2020
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 21 August 2020 at 17:39:12 UTC, Stefan Koch wrote:
 I just found out that
 is(T S == super)
 returns true, even if T is object.Object.
yeah it actually sets S to be a tuple of (base class, interfaces...). So you need to check the length of it too. It gives false only if passed a type that has no super at all, like a struct or a basic type.
Aug 21 2020
prev sibling next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 13:37:54 UTC, Dennis wrote:
 On Wednesday, 19 August 2020 at 12:03:59 UTC, Stefan Koch wrote:
 Does a limited pattern matching still fulfill a desirable 
 power to complexity ratio though?
If D were to be rebuilt from the ground up with first-class types, things could definitely be simplified. The existence of meta-programming libraries wrapping traits and is-expressions is a sign that D's introspection is not intuitive or ergonomic. https://code.dlang.org/packages/bolts https://code.dlang.org/packages/mirror However, considering the current template parameter system is so deeply ingrained into the D language and is-expressions simply leverage most of that, I don't think we should do away with is-expressions.
I fully agree. I want to keep is expressions as they are. They have their place. I hope that with the slow introduction of features which demonstrate first class types we can migrate away from having to use them for common patterns though. And also from having to have _costly_ wrapper libraries. So far the only thing I see which can remove some of the cost of templates is to provide optimized compiler intrinsics.
Aug 19 2020
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Aug 19, 2020 at 01:37:54PM +0000, Dennis via Digitalmars-d wrote:
[...]
 If D were to be rebuilt from the ground up with first-class types, things
 could definitely be simplified.
[...] In an ideal world, types would be first class entities that you can pass around and manipulate at will. Inside templates / CTFE, operations on types would operate directly on the compiler's internal type representation. If any of these types end up in runtime code, they are replaced with runtime equivalents (typeid, et al). The same type-manipulating function would be usable both at CTFE and at runtime, with the same semantics. Of course, this idealistic scenario may not be very realistic, but moving in its direction could unify quite a number of currently divergent constructs, simplify the language, yet increase expressivity. T -- The best way to destroy a cause is to defend it poorly.
Aug 19 2020
parent Bruce Carneal <bcarneal gmail.com> writes:
On Wednesday, 19 August 2020 at 15:51:01 UTC, H. S. Teoh wrote:
 On Wed, Aug 19, 2020 at 01:37:54PM +0000, Dennis via 
 Digitalmars-d wrote: [...]
 If D were to be rebuilt from the ground up with first-class 
 types, things could definitely be simplified.
[...] In an ideal world, types would be first class entities that you can pass around and manipulate at will. Inside templates / CTFE, operations on types would operate directly on the compiler's internal type representation. If any of these types end up in runtime code, they are replaced with runtime equivalents (typeid, et al). The same type-manipulating function would be usable both at CTFE and at runtime, with the same semantics.
Yes. Were types first class citizens we'd have a lot of power but backing away from full mutability could be a good thing wrt comprehension and correctness. Monadic forms perhaps.
 Of course, this idealistic scenario may not be very realistic, 
 but moving in its direction could unify quite a number of 
 currently divergent constructs, simplify the language, yet 
 increase expressivity.
Yes. D language users and implementers are at the forefront of practical metaprogramming. It's great, challenging but great. Stefan's type functions could be a good addition. I like the parsimonious naming and the move towards full reification.
Aug 19 2020
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 19 August 2020 at 09:53:41 UTC, Stefan Koch wrote:
     		pragma(msg, is(typeof(invert) == void 
 function(double[]))); // false ???
                 pragma(msg, is(typeof(invert) == R function 
 (A), R, A)); // also false, that would suggest that 
 typeof(invert) is actually function without return type or 
 parameters.
Same mistake on both of these, to get a function pointer you must use &. // true! pragma(msg, is(typeof(&invert) == R function (A), R, A)); And it does indeed set both R and A to the correct types. package void invert(double[] from, int a) { static if(is(typeof(&invert) == R function (A), R, A...)) { pragma(msg, R); // void pragma(msg, A); // (double[], int) } } The is expression isn't as limited as you think!
 That is a problem if you don't have room for Bernd at your home.
Then don't ask for his name! The spec says if you give an identifier there, the compiler gives it to you. You can just omit that and do class A {} pragma(msg, is(A == super)); // true, no new symbol The is expression is confusingly documented, I didn't really realize it until I was writing a section on it in my D Cookbook but it basically just mirrors a variable declaration with a lot of optional parts. Using more or less of these optional parts is how you get to the seven forms the dlang.org docs mention. is(A) is(A B) // only accepted inside static if though is(A == C) // adding optional specifier is(A B == C) // adding optional specifier and alias As for the lifetime of the added symbols, the spec prohibits is(A B) outside static if, but indeed, the others are allowed in other places. And then the names outlive the static if itself which I complained about in bugzilla not long ago: https://issues.dlang.org/show_bug.cgi?id=21078 So there could be a little tweak/fix to the scope lifetime of these symbols but it does overall work really quite well and there's ways to avoid at least some of the symbols if you like and maybe the others should be able to be set to local or whatever too. But it does do a LOT of useful things.
Aug 19 2020
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 12:56:37 UTC, Adam D. Ruppe wrote:

 So there could be a little tweak/fix to the scope lifetime of 
 these symbols but it does overall work really quite well and 
 there's ways to avoid at least some of the symbols if you like 
 and maybe the others should be able to be set to local or 
 whatever too.

 But it does do a LOT of useful things.
Yes it is useful. And I think I have found a way to support them within the typefunctions. is(T S == super) would become auto f(alias T) { static literal bool r = is(T S == super); // here alias S = (r ? SuperTupleOf(T) : ErrorType) is automatically declared. // where ErrorType is the value for which `!is(ErrorType)` is true } still the pattern matching part is going to be a pain to implement correctly. Without the some limitations we have right now.
Aug 19 2020
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 19 August 2020 at 13:07:40 UTC, Stefan Koch wrote:
 Yes it is useful.
 And I think I have found a way to support them within the 
 typefunctions.
 is(T S == super) would become
 auto f(alias T)
 {
    static literal bool r = is(T S == super);
    // here alias S = (r ? SuperTupleOf(T) : ErrorType) is 
 automatically declared.
    // where ErrorType is the value for which `!is(ErrorType)` 
 is true
 }

 still the pattern matching part is going to be a pain to 
 implement correctly.
 Without the some limitations we have right now.
Why is it that type functions need their own parallel implementation of an existing language feature? Shouldn't it be possible to re-use the existing implementation?
Aug 19 2020
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 13:15:37 UTC, Paul Backus wrote:
 On Wednesday, 19 August 2020 at 13:07:40 UTC, Stefan Koch wrote:
 Yes it is useful.
 And I think I have found a way to support them within the 
 typefunctions.
 is(T S == super) would become
 auto f(alias T)
 {
    static literal bool r = is(T S == super);
    // here alias S = (r ? SuperTupleOf(T) : ErrorType) is 
 automatically declared.
    // where ErrorType is the value for which `!is(ErrorType)` 
 is true
 }

 still the pattern matching part is going to be a pain to 
 implement correctly.
 Without the some limitations we have right now.
Why is it that type functions need their own parallel implementation of an existing language feature? Shouldn't it be possible to re-use the existing implementation?
Type functions work by doing partial evaluation, In the first, static pass they can use the regular semantic transform and checking pass that the rest of dmd uses. However everything which touches the type variables, cannot be semantically analyzed or transformed because the types aren't known yet. Furthermore if during the interpretation the types are in a known state, you cannot change/enrich the type information within the AST based on that. Since that would change the AST which would either violate the rule that ctfe has to be pure, and cause all sorts of issues, or you would have duplicate the whole subtree (this is what templates, do.). Both of these things are not an option for me. Therefore regular implementations of features can modify the meaning of existing nodes or indeed introduce new nodes, cannot be used. As you can see is expressions do introduce new nodes, so the regular implementation of them cannot just be forwarded to. Also the existing implementation has no notion of type variables. Since those only exist within the dynamic environment of the interpreter.
Aug 19 2020
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 19 August 2020 at 15:18:03 UTC, Stefan Koch wrote:
 Type functions work by doing partial evaluation,
 In the first, static pass they can use the regular semantic 
 transform and checking pass that the rest of dmd uses.
 However everything which touches the type variables, cannot be 
 semantically analyzed or transformed because the types aren't 
 known yet.
 Furthermore if during the interpretation the types are in a 
 known state, you cannot change/enrich the type information 
 within the AST based on that.

 Since that would change the AST which would either violate the 
 rule that ctfe has to be pure, and cause all sorts of issues, 
 or you would have duplicate the whole subtree (this is what 
 templates, do.).

 Both of these things are not an option for me.
I see. I had assumed that type functions would work by duplicating the subtree, using the copy as a temporary "stack frame" in which local mutation was permitted, and then throwing it away at the end of interpretation. This is still more efficient than using recursive templates, since it reduces the number of syntax-tree copies from O(n) to O(1), but I guess that's not enough of a performance win for what you're trying to accomplish.
Aug 19 2020
prev sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 13:07:40 UTC, Stefan Koch wrote:
 And I think I have found a way to support them within the 
 typefunctions.
 is(T S == super) would become
 auto f(alias T)
 {
    static literal bool r = is(T S == super);
    // here alias S = (r ? SuperTupleOf(T) : ErrorType) is 
 automatically declared.
    // where ErrorType is the value for which `!is(ErrorType)` 
 is true
 }
Back to the drawing board, it turns out my initial fears were appropriate. Making an identifier into a variable declaration depending on whether you are in a type function or not; Is a funky thing to do. I'll to go with additional __traits. Besides I do think that __traits(getSuperType, T) does look more readable than is(T S == super) even though it does not give you the "symbolic equation" feeling :)
Aug 20 2020
prev sibling parent reply Dennis <dkorpel gmail.com> writes:
On Wednesday, 19 August 2020 at 07:29:18 UTC, Stefan Koch wrote:
 At least in my opinion this is very diffrent to the way D 
 behaves anywhere else in the language.
is expressions are very similar to template parameters, IIRC they use the same mechanisms. You can do this for example: ``` void foo(T, A = T)() { pragma(msg, A); } alias f = foo!int; ``` Here the symbol T is also put in scope immediately so it can be used later in the parameter list.
 What do you think?
I think they have a learning curve and can be unintuitive. I still sometimes have to think whether to place the pattern left or right of the ==. However, I've grown to like them. I especially like using them when 'switching' over a generic type: ``` static if (is(T == struct) || is(T == union)) { } else static if (is(T == E*)) { } else static if (is(T == E[n], E, int n)) { } else static if (is(T E == enum)) { } ``` It's uniform, it's clear what each branch represents, it does not require importing and instantiating templates. Compare that to: ``` import std.traits; static if (isAggregateType!T) { } else static if (isPointer!T) { alias E = PointerTarget!E; } else static if (__traits(isStaticArray, T)) { } else static if (is(T E == enum)) { } ``` With traits, you might have to look up the documentation to read it. (E.g. isAggregateType also accepts class and interface, but would you know that immediately? Does isIntegral!char or isIntegral!bool hold?). Also they are not complete (there is trait for checking if something is an enum), though that can be worked on. (I know there's a Pull Request in dmd for adding `__traits(isDynamicArray)`) The advantage of traits is that they can be more future proof however. `isIntegral` would still work unlike `is(T : long)` even if `cent` and `ucent` types are added, or even if (ulong : long) implicit conversion gets deprecated. I wish the redundant traits (isAssociativeArray, isStaticArray) weren't there since they are completely covered by is expressions though.
Aug 19 2020
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Wednesday, 19 August 2020 at 13:03:59 UTC, Dennis wrote:
  Also they are not
 complete (there is trait for checking if something is an enum),
The expression is(T == enum) works, and does detect if something has an enum type. Are you talking about detecting a manifest constant?
 (I know there's a Pull Request in dmd for adding 
 `__traits(isDynamicArray)`)
Indeed and it shows a 16% performance improvement over using the isDynamicArray template in phobos. (Since it doesn't have to match anything, it just returns a part of the compiler internal data) But even if that weren't the case I find it much more intuitive than pattern matching is expressions.
 The advantage of traits is that they can be more future proof 
 however. `isIntegral` would still work unlike `is(T : long)` 
 even if `cent` and `ucent` types are added, or even if (ulong : 
 long) implicit conversion gets deprecated.
True I hadn't even considered that aspect yet.
 I wish the redundant traits (isAssociativeArray, isStaticArray) 
 weren't there since they are completely covered by is 
 expressions though.
The is expressions are more expensive, because of the matching part.
Aug 19 2020
parent Dennis <dkorpel gmail.com> writes:
On Wednesday, 19 August 2020 at 13:14:22 UTC, Stefan Koch wrote:
 On Wednesday, 19 August 2020 at 13:03:59 UTC, Dennis wrote:
  Also they are not
 complete (there is trait for checking if something is an enum),
The expression is(T == enum) works, and does detect if something has an enum type. Are you talking about detecting a manifest constant?
I meant to say "there is *no* trait for checking if something is an enum" (in __traits() or std.traits), so you still have to use an is expression there. My point is that is is-expressions are basically a swiss army knife for all type identification, while __traits and std.traits are more like swiss cheese (you can identify some types, but others you can't).
Aug 19 2020