digitalmars.dip.development - Third and Hopefully Last Draft: Primary Type Syntax
- Quirin Schroll (2/2) Sep 20 2024 The obligatory
- Richard (Rikki) Andrew Cattermole (5/5) Sep 21 2024 I recommend that you put it through Grammarly prior to Mike getting it,
- Quirin Schroll (14/19) Sep 21 2024 I gave it two people to proofread and probably one just didn't do
- Richard (Rikki) Andrew Cattermole (10/29) Sep 21 2024 Yeah you did good, its just that tools are guaranteed to catch stuff
- IchorDev (7/9) Sep 22 2024 Not sure what was wrong with the other two drafts, but this one
- Tim (81/83) Sep 22 2024 The grammar changes look good. I found some new ambiguities, but
- Quirin Schroll (78/162) Sep 23 2024 In general, ambiguities are resolved considering Maximum Munch:
- Quirin Schroll (20/22) Sep 24 2024 Done. And I updated the DIP draft to include the new Maximum
The obligatory [permalink](https://github.com/Bolpat/DIPs/blob/0562d0c1708f4f8bb79e72392218154ee39b1d4f/D Ps/DIP-2NNN-QFS.md) and [latest draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md)
Sep 20 2024
I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload. I.e. ``excpetion`` Otherwise it is looking pretty good, and good job on doing the implementation!
Sep 21 2024
On Saturday, 21 September 2024 at 13:29:05 UTC, Richard (Rikki) Andrew Cattermole wrote:I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload. I.e. ``excpetion`` Otherwise it is looking pretty good, and good job on doing the implementation!I gave it two people to proofread and probably one just didn't do it (he said it's good), the other sent me a revised version, which did contain some style suggestions. It's not like I didn't try something. I'll try Grammerly. Haven't used it in ages. The implementation has some workarounds that I'd hope won't make it into the compiler. But as Walter pointed out in the Monthly Meeting, it's not obvious the grammar changes won't lead to weird parsings. Therefore, I hope the implementation can give people like you, Paul Backus, and Timon Gehr the opportunity to find holes or, hopefully, find none, which might be enough for Walter to dispel his concerns.
Sep 21 2024
On 22/09/2024 4:01 AM, Quirin Schroll wrote:On Saturday, 21 September 2024 at 13:29:05 UTC, Richard (Rikki) Andrew Cattermole wrote:Yeah you did good, its just that tools are guaranteed to catch stuff like this :)I recommend that you put it through Grammarly prior to Mike getting it, it'll lessen his workload. I.e. ``excpetion`` Otherwise it is looking pretty good, and good job on doing the implementation!I gave it two people to proofread and probably one just didn't do it (he said it's good), the other sent me a revised version, which did contain some style suggestions. It's not like I didn't try something.The implementation has some workarounds that I'd hope won't make it into the compiler. But as Walter pointed out in the Monthly Meeting, it's not obvious the grammar changes won't lead to weird parsings. Therefore, I hope the implementation can give people like you, Paul Backus, and Timon Gehr the opportunity to find holes or, hopefully, find none, which might be enough for Walter to dispel his concerns.Grammar stuff like this isn't where I shine, as long as it passes buildkite I'm happy. The text shows you've done your research and put in the effort. Ideally we'd throw a fuzzer at the parser to verify that it works as expected. https://llvm.org/docs/LibFuzzer.html https://johanengelen.github.io/ldc/2018/01/14/Fuzzing-with-LDC.html
Sep 21 2024
On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:The obligatory [permalink](https://github.com/Bolpat/DIPs/blob/0562d0c1708f4f8bb79e72392218154ee39b1d4f/D Ps/DIP-2NNN-QFS.md) and [latest draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md)Not sure what was wrong with the other two drafts, but this one seems equally great. This feature would represent a massive improvement to string mixin code generation, and general language cohesion. Much like how you’re always allowed to use trailing commas in comma-separated lists.
Sep 22 2024
On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:The obligatory [permalink](https://github.com/Bolpat/DIPs/blob/0562d0c1708f4f8bb79e72392218154ee39b1d4f/D Ps/DIP-2NNN-QFS.md) and [latest draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md)The grammar changes look good. I found some new ambiguities, but the implementation seems to always prefer the old meaning, so it should be no problem. ```d // deprecated (size_t) x1 = 1; // Syntax error // align (size_t) x2 = 1; // Syntax error // package (size_t) x3 = 1; // Syntax error // extern (size_t) x4 = 1; // Syntax error struct UDA{} // UDA (size_t) x5 = 1; // Syntax error ``` The attributes `deprecated`, `align`, `package` and `extern` as well as UDAs can be followed with optional arguments in parens, like the deprecation message. These parens are now ambiguous with a basic type in parens. The implementation seems to always try to parse the parens as arguments for the attribute, so it remains backward compatible. Maybe this could be confusing for the user, when a declaration uses a type in parens and later an attribute is added. ```d alias exit = Object; Object x1; void main() { scope (exit) x1 = new Object(); // Still a scope guard // scope (Object) x2 = new Object(); // Syntax error // scope (int) x3 = 3; // Syntax error 0 scope (exit) x4 = new Object(); // Declares variable with type exit } ``` The first statement is a scope guard with the current grammar. With the new grammar it could also be a variable declaration of type `exit` and storage class `scope`. The implementation still parses it as a scope guard, so it remains backward compatible. The next line could also be a variable declaration, but it is still parsed as a scope guard. DMD then prints an error, because `Object` is not a valid scope identifier. The line with `x3` is a syntax error for the same reason. The last statement is parsed as a variable declaration, because scope guards can't have UDAs. ```d auto test1 = function (float){return 0;}; // auto test2 = function (float)(int){return 0;}; // Syntax error ``` Function literals have an optional return type and optional parameters. The type `float` for `test1` could be a parameter or a return type in parens. The implementation always parses the parens as parameters, so it remains backward compatible. The second function literal has both a return type and parameters, but it results in a syntax error, because the parens are parsed as parameters and no other parens are expected after that. ```d void main() { auto o1 = new class (Object) {}; } ``` The parens could be constructor arguments or a basic type in `AnonBaseClassList?`. The implementation always tries to parse constructor arguments, which should be fine.
Sep 22 2024
On Sunday, 22 September 2024 at 10:58:55 UTC, Tim wrote:On Saturday, 21 September 2024 at 01:01:22 UTC, Quirin Schroll wrote:In general, ambiguities are resolved considering Maximum Munch: If the next token can be parsed as part of the entity that the grammar suggests, it will be; only if it can’t, the entity is closed or it’s an error.The obligatory [permalink](https://github.com/Bolpat/DIPs/blob/0562d0c1708f4f8bb79e72392218154ee39b1d4f/D Ps/DIP-2NNN-QFS.md) and [latest draft](https://github.com/Bolpat/DIPs/blob/PrimaryTypeSyntax/DIPs/DIP-2NNN-QFS.md)The grammar changes look good. I found some new ambiguities, but the implementation seems to always prefer the old meaning, so it should be no problem.```d // deprecated (size_t) x1 = 1; // Syntax error // align (size_t) x2 = 1; // Syntax error // package (size_t) x3 = 1; // Syntax error // extern (size_t) x4 = 1; // Syntax error struct UDA{} // UDA (size_t) x5 = 1; // Syntax error ```Those all fall under Maximum Munch: A parenthesis following any of these attributes constitutes their optional arguments. Attributes with optional arguments are greedy. I wasn’t even aware of `align` without argument. The biggest one is `extern` because it’s realistically used with the new parsing. If you have a class `C`, `extern (C)` is ambiguous – except for Maximum Munch.The attributes `deprecated`, `align`, `package` and `extern` as well as UDAs can be followed with optional arguments in parens, like the deprecation message. These parens are now ambiguous with a basic type in parens. The implementation seems to always try to parse the parens as arguments for the attribute, so it remains backward compatible.Yes, and it follows MM, which is generally something programmers can rely on. What can be done about those? For one: ```d attribute { declaration; } ``` Always works at declaration scope, but for statement scope, that’s not possible. Here, I thought one could use an empty UDA list ` ()`, but those are expressly illegal, so one has to resort to using a dummy UDA like ` ("")`. Not nice, but if you insist on expressing something at statement scope in one swath, I guess we can ask the programmer for some concessions.Maybe this could be confusing for the user, when a declaration uses a type in parens and later an attribute is added.There’s unfortunately little that can be done about it. A better implementation can possibly backtrack and re-interpret what used to be an attribute’s argument as a basic type, but to be honest, that is a lot of work.```d alias exit = Object; Object x1; void main() { scope (exit) x1 = new Object(); // Still a scope guard // scope (Object) x2 = new Object(); // Syntax error // scope (int) x3 = 3; // Syntax error 0 scope (exit) x4 = new Object(); // Declares variable with type exit } ```The big issue with these is, basically, that IMO this _must_ work: ```d scope (ref void function())* fpp = null; ``` And it doesn’t.The first statement is a scope guard with the current grammar. With the new grammar it could also be a variable declaration of type `exit` and storage class `scope`. The implementation still parses it as a scope guard, so it remains backward compatible.IIRC, I ran into this and implemented a look-ahead to handle scope guards correctly. The Scope guards utilize magic identifiers, and unlike `__traits` or `pragma`, there is no-arg `scope`.The next line could also be a variable declaration, but it is still parsed as a scope guard. DMD then prints an error, because `Object` is not a valid scope identifier. The line with `x3` is a syntax error for the same reason.I just fixed that because it was fairly easy to do so. My implementation now looks ahead to see if it’s `scope(`exit/success/failure`)` and if it’s not, it tries to parse it as `scope` attribute.The last statement is parsed as a variable declaration, because scope guards can't have UDAs.This is interesting. It’s unlikely that something like that is going to be a real-world problem, though, as it requires two unlikely things: Someone naming a type `exit` and putting parentheses around it and using a UDA on statement scope. My fix from above doesn’t change that, but again, it’s really unlikely to be in code anyways.```d auto test1 = function (float){return 0;}; // auto test2 = function (float)(int){return 0;}; // Syntax error ``` Function literals have an optional return type and optional parameters. The type `float` for `test1` could be a parameter or a return type in parens. The implementation always parses the parens as parameters, so it remains backward compatible.Yes, for backwards compatibility, it must be done that way. However, this is a MM violation and must be mentioned in the DIP.The second function literal has both a return type and parameters, but it results in a syntax error, because the parens are parsed as parameters and no other parens are expected after that.The second one should be allowed; otherwise some things aren’t expressible. This should work because there’s no valid reason why it can’t: ```d auto fp = function (ref int function()) () => null; ``` However, this currently works and must keep behavior: ```d auto fp = function (ref int function()) => null; static assert(is(typeof(fp) : typeof(null) function(ref int function()))); ``` The implementation will do a look-ahead to figure out if it’s seeing `(Params) FunctionLiteralBody` or `(Type)(params) FunctionLiteralBody`. It might be noteworthy that this is not a MM violation. There is no other way to parse `(Type)(Parameters) FunctionLiteralBody`.```d void main() { auto o1 = new class (Object) {}; } ``` The parens could be constructor arguments or a basic type in `AnonBaseClassList?`. The implementation always tries to parse constructor arguments, which should be fine.I going to look into this. Probably this is low-priority because a base class or interface name following `new class` never requires parens. But it should not be an error either. Probably I’ll do the same as with function literals: Look ahead and see if there’s another set of parens. If yes, it’s `new class (Type)(Arguments) {}`. If not, it’s `new class /*implicit Object*/(Arguments) {}` because of backwards compatibility. I’ll commit my stuff probably tomorrow. I can’t do it now, unfortunately.
Sep 23 2024
On Monday, 23 September 2024 at 19:03:47 UTC, Quirin Schroll wrote:I’ll commit my stuff probably tomorrow. I can’t do it now, unfortunately.Done. And I updated the DIP draft to include the new Maximum Munch exceptions. I did everything as suggested in my post, except for the anonymous class stuff. There, I was mistaken. The constructor arguments go first, then the base class / interfaces follow: ``` new class ConstructorArgs? AnonBaseClassList? AggregateBody ``` This means there is no real issue. If someone writes `new class (Object)`, that’s a compile error today (if `Object` refers to a type, which it usually does) as parsing takes `(Object)` as the argument list, and it will stay one. Someone who wants to surround a the first base class / interface with parentheses has to use an explicit empty argument list, e.g. `new class () (Object) {}`. ---- Please review the latest draft [here](https://forum.dlang.org/post/cekqyahwnumvesppxsfs forum.dlang.org).
Sep 24 2024