digitalmars.D - SAOC 2024 "Learning about AST Nodes and Semantic Analysis in Compiler
- Dennis (102/102) Sep 22 **Tasks Accomplished**
- Nicholas Wilson (5/8) Sep 22 Excellent work! please do open a pull request (draft if you feel
**Tasks Accomplished** Before delving into how to decouple AST nodes from semantics functions, I looked at how compilers work in general and the processes involved. A typical compiler works this way: Character Stream=> |**Lexer**| =>Tokens=>|**Parser**| =>AST =>|**Semantic Routines**| =>**Intermediate Representation(Optimization)** =>|Code Generator| => **Assembly Code** **Character stream:** It is also known as source code or input that the programmer wrote. **Lexer/scanner:** lexing/lexical analysis is the process of breaking down a string into meaningful units, the result of this process is called tokens. **Parser:** The job of the parser is to obtain strings of tokens from the lexical analyzer and verifies that the string is a grammar from the source language. It detects and reports any syntax errors and produces a parse tree from which intermediate code can be generated. The output of the parser is an abstract syntax tree (AST). **Abstract syntax tree(AST):** The AST is like a blueprint that represents the structure of my code. It breaks down the code into smaller chunks and organizes them in a tree-like structure so that the compiler can understand. An important fact I learnt is that the AST only contains information related to analyzing the source text and ignores extra syntactic information used for parsing text. In the dmd compiler codebase, AST nodes are classes and structs, while the semantic routines are function tightly coupled within the AST classes. I also learnt about the core differences between an AST tree and a parse tree which in summary I would say an AST is focusing on the essential elements and their relationships. It captures the underlying structure and semantics of the code, excluding unnecessary syntactic details while parse tree captures the complete structure of the input code, including all the syntactic details, such as parentheses, semicolons, and other language-specific constructs. A simple ast node constructed for the practice https://github.com/dchidindu5/test_demo/blob/main/README.md **Semantic Analysis:** It is a process in compiling where the compiler checks whether the code is logical and meaningful. Its major role is type checking to confirm whether variable declarations, functions, and control flow adheres to the semantics of the language. So far these processes are the frontend of the dmd compiler. - To fully understand the directory for the dmd codebase, I used this as a guide, which outlines the files and what they perform. https://github.com/dlang/dmd/blob/master/compiler/src/dmd/README.md - Looked up into each and every file I would work on. - Chose the attrib.d AST node file as recommended by my mentor - I examined the imports and commented out //import dmd.dsymbolsem which is a semantic import. - Built the compiler and errors were encountered. - Looked at the error messages and moved the affected semantic functions to dsymbolsem.d which is a semantic analysis file. - The affected functions were `newScope` func - Converted it into a visitor which is a design pattern for refactoring. Had trouble mastering it so my mentor sent a previous commit on visitors to [Extract dsymbol.Dsymbol.importAll and turn it into a visitor](https://github.com/dlang/dmd/pull/15870/) - Implemented it on the newScope func. **First error encountered:** ``` src/dmd/dsymbolsem.d(7494): Error: function `extern (C++) Scope* dmd.dsymbolsem.newScopeVisitor.visit(Scope* sc)` does not override any function, did you mean to override alias `dmd.visitor.Visitor.visit`? src/dmd/dsymbolsem.d(7494): Functions are the only declarations that may be overridden Functions are the only declarations that may be overridden ``` **First commit-** https://github.com/dlang/dmd/commit/c01f76b25b4eb210d92d0ab858dd025ee72bfc6a **Solution** My mentor helped me to discover that the method signature in newScopeVisitor is not exactly the same as in the base class Visitor. That means that the method I'm trying to override does not have the exact same name, return type,and parameters. I worked on it and used the exact name and argument and no return type, because it’s a virtual function(does not return any value) **Challenges** Although still refactoring the code, working on new errors **Current commit:** https://github.com/dlang/dmd/compare/master...dchidindu5:dmd:practice1?expand=1 https://github.com/dlang/dmd/commit/36489c94755a502f7141168ed6e006ef95339062 **Summary:** This week was focused on building a strong theoretical foundation in compiler design, particularly around AST nodes and semantic analysis, while also getting acquainted with the practical aspects of contributing to the DMD compiler project. **Resources:** AST https://medium.com/basecs/leveling-up-ones-parsing-game-with-asts-d7a6fc2400ff https://pgrandinetti.github.io/compilers/page/what-is-semantic-analysis-in-compilers/ Visitors https://www.geeksforgeeks.org/visitor-method-design-patterns-in-c/ D language Book http://ddili.org/ders/d.en/index.html
Sep 22
On Sunday, 22 September 2024 at 20:33:22 UTC, Dennis wrote:**Tasks Accomplished** [...]Excellent work! please do open a pull request (draft if you feel it is not yet ready) and the rest of the reviewers (and CI; ex your missing newline at the end of file for `dsymbolsem.d`) can give you feedback.
Sep 22
On Monday, 23 September 2024 at 03:37:01 UTC, Nicholas Wilson wrote:On Sunday, 22 September 2024 at 20:33:22 UTC, Dennis wrote:**Noted****Tasks Accomplished** Design: [...]Excellent work! please do open a pull request (draft if you feel it is not yet ready) and the rest of the reviewers (and CI; ex your missing newline at the end of file for `dsymbolsem.d`) can give you feedback.
Sep 23
On Monday, 23 September 2024 at 03:37:01 UTC, Nicholas Wilson wrote:On Sunday, 22 September 2024 at 20:33:22 UTC, Dennis wrote:This is the link to the draft PR https://github.com/dlang/dmd/pull/16880**Tasks Accomplished** Design: [...]Excellent work! please do open a pull request (draft if you feel it is not yet ready) and the rest of the reviewers (and CI; ex your missing newline at the end of file for `dsymbolsem.d`) can give you feedback.
Sep 25