digitalmars.D - [SAOC 2024] SARIF Output Analysis and Documentation - Weekly Update #2
- Royal Simpson Pinto (65/65) Sep 29 ## Summary of Progress (September 23 – September 29)
In this second week of Milestone 1, I continued my focus on the DMD compiler with an emphasis on **SARIF output**. My work this week centered on analyzing SARIF outputs generated by **GCC** and **Clang**, documenting the results, and understanding how they can be applied for DMD’s SARIF integration. I also documented key differences between `physicalLocation` and `logicalLocations`, which will be included in the DMD developer documentation once reviewed. --- I ran test programs to generate SARIF outputs using both **GCC** and **Clang** compilers. The goal was twofold: - To quickly identify SARIF output patterns, validating them with the **SARIF Viewer** extension in VS Code. - To investigate whether SARIF requires each error to have a unique error code. My analysis showed that while unique codes are not required, they are recommended for clarity. This analysis is critical as it will inform the **SARIF integration** for DMD. The findings show that DMD doesn’t need to strictly enforce unique error codes but should aim to include them for better usability. I began documenting the SARIF patterns (an example can be found [here](https://docs.google.com/document/d/1Hl0Zbmr93XpapSubd8tLOIIunNfsBFM-DJjWj0BoaJ /edit?usp=sharing)) observed from GCC and Clang, focusing on how errors are reported. A key part of this documentation was providing a real example of an error scenario, which will help in understanding how to map error outputs to SARIF format in the DMD compiler. `logicalLocations`**: Following my mentor’s guidance, I analyzed the difference between `physicalLocation` and `logicalLocations` in SARIF outputs: - **GCC** provides both `physicalLocation` (the exact file/line/column) and `logicalLocations` (the function or class context). - **Clang**, on the other hand, mostly includes `physicalLocation` without additional logical context. I documented these findings (in my local repo as of now, can be found [here](https://github.com/royalpinto007/d-drafts/blob/main/ph sicalvslogical.md), which will be added to the DMD developer docs (after mentor's review) to guide other contributors working with SARIF. I utilized several resources to assist with my analysis and documentation, including the **[SARIF tutorials](https://github.com/microsoft/sarif-tutorials/tree/main/docs)** and the **SARIF Viewer [extension](https://marketplace.visualstudio.com/items?itemName=MS-Sarif SCode.sarif-viewer) for VS Code**. These tools were helpful in validating SARIF outputs and ensuring they conform to the specification. --- While both compilers provide SARIF outputs, their handling of error reporting—particularly in how they use `logicalLocations`—was different, making it necessary to adapt the documentation and integration plan accordingly. Understanding these differences took some time, but it will be valuable when implementing SARIF support in DMD. Fully grasping the SARIF schema, especially in terms of which fields are required and which are optional, was initially challenging. However, after going through the SARIF tutorials and examining real outputs, I now have a clearer understanding of how to structure SARIF output for the DMD compiler. --- - Complete the documentation on **SARIF integration** for DMD, including the key differences between `physicalLocation` and `logicalLocations`, and finalize the error output example. - Begin working on integrating SARIF support into DMD, focusing on mapping the compiler’s error reporting system to the SARIF schema.
Sep 29