www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [SAOC 2024] SARIF Output Analysis and Documentation - Weekly Update #2



In this second week of Milestone 1, I continued my focus on the 
DMD compiler with an emphasis on **SARIF output**. My work this 
week centered on analyzing SARIF outputs generated by **GCC** and 
**Clang**, documenting the results, and understanding how they 
can be applied for DMD’s SARIF integration. I also documented key 
differences between `physicalLocation` and `logicalLocations`, 
which will be included in the DMD developer documentation once 
reviewed.

---




I ran test programs to generate SARIF outputs using both **GCC** 
and **Clang** compilers. The goal was twofold:
- To quickly identify SARIF output patterns, validating them with 
the **SARIF Viewer** extension in VS Code.
- To investigate whether SARIF requires each error to have a 
unique error code. My analysis showed that while unique codes are 
not required, they are recommended for clarity.

This analysis is critical as it will inform the **SARIF 
integration** for DMD. The findings show that DMD doesn’t need to 
strictly enforce unique error codes but should aim to include 
them for better usability.


I began documenting the SARIF patterns (an example can be found 
[here](https://docs.google.com/document/d/1Hl0Zbmr93XpapSubd8tLOIIunNfsBFM-DJjWj0BoaJ
/edit?usp=sharing)) observed from GCC and Clang, focusing on how errors are
reported. A key part of this documentation was providing a real example of an
error scenario, which will help in understanding how to map error outputs to
SARIF format in the DMD compiler.


`logicalLocations`**:
Following my mentor’s guidance, I analyzed the difference between 
`physicalLocation` and `logicalLocations` in SARIF outputs:
- **GCC** provides both `physicalLocation` (the exact 
file/line/column) and `logicalLocations` (the function or class 
context).
- **Clang**, on the other hand, mostly includes 
`physicalLocation` without additional logical context.

I documented these findings (in my local repo as of now, can be 
found 
[here](https://github.com/royalpinto007/d-drafts/blob/main/ph
sicalvslogical.md), which will be added to the DMD developer docs (after
mentor's review) to guide other contributors working with SARIF.


I utilized several resources to assist with my analysis and 
documentation, including the **[SARIF 
tutorials](https://github.com/microsoft/sarif-tutorials/tree/main/docs)** and
the **SARIF Viewer [extension](https://marketplace.visualstudio.com/items?itemName=MS-Sarif
SCode.sarif-viewer) for VS Code**. These tools were helpful in validating SARIF
outputs and ensuring they conform to the specification.

---




While both compilers provide SARIF outputs, their handling of 
error reporting—particularly in how they use 
`logicalLocations`—was different, making it necessary to adapt 
the documentation and integration plan accordingly. Understanding 
these differences took some time, but it will be valuable when 
implementing SARIF support in DMD.


Fully grasping the SARIF schema, especially in terms of which 
fields are required and which are optional, was initially 
challenging. However, after going through the SARIF tutorials and 
examining real outputs, I now have a clearer understanding of how 
to structure SARIF output for the DMD compiler.

---



- Complete the documentation on **SARIF integration** for DMD, 
including the key differences between `physicalLocation` and 
`logicalLocations`, and finalize the error output example.
- Begin working on integrating SARIF support into DMD, focusing 
on mapping the compiler’s error reporting system to the SARIF 
schema.
Sep 29