www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Introduce a New DRuntime Hook Lowering Pass to Unify the Logic for All

reply Teodor Dutu <teodor.dutu gmail.com> writes:
Hi,

I am trying to add a template lowering for `ArrayLiteralExp`s. It 
should call `_d_arrayliteralTX` [1]. I introduced all previous 
lowerings in `expressionsem.d` and intended to do the same with 
this one by adding it here [2]. However, `ArrayLiteralExp`s are 
more finicky than other expressions because the compiler makes 
many optimisations that end up creating additional 
`ArrayLiteralExp`s or changing their type. Below are the main 
issues:

a. `constfold.d` will replace expressions such as `[1, 2] ~ 3` 
with `[1, 2, 3]` here [3]. The new `ArrayLiteralExp` is not 
semantically analysed (because it's obvious that the result has a 
correct type if the operands do so) and its type is set to that 
of `[1, 2]`. However, this causes `[1, 2, 3]` to miss its 
lowering. Also notice that now the lowering of `[1, 2]` is no 
longer used.

b. Sometimes the type of the array literal is promoted to the 
type of the variable in which it's stored. Consider this 
expression: `long[] arr = [1, 2, 3]`. `[1, 2, 3]` is initially 
analysed and its type inferred to `int[]` [2]. Then its type is 
bumped up to `long[]` using `implicitCastTo` [4]. If the hook is 
inserted in [2], its type is int[], thus causing an incorrect 
allocation of `3 * 4 = 12` bytes instead of `3 * 8 = 24`. This, 
in turn, leads to heap corruptions when initialising the elements 
of the array as `long`s.

Problem a can be solved individually by copying the lowering of 
`[1, 2]` to `[1, 2, 3]` and then updating the length argument of 
the hook from 2 to 3 in `e2ir.d` [5], which is where the AST no 
longer suffers any changes. This is ugly because it requires many 
changes to `constfold.d` and changes the logic of `e2ir.d`, which 
will require the same changes to be mimicked by GDC and LDC's 
glue layers. Problem b can also be solved individually by also 
performing the lowering in `implicitCastTo` [4]. But this 
solution is also ugly because now the lowering is performed in 2 
places, one of which is totally unintuitive. Coupled with the 
difficulties introduced by the solution to problem b, this patch 
is scattered all over the place, making code difficult to 
maintain and lacking in performance because of all the redundant 
lowerings and analyses made.

Hence what I am proposing as a global solution to both a and b 
and also applicable to other hooks is to create another pass just 
for lowering, after semantic analysis has finished and has 
produced a definitive AST. This will provide a unified place to 
introduce all DRuntime lowerings, thus leading to more readable 
and maintainable code in the long run. Traversing the AST in this 
manner will hurt performance, which is why it would be enough to 
just store the nodes that require lowerings in an array so that 
this new pass just iterates this much smaller array instead of 
the whole AST.

What do you think about my proposal in the paragraph above? Can 
you think of better approaches that you can suggest instead?

Thanks,
Teo

[1] 
https://github.com/dlang/dmd/blob/67135935086494408e511abad3d1dae7eda4699d/druntime/src/rt/lifetime.d#L2122-L2140
[2] 
https://github.com/dlang/dmd/blob/67135935086494408e511abad3d1dae7eda4699d/compiler/src/dmd/expressionsem.d#L4382-L4427
[3] 
https://github.com/dlang/dmd/blob/67135935086494408e511abad3d1dae7eda4699d/compiler/src/dmd/constfold.d#L1624
[4] 
https://github.com/dlang/dmd/blob/67135935086494408e511abad3d1dae7eda4699d/compiler/src/dmd/dcast.d#L185-L194
[5] 
https://github.com/dlang/dmd/blob/67135935086494408e511abad3d1dae7eda4699d/compiler/src/dmd/e2ir.d#L4051-L4054
Feb 08
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/02/2024 3:58 AM, Teodor Dutu wrote:
 Hence what I am proposing as a global solution to both a and b and also 
 applicable to other hooks is to create another pass just for lowering, 
 after semantic analysis has finished and has produced a definitive AST. 
 This will provide a unified place to introduce all DRuntime lowerings, 
 thus leading to more readable and maintainable code in the long run. 
 Traversing the AST in this manner will hurt performance, which is why it 
 would be enough to just store the nodes that require lowerings in an 
 array so that this new pass just iterates this much smaller array 
 instead of the whole AST.
It might be a good idea to question if a linked list wouldn't be a better solution than an array to store the nodes. It depends how often lowering has to take place. If its often then the extra ram usage would be worth the linked list.
Feb 08
parent Teodor Dutu <teodor.dutu gmail.com> writes:
On Thursday, 8 February 2024 at 15:08:05 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 On 09/02/2024 3:58 AM, Teodor Dutu wrote:
 Hence what I am proposing as a global solution to both a and b 
 and also applicable to other hooks is to create another pass 
 just for lowering, after semantic analysis has finished and 
 has produced a definitive AST. This will provide a unified 
 place to introduce all DRuntime lowerings, thus leading to 
 more readable and maintainable code in the long run. 
 Traversing the AST in this manner will hurt performance, which 
 is why it would be enough to just store the nodes that require 
 lowerings in an array so that this new pass just iterates this 
 much smaller array instead of the whole AST.
It might be a good idea to question if a linked list wouldn't be a better solution than an array to store the nodes. It depends how often lowering has to take place. If its often then the extra ram usage would be worth the linked list.
Indeed. Counting the nodes that require lowering in larger codebases (like phobos for example) might be useful to decide this. In addition, on second thought, an unordered set would be better suited because, in the case of const folds, the old array (`[1, 2]` in my example) should be removed from the container (to name it generically) and replaced with the folded one (`[1, 2, 3]`). Regardless, this is more of an implementation detail in case the new pass is to be introduced. The exact nature of how to store the expressions that require lowering can be improved once the general approach is decided upon.
Feb 08