www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Template lowering of druntime hooks that CTFE cannot interpret

reply Teodor Dutu <teodor.dutu gmail.com> writes:
Hi,

The current workflow of the compiler is that semantic analysis 
deduces types for all expressions in the AST. Then, if CTFE is 
required, the compiler performs the interpretation in 
`dinterpret.d`.
```d
bool f()
{
// ...
}
static assert(f());
```
Before the backend can generate the code, the intermediate code 
generator performs lowerings from expressions such as `a ~= b` 
(when `b` is an array) to `_d_arrayappendT(a, b)` 
[here](https://github.com/dlang/dmd/blob/25bf00749406171f4e7b52dbf0b6df9cb1181854/src/dmd/e2ir.d#L2715-L2734).

The intermediate code generator receives a fully decorated AST, 
therefore it does not run any semantic analysis. As a 
consequence, it is impossible to instantiate templates at this 
level without introducing calls to semantic analysis routines 
(currently there is no such precedent in the intermediate code 
generator). In addition, this layer differs between the various 
compilers, because each intermediary representation differs. 
However, one advantage of this approach is that the CTFE 
interpreter does not need to be aware of any hooks since the 
lowering takes place at a lower level.

This causes issues when the lowering is moved up from the 
intermediate code generator to the frontend, because now CTFE 
must recognize the hooks and interpret them either by
interpreting the runtime hook itself or by
generating interpretable code
The first option is not a viable one since most hooks call C 
stdlib functions, such as memcpy or malloc, which cannot be 
interpreted. Therefore, the alternative is to lower the calls to 
templates during semantic and then intercept such lowerings at 
CTFE and then bypass interpreting the runtime hooks. As an 
example, when lowering the expression `S[n] a = b` to 
`_d_arrayctor(a, b)`, the approach we chose in [this 
PR](https://github.com/dlang/dmd/pull/13116) was to have CTFE 
rewrite `_d_arrayctor(a, b)` back to `S[n] a = b` 
[here](https://github.com/teodutu/dmd/blob/eeb7f7fad360a5955d3db90fc1b98be535d790f6/src/dmd/dinte
pret.d#L4816-L4838) and then interpret it as a `ConstructExp`.

The solution above doesn't work when dealing with a lowering to 
`_d_arrayappendcTX`, because there is no single `CallExp` to 
rewrite to a corresponding `CatAssign` expression. This mismatch 
existed prior to our work and was solved by lowering `a ~= b` to 
`_d_arrayappendcTX(a, 1), a[$ - 1] = b, a` in `e2ir.d`. If we 
kept the same lowering when using the new templated hook, then in 
order to reconstruct the original expression, CTFE would have to 
search through the lowered `CommaExp` and look for 
`_d_arrayappendcTX`. This approach is both inelegant and 
impractical. Thus, the approach we chose was to lower `a ~= b` to:
```d
__ctfe ? a ~= b : _d_arrayappendcTX(a, 1), a[$ - 1] = b, a;
```
This makes it so that CTFE will pick the `true` branch of the 
`__ctfe` condition and not bother with the `false` branch. But 
while solving the problem of interpreting the expression 
correctly during CTFE, this approach passes the entire `CondExp` 
to e2ir.d, which then has to 
[ignore](https://github.com/dlang/dmd/blob/92d463064b567dd2e0a88aba2d32117a65be47d6/src/dmd
e2ir.d#L2911-L2922) the `CondExp` and the `true` branch. Moreover, s2ir.d has
to do [something similar](https://github.com/dlang/dmd/blob/92d463064b567dd2e0a88aba2d32117a65be47d6/src/d
d/s2ir.d#L188-L210) for certain `IfStatement`s.

The solution above can be improved so as to not require code 
changes to e2ir.d and s2ir.d. We aim to do this by breaking away 
from the old hooks when necessary and implementing new templated 
ones that correspond to the expressions from which they will be 
lowered. In the case of `_d_arrayappendcTX`, for example, we plan 
to modify the existing template `_d_arrayappendT` to perform `~=` 
regardless of whether the rhs is an array or a single element. 
This way, CTFE will be able to identify calls to 
`_d_arrayappendT`, convert them to `a ~= b` and then interpret 
the latter expression.

Additionally, we have also considered an alternative solution, 
whereby we introduce a new visitor between CTFE and the 
intermediate code generator. This visitor would eliminate all 
`__ctfe` `CondExp`s and `IfStatement`s as well as their `true` 
branches before passing the AST to the IR generator. This 
solution is, however, inefficient, as it adds another pass 
through the AST in order to remove some code that we ourselves 
insert. The real problem is the fact that the hooks do not 
perform the exact same actions as the expressions from which 
they’re lowered and this approach doesn’t solve the problem. The 
first approach, however, does.

Do you suggest any other solutions than those we propose?

Thanks,
Teodor
Jan 17 2022
next sibling parent max haughton <maxhaton gmail.com> writes:
On Monday, 17 January 2022 at 15:25:57 UTC, Teodor Dutu wrote:
 Hi,

 The current workflow of the compiler is that semantic analysis 
 deduces types for all expressions in the AST. Then, if CTFE is 
 required, the compiler performs the interpretation in 
 `dinterpret.d`.
 ```d
 bool f()
 {
 // ...
 }
 static assert(f());
 ```
 Before the backend can generate the code, the intermediate code 
 generator performs lowerings from expressions such as `a ~= b` 
 (when `b` is an array) to `_d_arrayappendT(a, b)` 
 [here](https://github.com/dlang/dmd/blob/25bf00749406171f4e7b52dbf0b6df9cb1181854/src/dmd/e2ir.d#L2715-L2734).

 [...]
Moving rewriting steps further up the compiler is a good thing. In this case it seems like a practical necessity anyway because ldc and GDC will have to do this eventually and they do not use e2ir et al, at all.
Jan 17 2022
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 17.01.22 16:25, Teodor Dutu wrote:
 
 Do you suggest any other solutions than those we propose?
Detect `if(__ctfe)` and don't do the rewrites in its body. Then use `if(__ctfe)` in the runtime hooks to provide an implementation that's compatible with CTFE. Of course, this won't work with the hooks that you have designed in a way that invokes UB, but I think that's a feature as those should be redesigned.
Jan 17 2022