www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Continuation passing style vs. wrapper objects in dmd

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dmd has a few string functions with names having "Then" as a prefix that 
take a lambda and call it with a temporary string converted for OS 
purposes (zero-terminated, encoded a specific way etc). The use goes 
like this:

int i = module_name.toCStringThen!(name => stat(name.ptr, &statbuf));

The way it goes, `module_name` gets converted from `char[]` to 
null-terminated `char*`, the lambda gets invoked, then whatever 
temporary memory allocated is freed just after the lambda returns.

I was thinking there's an easier way that's also more composable:

int i = stat(stringz(name).ptr, &statbuf));

where `stringz` returns a temporary struct offering primitives such as 
`ptr` and `opSlice`. In the destructor, the struct frees temporary 
memory if allocated. Better yet, it can return them as `scope` variable, 
that way ensuring correctness in safe code.

Destruction of temporary objects has been sketchy in the past but I 
assume things have been ironed out by now.
May 25
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/25/21 1:21 PM, Andrei Alexandrescu wrote:
 dmd has a few string functions with names having "Then" as a prefix
s/prefix/suffix/
May 25
prev sibling next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu 
wrote:
 int i = stat(stringz(name).ptr, &statbuf));
I much prefer that form too, but unfortunately it only works with immediate use. This is a big pitfall: ```D { const(char)* s = stringz(name).ptr; // destructor runs here printf(s); // use after free } ``` You have to write: ```D { auto temp = stringz(name); const(char)* s = temp.ptr; printf(s); // good! // destructor runs here } ``` The first form should be preventable with -dip1000, but it isn't currently: https://issues.dlang.org/show_bug.cgi?id=20880 https://issues.dlang.org/show_bug.cgi?id=21868
May 25
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/25/21 1:56 PM, Dennis wrote:
 On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:
 int i = stat(stringz(name).ptr, &statbuf));
I much prefer that form too, but unfortunately it only works with immediate use. This is a big pitfall: ```D { const(char)* s = stringz(name).ptr; // destructor runs here printf(s); // use after free } ``` You have to write: ```D { auto temp = stringz(name); const(char)* s = temp.ptr; printf(s); // good! // destructor runs here } ```
The CPS is exposed to the problem as well: const char* p; module_name.toCStringThen!(name => p = name.ptr);
 The first form should be preventable with -dip1000, but it isn't currently:
 https://issues.dlang.org/show_bug.cgi?id=20880
 https://issues.dlang.org/show_bug.cgi?id=21868
DIP1000 should help both, but the wrapper object is (to me at least) vastly easier on the eyes. Not to mention composability (what if you have two of those...)
May 25
prev sibling next sibling parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu 
wrote:
 int i = module_name.toCStringThen!(name => stat(name.ptr, 
 &statbuf));
This can use all alloca, if it does not inline.
 int i = stat(stringz(name).ptr, &statbuf));
This cannot...
May 25
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 26 May 2021 at 04:59:36 UTC, Ola Fosheim Grostad 
wrote:
 int i = stat(stringz(name).ptr, &statbuf));
This cannot...
Well, actually it could use alloca if you are 100% sure stringz is inlined, but it would be bad in a loop.
May 25
parent Daniel N <no public.email> writes:
On Wednesday, 26 May 2021 at 05:18:57 UTC, Ola Fosheim Grostad 
wrote:
 On Wednesday, 26 May 2021 at 04:59:36 UTC, Ola Fosheim Grostad 
 wrote:
 int i = stat(stringz(name).ptr, &statbuf));
This cannot...
Well, actually it could use alloca if you are 100% sure stringz is inlined, but it would be bad in a loop.
Thread necromancy yields: ref E stalloc(E)(ref E mem = *(cast(E*)alloca(E.sizeof))) Guaranteed to work as default parameter initialisation occurs in callers contex, but indeed loops are no fun.
May 26
prev sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu 
wrote:
 dmd has a few string functions with names having "Then" as a 
 prefix that take a lambda and call it with a temporary string 
 converted for OS purposes (zero-terminated, encoded a specific 
 way etc). The use goes like this:

 [...]

 I was thinking there's an easier way that's also more 
 composable:

 int i = stat(stringz(name).ptr, &statbuf));

 where `stringz` returns a temporary struct offering primitives 
 such as `ptr` and `opSlice`. In the destructor, the struct 
 frees temporary memory if allocated. Better yet, it can return 
 them as `scope` variable, that way ensuring correctness in safe 
 code.
Well, the usage of CPS is limited to one level here. As your example show, we can return whatever we want from `toCStringThen`, and if needed, chain the return value with something else. The `struct` approach works to a certain degree: DIP1000 would *not* provide the tool to make this pattern work in a ` safe` context. I have yet to see a container that is ` safe` to use with DIP1000 (e.g. https://github.com/dlang/phobos/pull/8101 ), but making the CPS work with DIP1000 is possible (provided DIP1000 works as intended). It's from this observation that this approach became my preferred one, and that's what led to `toCStringThen` (origin: https://github.com/dlang/dmd/pull/8585 ). But this function is only used a handful of times (~10?) in DMD, and only for C functions or in trampoline functions (turning a slice into a pointer). Is there a large ROI in finding the best possible pattern for it ? There are many large architectural problems in DMD that needs to be addressed, such as the absolute lack of abstraction despite the OOP hierarchy. The semantic routines will cast a base type (`Expression`, `Dsymbol`, etc...) to a more specialized type literally everywhere, instead of relying on virtual functions / properties available in the base classes. Just grep for `cast(TypeFunction)` to get an idea of what I mean.
May 26
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 2021-05-26 3:48, Mathias LANG wrote:
 On Tuesday, 25 May 2021 at 17:21:16 UTC, Andrei Alexandrescu wrote:
 dmd has a few string functions with names having "Then" as a prefix 
 that take a lambda and call it with a temporary string converted for 
 OS purposes (zero-terminated, encoded a specific way etc). The use 
 goes like this:

 [...]

 I was thinking there's an easier way that's also more composable:

 int i = stat(stringz(name).ptr, &statbuf));

 where `stringz` returns a temporary struct offering primitives such as 
 `ptr` and `opSlice`. In the destructor, the struct frees temporary 
 memory if allocated. Better yet, it can return them as `scope` 
 variable, that way ensuring correctness in safe code.
Well, the usage of CPS is limited to one level here. As your example show, we can return whatever we want from `toCStringThen`, and if needed, chain the return value with something else. The `struct` approach works to a certain degree: DIP1000 would *not* provide the tool to make this pattern work in a ` safe` context. I have yet to see a container that is ` safe` to use with DIP1000 (e.g. https://github.com/dlang/phobos/pull/8101 ), but making the CPS work with DIP1000 is possible (provided DIP1000 works as intended). It's from this observation that this approach became my preferred one, and that's what led to `toCStringThen` (origin: https://github.com/dlang/dmd/pull/8585 ).
It's great that DIP1000 works with CPS. Given the familiarity and ubiquity of wrapper structs, the more important conclusion here is we must make DIP1000 work with them. A struct should be able to expose innards thereof in with "scope" and have the compiler make sure their use doesn't outlive the struct. It's pretty much the primary use case of DIP1000.
 But this function is only used a handful of times (~10?) in DMD, and 
 only for C functions or in trampoline functions (turning a slice into a 
 pointer). Is there a large ROI in finding the best possible pattern for 
 it ? There are many large architectural problems in DMD that needs to be 
 addressed, such as the absolute lack of abstraction despite the OOP 
 hierarchy. The semantic routines will cast a base type (`Expression`, 
 `Dsymbol`, etc...) to a more specialized type literally everywhere, 
 instead of relying on virtual functions / properties available in the 
 base classes. Just grep for `cast(TypeFunction)` to get an idea of what 
 I mean.
Sure. It's a new idiom in D and out of character for dmd so it's worth exploring its pros and cons with an eye for further adoption.
May 26