www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [DIP idea] out variables

reply Q. Schroll <qs.il.paperinik gmail.com> writes:
Main goal: Make the `out` parameter storage class live up to 
promises.
In current semantics, `out` is basically `ref` but with 
documented intent. The initialization of the parameter is more 
like a detail.

General Idea
============

The idea of an out variable is one that **must** be passed to a 
function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of first 
x is complete.
     f(x); // initializes x.

An `out` variable cannot be read until initialized by a function 
call in an `out` parameter position. Since D has exact evaluation 
order, it is easily determined that one usage of `x` initializes 
it and another in the same overall expression reads it (and not 
the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

Evaluation order says in /*1*/ that h(x, y) is executed before x 
and y are read for testing `x < y`.
Evaluation order says in /*2*/ that f(x) and f(y) are executed 
before x and y are read for passing them to g.

Also, multiple execution paths can lead to different 
initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

It is always possible to initialize `out` variables using an 
ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches


Templates
=========

Similar to `ref`, there will be `auto out` which infers `out` 
based on the arguments passed. `auto out` can be combined with 
`ref` (meaning pass by reference always, but if the argument is 
an out value, this is its initialization) and `auto ref` (meaning 
pass by reference if possible, and if the argument is an out 
value, this is its initialization; it cannot be passed by value 
and be initialized).

With __traits(isOut, param) one can test whether `auto out` 
boiled down to `out` or not.

After being (potentially|definitely|?) initialized, `out` 
variables do not trigger `auto out` to become `out`.


In-place `out` Variables
========================

When calling a function with an `out` parameter, instead of 
passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

The type of an in-place out variable can be left out, when it can 
be inferred from the called function. [Clearly it can be done in 
some cases and clearly it cannot be in all templates. Exact rules 
TBD.]
In the first else branch, `x` can be used, since regardless 
whether the `f(out int x) > 0 && x > 0` is true or false, 
evaluating it will initialize `x`.
In the second else branch, `x` cannot be used because `x` might 
not be initialized if g(0) is false.
The visibility of in-place out variables is limited to the 
statement they're declared in. For `if` statements this 
encompasses both branches, but for expression statements, it only 
encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid

One obvious use-case is functions that return a bool value 
indicating success and the result is an `out` parameter. Usually, 
these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

What do you think? Worth it?
Jan 25 2021
next sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.

 An `out` variable cannot be read until initialized by a 
 function call in an `out` parameter position. Since D has exact 
 evaluation order, it is easily determined that one usage of `x` 
 initializes it and another in the same overall expression reads 
 it (and not the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

 Evaluation order says in /*1*/ that h(x, y) is executed before 
 x and y are read for testing `x < y`.
 Evaluation order says in /*2*/ that f(x) and f(y) are executed 
 before x and y are read for passing them to g.

 Also, multiple execution paths can lead to different 
 initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

 It is always possible to initialize `out` variables using an 
 ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches


 Templates
 =========

 Similar to `ref`, there will be `auto out` which infers `out` 
 based on the arguments passed. `auto out` can be combined with 
 `ref` (meaning pass by reference always, but if the argument is 
 an out value, this is its initialization) and `auto ref` 
 (meaning pass by reference if possible, and if the argument is 
 an out value, this is its initialization; it cannot be passed 
 by value and be initialized).

 With __traits(isOut, param) one can test whether `auto out` 
 boiled down to `out` or not.

 After being (potentially|definitely|?) initialized, `out` 
 variables do not trigger `auto out` to become `out`.


 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

 The type of an in-place out variable can be left out, when it 
 can be inferred from the called function. [Clearly it can be 
 done in some cases and clearly it cannot be in all templates. 
 Exact rules TBD.]
 In the first else branch, `x` can be used, since regardless 
 whether the `f(out int x) > 0 && x > 0` is true or false, 
 evaluating it will initialize `x`.
 In the second else branch, `x` cannot be used because `x` might 
 not be initialized if g(0) is false.
 The visibility of in-place out variables is limited to the 
 statement they're declared in. For `if` statements this 
 encompasses both branches, but for expression statements, it 
 only encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid

 One obvious use-case is functions that return a bool value 
 indicating success and the result is an `out` parameter. 
 Usually, these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

 Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

 What do you think? Worth it?
in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly. -Alex
Jan 25 2021
parent Q. Schroll <qs.il.paperinik gmail.com> writes:
On Tuesday, 26 January 2021 at 02:44:20 UTC, 12345swordy wrote:
 What do you think? Worth it?
in, out, inout need some badly reworking to do. Their is a preview for in, but not for others sadly.
While in and out are opposites in a sense, inout is something completely unrelated. For the most part, I consider `in` to be fixed. With the preview, it works exactly as one would expect it did. On the other hand, `out` is near useless: In the current state, making `out` an alias for `ref` wouldn't be that much of a breaking change.
Jan 26 2021
prev sibling next sibling parent Tobias Pankrath <tobias+dlang pankrath.net> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }
already. It makes function with out parameters so much more pleasant to use. Many argue that we should not overload D with even more features, but I'd say, if it makes D more fun to use and it is just syntax sugar / a simple lowering than we should consider it.
Jan 26 2021
prev sibling next sibling parent Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]
A few thoughts, I like the concept of out applied to lvalues to catch things being used too early. The concept of introducing a new variable *inside* an expression sounds like a nightmare, I think the following construct is not only easier to implement but also more generally applicable elsewhere in the language if(out x; expr(x)) { } -- lowers to -- out x; if(expr(x)) { } I have left out any types from the above, although deferred type inference could be very useful it would also have to be considered very carefully. Also, finally, this would be yet another thing that rhymes with dataflow analysis in the core language, so it needs to be specified carefully.
Jan 26 2021
prev sibling next sibling parent Luhrel <lucien.perregaux gmail.com> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position.
 Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.
I would add "the icing on the cake" : As DMD would know if a `out` variable is initialized or not, we should be able to throw a generic error like "error: variable `d` is not initialized." for these types of codes: ``` class D { int x; void foo() { } } void main() { D d; d.foo(); // error: variable `d` is not initialized. } ``` ... instead of a raw crash with signal 11. That would clearly save some time.
 An `out` variable cannot be read until initialized by a 
 function call in an `out` parameter position. Since D has exact 
 evaluation order, it is easily determined that one usage of `x` 
 initializes it and another in the same overall expression reads 
 it (and not the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

 Evaluation order says in /*1*/ that h(x, y) is executed before 
 x and y are read for testing `x < y`.
 Evaluation order says in /*2*/ that f(x) and f(y) are executed 
 before x and y are read for passing them to g.

 Also, multiple execution paths can lead to different 
 initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

 It is always possible to initialize `out` variables using an 
 ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches
I imagine that it will still be possible to call f()/h() with a non-`out` variable ?
 Templates
 =========

 Similar to `ref`, there will be `auto out` which infers `out` 
 based on the arguments passed.

 `auto out` can be combined with `ref`
`void f(T)(auto out ref T t);` ?
 (meaning pass by reference always, but if the argument is an 
 out value, this is its initialization) and `auto ref` (meaning 
 pass by reference if possible, and if the argument is an out 
 value, this is its initialization; it cannot be passed by value 
 and be initialized).


 With __traits(isOut, param) one can test whether `auto out` 
 boiled down to `out` or not.
ok.
 After being (potentially|definitely|?) initialized, `out` 
 variables do not trigger `auto out` to become `out`.
That doesn't make sense.
 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }
I don't like that idea. That makes the code more difficult to read.
 The type of an in-place out variable can be left out, when it 
 can be inferred from the called function. [Clearly it can be 
 done in some cases and clearly it cannot be in all templates. 
 Exact rules TBD.]
 In the first else branch, `x` can be used, since regardless 
 whether the `f(out int x) > 0 && x > 0` is true or false, 
 evaluating it will initialize `x`.
 In the second else branch, `x` cannot be used because `x` might 
 not be initialized if g(0) is false.
 The visibility of in-place out variables is limited to the 
 statement they're declared in. For `if` statements this 
 encompasses both branches, but for expression statements, it 
 only encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid
Meh, I really don't like that fact of declaring a variable inside a function's parameter. Also, I don't thing that it will be easy to implement it.
 One obvious use-case is functions that return a bool value 
 indicating success and the result is an `out` parameter. 
 Usually, these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

 Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }
Same as above.
 What do you think? Worth it?
Yes, except the `in-place`.
Jan 26 2021
prev sibling next sibling parent reply Ogi <ogion.art gmail.com> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]
Is there any reason to use out parameters at all instead of returning a tuple?
Jan 27 2021
parent reply Max Haughton <maxhaton gmail.com> writes:
On Wednesday, 27 January 2021 at 09:34:36 UTC, Ogi wrote:
 On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]
Is there any reason to use out parameters at all instead of returning a tuple?
Struct ABI can mean overhead in places you don't expect
Jan 27 2021
parent reply Jacob Carlborg <doob me.com> writes:
On 2021-01-27 19:25, Max Haughton wrote:

 Struct ABI can mean overhead in places you don't expect
If proper tuples are built-in to the language the language can invent its own ABI for that type. Just like it does for arrays and delegates. On the other hand, there are a bunch of existing C functions that encodes out parameter as pointers. When declaring these in D, they can be declared with `out`, which will be more descriptive and safer than a pointer. It better shows the intent. -- /Jacob Carlborg
Jan 29 2021
parent Afgdr <zerzre.rertert gmx.com> writes:
On Saturday, 30 January 2021 at 07:27:16 UTC, Jacob Carlborg 
wrote:
 On 2021-01-27 19:25, Max Haughton wrote:

 Struct ABI can mean overhead in places you don't expect
If proper tuples are built-in to the language the language can invent its own ABI for that type. Just like it does for arrays and delegates. On the other hand, there are a bunch of existing C functions that encodes out parameter as pointers.
Totally expected. "out" and "ref" parameters in a backend are pointers. In a front end it's "just" an abstraction that allows special checks, typically : 1. accepts only lvalues and 2. dont try implicit conversions. In addition for "out" it adds a zeroinit before the call since as it's a kind of return it must be defined even if not modified by the callee.
Jan 30 2021
prev sibling parent reply Dukc <ajieskola gmail.com> writes:
On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 In current semantics, `out` is basically `ref` but with 
 documented intent.
It is more, at least potentially: an optimization aid. The calling function knows that contents of the `out` variable won't affect the result, unlike with `ref`.
 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.
I don't like this. It is going to get annoying in cases like this: ``` int f(out int, int); int func() { out int x; if(someCond) x.f(0); else if(someOtherCond) x.f(1); return x; } ``` What should the compiler do? It cannot know whether it's possible x can be returned uninitialized. It can issue an error just in case, and we hate to refactor code due to false alarms like that. Or it can ignore it, in which case the `out` storage parameter will sometimes work, sometimes silently fail. One is still going to need to void initialize stuff to be sure to elide the default initialization.
 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }
This, however, sounds better. I'd only leave out the requirement for the caller to specify `out`, and also let to do that for `ref` parameters.
Jan 28 2021
parent Q. Schroll <qs.il.paperinik gmail.com> writes:
On Thursday, 28 January 2021 at 09:17:47 UTC, Dukc wrote:
 int f(out int, int);

 int func()
 {  out int x;
    if(someCond) x.f(0);
    else if(someOtherCond) x.f(1);
    return x;
 }

 What should the compiler do? It cannot know whether it's 
 possible x can be returned uninitialized. It can issue an error 
 just in case, and we hate to refactor code due to false alarms 
 like that.
Exactly this. Something similar is true for an immutable constructor: struct S { int a; this(int x) immutable { if (x < 0) { a = 1; } else if (x > 0) { a = 2; } if (x == 0) a = 3; // error } } This problem is inherent.
Jan 30 2021