digitalmars.D - [DIP idea] out variables

Q. Schroll (91/91) Jan 25 2021 Main goal: Make the `out` parameter storage class live up to

12345swordy (4/95) Jan 25 2021 in, out, inout need some badly reworking to do. Their is a

Q. Schroll (8/11) Jan 26 2021 While in and out are opposites in a sense, inout is something

Tobias Pankrath (7/13) Jan 26 2021 I recently started using C# professionally which has this feature
Max Haughton (22/28) Jan 26 2021 A few thoughts,
Luhrel (33/126) Jan 26 2021 I would add "the icing on the cake" : As DMD would know if a
Ogi (3/9) Jan 27 2021 Is there any reason to use out parameters at all instead of

Max Haughton (2/12) Jan 27 2021 Struct ABI can mean overhead in places you don't expect

Jacob Carlborg (9/10) Jan 29 2021 If proper tuples are built-in to the language the language can invent

Afgdr (10/17) Jan 30 2021 Totally expected. "out" and "ref" parameters in a backend are

Dukc (24/45) Jan 28 2021 It is more, at least potentially: an optimization aid. The

Q. Schroll (14/25) Jan 30 2021 Exactly this. Something similar is true for an immutable

Q. Schroll <qs.il.paperinik gmail.com> writes:

Main goal: Make the `out` parameter storage class live up to 
promises.
In current semantics, `out` is basically `ref` but with 
documented intent. The initialization of the parameter is more 
like a detail.

General Idea
============

The idea of an out variable is one that **must** be passed to a 
function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of first 
x is complete.
     f(x); // initializes x.

An `out` variable cannot be read until initialized by a function 
call in an `out` parameter position. Since D has exact evaluation 
order, it is easily determined that one usage of `x` initializes 
it and another in the same overall expression reads it (and not 
the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

Evaluation order says in /*1*/ that h(x, y) is executed before x 
and y are read for testing `x < y`.
Evaluation order says in /*2*/ that f(x) and f(y) are executed 
before x and y are read for passing them to g.

Also, multiple execution paths can lead to different 
initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

It is always possible to initialize `out` variables using an 
ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches


Templates
=========

Similar to `ref`, there will be `auto out` which infers `out` 
based on the arguments passed. `auto out` can be combined with 
`ref` (meaning pass by reference always, but if the argument is 
an out value, this is its initialization) and `auto ref` (meaning 
pass by reference if possible, and if the argument is an out 
value, this is its initialization; it cannot be passed by value 
and be initialized).

With __traits(isOut, param) one can test whether `auto out` 
boiled down to `out` or not.

After being (potentially|definitely|?) initialized, `out` 
variables do not trigger `auto out` to become `out`.


In-place `out` Variables
========================

When calling a function with an `out` parameter, instead of 
passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

The type of an in-place out variable can be left out, when it can 
be inferred from the called function. [Clearly it can be done in 
some cases and clearly it cannot be in all templates. Exact rules 
TBD.]
In the first else branch, `x` can be used, since regardless 
whether the `f(out int x) > 0 && x > 0` is true or false, 
evaluating it will initialize `x`.
In the second else branch, `x` cannot be used because `x` might 
not be initialized if g(0) is false.
The visibility of in-place out variables is limited to the 
statement they're declared in. For `if` statements this 
encompasses both branches, but for expression statements, it only 
encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid

One obvious use-case is functions that return a bool value 
indicating success and the result is an `out` parameter. Usually, 
these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

What do you think? Worth it?

Jan 25 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.

 An `out` variable cannot be read until initialized by a 
 function call in an `out` parameter position. Since D has exact 
 evaluation order, it is easily determined that one usage of `x` 
 initializes it and another in the same overall expression reads 
 it (and not the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

 Evaluation order says in /*1*/ that h(x, y) is executed before 
 x and y are read for testing `x < y`.
 Evaluation order says in /*2*/ that f(x) and f(y) are executed 
 before x and y are read for passing them to g.

 Also, multiple execution paths can lead to different 
 initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

 It is always possible to initialize `out` variables using an 
 ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches


 Templates
 =========

 Similar to `ref`, there will be `auto out` which infers `out` 
 based on the arguments passed. `auto out` can be combined with 
 `ref` (meaning pass by reference always, but if the argument is 
 an out value, this is its initialization) and `auto ref` 
 (meaning pass by reference if possible, and if the argument is 
 an out value, this is its initialization; it cannot be passed 
 by value and be initialized).

 With __traits(isOut, param) one can test whether `auto out` 
 boiled down to `out` or not.

 After being (potentially|definitely|?) initialized, `out` 
 variables do not trigger `auto out` to become `out`.


 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

 The type of an in-place out variable can be left out, when it 
 can be inferred from the called function. [Clearly it can be 
 done in some cases and clearly it cannot be in all templates. 
 Exact rules TBD.]
 In the first else branch, `x` can be used, since regardless 
 whether the `f(out int x) > 0 && x > 0` is true or false, 
 evaluating it will initialize `x`.
 In the second else branch, `x` cannot be used because `x` might 
 not be initialized if g(0) is false.
 The visibility of in-place out variables is limited to the 
 statement they're declared in. For `if` statements this 
 encompasses both branches, but for expression statements, it 
 only encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid

 One obvious use-case is functions that return a bool value 
 indicating success and the result is an `out` parameter. 
 Usually, these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

 Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

 What do you think? Worth it?

in, out, inout need some badly reworking to do. Their is a 
preview for in, but not for others sadly.

-Alex

Jan 25 2021

Q. Schroll <qs.il.paperinik gmail.com> writes:

On Tuesday, 26 January 2021 at 02:44:20 UTC, 12345swordy wrote:
 What do you think? Worth it?

 in, out, inout need some badly reworking to do. Their is a 
 preview for in, but not for others sadly.

While in and out are opposites in a sense, inout is something 
completely unrelated.
For the most part, I consider `in` to be fixed. With the preview, 
it works exactly as one would expect it did.
On the other hand, `out` is near useless: In the current state, 
making `out` an alias for `ref` wouldn't be that much of a 
breaking change.

Jan 26 2021

Tobias Pankrath <tobias+dlang pankrath.net> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }


already. It makes function with out parameters so much more 
pleasant to use.

Many argue that we should not overload D with even more features, 
but I'd say, if it makes D more fun to use and it is just syntax 
sugar / a simple lowering than we should consider it.

Jan 26 2021

Max Haughton <maxhaton gmail.com> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]

A few thoughts,

I like the concept of out applied to lvalues to catch things 
being used too early.

The concept of introducing a new variable *inside* an expression 
sounds like a nightmare,
I think the following construct is not only easier to implement 
but also more generally applicable elsewhere in the language

if(out x; expr(x))
{

}

-- lowers to --
out x;
if(expr(x))
{

}


I have left out any types from the above, although deferred type 
inference could be very useful it would also have to be 
considered very carefully.

Also, finally, this would be yet another thing that rhymes with 
dataflow analysis in the core language, so it needs to be 
specified carefully.

Jan 26 2021

Luhrel <lucien.perregaux gmail.com> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position.
 Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.


I would add "the icing on the cake" : As DMD would know if a 
`out` variable is initialized or not, we should be able to throw 
a generic error like "error: variable `d` is not initialized." 
for these types of codes:

```
class D
{
     int x;
     void foo()
     {
     }
}

void main()
{
     D d;
     d.foo(); // error: variable `d` is not initialized.
}
```

... instead of a raw crash with signal 11.

That would clearly save some time.

 An `out` variable cannot be read until initialized by a 
 function call in an `out` parameter position. Since D has exact 
 evaluation order, it is easily determined that one usage of `x` 
 initializes it and another in the same overall expression reads 
 it (and not the other way around):

     out int x, y;
     /*1*/ if (h(x, y) > 0 && x < y) { .. }
     /*2*/ g(f(x), f(y), x, y);

 Evaluation order says in /*1*/ that h(x, y) is executed before 
 x and y are read for testing `x < y`.
 Evaluation order says in /*2*/ that f(x) and f(y) are executed 
 before x and y are read for passing them to g.

 Also, multiple execution paths can lead to different 
 initialization points:

     out int x, y, z;
     if (g(0)) { f(x); f(y); f(z); } else h(x, y);
     // x, y are initialized.
     g(x, y); // okay: x and y initialized on both branches
     g(z); // invalid: z might not be initialized.

 It is always possible to initialize `out` variables using an 
 ordinary assignment:

     out int x, y, z;
     if (g(0)) { /*as above*/ } else { h(x, y); z = 0; }
     g(z); // valid: z initialized on both branches

I imagine that it will still be possible to call f()/h() with a 
non-`out` variable ?

 Templates
 =========

 Similar to `ref`, there will be `auto out` which infers `out` 
 based on the arguments passed.

 `auto out` can be combined with `ref`

`void f(T)(auto out ref T t);` ?

 (meaning pass by reference always, but if the argument is an 
 out value, this is its initialization) and `auto ref` (meaning 
 pass by reference if possible, and if the argument is an out 
 value, this is its initialization; it cannot be passed by value 
 and be initialized).


 With __traits(isOut, param) one can test whether `auto out` 
 boiled down to `out` or not.

ok.

 After being (potentially|definitely|?) initialized, `out` 
 variables do not trigger `auto out` to become `out`.

That doesn't make sense.

 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

I don't like that idea. That makes the code more difficult to 
read.

 The type of an in-place out variable can be left out, when it 
 can be inferred from the called function. [Clearly it can be 
 done in some cases and clearly it cannot be in all templates. 
 Exact rules TBD.]
 In the first else branch, `x` can be used, since regardless 
 whether the `f(out int x) > 0 && x > 0` is true or false, 
 evaluating it will initialize `x`.
 In the second else branch, `x` cannot be used because `x` might 
 not be initialized if g(0) is false.
 The visibility of in-place out variables is limited to the 
 statement they're declared in. For `if` statements this 
 encompasses both branches, but for expression statements, it 
 only encompasses that expression:

     x = f(out a) + a; // valid
     y = f(out b);
     // y += b; // error, b not visible
     out int c;
     f(c);
     z += c; // valid

Meh, I really don't like that fact of declaring a variable inside 
a function's parameter.
Also, I don't thing that it will be easy to implement it.

 One obvious use-case is functions that return a bool value 
 indicating success and the result is an `out` parameter. 
 Usually, these functions' names begin with try:

     if (tryParseInt(str, out x)) { use(x); }

 Another could be unpacking:

     out T x;
     out S y;
     tuple.unpack(x, y);
     // or
     if (tuple.unpack(out a, out b) && condition(a, b)) { .. }

Same as above.

 What do you think? Worth it?

Yes, except the `in-place`.

Jan 26 2021

Ogi <ogion.art gmail.com> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]

Is there any reason to use out parameters at all instead of 
returning a tuple?

Jan 27 2021

Max Haughton <maxhaton gmail.com> writes:

On Wednesday, 27 January 2021 at 09:34:36 UTC, Ogi wrote:
 On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 Main goal: Make the `out` parameter storage class live up to 
 promises.
 In current semantics, `out` is basically `ref` but with 
 documented intent. The initialization of the parameter is more 
 like a detail.

 [...]

 Is there any reason to use out parameters at all instead of 
 returning a tuple?

Struct ABI can mean overhead in places you don't expect

Jan 27 2021

Jacob Carlborg <doob me.com> writes:

On 2021-01-27 19:25, Max Haughton wrote:

 Struct ABI can mean overhead in places you don't expect

If proper tuples are built-in to the language the language can invent 
its own ABI for that type. Just like it does for arrays and delegates.

On the other hand, there are a bunch of existing C functions that 
encodes out parameter as pointers. When declaring these in D, they can 
be declared with `out`, which will be more descriptive and safer than a 
pointer. It better shows the intent.

-- 
/Jacob Carlborg

Jan 29 2021

Afgdr <zerzre.rertert gmx.com> writes:

On Saturday, 30 January 2021 at 07:27:16 UTC, Jacob Carlborg 
wrote:
 On 2021-01-27 19:25, Max Haughton wrote:

 Struct ABI can mean overhead in places you don't expect

 If proper tuples are built-in to the language the language can 
 invent its own ABI for that type. Just like it does for arrays 
 and delegates.

 On the other hand, there are a bunch of existing C functions 
 that encodes out parameter as pointers.

Totally expected. "out" and "ref" parameters in a backend are 
pointers.
In a front end it's "just" an abstraction that allows special 
checks, typically : 1. accepts only lvalues and 2. dont try 
implicit conversions.
In addition for "out" it adds a zeroinit before the call since as 
it's a kind of return it must be defined even if not modified by 
the callee.

Jan 30 2021

Dukc <ajieskola gmail.com> writes:

On Tuesday, 26 January 2021 at 01:01:54 UTC, Q. Schroll wrote:
 In current semantics, `out` is basically `ref` but with 
 documented intent.

It is more, at least potentially: an optimization aid. The 
calling function knows that contents of the `out` variable won't 
affect the result, unlike with `ref`.

 General Idea
 ============

 The idea of an out variable is one that **must** be passed to a 
 function in an `out` parameter position. Basic example:

     int f(out int value);
     int g(int[] value...);
     int h(out int a, out int b);

     out int x;
     // g(x); // illegal: reads x, but x is not yet initialized.
     // h(x, x); // illegal:
         // reads the second x before the initialization of 
 first x is complete.
     f(x); // initializes x.

I don't like this. It is going to get annoying in cases like this:

```
int f(out int, int);

int func()
{  out int x;
    if(someCond) x.f(0);
    else if(someOtherCond) x.f(1);
    return x;
}
```

What should the compiler do? It cannot know whether it's possible 
x can be returned uninitialized. It can issue an error just in 
case, and we hate to refactor code due to false alarms like that. 
Or it can ignore it, in which case the `out` storage parameter 
will sometimes work, sometimes silently fail. One is still going 
to need to void initialize stuff to be sure to elide the default 
initialization.


 In-place `out` Variables
 ========================

 When calling a function with an `out` parameter, instead of 
 passing an argument, a fresh variable can be declared instead:

     if (f(out int x) > 0 && x > 0) { g(x); } else { .. }
     if (g(0) && f(out x) > 0) { g(x); } else { .. }

This, however, sounds better. I'd only leave out the requirement 
for the caller to specify `out`, and also let to do that for 
`ref` parameters.

Jan 28 2021

Q. Schroll <qs.il.paperinik gmail.com> writes:

On Thursday, 28 January 2021 at 09:17:47 UTC, Dukc wrote:
 int f(out int, int);

 int func()
 {  out int x;
    if(someCond) x.f(0);
    else if(someOtherCond) x.f(1);
    return x;
 }

 What should the compiler do? It cannot know whether it's 
 possible x can be returned uninitialized. It can issue an error 
 just in case, and we hate to refactor code due to false alarms 
 like that.

Exactly this. Something similar is true for an immutable 
constructor:

struct S
{
     int a;
     this(int x) immutable
     {
         if (x < 0) { a = 1; }
         else if (x > 0) { a = 2; }
         if (x == 0) a = 3; // error
     }
}

This problem is inherent.

Jan 30 2021

D Programming

C/C++ Programming

Other

digitalmars.D - [DIP idea] out variables