digitalmars.D - Feedback Thread: DIP 1040--Copying, Moving, and Forwarding--Community

Mike Parker (50/50) Mar 05 2021 This is the feedback thread for the first round of Community

Ben Jones (3/5) Mar 05 2021 Minor bug: The functions `gun` and `sun` in the last use section
Paul Backus (28/35) Mar 05 2021 This rule has the potential to cause "spooky action at a

Walter Bright (11/50) Mar 08 2021 It is possible to do the DFA through nested functions. I plan on doing t...

deadalnix (5/16) Mar 08 2021 That seems like the best approach. It allows for thing like fixed

tsbockman (6/8) Mar 05 2021 There should be at least one example demonstrating WHY a
deadalnix (212/214) Mar 06 2021 First, very good proposal (in fact, I sent something very similar

Walter Bright (14/71) Mar 08 2021 That's exactly how the @live analysis is currently implemented. I didn't...

Timon Gehr (9/22) Mar 08 2021 The general framework is also called abstract interpretation, where the

Imperatorn (2/17) Mar 08 2021 Just curious, is there any connection here to symbolic execution?

Tobias Pankrath (5/7) Mar 08 2021 When I look at the wikipedia article about symbolic execution, it
Timon Gehr (14/31) Mar 08 2021 Most program analysis is abstract interpretation in some way as it is,

Imperatorn (6/28) Mar 08 2021 Interesting. That sounds quite powerful. But what about

deadalnix (7/9) Mar 08 2021 I assumed so, but IMO this is very much adding to the

Mike Parker (8/11) Mar 08 2021 I would like to remind everyone of the Feedback Thread rules. All

Atila Neves (25/89) Mar 10 2021 Awesome! Now for the feedback.
tsbockman (25/33) Mar 10 2021 This paragraph seems garbled to me.
Dukc (29/31) Mar 16 2021 It is labourous for the reader to get the difference between move

Walter Bright (8/34) Mar 20 2021 There'll be some more text added to deal with Rvalue Constructors, that ...

Elronnd (6/10) Mar 17 2021 I think the order should be undefined. Forcing the moves to

Walter Bright (5/14) Mar 20 2021 The trouble with that is the user may inadvertently rely on the

deadalnix (14/15) Mar 18 2021 Looking at how the discussion is going, this seems to me like

Walter Bright (3/21) Mar 20 2021 While this is an attractive idea, (1) without some help from the program...

Walter Bright (16/16) Mar 20 2021 The Move Constructor is defined as:
Mike Parker (5/9) Mar 20 2021 This review round is complete. Thanks to everyone who
Manu (56/58) Mar 22 2021 I'm really late to the party here, and there's a lot to catch up, so may...

Max Haughton (16/23) Mar 22 2021 Thank you for the feedback, I'm still mulling over it, but the
deadalnix (4/7) Mar 23 2021 The extern(C++) case has not been discussed much so far. It goes

Manu (4/12) Mar 23 2021 The DIP mentions C++ compat, and it's bad. I think it would be best if t...

Max Haughton (2/20) Mar 23 2021 I think this is the right approach.

Mike Parker <aldacron gmail.com> writes:

This is the feedback thread for the first round of Community 
Review of DIP 1040, "Copying, Moving, and Forwarding".

===================================
**THIS IS NOT A DISCUSSION THREAD**

Posts in this thread must adhere to the feedback thread rules 
outlined in the Reviewer Guidelines (and listed at the bottom of 
this post).

https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md

That document also provides guidelines on contributing feedback 
to a DIP review. Please read it before posting here. If you would 
like to discuss this DIP, please do so in the discussion thread:

https://forum.dlang.org/post/ncoqnixvllbjsxdzbyxi forum.dlang.org
==================================

You can find DIP 1040 here:

https://github.com/dlang/DIPs/blob/a9c553b0dbab1c2983a801b5e89b51c5c33d5180/DIPs/DIP1040.md

The review period will end at 11:59 PM ET on March 19, or when I 
make a post declaring it complete. Feedback posted to this thread 
after that point may be ignored.

At the end of this review round, the DIP will be moved into the 
Post-Community Round 1 state. Significant revisions resulting 
from this review round may cause the DIP manager to require 
another round of Community Review, otherwise the DIP will be 
queued for the Final Review.

==================================
Posts in this thread that do not adhere to the following rules 
will be deleted at the DIP author's discretion:

* All posts must be a direct reply to the DIP manager's initial 
post, with only two exceptions:

     - Any commenter may reply to their own posts to retract 
feedback contained in the original post

     - The DIP author may (and is encouraged to) reply to any 
feedback solely to acknowledge the feedback with agreement or 
disagreement (preferably with supporting reasons in the latter 
case)

* Feedback must be actionable, i.e., there must be some action 
the DIP author can choose to take in response to the feedback, 
such as changing details, adding new information, or even 
retracting the proposal.

* Feedback related to the merits of the proposal rather than to 
the contents of the DIP (e.g., "I'm against this DIP.") is 
allowed in Community Review (not Final Review), but must be 
backed by supporting arguments (e.g., "I'm against this DIP 
because..."). The supporting arguments must be reasonable. 
Obviously frivolous arguments waste everyone's time.

* Feedback should be clear and concise, preferably listed as 
bullet points (those who take the time to do an in-depth review 
and provide feedback in the form of answers to the questions in 
the documentation linked above will receive much gratitude). 
Information irrelevant to the DIP or which is not provided in 
service of clarifying the feedback is unwelcome.

Mar 05 2021

Ben Jones <fake fake.fake> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

Minor bug: The functions `gun` and `sun` in the last use section 
look like they're missing an S parameter.

Mar 05 2021

Paul Backus <snarwin gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

In the section on "Last Use":

 Module level functions that contain nested functions or lambda 
 functions that access an outer local variable will not be 
 subject to last use dataflow analysis. The reason is that a 
 pointer to the nested function or lambda could be escaped from 
 the containing function

This rule has the potential to cause "spooky action at a 
distance," where a change to one part of a function silently 
causes the meaning of seemingly-unrelated code in the same 
function to change. For example:

     struct EMO { /* ... */ }

     version (Before)
     void example()
     {
         int a;
         EMO b;
         fun(a);
         gun(b); // last use of b -> move
     }

     version (After)
     void example()
     {
         int a;
         EMO b;
         void helper() { fun(a); } // DFA disabled here
         helper();
         gun(b); // no DFA -> copy
     }

Ideally, only variables that are actually accessed from nested 
functions would have their last-use analysis disabled. In the 
example above, that would include `a`, which is accessed in 
`helper`, but not `b`.

Mar 05 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 3/5/2021 11:15 AM, Paul Backus wrote:
 On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community Review of DIP 
 1040, "Copying, Moving, and Forwarding".

 
 In the section on "Last Use":
 
 Module level functions that contain nested functions or lambda functions that 
 access an outer local variable will not be subject to last use dataflow 
 analysis. The reason is that a pointer to the nested function or lambda could 
 be escaped from the containing function

 
 This rule has the potential to cause "spooky action at a distance," where a 
 change to one part of a function silently causes the meaning of 
 seemingly-unrelated code in the same function to change. For example:
 
      struct EMO { /* ... */ }
 
      version (Before)
      void example()
      {
          int a;
          EMO b;
          fun(a);
          gun(b); // last use of b -> move
      }
 
      version (After)
      void example()
      {
          int a;
          EMO b;
          void helper() { fun(a); } // DFA disabled here
          helper();
          gun(b); // no DFA -> copy
      }
 
 Ideally, only variables that are actually accessed from nested functions would 
 have their last-use analysis disabled. In the example above, that would
include 
 `a`, which is accessed in `helper`, but not `b`.

It is possible to do the DFA through nested functions. I plan on doing that
with 
 live functions.

What the DIP should make clear, however, is that changing copies to moves
should 
be regarded as implementation dependent, i.e. it is a mistake if the user 
attempts to rely on a particular sequence of moves and copies. Much like
relying 
on this for NRVO is a mistake. NRVO has always been an optional optimization, 
and if it happens or not is difficult to determine just by looking at the code. 
In practice, the only problem that has cropped up has been pressure to find
more 
cases where NRVO can be applied, so I think we're good with labeling the 
copy=>move semantics as an implementation-dependent optimization.

Mar 08 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 8 March 2021 at 10:27:07 UTC, Walter Bright wrote:
 What the DIP should make clear, however, is that changing 
 copies to moves should be regarded as implementation dependent, 
 i.e. it is a mistake if the user attempts to rely on a 
 particular sequence of moves and copies. Much like relying on 
 this for NRVO is a mistake. NRVO has always been an optional 
 optimization, and if it happens or not is difficult to 
 determine just by looking at the code. In practice, the only 
 problem that has cropped up has been pressure to find more 
 cases where NRVO can be applied, so I think we're good with 
 labeling the copy=>move semantics as an 
 implementation-dependent optimization.

That seems like the best approach. It allows for thing like fixed 
point analysis to bail in case it needs too many iterations. In 
this case, the list of cases presented should be presented as 
examples rather than spec.

Mar 08 2021

tsbockman <thomas.bockman gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

There should be at least one example demonstrating WHY a 
non-default move constructor might be useful, not just a 
syntactical description of HOW to declare one. How about 
including an example struct with an interior pointer that needs 
to be updated whenever it is moved or copied?

Mar 05 2021

deadalnix <deadalnix gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

First, very good proposal (in fact, I sent something very similar 
to Andrei, Walter and Atila in 2019, and I hope this had some 
influence over this proposal). For completeness, I will paste 
this email at the end of this reply.

It cover some subtle but important point, notably:
  - The argument is invalid after this move, and is not destructed.
  - Moving an EMO means that the responsibility for destruction of 
that value moves along with it.

This is something I stressed out at the time, because it has 
profound consequences at the ABI level. It is unfortunate that 
this design goal - while achieved - has dropped from the 
rationale. To stress out how important it is, let me contrast 
with C++.

In C++, the caller remains in charge of the destruction of an 
object. This means that structs that are not trivially 
destructible, movable or copyable need to be passed by reference. 
This introduce many indirections in the generated code, but, 
worse, adds a ton of pressure on the alias analysis, which will 
prevent the optimizer to do its job properly.

The second downside is that structs always need to have a "null" 
state. They need to remain destructible, even after the callee 
move the object. This forces the callee to do extra work in order 
to set the moved object into a null state and extra work from the 
caller to check for the null state during destruction. A concrete 
example of this is for instance that a lock which locks a mutex 
in its constructor and unlocks it in its destructor now must 
check for the mutex being nulled in its destruction, introducing 
an unecessary branch in all path that unlock the mutex.

These are serious downside in the C++ way, and this proposal 
should make it explicit in its rationale that both problems must 
be avoided:
  - Ability to pass non POD as value even at the ABI level.
  - Ability to have movable structs without a "null" state.

Another aspect of the proposal that needs improvement is the 
"Last Use" section. the description of this last use is 
incomplete - and maybe doesn't adopt the right approach, because 
listing all scenarii is always going to be a challenge. One such 
scenario is the case of divergent branches:

S s;
if (...) {
     // Even though it is
     fun(s);
} else {
     // ...
}

The way to cover all base is to go for a fixed point analysis. A 
fixed point analysis approach would go ass follow: define a state 
for all local variable, then follow the control flow updating the 
state of said variable. When a merge happen in the control flow, 
verify that both both path through which you arrive at the merge 
point have a different state, then apply a fix and reprocess 
affected branches of the control flow, up to the point all merge 
point are consistent - you reached a fixed point.

Adopting such an approach will ensure the behavior is not only 
defined for current control flow constructs, but also whatever 
might come up in the future. In the pasted email, I go over in 
more detail about what a fixed point analysis could look like for 
this. While it is likely not formal enough to make it into the 
DIP as this, it makes for a good starting point, IMO.

======

Hi both of you,

This is a topic I discussed with Atila and Andrei informally, and 
now I'd to loop you in Walter. If this gains traction, it'll be 
time for a DIP, but I'd really like to gather some feedback 
before, as putting everything in a format that is formal enough 
is seriously time consuming.

As I assume Walter already knows, C++ copy/destruction semantic 
is actually quite an important barrier to optimization. For the 
sake of making the point, I'm going to use unique_ptr and 
shared_ptr as example, and what we imagine equivalent in D would 
look like. We will also consider that the compiler inline 
ctor/dtor and copies.

Let's consider this sample code:

void foo(unique<T> ptr):

unique<T> t = ...;
foo(move(t));

The first thing we consider s that unique<T>, as well as 
shared<T> are not PODs. This is rather obvious, as this is the 
whole point of having them, if we wanted a POD, we'd use a raw 
pointer. In C++, non POD must have an address. in D, they must 
have an address for the ctor to run, but after this, the compiler 
is free to do whatever it wants as D struct are movable by 
default.

An interesting side effect of that fact is that non POD are 
ALWAYS passed by reference in C++, never by value. The compiler 
creates a temporary in the caller in which the copy is stored. 
The code is lowered to something that look like this:

foo(T** ptrref);

T* t = ...;
T* tmp = nullptr;
swap(t, tmp);
foo(&tmp);
if (tmp != null) destroy(tmp);
if (t != null) destroy(t);

The compiler will have no problem figuring out that t is always 
null, and therefore optimize away its destruction, but the tmp 
one remains. A second order problem is that EVERY struct must 
have some sort of 'null' state. While this is fairly obvious what 
it is for a pointer (!) it is the source of many trap when it 
comes to correctness for others types of structs. Imagine a lock, 
for instance, it now has to check every time if the mutex it 
refers to has been nulled or not, introducing a branch every time 
you unlock the mutex.

Now immagine that foo does something like:

void foo(unique<T> ptr) {
   global = move(ptr);
}

Now foo end up having to go through an indirection to get the 
value of ptr, which right there is 3 cycles at best, a cache miss 
at worse. But worse, it MUST store null into ptr that the caller 
will then dutifully check is null and decide to not destroy 
anything. t will do so in 100% of the case, yet the compiler is 
unable to do anything to optimize it away. In addition, store to 
load forwarding has some bizarre perf edge cases on many CPUs 
(that is, loading an address that is still in the store buffer 
and hasn't hit cache yet) so it is something you'd like to see 
optimized.

If we replace the unique<T> by a shared<T> in the example, it 
gets worse because the increment/decrement operation cannot be 
optimized away at all. This result in code where you got to 
sprinkle calls to move all over the place constantly and it's a 
guarantee that you'll forget to put it sometime and end up with a 
ton of unexpected copies.

Here is what I propose.

1/ The ABI for non POD is the exact same as for POD. it means 
that the responsibility of the destruction falls on the the foo 
method in our example, not on the caller. This means it can be 
passed in registers in the case of unique ptr. The obvious gotcha 
is that the adress of the this pointer will be different in foo 
that it was when the copy constructor was called. D provides no 
guarantee on that front anyways, so I think we are good on that 
front.

Another side effect is that we lose the "russian doll" effect of 
construction destruction when things are moved to another 
function. However, by specifying that the copy need to be made in 
LTR order for parameters, and destruction from the last parameter 
to the first, we end up with this remaining true almost all the 
time, which is probably the right tradeof.

In our example, it means that foo can receive the unique<T> in a 
register, then the compiler can see that the value when we exit 
the function is always null and can optimize away the destruction.

2/ Decide when to copy using the following algorithm:

We go over the function body, keeping track of a status per 
lvalue that exist: destroyed, alive or consumed. All parameters, 
globals, or value reached through indirections are considered 
initially alive. All locals are considered destroyed initially, 
then alive once the constructor as run/initial value is set.

When an lvalue is converted into an rvalue, for instance when 
returned, when passed as argument, when assigned to another 
lvalue, etc... the following happens:
  - if it is alive, it is marked consumed.
  - if it is consumed, then we backtrack to the point at which it 
was consumed and insert a copy operation. The lvalue is 
reconsidered alive after the copy, the the algorithm restart from 
there.
  - if it is destroyed, then this is an error (this should 
basically never happen and is likely a compiler bug).

When an lvalue needs to be destroyed, the compiler does as follow:
  - if it is alive, call the desructor.
  - if it is consumed, do nothing.
  - if it is already destroyed, then once again, you probably have 
a compiler bug.

This is relatively straightforward, but there is one more tricky 
situation, is when you reach a "merge" point i the control flow, 
for instance, after an if/then/else construct. In this case, a 
variable can have different state for each codepath. The 
resolution happens as follow:
  - alive/alive : all good.
  - alive/destroyed: error. Some codepath did not initialize the 
variable or you have  compiler bug.
  - alive/consume: backtrack the consume path and insert copies. 
The lvalue marked alive.
  - consume/destroyed: Mark destroyed.

When an lvalue is assigned to, it's previous value is move to a 
temporary and then destroyed immediately after the assignment 
finishes. The temporary also keeps the state so destruction only 
occurs when the variable was alive.

All rvalues must be immediately returned, assigned, passed as 
argument or destroyed by calling the constructor.

Existing optimizations such as RVO naturaly arise out of this 
process, but this is much more generic and powerful, as it 
generate the minimal number of copies possible assuming structs 
are movable, which is what we want.

In case of loops, the state tracked by the algorithm at the end 
of the loop can affect the merge at the top of the loop, so you 
may have to pass over the loop twice. Same wise, backtracking may 
cause you to have to go over some code twice, but if you play 
with the idea you'll figure out that it converge pretty damn 
quickly and that it is not easy to come up with example requiring 
many passes, and it ALWAYS converges, if for no other reason that 
it inserted copy at each point a C++ compiler would.

In practice, this means that

shared<T> t = ...;
foo(t);

Would simple move t's ownership to foo and never generate a copy. 
However, if later one does

shared<T> t = ...;
foo(t);
bar(t);

Then the call to foo will be patched by the compiler to insert a 
copy. In both cases, t's destructor will never be called because 
it is consumed by the end of the control flow, so destruction is 
a noop.

This is really where we need to go IMO. We avoid so many copy by 
basically moving by default that and solution such as a RC 
solution will be significantly better than C++'s, and on par with 
stuff like ARC.

In addition, the same algorithm can be used to ensure that all 
fields are initialized for constructors. The start in the 
destroyed state and must all be in the alive state by the end of 
the constructor's execution.

Mar 06 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 3/6/2021 12:28 PM, deadalnix wrote:
 First, very good proposal (in fact, I sent something very similar to Andrei, 
 Walter and Atila in 2019, and I hope this had some influence over this 
 proposal). For completeness, I will paste this email at the end of this reply.
 
 It cover some subtle but important point, notably:
   - The argument is invalid after this move, and is not destructed.
   - Moving an EMO means that the responsibility for destruction of that value 
 moves along with it.
 
 This is something I stressed out at the time, because it has profound 
 consequences at the ABI level. It is unfortunate that this design goal - while 
 achieved - has dropped from the rationale. To stress out how important it is, 
 let me contrast with C++.
 
 In C++, the caller remains in charge of the destruction of an object. This
means 
 that structs that are not trivially destructible, movable or copyable need to
be 
 passed by reference. This introduce many indirections in the generated code, 
 but, worse, adds a ton of pressure on the alias analysis, which will prevent
the 
 optimizer to do its job properly.

 The second downside is that structs always need to have a "null" state. They 
 need to remain destructible, even after the callee move the object. This
forces 
 the callee to do extra work in order to set the moved object into a null state 
 and extra work from the caller to check for the null state during destruction.
A 
 concrete example of this is for instance that a lock which locks a mutex in
its 
 constructor and unlocks it in its destructor now must check for the mutex
being 
 nulled in its destruction, introducing an unecessary branch in all path that 
 unlock the mutex.
 
 These are serious downside in the C++ way, and this proposal should make it 
 explicit in its rationale that both problems must be avoided:
   - Ability to pass non POD as value even at the ABI level.
   - Ability to have movable structs without a "null" state.

Ok, good idea, although to be clear this is implicit in the Description.

 Another aspect of the proposal that needs improvement is the "Last Use"
section. 
 the description of this last use is incomplete - and maybe doesn't adopt the 
 right approach, because listing all scenarii is always going to be a
challenge. 
 One such scenario is the case of divergent branches:
 
 S s;
 if (...) {
      // Even though it is
      fun(s);
 } else {
      // ...
 }
 
 The way to cover all base is to go for a fixed point analysis. A fixed point 
 analysis approach would go ass follow: define a state for all local variable, 
 then follow the control flow updating the state of said variable. When a merge 
 happen in the control flow, verify that both both path through which you
arrive 
 at the merge point have a different state, then apply a fix and reprocess 
 affected branches of the control flow, up to the point all merge point are 
 consistent - you reached a fixed point.

That's exactly how the  live analysis is currently implemented. I didn't know
it 
was called a fixed point analysis.

The downside of this is the data flow analysis in  live is a bit expensive. 
Perhaps the DIP should allow for the DFA being optional, and assume worst case 
(i.e. never last use) for an initial implementation.


 Adopting such an approach will ensure the behavior is not only defined for 
 current control flow constructs, but also whatever might come up in the
future. 
 In the pasted email, I go over in more detail about what a fixed point
analysis 
 could look like for this. While it is likely not formal enough to make it into 
 the DIP as this, it makes for a good starting point, IMO.

The idea with DFA is to not think in terms of if, while, foreach, etc. But to 
think instead of flow as nodes connected by edges. In other words, replace all 
the structured code with goto's. Anything that can be lowered to gotos will
then 
work naturally.

Andrei and I have discussed "last use" optimizations for over a decade :-) but
I 
always balked because it needed DFA.

Thanks for reposting the email. It's full of good information.

Mar 08 2021

Timon Gehr <timon.gehr gmx.ch> writes:

On 08.03.21 11:21, Walter Bright wrote:
 The way to cover all base is to go for a fixed point analysis. A fixed 
 point analysis approach would go ass follow: define a state for all 
 local variable, then follow the control flow updating the state of 
 said variable. When a merge happen in the control flow, verify that 
 both both path through which you arrive at the merge point have a 
 different state, then apply a fix and reprocess affected branches of 
 the control flow, up to the point all merge point are consistent - you 
 reached a fixed point.

 
 That's exactly how the  live analysis is currently implemented. I didn't 
 know it was called a fixed point analysis.

The general framework is also called abstract interpretation, where the 
merge operation is called a join of abstract elements. There's also a 
meet operation that can introduce information from conditionals into the 
state. (E.g. if you want to do value range propagation this can be 
useful.) In general, abstract interpretation does not converge to a 
fixed point in finite time, then you need to introduce some sort of 
widening operator. In practice, this is often the main challenge to make 
it work.

Mar 08 2021

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Monday, 8 March 2021 at 11:28:53 UTC, Timon Gehr wrote:
 On 08.03.21 11:21, Walter Bright wrote:
 [...]

 
 That's exactly how the  live analysis is currently 
 implemented. I didn't know it was called a fixed point 
 analysis.

 The general framework is also called abstract interpretation, 
 where the merge operation is called a join of abstract 
 elements. There's also a meet operation that can introduce 
 information from conditionals into the state. (E.g. if you want 
 to do value range propagation this can be useful.) In general, 
 abstract interpretation does not converge to a fixed point in 
 finite time, then you need to introduce some sort of widening 
 operator. In practice, this is often the main challenge to make 
 it work.

Just curious, is there any connection here to symbolic execution?

Mar 08 2021

Tobias Pankrath <tobias+dlang pankrath.net> writes:

On Monday, 8 March 2021 at 14:01:33 UTC, Imperatorn wrote:
 Just curious, is there any connection here to symbolic 
 execution?

When I look at the wikipedia article about symbolic execution, it 
looks a lot like symbolic model checking, and Patrick Cousot has 
somewhere shown that model checking can be seen as a subform of 
abstract interpretation.

Mar 08 2021

Timon Gehr <timon.gehr gmx.ch> writes:

On 08.03.21 15:01, Imperatorn wrote:
 On Monday, 8 March 2021 at 11:28:53 UTC, Timon Gehr wrote:
 On 08.03.21 11:21, Walter Bright wrote:
 [...]

 That's exactly how the  live analysis is currently implemented. I 
 didn't know it was called a fixed point analysis.

 The general framework is also called abstract interpretation, where 
 the merge operation is called a join of abstract elements. There's 
 also a meet operation that can introduce information from conditionals 
 into the state. (E.g. if you want to do value range propagation this 
 can be useful.) In general, abstract interpretation does not converge 
 to a fixed point in finite time, then you need to introduce some sort 
 of widening operator. In practice, this is often the main challenge to 
 make it work.

 
 Just curious, is there any connection here to symbolic execution?

Most program analysis is abstract interpretation in some way as it is, 
well, abstract.

You can consider symbolic execution as an abstract domain consisting of 
symbolic formulas such that the concretization function γ maps each 
symbolic formula to the set of program states that satisfy it. Possible 
choices for meet and join are symbolic conjunction and disjunction.



fully model some aspects of the program semantics you will get the 


Widening for this domain is invariant synthesis and if you do it 
(automated or manually), you get something that's getting close to 
what's usually called a program verifier instead of a static analyzer.

Mar 08 2021

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Monday, 8 March 2021 at 17:42:16 UTC, Timon Gehr wrote:
 On 08.03.21 15:01, Imperatorn wrote:
 On Monday, 8 March 2021 at 11:28:53 UTC, Timon Gehr wrote:
 [...]

 
 Just curious, is there any connection here to symbolic 
 execution?

 Most program analysis is abstract interpretation in some way as 
 it is, well, abstract.

 You can consider symbolic execution as an abstract domain 
 consisting of symbolic formulas such that the concretization 
 function γ maps each symbolic formula to the set of program 
 states that satisfy it. Possible choices for meet and join are 
 symbolic conjunction and disjunction.



 execution does not fully model some aspects of the program 
 semantics you will get the ordinary soundness condition f[γ(φ)] 


 Widening for this domain is invariant synthesis and if you do 
 it (automated or manually), you get something that's getting 
 close to what's usually called a program verifier instead of a 
 static analyzer.

Interesting. That sounds quite powerful. But what about 
undecidability? Wait... The abstract semantics is a super set 
here right? Hmm, so that's why that works.

This is actually kinda cool.

Do you have any good books on the subject to recommend?

Mar 08 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 8 March 2021 at 10:21:37 UTC, Walter Bright wrote:
 Ok, good idea, although to be clear this is implicit in the 
 Description.

I assumed so, but IMO this is very much adding to the 
rationale/requirement part of the DIP, because this will be 
forgotten or overlooked by most readers. When it is in there then 
it is easy to point further discussions that break this 
assumption to it. Without it, this has to be explained again and 
again.

Mar 08 2021

Mike Parker <aldacron gmail.com> writes:

On Monday, 8 March 2021 at 21:50:19 UTC, deadalnix wrote:
 On Monday, 8 March 2021 at 10:21:37 UTC, Walter Bright wrote:
 Ok, good idea, although to be clear this is implicit in the 
 Description.


I would like to remind everyone of the Feedback Thread rules. All 
posts must be a reply to my original post and must provide 
actionable feedback. DIP authors can reply to feedback, but there 
should be no replies to their posts unless they ask for clarity. 
Discussions belong in the discussion thread.

I’ll let all the posts above this one stand, but I will delete 
any further such posts.

Mar 08 2021

Atila Neves <atila.neves gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

 ===================================
 **THIS IS NOT A DISCUSSION THREAD**

 Posts in this thread must adhere to the feedback thread rules 
 outlined in the Reviewer Guidelines (and listed at the bottom 
 of this post).

 https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md

 That document also provides guidelines on contributing feedback 
 to a DIP review. Please read it before posting here. If you 
 would like to discuss this DIP, please do so in the discussion 
 thread:

 https://forum.dlang.org/post/ncoqnixvllbjsxdzbyxi forum.dlang.org
 ==================================

 You can find DIP 1040 here:

 https://github.com/dlang/DIPs/blob/a9c553b0dbab1c2983a801b5e89b51c5c33d5180/DIPs/DIP1040.md

 The review period will end at 11:59 PM ET on March 19, or when 
 I make a post declaring it complete. Feedback posted to this 
 thread after that point may be ignored.

 At the end of this review round, the DIP will be moved into the 
 Post-Community Round 1 state. Significant revisions resulting 
 from this review round may cause the DIP manager to require 
 another round of Community Review, otherwise the DIP will be 
 queued for the Final Review.

 ==================================
 Posts in this thread that do not adhere to the following rules 
 will be deleted at the DIP author's discretion:

 * All posts must be a direct reply to the DIP manager's initial 
 post, with only two exceptions:

     - Any commenter may reply to their own posts to retract 
 feedback contained in the original post

     - The DIP author may (and is encouraged to) reply to any 
 feedback solely to acknowledge the feedback with agreement or 
 disagreement (preferably with supporting reasons in the latter 
 case)

 * Feedback must be actionable, i.e., there must be some action 
 the DIP author can choose to take in response to the feedback, 
 such as changing details, adding new information, or even 
 retracting the proposal.

 * Feedback related to the merits of the proposal rather than to 
 the contents of the DIP (e.g., "I'm against this DIP.") is 
 allowed in Community Review (not Final Review), but must be 
 backed by supporting arguments (e.g., "I'm against this DIP 
 because..."). The supporting arguments must be reasonable. 
 Obviously frivolous arguments waste everyone's time.

 * Feedback should be clear and concise, preferably listed as 
 bullet points (those who take the time to do an in-depth review 
 and provide feedback in the form of answers to the questions in 
 the documentation linked above will receive much gratitude). 
 Information irrelevant to the DIP or which is not provided in 
 service of clarifying the feedback is unwelcome.

Awesome! Now for the feedback.

I think the DIP would be helped by links to definitions that come 
after their first usage, or being restructured so that such links 
are no longer necessary. There are multiple cases, but the first 
example I ran into was "The argument is invalid after this move, 
and is not destructed.", and I wanted to write "what does 
'invalid' mean?". As it turns out it's defined later on in the 
document.

 Introduces the notion of a Moveable Reference

But then the DIP refers to these as "move refs" afterwards.

 where s is constructed into the parameter.

This confused me; it also seems to me that deleting it doesn't 
alter what the section is trying to say.

 A Move Constructor that throws is illegal.

I assume this means it'll fail to compile?

 The Move Assignment Operator is nothrow, even if nothrow is not 
 explicitly specified.

And also illegal if it happens to throw?

 An EMO is a struct that has both a Move Constructor and a Move 
 Assignment Operator

Including a compiler-generated version of either function?

 A Move Ref is a parameter that is a reference to an EMO

I eventually understood what this meant, but this confused me 
when I read it the first time. I'd reword it to mention that the 
syntax looks like a by-value parameter but ends up being passed 
by reference. It also confused me that the 2nd function had `ref` 
in there.

  If NRVO cannot be performed, s is copied to the return value 
 on the caller's stack.

Why is it copied instead of moved?

 When the call is made to f(), a copy is made.

I guess this only applies to f(S())?

 While it appears that C++ best practice is to use rvalue 
 references instead of values for parameters

Not always, sometimes the best practice can be to pass by value 
then move if a copy has to be made anyway: 
https://stackoverflow.com/questions/16724657/why-do-we-copy-then-move

Mar 10 2021

tsbockman <thomas.bockman gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

 From the DIP:
 A Move Assignment Operator is a struct member assignment 
 operator that moves, rather than copies, the argument 
 corresponding to its first parameter into the constructed 
 object. After the move is complete, the destructor is called on 
 the original contents of the constructed object. The argument 
 is invalid after this move, and is not destructed.

This paragraph seems garbled to me.

- What is "the constructed object"? It's ambiguous: presumably 
both `this` and the source argument to the move `opAssign` were 
initialized at some time in the past, otherwise why call the move 
`opAssign` instead of the move constructor? Generally, the phrase 
"the constructed object" is terribly confusing and for clarity 
should be replaced throughout the DIP with less ambiguous 
phrasing, such as "the source of the move" or "the destination of 
the move".

- "After the move is complete, the destructor is called on the 
original contents of the constructed object." This makes no 
sense: the original contents of the destination of the move 
cannot be destroyed *after* the move, as it has already been 
overwritten by the move, and is gone. It would have to either be 
destroyed *before* the move, or copied to a temporary first, 
which seems like a pointless waste.

- If the destination is to be destroyed before the move, then 
there seems to be no need for the separate move `opAssign` at 
all; the compiler should instead just call the destructor on the 
destination, if it has already been initialized, before calling 
the move constructor.

- If there is some good reason for the distinct move `opAssign` 
that I have missed, it should be explained clearly in the DIP.

Mar 10 2021

Dukc <ajieskola gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 This is the feedback thread for the first round of Community 
 Review of DIP 1040, "Copying, Moving, and Forwarding".

It is labourous for the reader to get the difference between move 
and copy constructors, which is that copy constructors have a 
`ref` parameter. It'd be easier if this was explicitly mentioned.

I don't like the term "move assignment operator". The operator is 
the same assignment operator we already have, no need to rename 
it. Just talk about how identity assignments are handled when the 
argument can be moved. You can call that a move assignment, but 
leave the trailing "operator" out, since there's no new operator.

There's no discussion how a move constructor will interact with a 
possible `opPostMove` in a struct or one of it's fields (DIP1014).

As for all the stuff about EMOs, I don't like that at all. We 
already allow the compiler to move a struct instead of copying 
it, if it can detect a last use. And also we have the possibility 
to manually elide copying via `std.algorithm.move`. If I recall 
TDPL book correctly, returning a value or passing a Rvalue to a 
function is already guaranteed to move. And those rules are for 
all structs, not just for those that have some fancy operator 
overloads.

I also don't like all the rules about detecting last uses. Those 
are optimizations, not something the language spec should 
complicate itself for.

There is a breaking change that hasn't been mentioned. An 
identity assignment with no `ref` for the argument, or an `auto 
ref` argument, is going to become a move assignment. Thus it will 
break if it throws something.

All in all, I like the idea of move constructor - it is the same 
improvement over `opPostMove` that copy constructors are over 
postblits. But I cannot say the same about rest of the DIP.

Mar 16 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 3/16/2021 10:19 AM, Dukc wrote:
 It is labourous for the reader to get the difference between move and copy 
 constructors, which is that copy constructors have a `ref` parameter. It'd be 
 easier if this was explicitly mentioned.

There'll be some more text added to deal with Rvalue Constructors, that should 
take care of this.

 I don't like the term "move assignment operator". The operator is the same 
 assignment operator we already have, no need to rename it. Just talk about how 
 identity assignments are handled when the argument can be moved. You can call 
 that a move assignment, but leave the trailing "operator" out, since there's
no 
 new operator.

It's good to distinguish between a move and a copy.


 There's no discussion how a move constructor will interact with a possible 
 `opPostMove` in a struct or one of it's fields (DIP1014).

https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1014.md

Although #DIP1014 was approved, this DIP replaces it. That should be covered in 
the text.


 As for all the stuff about EMOs, I don't like that at all. We already allow
the 
 compiler to move a struct instead of copying it, if it can detect a last use. 
 And also we have the possibility to manually elide copying via 
 `std.algorithm.move`. If I recall TDPL book correctly, returning a value or 
 passing a Rvalue to a function is already guaranteed to move. And those rules 
 are for all structs, not just for those that have some fancy operator
overloads.
 
 I also don't like all the rules about detecting last uses. Those are 
 optimizations, not something the language spec should complicate itself for.
 
 There is a breaking change that hasn't been mentioned. An identity assignment 
 with no `ref` for the argument, or an `auto ref` argument, is going to become
a 
 move assignment. Thus it will break if it throws something.

Only if the move assignment is defined, so it shouldn't be breaking.

 All in all, I like the idea of move constructor - it is the same improvement 
 over `opPostMove` that copy constructors are over postblits. But I cannot say 
 the same about rest of the DIP.

Mar 20 2021

Elronnd <elronnd elronnd.net> writes:

 If a Move Constructor is not defined for a struct that has a 
 Move Assignment Operator, a default Move Constructor is defined 
 and implemented as a move for each of its fields, in lexical 
 order.

I think the order should be undefined.  Forcing the moves to 
happen in lexical order may inhibit some optimizations; and code 
which relies on the order of implicit moves is fragile anyway and 
shouldn't be encouraged.  If the order is important for 
legitimate reasons, then those reasons should be made explicit in 
the code with an explicit move constructor.

Mar 17 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 3/17/2021 11:14 PM, Elronnd wrote:
 If a Move Constructor is not defined for a struct that has a Move Assignment 
 Operator, a default Move Constructor is defined and implemented as a move for 
 each of its fields, in lexical order.

 
 I think the order should be undefined.  Forcing the moves to happen in
lexical 
 order may inhibit some optimizations; and code which relies on the order of 
 implicit moves is fragile anyway and shouldn't be encouraged.  If the order
is 
 important for legitimate reasons, then those reasons should be made explicit
in 
 the code with an explicit move constructor.

The trouble with that is the user may inadvertently rely on the 
implementation-defined order, and his code will break when using another 
implementation.

The optimization advantage, if any, would be slight and not worth the risk.

Mar 20 2021

deadalnix <deadalnix gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 [...]

Looking at how the discussion is going, this seems to me like 
this DIP should be split into two DIPs because it's really doing 
two things. One of these things seems to generally be agreed 
upon, while the other is quite intensively discussed.

1/ The transformation of copy/destruct couples into move if the 
object isn't used in between the copy and the destruction. This 
seems to be fairly uncontroversial and the proposal is sound.

2/ The addition of new move constructor/assignment operators. 
There seems to be a lot of discussion around these, and I am 
myself not convinced that the design is sound.

Splitting this DIP in two would allow to move forward with 1/ 
relatively quickly and provide value to D user right away, while 
providing time and space for 2/ to ironed out.

Mar 18 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 3/18/2021 4:22 PM, deadalnix wrote:
 On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 [...]

 
 Looking at how the discussion is going, this seems to me like this DIP should
be 
 split into two DIPs because it's really doing two things. One of these things 
 seems to generally be agreed upon, while the other is quite intensively
discussed.
 
 1/ The transformation of copy/destruct couples into move if the object isn't 
 used in between the copy and the destruction. This seems to be fairly 
 uncontroversial and the proposal is sound.
 
 2/ The addition of new move constructor/assignment operators. There seems to
be 
 a lot of discussion around these, and I am myself not convinced that the
design 
 is sound.
 
 Splitting this DIP in two would allow to move forward with 1/ relatively
quickly 
 and provide value to D user right away, while providing time and space for 2/
to 
 ironed out.

While this is an attractive idea, (1) without some help from the programmer via 
adding the Move functions seems likely to cause problems.

Mar 20 2021

Walter Bright <newshound2 digitalmars.com> writes:

The Move Constructor is defined as:

     this(S s) { ... } // move constructor

It's been pointed out (not sure which post) that this conflicts with an Rvalue 
Constructor, which has the same syntax:

     this(S s) { ... } // rvalue constructor

Note that an Rvalue Constructor is *not* considered to be a Copy Constructor:

     this(ref S s) { ... } // copy constructor

So the syntax is already in use. Need to resolve the conflict.

---------

The Rvalue Constructor and Copy Constructor are currently not allowed to
coexist:

     this(S);
     this(ref S); // error, can't have both an rvalue and copy constructor

which suggests a straightforward resolution: it's an Rvalue Constructor if 
there's no Copy Constructor, and a Move Constructor if there is.

While this sounds confusing, I suspect in practice the MC is simply a more 
advanced RC.

Mar 20 2021

Mike Parker <aldacron gmail.com> writes:

On Friday, 5 March 2021 at 12:20:36 UTC, Mike Parker wrote:
 
 The review period will end at 11:59 PM ET on March 19, or when 
 I make a post declaring it complete. Feedback posted to this 
 thread after that point may be ignored.

This review round is complete. Thanks to everyone who 
participated.

The DIP authors may reply to unanswered feedback, but I ask that 
others please refrain from leaving any more posts in this thread.

Mar 20 2021

Manu <turkeyman gmail.com> writes:

On Fri, Mar 5, 2021 at 10:25 PM Mike Parker via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 This is the feedback thread for the first round of Community
 Review of DIP 1040, "Copying, Moving, and Forwarding".


I'm really late to the party here, and there's a lot to catch up, so maybe
I'm repeating discussion that already happened... but I think there's a few
points that need consideration.

First, I like this DIP, I think this is good for D. That said, I also care
deeply about C++ compatibility, and this DIP introduces some serious issues.

1. This changes the extern(C++) ABI, and it's a major breaking change, yet
it's not mentioned in the breaking changes section; every by-val
constructor call will have an ABI change, and C++ compatibility will
completely explode.

2. By-val declarations matching f(T&&) is not a great default. It's
relatively uncommon in C++ for these arguments to exist unless it makes
specific sense, and many cases where it is present C++ uses 'universal
references', that infer rval-ness.

3. ...and even when they do exist, this DIP has different ABI semantics!
The C++ `T&&` function still expects the caller to destruct the argument,
but in D, the caller will never destruct a moved argument. You talk about
this, saying that it's a requirement to pay special attention to the
semantic difference, but I don't think that's reasonable. You're going to
have endless reports of issues from people that don't read or don't
*understand* the difference, and complain their program is crashing.

4. Is it that C++'s criteria for an EMO matches your definition, or will
there emerge cases where a differing EMO assessment will produce a mismatch
in calling convention?

I think there are more problems I can imagine if I dive into the weeds, but
at least these need to be addressed in the DIP.

My off-the-cuff thoughts are:
1 & 2. Rather than applying EMO semantics to extern(C++) functions by
default, and using  value to nominate the default behaviour; since you've
accepted an attribute just for C++ disambiguation, why not assume by-val
semantics as default, and require an  rval-ref attribute instead to select
the C++ move ABI?
3. Since your EMO semantics don't actually apply to C++, that is,
destruction responsibility does NOT carry into the call, I don't think EMO
semantics as described actually apply to C++ at all, and shouldn't be
attempted. We should use a separate solution for C++ compat (like the
attribute I mention above) and EMO semantics have no impact on C++.

So, in the presence of extern(C++), f(T arg) behaves 'traditionally' (as
you describe when applying ` value`), and maybe something like f(T  rvalref
arg) introduces the C++ rules (together with some attention to
overloading). EMO semantics as described have no applicability to C++.

Your C++ compatibility section needs a complete do-over as far as I'm
concerned. They don't improve the situation for C++ compatibility, they
make it worse.
Perhaps it should be tackled as a separate matter, and in that event, I
think the simplest thing to do is to assume f(T) for EMO remains by-val for
extern(C++), that is "EMO calling only applies to extern(D) code", and then
we make an  rvalref attribute that applies only to extern(C++) for the sake
of matching calling convention in a future DIP?

I might be interested in attempting a DIP that adds an extern(C++)-only
rval-ref attribute on top of this DIP; It needs more work. Existing work
(for instance, last SAOC) was taken from the perspective that it solves
move semantics in D at the same time as interfacing C++. With this DIP
addressing elaborate move in D, I think a new approach which ONLY addresses
interfacing C++ is an easier sell, and keeps the feature much more isolated.

Mar 22 2021

Max Haughton <maxhaton gmail.com> writes:

On Tuesday, 23 March 2021 at 00:35:31 UTC, Manu wrote:
 On Fri, Mar 5, 2021 at 10:25 PM Mike Parker via Digitalmars-d < 
 digitalmars-d puremagic.com> wrote:

 [...]


 I'm really late to the party here, and there's a lot to catch 
 up, so maybe I'm repeating discussion that already happened... 
 but I think there's a few points that need consideration.

 [...]

Thank you for the feedback, I'm still mulling over it, but the 
idea of having move semantics for D that we can make the best 
they can possibly be, and then a solution for C++ compatibility 
seems like a good one. I'm not a huge fan of magic quasi-UDA 
things on parameters personally.

One thing I have put to Walter and would like more opinions on, 
is being a little more like Rust and less like C++ by having more 
violent move semantics that (often described as destructive) 
bypass the .init process described here. Although I'd like it, 
I'm not convinced of how practical it is, but it would greatly 
simplify analysis of the program both for a static analyzer 
(presumably the successor to ob.d) and for the programmers brain 
(e.g. think about how many value categories C++ has, and how use 
of move semantics in C++ can end up introducing a subtle new 
state to objects).

Mar 22 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 23 March 2021 at 00:35:31 UTC, Manu wrote:
 I'm really late to the party here, and there's a lot to catch 
 up, so maybe I'm repeating discussion that already happened... 
 but I think there's a few points that need consideration.

The extern(C++) case has not been discussed much so far. It goes 
without saying that it needs to be as close as C++ as it can be. 
I assumed the DIP to not apply to extern(C++).

Mar 23 2021

Manu <turkeyman gmail.com> writes:

On Tue, Mar 23, 2021 at 10:05 PM deadalnix via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On Tuesday, 23 March 2021 at 00:35:31 UTC, Manu wrote:
 I'm really late to the party here, and there's a lot to catch
 up, so maybe I'm repeating discussion that already happened...
 but I think there's a few points that need consideration.

 The extern(C++) case has not been discussed much so far. It goes
 without saying that it needs to be as close as C++ as it can be.
 I assumed the DIP to not apply to extern(C++).

The DIP mentions C++ compat, and it's bad. I think it would be best if the
whole DIP apply only to extern(D), and we'll address C++ compat separately.

Mar 23 2021

Max Haughton <maxhaton gmail.com> writes:

On Wednesday, 24 March 2021 at 00:14:27 UTC, Manu wrote:
 On Tue, Mar 23, 2021 at 10:05 PM deadalnix via Digitalmars-d < 
 digitalmars-d puremagic.com> wrote:

 On Tuesday, 23 March 2021 at 00:35:31 UTC, Manu wrote:
 I'm really late to the party here, and there's a lot to 
 catch up, so maybe I'm repeating discussion that already 
 happened... but I think there's a few points that need 
 consideration.

 The extern(C++) case has not been discussed much so far. It 
 goes
 without saying that it needs to be as close as C++ as it can 
 be.
 I assumed the DIP to not apply to extern(C++).

 The DIP mentions C++ compat, and it's bad. I think it would be 
 best if the whole DIP apply only to extern(D), and we'll 
 address C++ compat separately.

I think this is the right approach.

Mar 23 2021

D Programming

C/C++ Programming

Other

digitalmars.D - Feedback Thread: DIP 1040--Copying, Moving, and Forwarding--Community