digitalmars.D - Slice expressions - exact evaluation order, dollar

kinke (45/46) Jun 17 2016 The following snippet is interesting:

kinke (2/2) Jun 25 2016 Ping. Let's clearly define these hairy evaluation order details
Timon Gehr (3/11) Jun 25 2016 Evaluation order should be strictly left-to-right. DMD and GDC get it

Iain Buclaw via Digitalmars-d (3/16) Jun 26 2016 It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperB...
Iain Buclaw via Digitalmars-d (18/39) Jun 26 2016 Ah, I see what you mean. I think you may be using an old GDC version.

kinke (4/13) Jun 26 2016 Thx for the replies - so my testcase works for GDC already? So
Timon Gehr (6/29) Jun 26 2016 This seems to be what I'd expect. It's also what CTFE does.

Iain Buclaw via Digitalmars-d (27/54) Jun 26 2016 Very likely CTFE. Anyway, this isn't the only thing where CTFE and

Timon Gehr (15/79) Jun 26 2016 I don't see how that is possible, unless I misunderstood your previous

Iain Buclaw via Digitalmars-d (8/87) Jul 12 2016 Because changes made to a temporary get lost as they never bind back

Timon Gehr (8/19) Jul 18 2016 Which I'd expect. It is just like:

kinke (10/20) Jul 13 2016 So Timon prefers the pre-buffer (apparently what DMD does), GDC

kinke (7/13) Jul 13 2016 Oh, that's actually

Michael Coulombe (11/20) Jul 13 2016 The docs aren't fully detailed, but this is explicit behavior in

kinke <noone nowhere.com> writes:

The following snippet is interesting:

<<<
__gshared int step = 0;
__gshared int[] globalArray;

ref int[] getBase()
{
     assert(step == 0);
     ++step;
     return globalArray;
}

int getLowerBound(size_t dollar)
{
     assert(step == 1);
     ++step;
     assert(dollar == 0);
     globalArray = [ 666 ];
     return 1;
}

int getUpperBound(size_t dollar)
{
     assert(step == 2);
     ++step;
     assert(dollar == 1);
     globalArray = [ 1, 2, 3 ];
     return 3;
}


void main()
{
     auto r = getBase()[getLowerBound($) .. getUpperBound($)];
     assert(r == [ 2, 3 ]);
}



Firstly, it fails with DMD 2.071 because $ in the upper bound 
expression is 0, i.e., it doesn't reflect the updated length (1) 
after evaluating the lower bound expression. LDC does.
Secondly, DMD 2.071 throws a RangeError, most likely because it's 
using the initial length for the bounds checks too.

Most interesting IMO though is the question when the slicee's 
pointer is to be loaded. This is only relevant if the base is an 
lvalue and may therefore be modified when evaluating the bound 
expressions. Should the returned slice be based on the slicee's 
buffer before or after evaluating the bounds expressions?
This has been triggered by 
https://github.com/ldc-developers/ldc/issues/1433 as LDC loads 
the pointer before evaluating the bounds.

Jun 17 2016

kinke <noone nowhere.com> writes:

Ping. Let's clearly define these hairy evaluation order details 
and add corresponding tests; that'd be another advantage over C++.

Jun 25 2016

Timon Gehr <timon.gehr gmx.ch> writes:

On 17.06.2016 21:59, kinke wrote:
 Most interesting IMO though is the question when the slicee's pointer is
 to be loaded. This is only relevant if the base is an lvalue and may
 therefore be modified when evaluating the bound expressions. Should the
 returned slice be based on the slicee's buffer before or after
 evaluating the bounds expressions?
 This has been triggered by
 https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
 pointer before evaluating the bounds.

Evaluation order should be strictly left-to-right. DMD and GDC get it 
wrong here.

Jun 25 2016

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 17.06.2016 21:59, kinke wrote:
 Most interesting IMO though is the question when the slicee's pointer is
 to be loaded. This is only relevant if the base is an lvalue and may
 therefore be modified when evaluating the bound expressions. Should the
 returned slice be based on the slicee's buffer before or after
 evaluating the bounds expressions?
 This has been triggered by
 https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
 pointer before evaluating the bounds.


 Evaluation order should be strictly left-to-right. DMD and GDC get it wrong
 here.

It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperBound().

Jun 26 2016

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 26 June 2016 at 09:36, Iain Buclaw <ibuclaw gdcproject.org> wrote:

 On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 17.06.2016 21:59, kinke wrote:
 Most interesting IMO though is the question when the slicee's pointer is
 to be loaded. This is only relevant if the base is an lvalue and may
 therefore be modified when evaluating the bound expressions. Should the
 returned slice be based on the slicee's buffer before or after
 evaluating the bounds expressions?
 This has been triggered by
 https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the
 pointer before evaluating the bounds.


 Evaluation order should be strictly left-to-right. DMD and GDC get it

 wrong
 here.

 It is evaluated left-to-right. getBase() -> getLowerBound() ->
 getUpperBound().

Ah, I see what you mean.  I think you may be using an old GDC version.
Before I used to cache the result of getBase().

Old codegen:

_base = *(getBase());
_lwr = getLowerBound(_base.length);
_upr = getUpperBound(_base.length);
r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

---
Now when creating temporaries of references, the reference is stabilized
instead.

New codegen:

*(_ptr = getBase());
_lwr = getLowerBound(_ptr.length);
_upr = getUpperBound(_ptr.length);
r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
---

I suggest you fix LDC if it doesn't already do this. :-)

Jun 26 2016

kinke <noone nowhere.com> writes:

On Sunday, 26 June 2016 at 08:08:58 UTC, Iain Buclaw wrote:
 Now when creating temporaries of references, the reference is 
 stabilized instead.

 New codegen:

 *(_ptr = getBase());
 _lwr = getLowerBound(_ptr.length);
 _upr = getUpperBound(_ptr.length);
 r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
 ---

 I suggest you fix LDC if it doesn't already do this. :-)

Thx for the replies - so my testcase works for GDC already? So 
since what GDC is doing is what I came up for independently for

Jun 26 2016

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
      > Evaluation order should be strictly left-to-right. DMD and GDC
     get it wrong
      > here.
      >

     It is evaluated left-to-right. getBase() -> getLowerBound() ->
     getUpperBound().

 Ah, I see what you mean.  I think you may be using an old GDC version.
 Before I used to cache the result of getBase().

 Old codegen:

 _base = *(getBase());
 _lwr = getLowerBound(_base.length);
 _upr = getUpperBound(_base.length);
 r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

 ---

This seems to be what I'd expect. It's also what CTFE does.
CTFE and run time behaviour should be identical. (So either one of them 
needs to be fixed.)

 Now when creating temporaries of references, the reference is stabilized
 instead.

 New codegen:

 *(_ptr = getBase());
 _lwr = getLowerBound(_ptr.length);
 _upr = getUpperBound(_ptr.length);
 r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
 ---

 I suggest you fix LDC if it doesn't already do this. :-)

I'm not convinced this is a good idea. It makes 
(()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].

Jun 26 2016

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
 Old codegen:

 _base = *(getBase());
 _lwr = getLowerBound(_base.length);
 _upr = getUpperBound(_base.length);
 r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

 ---


 This seems to be what I'd expect. It's also what CTFE does.
 CTFE and run time behaviour should be identical. (So either one of them
 needs to be fixed.)

Very likely CTFE.  Anyway, this isn't the only thing where CTFE and
Runtime do things differently.

 Now when creating temporaries of references, the reference is stabilized
 instead.

 New codegen:

 *(_ptr = getBase());
 _lwr = getLowerBound(_ptr.length);
 _upr = getUpperBound(_ptr.length);
 r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
 ---

 I suggest you fix LDC if it doesn't already do this. :-)



 I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()]
 behave differently from base[lwr()..upr()].

No, sorry, I'm afraid you are wrong there. They should both behave
exactly the same.

I may need to step aside and explain what changed in GDC, as it had
nothing to do with this LDC bug.

==> Step

What made this subtle change was in relation to fixing bug 42 and 228
in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our
codegen trees, which in turn makes NRVO work consistently regardless
of optimization flags used - no more optimizer being confused by us
"faking it".

How is the above jargon related? Well, one of the problems faced was
that it must be ensured that lvalues continue being lvalues when
considering creating a temporary in the codegen pass.  Lvalue
references must have the reference stabilized, not the value that is
being dereferenced.  This also came with an added assurance that GDC
will now *never* create a temporary of a decl with a cpctor or dtor,
else it'll die with an internal compiler error trying. :-)

<== Step

(() => base)[lwr()..up()] will make a temporary of (() => base), but
guarantees that references are stabilized first.

base[lwr()..upr()] will create no temporary if base has no side
effects.  And so if lwr() modifies base, then upr() will get the
updated copy.

Jun 26 2016

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:
 On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
 Old codegen:

 _base = *(getBase());
 _lwr = getLowerBound(_base.length);
 _upr = getUpperBound(_base.length);
 r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

 ---


 This seems to be what I'd expect. It's also what CTFE does.
 CTFE and run time behaviour should be identical. (So either one of them
 needs to be fixed.)

 Very likely CTFE.  Anyway, this isn't the only thing where CTFE and
 Runtime do things differently.
 ...

All arbitrary differences should be eradicated.

 Now when creating temporaries of references, the reference is stabilized
 instead.

 New codegen:

 *(_ptr = getBase());
 _lwr = getLowerBound(_ptr.length);
 _upr = getUpperBound(_ptr.length);
 r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
 ---

 I suggest you fix LDC if it doesn't already do this. :-)



 I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()]
 behave differently from base[lwr()..upr()].

 No, sorry, I'm afraid you are wrong there. They should both behave
 exactly the same.
 ...

I don't see how that is possible, unless I misunderstood your previous 
explanation. As far as I understand, for the first expression, code gen 
will generate a reference to a temporary copy of base, and for the 
second expression, it will generate a reference to base directly. If 
lwr() or upr() then update the ptr and/or the length of base, those 
changes will be seen for the second slice expression, but not for the first.


 I may need to step aside and explain what changed in GDC, as it had
 nothing to do with this LDC bug.

 ==> Step

 What made this subtle change was in relation to fixing bug 42 and 228
 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our
 codegen trees, which in turn makes NRVO work consistently regardless
 of optimization flags used - no more optimizer being confused by us
 "faking it".

 How is the above jargon related? Well, one of the problems faced was
 that it must be ensured that lvalues continue being lvalues when
 considering creating a temporary in the codegen pass.  Lvalue
 references must have the reference stabilized, not the value that is
 being dereferenced.  This also came with an added assurance that GDC
 will now *never* create a temporary of a decl with a cpctor or dtor,
 else it'll die with an internal compiler error trying. :-)
 ...

What is the justification why the base should be evaluated as an lvalue?

 <== Step

 (() => base)[lwr()..up()] will make a temporary of (() => base), but
 guarantees that references are stabilized first.

(I assume you meant (() => base)()[lwr()..upr()].)

The lambda returns by value, so you will stabilize the reference to a 
temporary copy of base? (Unless I misunderstand your terminology.)

 base[lwr()..upr()] will create no temporary if base has no side
 effects.  And so if lwr() modifies base, then upr() will get the
 updated copy.

Yes, it is clear that upr() should see modifications to memory that 
lwr() makes. The point is that the slice expression itself does or does 
not see the updates based on whether I wrap base in a lambda or not.

Jun 26 2016

Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 27 June 2016 at 04:38, Timon Gehr via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:
 On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:
 Old codegen:

 _base = *(getBase());
 _lwr = getLowerBound(_base.length);
 _upr = getUpperBound(_base.length);
 r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4};

 ---



 This seems to be what I'd expect. It's also what CTFE does.
 CTFE and run time behaviour should be identical. (So either one of them
 needs to be fixed.)

 Very likely CTFE.  Anyway, this isn't the only thing where CTFE and
 Runtime do things differently.
 ...


 All arbitrary differences should be eradicated.

 Now when creating temporaries of references, the reference is stabilized
 instead.

 New codegen:

 *(_ptr = getBase());
 _lwr = getLowerBound(_ptr.length);
 _upr = getUpperBound(_ptr.length);
 r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4};
 ---

 I suggest you fix LDC if it doesn't already do this. :-)




 I'm not convinced this is a good idea. It makes
 (()=>base)()[lwr()..upr()]
 behave differently from base[lwr()..upr()].


 No, sorry, I'm afraid you are wrong there. They should both behave
 exactly the same.
 ...


 I don't see how that is possible, unless I misunderstood your previous
 explanation. As far as I understand, for the first expression, code gen will
 generate a reference to a temporary copy of base, and for the second
 expression, it will generate a reference to base directly. If lwr() or upr()
 then update the ptr and/or the length of base, those changes will be seen
 for the second slice expression, but not for the first.


 I may need to step aside and explain what changed in GDC, as it had
 nothing to do with this LDC bug.

 ==> Step

 What made this subtle change was in relation to fixing bug 42 and 228
 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our
 codegen trees, which in turn makes NRVO work consistently regardless
 of optimization flags used - no more optimizer being confused by us
 "faking it".

 How is the above jargon related? Well, one of the problems faced was
 that it must be ensured that lvalues continue being lvalues when
 considering creating a temporary in the codegen pass.  Lvalue
 references must have the reference stabilized, not the value that is
 being dereferenced.  This also came with an added assurance that GDC
 will now *never* create a temporary of a decl with a cpctor or dtor,
 else it'll die with an internal compiler error trying. :-)
 ...


 What is the justification why the base should be evaluated as an lvalue?

Because changes made to a temporary get lost as they never bind back
to the original reference.

Regardless, creating a temporary of a struct with a cpctor violates
the semantics of the type - it's the job of the frontend to generate
all the code for lifetime management for us.

(Sorry for the belated response, I have been distracted).

Jul 12 2016

Timon Gehr <timon.gehr gmx.ch> writes:

On 12.07.2016 23:56, Iain Buclaw via Digitalmars-d wrote:
What is the justification why the base should be evaluated as an lvalue?


 Because changes made to a temporary get lost as they never bind back
 to the original reference.
 ...

Which I'd expect. It is just like:

int x = 0;
assert(3 == ++x + ++x);

If the first '++x' was evaluated by reference, this would be 4, not 3.


 Regardless, creating a temporary of a struct with a cpctor violates
 the semantics of the type - it's the job of the frontend to generate
 all the code for lifetime management for us.
 ...

Yes, but the front end can also be wrong. What is unclear here is if/why 
the front end should evaluate the array base by reference.


 (Sorry for the belated response, I have been distracted).

(Me too.)

Jul 18 2016

kinke <noone nowhere.com> writes:

On Monday, 27 June 2016 at 02:38:22 UTC, Timon Gehr wrote:
 As far as I understand, for the first expression, code gen will 
 generate a reference to a temporary copy of base, and for the 
 second expression, it will generate a reference to base 
 directly. If lwr() or upr() then update the ptr and/or the 
 length of base, those changes will be seen for the second slice 
 expression, but not for the first.

Exactly. That's what I initially asked in

 Should the returned slice be based on the slicee's buffer 
 before or after evaluating the bounds expressions?

So Timon prefers the pre-buffer (apparently what DMD does), GDC 
does the post-buffer, and LDC buggily something inbetween (for $, 
we treat base.length as lvalue, but we load base.ptr before 
evaluating the bounds, hence treating base as rvalue there).

Can we agree on something, add corresponding tests and make sure 
CTFE works exactly the same? %)

 The point is that the slice expression itself does or does not 
 see the updates based on whether I wrap base in a lambda or not.

I don't really see a necessity for the lambda to return the same 
kind (lvalue/rvalue) of value as the expression directly.

Jul 13 2016

kinke <noone nowhere.com> writes:

On Wednesday, 13 July 2016 at 21:06:28 UTC, kinke wrote:
 On Monday, 27 June 2016 at 02:38:22 UTC, Timon Gehr wrote:
 The point is that the slice expression itself does or does not 
 see the updates based on whether I wrap base in a lambda or 
 not.

 I don't really see a necessity for the lambda to return the 
 same kind (lvalue/rvalue) of value as the expression directly.

Oh, that's actually 
https://issues.dlang.org/show_bug.cgi?id=16271.

So lambda wrapping isn't the issue here. It's just that both ways 
of dealing with the base are possible and arguably plausible. Is 
the current DMD way (base treated as rvalue) the one to be 
followed or has just nobody given this a deeper thought yet?

Jul 13 2016

Michael Coulombe <kirsybuu gmail.com> writes:

On Friday, 17 June 2016 at 19:59:09 UTC, kinke wrote:

 void main()
 {
     auto r = getBase()[getLowerBound($) .. getUpperBound($)];
     assert(r == [ 2, 3 ]);
 }

 Firstly, it fails with DMD 2.071 because $ in the upper bound 
 expression is 0, i.e., it doesn't reflect the updated length 
 (1) after evaluating the lower bound expression. LDC does.

The docs aren't fully detailed, but this is explicit behavior in 
the DMD front end that is the same no matter what type getBase() 
returns:

"Note that opDollar!i is only evaluated once for each i where $ 
occurs in the corresponding position in the indexing operation." 
- https://dlang.org/spec/operatoroverloading.html

"PostfixExpression is evaluated. if PostfixExpression is an 
expression of type static array or dynamic array, the special 
variable $ is declared and set to be the length of the array. " - 
https://dlang.org/spec/expression.html

Jul 13 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Slice expressions - exact evaluation order, dollar