digitalmars.D - Slice expressions - exact evaluation order, dollar
- kinke (45/46) Jun 17 2016 The following snippet is interesting:
- kinke (2/2) Jun 25 2016 Ping. Let's clearly define these hairy evaluation order details
- Timon Gehr (3/11) Jun 25 2016 Evaluation order should be strictly left-to-right. DMD and GDC get it
- Iain Buclaw via Digitalmars-d (3/16) Jun 26 2016 It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperB...
- Iain Buclaw via Digitalmars-d (18/39) Jun 26 2016 Ah, I see what you mean. I think you may be using an old GDC version.
- kinke (4/13) Jun 26 2016 Thx for the replies - so my testcase works for GDC already? So
- Timon Gehr (6/29) Jun 26 2016 This seems to be what I'd expect. It's also what CTFE does.
- Iain Buclaw via Digitalmars-d (27/54) Jun 26 2016 Very likely CTFE. Anyway, this isn't the only thing where CTFE and
- Timon Gehr (15/79) Jun 26 2016 I don't see how that is possible, unless I misunderstood your previous
- Iain Buclaw via Digitalmars-d (8/87) Jul 12 2016 Because changes made to a temporary get lost as they never bind back
- Timon Gehr (8/19) Jul 18 2016 Which I'd expect. It is just like:
- kinke (10/20) Jul 13 2016 So Timon prefers the pre-buffer (apparently what DMD does), GDC
- kinke (7/13) Jul 13 2016 Oh, that's actually
- Michael Coulombe (11/20) Jul 13 2016 The docs aren't fully detailed, but this is explicit behavior in
The following snippet is interesting: <<< __gshared int step = 0; __gshared int[] globalArray; ref int[] getBase() { assert(step == 0); ++step; return globalArray; } int getLowerBound(size_t dollar) { assert(step == 1); ++step; assert(dollar == 0); globalArray = [ 666 ]; return 1; } int getUpperBound(size_t dollar) { assert(step == 2); ++step; assert(dollar == 1); globalArray = [ 1, 2, 3 ]; return 3; } void main() { auto r = getBase()[getLowerBound($) .. getUpperBound($)]; assert(r == [ 2, 3 ]); }Firstly, it fails with DMD 2.071 because $ in the upper bound expression is 0, i.e., it doesn't reflect the updated length (1) after evaluating the lower bound expression. LDC does. Secondly, DMD 2.071 throws a RangeError, most likely because it's using the initial length for the bounds checks too. Most interesting IMO though is the question when the slicee's pointer is to be loaded. This is only relevant if the base is an lvalue and may therefore be modified when evaluating the bound expressions. Should the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions? This has been triggered by https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the pointer before evaluating the bounds.
Jun 17 2016
Ping. Let's clearly define these hairy evaluation order details and add corresponding tests; that'd be another advantage over C++.
Jun 25 2016
On 17.06.2016 21:59, kinke wrote:Most interesting IMO though is the question when the slicee's pointer is to be loaded. This is only relevant if the base is an lvalue and may therefore be modified when evaluating the bound expressions. Should the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions? This has been triggered by https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the pointer before evaluating the bounds.Evaluation order should be strictly left-to-right. DMD and GDC get it wrong here.
Jun 25 2016
On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 17.06.2016 21:59, kinke wrote:It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperBound().Most interesting IMO though is the question when the slicee's pointer is to be loaded. This is only relevant if the base is an lvalue and may therefore be modified when evaluating the bound expressions. Should the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions? This has been triggered by https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the pointer before evaluating the bounds.Evaluation order should be strictly left-to-right. DMD and GDC get it wrong here.
Jun 26 2016
On 26 June 2016 at 09:36, Iain Buclaw <ibuclaw gdcproject.org> wrote:On 26 June 2016 at 03:30, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:Ah, I see what you mean. I think you may be using an old GDC version. Before I used to cache the result of getBase(). Old codegen: _base = *(getBase()); _lwr = getLowerBound(_base.length); _upr = getUpperBound(_base.length); r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4}; --- Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)On 17.06.2016 21:59, kinke wrote:wrongMost interesting IMO though is the question when the slicee's pointer is to be loaded. This is only relevant if the base is an lvalue and may therefore be modified when evaluating the bound expressions. Should the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions? This has been triggered by https://github.com/ldc-developers/ldc/issues/1433 as LDC loads the pointer before evaluating the bounds.Evaluation order should be strictly left-to-right. DMD and GDC get ithere.It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperBound().
Jun 26 2016
On Sunday, 26 June 2016 at 08:08:58 UTC, Iain Buclaw wrote:Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)Thx for the replies - so my testcase works for GDC already? So since what GDC is doing is what I came up for independently for
Jun 26 2016
On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:> Evaluation order should be strictly left-to-right. DMD and GDC get it wrong > here. > It is evaluated left-to-right. getBase() -> getLowerBound() -> getUpperBound(). Ah, I see what you mean. I think you may be using an old GDC version. Before I used to cache the result of getBase(). Old codegen: _base = *(getBase()); _lwr = getLowerBound(_base.length); _upr = getUpperBound(_base.length); r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4}; ---This seems to be what I'd expect. It's also what CTFE does. CTFE and run time behaviour should be identical. (So either one of them needs to be fixed.)Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].
Jun 26 2016
On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:Very likely CTFE. Anyway, this isn't the only thing where CTFE and Runtime do things differently.Old codegen: _base = *(getBase()); _lwr = getLowerBound(_base.length); _upr = getUpperBound(_base.length); r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4}; ---This seems to be what I'd expect. It's also what CTFE does. CTFE and run time behaviour should be identical. (So either one of them needs to be fixed.)No, sorry, I'm afraid you are wrong there. They should both behave exactly the same. I may need to step aside and explain what changed in GDC, as it had nothing to do with this LDC bug. ==> Step What made this subtle change was in relation to fixing bug 42 and 228 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our codegen trees, which in turn makes NRVO work consistently regardless of optimization flags used - no more optimizer being confused by us "faking it". How is the above jargon related? Well, one of the problems faced was that it must be ensured that lvalues continue being lvalues when considering creating a temporary in the codegen pass. Lvalue references must have the reference stabilized, not the value that is being dereferenced. This also came with an added assurance that GDC will now *never* create a temporary of a decl with a cpctor or dtor, else it'll die with an internal compiler error trying. :-) <== Step (() => base)[lwr()..up()] will make a temporary of (() => base), but guarantees that references are stabilized first. base[lwr()..upr()] will create no temporary if base has no side effects. And so if lwr() modifies base, then upr() will get the updated copy.Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].
Jun 26 2016
On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:All arbitrary differences should be eradicated.On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:Very likely CTFE. Anyway, this isn't the only thing where CTFE and Runtime do things differently. ...Old codegen: _base = *(getBase()); _lwr = getLowerBound(_base.length); _upr = getUpperBound(_base.length); r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4}; ---This seems to be what I'd expect. It's also what CTFE does. CTFE and run time behaviour should be identical. (So either one of them needs to be fixed.)I don't see how that is possible, unless I misunderstood your previous explanation. As far as I understand, for the first expression, code gen will generate a reference to a temporary copy of base, and for the second expression, it will generate a reference to base directly. If lwr() or upr() then update the ptr and/or the length of base, those changes will be seen for the second slice expression, but not for the first.No, sorry, I'm afraid you are wrong there. They should both behave exactly the same. ...Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].I may need to step aside and explain what changed in GDC, as it had nothing to do with this LDC bug. ==> Step What made this subtle change was in relation to fixing bug 42 and 228 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our codegen trees, which in turn makes NRVO work consistently regardless of optimization flags used - no more optimizer being confused by us "faking it". How is the above jargon related? Well, one of the problems faced was that it must be ensured that lvalues continue being lvalues when considering creating a temporary in the codegen pass. Lvalue references must have the reference stabilized, not the value that is being dereferenced. This also came with an added assurance that GDC will now *never* create a temporary of a decl with a cpctor or dtor, else it'll die with an internal compiler error trying. :-) ...What is the justification why the base should be evaluated as an lvalue?<== Step (() => base)[lwr()..up()] will make a temporary of (() => base), but guarantees that references are stabilized first.(I assume you meant (() => base)()[lwr()..upr()].) The lambda returns by value, so you will stabilize the reference to a temporary copy of base? (Unless I misunderstand your terminology.)base[lwr()..upr()] will create no temporary if base has no side effects. And so if lwr() modifies base, then upr() will get the updated copy.Yes, it is clear that upr() should see modifications to memory that lwr() makes. The point is that the slice expression itself does or does not see the updates based on whether I wrap base in a lambda or not.
Jun 26 2016
On 27 June 2016 at 04:38, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:On 26.06.2016 20:08, Iain Buclaw via Digitalmars-d wrote:Because changes made to a temporary get lost as they never bind back to the original reference. Regardless, creating a temporary of a struct with a cpctor violates the semantics of the type - it's the job of the frontend to generate all the code for lifetime management for us. (Sorry for the belated response, I have been distracted).On 26 June 2016 at 14:33, Timon Gehr via Digitalmars-d <digitalmars-d puremagic.com> wrote:All arbitrary differences should be eradicated.On 26.06.2016 10:08, Iain Buclaw via Digitalmars-d wrote:Very likely CTFE. Anyway, this isn't the only thing where CTFE and Runtime do things differently. ...Old codegen: _base = *(getBase()); _lwr = getLowerBound(_base.length); _upr = getUpperBound(_base.length); r = {.length=(_upr - _lwr), .ptr=_base.ptr + _lwr * 4}; ---This seems to be what I'd expect. It's also what CTFE does. CTFE and run time behaviour should be identical. (So either one of them needs to be fixed.)I don't see how that is possible, unless I misunderstood your previous explanation. As far as I understand, for the first expression, code gen will generate a reference to a temporary copy of base, and for the second expression, it will generate a reference to base directly. If lwr() or upr() then update the ptr and/or the length of base, those changes will be seen for the second slice expression, but not for the first.No, sorry, I'm afraid you are wrong there. They should both behave exactly the same. ...Now when creating temporaries of references, the reference is stabilized instead. New codegen: *(_ptr = getBase()); _lwr = getLowerBound(_ptr.length); _upr = getUpperBound(_ptr.length); r = {.length=(_upr - _lwr), .ptr=_ptr.ptr + _lwr * 4}; --- I suggest you fix LDC if it doesn't already do this. :-)I'm not convinced this is a good idea. It makes (()=>base)()[lwr()..upr()] behave differently from base[lwr()..upr()].I may need to step aside and explain what changed in GDC, as it had nothing to do with this LDC bug. ==> Step What made this subtle change was in relation to fixing bug 42 and 228 in GDC, which involved turning on TREE_ADDRESSABLE(type) bit in our codegen trees, which in turn makes NRVO work consistently regardless of optimization flags used - no more optimizer being confused by us "faking it". How is the above jargon related? Well, one of the problems faced was that it must be ensured that lvalues continue being lvalues when considering creating a temporary in the codegen pass. Lvalue references must have the reference stabilized, not the value that is being dereferenced. This also came with an added assurance that GDC will now *never* create a temporary of a decl with a cpctor or dtor, else it'll die with an internal compiler error trying. :-) ...What is the justification why the base should be evaluated as an lvalue?
Jul 12 2016
On 12.07.2016 23:56, Iain Buclaw via Digitalmars-d wrote:Which I'd expect. It is just like: int x = 0; assert(3 == ++x + ++x); If the first '++x' was evaluated by reference, this would be 4, not 3.Because changes made to a temporary get lost as they never bind back to the original reference. ...What is the justification why the base should be evaluated as an lvalue?Regardless, creating a temporary of a struct with a cpctor violates the semantics of the type - it's the job of the frontend to generate all the code for lifetime management for us. ...Yes, but the front end can also be wrong. What is unclear here is if/why the front end should evaluate the array base by reference.(Sorry for the belated response, I have been distracted).(Me too.)
Jul 18 2016
On Monday, 27 June 2016 at 02:38:22 UTC, Timon Gehr wrote:As far as I understand, for the first expression, code gen will generate a reference to a temporary copy of base, and for the second expression, it will generate a reference to base directly. If lwr() or upr() then update the ptr and/or the length of base, those changes will be seen for the second slice expression, but not for the first.Exactly. That's what I initially asked inShould the returned slice be based on the slicee's buffer before or after evaluating the bounds expressions?So Timon prefers the pre-buffer (apparently what DMD does), GDC does the post-buffer, and LDC buggily something inbetween (for $, we treat base.length as lvalue, but we load base.ptr before evaluating the bounds, hence treating base as rvalue there). Can we agree on something, add corresponding tests and make sure CTFE works exactly the same? %)The point is that the slice expression itself does or does not see the updates based on whether I wrap base in a lambda or not.I don't really see a necessity for the lambda to return the same kind (lvalue/rvalue) of value as the expression directly.
Jul 13 2016
On Wednesday, 13 July 2016 at 21:06:28 UTC, kinke wrote:On Monday, 27 June 2016 at 02:38:22 UTC, Timon Gehr wrote:Oh, that's actually https://issues.dlang.org/show_bug.cgi?id=16271. So lambda wrapping isn't the issue here. It's just that both ways of dealing with the base are possible and arguably plausible. Is the current DMD way (base treated as rvalue) the one to be followed or has just nobody given this a deeper thought yet?The point is that the slice expression itself does or does not see the updates based on whether I wrap base in a lambda or not.I don't really see a necessity for the lambda to return the same kind (lvalue/rvalue) of value as the expression directly.
Jul 13 2016
On Friday, 17 June 2016 at 19:59:09 UTC, kinke wrote:void main() { auto r = getBase()[getLowerBound($) .. getUpperBound($)]; assert(r == [ 2, 3 ]); } Firstly, it fails with DMD 2.071 because $ in the upper bound expression is 0, i.e., it doesn't reflect the updated length (1) after evaluating the lower bound expression. LDC does.The docs aren't fully detailed, but this is explicit behavior in the DMD front end that is the same no matter what type getBase() returns: "Note that opDollar!i is only evaluated once for each i where $ occurs in the corresponding position in the indexing operation." - https://dlang.org/spec/operatoroverloading.html "PostfixExpression is evaluated. if PostfixExpression is an expression of type static array or dynamic array, the special variable $ is declared and set to be the length of the array. " - https://dlang.org/spec/expression.html
Jul 13 2016