www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - foreach ref very broken: fails to call front(val)

reply "monarch_dodra" <monarch_dodra gmail.com> writes:
I think this is a pretty serious bug: when one writes: 
"foreach(ref a, range)", the underlying (ref'd) object will ONLY 
get modified if the range object provides a "ref T front()" 
method.

What is worrisome is that:
a) The code compiles anyways.
b) This even happens when range provides a "void front(T t)" 
method.

Here is the bug in action, with the very standard Array class, as 
well as the generic Map algorithm:
----
import std.container;
import std.stdio;
import std.algorithm;

void main()
{
   Array!int arr;
   arr.length = 3;

   foreach(a; arr)
     a = 2;

   map!("a+=2")(arr[]);

   writeln(arr[]);
}
----
Output:
----
[0, 0, 0]
----

As you can see, not only is "foreach" broken, but so is any 
algorithm that tries to mutate an Array.

Note that "Array" itself can be fixed by my recommendation here:
http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org

However, in the case of containers that can't return a ref'ed 
front (eg Array!bool), this is actually a double bug:

1) foreach: It makes compile time call to "T front()" regardless 
of context. Because of this, even when we write "a = 2", a 
straight up assignment to a temporary takes place, rather than 
compiling as "arr.front(2)".

2) front: Unlike opIndex, front is not an operator. This means 
that writing code such as "arr.front += 5" or "++arr.front" is 
not supported by the language (when front returns by value). This 
means the implementer of a Range (which can't return a ref'd 
front) has absolutely no way of intercepting operations that want 
to occur directly on front
Jul 02 2012
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
On Monday, 2 July 2012 at 12:44:59 UTC, monarch_dodra wrote:
 I think this is a pretty serious bug: when one writes: 
 "foreach(ref a, range)", the underlying (ref'd) object will 
 ONLY get modified if the range object provides a "ref T 
 front()" method.
Somethig related, zip(a,b) allows to sort the two arrys, but you can't modify the arry items with a ref foreach: import std.stdio: writeln; import std.algorithm: sort; import std.range: zip; void foo1() { auto a = [10, 20, 30]; auto b = [100, 200, 300]; foreach (ref x, y; zip(a, b)) x++; writeln(a); } void foo2() { int[] a = [1, 2, 3]; string[] b = ["c", "b", "a"]; writeln(zip(a, b)[2]); writeln(a, "\n", b); sort!("a[0] > b[0]")(zip(a, b)); writeln(a, "\n", b); } void main() { foo1(); foo2(); } Output: [10, 20, 30] Tuple!(int,string)(3, "a") [1, 2, 3] ["c", "b", "a"] [3, 2, 1] ["a", "b", "c"] Bye, bearophile
Jul 02 2012
prev sibling next sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Monday, 2 July 2012 at 12:44:59 UTC, monarch_dodra wrote:
 ...
I opened this thread a week ago, and got little no feed back. I've realized since: 1) "map" was a bad example, since it actually returns a new range (does not modify the underlying values), so the output was normal 2) I pasted the wrong code, I forgot to put a "ref" in there, so the output was normal. Now take this example: ---- import std.stdio; import std.algorithm; import std.container; void main() { Array!int arr = Array!int(1, 2, 3, 4, 5); foreach(ref a; arr[]) { a = 10; } writeln("Output: ", arr[]); } ---- Output: [1, 2, 3, 4, 5] ---- I realize I'm still new to D, but I'd appreciate if somebody told me if I am wrong(or being stupid). Anyways, I'm bothered by several things: Quoteth "The D programing language", 12.9.1: "If you specify ref with value, the compiler replaces all uses of value with calls to __c.front throughout the body of the loop". Surely, there is something wrong here... right?
Jul 09 2012
next sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Monday, 9 July 2012 at 09:24:29 UTC, monarch_dodra wrote:
 ...
Forgive me for insisting, but at this point, I'm just trying to figure out if I am being stupid and misunderstood how D works. Could someone please just tell me if what I'm seeing is expected?
Jul 10 2012
parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"monarch_dodra" <monarch_dodra gmail.com> wrote in message 
news:fbithbggnynmazrxkfjw forum.dlang.org...
 On Monday, 9 July 2012 at 09:24:29 UTC, monarch_dodra wrote:
 ...
Forgive me for insisting, but at this point, I'm just trying to figure out if I am being stupid and misunderstood how D works. Could someone please just tell me if what I'm seeing is expected?
Looks like a bug.
Jul 10 2012
prev sibling next sibling parent "monarch_dodra" <monarch_dodra gmail.com> writes:
On Monday, 9 July 2012 at 09:24:29 UTC, monarch_dodra wrote:
 ...
Forgive me for insisting, but at this point, I'm just trying to figure out if I am being stupid and misunderstood how D works. Could someone please tell me if what I'm seeing is expected?
Jul 10 2012
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/09/2012 02:24 AM, monarch_dodra wrote:
 On Monday, 2 July 2012 at 12:44:59 UTC, monarch_dodra wrote:
 ...
I opened this thread a week ago, and got little no feed back.
Please also consider the D.learn newsgroup. Some people are more responsive over there. :) (And you may have already asked there anyway. :))
 import std.stdio;
 import std.algorithm;
 import std.container;

 void main()
 {
 Array!int arr = Array!int(1, 2, 3, 4, 5);
 foreach(ref a; arr[])
 {
 a = 10;
 }
 writeln("Output: ", arr[]);
 }
 ----
 Output: [1, 2, 3, 4, 5]
 ----
Yet using the InputRange functions directly works: Array!int arr = Array!int(1, 2, 3, 4, 5); auto all = arr[]; for ( ; !all.empty; all.popFront()) { all.front = 10; } Now, the output: [10, 10, 10, 10, 10] Using for and indexes also works: Array!int arr = Array!int(1, 2, 3, 4, 5); auto all = arr[]; for (auto i = 0; i != all.length; ++i) { all[i] = 10; } Again, the output: [10, 10, 10, 10, 10]
 I realize I'm still new to D, but I'd appreciate if somebody told me if
 I am wrong(or being stupid).
I have found std.container to be confusing.
 Anyways, I'm bothered by several things:

 Quoteth "The D programing language", 12.9.1: "If you specify ref with
 value, the compiler replaces all uses of value with calls to __c.front
 throughout the body of the loop".

 Surely, there is something wrong here... right?
Yes. :) Ali
Jul 10 2012
prev sibling next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
There are two problems:

1. std.container.Array.(front, back, opIndex, ...) doesn't return its
elements by ref.
2. std.algorithm.map doesn't consider original range's ref-ness.

Bye.

Kenji Hara

2012/7/2 monarch_dodra <monarch_dodra gmail.com>:
 I think this is a pretty serious bug: when one writes: "foreach(ref a,
 range)", the underlying (ref'd) object will ONLY get modified if the range
 object provides a "ref T front()" method.

 What is worrisome is that:
 a) The code compiles anyways.
 b) This even happens when range provides a "void front(T t)" method.

 Here is the bug in action, with the very standard Array class, as well as
 the generic Map algorithm:
 ----
 import std.container;
 import std.stdio;
 import std.algorithm;

 void main()
 {
   Array!int arr;
   arr.length = 3;

   foreach(a; arr)
     a = 2;

   map!("a+=2")(arr[]);

   writeln(arr[]);
 }
 ----
 Output:
 ----
 [0, 0, 0]
 ----

 As you can see, not only is "foreach" broken, but so is any algorithm that
 tries to mutate an Array.

 Note that "Array" itself can be fixed by my recommendation here:
 http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org

 However, in the case of containers that can't return a ref'ed front (eg
 Array!bool), this is actually a double bug:

 1) foreach: It makes compile time call to "T front()" regardless of context.
 Because of this, even when we write "a = 2", a straight up assignment to a
 temporary takes place, rather than compiling as "arr.front(2)".

 2) front: Unlike opIndex, front is not an operator. This means that writing
 code such as "arr.front += 5" or "++arr.front" is not supported by the
 language (when front returns by value). This means the implementer of a
 Range (which can't return a ref'd front) has absolutely no way of
 intercepting operations that want to occur directly on front
Jul 10 2012
parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 07/10/2012 12:21 AM, kenji hara wrote:
 There are two problems:

 1. std.container.Array.(front, back, opIndex, ...) doesn't return its
 elements by ref.
It's not that obvious to me. The following picture should supposedly provide references to elements: The latter of the following pair of Array.Range.front() should be used in the foreach body: property T front() { enforce(!empty); return _outer[_a]; } property void front(T value) { enforce(!empty); _outer[_a] = move(value); } There, _outer is the Array itself. And Array.opIndexAssign() seems to be doing the right thing: void opIndexAssign(T value, size_t i) { enforce(_data.RefCounted.isInitialized); _data._payload[i] = value; } Ali
Jul 10 2012
prev sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
Posted a pull request:
https://github.com/D-Programming-Language/phobos/pull/678

With the patch, following code works as expected.

import std.container;
import std.stdio;
import std.algorithm;
void main()
{
    Array!int arr;
    arr.length = 3;

    foreach(ref a; arr)   // requre 'ref'
        a = 2;

    writeln(arr[]);
    // prints [2,2,2]

    // Use writeln to evaluate all elements of map range.
    map!("a+=2")(arr[]).writeln();
    // prints [4,4,4]
}

2012/7/10 kenji hara <k.hara.pg gmail.com>:
 There are two problems:

 1. std.container.Array.(front, back, opIndex, ...) doesn't return its
 elements by ref.
 2. std.algorithm.map doesn't consider original range's ref-ness.

 Bye.

 Kenji Hara

 2012/7/2 monarch_dodra <monarch_dodra gmail.com>:
 I think this is a pretty serious bug: when one writes: "foreach(ref a,
 range)", the underlying (ref'd) object will ONLY get modified if the range
 object provides a "ref T front()" method.

 What is worrisome is that:
 a) The code compiles anyways.
 b) This even happens when range provides a "void front(T t)" method.

 Here is the bug in action, with the very standard Array class, as well as
 the generic Map algorithm:
 ----
 import std.container;
 import std.stdio;
 import std.algorithm;

 void main()
 {
   Array!int arr;
   arr.length = 3;

   foreach(a; arr)
     a = 2;

   map!("a+=2")(arr[]);

   writeln(arr[]);
 }
 ----
 Output:
 ----
 [0, 0, 0]
 ----

 As you can see, not only is "foreach" broken, but so is any algorithm that
 tries to mutate an Array.

 Note that "Array" itself can be fixed by my recommendation here:
 http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org

 However, in the case of containers that can't return a ref'ed front (eg
 Array!bool), this is actually a double bug:

 1) foreach: It makes compile time call to "T front()" regardless of context.
 Because of this, even when we write "a = 2", a straight up assignment to a
 temporary takes place, rather than compiling as "arr.front(2)".

 2) front: Unlike opIndex, front is not an operator. This means that writing
 code such as "arr.front += 5" or "++arr.front" is not supported by the
 language (when front returns by value). This means the implementer of a
 Range (which can't return a ref'd front) has absolutely no way of
 intercepting operations that want to occur directly on front
Jul 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/10/12 4:29 AM, kenji hara wrote:
 Posted a pull request:
 https://github.com/D-Programming-Language/phobos/pull/678
Hmm, that has a couple of issues. First, map that modifies things in place is a bug more often than a feature given that the caller of map may use front() an arbitrary number of times. So I see little intentional and legitimate use for things like map!"a += 2"(range). Second, the return by value from Array is intentional and has to do with sealing. Array is intended to never escape the addresses of its elements. That way, the collection is "sealed" in the sense there can never be uncontrolled pointers to its elements. This allows using efficient allocation strategies for the array without compromising its safety. In more recent discussions with Walter we discussed the possibility of making "ref" returns transitory, i.e. no caller code can actually save a ref result beyond the call. That would allow us to keep sealing while also benefitting of ref returns. But that feature has not been implemented yet. Andrei
Jul 10 2012
next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
2012/7/10 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:
 On 7/10/12 4:29 AM, kenji hara wrote:
 Posted a pull request:
 https://github.com/D-Programming-Language/phobos/pull/678
Hmm, that has a couple of issues. First, map that modifies things in place is a bug more often than a feature given that the caller of map may use front() an arbitrary number of times. So I see little intentional and legitimate use for things like map!"a += 2"(range). Second, the return by value from Array is intentional and has to do with sealing. Array is intended to never escape the addresses of its elements. That way, the collection is "sealed" in the sense there can never be uncontrolled pointers to its elements. This allows using efficient allocation strategies for the array without compromising its safety. In more recent discussions with Walter we discussed the possibility of making "ref" returns transitory, i.e. no caller code can actually save a ref result beyond the call. That would allow us to keep sealing while also benefitting of ref returns. But that feature has not been implemented yet.
OK. I know the past discussion about 'sealed container' so I can agree with your argument. Then I'll close the pull request. Thanks. Kenji Hara
Jul 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/10/12 10:32 AM, kenji hara wrote:
 OK. I know the past discussion about 'sealed container' so I can agree
 with your argument.
 Then I'll close the pull request.
Sounds good for now but let's keep in mind the possibility of restricting ref returns. At that point we can "open up" ref returns in std.container. Andrei
Jul 10 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/10/2012 05:01 PM, Andrei Alexandrescu wrote:
 On 7/10/12 10:32 AM, kenji hara wrote:
 OK. I know the past discussion about 'sealed container' so I can agree
 with your argument.
 Then I'll close the pull request.
Sounds good for now but let's keep in mind the possibility of restricting ref returns. At that point we can "open up" ref returns in std.container. Andrei
Would that apply to all ref returns or would there be an additional attribute that allows to explicitly restrict them?
Jul 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/10/12 11:03 AM, Timon Gehr wrote:
 On 07/10/2012 05:01 PM, Andrei Alexandrescu wrote:
 On 7/10/12 10:32 AM, kenji hara wrote:
 OK. I know the past discussion about 'sealed container' so I can agree
 with your argument.
 Then I'll close the pull request.
Sounds good for now but let's keep in mind the possibility of restricting ref returns. At that point we can "open up" ref returns in std.container. Andrei
Would that apply to all ref returns or would there be an additional attribute that allows to explicitly restrict them?
If we decide to introduce the feature it would apply to all ref parameters and results. That means functions receiving a ref parameter cannot save its address anywhere, and that callers of functions that return a ref result cannot save the address beyond the function call. Still trying to figure out whether this is too restrictive. My opinion is that that's sensible because many uses opposing it are antipatterns anyway. For example, if a function sneaks away the address of something it better makes that clear in the signature by requiring a pointer in the first place. Andrei
Jul 10 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/10/2012 05:07 PM, Andrei Alexandrescu wrote:
 On 7/10/12 11:03 AM, Timon Gehr wrote:
 On 07/10/2012 05:01 PM, Andrei Alexandrescu wrote:
 On 7/10/12 10:32 AM, kenji hara wrote:
 OK. I know the past discussion about 'sealed container' so I can agree
 with your argument.
 Then I'll close the pull request.
Sounds good for now but let's keep in mind the possibility of restricting ref returns. At that point we can "open up" ref returns in std.container. Andrei
Would that apply to all ref returns or would there be an additional attribute that allows to explicitly restrict them?
If we decide to introduce the feature it would apply to all ref parameters and results. That means functions receiving a ref parameter cannot save its address anywhere, and that callers of functions that return a ref result cannot save the address beyond the function call. Still trying to figure out whether this is too restrictive. My opinion is that that's sensible because many uses opposing it are antipatterns anyway. For example, if a function sneaks away the address of something it better makes that clear in the signature by requiring a pointer in the first place. Andrei
I am somewhat opposed to that change. One issue is that it would make built-in arrays more magical. struct S{ ???? } version(A) int[] array; version(B) S array; auto p = &array[i]; However, the need for restricted ref returns is quite obvious. 'scope ref' ?
Jul 10 2012
parent "Mehrdad" <wfunction hotmail.com> writes:
On Tuesday, 10 July 2012 at 22:22:12 UTC, Timon Gehr wrote:
 'scope ref' ?
http://d.puremagic.com/issues/show_bug.cgi?id=8121
Jul 16 2012
prev sibling parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
On Tuesday, 10 July 2012 at 14:19:04 UTC, Andrei Alexandrescu 
wrote:
 ...

 Second, the return by value from Array is intentional and has 
 to do with sealing. Array is intended to never escape the 
 addresses of its elements. That way, the collection is "sealed" 
 in the sense there can never be uncontrolled pointers to its 
 elements. This allows using efficient allocation strategies for 
 the array without compromising its safety.

 ...

 Andrei
Thanks for the reply. I had actually opened a thread about making Array return by ref, and you answered you thought it was a good point, and said that there should be an enhancement request about it: http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org#post-jss5iu:24qct:241:40digitalmars.com But I guess you meant for ranges in general. I wasn't there for the discussions about sealed classes, so I had not considered it. ... Anyways, still curious about that "foreach with ref" loop: Any chance I can expect it to work with Array in a near future? Or if not, shouldn't I get a compile error that you can't bind a reference to a temporary? And more generally, how would code like this _ever_ work? Array!int arr foreach(ref a; arr) a += 5; Since there is no mechanism to pass the "op" to "front"?
Jul 10 2012
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, July 10, 2012 23:25:41 monarch_dodra wrote:
 On Tuesday, 10 July 2012 at 14:19:04 UTC, Andrei Alexandrescu
 
 wrote:
 ...
 
 Second, the return by value from Array is intentional and has
 to do with sealing. Array is intended to never escape the
 addresses of its elements. That way, the collection is "sealed"
 in the sense there can never be uncontrolled pointers to its
 elements. This allows using efficient allocation strategies for
 the array without compromising its safety.
 
 ...
 
 Andrei
Thanks for the reply. I had actually opened a thread about making Array return by ref, and you answered you thought it was a good point, and said that there should be an enhancement request about it: http://forum.dlang.org/thread/bkozswmsgeibarowfwvq forum.dlang.org#post-jss5 iu:24qct:241:40digitalmars.com But I guess you meant for ranges in general. I wasn't there for the discussions about sealed classes, so I had not considered it. ... Anyways, still curious about that "foreach with ref" loop: Any chance I can expect it to work with Array in a near future? Or if not, shouldn't I get a compile error that you can't bind a reference to a temporary? And more generally, how would code like this _ever_ work? Array!int arr foreach(ref a; arr) a += 5; Since there is no mechanism to pass the "op" to "front"?
I believe that according to TDPL, if you use ref in a foreach with a range, and range.front = value isn't legal, then you should get a compilation error. Whether a += value works depends on whether range.front += value works, and at present, that's pretty much only going to work with ranges whose front returns by ref, since property syntax isn't currently sophisticated enough to combine the getter and setter properties to make stuff like front++ or front += 5 possible. - Jonathan M Davis
Jul 10 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/10/2012 11:51 PM, Jonathan M Davis wrote:
 On Tuesday, July 10, 2012 23:25:41 monarch_dodra wrote:
 ...
 And more generally, how would code like this _ever_ work?

 Array!int arr
 foreach(ref a; arr)
 a += 5;

 Since there is no mechanism to pass the "op" to "front"?
I believe that according to TDPL, if you use ref in a foreach with a range, and range.front = value isn't legal, then you should get a compilation error. Whether a += value works depends on whether range.front += value works, and at present, that's pretty much only going to work with ranges whose front returns by ref, since property syntax isn't currently sophisticated enough to combine the getter and setter properties to make stuff like front++ or front += 5 possible. - Jonathan M Davis
That is a bug. property has very limited value if pairs of property functions are accessed in a way syntactically distinguishable from field access.
Jul 10 2012
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, July 11, 2012 00:15:34 Timon Gehr wrote:
 On 07/10/2012 11:51 PM, Jonathan M Davis wrote:
 On Tuesday, July 10, 2012 23:25:41 monarch_dodra wrote:
 ...
 And more generally, how would code like this _ever_ work?
 
 Array!int arr
 foreach(ref a; arr)
 a += 5;
 
 Since there is no mechanism to pass the "op" to "front"?
I believe that according to TDPL, if you use ref in a foreach with a range, and range.front = value isn't legal, then you should get a compilation error. Whether a += value works depends on whether range.front += value works, and at present, that's pretty much only going to work with ranges whose front returns by ref, since property syntax isn't currently sophisticated enough to combine the getter and setter properties to make stuff like front++ or front += 5 possible. - Jonathan M Davis
That is a bug. property has very limited value if pairs of property functions are accessed in a way syntactically distinguishable from field access.
I don't know if it's technically a bug or not. That would depend on the spec, and I suspect that it's mum on the matter, which would make the compiler technically correct no matter what it does with regards to this, since TDPL doesn't say anything on it either that I recall. But if it's not a bug, it's definitely an enhancement request that should be seriously considered. I believe that there are a couple of related enhancement requests, but I can't find them at the moment. - Jonathan M Davis
Jul 10 2012
parent reply "monarch_dodra" <monarch_dodra gmail.com> writes:
Giving Array an opApply operator fixes both problems:

1) It allows modifying the elements when accessed with "ref".
2) It allows writing "++a" even though "++front" isn't supported.

I implemented this and made a pull request:
https://github.com/D-Programming-Language/phobos/pull/683

I implemented both opApply and opApplyReverse.
I implemented it both for Array.Range and Array (see: 
http://forum.dlang.org/thread/rxpbtrawpjzvdfuuwmwp forum.dlang.org)
I implemented correct behavior for "break" cases.
I documented the public opApply.
I wrote a complete unit test.
I tested && unittested it.

This is my first time doing this, so please forgive me if I did 
something wrong :(

----
If this gets validated, then maybe the same should be done to the 
other containers? AFAIK, none of them work correctly with 
foreach(ref).

----
Note that i did not touch Array!bool. This is because in that 
case, I cannot, even internally, pass a bool by reference to the 
delegate.

A workaround would be to read a bool into a temporary, pass 
temporary to the delegate, write back to container. However, I do 
not know how to distinguish the case where the delegate modifies 
the bool or not, making the general case too expensive.

I preferred not touching it, and doing things one at a time.
----

Thank you all for the time you have already invested in this 
thread (as well as the others...)
Jul 11 2012
parent "monarch_dodra" <monarchdodra gmail.com> writes:
andralex wrote:
 We need to improve ranges instead of using opApply for such 
 trivial
 iteration patterns. Apologies that the thread didn't make this 
 clear.
 Will close this now - sorry.
Thanks for taking the time to process my pull request. I'd argue that we could keep the "opApply fix" until ranges are improved, but at the same time, keeping things (arguably) "broken" may motivate for improvement more :) That said, given the whole "return by value" thing for closed containers, and the fact that "front" can't be forwarded a operators, I'm curious to known just _HOW_ ranges *could* ever be improved to support the current scenario. I'll drop the issue here regardless, but I will admit that I don't see how any solution other than opApply could ever work...
Jul 16 2012