www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - foreach by value iteration

reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
I just got bitten by this again. It usually looks something like 
this:

-----------------
struct Vector
{
     ...
     void normalize() { this /= length; }
}

Vector[] vs;
...
foreach (v; vs)
     v.normalize();
-----------------

The bug here is that the loop does nothing because it is 
normalizing a temporary copy of the vector, rather than the 
vectors in the array.

Am I the only person that gets caught by this trap?

Perhaps the loop variable should be implicitly const so that this 
type of thing won't happen?
Feb 26 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 26, 2012 09:01:23 Peter Alexander wrote:
 I just got bitten by this again. It usually looks something like
 this:
 
 -----------------
 struct Vector
 {
      ...
      void normalize() { this /= length; }
 }
 
 Vector[] vs;
 ...
 foreach (v; vs)
      v.normalize();
 -----------------
 
 The bug here is that the loop does nothing because it is
 normalizing a temporary copy of the vector, rather than the
 vectors in the array.
 
 Am I the only person that gets caught by this trap?
 
 Perhaps the loop variable should be implicitly const so that this
 type of thing won't happen?

That would be annoying. Sometimes, you _want_ to be able to alter the variable, even if it _is_ a copy. And if it's a reference type, then you don't have the problem at all, and making it const would be completely restrictive. I really don't know how we could make it easier for you to avoid this error without making it impossible to other, useful things. But your code works if you use ref. You just have to remember to do that. - Jonathan M Davis
Feb 26 2012
prev sibling next sibling parent "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Peter Alexander" <peter.alexander.au gmail.com> wrote in message 
news:awuvuwxwgjezcdrfswyn forum.dlang.org...

 Perhaps the loop variable should be implicitly const so that this type of 
 thing won't happen?

Making it const would break heaps of valid use cases. It should probably be an rvalue, and so should the index. I think there's a discussion about this burried somewhere in bugzilla.
Feb 26 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 26, 2012 20:57:10 Daniel Murphy wrote:
 "Peter Alexander" <peter.alexander.au gmail.com> wrote in message
 news:awuvuwxwgjezcdrfswyn forum.dlang.org...
 
 Perhaps the loop variable should be implicitly const so that this type of
 thing won't happen?

Making it const would break heaps of valid use cases. It should probably be an rvalue, and so should the index. I think there's a discussion about this burried somewhere in bugzilla.

The index bit is highly debatable, as the discussion on that shows (personally, I tend to favor leaving it as-is, because it's useful that way), but regardless making either an rvalue wouldn't make any sense, because they have to be variables that you can operate on in the loop. Doing anything else would seriously break the semantics of foreach. The question is simply whether they should be actual copies so that you don't end up with them altering anything outside of the loop unless you mark them as ref or if the type itself is a reference type. The element is pretty much its own copy regardless, unless you mark it as ref, whereas the situation with the index is more complicated (as the discussion on that shows). - Jonathan M Davis
Feb 26 2012
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Peter Alexander:

 Am I the only person that gets caught by this trap?

I have being hit by this bug some times. I think this is a common enough bug in D code. But it's not easy to invent a way to solve it. ---------------- Jonathan M Davis:
 I really don't know how we could make it easier for you to avoid this error
 without making it impossible to other, useful things. But your code works if
 you use ref. You just have to remember to do that.

To "just remember" to avoid the bug is an option, but programmers are not perfect. One possible solution is to generate a warning when a mutable struct copy is done. But I think this causes too many false positives (false alarms). In theory the compiler may avoid showing the warning if the copied mutable struct is _clearly_ not modified inside the loop body. This reduces the number of false positives, but I think not enough. Another possible solution is to require something in the foreach, like "ref" or "copy" (or "dup" instead of "copy"). Here I think it's clear that 'v' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is not mutable): struct Vector { void normalize() { this /= length; } } Vector[] vs; foreach (copy v; vs) v.normalize(); A mix of the two ideas is to generate the warning unless a "copy" or "ref" is present when there is a mutable copy. So "copy" is used just to silence the warning. Or maybe just an inlined comment /*copy*/ is needed to silence the warning/error: foreach (/*copy*/ v; vs) v.normalize(); Or a different foreach: foreach_copy (v; vs) v.normalize(); I think none of the ideas I've shown is good enough. Bye, bearophile
Feb 26 2012
prev sibling next sibling parent James Miller <james aatch.net> writes:
On 27 February 2012 10:31, bearophile <bearophileHUGS lycos.com> wrote:
 Peter Alexander:

 Am I the only person that gets caught by this trap?

I have being hit by this bug some times. I think this is a common enough =

 ----------------

 Jonathan M Davis:

 I really don't know how we could make it easier for you to avoid this er=


 without making it impossible to other, useful things. But your code work=


 you use ref. You just have to remember to do that.

To "just remember" to avoid the bug is an option, but programmers are not=

 One possible solution is to generate a warning when a mutable struct copy=

In theory the compiler may avoid showing the warning if the copied mutable = struct is _clearly_ not modified inside the loop body. This reduces the num= ber of false positives, but I think not enough.
 Another possible solution is to require something in the foreach, like "r=

' is a copy of the struct of 'vs' (maybe "copy" is not required if 'v' is n= ot mutable):
 struct Vector {
 =C2=A0 =C2=A0 void normalize() { this /=3D length; }
 }
 Vector[] vs;
 foreach (copy v; vs)
 =C2=A0 =C2=A0 v.normalize();


 A mix of the two ideas is to generate the warning unless a "copy" or "ref=

e the warning.
 Or maybe just an inlined comment /*copy*/ is needed to silence the warnin=

 foreach (/*copy*/ v; vs)
 =C2=A0 =C2=A0 v.normalize();

 Or a different foreach:

 foreach_copy (v; vs)
 =C2=A0 =C2=A0 v.normalize();

 I think none of the ideas I've shown is good enough.

 Bye,
 bearophile

I'm with Jonathon, using `ref` is the best way, it is ridiculous to think that the compiler should be able to figure out what's going on in every case. After getting hit by it a couple times, then you'll remember to use ref if you want to alter a copy. By the way, its the same in other languages, you need to use "reference" syntax to alter values in a loop. This is one case where its not /immediately/ obvious what the problem is, "fixing" it would inevitably cause other code to because "unituitive" so its pretty much a lose-lose scenario. Even in D, we are going to have code that compiles, runs, and doesn't do what you expect for some small reason. At least in this case its the simple matter of adding `ref` to the loop. -- James Miller
Feb 26 2012
prev sibling next sibling parent "Daniel Murphy" <yebblies gmail.com> writes:
On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis 
wrote:
 The index bit is highly debatable, as the discussion on that 
 shows
 (personally, I tend to favor leaving it as-is, because it's 
 useful that way),
 but regardless making either an rvalue wouldn't make any sense, 
 because they
 have to be variables that you can operate on in the loop. Doing 
 anything else
 would seriously break the semantics of foreach. The question is 
 simply whether
 they should be actual copies so that you don't end up with them 
 altering
 anything outside of the loop unless you mark them as ref or if 
 the type itself
 is a reference type. The element is pretty much its own copy 
 regardless,
 unless you mark it as ref, whereas the situation with the index 
 is more
 complicated (as the discussion on that shows).

 - Jonathan M Davis

Why does making either an rvalue not make any sense? foreach(<modifiers> var; agg) var = 7; Then if var is not ref, the compiler marks it as an rvalue and gives an error if you try to modify or take the address of it. It's pretty unlikely to happen, but it fixes the problem of accidentally modifying a foreach index or variable. Just because it's in a variable doesn't mean the compiler can't treat it as an rvalue.
Feb 26 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, February 27, 2012 05:51:37 Daniel Murphy wrote:
 On Sunday, 26 February 2012 at 10:25:31 UTC, Jonathan M Davis
 
 wrote:
 The index bit is highly debatable, as the discussion on that
 shows
 (personally, I tend to favor leaving it as-is, because it's
 useful that way),
 but regardless making either an rvalue wouldn't make any sense,
 because they
 have to be variables that you can operate on in the loop. Doing
 anything else
 would seriously break the semantics of foreach. The question is
 simply whether
 they should be actual copies so that you don't end up with them
 altering
 anything outside of the loop unless you mark them as ref or if
 the type itself
 is a reference type. The element is pretty much its own copy
 regardless,
 unless you mark it as ref, whereas the situation with the index
 is more
 complicated (as the discussion on that shows).
 
 - Jonathan M Davis

Why does making either an rvalue not make any sense? foreach(<modifiers> var; agg) var = 7; Then if var is not ref, the compiler marks it as an rvalue and gives an error if you try to modify or take the address of it. It's pretty unlikely to happen, but it fixes the problem of accidentally modifying a foreach index or variable. Just because it's in a variable doesn't mean the compiler can't treat it as an rvalue.

That would be _completely_ inconsistent with the rest of the language. Variables are _always_ lvalues. The closest that you can get to them being rvalues is if they're const or immutable, but even then, they're still lvalues, and you can take their address. It's just that the type system protects them from being altered. But regardless, being unable to alter a foreach loop variable would be like making it so that you can no longer alter function parameters. Sure, it might fix some bugs, but it would be horribly restrictive a lot of the time. - Jonathan M Davis
Feb 26 2012
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
 I really don't know how we could make it easier for you to avoid this  
 error
 without making it impossible to other, useful things. But your code  
 works if
 you use ref. You just have to remember to do that.

Feb 27 2012
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Sunday, 26 February 2012 at 23:20:24 UTC, James Miller wrote:
 I'm with Jonathon, using `ref` is the best way, it is 
 ridiculous to
 think that the compiler should be able to figure out what's 
 going on
 in every case. After getting hit by it a couple times, then 
 you'll
 remember to use ref if you want to alter a copy.

There's no need for the compiler to figure anything out. I was curious if there were any syntactic ways of solving this problem. Changing the existing syntax is not really an option, but brainstorming ideas can't hurt. I agree that I can "just remember", but that's not a good way to go about creating robust software. People make mistakes. D already has lots of errors to help prevent against these kinds of common mistakes and making the obvious thing correct and safe is one of D's design goals.
 By the way, its the same in other languages, you need to use
 "reference" syntax to alter values in a loop.

C# isn't the same. If you iterate over structs you cannot modify the iteration variable. You have to explicitly copy.
Feb 27 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Peter Alexander:

 C# isn't the same. If you iterate over structs you cannot modify 
 the iteration variable. You have to explicitly copy.

C# developers are old wise foxes, again :o) My respect for them grows some more every year. Bye, bearophile
Feb 27 2012
prev sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 28 February 2012 at 01:56:48 UTC, bearophile wrote:
 Peter Alexander:

 C# isn't the same. If you iterate over structs you cannot 
 modify the iteration variable. You have to explicitly copy.

C# developers are old wise foxes, again :o) My respect for them grows some more every year. Bye, bearophile

I agree. They really know how to create a practical language that isn't held back by dogmatic adherence to some underlying principle. When coding in C#, it's amazing how everything Just Works, and does what is expected. There are very few gotchas in the languages, at least any that you'll encounter in day-to-day usage.
Feb 28 2012