www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - assignment: left-to-right or right-to-left evaluation?

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Consider:

uint fun();
int gun();
...
int[] a = new int[5];
a[fun] = gun;

Which should be evaluated first, fun() or gun()? It's a rather arbitrary 
decision. C/C++ don't even define an order. Python chooses 
left-to-right, EXCEPT for assignment, which is right-hand side first. 
Lisp and C# choose consistent left-to-right. I don't like exceptions and 
I'd like everything to be left-to-right. However, this leads to some odd 
cases. Consider this example in TDPL:

import std.stdio, std.string;

void main() {
   uint[string] dic;
   foreach (line; stdin.byLine) {
     string[] words = split(strip(line));
     foreach (word; words) {
       if (word in dic) continue; // nothing to do
       uint newID = dic.length;
       dic[word] = newID;
       writeln(newID, '\t', word);
     }
   }
}

If we want to get rid of newID, we'd write:

       writeln(dic.length, '\t', word);
       dic[word] = dic.length;

by the Python rule, and

       writeln(dic.length, '\t', word);
       dic[word] = dic.length - 1;

by the C# rule.

What's best?


Andrei
May 09 2009
next sibling parent reply Frank Benoit <keinfarbton googlemail.com> writes:
Andrei Alexandrescu schrieb:
 Consider:
 
 uint fun();
 int gun();
 ....
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()? It's a rather arbitrary
 decision. C/C++ don't even define an order. Python chooses
 left-to-right, EXCEPT for assignment, which is right-hand side first.
 Lisp and C# choose consistent left-to-right. I don't like exceptions and
 I'd like everything to be left-to-right. However, this leads to some odd
 cases. Consider this example in TDPL:
 
 import std.stdio, std.string;
 
 void main() {
   uint[string] dic;
   foreach (line; stdin.byLine) {
     string[] words = split(strip(line));
     foreach (word; words) {
       if (word in dic) continue; // nothing to do
       uint newID = dic.length;
       dic[word] = newID;
       writeln(newID, '\t', word);
     }
   }
 }
 
 If we want to get rid of newID, we'd write:
 
       writeln(dic.length, '\t', word);
       dic[word] = dic.length;
 
 by the Python rule, and
 
       writeln(dic.length, '\t', word);
       dic[word] = dic.length - 1;
 
 by the C# rule.
 
 What's best?
 
 
 Andrei

From my POV, it would be nice if it would be the same as in Java, because i am porting lots of it to D.
May 09 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Frank Benoit wrote:
 From my POV, it would be nice if it would be the same as in Java,
 because i am porting lots of it to D.

Good point. I searched for that one and found: http://java.sun.com/docs/books/jls/second_edition/html/expressions.doc.html "The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right." I also searched for the way Perl does it and got a tad disappointed: http://www.nntp.perl.org/group/perl.perl5.porters/2003/09/msg82032.html Essentially the order of evaluation in Perl is as arbitrary and as specific-case-driven as if nobody really sat down and thought any of it. Andrei
May 09 2009
parent Frank Benoit <keinfarbton googlemail.com> writes:
Andrei Alexandrescu schrieb:
 Frank Benoit wrote:
 From my POV, it would be nice if it would be the same as in Java,
 because i am porting lots of it to D.

Good point. I searched for that one and found: http://java.sun.com/docs/books/jls/second_edition/html/expressions.doc.html "The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right."

I think, code relying on the order is bad code. And iI think there is no "right" way. But... + it is good to have it defined + it is good if ported code will not break because of that difference I see no other argument. So the question would be, from which language would you expect the most ported code? I think it will be C/C++ bindings for libs and Java code For bindings/declarations the evaluation order is not of relevance. So choose Java's scheme. :)
May 09 2009
prev sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Sat, May 9, 2009 at 1:52 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 I also searched for the way Perl does it and got a tad disappointed:

 http://www.nntp.perl.org/group/perl.perl5.porters/2003/09/msg82032.html

 Essentially the order of evaluation in Perl is as arbitrary and as
 specific-case-driven as if nobody really sat down and thought any of it.

You're surprised by that? ;)
May 09 2009
prev sibling next sibling parent Michiel Helvensteijn <nomail please.com> writes:
Andrei Alexandrescu wrote:

 Consider:
 
 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()? It's a rather arbitrary
 decision. C/C++ don't even define an order. Python chooses
 left-to-right, EXCEPT for assignment, which is right-hand side first.
 Lisp and C# choose consistent left-to-right. I don't like exceptions and
 I'd like everything to be left-to-right. However, this leads to some odd
 cases.

I find this a very interesting issue. I've just coauthored a paper describing three expression evaluation strategies. Their main advantage is that they preserve operator commutativity. In other words, the evaluation order of side-effects is independent from their textual order. No left-to-right or right-to-left. One of the strategies (parallel evaluation) evaluates all subexpressions in the old program state. It maintains operator idempotence. An expression is only legal if no two if its subexpressions make incompatible changes to the state. This strategy requires some extra memory and runtime, at least in its most naive implementation. It does scale extremely well to multi-processor systems, however. Another strategy (ordering by dependency) recognizes read/write and write/write hazards and orders evaluation accordingly. An expression would only be legal if its dependency graph has no cycles. This may be difficult to verify for more complex languages, which have aliasing and such. The third strategy (order agnostic evaluation) basically evaluates all smallest side-effect subexpressions before all pure subexpressions (unless the pure one is a subexpression of a side-effect one). Ordering between side-effects is required to be semantically irrelevant. An expression that depends on any particular ordering of side-effects is illegal. This is non-trivial to prove at compile-time but always possible to test at run-time. And it leaves the compiler free to choose the most efficient evaluation order. Our new programming language Mist will use order agnostic expression evaluation. -- Michiel Helvensteijn
May 09 2009
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gu4bqu$bq$1 digitalmars.com...
 Consider:

 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;

 Which should be evaluated first, fun() or gun()? It's a rather arbitrary 
 decision. C/C++ don't even define an order. Python chooses left-to-right, 
 EXCEPT for assignment, which is right-hand side first. Lisp and C# choose 
 consistent left-to-right. I don't like exceptions and I'd like everything 
 to be left-to-right. However, this leads to some odd cases. Consider this 
 example in TDPL:

This may sound strange coming from me, but I think I actually like the Python way of doing it. I may be getting slightly ahead of myself by saying this, but I think the inconsistency lies more with the assignment statement itself than function evaluation order. First of all: lvalue = expression; Anyone with any real experience in imperative programing looks at that and sees "expression (on the right) gets evaluated, and then the result gets placed into lvalue (on the left)". If both sides contain a function call, and the functions in the lvalue get exaluated first, then that would seem to contradict the intuitive understanding of the assignment's natural right-to-left order. I might be venturing slightly offtopic here, but lately, I've been giving a bit of thought to matters of left/right directions. Consider the following: fooA(); fooB(); fooC(); Flow of execution can be followed left-to-right. Which makes sense, because most of the more common (human) languages are left-to-right. But then there's: lvalue = fooC(fooB(fooA(stuff))) Now all of a sudden, it's all completely reversed! If you want to follow this statement's flow-of-execution step-by-step, the entire thing goes right-to-left: First stuff is evaluated, then fooA is called, then fooB is called, then fooC is called, then a value gets assigned to lvalue. Of course, function chaining syntax has been gaining popularity, so in *some* cases you may be able to fix the ordering of the right-hand-side: lvalue = stuff.fooA().fooB().fooC(); (Which, of course, also has the side-benefit of reducing parenthesis-hell). Now, at least the right-hand-side, reads completely in a natural left-to-right order. However, there are languages out there that flip the assignment: auto num; 5 => num; I've never actually liked that in the past. I literally grew up using ApplesoftBASIC and QBASIC, which use the more common "target_var = expression" syntax, so the few times I came across the above, it always felt awkward and uncomfortable. But lately, because of all of this, I've been starting to think "expr => target_var" does have its merits. Of course, the downside is one could argue that "lvalue = fooC(fooB(fooA(stuff)))" reads in order from "general overview" to "more specific", and with "stuff.fooA().fooB().fooC() => target_var;" you have to go all the way to the end to see the general idea of what's finally gotten accomplished. The reason I bring all this up is because that (not that it's likely to actually happen in D) would allow the sensible Python-style of "functions in the expression get evaluated before functions on the assignment-side" *without* making any exceptions to the left-to-right rule (which, as I pointed out before, is really more of an exception with the basic assignment syntax).
May 09 2009
prev sibling next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 09 May 2009 11:43:09 -0500, Andrei Alexandrescu wrote:

 Consider:
 
 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()? It's a rather arbitrary 
 decision. C/C++ don't even define an order. Python chooses 
 left-to-right, EXCEPT for assignment, which is right-hand side first. 
 Lisp and C# choose consistent left-to-right. I don't like exceptions and 
 I'd like everything to be left-to-right. However, this leads to some odd 
 cases. Consider this example in TDPL:
 
 import std.stdio, std.string;
 
 void main() {
    uint[string] dic;
    foreach (line; stdin.byLine) {
      string[] words = split(strip(line));
      foreach (word; words) {
        if (word in dic) continue; // nothing to do
        uint newID = dic.length;
        dic[word] = newID;
        writeln(newID, '\t', word);
      }
    }
 }
 
 If we want to get rid of newID, we'd write:
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length;
 
 by the Python rule, and
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length - 1;
 
 by the C# rule.
 
 What's best?

I'm sure about 'best', but I'd prefer the Python method. The example is similar to ... array = array ~ array.length; in as much as the result of the assignment is that the array length changes, but here it more easy to see that the pre-assignment length is being used by the RHS. In COBOL-like syntax ... move dic.length to dic[word]. it is also more obvious what the coder's intentions were. In assembler-like syntax (which is what eventually gets run, of course) ... mov regA, dic.length mov dic[word], regA It just seems counter-intuitive that the target expression's side-effects should influence the source expression. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
May 09 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Steven Schveighoffer wrote:
 For example:
 
 mydic[x] = mydic[y] = mydic[z] = mydic.length;

I distinctly remember Walter discouraging chained assignments in the doccs, already in the very early versions of D.
May 11 2009
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.utrtl8x7eav7ka steves.networkengines.com...
 On Mon, 11 May 2009 09:37:56 -0400, Georg Wrede <georg.wrede iki.fi> 
 wrote:

 Steven Schveighoffer wrote:
 For example:
  mydic[x] = mydic[y] = mydic[z] = mydic.length;

I distinctly remember Walter discouraging chained assignments in the doccs, already in the very early versions of D.

Seriously? So the correct method is to do this: auto tmp = mydic.length; mydic[x] = tmp; mydic[y] = tmp; mydic[z] = tmp; ??? That sucks. We have to remember, there are reasons why we stopped having to use assembly :)

I was giving a little bit of thought to assignment chaining the other day. Unless someone can point out why I'm wrong, I think some of the functional-style stuff we've been getting into can make assignment chaining obsolete. Hypothetical example: [mydic[x], mydic[y], mydic[z]].fill(mydic.length); I think something like that would be more clear than both the "tmp" and assignment chaining versions, and perhaps allow any language complexities that arise from the assignment chaining feature to be removed.
May 11 2009
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:gua0dm$121j$1 digitalmars.com...
 I was giving a little bit of thought to assignment chaining the other day. 
 Unless someone can point out why I'm wrong, I think some of the 
 functional-style stuff we've been getting into can make assignment 
 chaining obsolete.

 Hypothetical example:
 [mydic[x], mydic[y], mydic[z]].fill(mydic.length);

Or maybe something like: [mydic[x], mydic[y], mydic[z]].each = mydic.length;
 I think something like that would be more clear than both the "tmp" and 
 assignment chaining versions, and perhaps allow any language complexities 
 that arise from the assignment chaining feature to be removed.

May 11 2009
prev sibling next sibling parent reply Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:
Nick Sabalausky wrote:

 I was giving a little bit of thought to assignment chaining the other day.
 Unless someone can point out why I'm wrong, I think some of the
 functional-style stuff we've been getting into can make assignment
 chaining obsolete.
 
 Hypothetical example:
 [mydic[x], mydic[y], mydic[z]].fill(mydic.length);
 
 I think something like that would be more clear than both the "tmp" and
 assignment chaining versions, and perhaps allow any language complexities
 that arise from the assignment chaining feature to be removed.

Seriously? [mydic[x], mydic[y], mydic[z]].fill(mydic.length);, I admit, is not bad, but it's certainly not easier on the eyes than mydic[x] = mydic[y] = mydic[z] = mydic.length;. And you need a good optimizing compiler to get the same speed. Plus, doesn't it suffer from the same problems as the assignment-chaining version? If x, y and/or z are not already in mydic, which mydic.length are you using there? -- Michiel Helvensteijn
May 11 2009
parent reply "Nick Sabalausky" <a a.a> writes:
"Michiel Helvensteijn" <m.helvensteijn.remove gmail.com> wrote in message 
news:gua0ub$130b$1 digitalmars.com...
 Nick Sabalausky wrote:
 I was giving a little bit of thought to assignment chaining the other 
 day.
 Unless someone can point out why I'm wrong, I think some of the
 functional-style stuff we've been getting into can make assignment
 chaining obsolete.

 Hypothetical example:
 [mydic[x], mydic[y], mydic[z]].fill(mydic.length);

 I think something like that would be more clear than both the "tmp" and
 assignment chaining versions, and perhaps allow any language complexities
 that arise from the assignment chaining feature to be removed.

Seriously? [mydic[x], mydic[y], mydic[z]].fill(mydic.length);, I admit, is not bad, but it's certainly not easier on the eyes than mydic[x] = mydic[y] = mydic[z] = mydic.length;.

True, that's why I replied again and suggested something like: [mydic[x], mydic[y], mydic[z]].each = mydic.length; Although that might be more difficult to implement properly.
 And you need a good optimizing compiler to get the same speed.

True, although as our compilers progress that will hopefully become less and less of an issue. And in the meantime, if you really needed the speed, you could make a templated string mixin that does the "tmp" version: // Assuming we actually need the extra speed in this case: mixin(fill!([mydic[x], mydic[y], mydic[z]], mydic.length)); // (or something roughly like that)
 Plus, doesn't
 it suffer from the same problems as the assignment-chaining version? If x,
 y and/or z are not already in mydic, which mydic.length are you using
 there?

At least in the case of the ".fill()" version, the function-call syntax makes it clear that you're using the original "mydic.length". Other than that, I'm convinced that order-of-function-call for assignments should be rhs first, then lhs, regardless of what syntax is used to assign an rvalue to multiple lvalues.
May 11 2009
parent reply Rainer Deyke <rainerd eldwood.com> writes:
Nick Sabalausky wrote:
 True, that's why I replied again and suggested something like:
 
 [mydic[x], mydic[y], mydic[z]].each = mydic.length;

[[mydic[x], mydic[y], mydic[z]].each].each = mydic.length; [[[mydic[x], mydic[y], mydic[z]].each].each].each = mydic.length; [[[[mydic[x], mydic[y], mydic[z]].each].each].each].each = mydic.length; ... What's wrong with assignment chaining? I'm amazed anybody would find chained assignment more confusing than unchained assignment. -- Rainer Deyke - rainerd eldwood.com
May 11 2009
parent reply "Nick Sabalausky" <a a.a> writes:
"Rainer Deyke" <rainerd eldwood.com> wrote in message 
news:guamfi$2b0d$1 digitalmars.com...
 Nick Sabalausky wrote:
 True, that's why I replied again and suggested something like:

 [mydic[x], mydic[y], mydic[z]].each = mydic.length;

[[mydic[x], mydic[y], mydic[z]].each].each = mydic.length; [[[mydic[x], mydic[y], mydic[z]].each].each].each = mydic.length; [[[[mydic[x], mydic[y], mydic[z]].each].each].each].each = mydic.length; ...

I'm not sure what point you're trying to make here, but my idea is that "each" would basically be a write-only "extension property" (to borrow C# terminology) of array. So "each" wouldn't have a getter or a return value and therefore trying to stick its non-existant read/return value into an array literal (as you're doing above) would be an error.
 What's wrong with assignment chaining?  I'm amazed anybody would find
 chained assignment more confusing than unchained assignment.

I never said I had any problem with assignment chaining. I find them very straightforward (provided that their function-evaluation order is "rhs first, then lhs"). I'm just saying that if there's compelling reason to say "ok, assignments shouldn't be expressions", for the sake of simplifying certain aspects of the language, and would therefore be giving up assignment-chaining, then we wouldn't have to resort to temp-var and non-DRY methods to assign a single value to multiple targets. ...I suppose I was getting a little farther off-topic than I thought I was, maybe that's where the confusion came from.
May 11 2009
parent reply Rainer Deyke <rainerd eldwood.com> writes:
Nick Sabalausky wrote:
 "Rainer Deyke" <rainerd eldwood.com> wrote in message 
 news:guamfi$2b0d$1 digitalmars.com...
 [[mydic[x], mydic[y], mydic[z]].each].each = mydic.length;
 ...

I'm not sure what point you're trying to make here, but my idea is that "each" would basically be a write-only "extension property" (to borrow C# terminology) of array. So "each" wouldn't have a getter or a return value and therefore trying to stick its non-existant read/return value into an array literal (as you're doing above) would be an error.

I had assumed that there was some sort of magic going on that would allow this. If '[mydic[x], mydic[y], mydic[z]].each' is a write-only property of an array, then assigning to it will modify the array, but leave 'mydic' untouched. -- Rainer Deyke - rainerd eldwood.com
May 11 2009
parent "Nick Sabalausky" <a a.a> writes:
"Rainer Deyke" <rainerd eldwood.com> wrote in message 
news:guatj2$2pft$1 digitalmars.com...
 Nick Sabalausky wrote:
 "Rainer Deyke" <rainerd eldwood.com> wrote in message
 news:guamfi$2b0d$1 digitalmars.com...
 [[mydic[x], mydic[y], mydic[z]].each].each = mydic.length;
 ...

I'm not sure what point you're trying to make here, but my idea is that "each" would basically be a write-only "extension property" (to borrow C# terminology) of array. So "each" wouldn't have a getter or a return value and therefore trying to stick its non-existant read/return value into an array literal (as you're doing above) would be an error.

I had assumed that there was some sort of magic going on that would allow this. If '[mydic[x], mydic[y], mydic[z]].each' is a write-only property of an array, then assigning to it will modify the array, but leave 'mydic' untouched.

I wrote up a short app to try to think my idea through a bit better. This is the best I was able to get in D1: ------------------------- import tango.io.Stdout; void each(T, V)(T*[] array, V val) { foreach(T* elem; array) *elem = val; } void main() { int x, y, z; [&x, &y, &z].each(7); // Output: 7 7 7 Stdout.formatln("{} {} {}", x, y, z); } ------------------------- As you can see, it's not quite was I was trying to get. Things that would probably need to be changed to get what I really want: - Some sort of variable reference tuple (not a type tuple like we have now), so we could get rid of the pointer stuff, ie: Change "each([&x, &y, &z], 7);" to something like "each((x, y, z), 7);" - A *real* extention method syntax and a *real* property syntax, like in C#, plus the ablity to combine them, so then we could do: ------------------------- import tango.io.Stdout; // The "this" indicates it's an extension of T[] // and therefore can be called with // tuple.each instead of each(tuple) T each(T, V)(this T() tuple) { // set{}, so this is a hypothetical "extension property" // instead of an extention method. set(V val) { foreach(ref T elem; tuple) elem = val; } } void main() { int x, y, z; // Notice that the "xxx.each()" and "each = yyy" // syntaxes no longer conflict with each other. (x, y, z).each = 7; // Output: 7 7 7 Stdout.formatln("{} {} {}", x, y, z); } -------------------------
May 11 2009
prev sibling parent Derek Parnell <derek psych.ward> writes:
On Mon, 11 May 2009 16:03:39 -0400, Nick Sabalausky wrote:

 I was giving a little bit of thought to assignment chaining the other day. 
 Unless someone can point out why I'm wrong, I think some of the 
 functional-style stuff we've been getting into can make assignment chaining 
 obsolete.
 
 Hypothetical example:
 [mydic[x], mydic[y], mydic[z]].fill(mydic.length);
 
 I think something like that would be more clear than both the "tmp" and 
 assignment chaining versions, and perhaps allow any language complexities 
 that arise from the assignment chaining feature to be removed.

This is close to what we are going to do for the Euphoria language. Currently, Euphoria does not allow assignment chaining (an assignment is not an expression) so the proposed syntax is ... (mydic[x], mydic[y], mydic[z]) = length(mydic) Plus that language evaluates the source expression before tackling the target expression, so your idea is not unheared of. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell
May 11 2009
prev sibling parent reply Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:
Jarrett Billingsley wrote:

 It's
 incredibly rare that I have to assign the same value to multiple
 targets, and when I do, the little time I could save by not typing the
 second assignment is offset by the increased time it takes me to look
 for assignments to a variable when I come back to the code later.

It's not about time. It's about code duplication. -- Michiel Helvensteijn
May 11 2009
parent Christopher Wright <dhasenan gmail.com> writes:
Michiel Helvensteijn wrote:
 Jarrett Billingsley wrote:
 
 It's
 incredibly rare that I have to assign the same value to multiple
 targets, and when I do, the little time I could save by not typing the
 second assignment is offset by the increased time it takes me to look
 for assignments to a variable when I come back to the code later.

It's not about time. It's about code duplication.

Code duplication is only an issue when it can be a maintenance problem. Repeating the name of a temporary variable in five consecutive assignments is not a maintenance problem. A preference for or against assignment chaining is mainly stylistic. The exception is when you reference the same variables in the LHS as the RHS and modify them in one of the locations -- this gets confusing, even with a specified order. For that reason, I always use the ++ and -- operators as statements. Other people can use them as expressions and not get confused, but I always have to take a moment to think about which value will get returned. It only takes a moment, but I prefer not having to spend thought on such trivial matters when I'm reading code.
May 12 2009
prev sibling next sibling parent reply Georg Wrede <georg.wrede iki.fi> writes:
Andrei Alexandrescu wrote:
 Consider:
 
 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()? 

arra[i] = arrb[i++]; arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.
May 11 2009
next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-11 05:49:01 -0400, Georg Wrede <georg.wrede iki.fi> said:

 Andrei Alexandrescu wrote:
 Consider:
 
 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()?

arra[i] = arrb[i++]; arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

Well, I agree with you that we shouldn't encourage this kind of code. But leaving it undefined (as in C) isn't a good idea because even if it discourages people from relying on it, it also makes any well tested code potentially buggy when switching compiler. You could simply make it an error in the language to avoid that being written in the first place. But even then you can't catch all the cases statically. For instance, two different pointers or references can alias the same value, as in: int i; func(i, i); void func(ref int i, ref int j) { arra[i++] = arrb[j]; // how can the compiler issue an error for this? } So even if you make it an error for the obvious cases, you still need to define the evaluation order for the ones the compiler can't catch. And, by the way, I don't think we should make it an error even for the so-called obvious cases. Deciding what's obvious and what is not is going to complicate the rules more than necessary. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 11 2009
next sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Michel Fortin wrote:

           arra[i++] = arrb[j]; // how can the compiler issue an
           error for this? 

assert( &i != &j); -manfred
May 11 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Steven Schveighoffer wrote:
 On Mon, 11 May 2009 08:20:07 -0400, Manfred Nowak <svv1999 hotmail.com> 
 wrote:
 
 Michel Fortin wrote:

           arra[i++] = arrb[j]; // how can the compiler issue an
           error for this?

assert( &i != &j); -manfred

That is not a compiler error, it is an inserted runtime error.

Besides, it's just a particular case. Generally you can't tell modularly whether two expressions change the same variable. Andrei
May 11 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Michel Fortin Wrote:

 On 2009-05-11 05:49:01 -0400, Georg Wrede <georg.wrede iki.fi> said:
 
 Andrei Alexandrescu wrote:
 Consider:
 
 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;
 
 Which should be evaluated first, fun() or gun()?

arra[i] = arrb[i++]; arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

Well, I agree with you that we shouldn't encourage this kind of code. But leaving it undefined (as in C) isn't a good idea because even if it discourages people from relying on it, it also makes any well tested code potentially buggy when switching compiler. You could simply make it an error in the language to avoid that being written in the first place. But even then you can't catch all the cases statically. For instance, two different pointers or references can alias the same value, as in: int i; func(i, i); void func(ref int i, ref int j) { arra[i++] = arrb[j]; // how can the compiler issue an error for this? }

D2 could have no ordering guarantees, and simply give an error when reordering could effect impure operations. Flow analysis could relax this rule a bit. Local primitives that have not escaped are immune to side effects affecting other variables.
 So even if you make it an error for the obvious cases, you still need 
 to define the evaluation order for the ones the compiler can't catch.
 
 And, by the way, I don't think we should make it an error even for the 
 so-called obvious cases. Deciding what's obvious and what is not is 
 going to complicate the rules more than necessary.
 
 
 -- 
 Michel Fortin
 michel.fortin michelf.com
 http://michelf.com/
 

May 11 2009
parent Georg Wrede <georg.wrede iki.fi> writes:
Jason House wrote:
 Michel Fortin Wrote:
 
 On 2009-05-11 05:49:01 -0400, Georg Wrede <georg.wrede iki.fi> said:

 Andrei Alexandrescu wrote:
 Consider:

 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;

 Which should be evaluated first, fun() or gun()?

arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

Well, I agree with you that we shouldn't encourage this kind of code. But leaving it undefined (as in C) isn't a good idea because even if it discourages people from relying on it, it also makes any well tested code potentially buggy when switching compiler.

D2 could have no ordering guarantees, and simply give an error when

this rule a bit. Local primitives that have not escaped are immune to side effects affecting other variables.
 
 So even if you make it an error for the obvious cases, you still need 
 to define the evaluation order for the ones the compiler can't catch.


C didn't define it for good reason. It should not be used, period. Defining it in any way, or forbidding it, both mean that the compiler writer has to write lines of code to *try* to analyse it somehow. D is not a language for the infantile (even if I strongly advocate its use in language education), so we don't have to make this a bicycle with assist-wheels. Walter's time is better spent on things that give more reward and take less of his time. And Andrei's, too.
May 11 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 May 2009 08:20:07 -0400, Manfred Nowak <svv1999 hotmail.com>  
wrote:

 Michel Fortin wrote:

           arra[i++] = arrb[j]; // how can the compiler issue an
           error for this?

assert( &i != &j); -manfred

That is not a compiler error, it is an inserted runtime error. -Steve
May 11 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Georg Wrede wrote:
 Andrei Alexandrescu wrote:
 Consider:

 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;

 Which should be evaluated first, fun() or gun()? 

arra[i] = arrb[i++]; arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

By not stating it, I introduce a gratuitous nonportability. Andrei
May 11 2009
parent reply Georg Wrede <georg.wrede iki.fi> writes:
Andrei Alexandrescu wrote:
 Georg Wrede wrote:
 Andrei Alexandrescu wrote:
 Consider:

 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;

 Which should be evaluated first, fun() or gun()? 

arra[i] = arrb[i++]; arra[i++] = arrb[i]; I'm not sure that such dependences are good code. By stating a definite order between lvalue and rvalue, you would actually encourage this kind of code.

By not stating it, I introduce a gratuitous nonportability.

If the programmer has introduced dependencies on the evaluation order, yes. But if he hasn't, then it will not introduce anything. With a[fun] = gun; a rewrite auto f = a[fun]; a[f] = gun; makes it explicit how the programmer wants it done. It also removes any uncertainty (and need to remember an arbitrary rule) for other people. If you'd really want things easy for Walter, unambiguous, and clear for the reader, then you'd advocate forbidding expressions in lvalues.
May 11 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Georg Wrede wrote:
 If the programmer has introduced dependencies on the evaluation order, 
 yes. But if he hasn't, then it will not introduce anything.

If violations could be checked such that invalid code is rejected, your solution would work.
 With
 
   a[fun] = gun;
 
 a rewrite
 
   auto f = a[fun];
   a[f] = gun;
 
 makes it explicit how the programmer wants it done. It also removes any 
 uncertainty (and need to remember an arbitrary rule) for other people.
 
 If you'd really want things easy for Walter, unambiguous, and clear for 
 the reader, then you'd advocate forbidding expressions in lvalues.

I think that would be too restrictive. a[b] is already an expression. The solution is simple: define an order of evaluation such that even bad code behaves consistently. Andrei
May 11 2009
parent Georg Wrede <georg.wrede iki.fi> writes:
Andrei Alexandrescu wrote:
 Georg Wrede wrote:
 If the programmer has introduced dependencies on the evaluation order, 
 yes. But if he hasn't, then it will not introduce anything.

If violations could be checked such that invalid code is rejected, your solution would work.
 With

   a[fun] = gun;

 a rewrite

   auto f = a[fun];
   a[f] = gun;

 makes it explicit how the programmer wants it done. It also removes 
 any uncertainty (and need to remember an arbitrary rule) for other 
 people.

 If you'd really want things easy for Walter, unambiguous, and clear 
 for the reader, then you'd advocate forbidding expressions in lvalues.

I think that would be too restrictive. a[b] is already an expression.

I guessed you'd say that. But I thought it'd be condescending to explain that the compiler should notice a leaf expression.
 The solution is simple: define an order of evaluation such that even bad 
 code behaves consistently.

Then pick lexical order. For consistency, that's what statement.html has all over the place anyway.
May 11 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 May 2009 19:15:59 -0400, Derek Parnell <derek psych.ward> wrote:

 On Sat, 09 May 2009 11:43:09 -0500, Andrei Alexandrescu wrote:

 Consider:

 uint fun();
 int gun();
 ...
 int[] a = new int[5];
 a[fun] = gun;

 Which should be evaluated first, fun() or gun()? It's a rather arbitrary
 decision. C/C++ don't even define an order. Python chooses
 left-to-right, EXCEPT for assignment, which is right-hand side first.
 Lisp and C# choose consistent left-to-right. I don't like exceptions and
 I'd like everything to be left-to-right. However, this leads to some odd
 cases. Consider this example in TDPL:

 import std.stdio, std.string;

 void main() {
    uint[string] dic;
    foreach (line; stdin.byLine) {
      string[] words = split(strip(line));
      foreach (word; words) {
        if (word in dic) continue; // nothing to do
        uint newID = dic.length;
        dic[word] = newID;
        writeln(newID, '\t', word);
      }
    }
 }

 If we want to get rid of newID, we'd write:

        writeln(dic.length, '\t', word);
        dic[word] = dic.length;

 by the Python rule, and

        writeln(dic.length, '\t', word);
        dic[word] = dic.length - 1;

 by the C# rule.

 What's best?

I'm sure about 'best', but I'd prefer the Python method.

Think you meant 'not sure' :)
 The example is similar to ...

     array = array ~ array.length;

 in as much as the result of the assignment is that the array length
 changes, but here it more easy to see that the pre-assignment length is
 being used by the RHS.

 In COBOL-like syntax ...

    move dic.length to dic[word].

 it is also more obvious what the coder's intentions were.

 In assembler-like syntax (which is what eventually gets run, of course)  
 ...

    mov regA, dic.length
    mov dic[word], regA

 It just seems counter-intuitive that the target expression's side-effects
 should influence the source expression.

This reasoning makes the most sense, but let's leave COBOL out of it :) I vote for the Python method too. It's how my brain sees the expression. Also consider like this: uint len; mydic[x] = len = mydic.length; Now, it's even more obvious that len = mydic.length should be immune to the effects of mydic[x]. Longer chained assignment expressions seem like they would make the problem even harder to understand if it's all evaluated left to right. You may even make code more bloated because of it. For example: mydic[x] = mydic[y] = mydic[z] = mydic.length; if evaluating right to left, this looks like: 1. calculate mydic.length, store it in register A. 2. lookup mydic[z], if it doesn't exist, add it. Store register A to it. 3. lookup mydic[y], if it doesn't exist, add it. Store register A to it. 4. ditto for mydic[x] If evaluating left to right, this looks like: 1. lookup mydic[x], if it doesn't exist, add it. Store a reference to it on the stack. 2. lookup mydic[y], if it doesn't exist, add it. Store a reference to it on the stack. 3. lookup mydic[z], if it doesn't eixst, add it. Store the reference to it in register B. 4. calculate mydic.length, store it in register A. Store the result in the reference pointed to by register B. 5. pop register B from the stack, store register A to the value it references. 6. Repeat step 5. Two extra steps, and I have to use a stack. Maybe 3 chained assignments would be easy to store without a stack, but try 10 chained assignments. I'd think the compiler code to evaluate right to left would be simpler also, because you can reduce the expression at every assignment. -Steve
May 11 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 May 2009 07:34:38 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 For example:

 mydic[x] = mydic[y] = mydic[z] = mydic.length;

 if evaluating right to left, this looks like:

 1. calculate mydic.length, store it in register A.
 2. lookup mydic[z], if it doesn't exist, add it.  Store register A to it.
 3. lookup mydic[y], if it doesn't exist, add it.  Store register A to it.
 4. ditto for mydic[x]

 If evaluating left to right, this looks like:

 1. lookup mydic[x], if it doesn't exist, add it.  Store a reference to  
 it on the stack.
 2. lookup mydic[y], if it doesn't exist, add it.  Store a reference to  
 it on the stack.
 3. lookup mydic[z], if it doesn't eixst, add it.  Store the reference to  
 it in register B.
 4. calculate mydic.length, store it in register A.  Store the result in  
 the reference pointed to by register B.
 5. pop register B from the stack, store register A to the value it  
 references.
 6. Repeat step 5.

 Two extra steps, and I have to use a stack.  Maybe 3 chained assignments  
 would be easy to store without a stack, but try 10 chained assignments.

 I'd think the compiler code to evaluate right to left would be simpler  
 also, because you can reduce the expression at every assignment.

BTW, I'm curious to know how Java does this... -Steve
May 11 2009
prev sibling next sibling parent Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:
Consider that mathematically speaking, an array is a function. And an
assignment to an array element actually changes the function.

A[i] = E;

is actually the same as

A = A[E/i];,

where the right-hand side reads: "A where i yields E" (notation not to be
confused with division). It is formally defined:

A[E/i][j] == E    (if i == j)
             A[j] (if i != j).

Of course, there are no side-effects in mathematics, but I believe it's
beneficial to try to keep as many well-known mathematical identities (like
that one) valid in the face of chaos.

So your first example would then be equivalent with

a = a[gun/fun];,

which still leaves the question of side-effect evaluation order. The second
example would read:

dic = dic[dic.length/word];,

which would suggest using the old dic.length.

-- 
Michiel Helvensteijn
May 11 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 May 2009 09:37:56 -0400, Georg Wrede <georg.wrede iki.fi> wrote:

 Steven Schveighoffer wrote:
 For example:
  mydic[x] = mydic[y] = mydic[z] = mydic.length;

I distinctly remember Walter discouraging chained assignments in the doccs, already in the very early versions of D.

Seriously? So the correct method is to do this: auto tmp = mydic.length; mydic[x] = tmp; mydic[y] = tmp; mydic[z] = tmp; ??? That sucks. We have to remember, there are reasons why we stopped having to use assembly :) -Steve
May 11 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, May 11, 2009 at 11:07 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:

 =A0mydic[x] =3D mydic[y] =3D mydic[z] =3D mydic.length;



 auto tmp =3D mydic.length;
 mydic[x] =3D tmp;
 mydic[y] =3D tmp;
 mydic[z] =3D tmp;

 ???

 That sucks. =A0We have to remember, there are reasons why we stopped havi=

 use assembly :)

Funny, I vastly prefer the latter to the former. Having more than one thing happen on one line is very difficult to read after having written it, for me.
May 11 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 May 2009 11:26:36 -0400, Jarrett Billingsley  
<jarrett.billingsley gmail.com> wrote:

 On Mon, May 11, 2009 at 11:07 AM, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

  mydic[x] = mydic[y] = mydic[z] = mydic.length;



 auto tmp = mydic.length;
 mydic[x] = tmp;
 mydic[y] = tmp;
 mydic[z] = tmp;

 ???

 That sucks.  We have to remember, there are reasons why we stopped  
 having to
 use assembly :)

Funny, I vastly prefer the latter to the former. Having more than one thing happen on one line is very difficult to read after having written it, for me.

So I take it if you have many function calls, or chained function calls, you split them out into separate lines? :P fun(gun(123)).xyz() => auto tmp1 = gun(123); auto tmp2 = fun(tmp1); tmp2.xyz(); ??? I look at chained assignment as no different. Even if order of operations was defined as left to right, I'd still prefer using this: auto tmp = mydic.length; mydic[x] = mydic[y] = mydic[z] = tmp; I use assignment chaining all the time in C# Forms when I'm disabling/enabling a set of controls. e.g.: checkBox1.Enabled = textField1.Enabled = textField2.Enabled = textField3.Enabled = true; It seems pretty straightforward to me... -Steve
May 11 2009
prev sibling next sibling parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Mon, May 11, 2009 at 11:34 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:

 Funny, I vastly prefer the latter to the former. =A0Having more than one
 thing happen on one line is very difficult to read after having
 written it, for me.

So I take it if you have many function calls, or chained function calls, =

 split them out into separate lines? :P

 fun(gun(123)).xyz() =3D>

 auto tmp1 =3D gun(123);
 auto tmp2 =3D fun(tmp1);
 tmp2.xyz();

 ???

 I look at chained assignment as no different.

No. Nested function calls are incredibly common and do not (usually!) have side effects; it's not very different from other algebraic expressions. Chained method calls are still fairly common. It's incredibly rare that I have to assign the same value to multiple targets, and when I do, the little time I could save by not typing the second assignment is offset by the increased time it takes me to look for assignments to a variable when I come back to the code later. It's much more of an argument of my brain not being trained to parse it due to its low incidence than anything else :P
May 11 2009
prev sibling next sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Sat, 09 May 2009 11:43:09 -0500, Andrei Alexandrescu wrote:

 If we want to get rid of newID, we'd write:
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length;
 
 by the Python rule, and
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length - 1;
 
 by the C# rule.
 
 What's best?
 
 
 Andrei

Looking at it, I don't see dic[word] as increasing the length. Nothing has been added yet so why would it change? Ok, I realize a new location in memory has be reserved for storing a value, but at a glance it doesn't appear as though it is happening. You would also leave off the -1 if 'word' was already in there.
May 12 2009
prev sibling parent Sergey Gromov <snake.scaly gmail.com> writes:
Sat, 09 May 2009 11:43:09 -0500, Andrei Alexandrescu wrote:

 If we want to get rid of newID, we'd write:
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length;
 
 by the Python rule, and
 
        writeln(dic.length, '\t', word);
        dic[word] = dic.length - 1;
 
 by the C# rule.
 
 What's best?

If you use the Python rule you can rewrite dic[word] = dic.length; as dic.opIndexAssign(word, dic.length); By the C# rule you cannot do without opIndexForAssignment sort of thing. It's not a matter of best/worse IMO, it's a matter of feasible/not feasible.
Jun 21 2009