www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Operator overloading -- lets collect some use cases

reply Don <nospam nospam.com> writes:
There's been some interesting discussion about operator overloading over 
the past six months, but to take the next step, I think we need to 
ground it in reality. What are the use cases?

I think that D's existing opCmp() takes care of the plethora of trivial 
cases where <, >= etc are overloaded. It's the cases where the 
arithmetic and logical operations are overloaded that are particularly 
interesting to me.

The following mathematical cases immediately spring to mind:
* complex numbers
* quaternions (interesting since * is anti-commutative, a*b = -b*a)
* vectors
* matrices
* tensors
* bigint operations (including bigint, bigfloat,...)
I think that all of those are easily defensible.

But I know of very few reasonable non-mathematical uses.
In C++, I've seen them used for iostreams, regexps, and some stuff that 
is quite frankly bizarre.

So, please post any use cases which you consider convincing.
Dec 28 2008
next sibling parent Weed <resume755 mail.ru> writes:
Don пишет:
 There's been some interesting discussion about operator overloading over
 the past six months, but to take the next step, I think we need to
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial
 cases where <, >= etc are overloaded. It's the cases where the
 arithmetic and logical operations are overloaded that are particularly
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams,

It looked strange and confuse beginners, it seems to me
 regexps, and some stuff that
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.
 

Dec 28 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 There's been some interesting discussion about operator overloading over 
 the past six months, but to take the next step, I think we need to 
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial 
 cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that 
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.
 

Dimensional analysis, e.g. mass * distance / time / time yields gorce (bonus points for remembering where the typo comes from). Andrei
Dec 28 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Dimensional analysis, e.g. mass * distance / time / time yields gorce 
 (bonus points for remembering where the typo comes from).

But that's still arithmetic.
Dec 30 2008
prev sibling next sibling parent reply "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Dec 29, 2008 at 1:50 AM, Don <nospam nospam.com> wrote:
 There's been some interesting discussion about operator overloading over the
 past six months, but to take the next step, I think we need to ground it in
 reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of trivial
 cases where <, >= etc are overloaded. It's the cases where the arithmetic
 and logical operations are overloaded that are particularly interesting to
 me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that is
 quite frankly bizarre.

 So, please post any use cases which you consider convincing.

Array-like types which implement opCat/opCatAssign/opIndex/opIndexAssign/opSlice/opSliceAssign/opApply (and multiple flavors of some of those.) I think the biggest problem there is just the sheer number of methods you have to implement to support the array concept. Merging might be useful there too --- A ~= b ~ c ~ d is probably more efficiently implemented as 3 ~= ops. --bb
Dec 28 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Bill Baxter wrote:
 Merging might be useful there too --- A ~= b ~ c ~ d  is probably more
 efficiently implemented as 3 ~= ops.

Actually, it's probably most efficiently implemented as 1 "~=" with multiple parameters. (DMD already does this for arrays)
Dec 30 2008
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Denis Koroskin wrote:
 On Tue, 30 Dec 2008 17:30:13 +0300, Frits van Bommel 
 <fvbommel remwovexcapss.nl> wrote:
 
 Bill Baxter wrote:
 Merging might be useful there too --- A ~= b ~ c ~ d  is probably more
 efficiently implemented as 3 ~= ops.

Actually, it's probably most efficiently implemented as 1 "~=" with multiple parameters. (DMD already does this for arrays)

Perhaps, not not general enough: A += a * b - c / d; // how to do this one?

That's a very different case, IMHO. Look at Don's posts for an answer to that one. I think '~' and '~=' are more likely to allocate if used for their conventional meaning (adding items to some form of collection). When performing this operation in-place several times on the same collection it's quite possibly more efficient to do one big allocation instead of several small (temporary) ones by pre-calculating the required space. Your example is likely most efficiently implemented as something like A += a * b; // Something like FMULADD? A -= c / d; // Do FDIVADD-like instructions exist? for most implementations. (I think Don's suggested semantics should result in this, assuming sufficient optimization)
Dec 30 2008
prev sibling next sibling parent Christopher Wright <dhasenan gmail.com> writes:
Don wrote:
 There's been some interesting discussion about operator overloading over 
 the past six months, but to take the next step, I think we need to 
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial 
 cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that 
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.
 

I've seen | and & used for fluent interfaces in C#: Assert.That(0, Is.GreaterThan(-1) & Is.LessThan(1)); That is a bit awkward, a bit forced. In this situation, you want to use && or ||, which is possible by defining an implicit cast to bool in C#, but D wouldn't use that cast (I think, hopefully I'm wrong). This would mess up autocomple-- er, wait, that term isn't backed by Microsoft's army of lawyers. It would mess up Microsoft(r) Intellisense(tm).
Dec 28 2008
prev sibling next sibling parent reply aarti_pl <aarti interia.pl> writes:
Don pisze:
 There's been some interesting discussion about operator overloading over 
 the past six months, but to take the next step, I think we need to 
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial 
 cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that 
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.
 

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself. BR Marcin Kuszczak (aarti_pl)
Dec 28 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

I saw that, but few, if any, of the arguments you made apply to operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei
Dec 28 2008
parent reply aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where 
 the arithmetic and logical operations are overloaded that are 
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

I saw that, but few, if any, of the arguments you made apply to operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

I put my argument much earlier in this discussion: eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)
Dec 28 2008
parent reply downs <default_357-line yahoo.de> writes:
aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

I saw that, but few, if any, of the arguments you made apply to operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

I put my argument much earlier in this discussion: eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.
Dec 29 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
downs:
 The reason infix keywords are useful, is because they can be used as a simple
way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d.
Without infix keywords, this would take the form of select(map(foo(a, b), c),
d), which is an atrocity because it has to be read in two directions -
middle-leftwards for the operations, and middle-rightwards for the parameters.<

I agree that some syntax support for some form of function chaining may be useful if D wants to become a little more functional. But such syntax has to be chosen with care. Bye, bearophile
Dec 29 2008
prev sibling next sibling parent aarti_pl <aarti interia.pl> writes:
downs pisze:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters.

Exactly. My first thoughts was to allow defining infix keywords/operators in a similar way as today we can define prefix operators: myFunc a b or just: myFunc(a, b); Why? Because they are use cases where infix notation is much, much more natural than prefix notation. So we still need: 1. a myFunc b - infix operator 2. a b myFunc - postfix operator Ability to define infix operators seems to be most useful right now. I just didn't want to depress Walter with too high expectations, so starting with just oveloading of already existing infix operators like &&, || seemed to be a good start :-D
 
 (and yes, I know it's wrong to rely on operator evaluation order. So sue me. )
 
 Of course, a more convenient solution would be the ability to extend the D
syntax manually, but that's unlikely to appear in our lifetime.

BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei
Dec 29 2008
parent reply aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar .D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei

Maybe: R op[Infix|Postfix]_Name(A a, B b ...); with some additional forms for special characters. These special forms could be additionally restricted as necessary, so there would be no Postfix '&&', but only Infix etc. etc. Possible examples: ------------------ SqlExpression opInfix_LIKE(string a, string b); int opInfix_&(int a, int b); bool opInfix_&&(bool a, bool b); SqlExpression opInfix_AND(string a, string b); Below examples for postfix operators; I have not yet idea of syntax for calling side: int opPostfix_+(int a, int b); //for e.g. Reversed Polish Notation int opPostfix_*(int a, int b); It seems that prefix will not be necessary, as prefix notation is just a standard function call. BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
aarti_pl wrote:
 Andrei Alexandrescu pisze:
 downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar .D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei

Maybe: R op[Infix|Postfix]_Name(A a, B b ...);

That's not quite elegant. What if there is a symbol called Name in scope? This will confuse the parser to no end. (I forgot to mention that in the sub-token approach you'd still have to write the underscore when issuing a call.) Andrei
Dec 29 2008
next sibling parent aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we 
 need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar .D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei

Maybe: R op[Infix|Postfix]_Name(A a, B b ...);

That's not quite elegant. What if there is a symbol called Name in scope? This will confuse the parser to no end. (I forgot to mention that in the sub-token approach you'd still have to write the underscore when issuing a call.) Andrei

Rules for calling side is another subject. I didn't even try to address this part of problem. Additionally if we could get operator overloads as free functions, then there is another factor of complication. But let me try to put now few thoughts from top of my head: 1. Operator overloads are valid in the scope of 'import' validity. So, when import occurs on module level, then the scope is module. When import is in class, then scope is class. Depending on import qualifier operator overloads can be propagated or not. Maybe someday we will also get imports on function level, so there will be another level of operators validity. And maybe someday even imports on block level? :-) { import std.complex; Complex c = 2i; } 2. In the scope of import validity overloaded operators are taken before built-in operators without issuing errors. When there are other symbols in the scope with the same name as overloaded operator it should be an error. To use operators anyway you should resolve conflict using standard D methods: a. with FQN syntax: import doost.db.sql; void main() { int LIKE = 5; //Query query = Select(Table).Where(Table.Name LIKE "A*"); //Error Query query = Select(Table).Where(Table.Name doost.db.sql.LIKE "A*"); } b. with renamed imports import doost.db.sql : SQL_LIKE = LIKE; 3. There still might be need for using built-in operators, even in narrow scope of validity of overloaded operator. Then it should be possible to escape overloaded operators with '.' e.g. module doost.db.sql; SqlExpression opInfix_==(Column col, Column col) {} module main; void main() { import doost.db.sql; Column col1, col2; if (col1 .== col2) {...} } In these case overload for == should be chosen, but it was escaped with '.', so built-in comparison should be used. BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
prev sibling next sibling parent aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we 
 need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar .D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei

Maybe: R op[Infix|Postfix]_Name(A a, B b ...);

That's not quite elegant. What if there is a symbol called Name in scope? This will confuse the parser to no end. (I forgot to mention that in the sub-token approach you'd still have to write the underscore when issuing a call.) Andrei

I have just noticed that you were talking more about parsing than resolution of symbols. Sorry, I was too fast with my previous answer :-) In case of parsing I can not help too much. It might be that something like underscores would be necessary to make parser happy. Please notice that I am usually trying to say things from user perspective, so from perspective of person who doesn't know all internal details. And from user perspective I can say that I really don't like underscores in core language names. They make language feel hackish and not properly rethought. So I hope they can be avoided without ambiguities for parser. BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
prev sibling parent KennyTM~ <kennytm gmail.com> writes:
Andrei Alexandrescu wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 downs wrote:
 aarti_pl wrote:
 Andrei Alexandrescu pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading
 over the past six months, but to take the next step, I think we 
 need
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of
 trivial cases where <, >= etc are overloaded. It's the cases where
 the arithmetic and logical operations are overloaded that are
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

mockup tests description language (usually also in D - not as a separate script language). In my previous posts I already put few arguments why it is sometimes much more handy to use "DSL in mother language" approach rather than string mixins with DSL language itself.

operators vs. named methods. You argued on string mixins vs. using named symbols. In fact one argument of yours works against you as there's no completion for operators. Andrei

eg. here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmar .D&article_id=81040 To sum up: using methods to emulate operators is just completely unreadable and looks awful. It's very easy to make an error using such a technique. Later, I just replayed to posts convincing that solution for supporting DSL languages in D is using string mixins. I believe I could provide a few weighty arguments to support my opinion that it is not always best solution for this problem. In many cases it is much better to integrate DSL into D as a kind of API, not separate sub-language. Then we can get some support from IDE, while using string mixins we probably would get nothing... Best Regards Marcin Kuszczak (aarti_pl)

Scrapple.Tools uses operator overloading to provide fake infix keywords along the lines of [2, 3, 4] /map/ (int i) { return format(i); }, with a simple and convenient syntax for defining them (mixin(Operator!("map", "something something use lhs and rhs; ")); ). Forcing the end user to write mixin(function()) for such keywords is *NOT* the way to go, for several reasons: a) it's hard to extend, and b) it's needlessly verbose. The reason infix keywords are useful, is because they can be used as a simple way to chain bijective operations together, i.e. a /foo/ b /map/ c /select/ d. Without infix keywords, this would take the form of select(map(foo(a, b), c), d), which is an atrocity because it has to be read in two directions - middle-leftwards for the operations, and middle-rightwards for the parameters. (and yes, I know it's wrong to rely on operator evaluation order. So sue me. ) Of course, a more convenient solution would be the ability to extend the D syntax manually, but that's unlikely to appear in our lifetime.

I sometimes think of a subtoken-based approach, e.g. any function name starting and ending with an underscore is by definition infix. It's the kind of solution that turns Walter's nose so I never brought it up to him. Andrei

Maybe: R op[Infix|Postfix]_Name(A a, B b ...);

That's not quite elegant. What if there is a symbol called Name in scope? This will confuse the parser to no end. (I forgot to mention that in the sub-token approach you'd still have to write the underscore when issuing a call.) Andrei

R opInfix_Name (A a, B b ...); ... a Name b Name c Name d ... Not that I support allowing infix overloading like this, just to show it is possible to workaround the issue.
Dec 29 2008
prev sibling parent Yigal Chripun <yigal100 gmail.com> writes:
downs wrote:
 Scrapple.Tools uses operator overloading to provide fake infix
 keywords along the lines of [2, 3, 4] /map/ (int i) { return
 format(i); }, with a simple and convenient syntax for defining them
 (mixin(Operator!("map", "something something use lhs and rhs; "));
 ).

 Forcing the end user to write mixin(function()) for such keywords is
 *NOT* the way to go, for several reasons: a) it's hard to extend, and
 b) it's needlessly verbose.

 The reason infix keywords are useful, is because they can be used as
 a simple way to chain bijective operations together, i.e. a /foo/ b
 /map/ c /select/ d. Without infix keywords, this would take the form
 of select(map(foo(a, b), c), d), which is an atrocity because it has
 to be read in two directions - middle-leftwards for the operations,
 and middle-rightwards for the parameters.

 (and yes, I know it's wrong to rely on operator evaluation order. So
 sue me. )

This solution is the one taken by functional languages. ML, Scala, etc. another solution is the one closer to OOP style where the above snippet becomes: a.foo(b).map(c).select(d) D is supposed to get extension methods in the future, and this already works for arrays, i.e map(T)(T[], void delegate(T) func) {...} char[] arr; arr.map(whatever); this solves the problem of having to define the functions in the original class scope. personally I'm not sure what's better in this case. I know I liked smalltalk's way of function calling: arr inject: 0 into: whatever.
 Of course, a more convenient solution would be the ability to extend
 the D syntax manually, but that's unlikely to appear in our
 lifetime.

other languages solve this already. starting from Ruby that has a very flexible syntax that allows to define DSLs in the Ruby syntax itself in a natural way, going through functional languages like ML and even Lisp that allow to define functions as infix, and finishing with Lisp that had AST macros in the 60's and Nemerle which allows one to extend the syntax with its AST Macros (running on .net). Since D is going to get AST macros I'm much more optimistic than the above.
Dec 29 2008
prev sibling parent reply Don <nospam nospam.com> writes:
aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language).

Could you be more specific about this? For SQL, arithmetic and logical operators don't seem to be involved. The example you gave showed (if I understand correctly) a wish to make expression templates involving comparison operators, a task which is currently impossible. I also found your first example a little too simple. Query query = Select(a).Where(id == 5); I presume you would also want to do things like: for(int i=0; i<10; ++i) { Query query = Select(a).Where(id == (arr[i+2] + func(i)) || id==78+i); ... } meaning you'd also need to overload && and || operators. Is that correct?
 In my previous posts I already put few arguments why it is sometimes 
 much more handy to use "DSL in mother language" approach rather than 
 string mixins with DSL language itself.

Still, that doesn't necessarily involve operator overloading. I really want to assemble a list of cases which do.
Dec 29 2008
parent reply aarti_pl <aarti interia.pl> writes:
Don pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where 
 the arithmetic and logical operations are overloaded that are 
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language).

Could you be more specific about this? For SQL, arithmetic and logical operators don't seem to be involved.

In fact they both can be involved. Below are operators which are understood by SQLite - one of the simplest databases: Binary: || * / % + - << >> & | < <= > >= = == != <> IN LIKE GLOB MATCH REGEXP AND OR Unary: - + ~ NOT As you see there are arithmetic operators and as well logical operators, which are used in SQL expressions.
 The example you gave showed (if I 
 understand correctly) a wish to make expression templates involving 
 comparison operators, a task which is currently impossible.
 

Why do you mention expression templates? Its not necessary to use them in my opinion. Operator overloading should be just right for this task. I see built-in operators overloading as a part of wider category of defining infix functions (eventually also postfix functions). When we could get possibility to define infix functions: 'AND' and 'OR' I would not overload built-in '&&' and '||'. But having only possibility to overload '&&', '||' and others built-ins would be also acceptable. Overloading of only few of above operators in D is not possible. Theoretically you can overload <, <=, >, >=, ==, != (opCmp), but the way it is implemented right now is not usable for my purposes.
 I also found your first example a little too simple.
 
 Query query = Select(a).Where(id == 5);
 
 I presume you would also want to do things like:
 
 for(int i=0; i<10; ++i) {
   Query query = Select(a).Where(id == (arr[i+2] + func(i)) || id==78+i);
 ....
 }

Yes. Partially above expression will be evaluated on runtime with standard operators: val1 = arr[i+2] + func(i) val2 = 78 + i but partially it must be stored somehow to be evaluated later by SQL database: Query query = Select(a).Where(id == val1 || id == val2 || id = id1 + 5); (id, id1 are column names) (id = id1 + 5 must be stored for evaluation by SQL database) Also I would like to add a word to my latest post regarding string mixins as I see I was not clear enough about it. It should be possible to construct queries itself on runtime like below: DbTable Person = ....; //In my case DbTable keeps table columns and few other informations DbMatrix getPersons(SQLExpression exp) { Query query = Select(Person).Where(exp); return Database.execute(query); } void main() { SQLExpression exp = (Person.Name == "John"); DbMatrix res = getPersons(exp); } This is what I mean by constructing queries on runtime. With string mixins you will have to do not only mixin(SQL("SELECT ....")) but also mixin(SQLExpression()), what gets really complicated in the end.
 meaning you'd also need to overload && and || operators.
 Is that correct?

Yes.
 
 In my previous posts I already put few arguments why it is sometimes 
 much more handy to use "DSL in mother language" approach rather than 
 string mixins with DSL language itself.

Still, that doesn't necessarily involve operator overloading. I really want to assemble a list of cases which do.

Yes, it doesn't necessarily involve operator overloading as operator overloading is just a sub-case of defining functions - in this case infix functions and (possibly) postfix functions. But I would dare to assume that overloading of built-in infix functions would satisfy most of use-cases. And one more thing. It would be very useful to have possibility to define operators overloading as a free functions (not a part of class/struct). It will help decoupling of concepts and implementations. In my system columns are separate concept. They don't have to know anything about SQL and SQL expressions in which they appear. Columns are used by me to get values from resulting table (DbMatrix). This table can be used as a container for values, not necessarily for values taken from SQL queries. But to allow nicer syntax with only in-class operator overloading I will have to teach columns about SQL expressions. And someone who will want to use my table container will get also a lot of useless API from columns to create SQL expressions. BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
parent reply Don <nospam nospam.com> writes:
aarti_pl wrote:
 Don pisze:
 aarti_pl wrote:
 Don pisze:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where 
 the arithmetic and logical operations are overloaded that are 
 particularly interesting to me.




[snip]
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

DSL support in mother language. As an example I can give SQL in D or mockup tests description language (usually also in D - not as a separate script language).

Could you be more specific about this? For SQL, arithmetic and logical operators don't seem to be involved.

In fact they both can be involved. Below are operators which are understood by SQLite - one of the simplest databases: Binary: || * / % + - << >> & | < <= > >= = == != <> IN LIKE GLOB MATCH REGEXP AND OR Unary: - + ~ NOT As you see there are arithmetic operators and as well logical operators, which are used in SQL expressions.
 The example you gave showed (if I understand correctly) a wish to make 
 expression templates involving comparison operators, a task which is 
 currently impossible.

Why do you mention expression templates? Its not necessary to use them in my opinion. Operator overloading should be just right for this task.

It seems to be the same concept: the == operator does not perform an == comparison, rather the return value records the operands which were used, and records the fact that an equality comparison needs to be made. The actual equality comparison is done later, when the complete expression is known (it is done in the SQL statement).
 I see built-in operators overloading as a part of wider category of 
 defining infix functions (eventually also postfix functions).

It seems to be a case where you want an abstract syntax tree.
 Also I would like to add a word to my latest post regarding string 
 mixins as I see I was not clear enough about it. It should be possible 
 to construct queries itself on runtime like below:
 
 DbTable Person = ....; //In my case DbTable keeps table columns and few 
 other informations
 
 DbMatrix getPersons(SQLExpression exp) {
     Query query = Select(Person).Where(exp);
     return Database.execute(query);
 }
 
 void main() {
   SQLExpression exp = (Person.Name == "John");
   DbMatrix res = getPersons(exp);
 }
 
 This is what I mean by constructing queries on runtime. With string 
 mixins you will have to do not only mixin(SQL("SELECT ....")) but also 
 mixin(SQLExpression()), what gets really complicated in the end.

No. You can always move the complexity into the mixin string parser. (since you can get full type information of everything mentioned in the string). BTW, the string mixin is just a temporary evil -- it's a way of getting an abstract syntax tree using existing D. It's severely lacking syntax sugar.
Dec 29 2008
parent reply aarti_pl <aarti interia.pl> writes:
Don pisze:
 aarti_pl wrote:
 DSL support in mother language. As an example I can give SQL in D or 
 mockup tests description language (usually also in D - not as a 
 separate script language).

Could you be more specific about this? For SQL, arithmetic and logical operators don't seem to be involved.

In fact they both can be involved. Below are operators which are understood by SQLite - one of the simplest databases: Binary: || * / % + - << >> & | < <= > >= = == != <> IN LIKE GLOB MATCH REGEXP AND OR Unary: - + ~ NOT As you see there are arithmetic operators and as well logical operators, which are used in SQL expressions.
 The example you gave showed (if I understand correctly) a wish to 
 make expression templates involving comparison operators, a task 
 which is currently impossible.

Why do you mention expression templates? Its not necessary to use them in my opinion. Operator overloading should be just right for this task.

It seems to be the same concept: the == operator does not perform an == comparison, rather the return value records the operands which were used, and records the fact that an equality comparison needs to be made. The actual equality comparison is done later, when the complete expression is known (it is done in the SQL statement).
 I see built-in operators overloading as a part of wider category of 
 defining infix functions (eventually also postfix functions).

It seems to be a case where you want an abstract syntax tree.

Abstract syntax tree can be quite easily created using concepts already existing in D. Someone here on D newsgroup showed me a simple way using "parts": class Select { SelectWherePart Where(Expression expr) { return new SelectWherePart(expr); } FromPart From(FromExpression expr) { return FromPart(expr); } } Maybe it is not perfect, but should work reasonably well. Especially with opImplicitCast and with operator overloading.
 Also I would like to add a word to my latest post regarding string 
 mixins as I see I was not clear enough about it. It should be possible 
 to construct queries itself on runtime like below:

 DbTable Person = ....; //In my case DbTable keeps table columns and 
 few other informations

 DbMatrix getPersons(SQLExpression exp) {
     Query query = Select(Person).Where(exp);
     return Database.execute(query);
 }

 void main() {
   SQLExpression exp = (Person.Name == "John");
   DbMatrix res = getPersons(exp);
 }

 This is what I mean by constructing queries on runtime. With string 
 mixins you will have to do not only mixin(SQL("SELECT ....")) but also 
 mixin(SQLExpression()), what gets really complicated in the end.

No. You can always move the complexity into the mixin string parser. (since you can get full type information of everything mentioned in the string).

I don't think it is possible. Imagine that you have additionally OrderByExpression and few others expressions like GroupByExpression etc. Now you parser will have to guess what you need just using one method SQL(): mixin(SQL("SELECT * FROM a;")); // result: SelectQuery mixin(SQL("Person.Name == i + 5")); // result: WhereExpression mixin(SQL("Person.Surname ASC")); // result: OrderByExpression I wouldn't want to write such a parser. And additionally probably some expressions would be ambiguous. But it is possible that I miss something here, so if possible please provide usage examples. You can take e.g. above snippet with passing SQLExpression into function.
 BTW, the string mixin is just a temporary evil -- it's a way of getting 
 an abstract syntax tree using existing D. It's severely lacking syntax 
 sugar.

I don't know much about what will be cooked in future for D. But listening to proposed features would be definitely interesting :-) BR Marcin Kuszczak (aarti_pl)
Dec 29 2008
parent Don <nospam nospam.com> writes:
aarti_pl wrote:
 Don pisze:
 aarti_pl wrote:
 DSL support in mother language. As an example I can give SQL in D 
 or mockup tests description language (usually also in D - not as a 
 separate script language).

Could you be more specific about this? For SQL, arithmetic and logical operators don't seem to be involved.

In fact they both can be involved. Below are operators which are understood by SQLite - one of the simplest databases: Binary: || * / % + - << >> & | < <= > >= = == != <> IN LIKE GLOB MATCH REGEXP AND OR Unary: - + ~ NOT As you see there are arithmetic operators and as well logical operators, which are used in SQL expressions.
 The example you gave showed (if I understand correctly) a wish to 
 make expression templates involving comparison operators, a task 
 which is currently impossible.

Why do you mention expression templates? Its not necessary to use them in my opinion. Operator overloading should be just right for this task.

It seems to be the same concept: the == operator does not perform an == comparison, rather the return value records the operands which were used, and records the fact that an equality comparison needs to be made. The actual equality comparison is done later, when the complete expression is known (it is done in the SQL statement).
 I see built-in operators overloading as a part of wider category of 
 defining infix functions (eventually also postfix functions).

It seems to be a case where you want an abstract syntax tree.

Abstract syntax tree can be quite easily created using concepts already existing in D.

Of course they can. My point is simply: creating syntax trees is what you're using operator overloading for. You're actually not creating operations.
 Also I would like to add a word to my latest post regarding string 
 mixins as I see I was not clear enough about it. It should be 
 possible to construct queries itself on runtime like below:

 DbTable Person = ....; //In my case DbTable keeps table columns and 
 few other informations

 DbMatrix getPersons(SQLExpression exp) {
     Query query = Select(Person).Where(exp);
     return Database.execute(query);
 }

 void main() {
   SQLExpression exp = (Person.Name == "John");
   DbMatrix res = getPersons(exp);
 }

 This is what I mean by constructing queries on runtime. With string 
 mixins you will have to do not only mixin(SQL("SELECT ....")) but 
 also mixin(SQLExpression()), what gets really complicated in the end.

No. You can always move the complexity into the mixin string parser. (since you can get full type information of everything mentioned in the string).

I don't think it is possible. Imagine that you have additionally OrderByExpression and few others expressions like GroupByExpression etc. Now you parser will have to guess what you need just using one method SQL(): mixin(SQL("SELECT * FROM a;")); // result: SelectQuery mixin(SQL("Person.Name == i + 5")); // result: WhereExpression mixin(SQL("Person.Surname ASC")); // result: OrderByExpression I wouldn't want to write such a parser. And additionally probably some expressions would be ambiguous. But it is possible that I miss something here, so if possible please provide usage examples. You can take e.g. above snippet with passing SQLExpression into function.

You're right, I misunderstood what you were doing.
 BTW, the string mixin is just a temporary evil -- it's a way of 
 getting an abstract syntax tree using existing D. It's severely 
 lacking syntax sugar.

I don't know much about what will be cooked in future for D. But listening to proposed features would be definitely interesting :-) BR Marcin Kuszczak (aarti_pl)

Dec 29 2008
prev sibling next sibling parent reply "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sun, Dec 28, 2008 at 11:50 AM, Don <nospam nospam.com> wrote:

 So, please post any use cases which you consider convincing.

Virtually all I use it for is making containers. I do almost no numerical programming. I would probably not miss much if all I could overload were opIndex[Assign], opSlice[Assign], opCat[Assign], and opApply. As you've noted before, overloading of the arithmetic operators isn't useful when you only have access to two operands. It seems like an AST transformation (macros!) on mathematical transformations into possibly-compound operations would be much more useful than plain old operator overloading. We're already forced to use expression templates to do anything useful with operator overloading; why not put a more general, efficient, concise form of that in the language itself?
Dec 28 2008
parent reply Yigal Chripun <yigal100 gmail.com> writes:
Jarrett Billingsley wrote:
 On Sun, Dec 28, 2008 at 11:50 AM, Don<nospam nospam.com>  wrote:

 So, please post any use cases which you consider convincing.

Virtually all I use it for is making containers. I do almost no numerical programming. I would probably not miss much if all I could overload were opIndex[Assign], opSlice[Assign], opCat[Assign], and opApply. As you've noted before, overloading of the arithmetic operators isn't useful when you only have access to two operands. It seems like an AST transformation (macros!) on mathematical transformations into possibly-compound operations would be much more useful than plain old operator overloading. We're already forced to use expression templates to do anything useful with operator overloading; why not put a more general, efficient, concise form of that in the language itself?

could you please elaborate on this, and/or provide a code example? Also, just wanted to mention that other languages provide similar concepts but with slightly different mechanisms - for example, scala allows to declare a function with two parameters as an infix function, this idea is also present in functional languages like ML. So my question therefore is: what are the pros/cons of limiting this to already existing operators in the language (similar to C++) vs. allow any kind of symbol and/or function name? Downs uses his famous operand1 /func/ operand2 pattern, for example with a map function. Should D support this in the language?
Dec 28 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Jarrett Billingsley:
 His library works by having you write your code in a DSL in strings,
 which you pass to the library and then mix in the resulting X86.

Have someone some benchmarks that show such Blade give some performance improvements over code produced by DMD and LDC? This is the most important thing, otherwise such things aren't useful. Bye, bearophile
Dec 28 2008
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Jarrett Billingsley:
 His library works by having you write your code in a DSL in strings,
 which you pass to the library and then mix in the resulting X86.

Have someone some benchmarks that show such Blade give some performance improvements over code produced by DMD and LDC? This is the most important thing, otherwise such things aren't useful.

It's MUCH faster. Factor of 5 is typical. BTW, Blade is currently unusable because of the CTFE memory allocation bug. Also, now that array operations are in the language, it's much less necessary for the application I created it for.
 
 Bye,
 bearophile

Dec 28 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter:
 http://d.puremagic.com/issues/show_bug.cgi?id=1382
 From what I can tell they've had it fixed in LDC since mid-November by
 applying that change with no problems.  Christian K.,  can you confirm
 that it's been working fine in LDC?

The LDC home page also says:
optional Hans Boehm GC 7.0 (direct download link) (for ldc itself, buggy and
not recommended)<

Bye, bearophile
Dec 29 2008
prev sibling parent Christian Kamm <kamm-incasoftware removethis.de> writes:
Don wrote:
 BTW, Blade is currently unusable because of the CTFE memory allocation
 bug. 


Bill Baxter wrote:
 A good opportunity to mention once again that LDC guys have figured
 out some compile settings that seem to fix the problem.
 http://d.puremagic.com/issues/show_bug.cgi?id=1382
 From what I can tell they've had it fixed in LDC since mid-November by
 applying that change with no problems.  Christian K.,  can you confirm
 that it's been working fine in LDC?

We have not tested it extensively (though we did run our tests as well as dstress and found only improvements) and have not enabled boehm-gc by default. The current plan is to include two versions of LDC in the release: the default one and an experimental one that uses the boehm-gc hack and has the forward reference fixes applied.
Dec 29 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sun, Dec 28, 2008 at 4:17 PM, Jarrett Billingsley
<jarrett.billingsley gmail.com> wrote:
 On Sun, Dec 28, 2008 at 11:50 AM, Don <nospam nospam.com> wrote:

 So, please post any use cases which you consider convincing.

Virtually all I use it for is making containers. I do almost no numerical programming. I would probably not miss much if all I could overload were opIndex[Assign], opSlice[Assign], opCat[Assign], and opApply.

I also want to add that even these could be improved. How many times have I wanted this: dest[lo1 .. hi1] = src[lo2 .. hi2]; to be a single operation? You can't do it. You're forced to create a stupid temporary in the opSlice of src, and then copy out of that into dest's opSliceAssign.
Dec 28 2008
prev sibling next sibling parent "Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:
On Sun, Dec 28, 2008 at 6:14 PM, Yigal Chripun <yigal100 gmail.com> wrote:
 could you please elaborate on this, and/or provide a code example?

Don's Blade library is more or less a proof of concept for what I would imagine to be possible with macros: http://www.dsource.org/projects/mathextra/browser/trunk/blade His library works by having you write your code in a DSL in strings, which you pass to the library and then mix in the resulting X86. http://www.dsource.org/projects/mathextra/browser/trunk/blade/BladeDemo.d If we had AST macros, the parsing would be performed by the compiler instead of hackishly using CTFE and templates, and the library would just be concerned with matching certain patterns of expressions, performing transforms on them (simplifications, optimizations, etc.), and turning them into D code. String mixins are incredibly powerful. But they're powerful in the way that assembly language is powerful - you can do anything, but there are virtually no useful abstractions provided, so you end up having to do everything yourself - parsing, pattern matching, etc.
Dec 28 2008
prev sibling next sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Don wrote:
 There's been some interesting discussion about operator overloading over 
 the past six months, but to take the next step, I think we need to 
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial 
 cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)

Not true. Quaternion multiplication does have the distinction of being non-commutative in the general case, but anti-commutativity occurs in only some special cases. If you really want anti-commutativity, look at vectors under cross multiplication. Other unusual number systems: hypercomplex, biquaternions, octonions, p-adic numbers
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)

And possibly rational and Euclidean number types. Stewart.
Dec 28 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Dec 29, 2008 at 3:00 PM, Don <nospam nospam.com> wrote:
 bearophile wrote:
 Jarrett Billingsley:
 His library works by having you write your code in a DSL in strings,
 which you pass to the library and then mix in the resulting X86.

Have someone some benchmarks that show such Blade give some performance improvements over code produced by DMD and LDC? This is the most important thing, otherwise such things aren't useful.

It's MUCH faster. Factor of 5 is typical. BTW, Blade is currently unusable because of the CTFE memory allocation bug. Also, now that array operations are in the language, it's much less necessary for the application I created it for.

A good opportunity to mention once again that LDC guys have figured out some compile settings that seem to fix the problem. http://d.puremagic.com/issues/show_bug.cgi?id=1382 From what I can tell they've had it fixed in LDC since mid-November by applying that change with no problems. Christian K., can you confirm that it's been working fine in LDC? --bb
Dec 28 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Mon, Dec 29, 2008 at 1:50 AM, Don <nospam nospam.com> wrote:
 There's been some interesting discussion about operator overloading over the
 past six months, but to take the next step, I think we need to ground it in
 reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of trivial
 cases where <, >= etc are overloaded. It's the cases where the arithmetic
 and logical operations are overloaded that are particularly interesting to
 me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

Geometric algebra things. Here's a file from a GA library I just found: http://gasandbox.svn.sourceforge.net/viewvc/gasandbox/ga_sandbox/libgasandbox/e2ga.h?revision=540&view=markup I see "rotors" and "bivectors" being used with operator overloads in addition to some of the other types above. Also saw the keywords "multivector" "versor" and "blade" appearing elsewhere. Not sure if those are types that have op overloads or not. One thing I've heard about GA is that it doesn't lead to very efficient code. So it's perhaps better for working ideas out than for actually implementing them. But that might make it a very interesting target for an operator overloading/merging/optimizing framework. By that I mean that with sufficiently capable analysis, it may be possible to turn even GA operations into optimal code. That would be a pretty big coup for D I think. --bb
Dec 28 2008
prev sibling next sibling parent Chad J <gamerchad __spam.is.bad__gmail.com> writes:
Don wrote:
 There's been some interesting discussion about operator overloading over
 the past six months, but to take the next step, I think we need to
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial
 cases where <, >= etc are overloaded. It's the cases where the
 arithmetic and logical operations are overloaded that are particularly
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions (interesting since * is anti-commutative, a*b = -b*a)
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.
 

Array-alikes/Containers. dstring (okay, mtext now, damn new aliases *grumble grumble*) http://www.dprogramming.com/mtext.php Call chaining. http://www.dsource.org/projects/tango/docs/current/tango.io.Stdout.html Notice the `Stdout ("abc") ("def") (3.14); => abcdef3.14` Hybrid uses opCall and opIndex to do some of its fancy stuff.
Dec 29 2008
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Don wrote:
 There's been some interesting discussion about operator overloading over 
 the past six months, but to take the next step, I think we need to 
 ground it in reality. What are the use cases?
 
 I think that D's existing opCmp() takes care of the plethora of trivial 
 cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.
 
 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.
 
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff that 
 is quite frankly bizarre.
 
 So, please post any use cases which you consider convincing.

Some observations based on the use cases to date: (1) a += b is ALWAYS a = a + b (and likewise for all other operations). opXXXAssign therefore seems to be a (limited) performance optimisation. The compiler should be allowed to synthesize += from +. This would almost halve the minimum number of repetitive functions required. A straightforward first step would be to state in the spec that "the compiler is entitled to assume that X+=Y yields the same result as X=X+Y" (2) There seems to be a need for abstract syntax trees, which is NOT necessarily related to performance. (If we had a 'perfect performance' solution for operator overloading, it would not remove the desire for abstract syntax trees). (3) The array operations ~, [], [..] need further attention. A solution for $ is also required.
Dec 29 2008
next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Don wrote:
 Some observations based on the use cases to date:
 (1)
 a += b is ALWAYS a = a + b (and likewise for all other operations).
 opXXXAssign therefore seems to be a (limited) performance optimisation. 
 The compiler should be allowed to synthesize += from +. This would 
 almost halve the minimum number of repetitive functions required.

Not quite true: class A { int value; A opAdd(A other) { return new A(value + other.value); } A opAddAssign(A other) { value += other.value; } } class B { A a; this (A value) { a = value; } } void main () { auto a = new A; auto b1 = new B(a); auto b2 = new B(a); auto a2 = new A; b1.a += a2; // okay, b1.a is b2.a b1.a = b1.a + a2; // now b1.a !is b2.a }
Dec 30 2008
next sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Christopher Wright wrote:
<snip>
 void main ()
 {
     auto a = new A;
     auto b1 = new B(a);
     auto b2 = new B(a);
     auto a2 = new A;
     b1.a += a2; // okay, b1.a is b2.a
     b1.a = b1.a + a2; // now b1.a !is b2.a
 }

If these comments reflect what the code you posted does for you, your compiler is broken. Here, the code correctly doesn't even compile. Stewart.
Dec 30 2008
prev sibling parent KennyTM~ <kennytm gmail.com> writes:
Christopher Wright wrote:
 Don wrote:
 Some observations based on the use cases to date:
 (1)
 a += b is ALWAYS a = a + b (and likewise for all other operations).
 opXXXAssign therefore seems to be a (limited) performance 
 optimisation. The compiler should be allowed to synthesize += from +. 
 This would almost halve the minimum number of repetitive functions 
 required.

Not quite true: class A { int value; A opAdd(A other) { return new A(value + other.value); } A opAddAssign(A other) { value += other.value; } } class B { A a; this (A value) { a = value; } } void main () { auto a = new A; auto b1 = new B(a); auto b2 = new B(a); auto a2 = new A; b1.a += a2; // okay, b1.a is b2.a b1.a = b1.a + a2; // now b1.a !is b2.a }

Both expression should yield the same _value_, not necessarily the same _reference_, so Б─°b1.a == b2.aБ─² should return true.
Dec 31 2008
prev sibling next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don wrote:
 A straightforward first step would be to state in the spec that "the 
 compiler is entitled to assume that X+=Y yields the same result as X=X+Y"

That doesn't hold for reference types, does it? So perhaps this should only be the case for structs? (Shouldn't make much difference, I think all of the examples I've seen would normally be implemented as structs rather than objects)
Dec 30 2008
parent reply Don <nospam nospam.com> writes:
Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the 
 compiler is entitled to assume that X+=Y yields the same result as X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples? So perhaps this should
 only be the case for structs? (Shouldn't make much difference, I think 
 all of the examples I've seen would normally be implemented as structs 
 rather than objects)

Dec 30 2008
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the 
 compiler is entitled to assume that X+=Y yields the same result as 
 X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one: B1 = A; B1 += C; // A is modified, B1 stays same reference assert(A is B1); B2 = A; B2 = B2 + C; // Doesn't touch A, B2 is reassigned a different ref assert(A !is B2); You can't just arbitrarily substitute between these two. Certainly you can't substitute + with +=. I suppose substituting += with + might be reasonable if no += is provided.
Dec 30 2008
next sibling parent reply Don <nospam nospam.com> writes:
Frits van Bommel wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the 
 compiler is entitled to assume that X+=Y yields the same result as 
 X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

Sure, you can do it (behaviour inherited from C++), but is that _EVER_ a good idea? I can't think of any cases where that's anything other than a bug-breeder.
 You can't just arbitrarily substitute between these two.

sense. No-one has yet come up with such a use case. I postulate that it doesn't exist.
Dec 30 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that 
 "the compiler is entitled to assume that X+=Y yields the same 
 result as X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

Sure, you can do it (behaviour inherited from C++), but is that _EVER_ a good idea? I can't think of any cases where that's anything other than a bug-breeder.
 You can't just arbitrarily substitute between these two.

sense. No-one has yet come up with such a use case. I postulate that it doesn't exist.

Well I forgot whether BigInt is a class, is it? Anyhow, suppose it *is* a class and as such has reference semantics. Then a += b modifies an object in-situ, whereas a = a + b creates a whole new object and happens to bind a to that new object. Andrei
Dec 30 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Tue, 30 Dec 2008 22:26:19 +0300, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that 
 "the compiler is entitled to assume that X+=Y yields the same 
 result as X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

_EVER_ a good idea? I can't think of any cases where that's anything other than a bug-breeder.
 You can't just arbitrarily substitute between these two.

sense. No-one has yet come up with such a use case. I postulate that it doesn't exist.

Well I forgot whether BigInt is a class, is it? Anyhow, suppose it *is* a class and as such has reference semantics. Then a += b modifies an object in-situ, whereas a = a + b creates a whole new object and happens to bind a to that new object. Andrei

It was suggested 2 posts up the thread. I believe Don is looking for a use case where given a1 = a + b; a2 = a; a2 += b; the following check intentionally fails: assert(a1 == a2); // not "a1 is a2" He postulates that none exists.

Well then the post situated 2 posts up the thread was right because "is" vs. "==" is a red herring. For class types the two are not equivalent. The following two could be equivalent assuming correct definitions: a1 = a + b; and a2 = deepCopy(a); a2 += b; This also suggests that it may sometimes be inefficient to define + in terms of += (which is a tad counterintuitive in C++ circles). Andrei
Dec 30 2008
prev sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that 
 "the compiler is entitled to assume that X+=Y yields the same 
 result as X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

Sure, you can do it (behaviour inherited from C++), but is that _EVER_ a good idea? I can't think of any cases where that's anything other than a bug-breeder.
 You can't just arbitrarily substitute between these two.

sense. No-one has yet come up with such a use case. I postulate that it doesn't exist.

Well I forgot whether BigInt is a class, is it? Anyhow, suppose it *is* a class and as such has reference semantics. Then a += b modifies an object in-situ, whereas a = a + b creates a whole new object and happens to bind a to that new object.

You're right, though BigInt is not a class. I have, though, seen a BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, the reason it existed was that the only way you can enforce value semantics in D1 is via copy-on-write, which results in many unnecessary heap allocations and copies. So Fritz is correct, it could not be enforced for reference types. The question then is, when are reference semantics desired for an object with arithmetical operator overloading? For matrix slices, maybe? But even then I'm not certain you'd want to allow X=X+Y; you'd probably want to use X[]=X[]+Y[].
Dec 31 2008
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 Well I forgot whether BigInt is a class, is it? Anyhow, suppose it 
 *is* a class and as such has reference semantics. Then a += b modifies 
 an object in-situ, whereas a = a + b creates a whole new object and 
 happens to bind a to that new object.

You're right, though BigInt is not a class. I have, though, seen a BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, the reason it existed was that the only way you can enforce value semantics in D1 is via copy-on-write, which results in many unnecessary heap allocations and copies. So Fritz is correct, it could not be enforced for reference types.

First of all, please note that I'm not German so my name ends with an 's', not a 'z'.
 The question then is, when are reference semantics desired for an object 
 with arithmetical operator overloading?
 
 For matrix slices, maybe? But even then I'm not certain you'd want to 
 allow X=X+Y; you'd probably want to use X[]=X[]+Y[].

I never said it'd be useful to use arithmetic ops with classes. In fact, my suggestion was to just only apply these transformations to structs since they probably wouldn't be very useful for classes anyway.
Dec 31 2008
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Frits van Bommel wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Well I forgot whether BigInt is a class, is it? Anyhow, suppose it 
 *is* a class and as such has reference semantics. Then a += b 
 modifies an object in-situ, whereas a = a + b creates a whole new 
 object and happens to bind a to that new object.

You're right, though BigInt is not a class. I have, though, seen a BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, the reason it existed was that the only way you can enforce value semantics in D1 is via copy-on-write, which results in many unnecessary heap allocations and copies. So Fritz is correct, it could not be enforced for reference types.

First of all, please note that I'm not German so my name ends with an 's', not a 'z'.
 The question then is, when are reference semantics desired for an 
 object with arithmetical operator overloading?

 For matrix slices, maybe? But even then I'm not certain you'd want to 
 allow X=X+Y; you'd probably want to use X[]=X[]+Y[].

I never said it'd be useful to use arithmetic ops with classes. In fact, my suggestion was to just only apply these transformations to structs since they probably wouldn't be very useful for classes anyway.

The problem is that even structs may have reference semantics. Andrei
Dec 31 2008
parent Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Frits van Bommel wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Well I forgot whether BigInt is a class, is it? Anyhow, suppose it 
 *is* a class and as such has reference semantics. Then a += b 
 modifies an object in-situ, whereas a = a + b creates a whole new 
 object and happens to bind a to that new object.

You're right, though BigInt is not a class. I have, though, seen a BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, the reason it existed was that the only way you can enforce value semantics in D1 is via copy-on-write, which results in many unnecessary heap allocations and copies. So Fritz is correct, it could not be enforced for reference types.

First of all, please note that I'm not German so my name ends with an 's', not a 'z'.
 The question then is, when are reference semantics desired for an 
 object with arithmetical operator overloading?

 For matrix slices, maybe? But even then I'm not certain you'd want to 
 allow X=X+Y; you'd probably want to use X[]=X[]+Y[].

I never said it'd be useful to use arithmetic ops with classes. In fact, my suggestion was to just only apply these transformations to structs since they probably wouldn't be very useful for classes anyway.

The problem is that even structs may have reference semantics. Andrei

reference semantics. For example, you might implement a bigint class as a struct X with a dynamic array containing the actual number. If you naively use in-place operations on this array, you quickly run into problems since eg X+=1 will cause a reallocation if and only if X=0xFFFFFFFF.... Which gives you reference semantics most of the time, but occasionally gives value semantics instead. I believe this can only be solved with another layer of indirection. As far as I can tell, reference semantics are only possible if the struct contains a pointer to something with value semantics, for example , a struct, a class, or a fixed-length array; or else to another pointer. I had once thought that reference semantics may be sometimes desirable for performance reasons, but it seems that it is more likely that they are less efficient.
Jan 01 2009
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Don wrote:
 Andrei Alexandrescu wrote:

 Well I forgot whether BigInt is a class, is it? Anyhow, suppose it 
 *is* a class and as such has reference semantics. Then a += b modifies 
 an object in-situ, whereas a = a + b creates a whole new object and 
 happens to bind a to that new object.


Assuming that BigInt is mutable.
 You're right, though BigInt is not a class. I have, though, seen a 
 BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, 
 the reason it existed was that the only way you can enforce value 
 semantics in D1 is via copy-on-write, which results in many unnecessary 
 heap allocations and copies.
 So Fritz is correct, it could not be enforced for reference types.
 The question then is, when are reference semantics desired for an object 
 with arithmetical operator overloading?
 
 For matrix slices, maybe? But even then I'm not certain you'd want to 
 allow X=X+Y; you'd probably want to use X[]=X[]+Y[].

They would probably do different things: - assigning to X would reassign the reference that is X - assigning to X[] would fill X in-place. Stewart.
Dec 31 2008
parent Don <nospam nospam.com> writes:
Stewart Gordon wrote:
 Don wrote:
 Andrei Alexandrescu wrote:

 Well I forgot whether BigInt is a class, is it? Anyhow, suppose it 
 *is* a class and as such has reference semantics. Then a += b 
 modifies an object in-situ, whereas a = a + b creates a whole new 
 object and happens to bind a to that new object.


Assuming that BigInt is mutable.
 You're right, though BigInt is not a class. I have, though, seen a 
 BigIntRef class (in Diemos, I think) which behaved in that way. AFAIK, 
 the reason it existed was that the only way you can enforce value 
 semantics in D1 is via copy-on-write, which results in many 
 unnecessary heap allocations and copies.
 So Fritz is correct, it could not be enforced for reference types.
 The question then is, when are reference semantics desired for an 
 object with arithmetical operator overloading?

 For matrix slices, maybe? But even then I'm not certain you'd want to 
 allow X=X+Y; you'd probably want to use X[]=X[]+Y[].

They would probably do different things:

 - assigning to X would reassign the reference that is X

of matrix, and assigning X to it. I'm not sure that's an operation which you would want. I guess D strings work that way, though.
 - assigning to X[] would fill X in-place.

 Stewart.

Dec 31 2008
prev sibling parent reply Weed <resume755 mail.ru> writes:
Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

The += operator too should return the object (usually "this")
Dec 30 2008
next sibling parent reply Don <nospam nospam.com> writes:
Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"




The += operator too should return the object (usually "this")

ALWAYS 'this'. It's another feature of operator overloading which is redundant.
Dec 30 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Weed wrote:
 Frits van Bommel п©п╦я┬п╣я┌:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:

The += operator too should return the object (usually "this")

ALWAYS 'this'. It's another feature of operator overloading which is redundant.

Well as well as you yourself noticed in an older discussion, consistently returning "this" clashes with expression templates. But then IMHO the existence of "auto" makes it imperative that we find a better solution than expression templates. Changing gears, let's work on a simple example. Say we define a struct Vector and we want the following to be as fast as hand-coded: Vector a, b, c; double alpha = 0.5; ... a = b + alpha * c; The reference hand-coded implementation is: enforce(a.length == b.length && b.length == c.length); foreach (i; 0 .. a.length) { a[i] = b[i] + alpha * c[i]; } All right, let's do this with operator overloading :o). Ideas? I have a couple, but I don't want to introduce bias. Andrei
Dec 30 2008
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 [snip]
 
 All right, let's do this with operator overloading :o). Ideas? I have a 
 couple, but I don't want to introduce bias.
 
 Andrei

The only idea I can come up with is to introduce the following: struct Vector { macro opExpression(ast) { ... } } The compile-time opExpression macro's job is to spit out a function that takes each leaf of the AST as an argument and returns the final value of the expression. You could simplify it by ensuring that you don't get any subexpressions that don't result in the type "Vector". For example: Vector a, b, c; double g, h; a = b + (g*h) * c; The AST would have "(g*h)" as a single node of type double. Of course, this would probably be exceedingly complex, both on the compiler and user side of things. I can't imagine what the contents of that macro would even look like... The one positive to this method is that it's about as general as you can get; assuming this was implemented, it would hopefully be possible to implement a simpler scheme on top of it. struct Vector { mixin FusionOverloadFor!("+","-","*","/",".dot",".cross"); } I really hope there's a better way. -- Daniel
Dec 31 2008
parent reply Don <nospam nospam.com> writes:
Daniel Keep wrote:
 
 
 Andrei Alexandrescu wrote:
 [snip]

 All right, let's do this with operator overloading :o). Ideas? I have 
 a couple, but I don't want to introduce bias.

 Andrei

The only idea I can come up with is to introduce the following: struct Vector { macro opExpression(ast) { ... } } The compile-time opExpression macro's job is to spit out a function that takes each leaf of the AST as an argument and returns the final value of the expression. You could simplify it by ensuring that you don't get any subexpressions that don't result in the type "Vector". For example: Vector a, b, c; double g, h; a = b + (g*h) * c; The AST would have "(g*h)" as a single node of type double. Of course, this would probably be exceedingly complex, both on the compiler and user side of things. I can't imagine what the contents of that macro would even look like...

My BLADE library pretty much does that, so you don't need to guess. The ast is a string of the form "A=B+(C*D)", together with a type tuple Tuple!(Vector, Vector, double, Vector), and a values array ["a","b","(g*h)","c"]. ("g*h" is something like "4.564e+2" if g and h are compile-time constants). One approach would be to look at that code and try to simplify it.
 
 The one positive to this method is that it's about as general as you can 
 get; assuming this was implemented, it would hopefully be possible to 
 implement a simpler scheme on top of it.
 
 struct Vector
 {
     mixin FusionOverloadFor!("+","-","*","/",".dot",".cross");
 }
 
 I really hope there's a better way.
 
   -- Daniel

Jan 01 2009
parent Don <nospam nospam.com> writes:
Don wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 [snip]

 All right, let's do this with operator overloading :o). Ideas? I have 
 a couple, but I don't want to introduce bias.

 Andrei

The only idea I can come up with is to introduce the following: struct Vector { macro opExpression(ast) { ... } } The compile-time opExpression macro's job is to spit out a function that takes each leaf of the AST as an argument and returns the final value of the expression. You could simplify it by ensuring that you don't get any subexpressions that don't result in the type "Vector". For example: Vector a, b, c; double g, h; a = b + (g*h) * c; The AST would have "(g*h)" as a single node of type double. Of course, this would probably be exceedingly complex, both on the compiler and user side of things. I can't imagine what the contents of that macro would even look like...

My BLADE library pretty much does that, so you don't need to guess. The ast is a string of the form "A=B+(C*D)", together with a type tuple Tuple!(Vector, Vector, double, Vector), and a values array ["a","b","(g*h)","c"]. ("g*h" is something like "4.564e+2" if g and h are compile-time constants). One approach would be to look at that code and try to simplify it.

It performs the following steps. Note that it uses [5...3, 6, , 2..$] or [5...3, 6, 0..$, 2..$] syntax for multi-dimensional slicing. === (1) Simplify the expression. === (A) Remove all duplicate symbols. (B) Combine all scalars into a single scalar. Combine slicing operations into vector variables. (C) Deal with slices. - A[B..C, whatever][D..E] = A[(B+D)..(B+E), whatever] for any non-slicable expression A, including something ending with an index. Assert(E-D <= C-B). (Special case A[B..$, whatever][D..$] = A[(B+D)..$, whatever]) - A[whatever][D..E] = A[D..E, whatever] if A is slicable. (D) Rank and arithmetic transformations. - Use slicing distributive law for linear algebra: Given A[B..C] for expressions A,B,C where B and C are non-slicable, and A is slicable, the slice can be moved to every vector inside A. Note that this may convert matrix multiplications into dot products. - Remove unary minus where possible, eg A-(-B) => A+B, abs(-A) => abs(A). - Use associativity of * in intrinsics: sum(A*V) => A*sum(V), abs(A*B) => abs(A)*abs(B) (E) Expression standardisation - Move multiplies to left: Convert A[]*B into B*A[] (assumes * is commutative, not valid for quaternions). - Convert C-A*B into C+(-A)*B whenever possible. (F) Remove '$'. Convert x[$] into x[x.length]. (G) Check all of the ranks for each operation. Add each one to the list of asserts. === (2) Categorize the expression === This determines which type of loop it is. For example, with a matrix-vector multiply you have nested loops. === (3) Generate asserts === Spit out all the asserts which were generated during the simplification pass. === (4) Generate code, based on expression category === I only did this for a few cases, but it's generally pretty straightforward. It would get much more complicated once you start adding cache blocking techniques and sparse matrices... I completely implemented steps 1 to 3, so this isn't complete fantasy.
 
 The one positive to this method is that it's about as general as you 
 can get; assuming this was implemented, it would hopefully be possible 
 to implement a simpler scheme on top of it.

 struct Vector
 {
     mixin FusionOverloadFor!("+","-","*","/",".dot",".cross");
 }

 I really hope there's a better way.

   -- Daniel


Jan 02 2009
prev sibling parent Yigal Chripun <yigal100 gmail.com> writes:
Andrei Alexandrescu Wrote:

 Don wrote:
 Weed wrote:
 Frits van Bommel п©п╦я┬п╣я┌:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:

The += operator too should return the object (usually "this")

ALWAYS 'this'. It's another feature of operator overloading which is redundant.

Well as well as you yourself noticed in an older discussion, consistently returning "this" clashes with expression templates. But then IMHO the existence of "auto" makes it imperative that we find a better solution than expression templates. Changing gears, let's work on a simple example. Say we define a struct Vector and we want the following to be as fast as hand-coded: Vector a, b, c; double alpha = 0.5; ... a = b + alpha * c; The reference hand-coded implementation is: enforce(a.length == b.length && b.length == c.length); foreach (i; 0 .. a.length) { a[i] = b[i] + alpha * c[i]; } All right, let's do this with operator overloading :o). Ideas? I have a couple, but I don't want to introduce bias. Andrei

Here's my first attempt: struct Vector { double[] arr; // internal array opMul_r(double val) { for (int i; i < arr.size(); ++i) { yield val.opLazyMul(arr[i]); i++; } } opAdd(Vector other) { for (int i; i < arr.size(); ++i) { yield arr[i].opLazyAdd(other[i]); i++; } } ... } the idea in the [pseudo-] code above is: a) built-in types (double in the above) provide lazy ops. so, for example, double.opLazyAdd will look similar to this: double opLazyAdd(double other) { return double apply() { return this + other; }; } note: there should be syntax for this: first thought - op, like in 5 + 6; b) generators expressed with yield. think of this as expression templates with parts of it inside the compiler.
Jan 01 2009
prev sibling parent reply Weed <resume755 mail.ru> writes:
Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:

The += operator too should return the object (usually "this")

ALWAYS 'this'. It's another feature of operator overloading which is redundant.

Not always. Can be more convenient to create the new object and to return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.
Jan 01 2009
parent reply Don <nospam nospam.com> writes:
Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:


redundant.

Not always. Can be more convenient to create the new object and to return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!
Jan 02 2009
parent reply Weed <resume755 mail.ru> writes:
Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that
 "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:


redundant.

Not always. Can be more convenient to create the new object and to return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!

I agree, my point of view disputable. The programmer can have a desire to return not current object: the returned and this will be equivalent but are not identical. Do not forget that this object may be not a class - it can be struct and such return can in certain to save a few resources. But if us will force to return this under the threat of a compile error I will not cry.:) And you have certainly noticed that here the solution inaccuracy again appears to divide structures and classes by a principle "on value" and "reference". :)
Jan 03 2009
parent reply Don <nospam nospam.com> writes:
Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that
 "the
 compiler is entitled to assume that X+=Y yields the same result as
 X=X+Y"



new one:


redundant.

Not always. Can be more convenient to create the new object and to return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!

I agree, my point of view disputable. The programmer can have a desire to return not current object: the returned and this will be equivalent but are not identical. Do not forget that this object may be not a class - it can be struct and such return can in certain to save a few resources. But if us will force to return this under the threat of a compile error I will not cry.:) And you have certainly noticed that here the solution inaccuracy again appears to divide structures and classes by a principle "on value" and "reference". :)

Yah. They're almost the same, but not quite. It's interesting that value semantics are IMPOSSIBLE with classes (I didn't know that until Stewart's post), whereas reference semantics with structs are possible (but ugly) with structs. In my experience with D, I use structs + templates far more frequently than classes + polymorphism. And I suspect that if interfaces were a bit more powerful and efficient, struct+interface might replace even more of the use cases for class-based run-time polymorphism. So I must admit, I'm quite biased against classes.
Jan 04 2009
parent reply Weed <resume755 mail.ru> writes:
Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that
 "the
 compiler is entitled to assume that X+=Y yields the same
 result as
 X=X+Y"



new one:


redundant.

Not always. Can be more convenient to create the new object and to return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!

I agree, my point of view disputable. The programmer can have a desire to return not current object: the returned and this will be equivalent but are not identical. Do not forget that this object may be not a class - it can be struct and such return can in certain to save a few resources. But if us will force to return this under the threat of a compile error I will not cry.:) And you have certainly noticed that here the solution inaccuracy again appears to divide structures and classes by a principle "on value" and "reference". :)

Yah. They're almost the same, but not quite. It's interesting that value semantics are IMPOSSIBLE with classes (I didn't know that until Stewart's post), whereas reference semantics with structs are possible (but ugly) with structs. In my experience with D, I use structs + templates far more frequently than classes + polymorphism. And I suspect that if interfaces were a bit more powerful and efficient, struct+interface might replace even more of the use cases for class-based run-time polymorphism. So I must admit, I'm quite biased against classes.

I am absolutely agree with that that the interfaces too are necessary for structs. But whether you by means of templates and mixin repeat an OOP programming paradigm?
Jan 04 2009
parent reply Don <nospam nospam.com> writes:
Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that
 "the
 compiler is entitled to assume that X+=Y yields the same
 result as
 X=X+Y"



new one:


redundant.

return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!

The programmer can have a desire to return not current object: the returned and this will be equivalent but are not identical. Do not forget that this object may be not a class - it can be struct and such return can in certain to save a few resources. But if us will force to return this under the threat of a compile error I will not cry.:) And you have certainly noticed that here the solution inaccuracy again appears to divide structures and classes by a principle "on value" and "reference". :)

semantics are IMPOSSIBLE with classes (I didn't know that until Stewart's post), whereas reference semantics with structs are possible (but ugly) with structs. In my experience with D, I use structs + templates far more frequently than classes + polymorphism. And I suspect that if interfaces were a bit more powerful and efficient, struct+interface might replace even more of the use cases for class-based run-time polymorphism. So I must admit, I'm quite biased against classes.

I am absolutely agree with that that the interfaces too are necessary for structs. But whether you by means of templates and mixin repeat an OOP programming paradigm?

I generally use compile-time polymorphism rather than run-time. But there's an interesting question: using opDot() and mixins, how close can you come to implementing classes? A particularly interesting case is the GoF Strategy pattern, where derived classes add no data, they only override virtual functions. The slicing problem never happens with such objects, provided that you include the vtable pointer when you copy the object. Sounds like you want one struct + multiple interfaces.
Jan 05 2009
parent Weed <resume755 mail.ru> writes:
Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Don пишет:
 Weed wrote:
 Frits van Bommel пишет:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that
 "the
 compiler is entitled to assume that X+=Y yields the same
 result as
 X=X+Y"



new one:


redundant.

return it. For example: if it is necessary to return the object containing the sorted data those sorting hurriedly at creation of the returned object can give a scoring in performance than if the data is sorted in the current object after their change.

y = x+=b; surely that would give you x and y being different? Wouldn't you want x to also point to the new object? (OK, as Stewart pointed out, you can't!) Otherwise, you have to perform _both_ sorts!

The programmer can have a desire to return not current object: the returned and this will be equivalent but are not identical. Do not forget that this object may be not a class - it can be struct and such return can in certain to save a few resources. But if us will force to return this under the threat of a compile error I will not cry.:) And you have certainly noticed that here the solution inaccuracy again appears to divide structures and classes by a principle "on value" and "reference". :)

semantics are IMPOSSIBLE with classes (I didn't know that until Stewart's post), whereas reference semantics with structs are possible (but ugly) with structs. In my experience with D, I use structs + templates far more frequently than classes + polymorphism. And I suspect that if interfaces were a bit more powerful and efficient, struct+interface might replace even more of the use cases for class-based run-time polymorphism. So I must admit, I'm quite biased against classes.

I am absolutely agree with that that the interfaces too are necessary for structs. But whether you by means of templates and mixin repeat an OOP programming paradigm?

I generally use compile-time polymorphism rather than run-time.

What difference between them? vtable?
 But there's an interesting question: using opDot() and mixins, how close
 can you come to implementing classes?

All of us have already come! :) In that example with matrices (http://www.dsource.org/projects/openmeshd/browser/trunk/LinAlg/linalg/MatrixT.d, template MultReturnType (ArgT)) the returned type of matrices needed to be altered in void (in the pointer on void). And add check by list on types to make the general for all matrices and other structs (vectors etc.) in compile time. That is to imitate a base virtual class. As a whole so, but I am did not check it. If the present support vtable that is necessary I think it too it is possible to make, I do not see any problems. But if you do not tell that it is bad design that I will be surprised.
 A particularly interesting case is the GoF Strategy pattern, where
 derived classes add no data, they only override virtual functions.
 The slicing problem never happens with such objects, provided that you
 include the vtable pointer when you copy the object.

It is possible also real OOP to imitate not much more difficult
 Sounds like you want one struct + multiple interfaces.

I not against. Really, if the structure has methods that can have and the interface (and interfaces can be inherited), it is logical. As well compile-time inheriting could be made. To write so: struct RGB { ubyte r; ubyte g; ubyte b; } struct RGBA : RGB { ubyte a; } such code is quite clear. But without them it is possible to live. Much more important old kind classes on value, without them it is impossible apparently.
Jan 05 2009
prev sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Tue, Jan 6, 2009 at 1:10 AM, Weed <resume755 mail.ru> wrote:
 In that example with matrices
 (http://www.dsource.org/projects/openmeshd/browser/trunk/LinAlg/linalg/MatrixT.d,
 template MultReturnType (ArgT)) the returned type of matrices needed to
 be altered in void (in the pointer on void). And add check by list on
 types to make the general for all matrices and other structs (vectors
 etc.) in compile time. That is to imitate a base virtual class.
 As a whole so, but I am did not check it.

I don't think this is true. But I have trouble understanding your English, so it's possible I've misunderstood. It's a little more complicated to write, but MatrixT could accept an extra 'traits' template parameter that describes things like the return type from multiplication. That parameter can also implement a generic interface for querying sizes and accessing elements of matrices. If you do that, then the types don't have to be hard coded like they are there. You can also set it up so that if the user doesn't supply such a template parameter then a default one is used which knows about some types (basically extract that MultReturnType logic out into part of a separate default traits template). --bb
Jan 05 2009
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Don wrote:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

Some observations based on the use cases to date: (1) a += b is ALWAYS a = a + b (and likewise for all other operations). opXXXAssign therefore seems to be a (limited) performance optimisation. The compiler should be allowed to synthesize += from +. This would almost halve the minimum number of repetitive functions required.

I don't think compiler technology is good enough to automatically synthesize += from + for e.g. matrices. An easier path would be to synthesize + from +=, but sometimes that would be suboptimal.
 A straightforward first step would be to state in the spec that "the 
 compiler is entitled to assume that X+=Y yields the same result as X=X+Y"

I think that's a good idea.
 (2)
 There seems to be a need for abstract syntax trees, which is NOT 
 necessarily related to performance. (If we had a 'perfect performance' 
 solution for operator overloading, it would not remove the desire for 
 abstract syntax trees).
 (3)
 The array operations ~, [], [..] need further attention. A solution for 
 $ is also required.
 

Cool. Andrei
Dec 30 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 30 Dec 2008 22:26:19 +0300, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 Frits van Bommel wrote:
 Don wrote:
 A straightforward first step would be to state in the spec that  
 "the compiler is entitled to assume that X+=Y yields the same  
 result as X=X+Y"

That doesn't hold for reference types, does it?

I thought it does? Got any counter examples?

For any class type, with += modifying the object and + returning a new one:

a good idea? I can't think of any cases where that's anything other than a bug-breeder.
 You can't just arbitrarily substitute between these two.

sense. No-one has yet come up with such a use case. I postulate that it doesn't exist.

Well I forgot whether BigInt is a class, is it? Anyhow, suppose it *is* a class and as such has reference semantics. Then a += b modifies an object in-situ, whereas a = a + b creates a whole new object and happens to bind a to that new object. Andrei

It was suggested 2 posts up the thread. I believe Don is looking for a use case where given a1 = a + b; a2 = a; a2 += b; the following check intentionally fails: assert(a1 == a2); // not "a1 is a2" He postulates that none exists.
Dec 30 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
MjAwOC8xMi8zMSBEb24gPG5vc3BhbUBub3NwYW0uY29tPjoKPiBXZWVkIHdyb3RlOgo+Pgo+PiBG
cml0cyB2YW4gQm9tbWVsINDJ28XUOgo+Pj4KPj4+IERvbiB3cm90ZToKPj4+Pgo+Pj4+IEZyaXRz
IHZhbiBCb21tZWwgd3JvdGU6Cj4+Pj4+Cj4+Pj4+IERvbiB3cm90ZToKPj4+Pj4+Cj4+Pj4+PiBB
IHN0cmFpZ2h0Zm9yd2FyZCBmaXJzdCBzdGVwIHdvdWxkIGJlIHRvIHN0YXRlIGluIHRoZSBzcGVj
IHRoYXQgInRoZQo+Pj4+Pj4gY29tcGlsZXIgaXMgZW50aXRsZWQgdG8gYXNzdW1lIHRoYXQgWCs9
WSB5aWVsZHMgdGhlIHNhbWUgcmVzdWx0IGFzCj4+Pj4+PiBYPVgrWSIKPj4+Pj4KPj4+Pj4gVGhh
dCBkb2Vzbid0IGhvbGQgZm9yIHJlZmVyZW5jZSB0eXBlcywgZG9lcyBpdD8KPj4+Pgo+Pj4+IEkg
dGhvdWdodCBpdCBkb2VzPyBHb3QgYW55IGNvdW50ZXIgZXhhbXBsZXM/Cj4+Pgo+Pj4gRm9yIGFu
eSBjbGFzcyB0eXBlLCB3aXRoICs9IG1vZGlmeWluZyB0aGUgb2JqZWN0IGFuZCArIHJldHVybmlu
ZyBhIG5ldwo+Pj4gb25lOgo+Pgo+PiBUaGUgKz0gb3BlcmF0b3IgdG9vIHNob3VsZCByZXR1cm4g
dGhlIG9iamVjdCAodXN1YWxseSAidGhpcyIpCj4KPiBBTFdBWVMgJ3RoaXMnLiBJdCdzIGFub3Ro
ZXIgZmVhdHVyZSBvZiBvcGVyYXRvciBvdmVybG9hZGluZyB3aGljaCBpcwo+IHJlZHVuZGFudC4K
CkluIEQxIEkgYWx3YXlzIHJldHVybiB2b2lkIGZvciArPSBpbiBzdHJ1Y3RzIGJlY2F1c2UgeW91
IGNhbid0IHJldHVybgoidGhpcyIgYnkgcmVmZXJlbmNlLCBhbmQgdGhlIGNvbXBpbGVyIGRvZXNu
J3Qgc2VlbSB0byBjb25zaWRlciBpdCBhbgplcnJvciB0byBtb2RpZnkgdGhlIHJlc3VsdCBpbiBh
IHdheSB0aGF0IGhhcyBubyBzaWRlIGVmZmVjdHMuICBTbyBJCmp1c3QgY29uc2lkZXIgaXQgdG9v
IGVycm9yIHByb25lIHRvIGhhdmUgKz0gcmV0dXJuIGFueXRoaW5nLiAgQmV0dGVyCnRvIGdldCBh
IGNvbXBpbGVyIGVycm9yIHdoZW4geW91IHRyeSB0byBkbyBzb21ldGhpbmcgY3V0ZSBsaWtlICAg
eSA9IDIKKyAoeCs9MykuCgpJbiBEMiB3aXRoIHJlZiByZXR1cm5zIHRoZSBzaXR1YXRpb24gbWln
aHQgYmUgZGlmZmVyZW50LiAgQnV0IEkKYmFzaWNhbGx5IG5ldmVyIGZpbmQgbXlzZWxmIG5lZWRp
bmcgYSByZXR1cm4gdmFsdWUgZnJvbSArPS4KCi0tYmIK
Dec 30 2008
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Don wrote:
 Don wrote:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where the 
 arithmetic and logical operations are overloaded that are particularly 
 interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

Some observations based on the use cases to date: (1) a += b is ALWAYS a = a + b (and likewise for all other operations). opXXXAssign therefore seems to be a (limited) performance optimisation. The compiler should be allowed to synthesize += from +. This would almost halve the minimum number of repetitive functions required.

I thought it did. I must've been imagining it. But the really silly thing is that, for classes, it seems you can't even make a += b behave as a = a + b, i.e. reassigning the reference a. Consequently, having op= on an immutable big integer class is out of the question. There was a proposal that I supported a while back - see http://www.digitalmars.com/d/archives/digitalmars/D/announce/DMD_0.177_release_6132.html#N6150 and followups. The same would work for opXXXAssign (albeit not static). In short, the way it could be made to work is: - If a.opAddAssign(b) exists and returns something, make a += b equivalent to a = a.opAddAssign(b) - If a.opAddAssign(b) exists and returns void, make a += b equivalent to (a.opAddAssign(b), a) (this is a bit like another of my too-much-ignored proposals http://www.digitalmars.com/d/archives/digitalmars/D/10199.html ) - Otherwise, make a += b equivalent to a = a + b. Of course, the expression represented by a would be evaluated only once in each case. Stewart.
Dec 31 2008
parent Don <nospam nospam.com> writes:
Stewart Gordon wrote:
 Don wrote:
 Don wrote:
 There's been some interesting discussion about operator overloading 
 over the past six months, but to take the next step, I think we need 
 to ground it in reality. What are the use cases?

 I think that D's existing opCmp() takes care of the plethora of 
 trivial cases where <, >= etc are overloaded. It's the cases where 
 the arithmetic and logical operations are overloaded that are 
 particularly interesting to me.

 The following mathematical cases immediately spring to mind:
 * complex numbers
 * quaternions
 * vectors
 * matrices
 * tensors
 * bigint operations (including bigint, bigfloat,...)
 I think that all of those are easily defensible.

 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps, and some stuff 
 that is quite frankly bizarre.

 So, please post any use cases which you consider convincing.

Some observations based on the use cases to date: (1) a += b is ALWAYS a = a + b (and likewise for all other operations). opXXXAssign therefore seems to be a (limited) performance optimisation. The compiler should be allowed to synthesize += from +. This would almost halve the minimum number of repetitive functions required.

I thought it did. I must've been imagining it. But the really silly thing is that, for classes, it seems you can't even make a += b behave as a = a + b, i.e. reassigning the reference a. Consequently, having op= on an immutable big integer class is out of the question.

Yuck! That's good to know. There was
 a proposal that I supported a while back - see
 http://www.digitalmars.com/d/archives/digitalmars/D/announce/DMD_0.177_rel
ase_6132.html#N6150 
 
 and followups.  The same would work for opXXXAssign (albeit not static).
 
 In short, the way it could be made to work is:
 
 - If a.opAddAssign(b) exists and returns something, make a += b 
 equivalent to a = a.opAddAssign(b)
 - If a.opAddAssign(b) exists and returns void, make a += b equivalent to 
 (a.opAddAssign(b), a)
 (this is a bit like another of my too-much-ignored proposals
 http://www.digitalmars.com/d/archives/digitalmars/D/10199.html )
 - Otherwise, make a += b equivalent to a = a + b.
 
 Of course, the expression represented by a would be evaluated only once 
 in each case.
 
 Stewart.

Dec 31 2008
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 30 Dec 2008 17:30:13 +0300, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:

 Bill Baxter wrote:
 Merging might be useful there too --- A ~= b ~ c ~ d  is probably more
 efficiently implemented as 3 ~= ops.

Actually, it's probably most efficiently implemented as 1 "~=" with multiple parameters. (DMD already does this for arrays)

Perhaps, not not general enough: A += a * b - c / d; // how to do this one?
Dec 30 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Tue, Dec 30, 2008 at 11:30 PM, Frits van Bommel
<fvbommel remwovexcapss.nl> wrote:
 Bill Baxter wrote:
 Merging might be useful there too --- A ~= b ~ c ~ d  is probably more
 efficiently implemented as 3 ~= ops.

Actually, it's probably most efficiently implemented as 1 "~=" with multiple parameters. (DMD already does this for arrays)

True. You mean look at all the inputs, figure out how much space you're going to need in the end, and just do one allocation for all of it. That's a good point. So these kind of ops that increase the size of the output really do need a different kind of fusion strategy. --bb
Dec 30 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Don wrote:
 But I know of very few reasonable non-mathematical uses.
 In C++, I've seen them used for iostreams, regexps,

Those are not reasonable non-mathematical uses. I've hated C++ iostreams from their very beginning. I never use them outside of test code. I think Andrei got it right with writefln().
Dec 30 2008
next sibling parent reply John Reimer <terminal.node gmail.com> writes:
Hello Walter,

 Don wrote:
 
 But I know of very few reasonable non-mathematical uses. In C++, I've
 seen them used for iostreams, regexps,
 

iostreams from their very beginning. I never use them outside of test code. I think Andrei got it right with writefln().

What does 'fln' stand for? Is that something like "Format with LiNe carriage return"? With apologies to Andrei, I can't agree that it's pretty, but I suppose it works. :) -JJR
Dec 30 2008
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
John Reimer wrote:
 Hello Walter,
 
 Don wrote:

 But I know of very few reasonable non-mathematical uses. In C++, I've
 seen them used for iostreams, regexps,

iostreams from their very beginning. I never use them outside of test code. I think Andrei got it right with writefln().

What does 'fln' stand for? Is that something like "Format with LiNe carriage return"? With apologies to Andrei, I can't agree that it's pretty, but I suppose it works. :)

The name and original implementation of writefln are Walter's and predate my tenure with D. I just defined write() and writeln(). Andrei
Dec 30 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 The name and original implementation of writefln are Walter's and 
 predate my tenure with D. I just defined write() and writeln().

write/writeln are very useful to avoid the silly bugs caused by the possible of a % inside the first string given to writefln. But write/writeln have still a quite large amount of holes/limits (*) that deserve (or must, I'd say) to be filled/fixed. I have named put/putr (final "r" stands for return) my pair of fixed ones (fixed, but not perfect, they have few bug still), a shorter name to type, and you can't write writenl by mistake as I've done for a couple months. (*) If you want I can list few pages of such problems/limits/bugs. I have not done this already because not a single person here has shown some interest in fixing the writef/writefln/write/writeln functions so far. Bye, bearophile
Dec 31 2008
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 The name and original implementation of writefln are Walter's and 
 predate my tenure with D. I just defined write() and writeln().

write/writeln are very useful to avoid the silly bugs caused by the possible of a % inside the first string given to writefln. But write/writeln have still a quite large amount of holes/limits (*) that deserve (or must, I'd say) to be filled/fixed. I have named put/putr (final "r" stands for return) my pair of fixed ones (fixed, but not perfect, they have few bug still), a shorter name to type, and you can't write writenl by mistake as I've done for a couple months. (*) If you want I can list few pages of such problems/limits/bugs. I have not done this already because not a single person here has shown some interest in fixing the writef/writefln/write/writeln functions so far.

It would be great if you brought them in the open! Andrei
Dec 31 2008
prev sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

It would be great if you brought them in the open!<

OK, but it's late for me now, I'll do when I can, possibly tomorrow. It's a lot of stuff to write. ---------------- Bill Baxter:
Apparently Andrei was, since he took the trouble to re-write the D1 versions.<

I see, I didn't know it (I have supposed they just differ for not accepting the % syntax). I have never used the D2 versions, so my "improvements" are over the D1 version. So it's possible some things are already fixed in D2... Note that my code is far from perfect, it already has two known bugs... but it clearly shows where I am aiming (and I hope to be able to remove one of those bugs, and some of you may help fix the other bug). Later, bear hugs and a happy new year, bearophile
Dec 31 2008
parent reply bearophile <bearophileHUGS lycos.com> writes:
The first part of this post was posted around October 2 2008, and shows a lits
of general bugs I have found in DMD/D.

This post lists several problems/bugs/limits I have found in the write/writefln
of D1.

Most of such problems are solved in my implementation of the put/putr
functions, that you can find in my dlibs, inside the module 'string'. Note that
such functions have two known bugs (and some limits), I'll fix one of such bugs
ASAP. So here I'll talk mostly about what my code does. Note that often there's
no "correct" representation, but having a fixed default one is much better than
nothing.

The name of the functions: put/putr are very short to type, its purpose is easy
to understand, and there's very little risk of typos. So I think they are the
best choice of names. I have used them for almost a year now.

The purpose of such functions: to print data and data structures on the
console. Such printing is mostly for the programmer, expecially during debug,
or for logging. So the print functions have to be:
- Fast, possibly as fast as printf(). At the moment put/putr are slower than
writef/writefln, and writef/writefln are quite slower than printf(). Ideally
writef/writefln can become as fast as printf(). I'd like put/putr to become
faster, but it may require lot of work. The slowness of put/putr is generally
less important, because where lot of data has to be printed, printf() can often
be used.
- Unambiguous: the printed data must clearly show the type and content of data.
- Complete: all built-ins must have a good representation.
- Elegant and clean: to speed up the reading and have a good logs.
- A representation: when possible it can be useful to have an alternative way
to represent something in a more precise way. Python tells apart the two
usages: str() and its dual __str__ return a readable textual representation,
while repr() and __repr__ often return a textual representation that put inside
the code generates the same object that has being printed. In dlibs I have
found useful create the same pair of functions (but in D objects have only one
toString(), so there's no support for a toRepr or something similar).

So in the string module of the dlibs you can find put/putr and str/repr
functions. The second pair returns a textual representation and the first pair
prints.

Some notes on how put/putr work:
- Strings inside arrays/AAs are printed with "" around them and with special
chars escaped. Because sub-items in collections are string-fied using repr and
not str.
- Structs without toString(): prints their fields in the middle of <>.
- delegates are printed between <>.
- Objects: prints just their qualified name (this may be changed in the future).
- Unions: it prints the first field (using a .tupleof[0]) in the middle of {}.
- Pointers are printed as hex integers.
- Printing AAs that contain static arrays (as keys or values) may require lot
of memory. This is a bug of D itself.
- Interfaces are printed as: interface:modulepath.Interfacename.

Limitations:
- Printing very large dictionaries with this function can be 2+ times slower
than writefln.
- It can't print an enum.
- It doesn't work with dstring and wstring yet (to be fixed).
- It doesn't print structs/classes with both private attributes and toString()
not defined.

Bug: This situation with self-nested array of box isn't handled yet:
----------------
import std.boxer: box, Box;
import d.func: putr;
void main() {
  auto a = new Box[1];
  a[0] = box(a);
  putr(a); // Error: Stack Overflow
}
----------------

Bug: some cases of unions/structs aren't printed correctly yet:
----------------
union XY1 {
  struct { int x, y; }
  long xy;
}

struct XY2 {
  union {
    struct { int x, y; }
    long xy;
  }
}

putr(XY1(10, 20)); // Out: {10}
putr(XY2(10, 20)); // Out: <10, 20, 85899345930>

------------------------


And now, after showing the bugs/problems of put/putr, I can show what they do
well.

This shows how strings are printed:
- Escape character are printed with a \ before
- lists are printed with spaces among items to improve readability.
- AAs are printed with a space after the comma and after the colon for the same
purpose.
- strings string-fied with repr() are printed inside "", this helps a LOT tell
apart strings from everything else.

assert(str("hello", ' ', ["hello"], ' ', ['a', 'b']) == "hello [\"hello\"] ab");
assert(str([1, 2, 3]) == "[1, 2, 3]");
assert(str(["a", "b", "ca"]) == "[\"a\", \"b\", \"ca\"]");
string[][] ax = [["Ab", "c"], ["D", "ef"]];
assert(str(ax) == "[[\"Ab\", \"c\"], [\"D\", \"ef\"]]");
string[int] aa = [1:"aa", 2:"ba", 3:"bb"];
assert(str(aa, ' ', 3.154887e-3) == "[1: \"aa\", 2: \"ba\", 3: \"bb\"]
0.00315488");

Structs that don't define a toString have a default representation, their
fields between <>, this is very useful:

struct S1 { int x;}
assert( str(S1(7)) == "<7>");
struct S2 { int x, y;}
assert( str(S2(7, 8)) == "<7, 8>");
struct S3 { int x; float y; int z;}
assert( str(S3(2, 7.1, 8)) == "<2, 7.1, 8>");

S1[] a1 = [S1(10), S1(20), S1(30)];
assert( str(a1) == "[<10>, <20>, <30>]" );

S2[] a2 = [S2(10,5), S2(20,6), S2(30,7)];
assert( str(a2) == "[<10, 5>, <20, 6>, <30, 7>]" );

S3[] a3 = [S3(10,5.5,1), S3(20,6.5,2), S3(30,7.5,3)];
assert( str(a3) == "[<10, 5.5, 1>, <20, 6.5, 2>, <30, 7.5, 3>]" );

But toString comes first, when defined:

struct S4 {
    int x;
    string toString() {
        return "S4<" ~ format(x) ~ ">";
    }
}
assert( str(S4(125)) == "S4<125>");

Unions too are pretty-printed by default, among {}:

union U { int x; char c; float f; }
U u;
u.x = 100;
assert(str(u) == "{100}");

All non-printable chars like \t and all the other have a representation with
\symbol or \hex:

assert(str("\"", ' ', ["\""]) == `" ["\""]`);
assert(str("ab\tc") == "ab\tc");
assert(str(["ab\tc"]) == "[\"ab\\tc\"]");

// more tests with structs and classes
struct S { int[3] d; }

Everything nests, of course:

auto ay = new S;
ay.d[] = [1, 2, 3];
string str_ay = str(ay);
assert(str_ay.length <= (size_t.sizeof * 2));
foreach(c; str_ay)
    assert( isalnum(c) );
assert(str(*ay) == "<[1, 2, 3]>");

S b;
b.d[] = [1, 1, 1];
assert(str(b) == "<[1, 1, 1]>");

Classes are printed like structs:

class C1 {
    int[3] d;
    string toString() {return "C1" ~ format(d);}
}
auto c1 = new C1;
c1.d[] = [3, 2, 1];
assert(str(c1) == "C1[3,2,1]");

class C2 { int[3] d; }
auto c2 = new C2;
c2.d[] = [3, 2, 1];
assert( str(c2).startsWith("d.string.") );
assert( str(c2).endsWith(".C2") );

You can tell apart static arrays of chars from strings:

assert( str(["ab", "ba"]) == `["ab", "ba"]`);
assert( format(["ab":12, "ba":5]) == "[[a,b]:12,[b,a]:5]" );

string[int] aa2 = [12:"ab", 5:"ba"];
assert(str(aa2) == `[5: "ba", 12: "ab"]`);

char[2][int] aa3 = [12:"ab", 5:"ba"];
assert(str(aa2) == `[5: "ba", 12: "ab"]`);

assert(str([12:"ab", 5:"ba"]) == `[5: "ba", 12: "ab"]`);

assert(str(["ab":12, "ba":5]) == `["ab": 12, "ba": 5]`);

assert(str(["ab":"AB", "ba":"BA"]) == `["ab": "AB", "ba": "BA"]`);

assert(str(['a':'d','b':'e']) == `['a': 'd', 'b': 'e']`);

Empty associative arrays have a special representation:

assert(str(new int[][0]) == "[]");
char[int] aa_empty;
assert(str(aa_empty) == "AA!(int, char)");

aa3 = null;
assert(str(aa3) == "AA!(int, char[2])");
assert(str(aa_empty) == "AA!(int, char)");

More about classes:

// classes
class Cl0 { int a; }
Cl0 cl0;
assert(str(cl0) == "null");

auto cl0b = new Cl0();
cl0b.a = 10;
assert(str(cl0b).startsWith("d.string."));
assert(str(cl0b).endsWith(".Cl0"));

class Cl1 { int a; Cl1 cl; }
Cl1 cl1;
assert(str(cl1) == "null");

auto cl1b = new Cl1();
cl1b.a = 10;
assert(str(cl1b).startsWith("d.string."));
assert(str(cl1b).endsWith(".Cl1"));

Null objects:

class Cl2 {
    int a;
    Cl2 cl;
    string toString() { return "C2[" ~ str(a) ~ " " ~ str(cl) ~ "]"; }
}
Cl2 cl2;
assert(str(cl2) == "null");

auto cl2b = new Cl2();
cl2b.a = 20;
assert(str(cl2b) == "C2[20 null]");


Complex number tests are printed WAY better, try to do the same with writef,
here there are many bugs fixed, notice the trailing zeros that help tell apart
FP from ints. Hopefully all such many special cases are managed in the correct
way:

assert(str(cast(float)-5) == "-5.0");
assert(str(cast(double)-5) == "-5.0");
assert(str(cast(real)-5) == "-5.0");

assert(str(cast(ifloat)-5i) == "-5.0i");
assert(str(cast(idouble)-5i) == "-5.0i");
assert(str(cast(ireal)-5i) == "-5.0i");

assert(str(cast(cfloat)53.25+55i) == "53.25+55.0i");
assert(str(cast(cdouble)53.25+55i) == "53.25+55.0i");
assert(str(cast(creal)53.25+55i) == "53.25+55.0i");

assert(str(cast(cfloat)-53-55i) == "-53.0-55.0i");
assert(str(cast(cdouble)-53-55i) == "-53.0-55.0i");
assert(str(cast(creal)-53-55i) == "-53.0-55.0i");

assert(str(cast(cfloat)-7.25-0i) == "-7.25+0.0i");
assert(str(cast(cdouble)-7.25-0i) == "-7.25+0.0i");
assert(str(cast(creal)-7.25-0i) == "-7.25+0.0i");

assert(str(cast(cfloat)-7-0i) == "-7.0+0.0i");
assert(str(cast(cdouble)-7-0i) == "-7.0+0.0i");
assert(str(cast(creal)-7-0i) == "-7.0+0.0i");

assert(str(7.00001) == "7.00001");

Typedef-ed vars are managed correctly:

// typedef test
typedef int T;
T t = 10;
assert(str(t) == "10");

typedef C1 TC1;
auto tc1 = new TC1;
tc1.d[] = [3, 2, 1];
assert(str(tc1) == "C1[3,2,1]");

tests with void*:

void* void_ptr;
assert( str(void_ptr) == "null" );
void*[] void_star_arr;
assert( str(void_star_arr) == "[]" );
void_star_arr = [null, null, null];
assert( str(void_star_arr) == "[null, null, null]" );


More tricks:

assert(str([`"hello"`]) == `["\"hello\""]`);
assert(str(["`hello`"]) == "[\"`hello`\"]");
assert(str('a', 'b') == "ab");
assert(str(['a', 'b']) == "ab"); // because an array of char is the same as a
string
assert(str('\'', '\'') == "''");
assert(str(['\'', '\'']) == "''"); // because an array of char is the same as a
string


functions and delegates:

auto d1 = (int i, char c) { return -i; };

assert(str(d1).startsWith("<int delegate(int,char): "));
assert(str(d1).endsWith(">"));

assert(str(d1.funcptr).startsWith("<int function(int,char): "));
assert(str(d1.funcptr).endsWith(">"));


interfaces:

interface I1 {
    void foo(int i);
}
string str_I1 = str(I1.init);
assert( str_I1.startsWith("interface:") );
assert( str_I1.endsWith(".I1") );

As you can see this fixes about 15-20 bugs/troubles with writef/writefln.

Bye,
bearophile
Jan 01 2009
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Bill Baxter

What you've written seems more like a list of enhancements than bugs.<

Seeing how for example complex numbers are written by writefln, I can't agree, but I presume it doesn't matter. (But I essentially refuse to use D without improved and fixed printing/string-fying functions like the ones I have written and use.)
I think that's a fine goal, but it's clearly not a goal of writef/writefln.<

Time to change its goal then, because it will help to program faster and in a safer way.
Making writefln automatically generate a default representation of structs
instead of erroring out would be a great.<

It works with classes too, even if it's a little less useful, and partially with unions too (as much as possible).
It hasn't bitten me since then.<

Probably because D is statically typed, so reading code you usually can eventually understand the types of the data inside collections. But using higher-level functions to programm, like the map, filter, select, apply, etc, and all their lazy variants, you let the compiler manage types more automatically (and you may even have nested lazy generators), in such situation having a clean printout becomes more important. Bye, bearophile
Jan 01 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 The first part of this post was posted around October 2 2008, and 
 shows a lits of general bugs I have found in DMD/D.
 
 This post lists several problems/bugs/limits I have found in the 
 write/writefln of D1.

Thanks for taking the time to write them down. I was thinking you'd write a number of actual bugs in write(f)(ln) instead of an infomercial on put(r). Anyhow, these are good ideas that should be embodied in Phobos. One note about the charter. I think it's slightly different.
 The purpose of such functions: to print data and data structures on 
 the console.

I disagree. The purpose is to print data and data structures to any text device, with the console as a particular case and a potential shortcut.
 Such printing is mostly for the programmer, expecially during debug,
 or for logging.

I disagree. Printing should work equally well for "real" tasks.
 - Fast, possibly as fast as printf().

Cool.
 - Unambiguous: the printed data must clearly show the type and
 content of data.

I'm unclear on this. It would mean that if you print a short with value 1, you'd have to specify that it's a short. Or if you print a real number valued at 1, you'd have to include the decimal point. Now I agree that the decimal point (followed e.g. by zeros) is sometimes desirable, but sometimes it's just not. So I'm rather confused about this unambiguity principle. I'd replace it with a flexibility requirement, e.g. that one can print numbers in a variety of formats.
 - Complete: all built-ins must have a good representation.

And what happened to "Extensible: user-defined types should be able to define their own printing." toString won't cut it!
 - Elegant and clean: to speed up the reading and have a good logs.

Great.
 - A representation: when possible it can be useful to have an
 alternative way to represent something in a more precise way. Python
 tells apart the two usages: str() and its dual __str__ return a
 readable textual representation, while repr() and __repr__ often
 return a textual representation that put inside the code generates
 the same object that has being printed. In dlibs I have found useful
 create the same pair of functions (but in D objects have only one
 toString(), so there's no support for a toRepr or something similar).

I think that's a great idea. One format for default printing, and one precise format for serialization. One thing that is sorely missing from this all is parsing. I think library functions for formatted writing must be accompanied by functions for formatted reading. (scanf does that for printf, but I think it's doing a rather awkward job.) The presence of parsing routines changes the charter a bit, e.g. in the precise serialization mode you'd want to print an object in a manner that makes it parsable later. Andrei
Jan 02 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

instead of an infomercial on put(r).<

I have shown what I think is wrong, what I think is right, and I have actually written code that does that. So I'm not just asking for improvements, the code may also show how to implement them, and what practical limits D1 language gives to the implementation. So it's a kind of "infomercial", but I think it gives more information that just a list of troubles/limits/bugs of the Phobos functions. And I am not asking you money :-)
I disagree. The purpose is to print data and data structures to any text
device, with the console as a particular case and a potential shortcut.<

OK.
I disagree. Printing should work equally well for "real" tasks.<

OK, when possible.
- Fast, possibly as fast as printf().<<


Cool.<

But at the moment writef is quite (much) slower than printf. For put/putr speed is for me less important than the other qualities. When I need more speed I generally use printf. Creating put/putr functions almost as fast as printf is probably possible, but it may require lot of work (and some assembly too, maybe).
I'm unclear on this. It would mean that if you print a short with value 1,
you'd have to specify that it's a short. Or if you print a real number valued
at 1, you'd have to include the decimal point.<

Well, in this situation I have adopted a compromise that I hope is sensible: integral numbers are printed all the same (I have given a thought about adding a trailing L to longs/ulongs when sting-fied with repr() instead of str(), but so far I have refused it. It's possible, anyway.), but FP ones are printed with a leading .0. Because telling apart FP from integrals is important enough.
Now I agree that the decimal point (followed e.g. by zeros) is sometimes
desirable, but sometimes it's just not. So I'm rather confused about this
unambiguity principle.<

There I was talking about default representations, the one you receive if you do just a putr(10.0). If you don't want the leading zero, then you are supposed to have ways to not print it. Instead putr(repr(10.0)) has to give all the digits, to represent all the bits of the original FP.
I'd replace it with a flexibility requirement, e.g. that one can print numbers
in a variety of formats.<

I like to have a good and not too much ambiguous default representation, plus a way to change representation when I want it. put()/putr() are meant for the default representation. If you want to extend their syntax, you can take a look for example at how C#/Python3 prints formatted things.
And what happened to "Extensible: user-defined types should be able to define
their own printing." toString won't cut it!<

There are many ways to solve this problem, but I think it may lead to a more complex language and semantics, so maybe you don't want that. An intermediate solution may be to have the toString() object/struct method accept arguments too, that can be used by user-defined printing functions for a choice of desired representation...
I think that's a great idea. One format for default printing, and one precise
format for serialization.<

Note that's a first approximation. It doesn't cover all possibilities, so it's not a fully flexible solution. So it's a solution that covers 80% of the cases keeping complexity low enough. So it's a compromise between complexity and flexibility. C++ language often prefers to have too much complexity to have a higher flexibility. D language accepts less flexibility to keep its complexity low enough to be usable by humans too. And no, sometimes you can't have both: sometimes you can't have a high flexibility and low complexity. D2 language is clearly more complex than D1, and D1 is quite more complex than Java. too much complexity kills a language (or just gives lot of pain to programmers and produces bugs and slows down development, see the C++ language).
One thing that is sorely missing from this all is parsing.<

I agree. But I was talking about printing only.
The presence of parsing routines changes the charter a bit,<

I see, but I don't know how this changes the situation. Bye, bearophile
Jan 02 2009
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
John Reimer wrote:
 I think Andrei got it right with writefln().

carriage return"?

Yes.
 With apologies to Andrei, I can't agree that it's pretty, but I suppose 
 it works. :)

Use of those characters for that purpose has a long history.
Dec 30 2008
prev sibling next sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Thu, Jan 1, 2009 at 3:16 AM, bearophile <bearophileHUGS lycos.com> wrote:
 Andrei Alexandrescu:
 The name and original implementation of writefln are Walter's and
 predate my tenure with D. I just defined write() and writeln().

write/writeln are very useful to avoid the silly bugs caused by the possible of a % inside the first string given to writefln. But write/writeln have still a quite large amount of holes/limits (*) that deserve (or must, I'd say) to be filled/fixed. I have named put/putr (final "r" stands for return) my pair of fixed ones (fixed, but not perfect, they have few bug still), a shorter name to type, and you can't write writenl by mistake as I've done for a couple months.

Heh heh. I'd say close to half of my failed compiles are due to a "writelfn" I typed somewhere. I really need to find some kind of auto-correct package for emacs.
 (*) If you want I can list few pages of such problems/limits/bugs. I have not
done this already because not a single person here has shown some interest in
fixing the writef/writefln/write/writeln functions so far.

Apparently Andrei was, since he took the trouble to re-write the D1 versions. Are your improvements to the D1 version or the D2 templatized version? --bb
Dec 31 2008
prev sibling parent "Bill Baxter" <wbaxter gmail.com> writes:
On Fri, Jan 2, 2009 at 5:23 AM, bearophile <bearophileHUGS lycos.com> wrote:
 The first part of this post was posted around October 2 2008, and shows a lits
of general bugs I have found in DMD/D.

 This post lists several problems/bugs/limits I have found in the
write/writefln of D1.

 [...]
 As you can see this fixes about 15-20 bugs/troubles with writef/writefln.

I'm not really seeing the place where you pointed out any bugs with writef/writefln. What you've written seems more like a list of enhancements than bugs. Most of the changes seem to be related to this design goal:
 - Unambiguous: the printed data must clearly show the type and content of data.

I think that's a fine goal, but it's clearly not a goal of writef/writefln. Writefln is more like a typesafe printf with a little bit of automatic type deduction ability. This one is great:
 - Structs without toString(): prints their fields in the middle of <>.

It's very annoying to have to go write a custom toString function just to be able to print out what's inside a struct. It often forces me to go import std.string to just so I can use std.string.format to print out a debug representation of a darn two-field struct. Making writefln automatically generate a default representation of structs instead of erroring out would be a great. I also like this one:
 - Strings inside arrays/AAs are printed with "" around them and with special
chars escaped. Because sub-items in collections are string-fied using repr and
not str.

I once wasted a fair amount of time trying to figure out why something wasn't working because of not realizing that my array was actually full of strings and not parsed floats like I was thinking. It sure looked like an array of floats when I printed it out. Maybe this was a rare unfortunate interaction between strings, floats, arrays and auto, though. It hasn't bitten me since then. --bb
Jan 01 2009