digitalmars.D - Overloading operators by operator symbol

Bill Baxter (68/68) Oct 28 2006 I'm not a big fan of magic operator method names. Python has its

Jarrett Billingsley (9/32) Oct 28 2006 Basically you've just replaced "opAdd" etc. with "myPlus" etc. ;) I kn...
Mike Parker (16/23) Oct 28 2006 The idea behind opAdd and friends is that they establish an explicit

Chris Nicholson-Sauls (6/33) Oct 28 2006 Additionally, it allows for some minimizing. For example, instead of an...
rm (18/41) Oct 28 2006 Personally, I think binary operators should be static functions or

Walter Bright (27/31) Oct 30 2006 The reasons for "opAdd" instead of "operator+" are:

Bill Baxter (69/107) Oct 30 2006 Most of those are just perversities that don't exist in real code.

Walter Bright (33/107) Oct 31 2006 Perhaps you're right, but I sure get tired of things in C++ that work

Daniel Keep (12/20) Oct 30 2006 Just pointing out that Lua's special methods are in a completely

Bill Baxter <wbaxter gmail.com> writes:

I'm not a big fan of magic operator method names.  Python has its 
__add__ etc methods, Lua has very similar, D has opAdd etc.
Personally I prefer C++'s way of just using the syntax itself.  I find 
it a lot easier to remember and it looks less "magical".

I started wondering if it might be able to accomplish something like 
that using mixins.  Here's an example of what I've gotten to work so far:

class AClass
{
     // Look ma! I'm overloading operators by symbols!
     mixin Operator!("+", myPlus);
     mixin Operator!("+=", myPlusEq);
     mixin Operator!("-", myMinus);
     mixin Operator!("-=", myMinusEq);

     // the actual operator overload implementations
     int myPlus(int v){ return m_value + v; };
     int myMinus(int v){ return m_value - v; };
     int myPlusEq(int v){ return m_value += v; };
     int myMinusEq(int v){ return m_value -= v; };

     int m_value = 0;
}

void main()
{
     // example use
     AClass a = new AClass();
     a += 3;
     writefln("a is: ", a.m_value);
     writefln("a+5 is: ", a + 5);
     a -= 10;
     writefln("a-=10; a is now: ", a.m_value);
     writefln("a-5 is: ", a - 5);
}

// The guy who makes it happen
template Operator(char[] op, alias OpFn )
{
     // todo: actually derive these types from OpFn
     alias int RetType;
     alias int ArgType;
     static if(op=="+") {
         RetType opAdd(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="-") {
         RetType opSub(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="+=") {
         RetType opAddAssign(ArgType v) {
             return OpFn(v);
         }
     }
     else static if(op=="-=") {
         RetType opSubAssign(ArgType v) {
             return OpFn(v);
         }
     }
}

This is pretty simplistic and not very complete.  Ideally the syntax 
would look more like

     mixin Operator!("-",
        int(int v){ return my_value + v; }
     );

or best

     int Operator!("-")(int v){ return my_value + v; }

But I couldn't figure out any way to make those work. :-)
Can anyone do better?

--bb

Oct 28 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Bill Baxter" <wbaxter gmail.com> wrote in message 
news:ehv8kj$1d4$1 digitaldaemon.com...

 class AClass
 {
     // Look ma! I'm overloading operators by symbols!
     mixin Operator!("+", myPlus);
     mixin Operator!("+=", myPlusEq);
     mixin Operator!("-", myMinus);
     mixin Operator!("-=", myMinusEq);

     // the actual operator overload implementations
     int myPlus(int v){ return m_value + v; };
     int myMinus(int v){ return m_value - v; };
     int myPlusEq(int v){ return m_value += v; };
     int myMinusEq(int v){ return m_value -= v; };

     int m_value = 0;
 }

Basically you've just replaced "opAdd" etc. with "myPlus" etc.  ;)  I know 
what you're getting at but..

 This is pretty simplistic and not very complete.  Ideally the syntax would 
 look more like

     mixin Operator!("-",
        int(int v){ return my_value + v; }
     );

 or best

     int Operator!("-")(int v){ return my_value + v; }

 But I couldn't figure out any way to make those work. :-)
 Can anyone do better?

I think we'd have to have the ability to dynamically generate symbols with 
templates (i.e. some form of token pasting) in order for this to be 
possible.  But then, of course, you have the problem of not being able to 
declare delegates at a class level, which would make it hard to pass the 
implementation into the Operator template..

Oct 28 2006

Mike Parker <aldacron71 yahoo.com> writes:

Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".
 
 I started wondering if it might be able to accomplish something like 
 that using mixins.  Here's an example of what I've gotten to work so far:

The idea behind opAdd and friends is that they establish an explicit 
contract. In C++, what exactly is operator+ supposed to used for? It 
doesn't always mean 'addition'. Even in the standard library, 
std::string uses '+' to mean 'concatenation'. In some C++ vector math 
libraries, '*' is used to calculate the dot product of two vectors, 
since there's no such thing as the multiplication of two vectors (that 
is, it's not uniquely defined), while at the same time being used to 
multiply a vector by a scalar.

opAdd is an explicit interface saying that "this operator does 
addition." Programmers can still abuse it, just as they can abuse the 
contract established by any interface. But when a library implementor 
uses opAdd to do concatenation or something other than addition, you can 
now point at them and say they are breaking the contract. operator+ 
doesn't allow you to do that since '+' by itself does not necessarily 
equivocate to 'addition'.

Oct 28 2006

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Mike Parker wrote:
 Bill Baxter wrote:
 
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".

 I started wondering if it might be able to accomplish something like 
 that using mixins.  Here's an example of what I've gotten to work so far:

 
 
 The idea behind opAdd and friends is that they establish an explicit 
 contract. In C++, what exactly is operator+ supposed to used for? It 
 doesn't always mean 'addition'. Even in the standard library, 
 std::string uses '+' to mean 'concatenation'. In some C++ vector math 
 libraries, '*' is used to calculate the dot product of two vectors, 
 since there's no such thing as the multiplication of two vectors (that 
 is, it's not uniquely defined), while at the same time being used to 
 multiply a vector by a scalar.
 
 opAdd is an explicit interface saying that "this operator does 
 addition." Programmers can still abuse it, just as they can abuse the 
 contract established by any interface. But when a library implementor 
 uses opAdd to do concatenation or something other than addition, you can 
 now point at them and say they are breaking the contract. operator+ 
 doesn't allow you to do that since '+' by itself does not necessarily 
 equivocate to 'addition'.

Additionally, it allows for some minimizing.  For example, instead of an
'operator<', an 
'operator>', and an 'operator==' to make a class comparable, we merely define
an 'opCmp' 
that does the work of all three.  There could yet be some more minimizing (I
feel that 
opApply(dg)/opApplyReverse(dg) could, for example, become opApply(dg,reverse?)
instead).

-- Chris Nicholson-Sauls

Oct 28 2006

rm <roel.mathys gmail.com> writes:

Mike Parker wrote:
 Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find
 it a lot easier to remember and it looks less "magical".


Personally, I think binary operators should be static functions or
non-member functions. I understand magic to be something that is
momentarily beyond my horizon of understanding, so whatever syntax is
used ...
I think it's necessary to have operator overloading to not have to
"learn" two ways of doing operations, one for built-in types and one for
 user-defined types.

 I started wondering if it might be able to accomplish something like
 that using mixins.  Here's an example of what I've gotten to work so far:

 
 The idea behind opAdd and friends is that they establish an explicit
 contract. In C++, what exactly is operator+ supposed to used for? It
 doesn't always mean 'addition'. Even in the standard library,
 std::string uses '+' to mean 'concatenation'. In some C++ vector math
 libraries, '*' is used to calculate the dot product of two vectors,
 since there's no such thing as the multiplication of two vectors (that
 is, it's not uniquely defined), while at the same time being used to
 multiply a vector by a scalar.

well with opMul you can exactly do the same, that's just syntax.
So, drawing a conclusion because in C++ you have to overload operator+
and in D you have to overload opMul ... ?

 opAdd is an explicit interface saying that "this operator does
 addition." Programmers can still abuse it, just as they can abuse the
 contract established by any interface. But when a library implementor
 uses opAdd to do concatenation or something other than addition, you can
 now point at them and say they are breaking the contract. operator+
 doesn't allow you to do that since '+' by itself does not necessarily
 equivocate to 'addition'.

where are these definitions given? Or should I "interpret" opAdd as
"operator for addition", and then again, if wonder what Webster has to
say about addition, e.g. you have to add 100 grams of sugar, then you
have to mix-in :-) 500 ml of milk ...

The more important question is, is all that is needed provided in D?
Maybe the first question should be, what is needed?

roel

Oct 28 2006

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".

The reasons for "opAdd" instead of "operator+" are:

1) opAdd is eminently more greppable. Try grepping for operator+:

	operator /* comment */ + (T t)

	operator +\
	+()

	operator+(T t)

Oops, I found operator/ instead! I thought operator++ was operator+! Is 
that a unary + or a binary +? You practically need a full C++ front end 
to do the job correctly. D can do tolerably well with simple grep.

2) opAdd looks like "opAdd" in the object symbol table rather than "?H" 
(I am not making up ?H, it really is that) giving one a clue without 
needing a decoder ring.

3) it encourages the use of operating overloading for arithmetic 
purposes, rather than "parse this predicate once", which happens with 
C++ operator overloading.

4) operators that are mathematically related can be derived from each 
other: in C++ the == and != are separately overloadable. Anyone who 
wants to do mathematical overloading has to do both and take care that 
one actually is the not of the other. With opEquals, one function can 
serve both. This makes more of a difference with <, <=, >, >= where 4 
overloads are replaced by opCmp.

5) Note C++'s inability to distinguish operator[] as an lvalue and as an 
rvalue. D has opIndexAssign and opIndex.

6) Note the kludge-o-matic C++ overloading of operator++ and its 
different meanings. I can never remember which is which without looking 
it up. D has opAddAssign and opPostInc.

Oct 30 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 I'm not a big fan of magic operator method names.  Python has its 
 __add__ etc methods, Lua has very similar, D has opAdd etc.
 Personally I prefer C++'s way of just using the syntax itself.  I find 
 it a lot easier to remember and it looks less "magical".

 
 The reasons for "opAdd" instead of "operator+" are:
 
 1) opAdd is eminently more greppable. Try grepping for operator+:
 
     operator /* comment */ + (T t)
 
     operator +\
     +()
 
     operator+(T t)
 
 Oops, I found operator/ instead! I thought operator++ was operator+! Is 
 that a unary + or a binary +? You practically need a full C++ front end 
 to do the job correctly. D can do tolerably well with simple grep.

Most of those are just perversities that don't exist in real code. 
/operator\s*+[^+]/ would find you 99% of all real use cases.
On the other hand say I want to find all operator overloads period. 
With C++ I can pretty much just grep for 'operator', whereas for D I'd 
have to be a little smarter, because just grepping for 'op' is likely to 
turn up lots of cruft. Ok /\Wop[A-Z]/ would probably do a decent job 
where \W is the 'not a word character pattern'.  Either way I think this 
one is pretty much a wash.  It's not that hard to grep for either one.

 
 2) opAdd looks like "opAdd" in the object symbol table rather than "?H" 
 (I am not making up ?H, it really is that) giving one a clue without 
 needing a decoder ring.

I guess that's nice for the compiler writer.  Does it affect the user 
somehow too?  Because I'm not usually so concerned about how things look 
in the symbol table given all the name mangling going on everywhere.

Besides, couldn't one arrange things so that 'operator+' appeared in the 
symbol table as something like '__operator_plus' if one so desired?

 3) it encourages the use of operating overloading for arithmetic 
 purposes, rather than "parse this predicate once", which happens with 
 C++ operator overloading.

I suppose.  But I suspect programmers will likely see it as a feature 
they can use no matter what you call it.  C++ books generally recommend 
not overloading + for things that are semantically unrelated to adding, 
but people do it anyway.  Similarly people use static opCall in D as a 
constructor.  If the programmer really wants a succinct syntax for some 
common operation, then they're going to consider operator overloading as 
one of their design choices, no matter what those methods are called.

 4) operators that are mathematically related can be derived from each 
 other: in C++ the == and != are separately overloadable. Anyone who 
 wants to do mathematical overloading has to do both and take care that 
 one actually is the not of the other. With opEquals, one function can 
 serve both. This makes more of a difference with <, <=, >, >= where 4 
 overloads are replaced by opCmp.

Well, operator < alone is used in C++, and via similar mathematical 
identities you can construct <=, >, >= out of it.
given a < b operator we have:

a > b  ===  b < a
a <= b === !(b<a)
a >= b === !(a<b)


 5) Note C++'s inability to distinguish operator[] as an lvalue and as an 
 rvalue. D has opIndexAssign and opIndex.

Seems C++ does ok there:
type& operator[]() { }      // lvalue case
type operator[]() { } const // rvalue case

 6) Note the kludge-o-matic C++ overloading of operator++ and its 
 different meanings. I can never remember which is which without looking 
 it up. D has opAddAssign and opPostInc.

Yeh, that is super hacky and hard to remember.  Maybe C++ should have 
added 'loperator' to distinguish left from right.

Really this 'hard to remember' point is the main reason I think symbols 
for operator overloads would be superior.  Something like this, though I 
realize totally hopeless, would nonetheless be nice:

(Let 'i' mean 'this', though 'this' could be used instead.
  Let 'x' (or any non-i letter) mean the other thing where needed.)

operator[-i]   -- opNeg
operator[+i]   -- opPos
operator[~i]   -- opCom
operator[i++]  -- opPostInc
operator[i--]  -- opPostDec
operator[i+]   -- opAdd
operator[x+i]  -- opAdd_r
operator[i==]  -- opEquals
operator[i+=]  -- opAddAssign
operator[i in] -- opIn
operator[in i] -- opIn_r
operator[i[]]  -- opIndex
operator[i[]=] -- opIndexAssign
operator[i[..]] -- opSlice
operator[i[..]=] -- opSliceAssign
etc...

Then I don't have to remember what name the language chose to represent 
the operator, I just have to remember the syntactical situation in which 
I want that operator to be invoked.

I realize it's unconventional.  (I've never seen such a thing in a 
language before -- maybe haskell comes close.)  But I've always been 
annoyed by operator overloading in the languages I've used.  Why not 
just make the operator declaration show the exact use case where the 
operator is invoked??

The above could even be implemented as some sort of preprocessor. It's 
just pure syntactic sugar for the more cryptic built-in method names. 
For opCmp, basically you'd only allow operator[i>] and then say it 
should return positive if i>, zero if equal, and - if less than.

The only issue is I think opApply / opApplyReverse, and there the 
problem is that these are not really operators to begin with, they're 
iterators.  Unlike an operator they have no associated syntax.

--bb

Oct 30 2006

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 Walter Bright wrote:
 Oops, I found operator/ instead! I thought operator++ was operator+! 
 Is that a unary + or a binary +? You practically need a full C++ front 
 end to do the job correctly. D can do tolerably well with simple grep.

 Most of those are just perversities that don't exist in real code. 
 /operator\s*+[^+]/ would find you 99% of all real use cases.

Perhaps you're right, but I sure get tired of things in C++ that work 
only most of the time (and I didn't even get into what the preprocessor 
can do to any reliance on grep). I like things to work reliably. I want 
to make sure I found all the operator overload cases when I do a code audit.

There's a thread on comp.lang.c about writing a program that can convert 
C++ // comments to /* */ comments. Most of the thread is about all the 
weird corner cases (like trigraphs, line splicing, etc.) that can happen 
in C++ and how doing a correct job of it is far more complicated than it 
looks like it should be. This is not unusual, but typical of C++ source 
code analysis problems.


 2) opAdd looks like "opAdd" in the object symbol table rather than 
 "?H" (I am not making up ?H, it really is that) giving one a clue 
 without needing a decoder ring.

 I guess that's nice for the compiler writer.  Does it affect the user 
 somehow too?  Because I'm not usually so concerned about how things look 
 in the symbol table given all the name mangling going on everywhere.

How they look in the symbol table matters when you're having problems 
getting things to link properly or getting error messages from the 
linker or looking at exported names from a DLL or using a debugger 
without full debug info or using a disassembler, etc.


 Besides, couldn't one arrange things so that 'operator+' appeared in the 
 symbol table as something like '__operator_plus' if one so desired?

Yes, one could. But it's one less level of indirection to connect 
"opAdd" in the symbol table with "opAdd" in the source code.


 3) it encourages the use of operating overloading for arithmetic 
 purposes, rather than "parse this predicate once", which happens with 
 C++ operator overloading.

 I suppose.  But I suspect programmers will likely see it as a feature 
 they can use no matter what you call it.  C++ books generally recommend 
 not overloading + for things that are semantically unrelated to adding, 
 but people do it anyway.  Similarly people use static opCall in D as a 
 constructor.  If the programmer really wants a succinct syntax for some 
 common operation, then they're going to consider operator overloading as 
 one of their design choices, no matter what those methods are called.

Programmers can and will do whatever they want, but it helps to 
encourage correct usage by following the dictum "if it looks wrong, it 
probably is wrong". And overloading opAdd to be "parse" is going to look 
wrong, wrong, wrong.


 4) operators that are mathematically related can be derived from each 
 other: in C++ the == and != are separately overloadable. Anyone who 
 wants to do mathematical overloading has to do both and take care that 
 one actually is the not of the other. With opEquals, one function can 
 serve both. This makes more of a difference with <, <=, >, >= where 4 
 overloads are replaced by opCmp.

 
 Well, operator < alone is used in C++, and via similar mathematical 
 identities you can construct <=, >, >= out of it.
 given a < b operator we have:
 
 a > b  ===  b < a
 a <= b === !(b<a)
 a >= b === !(a<b)

I know you can construct those identities in C++, but the point is you 
have to manually construct them every time, which is tedious and a 
source of error. C++ won't do it for you.


 5) Note C++'s inability to distinguish operator[] as an lvalue and as 
 an rvalue. D has opIndexAssign and opIndex.

 
 Seems C++ does ok there:
 type& operator[]() { }      // lvalue case
 type operator[]() { } const // rvalue case

That's by learned and commonly followed convention, not by design. Even 
worse, the lvalue case is restricted to only allow assignment through 
the reference - making it impossible to have an lvalue case where some 
post processing needs to be done with the new contents of the lvalue.


 Really this 'hard to remember' point is the main reason I think symbols 
 for operator overloads would be superior.  Something like this, though I 
 realize totally hopeless, would nonetheless be nice:
 
 (Let 'i' mean 'this', though 'this' could be used instead.
  Let 'x' (or any non-i letter) mean the other thing where needed.)
 
 operator[-i]   -- opNeg
 operator[+i]   -- opPos
 operator[~i]   -- opCom
 operator[i++]  -- opPostInc
 operator[i--]  -- opPostDec
 operator[i+]   -- opAdd
 operator[x+i]  -- opAdd_r
 operator[i==]  -- opEquals
 operator[i+=]  -- opAddAssign
 operator[i in] -- opIn
 operator[in i] -- opIn_r
 operator[i[]]  -- opIndex
 operator[i[]=] -- opIndexAssign
 operator[i[..]] -- opSlice
 operator[i[..]=] -- opSliceAssign
 etc...
 
 Then I don't have to remember what name the language chose to represent 
 the operator, I just have to remember the syntactical situation in which 
 I want that operator to be invoked.

You'd have to remember the funky syntactical oddities for each operator 
in the above notation (note that it's inconsistent). I don't think 
there's any real improvement.

 The only issue is I think opApply / opApplyReverse, and there the 
 problem is that these are not really operators to begin with, they're 
 iterators.  Unlike an operator they have no associated syntax.

Using the opXxxx convention does enable the overloading of operations 
that do not have an obvious operator symbol.

Oct 31 2006

Daniel Keep <daniel.keep.lists gmail.com> writes:

Bill Baxter wrote:
 ..., Lua has very similar, ...

Just pointing out that Lua's special methods are in a completely
different namespace to the "normal" methods, so it isn't a problem.
Special methods are attached to a table's metatable, which exists just
for that purpose.

 local t = {}
 local mt = getmetatable(t) or {}

 function mt:index(k)
     return "foo"
 end

 setmetatable(t, mt)

 io.print(t.blah) -- prints "foo"

Apologies if any of that is incorrect; very sleepy over here :3

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Oct 30 2006

D Programming

C/C++ Programming

Other

digitalmars.D - Overloading operators by operator symbol