digitalmars.D - Re: A possible solution for the opIndexXxxAssign morass

Jason House <jason.james.house gmail.com> writes:
```Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] += c;

with reasonable expressiveness and efficiency.

I would hope that *= += /= and friends could all be handled efficiently with
one function written by the programmer. As I see it, there are 3 basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

it's possible to use opIndex for #1 and opIndexAssign for #3, but that's not
efficient. #1 and #3 should be part of the same function, but I think #2
shouldnot be. What about defining an opIndexOpOpAssign that accepts a delegate
for #2 and then use compiler magic to specialize/inline it?
```
Oct 14 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] +=3D c;

with reasonable expressiveness and efficiency.

I would hope that *=3D +=3D /=3D and friends could all be handled efficie=

basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

And as Chad J reminds us, same goes for in-place property mutations
like  a.b +=3D c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) +=3D c could benefit from the
same thing (a.length(index)+=3D3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but that's =

2 shouldnot be. What about defining an opIndexOpOpAssign that accepts a del=
egate for #2 and then use compiler magic to specialize/inline it?

It could also be done using a template thing to inject the "mutate the
value" operation:

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Op)(Thang c, Thing idx) {
ref v =3D <lookup [idx] however you like>
mixin("v "~Op~" c;");
<store to v to [idx] however you like>
}

or make it an alias function argument and use Op(v, b).

Sparse matrices are a good case to look at for issues.  a[b] is
defined for every [b], but if the value is zero nothing is actually
stored.  So there may or may not be something you can return a
reference to.   In C++ things like std::map just declare that if you
try to access a value that isn't there, it gets created.  That way
operator[] can always return a reference.   It would be great if we
could make a[b] not force a ref return in cases where there is no
lvalue that corresponds to the index (or property) being accessed.
Gracefully degrade to the slow path in those cases.

A good thing about a template is you can pretty easily specify which
cases to allow using template constraints:

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Op)(Thang c, Thing b)
if (Op in "+=3D -=3D")
{
...
}

(+ 1 small dream there about 'in' being defined to mean substring
search for string arguments -- that doesn't currently work does it?)

If the template can't be instantiated for the particular operation,
then the compiler would try to revert to the less efficient standby:
auto tmp =3D a[b];
tmp op=3D c;
a[b] =3D tmp;

The whole thing can generalize to all accessors too.  Instead of just
passing the Op, the compiler could pass the accessor string, and args
for that accessor.  Here an accessor means ".opIndex(b)",  ".foo", or
even a general ".memfun(b)"

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Member, string
Op)(Thang c, Thing b)
if (Member in ".foo() .bar() .opIndex()")
{
string call =3D ctReplace(Member, "()", "(b)");  // Member looks
like ".memfun()"  this turns it into ".memfun(b)"
ref v =3D mixin("this" ~ call ~ ";");
< any extra stuff you want to do on accesses to v >
mixin("v "~Op~" c;");
< store v back to member >
}

It's ugly and perhaps too low-level, but that can be worked on if the
general principle is sound.   Utility functions can be defined to do
whatever it is that turns out to be a recurring pattern.  Lack of
being virtual could be a problem for classes.

--bb
```
Oct 14 2009
Jason House <jason.james.house gmail.com> writes:
```Bill Baxter Wrote:

On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] += c;

with reasonable expressiveness and efficiency.

I would hope that *= += /= and friends could all be handled efficiently with
one function written by the programmer. As I see it, there are 3 basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

And as Chad J reminds us, same goes for in-place property mutations
like  a.b += c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) += c could benefit from the
same thing (a.length(index)+=3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but that's not
efficient. #1 and #3 should be part of the same function, but I think #2
shouldnot be. What about defining an opIndexOpOpAssign that accepts a delegate
for #2 and then use compiler magic to specialize/inline it?

It could also be done using a template thing to inject the "mutate the
value" operation:

The only issue with templates is that they're never virtual
```
Oct 14 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```Jason House wrote:
Bill Baxter Wrote:

On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] += c;

with reasonable expressiveness and efficiency.

1. Look up a value by index
2. Mutate the value
3. Store the result

like  a.b += c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) += c could benefit from the
same thing (a.length(index)+=3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but that's not
efficient. #1 and #3 should be part of the same function, but I think #2
shouldnot be. What about defining an opIndexOpOpAssign that accepts a delegate
for #2 and then use compiler magic to specialize/inline it?

value" operation:

The only issue with templates is that they're never virtual

You can make virtuals out of templates, but not templates out of
virtuals. I think Walter is now inclined to look at a template-based
solution for operator overloading. That would save a mighty lot of code
without preventing classes that prefer virtual dispatch from doing so.

Andrei
```
Oct 14 2009
Fawzi Mohamed <fmohamed mac.com> writes:
```On 2009-10-14 23:09:26 +0200, "Robert Jacques" <sandford jhu.edu> said:

On Wed, 14 Oct 2009 16:49:28 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Jason House wrote:
Bill Baxter Wrote:

On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] += c;

with reasonable expressiveness and efficiency.

efficiently with one function written by the programmer. As I see it,
there are 3 basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

like  a.b += c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) += c could benefit from the
same thing (a.length(index)+=3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but
that's not efficient. #1 and #3 should be part of the same function,
but I think #2 shouldnot be. What about defining an opIndexOpOpAssign
that accepts a delegate for #2 and then use compiler magic to
specialize/inline it?

value" operation:

You can make virtuals out of templates, but not templates out of
virtuals. I think Walter is now inclined to look at a template-based
solution for operator overloading. That would save a mighty lot of code
without preventing classes that prefer virtual dispatch from doing so.

Andrei

I've done something similar for a SmallVec struct. Most of the operator
overloads are actually aliases of templated functions (one each for
uni-ops, bi-ops, bi-op_r and opassign)

I would really like a solution to all the overloading ops, as I missed
them in NArray, I think that some small rewriting is ok, but it must be
*small*, no magic as already said by other numerics can be tricky.
Also Andrei proposal seem workable, but there is also another solution:

Note that a ref return for opIndex, could work in most situations.
As Bill correctly pointed out sparse matrix offer the most challenging
example, there one wants to have two different functions: opIndex and
opIndexLhs, the second being called when the index is on the left hand
side of an assignment, so that reading a 0 entry in a matrix returns 0,
whereas assigning it allocates place for it.
This makes it slightly more complex to control what is being assigned
(as you need to return a structure overloading opXAssign, but I think
it would be ok in most cases.

Fawzi
```
Oct 15 2009
Fawzi Mohamed <fmohamed mac.com> writes:
```On 2009-10-15 17:51:56 +0200, "Robert Jacques" <sandford jhu.edu> said:

On Thu, 15 Oct 2009 04:48:57 -0400, Fawzi Mohamed <fmohamed mac.com> wrote:

[...]
Note that a ref return for opIndex, could work in most situations.
As Bill correctly pointed out sparse matrix offer the most challenging
example, there one wants to have two different functions: opIndex and
opIndexLhs, the second being called when the index is on the left hand
side of an assignment, so that reading a 0 entry in a matrix returns 0,
whereas assigning it allocates place for it.
This makes it slightly more complex to control what is being assigned
(as you need to return a structure overloading opXAssign, but I think
it  would be ok in most cases.

Fawzi

Would you like some example code?

I suppose you would like it ;)

// example 1
class Matrix(T){
T opIndex(size_t i,size_t j){
if (has_(i,j)){
return data[index(i,j)];
} else {
return cast(T)0;
}
}
ref T opIndexLhs(size_t i,size_t j){
if (has_(i,j)){
return &data[index(i,j)];
} else {
//alloc new place and set things so that index(i,j) returns it
return &data[index(i,j)];
}
}
}

then
m[3,4]+=4.0;
would be converted in
m.opIndexLhs(3,4)+=4.0;

typically with just one method (opIndexLhs) all += -=,... are covered

if one needs more control

class AbsMatrix(T){
T opIndex(size_t i,size_t j){
if (has_(i,j)){
return data[index(i,j)];
} else {
return cast(T)0;
}
}
struct Setter{
T* pos;
*pos+=abs(val);
}
}
Setter opIndexLhs(size_t i,size_t j){
Setter pos;
if (has_(i,j)){
res.pos=&data[index(i,j)];
} else {
//alloc new place and set things so that index(i,j) returns it
res.pos=&data[index(i,j)];
}
return res;
}
}

if one does not allow ref T as return type then one can return a pointer and do
static if(is(typeof(*m.opIndexLhs(3,4))))
*m.opIndexLhs(3,4)+=4.0;
else
m.opIndexLhs(3,4)+=4.0;

so that the trick with the struct is still possible.
```
Oct 15 2009
Fawzi Mohamed <fmohamed mac.com> writes:
```On 2009-10-15 22:55:02 +0200, Fawzi Mohamed <fmohamed mac.com> said:

On 2009-10-15 17:51:56 +0200, "Robert Jacques" <sandford jhu.edu> said:

On Thu, 15 Oct 2009 04:48:57 -0400, Fawzi Mohamed <fmohamed mac.com> wrote:

[...]
Note that a ref return for opIndex, could work in most situations.
As Bill correctly pointed out sparse matrix offer the most challenging
example, there one wants to have two different functions: opIndex and
opIndexLhs, the second being called when the index is on the left hand
side of an assignment, so that reading a 0 entry in a matrix returns 0,
whereas assigning it allocates place for it.
This makes it slightly more complex to control what is being assigned
(as you need to return a structure overloading opXAssign, but I think
it  would be ok in most cases.

Fawzi

Would you like some example code?

I suppose you would like it ;)

// example 1
class Matrix(T){
T opIndex(size_t i,size_t j){
if (has_(i,j)){
return data[index(i,j)];
} else {
return cast(T)0;
}
}
ref T opIndexLhs(size_t i,size_t j){
if (has_(i,j)){
return &data[index(i,j)];
} else {
//alloc new place and set things so that index(i,j) returns it
return &data[index(i,j)];
}
}
}

mmmh I mixed up a bit the ref returning and pointer returning case...
clearly there should be no &...

then
m[3,4]+=4.0;
would be converted in
m.opIndexLhs(3,4)+=4.0;

typically with just one method (opIndexLhs) all += -=,... are covered

if one needs more control

class AbsMatrix(T){
T opIndex(size_t i,size_t j){
if (has_(i,j)){
return data[index(i,j)];
} else {
return cast(T)0;
}
}
struct Setter{
T* pos;
*pos+=abs(val);
}
}
Setter opIndexLhs(size_t i,size_t j){
Setter pos;
if (has_(i,j)){
res.pos=&data[index(i,j)];
} else {
//alloc new place and set things so that index(i,j) returns it
res.pos=&data[index(i,j)];
}
return res;
}
}

if one does not allow ref T as return type then one can return a pointer and do
static if(is(typeof(*m.opIndexLhs(3,4))))
*m.opIndexLhs(3,4)+=4.0;
else
m.opIndexLhs(3,4)+=4.0;

so that the trick with the struct is still possible.

```
Oct 15 2009
Bill Baxter <wbaxter gmail.com> writes:
```On Wed, Oct 14, 2009 at 9:34 AM, Bill Baxter <wbaxter gmail.com> wrote:
On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] +=3D c;

with reasonable expressiveness and efficiency.

I would hope that *=3D +=3D /=3D and friends could all be handled effici=

basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

And as Chad J reminds us, same goes for in-place property mutations
like =A0a.b +=3D c.
It's just a matter of =A0accessing =A0.b =A0vs .opIndex(b). =A0 And reall=

same goes for any function =A0a.memfun(b) +=3D c could benefit from the
same thing (a.length(index)+=3D3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but that's=

#2 shouldnot be. What about defining an opIndexOpOpAssign that accepts a de=
legate for #2 and then use compiler magic to specialize/inline it?
It could also be done using a template thing to inject the "mutate the
value" operation:

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Op)(Thang c, Thing idx)=

=A0 =A0 ref v =3D <lookup [idx] however you like>
=A0 =A0 mixin("v "~Op~" c;");
=A0 =A0 <store to v to [idx] however you like>
}

or make it an alias function argument and use Op(v, b).

Sparse matrices are a good case to look at for issues. =A0a[b] is
defined for every [b], but if the value is zero nothing is actually
stored. =A0So there may or may not be something you can return a
reference to. =A0 In C++ things like std::map just declare that if you
try to access a value that isn't there, it gets created. =A0That way
operator[] can always return a reference. =A0 It would be great if we
could make a[b] not force a ref return in cases where there is no
lvalue that corresponds to the index (or property) being accessed.
Gracefully degrade to the slow path in those cases.

A good thing about a template is you can pretty easily specify which
cases to allow using template constraints:

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Op)(Thang c, Thing b)
=A0 =A0 =A0 if (Op in "+=3D -=3D")
{
=A0 ...
}

(+ 1 small dream there about 'in' being defined to mean substring
search for string arguments -- that doesn't currently work does it?)

If the template can't be instantiated for the particular operation,
then the compiler would try to revert to the less efficient standby:
auto tmp =3D a[b];
tmp op=3D c;
a[b] =3D tmp;

The whole thing can generalize to all accessors too. =A0Instead of just
passing the Op, the compiler could pass the accessor string, and args
for that accessor. =A0Here an accessor means ".opIndex(b)", =A0".foo", or
even a general ".memfun(b)"

void opIndexOpOpAssignOpSpamOpSpamSpamSpam(string Member, string
Op)(Thang c, Thing b)
=A0 if (Member in ".foo() .bar() .opIndex()")
{
=A0 =A0 string call =3D ctReplace(Member, "()", "(b)"); =A0// Member look=

like ".memfun()" =A0this turns it into ".memfun(b)"
=A0 =A0 ref v =3D mixin("this" ~ call ~ ";");
=A0 =A0 < any extra stuff you want to do on accesses to v >
=A0 =A0 mixin("v "~Op~" c;");
=A0 =A0 < store v back to member >
}

It's ugly and perhaps too low-level, but that can be worked on if the
general principle is sound. =A0 Utility functions can be defined to do
whatever it is that turns out to be a recurring pattern. =A0Lack of
being virtual could be a problem for classes.

After mulling over it some more, basically what I'm describing is
simply a function that gives the user a chance to rewrite the AST of
these kinds of ".memfun(args) op=3D " type operations.

When the compiler sees     "obj.memfun(b) +=3D c"
It gives that bit of the syntax tree to the AST manipulator function
(if obj defines one) and the function can then alter it however
desired.

This is made somewhat clunky by the fact that our only representation
for ASTs is strings.

Actually this could just be a CTFE function.   It doesn't need to be a temp=
late.

Just imagine there's a compile-time struct passed in that could do
things like this:

string opWhateverAssign(AST syntax)
{
// First some examples:
// Assume obj.memfun(b0,b1) +=3D c   is what appeared in source code.
enum s0 =3D syntax.args;  // yields "b0, b1" -- compiler knows args
to this fn are called "b0" and "b1"
enum s1 =3D syntax.args[0]; // yields "b0"
enum s2 =3D syntax.rvalue; // yields "c"
enum s3 =3D syntax.member;  // yields  "memfun"
enum s4 =3D syntax.formatCallString("v =3D \$syntax.member( x,y )");
// yields "v=3Dmemfun(x,y)"
enum s5 =3D syntax.defaultImpl; // yields "auto v=3Dmemfun(b0,b1);
v+=3Dc; memfun(b0,b1)=3Dv;"

// ok now I'll actually do something
static if (syntax.member =3D=3D "opIndex") {
// say this is a sparse matrix class
return ctFormat(q{
if (!this.matrix_contains(\$syntax.args)) {
this.create_entry(\$syntax.args);
}
auto v =3D &this.matrix_storage[\$syntax.args];
*v  \$syntax.op  \$syntax.rvalue;
});
}
else {
return syntax.defaultImpl;
}
}

This assumes we can have CTFE functions inside structs/classes.  It
assumes a function called ctFormat that can format a string at
compile-time and do perl-like variable interpolation.   It assumes we
can pass structs to CTFE functions and use them there.

Really it doesn't have to be just the opAssign type calls either.  You
could allow such interceptors for any method call or member access.

This is really close to a nemerle-like macro, actually.  Just modify 4
lines and it is one.

macro opWhateverAssign(AST syntax)
{
// First some examples:
// ok now I'll actually do something
static if (syntax.member =3D=3D "opIndex") {
// say this is a sparse matrix class
<[
if (!this.matrix_contains(\$syntax.args)) {
this.create_entry(\$syntax.args);
}
auto v =3D &this.matrix_storage[\$syntax.args];
*v  \$syntax.op  \$syntax.rvalue;
]>
}
else {
<[ \$syntax.defaultImpl; ]>
}
}

And this really makes me think it's silly to put off macro syntax till
D3.  Everything needed is basically there.  In contrast to a new

--bb
```
Oct 14 2009
"Robert Jacques" <sandford jhu.edu> writes:
```On Wed, 14 Oct 2009 16:49:28 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Jason House wrote:
Bill Baxter Wrote:

On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and opIndexAssign
don't seem to be up to snuff because they don't catch operations like

a[b] += c;

with reasonable expressiveness and efficiency.

efficiently with one function written by the programmer. As I see it,
there are 3 basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

like  a.b += c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) += c could benefit from the
same thing (a.length(index)+=3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but
that's not efficient. #1 and #3 should be part of the same function,
but I think #2 shouldnot be. What about defining an opIndexOpOpAssign
that accepts a delegate for #2 and then use compiler magic to
specialize/inline it?

value" operation:

You can make virtuals out of templates, but not templates out of
virtuals. I think Walter is now inclined to look at a template-based
solution for operator overloading. That would save a mighty lot of code
without preventing classes that prefer virtual dispatch from doing so.

Andrei

I've done something similar for a SmallVec struct. Most of the operator
overloads are actually aliases of templated functions (one each for
uni-ops, bi-ops, bi-op_r and opassign)
```
Oct 14 2009
"Robert Jacques" <sandford jhu.edu> writes:
```On Thu, 15 Oct 2009 04:48:57 -0400, Fawzi Mohamed <fmohamed mac.com> wrote:

On 2009-10-14 23:09:26 +0200, "Robert Jacques" <sandford jhu.edu> said:

On Wed, 14 Oct 2009 16:49:28 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

Jason House wrote:
Bill Baxter Wrote:

On Wed, Oct 14, 2009 at 7:42 AM, Jason House
<jason.james.house gmail.com> wrote:
Andrei Alexandrescu Wrote:

Right now we're in trouble with operators: opIndex and
opIndexAssign
don't seem to be up to snuff because they don't catch operations
like
a[b] += c;
with reasonable expressiveness and efficiency.

efficiently with one function written by the programmer. As I see
it,  there are 3 basic steps:
1. Look up a value by index
2. Mutate the value
3. Store the result

like  a.b += c.
It's just a matter of  accessing  .b  vs .opIndex(b).   And really
same goes for any function  a.memfun(b) += c could benefit from the
same thing (a.length(index)+=3 anyone?)

it's possible to use opIndex for #1 and opIndexAssign for #3, but
that's not efficient. #1 and #3 should be part of the same
function,  but I think #2 shouldnot be. What about defining an
opIndexOpOpAssign  that accepts a delegate for #2 and then use
compiler magic to  specialize/inline it?

the
value" operation:

virtuals. I think Walter is now inclined to look at a template-based
code  without preventing classes that prefer virtual dispatch from
doing so.
Andrei

operator  overloads are actually aliases of templated functions (one
each for  uni-ops, bi-ops, bi-op_r and opassign)

I would really like a solution to all the overloading ops, as I missed
them in NArray, I think that some small rewriting is ok, but it must be
*small*, no magic as already said by other numerics can be tricky.
Also Andrei proposal seem workable, but there is also another solution:

Note that a ref return for opIndex, could work in most situations.
As Bill correctly pointed out sparse matrix offer the most challenging
example, there one wants to have two different functions: opIndex and
opIndexLhs, the second being called when the index is on the left hand
side of an assignment, so that reading a 0 entry in a matrix returns 0,
whereas assigning it allocates place for it.
This makes it slightly more complex to control what is being assigned
(as you need to return a structure overloading opXAssign, but I think it
would be ok in most cases.

Fawzi

Would you like some example code?
```
Oct 15 2009