digitalmars.D - Thoughts on improving operators

Gor Gyolchanyan (90/100) Oct 05 2011 I agree. But there's something that can be done to help make the

Vladimir Panteleev (9/10) Oct 05 2011 You can achieve almost exactly this by iterating over a delegate (define...

Gor Gyolchanyan (7/17) Oct 05 2011 I didn't know that was possible! Thanks!

travert phare.normalesup.org (Christophe) (51/90) Oct 05 2011 This is already possible for opApply.

Gor Gyolchanyan (9/99) Oct 05 2011 The whole point was to put the question mark to a better use.

kennytm (8/15) Oct 05 2011 That could be said to '-' which is only used for subtaction. What a wast...

Gor Gyolchanyan (6/21) Oct 06 2011 I never wanted it to be a part of an identifier. I wanted it to be an

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

I agree. But there's something that can be done to help make the
operators even more usable.
Example: named foreach loops over objects:

struct TestStruct
{
public:
    this(string text) { _text = text.dup; }

    int opApply(string name : "text")(int delegate(ref dchar) dg)
    {
        int result;
        foreach(ref c; _text)
        {
            result = dg(c);
            if(result)
                break;
        }
        return result;
    }

private:
    dchar[] _text;
}

unittest
{
    auto ts = TestStruct("Hello, world!");
    foreach(ref c; ts.text)
        if(isAlpha(c))
            ++c;
    assert(ts == "Ifmmp, xpsme!");
}

Dropping the string name of the operator will mean, using it on the
object itself (which is the only choice now).
This would be a very handy way to define more then one type of opApply
for an object without the need to create and return private structures
with opApply in them.
The same thing can be used for any other operator, including slices,
indices, etc...
This is very different from returning a ref, because two slice
operators with different names could slice the same object in
different way (e.g. slice a matrix by rows or columns).
The idea is similar to having properties: you _can_ achieve the same
effect without properties by creating and returning private structures
with the operators overloaded, but this allows to save lots of time
and run-time overhead.
Much like an integrated opDispatch in each operator.
This seems to introduce ambiguity with calling the operators for the
members, but there is more then one solution to this:
    * Either disallow using names, already used by members.
    * Or make the member-operator more prioritized, then the named
operator call, allowing it to run only if no member of that name is
accessible from the point of the call.
    * Or make all operators to members as named operators, that the
compiler automatically generates (like the default constructor for
classes) and allow the client to override or disable them.
Best of all, this does not in any way break existing code.

Introduce another operator opPred:

struct TestStruct
{
    int[] array;

    bool opPred(string name: empty)()
    {
        return array.length == 0;
    }
}

unittest
{
    TestStruct ts;
    if(ts.empty?) writeln("yes, it is.");
   bool delegate() isEmpty = &ts.empty?;
   bool isIt = ts.empty ? true : false;
}

This really looks like it's gonna make D context-dependent and break
lots of code, but:

The question mark becomes the opPred operator, which:
    * Either returns bool (when a colon isn't followed).
    * Or evaluates one of the expressions (if the colon is followed
and it is a ternary if operator).

The question mark is only used in the ternary operator, so there is no
other way to interpret it.
The question mark can be syntactically interpreted as one of two
possible expressions (either with or without the following colon),
which cannot be ambiguous, because the colon also has a very limited
and specific use.
of course, the question mark, as with the ternary operator, will have
the lowest precedence, to allow taking delegate of and generally
following the ternary operator's behavior.
What do you think?
Wouldn't this solve the aforementioned problem with predicates without
breaking anything?

On Wed, Oct 5, 2011 at 11:54 AM, Walter Bright
<newshound2 digitalmars.com> wrote:
 On 10/4/2011 2:46 AM, Jacob Carlborg wrote:
 What are the thoughts around here on function names containing arbitrary
 symbols, like in Scala. Example:

 void ::: (int a) {}

 This, in effect, means "user defined tokens". The lexing pass will then
 become intertwined with semantic analysis. While possible, this will make
 the compiler slow, buggy, impossible to run the passes concurrently, hard to
 write 3rd party parsing tools, etc.

Oct 05 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Wed, 05 Oct 2011 13:26:26 +0300, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 Example: named foreach loops over objects:

You can achieve almost exactly this by iterating over a delegate (define a  
method with the same signature as opApply). The only change at the call  
site is that instead of "foreach(ref c; ts.text)" you'll type "foreach(ref  
c; &ts.text)".

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Oct 05 2011

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

I didn't know that was possible! Thanks!
But the case with the ternary operator is the most interesting one.

On Wed, Oct 5, 2011 at 2:54 PM, Vladimir Panteleev
<vladimir thecybershadow.net> wrote:
 On Wed, 05 Oct 2011 13:26:26 +0300, Gor Gyolchanyan
 <gor.f.gyolchanyan gmail.com> wrote:

 Example: named foreach loops over objects:

 You can achieve almost exactly this by iterating over a delegate (define =

a
 method with the same signature as opApply). The only change at the call s=

ite
 is that instead of "foreach(ref c; ts.text)" you'll type "foreach(ref c;
 &ts.text)".

 --
 Best regards,
 =A0Vladimir =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mailto=

:vladimir thecybershadow.net

Oct 05 2011

travert phare.normalesup.org (Christophe) writes:

Gor Gyolchanyan , dans le message (digitalmars.D:146100), a écrit :
 I agree. But there's something that can be done to help make the
 operators even more usable.
 Example: named foreach loops over objects:
 
 struct TestStruct
 {
 public:
     this(string text) { _text = text.dup; }
 
     int opApply(string name : "text")(int delegate(ref dchar) dg)
     {
         int result;
         foreach(ref c; _text)
         {
             result = dg(c);
             if(result)
                 break;
         }
         return result;
     }
 
 private:
     dchar[] _text;
 }
 
 unittest
 {
     auto ts = TestStruct("Hello, world!");
     foreach(ref c; ts.text)
         if(isAlpha(c))
             ++c;
     assert(ts == "Ifmmp, xpsme!");
 }
 
 Dropping the string name of the operator will mean, using it on the
 object itself (which is the only choice now).
 This would be a very handy way to define more then one type of opApply
 for an object without the need to create and return private structures
 with opApply in them.

This is already possible for opApply.

http://www.d-programming-language.org/statement.html#ForeachStatement

See foreach over delegates.


With your proposal, the parsing of all operators becomes more 
complicated: each time you see a symbol, you must check if it is 
followed by a special operator, if it is, then you don't evaluate the 
symbol, but the operator with this symbol name as template argument. 
Even if the compiler was implemented that way without bugs, you could 
still lose the programmer. I'd prefer to keep the langage simple enough 
so I can see what will get being caller...

In short, I don't know if ts.empty? is going to call ts.empty.opPred, or 
ts.opPred!"empty" ?

I don't think naming predicates is such a big issue. 'isEmpty' is not 
that ugly. And in special cases, you can even decide to call the 
predicate just 'empty'. No one forces you to have a naming convention 
for predicates. If the proposal is used and empty? is used, that doesn't 
change the fact that you should not have a method called empty in your 
structure because of parsing ambiguity, so you could ave used an empty 
method in the first place (instead of isEmpty).

On way to solve this ambiguity would be to use ts?empty. But then you 
lose the ternary operator... And this doesn't work nicely with opIndex 
or opApply, for example.

About using ? as a post-fix unary operator converting to bool, I would 
say 'why not ?'. But I think overloading opPred (and other logical 
operators && and ||) is not a good idea at all. These operators must 
keep the same meaning in any condition.



Finally, if you hate constructing a structure to be used by opIndex, use 
a delegate, and construct the structure on the fly with a template:

struct Indexable(T)
{
  this(T delegate(size_t) dg_) { dg = dg_; }
  T delegate(size_t i) dg;
  T opIndex(size_t i) { return dg(i); }
}

indexable(T)(T delegate(size_t) dg)
{
  return Indexable!T(dg);
}

struct Foo
{
  int a, b;
  auto byIndex()
  {
    return indexable((){ return i==0? a: b; })
  }
}


We could even improve the template by templating the index argument, 
proposing to have a length method, etc...

-- 
Christophe

Oct 05 2011

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

The whole point was to put the question mark to a better use.
I mean, it's used in the ternary operator exclusively.
It's such a waste of a token.
The question mark logically belongs to bools (which goes good with the
ternary operator), but the bools are much more ofter worked with in
the form of predicates, so I'd want to make that question mark more
useful.

On Wed, Oct 5, 2011 at 3:32 PM, Christophe <travert phare.normalesup.org> w=
rote:
 Gor Gyolchanyan , dans le message (digitalmars.D:146100), a =E9crit=A0:
 I agree. But there's something that can be done to help make the
 operators even more usable.
 Example: named foreach loops over objects:

 struct TestStruct
 {
 public:
 =A0 =A0 this(string text) { _text =3D text.dup; }

 =A0 =A0 int opApply(string name : "text")(int delegate(ref dchar) dg)
 =A0 =A0 {
 =A0 =A0 =A0 =A0 int result;
 =A0 =A0 =A0 =A0 foreach(ref c; _text)
 =A0 =A0 =A0 =A0 {
 =A0 =A0 =A0 =A0 =A0 =A0 result =3D dg(c);
 =A0 =A0 =A0 =A0 =A0 =A0 if(result)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 break;
 =A0 =A0 =A0 =A0 }
 =A0 =A0 =A0 =A0 return result;
 =A0 =A0 }

 private:
 =A0 =A0 dchar[] _text;
 }

 unittest
 {
 =A0 =A0 auto ts =3D TestStruct("Hello, world!");
 =A0 =A0 foreach(ref c; ts.text)
 =A0 =A0 =A0 =A0 if(isAlpha(c))
 =A0 =A0 =A0 =A0 =A0 =A0 ++c;
 =A0 =A0 assert(ts =3D=3D "Ifmmp, xpsme!");
 }

 Dropping the string name of the operator will mean, using it on the
 object itself (which is the only choice now).
 This would be a very handy way to define more then one type of opApply
 for an object without the need to create and return private structures
 with opApply in them.

 This is already possible for opApply.

 http://www.d-programming-language.org/statement.html#ForeachStatement

 See foreach over delegates.


 With your proposal, the parsing of all operators becomes more
 complicated: each time you see a symbol, you must check if it is
 followed by a special operator, if it is, then you don't evaluate the
 symbol, but the operator with this symbol name as template argument.
 Even if the compiler was implemented that way without bugs, you could
 still lose the programmer. I'd prefer to keep the langage simple enough
 so I can see what will get being caller...

 In short, I don't know if ts.empty? is going to call ts.empty.opPred, or
 ts.opPred!"empty" ?

 I don't think naming predicates is such a big issue. 'isEmpty' is not
 that ugly. And in special cases, you can even decide to call the
 predicate just 'empty'. No one forces you to have a naming convention
 for predicates. If the proposal is used and empty? is used, that doesn't
 change the fact that you should not have a method called empty in your
 structure because of parsing ambiguity, so you could ave used an empty
 method in the first place (instead of isEmpty).

 On way to solve this ambiguity would be to use ts?empty. But then you
 lose the ternary operator... And this doesn't work nicely with opIndex
 or opApply, for example.

 About using ? as a post-fix unary operator converting to bool, I would
 say 'why not ?'. But I think overloading opPred (and other logical
 operators && and ||) is not a good idea at all. These operators must
 keep the same meaning in any condition.



 Finally, if you hate constructing a structure to be used by opIndex, use
 a delegate, and construct the structure on the fly with a template:

 struct Indexable(T)
 {
 =A0this(T delegate(size_t) dg_) { dg =3D dg_; }
 =A0T delegate(size_t i) dg;
 =A0T opIndex(size_t i) { return dg(i); }
 }

 indexable(T)(T delegate(size_t) dg)
 {
 =A0return Indexable!T(dg);
 }

 struct Foo
 {
 =A0int a, b;
 =A0auto byIndex()
 =A0{
 =A0 =A0return indexable((){ return i=3D=3D0? a: b; })
 =A0}
 }


 We could even improve the template by templating the index argument,
 proposing to have a length method, etc...

 --
 Christophe

Oct 05 2011

kennytm <kennytm gmail.com> writes:

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> wrote:
 The whole point was to put the question mark to a better use.
 I mean, it's used in the ternary operator exclusively.
 It's such a waste of a token.
 The question mark logically belongs to bools (which goes good with the
 ternary operator), but the bools are much more ofter worked with in
 the form of predicates, so I'd want to make that question mark more
 useful.

That could be said to '-' which is only used for subtaction. What a waste
of token. 

I'd say as long as the symbol alone is a valid token, it should never be
part of an identifier, doing else just gonna confuse anybody coming from


-1.

Oct 05 2011

Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:

I never wanted it to be a part of an identifier. I wanted it to be an
overloadable operator.
'-' already is an overloadable operator, so it can be put to many uses.

know if it will be unambiguous to use it as an operator.

On Thu, Oct 6, 2011 at 1:37 AM, kennytm <kennytm gmail.com> wrote:
 Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> wrote:
 The whole point was to put the question mark to a better use.
 I mean, it's used in the ternary operator exclusively.
 It's such a waste of a token.
 The question mark logically belongs to bools (which goes good with the
 ternary operator), but the bools are much more ofter worked with in
 the form of predicates, so I'd want to make that question mark more
 useful.

 That could be said to '-' which is only used for subtaction. What a waste
 of token.

 I'd say as long as the symbol alone is a valid token, it should never be
 part of an identifier, doing else just gonna confuse anybody coming from


 -1.

Oct 06 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Thoughts on improving operators