www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Thoughts on improving operators

reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
I agree. But there's something that can be done to help make the
operators even more usable.
Example: named foreach loops over objects:

struct TestStruct
{
public:
    this(string text) { _text = text.dup; }

    int opApply(string name : "text")(int delegate(ref dchar) dg)
    {
        int result;
        foreach(ref c; _text)
        {
            result = dg(c);
            if(result)
                break;
        }
        return result;
    }

private:
    dchar[] _text;
}

unittest
{
    auto ts = TestStruct("Hello, world!");
    foreach(ref c; ts.text)
        if(isAlpha(c))
            ++c;
    assert(ts == "Ifmmp, xpsme!");
}

Dropping the string name of the operator will mean, using it on the
object itself (which is the only choice now).
This would be a very handy way to define more then one type of opApply
for an object without the need to create and return private structures
with opApply in them.
The same thing can be used for any other operator, including slices,
indices, etc...
This is very different from returning a ref, because two slice
operators with different names could slice the same object in
different way (e.g. slice a matrix by rows or columns).
The idea is similar to having properties: you _can_ achieve the same
effect without properties by creating and returning private structures
with the operators overloaded, but this allows to save lots of time
and run-time overhead.
Much like an integrated opDispatch in each operator.
This seems to introduce ambiguity with calling the operators for the
members, but there is more then one solution to this:
    * Either disallow using names, already used by members.
    * Or make the member-operator more prioritized, then the named
operator call, allowing it to run only if no member of that name is
accessible from the point of the call.
    * Or make all operators to members as named operators, that the
compiler automatically generates (like the default constructor for
classes) and allow the client to override or disable them.
Best of all, this does not in any way break existing code.

Introduce another operator opPred:

struct TestStruct
{
    int[] array;

    bool opPred(string name: empty)()
    {
        return array.length == 0;
    }
}

unittest
{
    TestStruct ts;
    if(ts.empty?) writeln("yes, it is.");
   bool delegate() isEmpty = &ts.empty?;
   bool isIt = ts.empty ? true : false;
}

This really looks like it's gonna make D context-dependent and break
lots of code, but:

The question mark becomes the opPred operator, which:
    * Either returns bool (when a colon isn't followed).
    * Or evaluates one of the expressions (if the colon is followed
and it is a ternary if operator).

The question mark is only used in the ternary operator, so there is no
other way to interpret it.
The question mark can be syntactically interpreted as one of two
possible expressions (either with or without the following colon),
which cannot be ambiguous, because the colon also has a very limited
and specific use.
of course, the question mark, as with the ternary operator, will have
the lowest precedence, to allow taking delegate of and generally
following the ternary operator's behavior.
What do you think?
Wouldn't this solve the aforementioned problem with predicates without
breaking anything?

On Wed, Oct 5, 2011 at 11:54 AM, Walter Bright
<newshound2 digitalmars.com> wrote:
 On 10/4/2011 2:46 AM, Jacob Carlborg wrote:
 What are the thoughts around here on function names containing arbitrary
 symbols, like in Scala. Example:

 void ::: (int a) {}

This, in effect, means "user defined tokens". The lexing pass will then become intertwined with semantic analysis. While possible, this will make the compiler slow, buggy, impossible to run the passes concurrently, hard to write 3rd party parsing tools, etc.

Oct 05 2011
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wed, 05 Oct 2011 13:26:26 +0300, Gor Gyolchanyan  
<gor.f.gyolchanyan gmail.com> wrote:

 Example: named foreach loops over objects:

You can achieve almost exactly this by iterating over a delegate (define a method with the same signature as opApply). The only change at the call site is that instead of "foreach(ref c; ts.text)" you'll type "foreach(ref c; &ts.text)". -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Oct 05 2011
prev sibling next sibling parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
I didn't know that was possible! Thanks!
But the case with the ternary operator is the most interesting one.

On Wed, Oct 5, 2011 at 2:54 PM, Vladimir Panteleev
<vladimir thecybershadow.net> wrote:
 On Wed, 05 Oct 2011 13:26:26 +0300, Gor Gyolchanyan
 <gor.f.gyolchanyan gmail.com> wrote:

 Example: named foreach loops over objects:

You can achieve almost exactly this by iterating over a delegate (define =

 method with the same signature as opApply). The only change at the call s=

 is that instead of "foreach(ref c; ts.text)" you'll type "foreach(ref c;
 &ts.text)".

 --
 Best regards,
 =A0Vladimir =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mailto=


Oct 05 2011
prev sibling next sibling parent travert phare.normalesup.org (Christophe) writes:
Gor Gyolchanyan , dans le message (digitalmars.D:146100), a écrit :
 I agree. But there's something that can be done to help make the
 operators even more usable.
 Example: named foreach loops over objects:
 
 struct TestStruct
 {
 public:
     this(string text) { _text = text.dup; }
 
     int opApply(string name : "text")(int delegate(ref dchar) dg)
     {
         int result;
         foreach(ref c; _text)
         {
             result = dg(c);
             if(result)
                 break;
         }
         return result;
     }
 
 private:
     dchar[] _text;
 }
 
 unittest
 {
     auto ts = TestStruct("Hello, world!");
     foreach(ref c; ts.text)
         if(isAlpha(c))
             ++c;
     assert(ts == "Ifmmp, xpsme!");
 }
 
 Dropping the string name of the operator will mean, using it on the
 object itself (which is the only choice now).
 This would be a very handy way to define more then one type of opApply
 for an object without the need to create and return private structures
 with opApply in them.

This is already possible for opApply. http://www.d-programming-language.org/statement.html#ForeachStatement See foreach over delegates. With your proposal, the parsing of all operators becomes more complicated: each time you see a symbol, you must check if it is followed by a special operator, if it is, then you don't evaluate the symbol, but the operator with this symbol name as template argument. Even if the compiler was implemented that way without bugs, you could still lose the programmer. I'd prefer to keep the langage simple enough so I can see what will get being caller... In short, I don't know if ts.empty? is going to call ts.empty.opPred, or ts.opPred!"empty" ? I don't think naming predicates is such a big issue. 'isEmpty' is not that ugly. And in special cases, you can even decide to call the predicate just 'empty'. No one forces you to have a naming convention for predicates. If the proposal is used and empty? is used, that doesn't change the fact that you should not have a method called empty in your structure because of parsing ambiguity, so you could ave used an empty method in the first place (instead of isEmpty). On way to solve this ambiguity would be to use ts?empty. But then you lose the ternary operator... And this doesn't work nicely with opIndex or opApply, for example. About using ? as a post-fix unary operator converting to bool, I would say 'why not ?'. But I think overloading opPred (and other logical operators && and ||) is not a good idea at all. These operators must keep the same meaning in any condition. Finally, if you hate constructing a structure to be used by opIndex, use a delegate, and construct the structure on the fly with a template: struct Indexable(T) { this(T delegate(size_t) dg_) { dg = dg_; } T delegate(size_t i) dg; T opIndex(size_t i) { return dg(i); } } indexable(T)(T delegate(size_t) dg) { return Indexable!T(dg); } struct Foo { int a, b; auto byIndex() { return indexable((){ return i==0? a: b; }) } } We could even improve the template by templating the index argument, proposing to have a length method, etc... -- Christophe
Oct 05 2011
prev sibling parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
The whole point was to put the question mark to a better use.
I mean, it's used in the ternary operator exclusively.
It's such a waste of a token.
The question mark logically belongs to bools (which goes good with the
ternary operator), but the bools are much more ofter worked with in
the form of predicates, so I'd want to make that question mark more
useful.

On Wed, Oct 5, 2011 at 3:32 PM, Christophe <travert phare.normalesup.org> w=
rote:
 Gor Gyolchanyan , dans le message (digitalmars.D:146100), a =E9crit=A0:
 I agree. But there's something that can be done to help make the
 operators even more usable.
 Example: named foreach loops over objects:

 struct TestStruct
 {
 public:
 =A0 =A0 this(string text) { _text =3D text.dup; }

 =A0 =A0 int opApply(string name : "text")(int delegate(ref dchar) dg)
 =A0 =A0 {
 =A0 =A0 =A0 =A0 int result;
 =A0 =A0 =A0 =A0 foreach(ref c; _text)
 =A0 =A0 =A0 =A0 {
 =A0 =A0 =A0 =A0 =A0 =A0 result =3D dg(c);
 =A0 =A0 =A0 =A0 =A0 =A0 if(result)
 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 break;
 =A0 =A0 =A0 =A0 }
 =A0 =A0 =A0 =A0 return result;
 =A0 =A0 }

 private:
 =A0 =A0 dchar[] _text;
 }

 unittest
 {
 =A0 =A0 auto ts =3D TestStruct("Hello, world!");
 =A0 =A0 foreach(ref c; ts.text)
 =A0 =A0 =A0 =A0 if(isAlpha(c))
 =A0 =A0 =A0 =A0 =A0 =A0 ++c;
 =A0 =A0 assert(ts =3D=3D "Ifmmp, xpsme!");
 }

 Dropping the string name of the operator will mean, using it on the
 object itself (which is the only choice now).
 This would be a very handy way to define more then one type of opApply
 for an object without the need to create and return private structures
 with opApply in them.

This is already possible for opApply. http://www.d-programming-language.org/statement.html#ForeachStatement See foreach over delegates. With your proposal, the parsing of all operators becomes more complicated: each time you see a symbol, you must check if it is followed by a special operator, if it is, then you don't evaluate the symbol, but the operator with this symbol name as template argument. Even if the compiler was implemented that way without bugs, you could still lose the programmer. I'd prefer to keep the langage simple enough so I can see what will get being caller... In short, I don't know if ts.empty? is going to call ts.empty.opPred, or ts.opPred!"empty" ? I don't think naming predicates is such a big issue. 'isEmpty' is not that ugly. And in special cases, you can even decide to call the predicate just 'empty'. No one forces you to have a naming convention for predicates. If the proposal is used and empty? is used, that doesn't change the fact that you should not have a method called empty in your structure because of parsing ambiguity, so you could ave used an empty method in the first place (instead of isEmpty). On way to solve this ambiguity would be to use ts?empty. But then you lose the ternary operator... And this doesn't work nicely with opIndex or opApply, for example. About using ? as a post-fix unary operator converting to bool, I would say 'why not ?'. But I think overloading opPred (and other logical operators && and ||) is not a good idea at all. These operators must keep the same meaning in any condition. Finally, if you hate constructing a structure to be used by opIndex, use a delegate, and construct the structure on the fly with a template: struct Indexable(T) { =A0this(T delegate(size_t) dg_) { dg =3D dg_; } =A0T delegate(size_t i) dg; =A0T opIndex(size_t i) { return dg(i); } } indexable(T)(T delegate(size_t) dg) { =A0return Indexable!T(dg); } struct Foo { =A0int a, b; =A0auto byIndex() =A0{ =A0 =A0return indexable((){ return i=3D=3D0? a: b; }) =A0} } We could even improve the template by templating the index argument, proposing to have a length method, etc... -- Christophe

Oct 05 2011
next sibling parent kennytm <kennytm gmail.com> writes:
Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> wrote:
 The whole point was to put the question mark to a better use.
 I mean, it's used in the ternary operator exclusively.
 It's such a waste of a token.
 The question mark logically belongs to bools (which goes good with the
 ternary operator), but the bools are much more ofter worked with in
 the form of predicates, so I'd want to make that question mark more
 useful.

That could be said to '-' which is only used for subtaction. What a waste of token. I'd say as long as the symbol alone is a valid token, it should never be part of an identifier, doing else just gonna confuse anybody coming from C-like languages, i.e. C, C++, C#, D, Java, JavaScript, etc. -1. You've got a slightly better chance if you've suggested '#'.
Oct 05 2011
prev sibling parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
I never wanted it to be a part of an identifier. I wanted it to be an
overloadable operator.
'-' already is an overloadable operator, so it can be put to many uses.
'#' is, as i know, used in the shebang and the line specifier. I don't
know if it will be unambiguous to use it as an operator.

On Thu, Oct 6, 2011 at 1:37 AM, kennytm <kennytm gmail.com> wrote:
 Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> wrote:
 The whole point was to put the question mark to a better use.
 I mean, it's used in the ternary operator exclusively.
 It's such a waste of a token.
 The question mark logically belongs to bools (which goes good with the
 ternary operator), but the bools are much more ofter worked with in
 the form of predicates, so I'd want to make that question mark more
 useful.

That could be said to '-' which is only used for subtaction. What a waste of token. I'd say as long as the symbol alone is a valid token, it should never be part of an identifier, doing else just gonna confuse anybody coming from C-like languages, i.e. C, C++, C#, D, Java, JavaScript, etc. -1. You've got a slightly better chance if you've suggested '#'.

Oct 06 2011