www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8381] New: Uniform function call syntax (pseudo member) enhancement suggestions

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381

           Summary: Uniform function call syntax (pseudo member)
                    enhancement suggestions
           Product: D
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: tommitissari hotmail.com



As I see it, the goal of uniform function call syntax, as 
described here http://www.drdobbs.com/blogs/cpp/232700394, is to 
allow non-intrusively extending the functionality of a type. I 
think the current implementation falls short in accomplishing 
this goal on two accounts:

1) You can't non-intrusively add static member functions
2) You can't non-intrusively add constructors

So, I'm suggesting these two features to be added to the language:

1. Static members as free functions
------------------------------------
    If function invocations like the following are encountered...

    A) Type.func(<ARGUMENTS>);
    B) Type.func; // it's a static  property function

    ...and those functions haven't been implemented as members,
    the function invocations get lowered into:

    A) func!Type(<ARGUMENTS>);
    B) func!Type;

    For example:
    enum Fruit {apple, orange}

    // CODE               // LOWERED CODE
    Fruit.letEmRot();     // .letEmRot!Fruit();
    if (Fruit.isEatable)  // if (.isEatable!Fruit)
    {
      Fruit myfruit;
      myfruit.split(3);   // .split!Fruit(3); // only if it can't
                                              // be lowered into:
                                              // .split(myfruit, 3)
    }

2. Class & struct constructors as free functions
-------------------------------------------------
    If a constructor call hasn't been implemented by Type...

    auto t = Type(<ARGUMENTS>);

    ...then it get's lowered into a free function call...

    auto t = .make!Type(<ARGUMENTS>);

    For example:
    enum Fruit {apple, orange}

    // CODE                 // LOWERED CODE
    auto f = Fruit("red");  // auto f = .make!Fruit("red");

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 12 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381


David Piepgrass <qwertie256 gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |qwertie256 gmail.com



PDT ---
To some extent I like the goal of the proposal, but I don't like the
implementation:

- The rule is non-obvious. How could a new developer possibly guess that this
happens?
- It would get lowered to a call to a template function, but usually the
programmer wants to extend one specific type.

So I offer the following counterproposal:

1) Adding static member functions

Supposing that I would like to extend the set of static members in class Foo.
How about:

module A;
// assuming "static class" does not already have any meaning?
static class Foo {
    static void F();
    static int x;
}

This block defines members in a class namespace "Foo". If "regular class Foo"
is defined in module A then these static members go into the same class.
However, if the original Foo is defined in module B then "static class Foo"
goes into module A and is considered a separate class.

Now consider some code that tries to use F():

module C;
import A;
import B;
void code() { Foo.F(); }

Currently, the compiler complains that B.Foo and A.Foo conflict with each
other. I propose changing this rule when accessing static members. The compiler
does not need to declare a conflict as soon as it sees "Foo", instead it can
look for static members in ALL Foo classes, using the same anti-hijacking rules
that it uses for free-standing module functions.

The purpose of "static class" is not to facilitate this method lookup per se;
if A and B both contain non-static Foo classes, the compiler should still
search both classes for static members rather than giving an error immediately.
Rather, the main purpose of "static class" is to declare that the class has no
constructor (not merely a  disabled default constructor, but no constructor at
all). Therefore, "new Foo()" cannot mean "new A.Foo", leading the compiler to
the interpretation "new B.Foo". Also, multiple static classes can be defined
with the same name in the same module, and their contents are merged. 

IMO, when searching for static members, a class scope should pretty much behave
the same way as module scope. So the compiler should not report an ambiguous
call if the call has only one interpretation. In the above case, the compiler
should report an error if and only if a B.F() function exists.

One more thing, if module C declares a "class Foo" then it takes priority over
A.Foo and B.Foo. But if C declares "static class Foo" then C.Foo should allow
the same overloading behavior just described; thus "new Foo()" should still
mean "new B.Foo()" and "Foo.F()" should still mean "A.Foo.F()". I think an
equivalent way to say this is that, inside C, the lookup rules proceed as if
C.Foo were declared outside module C and imported into C.


2) Adding constructors aka "Constructors Considered Harmful"

Ahh, constructors, constructors. In my opinion the constructor design in most

implementation detail that should not be exposed, namely, which class gets
allocated and when.

I offer you as "exhibit L" the Lazy<T> class in the .NET framework. This
class's main purpose is to compute a value the first time you access it. For
example:

int x;
Lazy<int> lazy = new Lazy<int>(() => 7 * x);
x = 3;
x = lazy.Value; // 21
x = lazy.Value; // still 21 (Value is initialized only once)

There is an annoying issue though Lazy<T> operates in some different modes, and
it also contains a member and extra code to aid debugging. So in addition to
holding the value itself and a reference to the initializer delegate, it's got
a couple of other member variables and the Value property has to check a couple
of things before it returns the value.

So what if, in the future, MS decided to optimize Lazy<T> for its default mode
of operation, and factor out other modes into derived class(es)? Well, they
can't do that. If MS wants to return a LazyThreadSafe<T> object (derived from
Lazy<T>) when the user requests the thread-safe mode, they can't do that
because all the clients are saying "new Lazy", which can only return the exact
class Lazy and nothing else.

MS could add a static function, Lazy.New(...) but it's too late now that the
interface is already defined with all those public constructors.

As "exhibit 2", a.k.a. "exhibit Foo", I recently wrote a library where I needed
to provide a constructor that does a bunch of initialization work (that may
fail) before it actually creates the object. By far the most natural
implementation was a static member function:

// Constructor with dependency injection
public Foo(arg1, arg2, arg3)
{
}
// Static helper method provides an easy way to create MyClass
public static Foo LoadFrom(string filename, ...)
{
    ... // do some work
    arg1 = ...; arg2 = ...; arg3 = ...;
    return new Foo(arg1, arg2, arg3);
}

However, the client didn't like that, and insisted that Create() should be a
constructor "for consistency". I was able to rearrange things to make Create()
into a constructor, but my code was a little clunkier that way.

So what I'm saying is, when clients directly allocate memory themselves with
new(), it constrains the class implementation to work a certain way, and this
constraint is unnecessary.

So I would like to propose a way to solve these three problems. I give you:
"static new" functions:

module P;
class Foo {
    // Normal constructor
    init(A arg1, B arg2, C arg3) { ... }

    static Foo new(string filename, ...)
    {
        ... // do some work
        arg1 = ...; arg2 = ...; arg3 = ...;
        return new Foo(arg1, arg2, arg3);
    }
}

A static method named "new" overloads with constructors and can be called by
the new operator, just like constructors: "new Foo(filename, ...)" calls the
static method. A "static new" function can return a derived class if it wants,
or it can throw an exception without actually allocating any memory for a new
object. 

Now, remember the original topic: people would like to add "constructors" to
existing classes. The two above proposals together enable this possibility:

module Q;
static class Foo {
    static P.Foo new(Bar b, Baz z) {
        ...
        return new P.Foo(...);
    }
}

Arguably a "static new" function should not be allowed to allowed to return an
object that is not derived from a class called "Foo", but note that this
function in class Q.Foo must be able to return a P.Foo even though it is not
related except by name. Also, arguably, "static new" functions should not
return null, but it may be impractical and overkill for the compiler to enforce
such a rule.

One final problem is that standard constructors cannot have names. You can
define static functions with names instead, but client must use a different
syntax to call static methods compared to constructors, and this difference
feels a little odd, especially when a class provides BOTH static methods AND
constructors to construct objects.

To solve this problem, I would advocate a "uniform constructor call syntax" as
well, which is quite simply to allow users to flip around "new Foo" to
"Foo.new" if they so choose. Thus a class might define:

class Bar {
    init() { ... }
    static Bar newFromResourceId(int id) { ... }
    static Bar newFromFilename(string fn) { ... }
}

In this case the user can call Bar.new(), Bar.newFromResourceId(42) or
Bar.newFromFilename("..."). So this third proposal is not to allow named
constructors per se, but to allow a unification of syntax so that 

1. there no longer must be a big syntactic difference in syntax between "new
Bar(...)" and a "named" constructor "Bar.newFromFilename" 
2. a client cannot distinguish at the call site whether he is calling a
constructor or a static method. In other words, without this third proposal,
the language would seem inconsistent because it is presumably possible to use
the syntax Foo.new(Bar, Baz) but Bar.new() would be illegal.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381




PDT ---
 (not merely a  disabled default constructor, but no constructor at all).
Also, declaring an instance of a static class is not possible either, so that given
 module Q;
 static class Foo {
    static P.Foo new(Bar b, Baz z) {
        ...
        return new P.Foo(...);
    }
}
The meaning is unambiguous in: Foo f = new Foo(Bar(...), Baz(...)); // equivalent to P.Foo f = Q.Foo.new(Bar(...), Baz(...)); -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381





 - The rule is non-obvious. How could a new developer possibly guess that this
 happens?
You could also argue that it's not obvious that: var.func(arg); ...can "transform" itself into: func(var, arg); I don't think it's that much different to transform: MyType.func(arg); ...into: func!(MyType)(arg); It almost looks like we're trying to pass MyType as first argument to the function, but because we can't pass type as function argument, we pass it as first template argument. But you're right that none of that is very obvious. Maybe the whole uniform function call syntax could have been implemented somehow differently so that it had been more intuitive to people new to the language.
 Supposing that I would like to extend the set of static members in class Foo.
 How about:
 
 module A;
 // assuming "static class" does not already have any meaning?
 static class Foo {
     static void F();
     static int x;
 }
But I really mean type when I say, "extending the functionality of a type". I mean not only class but all user defined types as well, like enum, struct and union. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381


deadalnix <deadalnix gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |deadalnix gmail.com





 - The rule is non-obvious. How could a new developer possibly guess that this
 happens?
You could also argue that it's not obvious that: var.func(arg); ...can "transform" itself into: func(var, arg); I don't think it's that much different to transform: MyType.func(arg); ...into: func!(MyType)(arg);
This is the most obvious way to transform that. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381




One small point I forgot to mention about my lowering proposal, which is that
besides adding static pseudo-member functions, you can also add types. For
example:

struct WrapInt
{
    int m_value;
}

template ValueType(T)
    if (is(T == WrapInt))
{
    alias int ValueType;
}

void main(string[] args)
{
    WrapInt.ValueType value = 12;
 // gets lowered into: 
 // ValueType!WrapInt value = 12;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 12 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381




PDT ---
Argh, so many typos, I should be careful when I rename things...

 // Static helper method provides an easy way to create MyClass
 public static Foo LoadFrom(string filename, ...)
 However, the client didn't like that, and insisted that Create() should be a
 constructor "for consistency". I was able to rearrange things to make Create()
 into a constructor, but my code was a little clunkier that way.
s/MyClass/Foo/ s/Create/LoadFrom/ Oh how nice it would be if we could simply correct our posts.
 You could also argue that it's not obvious that:
 var.func(arg);
 
 ...can "transform" itself into:
 func(var, arg);
That feature is used very often, so newcomers will learn it quickly. And it is definitely a more obvious transformation than func!Type(...). P.S. I'm just throwing it out there, but couldn't classes themselves be treated as objects, a la Objective C? That could open the door to another approach, in which UFCS applies to class objects. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 12 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8381




PDT ---
 It almost looks like we're trying to pass MyType as first argument to the
 function, but because we can't pass type as function argument, we pass it as
 first template argument. But you're right that none of that is very obvious.
 Maybe the whole uniform function call syntax could have been implemented
 somehow differently so that it had been more intuitive to people new to the
 language.
Well, yeah, maybe. I could probably think of some alternatives that are a bit more intuitive than UFCS. For example I think it would be nice to allow "alternate" interfaces to types, something like: alias MyInt : int { bool isPrime() const { ... } } MyInt x = 7; writeln(x.isPrime()); // OK int y = x+1; writeln(y.isPrime()); // ERROR, no such function in "int"
But I really mean type when I say, "extending the functionality of a type". I
mean not only class but all user defined types as well, like enum, struct and
union.
My proposal could easily include enums and structs, too: static enum Goo { static Goo g() {...} } static struct Hoo { static void f() {...} } allocated on the heap. If that existed in D then the "static new" part of my proposal could work well for structs, too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 12 2012