www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - D UFCS anti-pattern

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
Recently, I observed a conversation happening on the github pull request  
system.

In phobos, we have the notion of output ranges. One is allowed to output  
to an output range by calling the function 'put'.

Here is the implementation of put:

void put(R, E)(ref R r, E e)
{
     static if(is(PointerTarget!R == struct))
         enum usingPut = hasMember!(PointerTarget!R, "put");
     else
         enum usingPut = hasMember!(R, "put");

     enum usingFront = !usingPut && isInputRange!R;
     enum usingCall = !usingPut && !usingFront;

     static if (usingPut && is(typeof(r.put(e))))
     {
         r.put(e);
     }
     else static if (usingPut && is(typeof(r.put((E[]).init))))
     {
         r.put((&e)[0..1]);
     }
     else static if (usingFront && is(typeof(r.front = e, r.popFront())))
     {
         r.front = e;
         r.popFront();
     }
     else static if ((usingPut || usingFront) && isInputRange!E &&  
is(typeof(put(r, e.front))))
     {
         for (; !e.empty; e.popFront()) put(r, e.front);
     }
     else static if (usingCall && is(typeof(r(e))))
     {
         r(e);
     }
     else static if (usingCall && is(typeof(r((E[]).init))))
     {
         r((&e)[0..1]);
     }
     else
     {
         static assert(false,
                 "Cannot put a "~E.stringof~" into a "~R.stringof);
     }
}

There is an interesting issue here -- put can basically be overridden by a  
member function of the output range, also named put. I will note that this  
function was designed and written before UFCS came into existence. So most  
of the machinery here is designed to detect whether a 'put' member  
function exists.

One nice thing about UFCS, now any range that has a writable front(), can  
put any other range whose elements can be put into front, via the  
pseudo-method put.

In other words:

void foo(int[] arr)
{
    int[] result = new int[arr.length];
    result.put(arr); // put arr into result.
}

But there is an issue with this. If the destination range actually  
implements the put member function, but doesn't implement all of the  
global function's niceties,
r.put(...) is not as powerful/useful as put(r,...). Therefore, the odd  
recommendation is to *always* call put(r,...)

I find this, at the very least, to be confusing. Here is a case where UFCS  
ironically is not usable via a function call that so obviously should be  
UFCS.

The anti-pattern here is using member functions to override or specialize  
UFCS behavior. In this case, we even hook the UFCS call with the member  
function, encouraging the name conflict!

As a possible solution, I would recommend simply change the name of the  
hook, and have the UFCS function forward to the hook. This way, calling  
put(r,...) and r.put(...) is always consistent.

Does this make sense? Anyone have any other possible solutions?

A relevant bug report (where I actually advocate for adding more of this  
horrible behavior): https://issues.dlang.org/show_bug.cgi?id=12583

-Steve
Apr 24 2014
next sibling parent Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Thu, 24 Apr 2014 22:21:32 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d puremagic.com>
wrote:

 Recently, I observed a conversation happening on the github pull
 request system.
 
 In phobos, we have the notion of output ranges. One is allowed to
 output to an output range by calling the function 'put'.
 
 Here is the implementation of put:
 
 void put(R, E)(ref R r, E e)
 {
 static if(is(PointerTarget!R == struct))
 enum usingPut = hasMember!(PointerTarget!R, "put");
 else
 enum usingPut = hasMember!(R, "put");
 
 enum usingFront = !usingPut && isInputRange!R;
 enum usingCall = !usingPut && !usingFront;
 
 static if (usingPut && is(typeof(r.put(e))))
 {
 r.put(e);
 }
 else static if (usingPut && is(typeof(r.put((E[]).init))))
 {
 r.put((&e)[0..1]);
 }
 else static if (usingFront && is(typeof(r.front = e,
 r.popFront()))) {
 r.front = e;
 r.popFront();
 }
 else static if ((usingPut || usingFront) && isInputRange!E && 
 is(typeof(put(r, e.front))))
 {
 for (; !e.empty; e.popFront()) put(r, e.front);
 }
 else static if (usingCall && is(typeof(r(e))))
 {
 r(e);
 }
 else static if (usingCall && is(typeof(r((E[]).init))))
 {
 r((&e)[0..1]);
 }
 else
 {
 static assert(false,
 "Cannot put a "~E.stringof~" into a "~R.stringof);
 }
 }
 
 There is an interesting issue here -- put can basically be overridden
 by a member function of the output range, also named put. I will note
 that this function was designed and written before UFCS came into
 existence. So most of the machinery here is designed to detect
 whether a 'put' member function exists.
 
 One nice thing about UFCS, now any range that has a writable front(),
 can put any other range whose elements can be put into front, via
 the pseudo-method put.
 
 In other words:
 
 void foo(int[] arr)
 {
 int[] result = new int[arr.length];
 result.put(arr); // put arr into result.
 }
 
 But there is an issue with this. If the destination range actually 
 implements the put member function, but doesn't implement all of the 
 global function's niceties,
 r.put(...) is not as powerful/useful as put(r,...). Therefore, the
 odd recommendation is to *always* call put(r,...)
 
 I find this, at the very least, to be confusing. Here is a case where
 UFCS ironically is not usable via a function call that so obviously
 should be UFCS.
 
 The anti-pattern here is using member functions to override or
 specialize UFCS behavior. In this case, we even hook the UFCS call
 with the member function, encouraging the name conflict!
 
 As a possible solution, I would recommend simply change the name of
 the hook, and have the UFCS function forward to the hook. This way,
 calling put(r,...) and r.put(...) is always consistent.
 
 Does this make sense? Anyone have any other possible solutions?
 
 A relevant bug report (where I actually advocate for adding more of
 this horrible behavior):
 https://issues.dlang.org/show_bug.cgi?id=12583

If it doesn't work to override a free function with a member function, I honestly don't see much point to UFCS. The whole idea behind it is to make it so that you don't have to care whether a function is a free function or a member function. The current situation essentially forces you to not use UFCS except in cases where you're trying to add "member functions" to built-in types. And as such, calling functions on user-defined types using UFCS runs a high risk of not compiling, because all it takes is for the user-defined type to define a function with the same name - even if it takes completely different arguments - and now the compiler won't even try to use the free function anymore. I really think that we should fix it so that stuff like outputRange.put(foo) works - including when types define put themselves. AFAIK, that means changing the overload rules so that member functions conflict with free functions only when they take the same arguments - in which case the member function would be called, as it is now, except that the cases where a free function matches the arguments would also work, allowing us to override free functions with member functions where appropriate and prevent simple name collisions from making UFCS not work (i.e. when the member function takes completely different arguments, UFCS would still use the free function). Without a change along those lines, I'd be strongly inclined to argue against using UFCS in any situation except in those where you need to add "member functions" to the built-in types. And the only common case for that that I'm aware of is making it so that arrays can function as ranges. But this issue goes far beyond put. - Jonathan M Davis
Apr 24 2014
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 25 April 2014 at 02:21:32 UTC, Steven Schveighoffer 
wrote:
 A relevant bug report (where I actually advocate for adding 
 more of this horrible behavior): 
 https://issues.dlang.org/show_bug.cgi?id=12583

 -Steve

See also: https://issues.dlang.org/show_bug.cgi?id=9074 Appender!string x; x.put(repeat(" ").take(4)); //fails put(x, repeat(" ").take(4)); //works
Apr 24 2014
prev sibling next sibling parent Andrej Mitrovic via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 4/25/14, Steven Schveighoffer via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 Recently, I observed a conversation happening on the github pull request
 system.

Another case of hijacking is the std.conv.text function. I've seen two people so far that have accidentally called the global text function by mistake, e.g.: ----- import std.conv; import std.stdio; class C { this(string user, string text) { _user = user; _text = text; } string _user; string _text; } void main() { auto c = new C("John", "this is my message"); string input = c.text; writeln(input); // test.C, oops! } ----- UFCS has its benefits but it also has its drawbacks. And yet I wouldn't want to ban the above code, but maybe we could figure out some system for protecting against UFCS calls in some contexts. The way I worked around the common `.text` pattern was to define a disable'd field in the class, which would override the global text function and trigger a compilation error.
Apr 25 2014
prev sibling next sibling parent "Brian Schott" <briancschott gmail.com> writes:
On Friday, 25 April 2014 at 07:52:20 UTC, Andrej Mitrovic via 
Digitalmars-d wrote:
 Another case of hijacking is the std.conv.text function. I've 
 seen two
 people so far that have accidentally called the global text 
 function
 by mistake, e.g.:

Make that three. The ability to call non- property functions without parenthesis is another of my favorite anti-features.
Apr 25 2014
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
25-Apr-2014 06:21, Steven Schveighoffer пишет:
 Recently, I observed a conversation happening on the github pull request
 system.

 In phobos, we have the notion of output ranges. One is allowed to output
 to an output range by calling the function 'put'.

 The anti-pattern here is using member functions to override or
 specialize UFCS behavior. In this case, we even hook the UFCS call with
 the member function, encouraging the name conflict!

 As a possible solution, I would recommend simply change the name of the
 hook, and have the UFCS function forward to the hook. This way, calling
 put(r,...) and r.put(...) is always consistent.

+1 -- Dmitry Olshansky
Apr 25 2014
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 25 Apr 2014 00:58:51 -0400, Jonathan M Davis via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 If it doesn't work to override a free function with a member
 function, I honestly don't see much point to UFCS. The whole idea
 behind it is to make it so that you don't have to care whether a
 function is a free function or a member function. The current situation
 essentially forces you to not use UFCS except in cases where you're
 trying to add "member functions" to built-in types. And as such,
 calling functions on user-defined types using UFCS runs a high risk of
 not compiling, because all it takes is for the user-defined type to
 define a function with the same name - even if it takes completely
 different arguments - and now the compiler won't even try to use the
 free function anymore.

Right but the pattern is: someGlobal(T t, U u) { static if(is(typeof(t.someGlobal(u)))) t.someGlobal(u); else // use default implementation. } Essentially, someGlobal is encouraging not only overriding itself for a type, but for only a PORTION of the implementation. If you override put completely, allowing all the mechanisms put uses, that is absolutely fine. But for only implementing a single part, the hook put uses to interface with the type should NOT be named put. This means the member put is far less functional than the global function put. What I'm saying is that the hook and the global function should not be named the same. Not that we should change the override rules.
 I really think that we should fix it so that stuff like
 outputRange.put(foo) works - including when types define put
 themselves. AFAIK, that means changing the overload rules so that
 member functions conflict with free functions only when they take the
 same arguments - in which case the member function would be called, as
 it is now, except that the cases where a free function matches the
 arguments would also work, allowing us to override free functions with
 member functions where appropriate and prevent simple name collisions
 from making UFCS not work (i.e. when the member function takes
 completely different arguments, UFCS would still use the free
 function). Without a change along those lines, I'd be strongly inclined
 to argue against using UFCS in any situation except in those where you
 need to add "member functions" to the built-in types. And the only
 common case for that that I'm aware of is making it so that arrays can
 function as ranges.

I don't think this is a good idea. A type first and foremost is in charge of its API. Another possible option is to make sure put(R, ...) is as limited as the member R.put(...). This means, if R.put doesn't support the parameters, put(R, ...) should also reject them. This at least is consistent, but I think code will break for not much of a good reason. -Steve
Apr 25 2014
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 25 Apr 2014 03:52:09 -0400, Andrej Mitrovic via Digitalmars-d  
<digitalmars-d puremagic.com> wrote:

 On 4/25/14, Steven Schveighoffer via Digitalmars-d
 <digitalmars-d puremagic.com> wrote:
 Recently, I observed a conversation happening on the github pull request
 system.

Another case of hijacking is the std.conv.text function. I've seen two people so far that have accidentally called the global text function by mistake, e.g.: ----- import std.conv; import std.stdio; class C { this(string user, string text) { _user = user; _text = text; } string _user; string _text; } void main() { auto c = new C("John", "this is my message"); string input = c.text; writeln(input); // test.C, oops! } ----- UFCS has its benefits but it also has its drawbacks. And yet I wouldn't want to ban the above code, but maybe we could figure out some system for protecting against UFCS calls in some contexts. The way I worked around the common `.text` pattern was to define a disable'd field in the class, which would override the global text function and trigger a compilation error.

I think this really comes down to a poorly named function. textOf, toText, textify even, are better names that wouldn't cause this problem. -Steve
Apr 25 2014
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 25 April 2014 at 12:21:04 UTC, Steven Schveighoffer 
wrote:
 I think this really comes down to a poorly named function. 
 textOf, toText, textify even, are better names that wouldn't 
 cause this problem.

 -Steve

Well, in this case, yes, because it's a noun. For all functions that are verbs the issue still stands.
Apr 25 2014