D - Re: Ideas, thoughts, and criticisms, part two. About functions.

Antti =?iso-8859-1?Q?Syk=E4ri?= <jsykari cc.hut.fi> Aug 27 2002

Pavel Minayev <evilone omen.ru> Aug 27 2002

Mac Reiter <Mac_member pathlink.com> Aug 28 2002
Suporte Internet <suporte spica.mps.com.br> Aug 28 2002
Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> Aug 28 2002

Pavel Minayev <evilone omen.ru> Aug 28 2002

Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> Aug 30 2002

C.R.Chafer <blackmarlin nospam.asean-mail.com> Aug 30 2002

Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> Aug 30 2002

Antti =?iso-8859-1?Q?Syk=E4ri?= <jsykari cc.hut.fi> writes:

In D, Pavel Minayev <evilone omen.ru> wrote:
 But out parameters are a bit more than that - don't forget that you
 can overload functions based on types of their parameters, and not by
 return type! For example, in stream.d, I wrote:
 
void read(out byte x);
void read(out short x);
void read(out int x);


That looks nice, and might be the right way to solve the problem (which
problem, that I will address shortly).

But, as declaring

void read(out byte x);

is practically the same as declaring

byte read();

- the difference is merely syntactical - then why not extend the idea of
overloading functions to also cover the case of overloading them by the
return type? The intention of the above-mention code would not change if
it would be written:

byte read();
short read();
int read();

void f()
{
    int x;
    x = read();
}

Only it would mean that the type of return value of "read()", and
therefore the function to be called, would have to be determined by
analyzing its context - which might have unexpected consequences. Think
about that.

Also, "out" parameters make it impossible to ignore a return value,
since we need a typed lvalue parameter as an argument to determine which
overloaded function to call. That might, or might not, be a good idea.

The core of the problem at hand is ultimately that we want to avoid
redundancy in the code. Consider (C example):

byte read_byte();
short read_short();

void redundancy_problem()
{   byte b;
    b = read_byte();
}

Now assume you want to change the byte to short. You have to change the
function call into a short, too. That's not fun, especially if the
function is a long one and you can't see all references. Besides i

So, in a modern language, we have to get rid of this kind of behavior.

Well, in C++ we can do that:
void read(byte& b);
void read(short& b);
void redundancy_solution()
{
    byte b; short s;
    read(b); // calls read(byte&)
    read(s); // calls read(short&)
}

So far, so good. But we can do that also if we had overloading by return
value types, or overloading by "out" types. (Which, the syntax set
aside, are practically the same. Except you can ignore the return
value(s).)

void solution_by_overloading_return_values()
{   byte b;
    b = read(); // byte read() is called
}

void solution_by_overloading_out_value()
{   byte b;
    read(b);    // read(out byte b) is called
}

Finally let's see two interesting alternatives to achieve the
non-redundancy:

1. Type inference (notice how I'm trying to smuggle features commonly
seen in functional languages into D?  No, I'm really not. This is just
an example. :-):

void type_inference_example()
{   b; // type of b will be inferred soon enough... but it is not
       // too visible from the source code and seem like trickery
    b = read_byte();

    b = read_short(); // error: cannot infer common supertype for b
                      // .. or actually, if short _is_ considered
                      // a supertype for b, we actually can!
}

2. Or, we could have the "C++ with typeof" solution:
void wicked_cplusplus_like_example()
{   byte b;
    b = read<typeof(b)>();  // yechh
}

No comments on that one.

Actually, type inference might be nice; consider
class Base { }
class Derived : Base { }
class AnotherDerived : Base { }

class Foobar { }

void type_inference_example_on_steroids(char[][2] obj_type)
{
    // no type presented, leave it to the compiler to decide
    object0;
    object1;

    // these two cause object0 to have type Base
    switch (object_to_make[0])
    {
    case "derived":
        object0 = new Derived(); break;
    case "anotherderived":
        object0 = new anotherDerived(); break;
    }

    // and these two cause object1 to be of type Object
    switch (object_to_make[1])
    {
    case "derived":
        object1 = new Derived;
    case "foobar":
        object1 = new Foobar; // uh-oh... not much in common
    }
}

Actually, type inference might not be nice. At least not for me, since I
want to know the type of the declared variable, and I want it to read
right there in the code.  I don't mind if, for example, a
syntax-directed editor does the work for me (although it would have to
me more like a semantics-directed editor): when I for example change the
Foobar there to be Derived, it could inform me that "Your Object object1
now doesn't need to be Object any more, it should be Derived, so I made
the change for you" or something like that. But anyway, this kind of
work wouldn't be suitable to be done behind the curtains.

=3E Now, if you want to return nothing, just use:
=3E 
=3E f();
 
 This makes it harder to parse. Also, when I look at it, my mind sees a 
 function call, and not a declaration.


Was this ("harder to parse") the original reason why the "void"
asymmetry (requiring void in return values but not arguments) wasn't
dumped in C++? Can't remember, don't have a copy of "The Design and
Evolution of C++" at hand, so can't check either. Stroustrup is good at
defending the choices he made when designing C++. (There actually are a
bunch of principles buried under it. *g*)

Still:

In my opinion, too many language features are crippled by the seemingly
well-meaning pursuit of trying to make the language easy to parse.

Consider the template syntax of C++: you have to say, for example,
set<vector<char> > instead of the _far_ more natural set<vector<char>>,
because someone decided that we must build lexical analyzers upon the
glorious principle of always returning the longest possible match. Rat's
ass I think.  I think that lot of C++ code would simply look a lot
better if, in the design phase, it had been decided to recognize ">>" as
two ">" tokens if we're inside <> brackets.

And I think that nobody's code would be broken even if we started to
adhere to the new rule right now. How often do you use ">>" inside a
template argument list? Theoretically possible that you'd want to do
something like "const int n; set< vec < float, 32>>n > > funny;". But,
hey, dream on.  Besides you could circumvent that by defining a new
constant before the set<float<>> usage, and it'd be much clearer.  So,
Including a rather simple special case in the lexer would've made the
language look and feel better. Imagine!

(It would've been simple -- as simple as making the /* */ comments
nested, actually. A language, to be successful, must be cool.)

And, in my opinion, look and feel of the language matter much more than
the ease of parsing. Sometimes the two goals do not conflict. In fact,
most of the time they don't - which deceitfully makes people associate a
"easily readable language" with "easily parsable language". But when
they do conflict, I'd opt for the better look and feel since the
language has far more users than it has implementers.

(And of course, function calls don't happen in the same places as
function declarations. (At least as long as you cannot declare functions
inside functions.) So it might not be that big an issue.)

Finally, well, look and feel are a matter of opinion. But they are also
always a fresh topic for a debate, since successful opinions are usually
those that change :-)

"void" can be treated as "procedure" in other languages, so I don't
 really see a problem here.


By the way, something I really don't understand is that some languages
actually have different keywords for procedures (which just execute
code) and functions (which return a value). It's just silly.

Maybe what I'm looking for here is a uniform interface - something that
could be (ab)used in metaprograms. Something along these lines:

First, a function f must have has the following parameters:
f.returnTuple - a tuple type of values it returns
f.argTuple    - a tuple type of values it gets as an argument

// please ignore the invented ad-hoc syntax and concentrate on the idea:
template<function F>
F.returnTuple
printingFunctionWrapper(
    F           fun,
    F.argTuple  arguments)      // this could be replaced with the "rest"
{                               // keyword proposed earlier
    print("I'm calling function ", F.name, " with arguments ");
    printTuple(arguments);

    // do the actual function call, store return values
    F.returnValues retVals = fun(arguments); 

    print(F.name, " returned: ");
    printTuple(retVals);
    return retVals;
}

As it might be obvious, this function is a wrapper that emulates normal
functions but logs their arguments and return values. printTuple is a
meta function which generates code which prints its arguments, their
names and their types. For example:

int squared_x square(int x) = { return x*x; }

main()
{
    printingFunctionWrapper(square, 5);
}

says:

I'm calling function square with arguments int x = 5
square returned: int squared_x = 25

Now, I admit that implementing this and refining the idea might not be
an easy task...

=3E private int inline max(int[] array, int maxSoFar, uint idx)
=3E {
=3E     if (idx == array.length)
=3E         maxSoFar;
=3E     else
=3E         max(array, max(array[i], maxSoFar), idx + 1);
 
 Yes, and then D will become a functional language...
 No, thanks! =)


Well, what I'm proposing here might not look like the traditional C
syntax - but on the semantic level it would be exactly the same as the
equivalent C code which 1) creates a nameless temporary value; 2)
generates code which assigns to the temporary value; 3) returns (or
yields) the temporary value at the end.

(I think that the functional form might be easier to optimize. I can't
say - haven't written an optimizing compiler yet.)

Another nice-to-have functional-like thing would be 
- lambda functions (there is a need for it in some form in any language,
  even if would be implemented only as syntactical sugar for introducing
  a nameless global function and yielding its address)

Let expressions or similar, on the other hand, wouldn't be needed.
Already C has a form of let expressions; the following block structure
is equivalent to (let ((x 5) (y 6)) (do_something)):

  {
    int x = 5;
    int y = 6;
    do_something();
  }

Antti.

(By the way, Pavel, I found it necessary to do some search & replace on
the quoted text, since my newsreader (slrn) shows your articles like
this, apparently not recognizing the character set and escaping each
punctuation mark:
=09void read=28out short x=29=3B
=09void read=28out int x=29=3B
=09=2E=2E=2E

Aug 27 2002

Pavel Minayev <evilone omen.ru> writes:

On Tue=2C 27 Aug 2002 21=3A41=3A59 +0000 =28UTC=29 Antti Syk=5Fri
=3Cjsykari=40cc=2Ehut=2Efi=3E wrote=3A

=3E In D=2C Pavel Minayev =3Cevilone=40omen=2Eru=3E wrote=3A
=3E But=2C as declaring
=3E 
=3E void read=28out byte x=29=3B
=3E 
=3E is practically the same as declaring
=3E 
=3E byte read=28=29=3B

It's not! read further=2E=2E=2E
 
=3E - the difference is merely syntactical - then why not extend the idea of
=3E overloading functions to also cover the case of overloading them by the
=3E return type=3F The intention of the above-mention code would not change if
=3E it would be written=3A
=3E 
=3E byte read=28=29=3B
=3E short read=28=29=3B
=3E int read=28=29=3B
=3E 
=3E void f=28=29
=3E {
=3E     int x=3B
=3E     x =3D read=28=29=3B
=3E }
=3E 
=3E Only it would mean that the type of return value of =22read=28=29=22=2C and
=3E therefore the function to be called=2C would have to be determined by
=3E analyzing its context - which might have unexpected consequences=2E Think
=3E about that=2E

=22Context=22 is exactly the word=2E For example=2C one could write=3A

=09*=28*=28foo=28x=2C y=29=2Ebar=5Bz=5D=29=5Bu=5D=29 =3D read=28=29=3B

The compiler would have quite a lot of problems trying to figure out the 
appropriate
version of read=28=29=2E It gets even more complicated with things like
overloaded 
functions
and operators=2E=2E=2E

=3E Also=2C =22out=22 parameters make it impossible to ignore a return value=2C
=3E since we need a typed lvalue parameter as an argument to determine which
=3E overloaded function to call=2E That might=2C or might not=2C be a good
idea=2E

Well=2C you cannot ignore the result if you overload by return type either=2C 
since then it would
be unclear to the compiler what you're trying to call=3A

=09byte read=28=29=3B
=09int read=28=29=3B

=09read=28=29=3B=09=2F=2F which version is called=3F

You could probably provide a version returning void in such cases=2C but so can 
you do
with out-parameters=3A

=09void read=28byte=29=3B
=09void read=28int=29=3B
=09void read=28=29=3B

Now you can ignore the result=2E =09

=3E Was this =28=22harder to parse=22=29 the original reason why the =22void=22
=3E asymmetry =28requiring void in return values but not arguments=29 wasn't
=3E dumped in C++=3F Can't remember=2C don't have a copy of =22The Design and
=3E Evolution of C++=22 at hand=2C so can't check either=2E Stroustrup is good
at
=3E defending the choices he made when designing C++=2E =28There actually are a
=3E bunch of principles buried under it=2E *g*=29
=3E 
=3E Still=3A
=3E 
=3E In my opinion=2C too many language features are crippled by the seemingly
=3E well-meaning pursuit of trying to make the language easy to parse=2E

Okay=2C so I don't care about the parser=2E I just find the =22void=2E=2E=2E=22
syntax more 
beautiful=2E
When I see it=2C it is clear to me that I see a function declaration=2E If it 
wasn't there=2C
I could think it is a call=2E
 
=3E =28And of course=2C function calls don't happen in the same places as
=3E function declarations=2E =28At least as long as you cannot declare functions
=3E inside functions=2E=29 So it might not be that big an issue=2E=29

It is=2E Just pick up some old big K&R C program and try to read through it=2E
It 
just
drives me mad!
 
=3E By the way=2C something I really don't understand is that some languages
=3E actually have different keywords for procedures =28which just execute
=3E code=29 and functions =28which return a value=29=2E It's just silly=2E

No=2C it isn't=2E It's =22academic=22=2E Whether it is good or not is a matter
of taste=2E 
Me personally=2C
I simply don't care=2E It works the same way everywhere=2C what else=3F =3D=29
 
=3E =28By the way=2C Pavel=2C I found it necessary to do some search & replace
on
=3E the quoted text=2C since my newsreader =28slrn=29 shows your articles like
=3E this=2C apparently not recognizing the character set and escaping each
=3E punctuation mark=3A

Okay=2C I never liked this newsreader anyhow=2E =3D=29 So=2C anyone knows a
decent free 
one=3F

Aug 27 2002

Mac Reiter <Mac_member pathlink.com> writes:

=3E =28By the way=2C Pavel=2C I found it necessary to do some search & replace
on
=3E the quoted text=2C since my newsreader =28slrn=29 shows your articles like
=3E this=2C apparently not recognizing the character set and escaping each
=3E punctuation mark=3A

Okay=2C I never liked this newsreader anyhow=2E =3D=29 So=2C anyone knows a
decent free 
one=3F


I have been very happy with the web interface, but that is only good for these
newsgroups.

Mac

Aug 28 2002

Suporte Internet <suporte spica.mps.com.br> writes:

Pavel Minayev <evilone omen.ru> wrote:
 On Tue, 27 Aug 2002 21:41:59 +0000 (UTC) Antti Syk_ri <jsykari cc.hut.fi>
wrote:
 (By the way, Pavel, I found it necessary to do some search & replace on
 the quoted text, since my newsreader (slrn) shows your articles like
 this, apparently not recognizing the character set and escaping each
 punctuation mark:


 one?


I use xnews for windows and tin for linux.

Aug 28 2002

Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> writes:

Pavel Minayev wrote:
 "Context" is exactly the word. For example, one could write:
 
         *(*(foo(x, y).bar[z])[u]) = read();
 
 The compiler would have quite a lot of problems trying to figure out the
 appropriate version of read().


The compiler already has these problems:
*(*(foo(x, y).bar[z])[u]) = 3.1415926; // implicit conversion to int, or not?

I like the idea of overloading by return type. Multiple return values are in my
eyes a concept, which borrows too much from functional languages. I have the
feeling,
that functional programming is the wrong paradigm for D.

-- 
+----------------------------------------------------------------------+
I Dipl. Phys. Olaf Rogalsky                 Institut f. Theo. Physik I I
I Tel.: 09131 8528440                       Univ. Erlangen-Nuernberg   I
I Fax.: 09131 8528444                       Staudtstrasse 7 B3         I
I rogalsky theorie1.physik.uni-erlangen.de  D-91058 Erlangen           I
+----------------------------------------------------------------------+

Aug 28 2002

Pavel Minayev <evilone omen.ru> writes:

On Wed, 28 Aug 2002 19:15:44 +0200 Olaf Rogalsky 
<olaf.rogalsky theorie1.physik.uni-erlangen.de> wrote:

 The compiler already has these problems:
 *(*(foo(x, y).bar[z])[u]) = 3.1415926; // implicit conversion to int, or not?


And does it matter here? Whether the conversion happens or not, it is all 
built-in
stuff, no user-defined functions are called.

Aug 28 2002

Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> writes:

Pavel Minayev wrote:

 The compiler already has these problems:
 *(*(foo(x, y).bar[z])[u]) = 3.1415926; // implicit conversion to int, or not?


 And does it matter here? Whether the conversion happens or not, it is all
 built-in stuff, no user-defined functions are called.


to carry out correct type conversions. based on that knowlegde it wouln't
be too hard to implement function overloading by return type.
-- 
+----------------------------------------------------------------------+
I Dipl. Phys. Olaf Rogalsky                 Institut f. Theo. Physik I I
I Tel.: 09131 8528440                       Univ. Erlangen-Nuernberg   I
I Fax.: 09131 8528444                       Staudtstrasse 7 B3         I
I rogalsky theorie1.physik.uni-erlangen.de  D-91058 Erlangen           I
+----------------------------------------------------------------------+

Aug 30 2002

C.R.Chafer <blackmarlin nospam.asean-mail.com> writes:

Olaf Rogalsky wrote:

 internal or not, the compiler needs to know the type of the LHS in order
 to carry out correct type conversions. based on that knowlegde it wouln't
 be too hard to implement function overloading by return type.


Yes, but there are several issues involved in resolving ambiguities.
Suppose we had ...

/* prototypes - in reality they would be full definitions */
int f1( int a );        /* #1 */
int f1( long a );       /* #2 */
int f2( long a );       /* #3 */
long f2( long a );      /* #4 */

/* code (in procedure) */
int b = f1( f2( 42 ) );

Now do we call ...
         f1#1 on f2#3
or      f1#2 on f2#4

This problem is not insurmountable - we could resolve this with a cast (and 
D would probably flag an error rather than try to guess the correct format 
if the cast was ommited).

My option is that overloading on return type is of limited utility at the 
present time - though may be a useful addition in Dv2.

C 2002/8/30

Aug 30 2002

Olaf Rogalsky <olaf.rogalsky theorie1.physik.uni-erlangen.de> writes:

"C.R.Chafer" wrote:
 My option is that overloading on return type is of limited utility at the
 present time - though may be a useful addition in Dv2.


type overloading. It just would be neat :-).

-- 
+----------------------------------------------------------------------+
I Dipl. Phys. Olaf Rogalsky                 Institut f. Theo. Physik I I
I Tel.: 09131 8528440                       Univ. Erlangen-Nuernberg   I
I Fax.: 09131 8528444                       Staudtstrasse 7 B3         I
I rogalsky theorie1.physik.uni-erlangen.de  D-91058 Erlangen           I
+----------------------------------------------------------------------+

Aug 30 2002

D Programming

C/C++ Programming

Other

D - Re: Ideas, thoughts, and criticisms, part two. About functions.